Over a million developers have joined DZone.

String Split and Join With Escaping

Anders Abel talks escape characters and regular expressions.

· Performance Zone

Download Forrester’s “Vendor Landscape, Application Performance Management” report that examines the evolving role of APM as a key driver of customer satisfaction and business success, brought to you in partnership with BMC.

.NET offers the simple string.Split() and string.Join() methods for joining and splitting separated strings. But what if there is no suitable separator character that would occur in the string? Then the separator character must be escaped. And then the escape character must be escaped too… And this turns out to be quite an interesting algorithm to write.

I thought that this functionality would be built-in, but as far as I could find out it isn’t. If there is a built-in way, please leave a comment to educate me. This being a string manipulation, there is a possibility to use Regular Expressions too, but…

Some people, when confronted with a problem, think “I know, I’ll use regular expressions.” Now they have two problems.

Jamie Zawinski

Solving this through a Regular Expression would require some black magic double look-behind assertion which I wouldn’t understand even when I wrote the code, much less later when I came back to fix some bug. So I went for implementing it myself.

Image title

Design Considerations

This is just a small helper that I’m writing as part of a bigger project. It is not performance critical, so I haven’t spent any time optimizing it. But I did think a bit about performance implications. One approach that I directly ruled out was to build up the split strings character by character when looping through the input string. It would make the implementation quite easy to follow, but would allocate a new string for each char being checked in the source string. That is a bit too much pressure on the garbage collector for my taste.

So I went for an iterative approach where I loop through the string, keeping track of where the current segment started and checking if the end of the segment has been found. I think that the resulting code is fairly readable. But it is more complex with more quirks than I first imagined because of some edge cases. With the delimiter being , and the escape character / consider the following escaped and joined strings:

  • aa,bb,cc
  • ,aa,,bb,
  • a/,b//c,/,,//,
  • a/,

Strings can be empty – even the final one. They can end with an escape sequence. And an escaped escape character can precede a delimiter where the string should be split.

The Code

/// <summary>
/// Helpers for delimited string, with support for escaping the delimiter
/// character.
/// </summary>
public static class DelimitedString
  const string DelimiterString = ",";
  const char DelimiterChar = ',';

  // Use a single / as escape char, avoid \ as that would require
  // all escape chars to be escaped in the source code...
  const char EscapeChar = '/';
  const string EscapeString = "/";

  /// <summary>
  /// Join strings with a delimiter and escape any occurence of the
  /// delimiter and the escape character in the string.
  /// </summary>
  /// <param name="strings">Strings to join</param>
  /// <returns>Joined string</returns>
  public static string Join(params string[] strings)
    return string.Join(
        s => s
        .Replace(EscapeString, EscapeString + EscapeString)
        .Replace(DelimiterString, EscapeString + DelimiterString)));

  /// <summary>
  /// Split strings delimited strings, respecting if the delimiter
  /// characters is escaped.
  /// </summary>
  /// <param name="source">Joined string from <see cref="Join(string[])"/></param>
  /// <returns>Unescaped, split strings</returns>
  public static string[] Split(string source)
    var result = new List<string>();

    int segmentStart = 0;
    for (int i = 0; i < source.Length; i++)
      bool readEscapeChar = false;
      if (source[i] == EscapeChar)
        readEscapeChar = true;

      if (!readEscapeChar && source[i] == DelimiterChar)
          source.Substring(segmentStart, i - segmentStart)));
        segmentStart = i + 1;

      if (i == source.Length - 1)

    return result.ToArray();

  static string UnEscapeString(string src)
    return src.Replace(EscapeString + DelimiterString, DelimiterString)
      .Replace(EscapeString + EscapeString, EscapeString);

The code is part of Kentor.AuthServices and also available at GitHub. The code is covered by tests.

See Forrester’s Report, “Vendor Landscape, Application Performance Management” to identify the right vendor to help IT deliver better service at a lower cost, brought to you in partnership with BMC.

.net,escaping,regular expressions

Published at DZone with permission of Anders Abel, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}