Java split is eating my characters

Use zero-width matching assertions:

    String str = "la$le\\$li$lo";
    System.out.println(java.util.Arrays.toString(
        str.split("(?<!\\\\)\\$")
    )); // prints "[la, le\$li, lo]"

The regex is essentially

(?<!\\)\$

It uses negative lookbehind to assert that there is not a preceding \.

See also


More examples of splitting on assertions

Simple sentence splitting, keeping punctuation marks:

    String str = "Really?Wow!This.Is.Awesome!";
    System.out.println(java.util.Arrays.toString(
        str.split("(?<=[.!?])")
    )); // prints "[Really?, Wow!, This., Is., Awesome!]"

Splitting a long string into fixed-length parts, using \G

    String str = "012345678901234567890";
    System.out.println(java.util.Arrays.toString(
        str.split("(?<=\\G.{4})")
    )); // prints "[0123, 4567, 8901, 2345, 6789, 0]"

Using a lookbehind/lookahead combo:

    String str = "HelloThereHowAreYou";
    System.out.println(java.util.Arrays.toString(
        str.split("(?<=[a-z])(?=[A-Z])")
    )); // prints "[Hello, There, How, Are, You]"

Related questions

Leave a Comment