Does MySQL Regexp support Unicode matching

  1. Does anyone know if Mysql’s regexp supports unicode? I’ve been doing some research and the majority of blogs etc. seem to indicate that there is a problem or its not supported.

    As documented under Regular Expressions:

    Warning

    The REGEXP and RLIKE operators work in byte-wise fashion, so they are not multi-byte safe and may produce unexpected results with multi-byte character sets. In addition, these operators compare characters by their byte values and accented characters may not compare as equal even if a given collation treats them as equal.

  2. I’m wondering then is it best to use LIKE for unicode pattern matching and regexp for ASCII enhanced pattern matching?

    Yes, that would be best.

  3. I Like the idea of being able to search for matches at the beginning or end of a string, but if regexp doesn’t support unicode then this could be difficult if my text is unicode.

    One can do that with LIKE too:

    WHERE foo LIKE 'bar%'
    

    And:

    WHERE foo LIKE '%bar'
    

Leave a Comment