Java regex for support Unicode?
What you are looking for are Unicode properties. e.g. \p{L} is any kind of letter from any language So a regex to match such a Chinese word could be something like \p{L}+ There are many such properties, for more details see regular-expressions.info Another option is to use the modifier Pattern.UNICODE_CHARACTER_CLASS In Java 7 there is … Read more