Check out Unicode character properties: http://www.regular-expressions.info/unicode.html#prop. I think what you are looking for is probably
\p{L}
which will match any letters or ideographs. You may also want to include letters with marks on them, so you could do
\p{L}\p{M}*
In any case, all the different types of character properties are detailed in the first link.
Edit: You may also want to look at this Stack Overflow answer discussing whether \w matches unicode characters. They suggest that you could also use \p{Word} or \p{Alnum}: Does \w match all alphanumeric characters defined in the Unicode standard?