Why does Java permit escaped unicode characters in the source code?

Unicode escape sequences allow you to store and transmit your source code in pure ASCII and still use the entire range of Unicode characters. This has two advantages:

  • No risk of non-ASCII characters getting broken by tools that can’t handle them. This was a real concern back in the early 1990s when Java was designed. Sending an email containing non-ASCII characters and having it arrive unmangled was the exception rather than the norm.

  • No need to tell the compiler and editor/IDE which encoding to use for interpreting the source code. This is still a very valid concern. Of course, a much better solution would have been to have the encoding as metadata in a file header (as in XML), but this hadn’t yet emerged as a best practice back then.

The first variant makes sense to me –
it allows programmers to name
variables and methods in an
international language of their
choice. However, I don’t see any
practical application of the second
approach.

Both will result in exactly the same byte code and have the same power as a language feature. The only difference is in the source code.

First, a bad programmer could use it
to secretly comment out bits of code,
or create multiple ways of identifying
the same variable.

If you’re concerned about a programmer deliberately sabotaging your code’s readability, this language feature is the least of your problems.

Second, there seems to be a lack of support among IDEs.

That’s hardly the fault of the feature or its designers. But then, I don’t think it was ever intended to be used “manually”. Ideally, the IDE would have an option to have you enter the characters normally and have them displayed normally, but automatically save them as Unicode escape sequences. There may even already be plugins or configuration options that makes the IDEs behave that way.

But in general, this feature seems to be very rarely used and probably therefore badly supported. But how could the people who designed Java around 1993 have known that?

Leave a Comment