Unicode characters in servlet application are shown as question marks

Seeing ?????? instead of intelligible characters (and even instead of Mojibake) usually indicates that the data transfer responsible is by itself very well aware about the encoding used in both the source and the destination. In the average web application there are only 2 places where this is the case: the point when the data is transferred to/from the DB by JDBC and the point when the data is transferred to the HTTP response by response.getWriter() (as implicitly used by JSP).

In your particular case with properties files, there’s no means of a DB, so the HTTP response remains as the main suspect. This problem can happen when the server isn’t been instructed to use UTF-8 to decode the characters which are being written to the HTTP response, but instead some platform default encoding, most commonly ISO-8859-1. This way any character in the source which is not covered by ISO-8859-1 will be turned into a question mark. As ISO-8859-1 is exclusively dedicated to Latin characters, this will thus affect all non-Latin characters such as Chinese, Japanese, Arabic, Cyrillic, Hebrew, Sanskrit, etcetera. They would all be written as question marks.

This can be fixed on a per-JSP basis by adding the following to the very top of JSP:

<%@page pageEncoding="UTF-8" %>

(note that you really need to put this in every JSP, also the include files!)

Or, better, fix it on an application-wide basis by adding the following entry to webapp’s web.xml:

<jsp-config>
    <jsp-property-group>
        <url-pattern>*.jsp</url-pattern>
        <page-encoding>UTF-8</page-encoding>
    </jsp-property-group>
</jsp-config>

See also:

Leave a Comment