HTTP URL Address Encoding in Java

The java.net.URI class can help; in the documentation of URL you find

Note, the URI class does perform escaping of its component fields in certain circumstances. The recommended way to manage the encoding and decoding of URLs is to use an URI

Use one of the constructors with more than one argument, like:

URI uri = new URI(
    "http", 
    "search.barnesandnoble.com", 
    "/booksearch/first book.pdf",
    null);
URL url = uri.toURL();
//or String request = uri.toString();

(the single-argument constructor of URI does NOT escape illegal characters)


Only illegal characters get escaped by above code – it does NOT escape non-ASCII characters (see fatih’s comment).
The toASCIIString method can be used to get a String only with US-ASCII characters:

URI uri = new URI(
    "http", 
    "search.barnesandnoble.com", 
    "/booksearch/é",
    null);
String request = uri.toASCIIString();

For an URL with a query like http://www.google.com/ig/api?weather=São Paulo, use the 5-parameter version of the constructor:

URI uri = new URI(
        "http", 
        "www.google.com", 
        "/ig/api",
        "weather=São Paulo",
        null);
String request = uri.toASCIIString();

Leave a Comment