How to detect the presence of URL in a string

Use java.net.URL for that!!

Hey, why don’t use the core class in java for this “java.net.URL” and let it validate the URL.

While the following code violates the golden principle “Use exception for exceptional conditions only” it does not make sense to me to try to reinvent the wheel for something that is veeery mature on the java platform.

Here’s the code:

import java.net.URL;
import java.net.MalformedURLException;

// Replaces URLs with html hrefs codes
public class URLInString {
    public static void main(String[] args) {
        String s = args[0];
        // separate input by spaces ( URLs don't have spaces )
        String [] parts = s.split("\\s+");

        // Attempt to convert each item into an URL.   
        for( String item : parts ) try {
            URL url = new URL(item);
            // If possible then replace with anchor...
            System.out.print("<a href=\"" + url + "\">"+ url + "</a> " );    
        } catch (MalformedURLException e) {
            // If there was an URL that was not it!...
            System.out.print( item + " " );
        }

        System.out.println();
    }
}

Using the following input:

"Please go to http://stackoverflow.com and then mailto:[email protected] to download a file from    ftp://user:pass@someserver/someFile.txt"

Produces the following output:

Please go to <a href="http://stackoverflow.com">http://stackoverflow.com</a> and then <a href="mailto:[email protected]">mailto:[email protected]</a> to download a file from    <a href="ftp://user:pass@someserver/someFile.txt">ftp://user:pass@someserver/someFile.txt</a>

Of course different protocols could be handled in different ways.
You can get all the info with the getters of URL class, for instance

 url.getProtocol();

Or the rest of the attributes: spec, port, file, query, ref etc. etc

http://java.sun.com/javase/6/docs/api/java/net/URL.html

Handles all the protocols ( at least all of those the java platform is aware ) and as an extra benefit, if there is any URL that java currently does not recognize and eventually gets incorporated into the URL class ( by library updating ) you’ll get it transparently!

Leave a Comment