A single regex to parse and breakup a
full URL including query parameters
and anchors e.g.https://www.google.com/dir/1/2/search.html?arg=0-a&arg1=1-b&arg3-c#hash
^((http[s]?|ftp):\/)?\/?([^:\/\s]+)((\/\w+)*\/)([\w\-\.]+[^#?\s]+)(.*)?(#[\w\-]+)?$
RexEx positions:
url: RegExp[‘$&’],
protocol:RegExp.$2,
host:RegExp.$3,
path:RegExp.$4,
file:RegExp.$6,
query:RegExp.$7,
hash:RegExp.$8
you could then further parse the host (‘.’ delimited) quite easily.
What I would do is use something like this:
/*
^(.*:)//([A-Za-z0-9\-\.]+)(:[0-9]+)?(.*)$
*/
proto $1
host $2
port $3
the-rest $4
the further parse ‘the rest’ to be as specific as possible. Doing it in one regex is, well, a bit crazy.