decodeURIComponent vs unescape, what is wrong with unescape?

What I want to know is what is wrong with escape/unescape ?

They’re not “wrong” as such, they’re just their own special string format which looks a bit like URI-parameter-encoding but actually isn’t. In particular:

  • ‘+’ means plus, not space
  • there is a special “%uNNNN” format for encoding Unicode UTF-16 code points, instead of encoding UTF-8 bytes

So if you use escape() to create URI parameter values you will get the wrong results for strings containing a plus, or any non-ASCII characters.

escape() could be used as an internal JavaScript-only encoding scheme, for example to escape cookie values. However now that all browsers support encodeURIComponent (which wasn’t originally the case), there’s no reason to use escape in preference to that.

There is only one modern use for escape/unescape that I know of, and that’s as a quick way to implement a UTF-8 encoder/decoder, by leveraging the UTF-8 processing in URIComponent handling:

utf8bytes= unescape(encodeURIComponent(unicodecharacters));
unicodecharacters= decodeURIComponent(escape(utf8bytes));

Leave a Comment