Bad JSON – Keys are not quoted

You have an HJSON document, at which point you can use the hjson project to parse it:

>>> import hjson
>>> hjson.loads('{javascript_style:"Look ma, no quotes!"}')
OrderedDict([('javascript_style', 'Look ma, no quotes!')])

HJSON is JSON without the requirement to quote object names and even for certain string values, with added comment support and multi-line strings, and with relaxed rules on where commas should be used (including not using commas at all).

Or you could install and use the demjson library; it supports parsing valid JavaScript (missing quotes):

import demjson

result = demjson.decode(jsonp_payload)

Only when you set the strict=True flag does demjson refuse to parse your input:

>>> import demjson
>>> demjson.decode('{javascript_style:"Look ma, no quotes!"}')
{u'javascript_style': u'Look ma, no quotes!'}
>>> demjson.decode('{javascript_style:"Look ma, no quotes!"}', strict=True)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/mjpieters/Development/venvs/stackoverflow-2.7/lib/python2.7/site-packages/demjson.py", line 5701, in decode
    return_stats=(return_stats or write_stats) )
  File "/Users/mjpieters/Development/venvs/stackoverflow-2.7/lib/python2.7/site-packages/demjson.py", line 4917, in decode
    raise errors[0]
demjson.JSONDecodeError: ('JSON does not allow identifiers to be used as strings', u'javascript_style')

Using a regular expression you can try to regex your way to valid JSON; this can lead to false positives however. The pattern would be:

import re

valid_json = re.sub(r'(?<={|,)([a-zA-Z][a-zA-Z0-9]*)(?=:)', r'"\1"', jsonp_payload)

This matches a { or ,, followed by a JavaScript identifier (a character, followed by more characters or digits), and followed directly by a : colon. If your quoted values contain any such patterns, you’ll get invalid JSON.

Leave a Comment