pickle.dumps
or numpy.save
encode all the information needed to reconstruct an arbitrary NumPy array, even in the presence of endianness issues, non-contiguous arrays, or weird structured dtypes. Endianness issues are probably the most important; you don’t want array([1])
to suddenly become array([16777216])
because you loaded your array on a big-endian machine. pickle
is probably the more convenient option, though save
has its own benefits, given in the npy
format rationale.
I’m giving options for serializing to JSON or a bytestring, because the original questioner needed JSON-serializable output, but most people coming here probably don’t.
The pickle
way:
import pickle
a = # some NumPy array
# Bytestring option
serialized = pickle.dumps(a)
deserialized_a = pickle.loads(serialized)
# JSON option
# latin-1 maps byte n to unicode code point n
serialized_as_json = json.dumps(pickle.dumps(a).decode('latin-1'))
deserialized_from_json = pickle.loads(json.loads(serialized_as_json).encode('latin-1'))
numpy.save
uses a binary format, and it needs to write to a file, but you can get around that with io.BytesIO
:
a = # any NumPy array
memfile = io.BytesIO()
numpy.save(memfile, a)
serialized = memfile.getvalue()
serialized_as_json = json.dumps(serialized.decode('latin-1'))
# latin-1 maps byte n to unicode code point n
And to deserialize:
memfile = io.BytesIO()
# If you're deserializing from a bytestring:
memfile.write(serialized)
# Or if you're deserializing from JSON:
# memfile.write(json.loads(serialized_as_json).encode('latin-1'))
memfile.seek(0)
a = numpy.load(memfile)