Best method of saving data

If your data are pretty simple, like just collections of collections of strings or numbers, I would use json. What JSON is, is a string representation of simple data types and combinations of simple data types. Once you use the json module to convert your data to a string, you write it to a file yourself.

It’s super simple:

>>> my_data = [range(5) for i in range(5)]
>>> my_data
[[0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4]]
>>> import json
>>> json.dumps(my_data)
'[[0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4]]'

Then just write that string to a file. When you want to reload it, like so:

>>> import json
>>> string_from_file
'[[0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4]]'
>>> my_saved_data = json.loads(string_from_file)
>>> my_saved_data
[[0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4]]

If your data are more complicated, and involves classes other than the built-in collection objects, pickle is a better choice. One very important thing to know about pickle is that there are security vulnerabilities in pickle, and it’s a bad idea to unpickle anything you yourself didn’t pickle. pickle is vulnerable to the security problems detailed in this article: http://www.kalzumeus.com/2013/01/31/what-the-rails-security-issue-means-for-your-startup/

If the size of your data is very large, or you will be saving/loading it frequently, or for any reason using json and saving to a local file is inadequate, then a database will be the way to go.

Leave a Comment