How to create dataset in the same format as the FSNS dataset?

The data format for storing training/test is defined in the FSNS paper https://arxiv.org/pdf/1702.03970.pdf (Table 4).

To store tfrecord files with tf.Example protos you can use tf.python_io.TFRecordWriter. There is a nice tutorial, an existing answer on the stackoverflow and a short gist.

Assume you have an numpy ndarray img which has num_of_views images stored side-by-side (see Fig. 3 in the paper):
enter image description here
and a corresponding text in a variable text. You will need to define some function to convert a unicode string into a list of character ids padded to a fixed length and unpadded as well. For example:

char_ids_padded, char_ids_unpadded = encode_utf8_string(
   text="abc", 
   charset={'a':0, 'b':1, 'c':2},
   length=5,
   null_char_id=3)

the result should be:

char_ids_padded = [0,1,2,3,3]
char_ids_unpadded = [0,1,2]

If you use functions _int64_feature and _bytes_feature defined in the gist you can create a FSNS compatible tf.Example proto using a following snippet:

char_ids_padded, char_ids_unpadded = encode_utf8_string(
   text, charset, length, null_char_id)
example = tf.train.Example(features=tf.train.Features(
  feature={
    'image/format': _bytes_feature("PNG"),
    'image/encoded': _bytes_feature(img.tostring()),
    'image/class': _int64_feature(char_ids_padded),
    'image/unpadded_class': _int64_feature(char_ids_unpadded),
    'height': _int64_feature(img.shape[0]),
    'width': _int64_feature(img.shape[1]),
    'orig_width': _int64_feature(img.shape[1]/num_of_views),
    'image/text': _bytes_feature(text)
  }
))

Leave a Comment