Need logic to create dict of dict by reading input file [closed]

To solve your problem you need a parser. There are various libraries that can be used for parsing in Python, see Python parsing tools by SO veteran Ned Batchelder for a list of what’s available.

However, your data format isn’t too complicated, so it’s easy enough to write a simple parser that doesn’t rely on any 3rd-party modules. To break the data up into individual tokens (aka lexical analysis) we can use the standard shlex module.

The code below implements a very simple recursive decent parser. It was developed & tested on Python 2.6.6, but it should function correctly on Python 3. You might like to encapsulate it by putting it into a class. IMHO that’s probably not really necessary, but I guess that depends on your actual use case.

This code uses the json module to print the parsed dictionary; that’s not strictly necessary, but it does make it easy to print nested dictionaries nicely.

The code incorporates some error checking, but it can easily be fooled into accepting weird data, so you may wish to enhance the error checking if you can’t guarantee that the input data will always be correct.

#!/usr/bin/env python

''' Parse config file data; see below for example data format
    Written by PM 2Ring 2016.01.21

from __future__ import print_function
import shlex
import string
import json

router1 = {
    hostname: abcd
            interface: gigabit 0/1
            valn: 100
            name: vlan1
    clear: clear config all

#Set up a simple lexer. `data` must be a file-/stream-like object
# with read() and readline() methods, or a string 
lex = shlex.shlex(data)
lex.wordchars = string.ascii_letters + string.digits + "./:_"

def t_is(ch):
    ''' verify that the next token is ch '''
    token = next(lex)
    if token != ch:
        raise ValueError('Line %d: Expected %r got %r' 
            % (lex.lineno, ch, token))

def get_word():
    ''' get next token if it's a word.
        Otherwise, push it back & return None
    token = next(lex)
    if token not in '{}':
        return token

def is_key(token):
    return token[-1] == ':'

def get_value():
    ''' get value, which may be a list of words or a dict '''
    token = next(lex)
    if token == '{':
        #Value is a dict
        return get_dict()

    #Value consists of one or more non-key words
    value = [token]
    while True:
        token = get_word()
        if token is None:
        if is_key(token):
    return ' '.join(value)

def get_dict():
    ''' parse a dictionary '''
    d = {}
    while True:
        #get key, value pairs
        key = get_word()
        if key is None:
            return d
        if not is_key(key):
            raise ValueError('Line %d: Bad key %r' 
                % (lex.lineno, key))
        d[key[:-1]] = get_value()

def get_cfg():
    ''' parse config data, returning the name and the dict '''
    name = get_word()
    if name is None:
        raise ValueError('Line %d: Expected name, got %r' 
            % (lex.lineno, next(lex)))
    d = get_dict()
    return name, d


print(20 * '- ' + '\n')
#for token in lex: print(token)

name, cfg = get_cfg()
print(json.dumps(cfg, indent=4, sort_keys=True))


router1 = {
    hostname: abcd
            interface: gigabit 0/1
            valn: 100
            name: vlan1
    clear: clear config all

- - - - - - - - - - - - - - - - - - - - 

    "clear": "clear config all", 
    "hostname": "abcd", 
    "interfaces": {
        "interface": "gigabit 0/1", 
        "ip_address": "", 
        "name": "vlan1", 
        "valn": "100"

Browse More Popular Posts

Leave a Comment