How to parse a boolean expression and load it into a class?

TL;DR: If you want to see the code, jump to the second portion of the answer.

I would build a tree from the expression to parse and then traverse it depth first. You can refer to the wikipedia article about Binary Expression Trees to get a feel for what I’m suggesting.

  1. Start by adding the omitted optional parentheses to make the next step easier
  2. When you read anything that is not an operator or a parenthese, create a LEAF type node
  3. When you read any operator (in your case not, and, or), create the corresponding operator node
  4. Binary operators get the previous and following nodes as children, unary operators only get the next one.

So, for your example ¬((A ∧ B) ∨ C ∨ D), the algorithm would go like this:

¬((A ∧ B) ∨ C ∨ D) becomes ¬(((A ∧ B) ∨ C) ∨ D)

  1. Create a NOT node, it’ll get the result of the following opening paren as a child.
  2. Create A LEAF node, AND node and B LEAF node. AND has A and B as children.
  3. Create OR node, it has the previously created AND as a child and a new LEAF node for C.
  4. Create OR node, it has the previously created OR and a new node for D as children.

At that point, your tree looks like this:

  NOT
   |
  OR
  /\
 OR D
 / \
AND C
/\
A B

You can then add a Node.Evaluate() method that evaluates recursively based on its type (polymorphism could be used here). For example, it could look something like this:

class LeafEx {
    bool Evaluate() {
        return Boolean.Parse(this.Lit);
    }
}

class NotEx {
    bool Evaluate() {
        return !Left.Evaluate();
    }
}

class OrEx {
    bool Evaluate() {
        return Left.Evaluate() || Right.Evaluate();
    }
}

And so on and so forth. To get the result of your expression, you then only need to call

bool result = Root.Evaluate();

Alright, since it’s not an assignment and it’s actually a fun thing to implement, I went ahead. Some of the code I’ll post here is not related to what I described earlier (and some parts are missing) but I’ll leave the top part in my answer for reference (nothing in there is wrong (hopefully!)).

Keep in mind this is far from optimal and that I made an effort to not modify your provided BoolExpr class. Modifying it could allow you to reduce the amount of code. There’s also no error checking at all.

Here’s the main method

static void Main(string[] args)
{
    //We'll use ! for not, & for and, | for or and remove whitespace
    string expr = @"!((A&B)|C|D)";
    List<Token> tokens = new List<Token>();
    StringReader reader = new StringReader(expr);

    //Tokenize the expression
    Token t = null;
    do
    {
        t = new Token(reader);
        tokens.Add(t);
    } while (t.type != Token.TokenType.EXPR_END);

    //Use a minimal version of the Shunting Yard algorithm to transform the token list to polish notation
    List<Token> polishNotation = TransformToPolishNotation(tokens);

    var enumerator = polishNotation.GetEnumerator();
    enumerator.MoveNext();
    BoolExpr root = Make(ref enumerator);

    //Request boolean values for all literal operands
    foreach (Token tok in polishNotation.Where(token => token.type == Token.TokenType.LITERAL))
    {
        Console.Write("Enter boolean value for {0}: ", tok.value);
        string line = Console.ReadLine();
        booleanValues[tok.value] = Boolean.Parse(line);
        Console.WriteLine();
    }

    //Eval the expression tree
    Console.WriteLine("Eval: {0}", Eval(root));

    Console.ReadLine();
}

The tokenization phase creates a Token object for all tokens of the expression. It helps keep the parsing separated from the actual algorithm. Here’s the Token class that performs this:

class Token
{
    static Dictionary<char, KeyValuePair<TokenType, string>> dict = new Dictionary<char, KeyValuePair<TokenType, string>>()
    {
        {
            '(', new KeyValuePair<TokenType, string>(TokenType.OPEN_PAREN, "(")
        },
        {
            ')', new KeyValuePair<TokenType, string>(TokenType.CLOSE_PAREN, ")")
        },
        {
            '!', new KeyValuePair<TokenType, string>(TokenType.UNARY_OP, "NOT")
        },
        {
            '&', new KeyValuePair<TokenType, string>(TokenType.BINARY_OP, "AND")
        },
        {
            '|', new KeyValuePair<TokenType, string>(TokenType.BINARY_OP, "OR")
        }
    };

    public enum TokenType
    {
        OPEN_PAREN,
        CLOSE_PAREN,
        UNARY_OP,
        BINARY_OP,
        LITERAL,
        EXPR_END
    }

    public TokenType type;
    public string value;

    public Token(StringReader s)
    {
        int c = s.Read();
        if (c == -1)
        {
            type = TokenType.EXPR_END;
            value = "";
            return;
        }

        char ch = (char)c;

        if (dict.ContainsKey(ch))
        {
            type = dict[ch].Key;
            value = dict[ch].Value;
        }
        else
        {
            string str = "";
            str += ch;
            while (s.Peek() != -1 && !dict.ContainsKey((char)s.Peek()))
            {
                str += (char)s.Read();
            }
            type = TokenType.LITERAL;
            value = str;
        }
    }
}

At that point, in the main method, you can see I transform the list of tokens in Polish Notation order. It makes the creation of the tree much easier and I use a modified implementation of the Shunting Yard Algorithm for this:

static List<Token> TransformToPolishNotation(List<Token> infixTokenList)
{
    Queue<Token> outputQueue = new Queue<Token>();
    Stack<Token> stack = new Stack<Token>();

    int index = 0;
    while (infixTokenList.Count > index)
    {
        Token t = infixTokenList[index];

        switch (t.type)
        {
            case Token.TokenType.LITERAL:
                outputQueue.Enqueue(t);
                break;
            case Token.TokenType.BINARY_OP:
            case Token.TokenType.UNARY_OP:
            case Token.TokenType.OPEN_PAREN:
                stack.Push(t);
                break;
            case Token.TokenType.CLOSE_PAREN:
                while (stack.Peek().type != Token.TokenType.OPEN_PAREN)
                {
                    outputQueue.Enqueue(stack.Pop());
                }
                stack.Pop();
                if (stack.Count > 0 && stack.Peek().type == Token.TokenType.UNARY_OP)
                {
                    outputQueue.Enqueue(stack.Pop());
                }
                break;
            default:
                break;
        }

        ++index;
    }
    while (stack.Count > 0)
    {
        outputQueue.Enqueue(stack.Pop());
    }

    return outputQueue.Reverse().ToList();
}

After this transformation, our token list becomes NOT, OR, OR, C, D, AND, A, B.

At this point, we’re ready to create the expression tree. The properties of Polish Notation allow us to just walk the Token List and recursively create the tree nodes (we’ll use your BoolExpr class) as we go:

static BoolExpr Make(ref List<Token>.Enumerator polishNotationTokensEnumerator)
{
    if (polishNotationTokensEnumerator.Current.type == Token.TokenType.LITERAL)
    {
        BoolExpr lit = BoolExpr.CreateBoolVar(polishNotationTokensEnumerator.Current.value);
        polishNotationTokensEnumerator.MoveNext();
        return lit;
    }
    else
    {
        if (polishNotationTokensEnumerator.Current.value == "NOT")
        {
            polishNotationTokensEnumerator.MoveNext();
            BoolExpr operand = Make(ref polishNotationTokensEnumerator);
            return BoolExpr.CreateNot(operand);
        }
        else if (polishNotationTokensEnumerator.Current.value == "AND")
        {
            polishNotationTokensEnumerator.MoveNext();
            BoolExpr left = Make(ref polishNotationTokensEnumerator);
            BoolExpr right = Make(ref polishNotationTokensEnumerator);
            return BoolExpr.CreateAnd(left, right);
        }
        else if (polishNotationTokensEnumerator.Current.value == "OR")
        {
            polishNotationTokensEnumerator.MoveNext();
            BoolExpr left = Make(ref polishNotationTokensEnumerator);
            BoolExpr right = Make(ref polishNotationTokensEnumerator);
            return BoolExpr.CreateOr(left, right);
        }
    }
    return null;
}

Now we’re golden! We have the expression tree that represents the expression so we’ll ask the user for the actual boolean values of each literal operand and evaluate the root node (which will recursively evaluate the rest of the tree as needed).

My Eval function follows, keep in mind I’d use some polymorphism to make this cleaner if I modified your BoolExpr class.

static bool Eval(BoolExpr expr)
{
    if (expr.IsLeaf())
    {
        return booleanValues[expr.Lit];
    }

    if (expr.Op == BoolExpr.BOP.NOT)
    {
        return !Eval(expr.Left);
    }

    if (expr.Op == BoolExpr.BOP.OR)
    {
        return Eval(expr.Left) || Eval(expr.Right);
    }

    if (expr.Op == BoolExpr.BOP.AND)
    {
        return Eval(expr.Left) && Eval(expr.Right);
    }

    throw new ArgumentException();
}

As expected, feeding our test expression ¬((A ∧ B) ∨ C ∨ D) with values false, true, false, true for A, B, C, D respectively yields the result false.

Leave a Comment