How to compare two rich text box contents and highlight the characters that are changed?

First kudos to ArtyomZzz for pointing to the great source of DiffMatchPatch!

Here is a piece of code the will paint the changed characters in two RichTextboxes upon a button click.

First download the diff-match-patchsource. (!See update below!) From the zip file copy ‘DiffMatchPatch.cs’ and also ‘COPY’ to your project and inlude the cs file in you project. Also add the namespace to your using clauses.

using DiffMatchPatch;

namespace RTF_diff
{
  public partial class Form1 : Form
  {
    public Form1()
    {
        InitializeComponent();
    }

    // this is the diff object;
    diff_match_patch DIFF = new diff_match_patch();

    // these are the diffs
    List<Diff> diffs;

    // chunks for formatting the two RTBs:
    List<Chunk> chunklist1; 
    List<Chunk> chunklist2;

    // two color lists:
    Color[] colors1 = new Color[3] { Color.LightGreen, Color.LightSalmon, Color.White };
    Color[] colors2 = new Color[3] { Color.LightSalmon, Color.LightGreen, Color.White };


    public struct Chunk
    {
        public int startpos;
        public int length;
        public Color BackColor;
    }


    private void button1_Click(object sender, EventArgs e)
    {
        diffs = DIFF.diff_main(RTB1.Text, RTB2.Text);
        DIFF.diff_cleanupSemanticLossless(diffs);      // <--- see note !

        chunklist1 = collectChunks(RTB1);
        chunklist2 = collectChunks(RTB2);

        paintChunks(RTB1, chunklist1);
        paintChunks(RTB2, chunklist2);

        RTB1.SelectionLength = 0;
        RTB2.SelectionLength = 0;
    }


    List<Chunk> collectChunks(RichTextBox RTB)
    {
        RTB.Text = "";
        List<Chunk> chunkList = new List<Chunk>();
        foreach (Diff d in diffs)
        {
            if (RTB == RTB2 && d.operation == Operation.DELETE) continue;  // **
            if (RTB == RTB1 && d.operation == Operation.INSERT) continue;  // **

            Chunk ch = new Chunk();
            int length = RTB.TextLength;
            RTB.AppendText(d.text);
            ch.startpos = length;
            ch.length = d.text.Length;
            ch.BackColor = RTB == RTB1 ? colors1[(int)d.operation]
                                       : colors2[(int)d.operation];
            chunkList.Add(ch);
        }
        return chunkList;

    }

    void paintChunks(RichTextBox RTB, List<Chunk> theChunks)
    {
        foreach (Chunk ch in theChunks)
        {
            RTB.Select(ch.startpos, ch.length);
            RTB.SelectionBackColor = ch.BackColor;
        }

    }

  }
}

At first I was trying to also color the changed lines as a whole in a lighter color; however that takes considerably more work, can’t be done (coloring the whole line as opposed to just the part with content), and was not part of your question in the first place..

Note: There are four different post-diff cleanup methods. Which is best suited will depend on the input and purpose. I suggest trying the cleanupSemanticLossless. I have added a 3rd screenshot showing how this cleanup works

Here is a screenshot of the output:
original output

And one of the new version:
new screenshot

Screenshot after cleanupSemanticLossless:
3rd screenshot

Update: The sources seem to have moved. This should help..

Leave a Comment