How does array_diff work?

user187291‘s suggestion to do it in PHP via hash tables is simply great! In a rush of adrenaline taken from this phantastic idea, I even found a way to speed it up a little more (PHP 5.3.1):

function leo_array_diff($a, $b) {
    $map = array();
    foreach($a as $val) $map[$val] = 1;
    foreach($b as $val) unset($map[$val]);
    return array_keys($map);
}

With the benchmark taken from user187291’s posting:

LEO=0.0322  leo_array_diff()
ME =0.1308  my_array_diff()
YOU=4.5051  your_array_diff()
PHP=45.7114 array_diff()

The array_diff() performance lag is evident even at 100 entries per array.

Note: This solution implies that the elements in the first array are unique (or they will become unique). This is typical for a hash solution.

Note: The solution does not preserve indices. Assign the original index to $map and finally use array_flip() to preserve keys.

function array_diff_pk($a, $b) {
    $map = array_flip($a);
    foreach($b as $val) unset($map[$val]);
    return array_flip($map);
}

PS: I found this while looking for some array_diff() paradoxon: array_diff() took three times longer for practically the same task if used twice in the script.

Leave a Comment