Find documents whose array field contains at least n elements of a given array

Let say we have the following documents in our collection:

{ "_id" : ObjectId("5759658e654456bf4a014d01"), "a" : [ 1, 3, 9, 2, 9, 0 ] }
{ "_id" : ObjectId("5759658e654456bf4a014d02"), "a" : [ 0, 8, 1 ] }
{ "_id" : ObjectId("5759658e654456bf4a014d03"), "a" : [ 0, 8, 432, 9, 34, -3 ] }
{ "_id" : ObjectId("5759658e654456bf4a014d04"), "a" : [ 0, 0, 4, 3, 2, 7 ] }

and the following input array and n = 2

var inputArray = [1, 3, 0];

We can return those documents where the array field contains at least n elements of a given array using the aggregation framework.

The $match selects only those documents with the array’s length greater or equals to n. This reduce the amount of data to be processed in down in the pipeline.

The $redact pipeline operator use a logical condition processing using the $cond operator and the special operations $$KEEP to “keep” the document where the logical condition is true or $$PRUNE to “discard” the document where the condition is false.

In our case, the condition is $gte which returns true if the $size of the intersection of the two arrays, which we compute using the $setIntersection operator is greater than or equal 2.

db.collection.aggregate(
    [ 
        { "$match": { "a.1": { "$exists": true } } }, 
        { "$redact": { 
            "$cond": [ 
                { "$gte": [ 
                    { "$size": { "$setIntersection": [ "$a", inputArray ] } }, 
                    2
                ]},
                "$$KEEP", 
                "$$PRUNE" 
            ]
        }}
    ]
)

which produces:

{ "_id" : ObjectId("5759658e654456bf4a014d01"), "a" : [ 1, 3, 9, 2, 9, 0 ] }
{ "_id" : ObjectId("5759658e654456bf4a014d02"), "a" : [ 0, 8, 1 ] }
{ "_id" : ObjectId("5759658e654456bf4a014d04"), "a" : [ 0, 0, 4, 3, 2, 7 ] }

Leave a Comment