How to get back the Original document back after aggregation

Falling under the category of stupid aggregation tricks is a little technique that often gets overlooked.

The query doing all of it’s grouping around the document _id, being the unique identifier for this document. So the main point to think of is the whole document is actually a unique identifier already. So instead of just stashing in the _id key, use the whole document.

    {$project: { 
        _id: { _id: "$_id", name: "$name", forms: "$forms" }, forms: "$forms"}
    },

Where this is done anything that is rolled up by the _id retains the document in it’s original form. At the end of all other aggregation stages, issue a final $project in order to restore the true original document form:

    {$project: { _id: "$_id._id", name: "$_id.name", forms: "$_id.forms"}}

Then you will have the filtered results that you want. This technique can be very handy when used with advanced filtering such as in the case of this query, as it removes the need to issue an additional find on all of the results.

Also, in such a case where you know you are only looking for a set of results that are going to match a certain set of conditions, use a $match operator as the first stage of the aggregation pipeline. This is not only useful in reducing the working set size, but it is also the only stage at which you can make use of an index and where you can significantly increase query performance.

The whole process together:

db.forms.aggregate([
    {$match: { "forms.status": "closed" } },
    {$project: { 
        _id: { _id: "$_id", name: "$name", forms: "$forms" }, forms: "$forms"}
    },
    {$unwind: "$forms"},
    {$group: { _id: "$_id", status: {$addToSet: "$forms.status"}}},
    {$unwind: "$status"},
    {$sort: { _id: 1, status: -1} },
    {$group: { _id: "$_id", status: {$first: "$status"} }},
    {$match: { status: "closed"}},
    {$project: { _id: "$_id._id", name: "$_id.name", forms: "$_id.forms"}}
])

Leave a Comment