MongoDB Aggregation Framework: The SQL of NoSQL

No, it’s not really SQL. Its syntax resembles JSON. It returns a collection of JSON documents. But the Aggregation Framework addresses for MongoDB what many have found lacking in non-relational databases, the capability for ad hoc queries that SQL performs so well.

Actually, MongoDB already supports a variety of query mechanisms: queries on object properties, regular expression queries, and complex queries using JavaScript methods. And for those aggregation queries for which a SQL developer would use a group by clause, a developer using MongoDB could use MongoDB’s built-in facilities for MapReduce, a programming pattern for grouping data into buckets and iterating through each bucket.

But JavaScript-based queries require a developer to write a function. And MapReduce requires two functions, one for the map (grouping) and one for the reduce (analysis of the group). While doable, it is quite a bit of work compared to the ease of SQL. And it is this problem that the Aggregation Framework addresses.

The Aggregation Framework provides a declarative syntax for creating query expressions. The declarative statements work together like piped commands on a UNIX shell. The statements very much resemble their SQL counterparts. There is a match statement to identify records for inclusion, a sort statement for sorting, and a group statement for grouping. Grouping supports all of the aggregation functions one would expect: average, first, minimum, maximum. And there are all of the string and date extraction statements similar to SQL.

I found the unwind statement particularly interesting. Unwind pulls out elements from an array. MongoDB documents are JSON objects, which may include a hierarchy of child objects, including arrays of objects. The unwind statement returns a copy of the parent document for each element within the child array, but instead of the entire array, each document displays only the value of one array element. The result looks much like you would expect from a SQL join.

The Aggregation Framework is not yet included in MongoDB, though it is targeted for inclusion in version 2.2 due out in March. Last night at the SF Bay MongoDB Meet-up, Chris Westin of 10gen gave a preview and demo to the crowd. For more information, see his slide deck at http://www.slideshare.net/cwestin63/mongodbs-new-aggregation-framework.

For any organization considering NoSQL databases, the Aggregation Framework will certainly ease the SQL to NoSQL transition.

Advertisements

I’m a solution engineer for Shape Security, an awesome web security startup in Mountain View that defends some of the worlds largest web sites from bot attacks. I see this blog as a learning tool. It gives me a chance to collect my thoughts on topics of interest and to share with others. If you see a mistake or think I’m on the wrong track, please let me know. I appreciate comments. See my LinkedIn profile at http://www.linkedin.com/in/jamesdowney and follow me on Twitter at http://twitter.com/james_downey.

Posted in MongoDB, NoSQL
One comment on “MongoDB Aggregation Framework: The SQL of NoSQL
  1. The MongoDB driver for XQuery also enables complex processing such as group by or windowing. An example can be found at http://28.io/mongodb

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: