MongoDB Query Language

Databases in MongoDB: creating, listing, current

Collections in MongoDB

A collection is a set of JSON objects, and is analogous to a table in a relational database.

Inserting in MongoDB

Querying

Introduction to find

The find method is used to perform queries in MongoDB. Querying returns a subset of documents in a collection, from no documents at all to the entire collection. Which documents get returned is determined by the first argument to find, which is a document specifying the query criteria.

An empty query document (i.e., {}) matches everything in the collection. If find isn't given a query document, it defaults to {}. For example, the following:

<db.c.find()

matches every document in the collection c (and returns these documents in batches).

When we start adding key/value pairs to the query document, we begin restricting our search. This works in a straightforward way for most types: numbers match numbers, booleans match booleans, and strings match strings. Querying for a simple type is as easy as specifying the value that you are looking for. For example, to find all documents where the value for age is 27, we can add that key/value pair to the query document:

> db.users.find({"age":27})

If we have a string we want to match, such as a "username" key with the value "joe", we use that key/value pair instead:

> db.users.find({"username":"joe"})

Multiple conditions can be strung together by adding more key/value pairs to the query document, which gets interpreted as "condition1 AND condition2 AND … AND conditionN." For instance, to get all users who are 27-year-olds with the username "joe," we can query for the following:

> db.users.find({"username":"joe","age":27})

Specifying Which Keys to Return

Sometimes you do not need all of the key/value pairs in a document returned. If this is the case, you can pass a second argument to find (or findOne) specifying the keys you want. This reduces both the amount of data sent over the wire and the time and memory used to decode documents on the client side.

For example, if you have a user collection and you are interested only in the username and email keys, you could return just those keys with the following query:

> db.users.find({},{"username":1,"email":1})
  {
  "_id":ObjectId("4ba0f0dfd22aa494fd523620"),
  "username":"joe",
  "email":"joe@example.com"
  }

As you can see from the previous output, the _id key is returned by default, even if it isn't specifically requested.

You can also use this second parameter to exclude specific key/value pairs from the results of a query. For instance, you may have documents with a variety of keys, and the only thing you know is that you never want to return the fatal_weakness key:

> db.users.find({},{"fatal_weakness":0})

The following can also prevent _id from being returned:

>db.users.find({},{"username":1,"_id":0})
  {
  "username":"joe",
  }

Limitations

There are some restrictions on queries. The value of a query document must be a constant as far as the database is concerned. (It can be a normal variable in your own code.) That is, it cannot refer to the value of another key in the document. For example, if we were keeping inventory and we had both in_stock and num_sold keys, we couldn't compare their values by querying the following:

> db.stock.find({"in_stock":"this.num_sold"}) // doesn't work

There are ways to do this (see $where Queries), but you will usually get better performance by restructuring your document slightly, such that a normal query will suffice. In this example, we could instead use the keys initial_stock and in_stock. Then, every time someone buys an item, we decrement the value of the in_stock key by one. Finally, we can do a simple query to check which items are out of stock:

> db.stock.find({"in_stock": 0})

Query Criteria

Queries can go beyond the exact matching described in the previous section; they can match more complex criteria, such as ranges, OR-clauses, and negation.

Query: Comparisons

"$lt", "$lte", "$gt", and "$gte" are all comparison operators, corresponding to <, <=, >, and >=, respectively. They can be combined to look for a range of values. For example, to look for users who are between the ages of 18 and 30, we can do this:

> db.users.find({"age":{"$gte":18,"$lte":30}})

This would find all documents where the age field was greater than or equal to 18 AND less than or equal to 30.


To query for documents where a key's value is not equal to a certain value, you must use another conditional operator, "$ne", which stands for "not equal." If you want to find all users who do not have the username joe, you can query for them using this:

> db.users.find({"username":{"$ne":"joe"}})

Actually, "$ne" can be used with any type.

Match by Date

These types of range queries are often useful for dates. For example, to find people who registered before January 1, 2007, we can do this:

> start = new Date("01/01/2007")
> db.users.find({"registered":{"$lt":start}})

Depending on how you create and store dates, an exact match might be less useful, since dates are stored with millisecond precision. Often you want a whole day, week, or month, making a range query necessary.

The following query filter uses the $gt operator and Date() method to find all documents where the dateCreated field value is later than June 22nd, 2000:

{ dateCreated: { $gt: new Date('2000-06-22') } }

Match by Multiple Conditions ($and)

The following query filter finds all documents where scores array contains the value 75, and the name is Greg Powell:

{ $and: [ { scores: 75, name: "Greg Powell" } ] }

Match by Multiple Possible Conditions ($in and $or)

There are two ways to do an OR query in MongoDB. "$in" can be used to query for a variety of values for a single key. "$or" is more general; it can be used to query for any of the given values across multiple keys.

If you have more than one possible value to match for a single key, use an array of criteria with "$in". For instance, suppose we're running a raffle and the winning ticket numbers are 725, 542, and 390. To find all three of these documents, we can construct the following query:

> db.raffle.find({"ticket_no":{"$in":[725,542,390]}})

"$in" is very flexible and allows you to specify criteria of different types as well as values. For example, if we are gradually migrating our schema to use usernames instead of user ID numbers, we can query for either by using this:

> db.users.find({"user_id":{"$in":[12345,"joe"]}})

This matches documents with a user_id equal to 12345 and documents with a user_id equal to joe.

If "$in" is given an array with a single value, it behaves the same as directly matching the value. For instance, {ticket_no : {$in : [725]}} matches the same documents as {ticket_no : 725}.

The opposite of "$in" is "$nin", which returns documents that don't match any of the criteria in the array. If we want to return all of the people who didn't win anything in the raffle, we can query for them with this:

> db.raffle.find({"ticket_no":{"$nin":[725,542,390]}})

This query returns everyone who did not have tickets with those numbers.

"$in" gives you an OR query for a single key, but what if we need to find documents where ticket_no is 725 or winner is true? For this type of query, we'll need to use the "$or" conditional. "$or" takes an array of possible criteria. In the raffle case, using "$or" would look like this:

> db.raffle.find({"$or":[{"ticket_no": 725},{"winner": true}]})

"$or" can contain other conditionals. If, for example, we want to match any of the three ticket_no values or the winner key, we can use this:

> db.raffle.find({"$or":[{"ticket_no":{"$in":[725,542,390]}}, ... {"winner":true}]})

With a normal AND-type query, you want to narrow down your results as far as possible in as few arguments as possible. OR-type queries are the opposite: they are most efficient if the first arguments match as many documents as possible.

While "$or" will always work, use "$in" whenever possible as the query optimizer handles it more efficiently.


The following query filter uses the $or operator to find documents where version is 4, or name is Andrea Le:

{ $or: [ { version: 4 }, { name: "Andrea Le" } ] }

Match by Exclusion ($not)

"$not" is a metaconditional: it can be applied on top of any other criteria. As an example, let's consider the modulus operator, "$mod". "$mod" queries for keys whose values, when divided by the first value given, have a remainder of the second value:

> db.users.find({"id_num":{"$mod":[5,1]}})

The previous query returns users with id_nums of 1, 6, 11, 16, and so on. If we want, instead, to return users with id_nums of 2, 3, 4, 5, 7, 8, 9, 10, 12, etc., we can use "$not":

> db.users.find({"id_num":{"$not":{"$mod":[5,1]}}})

"$not" can be particularly useful in conjunction with regular expressions to find all documents that don't match a given pattern (regular expression usage is described in the section Regular Expressions).


The following query filter uses the $not operator to find all documents where the value of the name field is not equal to "Andrea Le", or the name field does not exist:

{ name: { $not: { $eq: "Andrea Le" } } }

Type-Specific Queries

MongoDB has a wide variety of types that can be used in a document. Some of these types have special behavior when querying.

null

null behaves a bit strangely. It does match itself, so if we have a collection with the following documents:

> db.c.find()
{"_id":ObjectId("4ba0f0dfd22aa494fd523621"),"y":null}
{"_id":ObjectId("4ba0f0dfd22aa494fd523622"),"y":1}
{"_id":ObjectId("4ba0f148d22aa494fd523623"),"y":2}

we can query for documents whose y key is null in the expected way:

> db.c.find({"y":null})
{"_id":ObjectId("4ba0f0dfd22aa494fd523621"),"y":null}

However, null also matches does not exist. Thus, querying for a key with the value null will return all documents lacking that key:

> db.c.find({"z":null})
{"_id":ObjectId("4ba0f0dfd22aa494fd523621"),"y":null}
{"_id":ObjectId("4ba0f0dfd22aa494fd523622"),"y":1}
{"_id":ObjectId("4ba0f148d22aa494fd523623"),"y":2}

If we only want to find keys whose value is null, we can check that the key is null and exists using the "$exists" conditional:

> db.c.find({"z":{"$eq":null,"$exists":true}})

Match by Substring

The following query filter uses the $regex operator to find all documents where the value of email includes the term /andrea_le/ (where the underscores stands for any character):

{ email: { $regex: "andrea_le" } }

Regular Expression

"$regex" provides regular expression capabilities for pattern matching strings in queries. Regular expressions are useful for flexible string matching. For example, if we want to find all users with the name "Joe" or "joe," we can use a regular expression to do case-insensitive matching:

> db.users.find({"name":{"$regex":/joe/i}})

Regular expression flags (e.g., i) are allowed but not required. If we want to match not only various capitalizations of "joe," but also "joey," we can continue to improve our regular expression:

> db.users.find({"name":/joey?/i})

MongoDB uses the Perl Compatible Regular Expression (PCRE) library to match regular expressions; any regular expression syntax allowed by PCRE is allowed in MongoDB. It is a good idea to check your syntax with the JavaScript shell before using it in a query to make sure it matches what you think it matches.

MongoDB can leverage an index for queries on prefix regular expressions (e.g., /^joey/). Indexes cannot be used for case-insensitive searches (/^joey/i). A regular expression is a "prefix expression" when it starts with either a caret (^) or a left anchor (\A). If the regular expression uses a case-sensitive query, then if an index exists for the field, the matches can be conducted against values in the index. If it also is a prefix expression, then the search can be limited to the values within the range created by that prefix from the index.

Regular expressions can also match themselves. Very few people insert regular expressions into the database, but if you insert one, you can match it with itself:

> db.foo.insertOne({"bar":/baz/})
> db.foo.find({"bar":/baz/})
{
  "_id":ObjectId("4b23c3ca7525f35f94b60a2d"),
  "bar":/baz/
}

Querying Array

The following query filter uses the $elemMatch operator to find all documents where at least one value in the scores array is greater than 80 and less than 90:

{ scores: { $elemMatch: { $gt: 80, $lt: 90 } } }

Querying on Embedded Document

The following query filter finds the document with the school.name subfield of Northwestern:

{ "school.name": "Northwestern" }

$where Queries*

*

*

Cursors*

*

*

MongoDB Filters*

*

*

MongoDB Realm Filters

A filter modifies an incoming MongoDB query to return only a subset of the results matched by the query. Filters add additional query parameters and omit fields from query results before Realm runs the query.

Every filter has three components:

  • An Apply When expression that determines if the filter applies to an incoming request. You can use variables like %%user and %%request in this expression. However, you cannot use MongoDB variables like %%root because Realm applies filters before it reads any data.
  • An optional query document, which uses standard MongoDB query syntax and merges with the existing query of any request the filter applies to.
  • An optional projection document, which uses standard MongoDB projection syntax and merges with the existing projection of any request the filter applies to.

Indexes*

Special Index and Collection Types*

Introduction to the Aggregation Framework*

*

Updating Documents

Removing Documents

Outputing to and Inputing from a File

You can log all the shell session if you invoked the shell with tee command: mongo | tee session.txt. Or you can use the --eval switch like so: mongo localhost:27017/dbname --eval "printjson(db.MyCollection.find().pretty())" > sample.json.

Outputing to a File with mongoexport

  • To get all documents in collection crops, in database farming, into file farming.crops.json, issue: mongoexport --db=farming --collection=crops --out="farming.crops.json" --pretty
  • To get results as an array, add --jsonArray, as in: mongoexport --db=farming --collection=crops --out="farming.crops.json" --pretty --jsonArray
  • You can select which fields to get through -f, --fields=<field>[,<field>]* (a comma separated list of field names (required for exporting CSV) e.g. -f "name,age") or --fieldFile=<filename> (a file with field names, one per line)
  • You can specify a query filter through -q, --query=<json>, such as {x:{$gt:1}}, or set a path to a query file containing a query filter for JSON files, through --queryFile=<filename>

Inputing from a File with mongoimport

Switches --db, --collections, --fields, --fieldFile, --jsonArray are the same as with mongoexport.

Command mongoimport has some ingest switches, too:

--drop
drop collection before inserting documents
--ignoreBlanks
ignore fields with empty values in CSV and TSV
--maintainInsertionOrder
insert documents in the order of their appearance in the input source
-j, --numInsertionWorkers=<number>
number of insert operations to run concurrently (defaults to 1) (default: 1)
--stopOnError
stop importing at first insert/upsert error
--mode=[insert|upsert|merge]
insert: insert only; upsert: insert or replace existing documents; merge: insert or modify existing documents; defaults to insert
--upsertFields=<field>[,<field>]*
comma-separated fields for the query part when --mode is set to upsert or merge
--writeConcern=<write-concern-specifier>
write concern options e.g. --writeConcern majority, --writeConcern '{w: 3, wtimeout: 500, fsync: true, j: true}'
--bypassDocumentValidation
bypass document validation