Querying CouchDB via the _find Endpoint and Indexes

(From https://docs.couchdb.org/en/stable/api/database/find.html)

Find documents using a declarative JSON querying syntax. Queries will use custom indexes, specified using the _index endpoint, if available. Otherwise, when allowed, they use the built-in _all_docs index, which can be arbitrarily slow.

Structure of Request JSON Object

A Request JSON Object holds quite a few fields.

Using an Index

Example request body for finding documents using an index:

POST /movies/_find HTTP/1.1
Accept: application/json
Content-Type: application/json
Content-Length: 168
Host: localhost:5984

{
    "selector": {
        "year": {"$gt": 2010}
    },
    "fields": ["_id", "_rev", "year", "title"],
    "sort": [{"year": "asc"}],
    "limit": 2,
    "skip": 0,
    "execution_stats": true
}

Example response when finding documents using an index:

HTTP/1.1 200 OK
Cache-Control: must-revalidate
Content-Type: application/json
Date: Thu, 01 Sep 2016 15:41:53 GMT
Server: CouchDB (Erlang OTP)
Transfer-Encoding: chunked

{
    "docs": [
        {
            "_id": "176694",
            "_rev": "1-54f8e950cc338d2385d9b0cda2fd918e",
            "year": 2011,
            "title": "The Tragedy of Man"
        },
        {
            "_id": "780504",
            "_rev": "1-5f14bab1a1e9ac3ebdf85905f47fb084",
            "year": 2011,
            "title": "Drive"
        }
    ],
    "execution_stats": {
        "total_keys_examined": 200,
        "total_docs_examined": 200,
        "total_quorum_docs_examined": 0,
        "results_returned": 2,
        "execution_time_ms": 5.52
    }
}

Selectors

Selectors are expressed as a JSON object describing documents of interest. Within this structure, you can apply conditional logic using specially named fields.

Whilst selectors have some similarities with MongoDB query documents, these arise from a similarity of purpose and do not necessarily extend to commonality of function or result.

Elementary selector syntax requires you to specify one or more fields, and the corresponding values required for those fields. This selector matches all documents whose "director" field has the value Lars von Trier.

{ "director": "Lars von Trier" }

A simple selector, inspecting specific fields:

"selector": {
    "title": "Live And Let Die"
},
"fields": [
    "title",
    "cast"
]

You can create more complex selector expressions by combining operators. For best performance, it is best to combine ‘combination’ or ‘array logical’ operators, such as $regex, with an operator that defines a contiguous range of keys such as $eq, $gt, $gte, $lt, $lte, and $beginsWith (but not $ne). For more information about creating complex selector expressions, see creating selector expressions.

Selector with 2 fields

This selector matches any document with a name field containing "Paul", and that also has a location field with the value "Boston".

{
    "name": "Paul",
    "location": "Boston"
}

Subfields

A more complex selector enables you to specify the values for field of nested objects, or subfields. For example, you might use a standard JSON structure for specifying a field and subfield.

Example of a field and subfield selector, using a standard JSON structure:

{
    "imdb": {
        "rating": 8
    }
}

An abbreviated equivalent uses a dot notation to combine the field and subfield names into a single name.

{
    "imdb.rating": 8
}

Combination Operators

Combination operators are used to combine selectors. In addition to the common boolean operators found in most programming languages, there are three combination operators ($all, $elemMatch, and $allMatch) that help you work with JSON arrays and one that works with JSON maps ($keyMapMatch).

A combination operator takes a single argument. The argument is either another selector, or an array of selectors.

The list of combination operators:

Operator Argument Purpose
$and Array Matches if all the selectors in the array match.
$or Array Matches if any of the selectors in the array match. All selectors must use the same index.
$not Selector Matches if the given selector does not match.
$nor Array Matches if none of the selectors in the array match.
$all Array Matches an array value if it contains all the elements of the argument array.
$elemMatch Selector Matches and returns all documents that contain an array field with at least one element that matches all the specified query criteria.
$allMatch Selector Matches and returns all documents that contain an array field with all its elements matching all the specified query criteria.
$keyMapMatch Selector Matches and returns all documents that contain a map that contains at least one key that matches all the specified query criteria.
$text String Perform a text search

Sort Syntax

The sort field contains a list of field name and direction pairs, expressed as a basic array. The first field name and direction pair is the topmost level of sort. The second pair, if provided, is the next level of sort.

The field can be any field, using dotted notation if desired for sub-document fields.

The direction value is "asc" for ascending, and "desc" for descending. If you omit the direction value, the default "asc" is used.


Example, sorting by 2 fields:

[
  {"fieldName1": "desc"},
  {"fieldName2": "desc"}
]

Example, sorting by 2 fields, assuming default direction for both :

["fieldNameA", "fieldNameB"]

A typical requirement is to search for some content using a selector, then to sort the results according to the specified field, in the required direction.


To use sorting, ensure that:

  • At least one of the sort fields is included in the selector.
  • There is an index already defined, with all the sort fields in the same order.
  • Each object in the sort array has a single key.

If an object in the sort array does not have a single key, the resulting sort order is implementation specific and might change.

Find does not support multiple fields with different sort orders, so the directions must be either all ascending or all descending.

For field names in text search sorts, it is sometimes necessary for a field type to be specified, for example:

{
  "fieldname:string": "asc"
}

If possible, an attempt is made to discover the field type based on the selector. In ambiguous cases the field type must be provided explicitly.

The sorting order is undefined when fields contain different data types. This is an important difference between text and view indexes. Sorting behavior for fields with different data types might change in future versions.


A simple query, using sorting:

{
    "selector": {"Actor_name": "Robert De Niro"},
    "sort": [{"Actor_name": "asc"}, {"Movie_runtime": "asc"}]
}

Views and Map-Reduce

CouchDB uses views filtered through map-reduce to query all the documents of your database. Each view has a map- and optionally a reduce-function. Doctrine CouchDB ODM allows you to create and query views in your application.

MapReduce is a programming model and an associated implementation for processing and generating big data sets with a parallel, distributed algorithm on a cluster. A MapReduce program is composed of a Map() procedure (method) that performs filtering and sorting (such as sorting students by first name into queues, one queue for each name) and a Reduce() method that performs a summary operation (such as counting the number of students in each queue, yielding name frequencies).

A MapReduce view is created by adding it to the _design document for the database (which we will discuss later). You can also use Futon, which is accessible with the following URL once you have CouchDB set up: http://127.0.0.1:5984/_utils/ to create a Temporary View directly through the interface. So I would set up a map and a reduce function. The result which would be displayed at the bottom of the image is the result of the map step, which contains rows of data containing, say, the documents name as the key, and the age as the value. We can then enable the reduce step by ticking the reduce box just above the result set.


An example:

function(doc){

    if(doc.age > 29 && doc.age < 36){
        emit(doc.age, doc.income);
    }

}

whose result would be fed to a reduce function:

function(keys, values) {
  var averageIncome = sum(values) / values.length;
  return averageIncome;
}

We might actually create more realistic MapReduce views and add them to the design doc (rather than just creating temporary views)


When you query a view, CouchDB will run the MapReduce function against every document in the database. On the surface, that sounds like a bad idea – especially if you've got millions of documents. However, it only performs this process once to create the view initially, and when updates are made to the data it only needs to make updates to the resulting view for that specific document (it doesn't need to regenerate the entire view again).

To query a view, all you need to do it access its URL, which will look something like this (once you have added it to a design doc):

mydb/_design/nameOfDesigndoc/_view/nameOfView

You can also supply parameters to the URL to restrict the returned dataset by using things like by_date? with a start_key and an end_key. The process for this would go something like Start at the row that has a date that equals the start_key and then keep returning rows until you reach a key that matches the end_key.