Using Apache CouchDB

/{db}/_find

(From https://docs.couchdb.org/en/stable/api/database/find.html)

Find documents using a declarative JSON querying syntax. Queries will use custom indexes, specified using the _index endpoint, if available. Otherwise, when allowed, they use the built-in _all_docs index, which can be arbitrarily slow.

Structure of Request JSON Object

A Request JSON Object holds the following fields:

  • selector (object) – JSON object describing criteria used to select documents. More information provided in the section on selector syntax. Required
  • limit (number) – Maximum number of results returned. Default is 25. Optional
  • skip (number) – Skip the first ‘n’ results, where ‘n’ is the value specified. Optional
  • sort (array) – JSON array following sort syntax. Optional
  • fields (array) – JSON array specifying which fields of each object should be returned. If it is omitted, the entire object is returned. More information provided in the section on filtering fields. Optional
  • use_index (string|array) – Request a query to use a specific index. Specified either as "<design_document>" or ["<design_document>", "<index_name>"]. It is not guaranteed that the index will be actually used because if the index is not valid for the selector, fallback to a valid index is attempted. Therefore that is more like a hint. When fallback occurs, the details are given in the warning field of the response. Optional
  • allow_fallback (boolean) – Tell if it is allowed to fall back to another valid index. This can happen on running a query with an index specified by use_index which is not deemed usable, or when only the built-in _all_docs index would be picked in lack of indexes available to support the query. Disabling this fallback logic causes the endpoint immediately return an error in such cases. Default is true. Optional
  • conflicts (boolean) – Include conflicted documents if true. Intended use is to easily find conflicted documents, without an index or view. Default is false. Optional
  • r (number) – Read quorum needed for the result. This defaults to 1, in which case the document found in the index is returned. If set to a higher value, each document is read from at least that many replicas before it is returned in the results. This is likely to take more time than using only the document stored locally with the index. Optional, default: 1
  • bookmark (string) – A string that enables you to specify which page of results you require. Used for paging through result sets. Every query returns an opaque string under the bookmark key that can then be passed back in a query to get the next page of results. If any part of the selector query changes between requests, the results are undefined. Optional, default: null
  • update (boolean) – Whether to update the index prior to returning the result. Default is true. Optional
  • stable (boolean) – Whether or not the view results should be returned from a “stable” set of shards. Optional
  • stale (string) – Combination of update=false and stable=true options. Possible options: "ok", false (default). Optional Note that this parameter is deprecated. Use stable and update instead. See Views Generation for more details.
  • execution_stats (boolean) – Include execution statistics in the query response. Optional, default: false

Using an Index

Example request body for finding documents using an index:

POST /movies/_find HTTP/1.1
Accept: application/json
Content-Type: application/json
Content-Length: 168
Host: localhost:5984

{
    "selector": {
        "year": {"$gt": 2010}
    },
    "fields": ["_id", "_rev", "year", "title"],
    "sort": [{"year": "asc"}],
    "limit": 2,
    "skip": 0,
    "execution_stats": true
}

Example response when finding documents using an index:

HTTP/1.1 200 OK
Cache-Control: must-revalidate
Content-Type: application/json
Date: Thu, 01 Sep 2016 15:41:53 GMT
Server: CouchDB (Erlang OTP)
Transfer-Encoding: chunked

{
    "docs": [
        {
            "_id": "176694",
            "_rev": "1-54f8e950cc338d2385d9b0cda2fd918e",
            "year": 2011,
            "title": "The Tragedy of Man"
        },
        {
            "_id": "780504",
            "_rev": "1-5f14bab1a1e9ac3ebdf85905f47fb084",
            "year": 2011,
            "title": "Drive"
        }
    ],
    "execution_stats": {
        "total_keys_examined": 200,
        "total_docs_examined": 200,
        "total_quorum_docs_examined": 0,
        "results_returned": 2,
        "execution_time_ms": 5.52
    }
}

Selectors

Selectors are expressed as a JSON object describing documents of interest. Within this structure, you can apply conditional logic using specially named fields.

Whilst selectors have some similarities with MongoDB query documents, these arise from a similarity of purpose and do not necessarily extend to commonality of function or result.

Elementary selector syntax requires you to specify one or more fields, and the corresponding values required for those fields. This selector matches all documents whose "director" field has the value Lars von Trier.

{ "director": "Lars von Trier" }

A simple selector, inspecting specific fields:

"selector": {
    "title": "Live And Let Die"
},
"fields": [
    "title",
    "cast"
]

You can create more complex selector expressions by combining operators. For best performance, it is best to combine ‘combination’ or ‘array logical’ operators, such as $regex, with an operator that defines a contiguous range of keys such as $eq, $gt, $gte, $lt, $lte, and $beginsWith (but not $ne). For more information about creating complex selector expressions, see creating selector expressions.

Selector with 2 fields

This selector matches any document with a name field containing "Paul", and that also has a location field with the value "Boston".

{
    "name": "Paul",
    "location": "Boston"
}

Subfields

A more complex selector enables you to specify the values for field of nested objects, or subfields. For example, you might use a standard JSON structure for specifying a field and subfield.

Example of a field and subfield selector, using a standard JSON structure:

{
    "imdb": {
        "rating": 8
    }
}

An abbreviated equivalent uses a dot notation to combine the field and subfield names into a single name.

{
    "imdb.rating": 8
}

Combination Operators

Combination operators are used to combine selectors. In addition to the common boolean operators found in most programming languages, there are three combination operators ($all, $elemMatch, and $allMatch) that help you work with JSON arrays and one that works with JSON maps ($keyMapMatch).

A combination operator takes a single argument. The argument is either another selector, or an array of selectors.

The list of combination operators:

Operator Argument Purpose
$and Array Matches if all the selectors in the array match.
$or Array Matches if any of the selectors in the array match. All selectors must use the same index.
$not Selector Matches if the given selector does not match.
$nor Array Matches if none of the selectors in the array match.
$all Array Matches an array value if it contains all the elements of the argument array.
$elemMatch Selector Matches and returns all documents that contain an array field with at least one element that matches all the specified query criteria.
$allMatch Selector Matches and returns all documents that contain an array field with all its elements matching all the specified query criteria.
$keyMapMatch Selector Matches and returns all documents that contain a map that contains at least one key that matches all the specified query criteria.
$text String Perform a text search

            


            


          

*


            


            


          

*


            


            


          

*


            


            


          

*


            


            


          

*


            


            


          

*


            


            


          

*


            


            


          

Views and Map-Reduce

CouchDB uses views filtered through map-reduce to query all the documents of your database. Each view has a map- and optionally a reduce-function. Doctrine CouchDB ODM allows you to create and query views in your application.

MapReduce is a programming model and an associated implementation for processing and generating big data sets with a parallel, distributed algorithm on a cluster. A MapReduce program is composed of a Map() procedure (method) that performs filtering and sorting (such as sorting students by first name into queues, one queue for each name) and a Reduce() method that performs a summary operation (such as counting the number of students in each queue, yielding name frequencies).

A MapReduce view is created by adding it to the _design document for the database (which we will discuss later). You can also use Futon, which is accessible with the following URL once you have CouchDB set up:

http://127.0.0.1:5984/_utils/

to create a Temporary View directly through the interface. So I would set up a map and a reduce function. The result which would be displayed at the bottom of the image is the result of the map step, which contains rows of data containing, say, the documents name as the key, and the age as the value. We can then enable the reduce step by ticking the reduce box just above the result set.


An example:

function(doc){

    if(doc.age > 29 && doc.age < 36){
        emit(doc.age, doc.income);
    }

}

whose result would be fed to a reduce function:

function(keys, values) {
  var averageIncome = sum(values) / values.length;
  return averageIncome;
}

We might actually create more realistic MapReduce views and add them to the design doc (rather than just creating temporary views)


When you query a view, CouchDB will run the MapReduce function against every document in the database. On the surface, that sounds like a bad idea – especially if you've got millions of documents. However, it only performs this process once to create the view initially, and when updates are made to the data it only needs to make updates to the resulting view for that specific document (it doesn't need to regenerate the entire view again).

To query a view, all you need to do it access its URL, which will look something like this (once you have added it to a design doc):

mydb/_design/nameOfDesigndoc/_view/nameOfView

You can also supply parameters to the URL to restrict the returned dataset by using things like by_date? with a start_key and an end_key. The process for this would go something like Start at the row that has a date that equals the start_key and then keep returning rows until you reach a key that matches the end_key.