CouchDB Indexes
Indexes are like indexes in most other database systems: they spend a little extra space to improve the performance of queries.
They primarily consist of a list of fields to index, but can also contain a selector to create a partial index.
Mango indexes have a type, currently either json, text, or nouveau.
Index Definitions
Index definitions are JSON objects with the following fields:
- ddoc (string): ID of the design document the index belongs to. This ID can be used to retrieve the design document containing the index, by making a GET request to /{db}/ddoc, where ddoc is the value of this field.
- name (string): Name of the index.
- partitioned (boolean): Partitioned (
true) or global (false) index. - type (string): Type of the index. Can be "json", "text", "nouveau", or sometimes "special".
- def/index (object): Definition of the index, depending on the type. Which name is used depends on the context.
JSON Indexes
JSON Indexes are your standard structural indexes, used by the majority of selector operators.
Their definition consists of:
- fields (array): Array of field names following the sort syntax. Nested fields are also allowed, e.g. “person.name”.
- partial_filter_selector (object): A selector to apply to documents at indexing time, creating a partial index. Optional.
Example:
{
"type" : "json",
"index": {
"fields": ["foo"]
}
}
Partial Indexes*
Partial indexes allow documents to be filtered at indexing time, potentially offering significant performance improvements for query selectors that do not map cleanly to a range query on an index.
Let's look at an example query:
{
"selector": {
"status": {
"$ne": "archived"
},
"type": "user"
}
}
Without a partial index, this requires a full index scan to find all the documents of "type":"user" that do not have a status of "archived". This is because a normal index can only be used to match contiguous rows, and the "$ne" operator cannot guarantee that.
To improve response times, we can create an index which excludes documents where "status": { "$ne": "archived" } at index time using the partial_filter_selector field:
POST /db/_index HTTP/1.1
Content-Type: application/json
Content-Length: 144
Host: localhost:5984
{
"index": {
"partial_filter_selector": {
"status": {
"$ne": "archived"
}
},
"fields": ["type"]
},
"ddoc" : "type-not-archived",
"type" : "json"
}
Partial indexes are not currently used by the query planner unless specified by a "use_index" field, so we need to modify the original query:
{
"selector": {
"status": {
"$ne": "archived"
},
"type": "user"
},
"use_index": "type-not-archived"
}
Technically, we do not need to include the filter on the "status" field in the query selector - the partial index ensures this is always true - but including it makes the intent of the selector clearer and will make it easier to take advantage of future improvements to query planning (e.g. automatic selection of partial indexes).
Text Indexes
Mango can also interact with the Search and Nouveau search systems, using the $text selector and the appropriate index. These indexes can be queried using either $text or GET /{db}/_design/{ddoc}/_search/{index} or GET /{db}/_design/{ddoc}/_nouveau/{index}.
Example index:
{
"type": "nouveau",
"index": {
"fields": [
{"name": "foo", "type": "string"},
{"name": "bar", "type": "number"},
{"name": "baz", "type": "string"},
],
"default_analyzer": "keyword",
}
}
A Text or Nouveau index definition consists of:
-
fields: The list of fields to index. "all_fields" or list of objects:
- name (string): not blank
- type (string): one of "text", "string", "number", "boolean"
- default_analyzer (string): Analyzer to use, defaults to "keyword" Optional
- default_field: Enables the “default field” index, boolean or object of enabled and analyzer Optional
- partial_filter_selector (object): A selector, causing this to be a partial index Optional
- selector (object): A selector Optional