JMESPath JSON Query Language
jp
(From https://github.com/jmespath/jp)
The jp command is a command line interface to JMESPath, an expression language for manipulating JSON
The most basic usage of jp is to accept input JSON data through stdin, apply the JMESPath expression you've provided as an argument to jp, and print the resulting JSON data to stdout.
$ echo '{"key": "aValue"}' | jp key
"aValue"
$ echo '{"foo": {"bar": ["a", "b", "c"]}}' | jp foo.bar[1]
"b"
To process a file, use switch -f filename or --filename filename:
$ jp --filename myJsonObj.json foo.bar[1] "b"
Projections
Projections are one of the key features of JMESPath. It allows you to apply an expression to a collection of elements. There are five kinds of projections:
- List Projections
- Slice Projections
- Object Projections
- Flatten Projections
- Filter Projections
List and Slice Projections
A wildcard expression creates a list projection, which is a projection over a JSON array. This is best illustrated with an example. Let’s say we have a JSON document describing a people, and each array element is a JSON object that has a first, last, and age key. Suppose we wanted a list of all the first names in our list.
Given an object:
{
"people": [
{"first": "James", "last": "d"},
{"first": "Jacob", "last": "e"},
{"first": "Jayden", "last": "f"},
{"missing": "different"}
],
"foo": {"bar": "baz"}
}
the command people[*].first yields result:
[ "James", "Jacob", "Jayden" ]
In the example above, the first expression, which is just an identifier, is applied to each element in the people array. The results are collected into a JSON array and returned as the result of the expression. The expression can be more complex than a basic identifier. For example, the expression foo[*].bar.baz[0] would project the bar.baz[0] expression to each element in the foo array.
There are a few things to keep in mind when working with projections. These are discussed in more detail in the wildcard expressions section of the spec, but the main points are:
- Projections are evaluated as two steps. The left hand side (LHS) creates a JSON array of initial values. The right hand side (RHS) of a projection is the expression to project for each element in the JSON array created by the left hand side. Each projection type has slightly different semantics when evaluating either the left hand side and/or the right hand side.
- If the result of the expression projected onto an individual array element is
null, then that value is omitted from the collected set of results. - You can stop a projection with a Pipe Expression (discussed later).
- A list projection is only valid for a JSON array. If the value is not a list, then the result of the expression is
null.
Slice Projections
Slice projections are almost identical to a list projection, with the exception that the left hand side is the result of evaluating the slice, which may not include all the elements in the original list.
people[:2].first
will yield the result:
[ "James", "Jacob" ]
Object Projections
Whereas a list projection is defined for a JSON array, an object projection is defined for a JSON object. You can create an object projection using the * syntax. This will create a list of the values of the JSON object, and project the right hand side of the projection onto the list of values.
Given an object like:
{
"ops": {
"functionA": {"numArgs": 2},
"functionB": {"numArgs": 3},
"functionC": {"variadic": true}
}
}
then ops.*.numArg will yield:
In the example above the * creates a JSON array of the values associated with the ops JSON object. The RHS of the projection, numArgs, is then applied to the JSON array, resulting in the final array of [2, 3].
Below is a sample walkthrough of how an implementation could potentially implement evaluating an object projection. First, the object projection can be broken down into its two components, the left hand side (LHS) and its right hand side (RHS):
- LHS: ops
- RHS: numArgs
First, the LHS is evaluated to create the initial array to be projected:
evaluate(ops, inputData) -> [{"numArgs": 2}, {"numArgs": 3},
{"variadic": True}]
Then the RHS is applied to each element in the array:
evaluate(numArgs, {numArgs: 2}) -> 2
evaluate(numArgs, {numArgs: 3}) -> 3
evaluate(numArgs, {variadic: true}) -> null
Flatten Projections
More than one projection can be used in a JMESPath expression. In the case of a List/Object projection, the structure of the original document is preserved when creating projection within a projection. For example, let’s take the expression reservations[*].instances[*].state. This expression is saying that the top level key reservations has an array as a value. For each of those array elements, project the instances[*].state expression. Within each list element, there’s an instances key which itself is a value, and we create a sub projection for each each list element in the list. Here’s an example of that.
So, given the object:
{
"reservations": [
{
"instances": [
{"state": "running"},
{"state": "stopped"}
]
},
{
"instances": [
{"state": "terminated"},
{"state": "running"}
]
}
]
}
command reservations[*].instances[*].state will yield:
[
[
"running",
"stopped"
],
[
"terminated",
"running"
]
]
What if we just want a list of all the states of our instances? We’d ideally like a result ["running", "stopped", "terminated", "running"]. In this situation, we don’t care which reservation the instance belonged to, we just want a list of states.
This is the problem that a Flatten Projection solves. To get the desired result, you can use [] instead of [*] to flatten a list: reservations[].instances[].state. Try changing [*] to [] in the expression above and see how the result changes.
While the spec goes into more detail, a simple rule of thumb to use for the flatten operator, [], is that:
- It flattens sublists into the parent list (not recursively, just one level).
- It creates a projection, so anything on the RHS of the flatten projection is projected onto the newly created flattened list.
Applying [] on
[ [0, 1], 2, [3], 4, [5, [6, 7]] ]
will yield:
[
0,
1,
2,
3,
4,
5,
[
6,
7
]
Filter Projections
Evaluating the RHS of a projection is a basic type of filter. If the result of the expression evaluated against an individual element results in null, then the element is excluded from the final result.
A filter projection allows you to filter the LHS of the projection before evaluating the RHS of a projection.
For example, let’s say we have a list of machines, each has a name and a state. We’d like the name of all machines that are running. In pseudocode, this would be:
result = []
foreach machine in inputData['machines']
if machine['state'] == 'running'
result.insert_at_end(machine['name'])
return result
A filter projection can be used to accomplish this. If we apply
machines[?state=='running'].name
to
{
"machines": [
{"name": "a", "state": "running"},
{"name": "b", "state": "stopped"},
{"name": "c", "state": "running"}
]
}
then we obtain:
[ "a", "c" ]
A filter expression is defined for an array and has the general form LHS [? expression comparator expression] RHS. The filter expression spec details exactly what comparators are available and how they work, but the standard comparators are supported, i.e ==, !=, <, <=, >, >=.
Pipe Expressions
Projections are an important concept in JMESPath. However, there are times when projection semantics are not what you want. A common scenario is when you want to operate [on] the result of a projection rather than projecting an expression onto each element in the array. For example, the expression people[*].first will give you an array containing the first names of everyone in the people array. What if you wanted the first element in that list? If you tried people[*].first[0] [then] you just evaluate first[0] for each element in the people array, and because indexing is not defined for strings, the final result would be an empty array, []. To accomplish the desired result, you can use a pipe expression, expression | expression, to indicate that a projection must stop. This is shown in the example below.
We shall run:
people[*].first | [0]
on
{
"people": [
{"first": "James", "last": "d"},
{"first": "Jacob", "last": "e"},
{"first": "Jayden", "last": "f"},
{"missing": "different"}
],
"foo": {"bar": "baz"}
}
to obtain:
"James"
In the example above, the RHS of the list projection is first. When a pipe is encountered, the result up to that point is passed to the RHS of the pipe expression. The pipe expression is evaluated as:
evaluate(people[*].first, inputData) -> ["James", "Jacob", "Jayden"] evaluate([0], ["James", "Jacob", "Jayden"]) -> "James"
MultiSelect
Up to this point, we’ve looked at JMESPath expressions that help to pare down a JSON document into just the elements you’re interested in. This next concept, multiselect lists and multiselect hashes allow you to create JSON elements. This allows you to create elements that don’t exist in a JSON document. A multiselect list creates a list and a multiselect hash creates a JSON object.
MultiSelect List (Array)
This is an example of a multiselect list. Applying:
people[].[name, state.name]
on
{
"people": [
{
"name": "a",
"state": {"name": "up"}
},
{
"name": "b",
"state": {"name": "down"}
},
{
"name": "c",
"state": {"name": "up"}
}
]
}
yields result:
[
[
"a",
"up"
],
[
"b",
"down"
],
[
"c",
"up"
]
]
In the expression above, the [name, state.name] portion is a multiselect list. It says to create a list of two element, the first element is the result of evaluating the name expression against the list element, and the second element is the result of evaluating state.name. Each list element will therefore create a two element list, and the final result of the entire expression is a list of two element lists.
Unlike a projection, the result of the expression is always included, even if the result is a null. If you change the above expression to people[].[foo, bar] each two element list will be [null, null].
MultiSelect Hash/Object
A multiselect hash has the same basic idea as a multiselect list, except it instead creates a hash/object instead of an array. Using the same example above, if we instead wanted to create a two element hash that had two keys, Name and State, we could use this:
people[].{Name: name, State: state.name}
and get:
[
{
"Name": "a",
"State": "up"
},
{
"Name": "b",
"State": "down"
},
{
"Name": "c",
"State": "up"
}
]
JMESPath Examples¶
Filters and Multiselect Lists
One of the most common usage scenarios for JMESPath is being able to take a complex JSON document and simplify it down. The main features at work here are filters and multiselects. In this example below, we’re taking the array of people and, for any element with an age key whose value is greater than 20, we’re creating a sub list of the name and age values.
people[?age > `20`].[name, age]
Filters and Multiselect Hashes
In the previous example we were taking an array of hashes, and simplifying down to an array of two element arrays containing a name and an age. We’re also only including list elements where the age key is greater than 20. If instead we want to create the same hash structure but only include the age and name key, we can instead say:
people[?age > `20`].{name: name, age: age}
The last half of the above expression contains key value pairs which have the general form keyname: expression. In the above expression we’re just using a field as an expression, but they can be more advanced expressions. For example:
people[*].{name: name, tags: tags[0]}
Notice in the above example instead of applying a filter expression ([? expr ]), we’re selecting all array elements via [*].
Working with Nested Data
reservations[].instances[].[tags[?Key=='Name'].Values[] | [0], type, state.name]
The above example combines several JMESPath features including the flatten operator, multiselect lists, filters, and pipes.
The input data contains a top level key, “reservations”, which is a list. Within each list, there is an “instances” key, which is also a list.
The first thing we’re doing here is creating a single list from multiple lists of instances. By using the Flatten Operator we can take the two instances from the first list and the two instances from the second list, and combine them into a single list. Try changing the above expression to just reservations[].instances[] to see what this flattened list looks like. Everything to the right of the reservations[].instances[] is about taking the flattened list and paring it down to contain only the data that we want. This expression is taking each element in the original list and transforming it into a three element sublist. The three elements are:
- In the tags list, select the first element in the flattened Values list whose Key has a value of
Name
. - The type
- The state.name of each instance.
The most interesting of those three expressions is the tags[?Key=='Name'].Values[] | [0] part. Let’s examine that further.
The first thing to notice is that we’re filtering down the list associated with the tags key. Expression tags[?Key=='Name'] tells us to only include list elements that contain a Key whose value is Name
. From those filtered list elements we’re going to take the Values key and flatten the list. Finally, the | [0] part will take the entire list and extract the 0th element.
Filtering and Selecting Nested Data
In this example, we’re going to look at how you can filter nested hashes.
people[?general.id==`100`].general | [0]
In this example we’re searching through the people array. Each element in this array contains a hash of two elements, and each value in the hash is itself a hash. We’re trying to retrieve the value of the general key that contains an id key with a value of 100.
If we just had the expression people[?general.id==`100`], we’d have the result:
[{
"general": {
"id": 100,
"age": 20,
"other": "foo",
"name": "Bob"
},
"history": {
"first_login": "2014-01-01",
"last_login": "2014-01-02"
}
}]
Let’s walk through how we arrived at this result. In words, the people[?general.id==`100`] expression is saying for each element in the people array, select the elements where general.id equals 100
. If we trace the execution of this filtering process we have:
# First element:
{
"general": {
"id": 100,
"age": 20,
"other": "foo",
"name": "Bob"
},
"history": {
"first_login": "2014-01-01",
"last_login": "2014-01-02"
}
},
# Applying the expression ``general.id`` to this hash::
100
# Does 100==100?
true
# Add this first element (in its entirety) to the result list.
# Second element:
{
"general": {
"id": 101,
"age": 30,
"other": "bar",
"name": "Bill"
},
"history": {
"first_login": "2014-05-01",
"last_login": "2014-05-02"
}
}
# Applying the expression ``general.id`` to this element::
101
# Does 101==100?
false
# Do not add this element to the results list.
However, this still isn’t the final value we want which is:
{
"id": 100,
"age": 20,
"other": "foo",
"name": "Bob"
}
In order to get to this value from our filtered results we need to first select the general key. This gives us a list of just the values of the general hash:
[{
"id": 100,
"age": 20,
"other": "foo",
"name": "Bob"
}]
From there, we use a pipe (|) to stop projections so that we can finally select the first element ([0]). Note that we are making the assumption that there’s only one hash that contains an id of 100.
Finally, it’s worth mentioning there is more than one way to write this expression. In this example we’ve decided that after we filter the list we’re going to select the value of the general key and then select the first element in that list. We could also reverse the order of those operations, we could have taken the filtered list, selected the first element, and then extracted the value associated with the general key. That expression would be:
people[?general.id==`100`] | [0].general
Both versions are equally valid.
Using Functions
JMESPath functions give you a lot of power and flexibility when working with JMESPath expressions. Below are some common expressions and functions used in JMESPath.
sort_by
sort_by(Contents, &Date)[*].{Key: Key, Size: Size}
The first interesting thing here if the use of the function sort_by. In this example we are sorting the Contents array by the value of each Date key in each element in the Contents array. The sort_by function takes two arguments. The first argument is an array, and the second argument describes the key that should be used to sort the array.
The second interesting thing in this expression is that the second argument starts with &, which creates an expression type. Think of this conceptually as a reference to an expression that can be evaluated later. If you are familiar with lambda and anonymous functions, expression types are similiar. The reason we use &Date instead of Date is because if the expression is Date, it would be evaluated before calling the function, and given there’s no Date key in the outer hash, the second argument would evaluate to null. Check out Function Evaluation in the specification for more information on how functions are evaluated in JMESPath. Also, note that we’re taking advantage of the fact that the dates are in ISO 8601 format, which can be sorted lexicographically.
And finally, the last interesting thing in this expression is the [*] immediately after the sort_by function call. The reason for this is that we want to apply the multiselect hash, the second half of the expression, to each element in the sorted array. In order to do this we need a projection. The [*] does exactly that, it takes the input array and creates a projection such that the multiselect hash {Key: Key, Size: Size} will be applied to each element in the list.
There are other functions that take expression types that are similar to sort_by including min_by and max_by.
Examples using Pipes
Pipe expression are useful for stopping projections. They can also be used to group expressions.
Let’s look at a modified version of the expression on the JMESPath front page.
locations[?state == 'WA'].name | sort(@)[-2:] | {WashingtonCities: join(', ', @)}
When applied to
{
"locations": [
{"name": "Seattle", "state": "WA"},
{"name": "New York", "state": "NY"},
{"name": "Bellevue", "state": "WA"},
{"name": "Olympia", "state": "WA"}
]
}
we get:
{
"WashingtonCities": "Olympia, Seattle"
}
We can think of this JMESPath expression as having three components, each separated by the pipe character (|). The first expression is familiar to us, it’s similar to the previous example. The second part of the expression, sort(@), is similar to the sort_by function we saw in the previous section. The @ token is used to refer to the current element. The sort function takes a single parameter which is an array. If the input JSON document was a hash, and we wanted to sort the foo key, which was an array, we could just use sort(foo). In this scenario, the input JSON document is the array we want to sort. To refer to this value, we use the current element, @, to indicate this. We’re also only taking a subset of the sorted array. We’re using a slice ([-2:]) to indicate that we only want the last two elements in the sorted array to be passed through to the final third of this expression.
And finally, the third part of the expression, {WashingtonCities: join(', ', @)}, creates a multiselect hash. It takes as input, the list of sorted city names, and produces a hash with a single key, WashingtonCities, whose values are the input list (denoted by @) as a string separated by a comma.