crucial jq insight which unlocked the tool for me: it's jsonl, not json.<p>it's a pipeline operating on a <i>stream</i> of independent json terms. The filter is reapplied to every element from the stream. Streams != lists; the latter are just a data type. `.` always points at the current element of the stream. Functions like `select` operate on separate items of the stream, while `map` operates on individual elements of a list. If you want a `map` over all elements of the stream: that's just what jq is, naturally :)<p>stream of a single element which is a list:<p><pre><code> echo '[1,2,3,4]' | jq .
# [1,2,3,4]
</code></pre>
unpack the list into a stream of separate elements:<p><pre><code> echo '[1,2,3,4]' | jq '.[]'
# 1
# 2
# 3
# 4
echo '[1,2,3,4]' | jq '.[] | .' # same: piping into `.` is a NOP:
</code></pre>
only keep elements 2 and 4 from the <i>stream</i>, not from the array--there is no array left after .[] :<p><pre><code> echo '[1,2,3,4]' | jq '.[] | select(. % 2 == 0)'
# 2
# 4
</code></pre>
keep the array:<p><pre><code> echo '[1,2,3,4]' | jq 'map(. * 2)'
# [2,4,6,8]
</code></pre>
map over individual elements of a stream instead:<p><pre><code> echo '[1,2,3,4]' | jq '.[] | . * 2'
# 2
# 4
# 6
# 8
printf '1\n2\n3\n4\n' | jq '. * 2' # same
</code></pre>
This is how you can do things like<p><pre><code> printf '{"a":{"b":1}}\n{"a":{"b":2}}\n{"a":{"b":3}}\n' | jq 'select(.a.b % 2 == 0) | .a'
# {"b": 2}
</code></pre>
select creates a nested "scope" for the current element in its parens, but restores the outer scope when it exits.<p>Hope this helps someone else!
Doesn't the command-line utility `jq` already define a protocol for this? How do the syntaxes compare?<p>(LLMs are already very adept at using `jq` so I would think it was preferable to be able to prompt a system that implements querying inside of source code as "this command uses the same format as `jq`")
For convenience: <a href="https://en.wikipedia.org/wiki/Jq_(programming_language)" rel="nofollow">https://en.wikipedia.org/wiki/Jq_(programming_language)</a>
Mongo also has a good query language and a mongo DB can be seen as an array of documents
You just have to wrap your mind around jq. It's a) functional, b) has pervasive generators and backtracking. So when you write `.a[].b`, which is a lot like `(.a | .[] | .b)` what you get is three generators strung together in an `and_then` fashion: `.a`, then `.[]`, and then `.b`. And here `.a` generates exactly one value, as does `.b`, but `.[]` generates as many values as are in the value produced by `.a`. And obviously `.b` won't run at all if `.a` has no values, and `.b` will run for _each_ value of `.a[]`. Once you begin to see the generators and the backtracking then everything begins to make sense.
DuckDB can read JSON - you can query JSON with normal SQL.[1]
I prefer to Malloy Data language for querying as it is 10x simpler than SQL.[2]<p>[1] - <a href="https://duckdb.org/docs/stable/data/json/overview" rel="nofollow">https://duckdb.org/docs/stable/data/json/overview</a>
[2] - <a href="https://www.malloydata.dev/" rel="nofollow">https://www.malloydata.dev/</a>
So can postgres, I tend to just use PG, since I have instances running basically everywhere, even locally, but duckdb works well too.
I read the man page of `jq` and learned how to use it. It's quite well-written and contains a good introduction.<p>I've observed that too many users of jq aren't willing to take a few minutes to understand how stream programming works. That investment pays off in spades.
Plugging a previous personal project for learning jq interactively: <a href="https://jqjake.com/" rel="nofollow">https://jqjake.com/</a>
I'm a big fan of jq but won't credit its man page with much. There were (ineffable) insights that I picked up through my own usage over time, that I couldn't glean from reading the man page alone. In other words, it's not doing its best to put the correct mental model out for a newish user.
Also, LLMs are good at spitting out filters, but you can learn what they do by going and then looking up what it’s doing in the docs. They often apply things in far more interesting and complex ways than the docs at jqlang.org do, which are often far too “foo bar baz” tier to truly understand explain the power of things.
I'd like to know how it compares to <a href="https://jsonata.org" rel="nofollow">https://jsonata.org</a>
JSONata looks to be more general purpose with its support for variables/statements, and custom functions. I'd probably still stick with JSONata
We wrote an article on this: "JQ vs. JSONata: Language and Tooling Compared".
<a href="https://dashjoin.medium.com/jq-vs-jsonata-language-and-tooling-compared-5f0f7acc778e" rel="nofollow">https://dashjoin.medium.com/jq-vs-jsonata-language-and-tooli...</a>
Can't you just visit both pages, build an understanding and compare them?
Maybe the author would be in a better place to do that, having the expertise already. Also, as a user I'm quite happy with jq already, so why expend the effort?
<a href="https://jsonpath.com/" rel="nofollow">https://jsonpath.com/</a> or <a href="https://jsonata.org/" rel="nofollow">https://jsonata.org/</a>
In the k8s world there's a random collection of json path, json query, some random expression language.<p>Just use jq. None of the other ones are as flexible or widespread and you just end up with frustrated users.
I have a similar use case in the app I'm working on. Initially I went with JSONata, which worked, but resulted in queries that indeed felt more like incantations and were difficult even for me to understand (let alone my users).<p>I then switched to JavaScript / TypeScript, which I found much better overall: it's understandable to basically every developer, and LLMs are very good at it. So now in my app I have a button wherever a TypeScript snippet is required that asks the LLM for its implementation, and even "weak" models one-shot it correctly 99% of the times.<p>It's definitely more difficult to set up, though, as it requires a sandbox where you can run the code without fears. In my app I use QuickJS, which works very well for my use case, but might not be performant enough in other contexts.
"JSON Query" is kind of a long name. You should find a way to shorten it. Maybe "jQuery" or something along those lines :P
I can't help myself and surely someone else has already done the same. But the query<p><pre><code> obj.friends.filter(x=>{ return x.city=='New York'})
.sort((a, b) => a.age - b.age)
.map(item => ({ name: item.name, age: item.age }));
</code></pre>
does exactly the same without any plugin.<p>am I missing something?
Lot of people focus on a similarity to jq et al, I guess the author had his reasons to craft own stuff.<p>Kudos for all the work, it's a nice language. I find writing parsers a very mind-expanding activity.
Most alternatives being talked about are working on query strings (like `$.phoneNumbers[:1].type`) which is fine but can not be easily modeled / modified by code.<p>Things like <a href="https://jsonlogic.com/" rel="nofollow">https://jsonlogic.com/</a> works better if you wish to expose a rest api with a defined query schema or something like that. Instead of accepting a query `string`. This seems better as in you have a string format and a concrete JSON format. Also APIs to convert between them.<p>Also, if you are building a filter interface, having a structured representation helps:<p><a href="https://react-querybuilder.js.org/demo?outputMode=export&exportFormat=jsonlogic" rel="nofollow">https://react-querybuilder.js.org/demo?outputMode=export&exp...</a>
.friends
| filter(.city == "New York")
| sort(.age)
| pick(.name, .age)<p>mapValues(mapKeys(substring(get(), 0, 10)))<p>This is all too cute. Why not just use JavaScript syntax? You can limit it to the exact amount of functionality you want for whatever reason it is you want to limit it.
I've been working on an ultra-token-efficient LLM-friendly query language.
<a href="https://memelang.net/09/" rel="nofollow">https://memelang.net/09/</a>
Cool idea! Although without looking closer I can't tell if "meme" is in reference to the technical or the colloquial meaning of meme.<p>Admittedly I don't know that much about LLM optimization/configuration, so apologies if I'm asking dumb questions. Isn't the value of needing to copy/paste that prompt in front of your queries a huge bog on net token efficiency? Like wouldn't you need to do some hundred/thousand query translations just to break even? Maybe I don't understand what you've built.<p>Cool idea either way!
If you prefer JSONPath as a query language, oj from <a href="https://github.com/ohler55/ojg" rel="nofollow">https://github.com/ohler55/ojg</a> provides that functionality. It can also be installed with brew. (disclaimer, I'm the author of OjG)
There are a ridiculous number of JSON query/path languages. Wish all the authors got together and harmonized on a standard.
There is a standard in RFC 9535 (JSONPath)[1]. But as far as I can tell, it isn't very widely used, and it has more limited functionality than some of the alternatives.<p>[1]: <a href="https://datatracker.ietf.org/doc/html/rfc9535" rel="nofollow">https://datatracker.ietf.org/doc/html/rfc9535</a>
Don't forget the also standardized way of referring to a single value in JSON, "JSON Pointer": datatracker.ietf.org/doc/html/rfc6901
the issue with JSONPath is that it took 17 years for it to become a properly fleshed-out standard. The original idea came from a 2007 blog post [0], which was then extended and implemented subtly differently dozens of times, with the result that almost every JSON Path implementation out there is incompatible with the others.<p>[0] <a href="https://goessner.net/articles/JsonPath/" rel="nofollow">https://goessner.net/articles/JsonPath/</a>
Postgresql supports jsonpath, right?
The AWS CLI supports JMESPath (<a href="https://jmespath.org" rel="nofollow">https://jmespath.org</a>) for the `--query` flag. I don't think I've run into anything else that uses it. Pretty similar to JSONPath IIRC.
Plus, I feel like most, if not all, higher level languages already come with everything you need to do that easily. Well except for go that requires you to create your own filter function.
The standard is called jq, any new standard is just going to be a committee circle jerk that doesn't move the ball forward in any meaningful way.
Xkcd.gif
<a href="https://www.jsoniq.org" rel="nofollow">https://www.jsoniq.org</a> -- jq is ubiquitous, but I still prefer the jsoniq syntax.
jq is amazing (not only for querying json), I recommend going though the docs, it's fairly small.<p>I implemented one day of advent of code in jq to learn it: <a href="https://github.com/ivanjermakov/adventofcode/blob/master/aoc2024/src/day10/day10b.jq" rel="nofollow">https://github.com/ivanjermakov/adventofcode/blob/master/aoc...</a>
What do your users know? If they’re quite familiar with SQL for querying, I would look at duckdb.
Nice work with a jq-esque feel. Website is cut on mobile devices though
Maybe JS directly?
Nice. I work on something similar but for .net.
I hate jq as much as the next guy but it’s ubiquitous and great for this sort of thing. If you want a single path style query language I’d highly recommend JsonPath. It’s so much nicer than jq for “I need every student’s gpa”.
not to be confused with jq for querying json?
[dead]