[FOLIO-405] implement CQL-to-Postgres/JSONB translator Created: 06/Dec/16 Updated: 12/Nov/18 Resolved: 19/Dec/16 |
|
| Status: | Closed |
| Project: | FOLIO |
| Components: | None |
| Affects versions: | None |
| Fix versions: | None |
| Type: | New Feature | Priority: | P2 |
| Reporter: | Jakub Skoczen | Assignee: | Julian Ladisch |
| Resolution: | Done | Votes: | 0 |
| Labels: | sprint4, sprint5 | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | 1 hour | ||
| Original estimate: | Not Specified | ||
| Issue links: |
|
||||||||||||||||||||||||
| Sprint: | |||||||||||||||||||||||||
| Description |
|
To be implemented as a stand-alone tool with an API that can be, eventually, consumed in the raml-module-builder and/or particular modules. Possibly a command-line interface if it simplifies testing/demoing. Unless there is a reason not to, the tool should live in it's own git repo. In short, given a JSON Schema for a particular object/document and the CQL-query, the tool generates a matching SELECT or WHERE clause that can be executed on the JSONB column in Postgres containing the objects. The tool should be structured in a way that the JSON Schema traversal (to look up JSON object's key path(s) and type(s)) and Postgres query translation is as independent as possible to open up possibilities for implementation of translators to other query languages (e.g SOLR). In the first iteration a general structure of the boolean query (AND/OR/NOT) should be translated, search indices should be mapped to equivalent (same name) JSON fields, and the '=' relation opertor should be mapped to a substring match (_LIKE '%s%'). We will also want to have the most basic suppor for specifying sort order using the sortBy keyword (which should be translated into ORDER BY): sortBy fieldName. Repo: https://github.com/julianladisch/cql2pgjson-java |
| Comments |
| Comment by Mike Taylor [ 07/Dec/16 ] |
|
I assume you'll base this on the existing CQL-Java? |
| Comment by Julian Ladisch [ 07/Dec/16 ] |
| Comment by frances.webb@cornell.edu [ 07/Dec/16 ] |
|
That's definitely the plan. I was thinking that if the long term goal is to have the CQL-to-JSONB conversion as an instance of a general CQL-to-Native conversion interface, that it might be a good idea to actually add the capability to CQL-Java itself, as this seems like a fairly common need for users of CQL in Java. Although that seems like a decision that those more experienced with CQL-Java and its history could better decide. |
| Comment by Jakub Skoczen [ 07/Dec/16 ] |
|
Let's make that decision later. I'd say right now let's build this as a seperate tool. The reason not to make it part of the CQL-java is forcing all CQL-java users to pull new dependencies, like JSON Schema parser. |
| Comment by Jakub Skoczen [ 07/Dec/16 ] |
|
So to clarify – the tool should be packaged as a jar using Maven to pull dependencies (like CQL-java) |
| Comment by frances.webb@cornell.edu [ 07/Dec/16 ] |
|
Yes, and long term an XML parser as well. Good point. |
| Comment by Mike Taylor [ 07/Dec/16 ] |
|
When I originally wrote this parser, the idea was that any back-end could be supported just by walking the tree produced by `parser.parse(String)`. That works fine – but the translation to CQL, XCQL and BER is done with methods added to the tree-node classes. Either way is fine. BTW., I just updated the CQL-Java README a little, and translated it into Markdown. |
| Comment by frances.webb@cornell.edu [ 07/Dec/16 ] |
|
It looks like the JSON data specs like this one: https://github.com/folio-org/mod-users/blob/master/ramls/schemas/userdata.json |
| Comment by Jakub Skoczen [ 07/Dec/16 ] |
|
Yes, but I would discourage adding any new translation methods to the parse tree classes. This is generally bad design – doing so in opposition to the Single Responsibility Principle. The toXCQL method is special as it does not need any configuration and is just an alternative representation of the parse tree. The toPQF should most likely be moved to a different module, but that's ancient history not worth spending time on Beside the parse tree classes that you can traverse "manually" (and do explicit casting for node types), I have added the CQLNodeVisitor interface that implements the visitor pattern. Unfortunately it is optimized for generating "prefix" notations (like the already mentioned PQF). For "infix" notation like the WHERE SQL clause you would need to keep your own "stack" to figure out where in the tree are you. Give it a shot, you could also implement another visitor traversal method directly in the parser, if it makes sense. |
| Comment by Mike Taylor [ 07/Dec/16 ] |
|
I'd agree with all of that. My inclination is to make a CQLNodeVisitor.inOrderTraverse() method and use that. And, yes, we might later on decide for hygiene reasons to move some of the existing back-end methods out of the node classes. |
| Comment by Jakub Skoczen [ 13/Dec/16 ] |
|
Added info about sort. |
| Comment by shale99 [ 15/Dec/16 ] |
|
guys, good to have a peek at this - really important to get this into the RMB - currently there is a simple RMB postgres query language (not mongo native as was for the mongo client) which is translated into postgres jsonb queries by the postgres client wrapper in the RMB - this will need to replace that |
| Comment by Julian Ladisch [ 19/Dec/16 ] |
|
The basic CQL-to-Postgres support is complete: https://github.com/julianladisch/cql2pgjson-java |