Handling array structures in PostgreSQL JSON objects
(FOLIO-921)
|
|
| Status: | Closed |
| Project: | FOLIO |
| Components: | None |
| Affects versions: | None |
| Fix versions: | None | Parent: | Handling array structures in PostgreSQL JSON objects |
| Type: | Sub-task | Priority: | P2 |
| Reporter: | Niels Erik Nielsen | Assignee: | Unassigned |
| Resolution: | Duplicate | Votes: | 0 |
| Labels: | platform-backlog | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original estimate: | Not Specified | ||
| Issue links: |
|
||||||||||||||||||||
| Sprint: | |||||||||||||||||||||
| Development Team: | Core: Platform | ||||||||||||||||||||
| Description |
|
The issue is how to perform searches for records that contain certain values in the elements of an array. They can be search along the lines of object identities for sure, like asking for records that have an element that is an object with certain properties containing certain values: For example find within array objects with a property "value": "xyz" select * from diku_mod_inventory_storage.instance where jsonb->'identifiers' @> '[ {"value": "9780552142352"}]'; Or with two properties - in this example one of the properties is a reference to the identifier type "ISBN": select * from diku_mod_inventory_storage.instance where jsonb->'identifiers' @> '[ {"value": "9780552142352", "typeId": "8261054f-be78-422d-bd51-4ed9f33c3422"}]'; But the question is, can we also somehow search with truncation? That would obviously be needed for searching for authors in the 'creators' array. And how will it perform? |
| Comments |
| Comment by Julian Ladisch [ 08/Dec/17 ] |
|
This shell script searches for all instances where ISBN equals the passed argument $1. SERVER=http://localhost:9130
TOKEN=$( curl -D - -H "X-Okapi-Tenant: diku" -H "Content-type: application/json" \
-d '{"tenant":"diku","username":"diku_admin","password":"admin"}' $SERVER/authn/login | grep -i "^x-okapi-token: " )
ISBNID=$( curl -H "$TOKEN" -H "X-Okapi-Tenant: diku" $SERVER/identifier-types?query=name==ISBN | sed -E -n 's/.*"id"\s*:\s*"([^"]+)".*/\1/p' )
ISBN=$1
CQL=$(echo 'identifiers=="*\"value\": \"ISBN\", \"identifierTypeId\": \"ISBNID\"*"' | sed "s/ISBN/$ISBN/; s/ISBNID/$ISBNID/" );
echo $CQL
CQL=$(echo "$CQL" | sed 's/ /%20/g; s/"/%22/g; s/\\/%5c/g') # urlencode
curl -H "$TOKEN" -H "X-Okapi-Tenant: diku" $SERVER/instance-storage/instances?query=$CQL
This is an example CQL query: identifiers=="*\"value\": \"9783782791120\", \"identifierTypeId\": \"8261054f-be78-422d-bd51-4ed9f33c3422\"*" value is always before identifierTypeId because shorter keys are stored before longer keys: https://www.postgresql.org/docs/current/static/datatype-json.html |
| Comment by Julian Ladisch [ 18/Jan/19 ] |
|
We may use a trigger that automatically extracts the values and writes them into read only fields: UPDATE instance SET jsonb=jsonb_set(jsonb_set( jsonb, '{identifiervalues}', ( SELECT COALESCE(jsonb_agg(value), '[]') FROM jsonb_to_recordset(jsonb->'identifiers') AS x(key text, value text) WHERE value IS NOT NULL )), '{identifiervaluesisbn}', ( SELECT COALESCE(jsonb_agg(value), '[]') FROM jsonb_to_recordset(jsonb->'identifiers') AS y(key text, value text) WHERE key='26978ebf-eceb-4537-b5a7-9482c2cce893' ) ); A record like this
{
"identifiers":
[{"key": "26978ebf-eceb-4537-b5a7-9482c2cce893", "value": "020163385X"},
{"key": "26978ebf-eceb-4537-b5a7-9482c2cce893", "value": "9780201633856"},
{"key": "1e276ff3-48d5-41c3-483a-58298be393a8", "value": "ocn294998755"}
]
}
becomes
{
"identifiers":
[{"key": "26978ebf-eceb-4537-b5a7-9482c2cce893", "value": "020163385X"},
{"key": "26978ebf-eceb-4537-b5a7-9482c2cce893", "value": "9780201633856"},
{"key": "1e276ff3-48d5-41c3-483a-58298be393a8", "value": "ocn294998755"}
],
"identifiervalues": ["020163385X", "9780201633856", "ocn294998755"],
"identifiervaluesisbn": ["020163385X", "9780201633856"]
}
These new fields can be indexed and used with full text search including right truncation. |
| Comment by Jakub Skoczen [ 03/Apr/19 ] |
|
Duplicate of
|