[FOLIO-677] Sorting on names (rather than IDs) of fields that are linked into main record Created: 15/Jun/17  Updated: 14/Feb/22

Status: Draft
Project: FOLIO
Components: None
Affects versions: None
Fix versions: None

Type: New Feature Priority: P3
Reporter: Mike Taylor Assignee: shale99
Resolution: Unresolved Votes: 0
Labels: core, for-next-sprint, performance, sprint17, team2
Remaining Estimate: Not Specified
Time Spent: 1 hour
Original estimate: Not Specified

Issue links:
Blocks
is blocked by FOLIO-704 Investigate a central compound object... Draft
Relates
relates to FOLIO-705 implement intra-module joins/views to... Closed
relates to RMB-454 sortby does not work with target tabl... Closed
relates to STCON-30 Support including foreign records via... Closed
Sprint: CP: Roadmap backlog
Development Team: Core: Platform

 Description   

Consider a compound record such as the Item record returned from mod-inventory. The underlying record in inventory-storage has a materialTypeID field; and the compound record adds a {{materialType]} sub-record corresponding to that ID by consulting the material-types data.

That works fine for retrieval: the client can pick out materialType.name to display.

And it works OK for filtering, too: having retrieved the IDs-to-names mapping, the UI uses this to offer filtering where the MT names are displayed, but the query that gets generated when the filter is selected uses the corresponding ID.

But it fails for sorting: there is no way to way to instruct the back-end to sort a list of records by the name of the material-type. materialTypeID exists because it's in the underlying record in mod-inventory, but materialTypeName does not; materialType.name exists in the retrieval record, but not the underlying one used for database operations.

This issue will apply to many "controlled-vocabulary" fields, not just material-type. For example, location and patron-group.



 Comments   
Comment by Jakub Skoczen [ 23/Jun/17 ]

I'd like this to be a generic feature of the BL modules but they probably need to understand CQL. Thoughts?

Comment by Marc Johnson [ 23/Jun/17 ]

Jakub Skoczen

This is likely similar in nature to the aggregation feature of the RAML module builder that is currently being worked on. I think it is likely that the business logic modules will need to understand CQL.

Once we separate out the included records from the main record in responses, if we want to be able to refer to properties in the included records, we will likely need a way of addressing them (and probably change the way we address the primary record in business logic interfaces) and for splitting the CQL up into multiple CQL queries for each dependent interface.

For filtering, we need a way of applying the results of the filter on included records to the primary records. For sorting this likely becomes more involved, as I imagine the PostgreSQL CQL implementation likely does a sort across all records and then finds the appropriate page, I don't think we'll be able to do that when sorting on a property of an associated record).

Comment by Mike Taylor [ 23/Jun/17 ]

This sounds right to me. But the tools are all there, is ought to be easy enough. You can parse the initial CQL string to get a tree of nodes, then pick out which subtrees need to be addressed to which modules, and then decompile those subtrees into their own CQL strings ready to be passed into the servant modules.

Comment by shale99 [ 27/Jun/17 ]

Mike Taylor - just want to make sure i am following, the problem is requesting a sort for items using a field that exists in the materialTypes entry (name) that does not exist in the items object?

Comment by Mike Taylor [ 27/Jun/17 ]

Yes, exactly. We want to sort Items by materialTypeName, but the Item record only contains materialTypeId.

Comment by shale99 [ 27/Jun/17 ]

will i get egged if i refer us back to the biz logic key (how about a biz_logic_key-uuid combo?), but seriously, i dont really know how to do this correctly.

if i have 4 material types in the system and 1 million items.
lets say
1 book - with a uuid 1111...
1 article - with a uuid 444...
1 dvd - with a uuid 555...
999k videos - with a uuid 222...

how can i sort those if i have uuids in no sepcific order in the items table - what query do i generate to first sort by 111, 444, 555 and then 222 on the items? i cant first query the items and then sort those as those records may only contain videos for example...

there may be an answer that is obvious and i am missing (note i have not eaten in about 24 hours so its affecting the brain cells)

Comment by Mike Taylor [ 27/Jun/17 ]

I don't know what a "biz logic key" is.

I also don't know how to implement this in the general case. For the specific case of material-types, when there will be only a few of them it's easy. Read all the MTs from the database table, put them into an ID-to-name map id2name, then sort the main objects (items) by id2name[item.materialTypeId]. But obviously, if there are a million material-types that doesn't work so well.

Comment by shale99 [ 27/Jun/17 ]

ok, got whats your saying

but that would mean we have all items that match the filter (there may be no filter sent) at hand - and then we can iterate and sort. but that isnt necessarily the case - if we ask for all items where field_x = y --> returns 25k items - and then we want to sort by material type name. this means we first need all 25k items and then sort them. as a general mechanism to solve this problem this would not be efficient.

Comment by Mike Taylor [ 27/Jun/17 ]

Exactly.

In the short term, one solution would be to query the authority WSAPI (in this case MT) to find out how many records there are. If there are 100 or less, proceed as I suggested, otherwise reject the search with an informative error message.

That would buy us time to think about the harder general problem.

Comment by shale99 [ 27/Jun/17 ]

the best solution i see here would be to integrate an index (solr / elastic) into the system and index the composite record. this is a bit more complex as we would need to set up triggers on updates / inserts when any part of the composite is updated (and send to an indexing queue). looks like solr (since v4.0) supports updates on a per field basis - which would simplify the process as we would not need to resend full documents (hence , not requiring a read from the db all parts of the composite)

Comment by Julian Ladisch [ 14/Feb/22 ]

Yes, a de-normalized tables is required if we need an index for performance but the sort field is in a foreign key table.
If the library is bi-lingual two de-normalized tables are needed.

Generated at Thu Feb 08 23:07:32 UTC 2024 using Jira 1001.0.0-SNAPSHOT#100246-sha1:7a5c50119eb0633d306e14180817ddef5e80c75d.