[MODINVSTOR-390] Adding an Instance with a longer title throws "Values larger than 1/3 of a buffer page cannot be indexed" Created: 06/Nov/19  Updated: 15/Sep/20  Resolved: 05/Jan/20

Status: Closed
Project: mod-inventory-storage
Components: None
Affects versions: None
Fix versions: 18.2.1, 19.0.0

Type: Bug Priority: P2
Reporter: Theodor Tolstoy (One-Group.se) Assignee: Julian Ladisch
Resolution: Done Votes: 0
Labels: back-end, platform-backlog, q4-2019, q4-2019-bugfix, q4-2019-bugfix-release-created, q4-2019-bugtest, q4-2019-deployed-to-bugfest
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original estimate: Not Specified

Attachments: XML File 1-1.xml     PNG File Skärmavbild 2020-01-05 kl. 18.42.55.png     PNG File Skärmavbild 2020-01-05 kl. 18.43.15.png     PNG File Skärmavbild 2020-01-07 kl. 17.52.15.png     File looong_title.json     File looong_title2.json     PNG File screenshot-1.png    
Issue links:
Blocks
blocks FOLIO-2402 platform-related features and fixes f... Closed
is blocked by MODINVSTOR-418 Update RAML Module Builder (RMB) to 2... Closed
is blocked by RMB-498 Truncate b-tree string for 2712 index... Closed
Relates
relates to MODINVSTOR-379 Reindex of contributors fails with ma... Closed
relates to MODINVSTOR-395 Adding a Instance record with a few m... Closed
relates to MODINVSTOR-139 Indices for sorting: Only index first... Closed
Sprint: CP: sprint 79
Story Points: 1
Development Team: Core: Platform
Tester Assignee: Charlotte Whitt

 Description   

This behaviour has been observed both in Five Colleges tenant and in Snapshot-stable

Steps to reproduce:
1. Post the attached Instance record to /inventory/instances

What should happen:
Instance is created successfully

What actually happens:
An exception is thrown. Message from API:

ErrorMessage(fields=[(Severity, ERROR), (V, ERROR), (SQLSTATE, 54000), (Message, index row size 2792 exceeds maximum 2712 for index "instance_title_idx"), (Hint, Values larger than 1/3 of a buffer page cannot be indexed.
Consider a function index of an MD5 hash of the value, or use full text indexing.), (s, diku_mod_inventory_storage), (t, instance), (n, instance_title_idx), (File, nbtinsert.c), (Line, 584), (Routine, _bt_findinsertloc)])



 Comments   
Comment by Theodor Tolstoy (One-Group.se) [ 06/Nov/19 ]

Anya Ann-Marie Breaux this might be of interest to you

Comment by Theodor Tolstoy (One-Group.se) [ 06/Nov/19 ]

I found a second example

Comment by Charlotte Whitt [ 06/Nov/19 ]

Marc Johnson - Is this a `ui-only` problem, or backend?

Cate Boerema I only gave it P2, but this bug might be as high a P1?

Comment by Theodor Tolstoy (One-Group.se) [ 06/Nov/19 ]

If a couple of records are failing for FC, i do not think it is that urgent, and it seems that we at EBSCO can work around the problem by removing the indexes. But Anya could help on the severity here.

This is a backend issue only I would say. Having a 250+ contributor stacked instance record might cause other issues in the UI though.

Comment by Cate Boerema (Inactive) [ 06/Nov/19 ]

Charlotte Whitt I'll put it at the top of the backlog

Comment by Ann-Marie Breaux (Inactive) [ 07/Nov/19 ]

Hi Cate Boerema and Charlotte Whitt Should this be moved to MODINV project? Also, do you think this is related to the other bug where a record with hundreds of contributors fails to load/create the Instance properly? MODINVSTOR-395 Closed and MODINVSTOR-379 Closed

Comment by Cate Boerema (Inactive) [ 07/Nov/19 ]

Ann-Marie Breaux I think those are questions for Marc Johnson.

Comment by Marc Johnson [ 07/Nov/19 ]

Cate Boerema Charlotte Whitt Ann-Marie Breaux Theodor Tolstoy (One-Group.se)

I think those are questions for Marc Johnson

Alas, I think I am ill equipped to answer them. I'll share my thoughts below.

I think it is likely that this issue would be more appropriately investigated by the core platform team, as they are more familiar with the database indexing done by RAML Module Builder. Jakub Skoczen what do you think?

Is this a `ui-only` problem, or backend?

This is not a UI issue, it is related to database indexes and hence is solely backend.

Should this be moved to MODINV project?

Database indexes are managed by the storage modules. It should be moved to the MODINVSTOR project.

it seems that we at EBSCO can work around the problem by removing the indexes

I suggest being cautious with this approach. Removing indexes might impede performance of common operations. For example, removing an index on title could cause title searching in inventory to either stop working or work sufficiently slowly for the impact to be similar.

It might also be that this needs doing every time a new version of the module is deployed.

do you think this is related to the other bug where a record with hundreds of contributors fails to load/create the Instance properly?

It may well be related, as they both appear to be related to constraints around database indexing.

Comment by Theodor Tolstoy (One-Group.se) [ 07/Nov/19 ]

When you are figuring out a solution, that solution needs to reflect the fact that there are Scientific reports/Articles etc, especially in Physics, where the number of contributors can reach well over 3000 names. I am not joking.

Usually these reports are kept in the Institutional repositories of the Universities, but I am sure they will end up in FOLIO eventually, so we need to take that into consideration.

Is it not very unusual and against common practices to add all the contributors to the title like in these examples? I do not think anyone would do that with a 3k long contributor list

Comment by Anya [ 07/Nov/19 ]

Steve Bischof adding

Comment by Charlotte Whitt [ 07/Nov/19 ]

Theodor Tolstoy (One-Group.se) re.

Is it not very unusual and against common practices to add all the contributors to the title like in these examples? I do not think anyone would do that with a 3k long contributor list

Do you have a MARC record from LiIBIS on that example with that list of 3k contributors??

I'll loop lew235 in too.

Comment by Theodor Tolstoy (One-Group.se) [ 07/Nov/19 ]

https://www.sciencemag.org/news/2015/05/physics-paper-sets-record-more-5000-authors

Comment by Theodor Tolstoy (One-Group.se) [ 07/Nov/19 ]

Do you have a MARC record from LiIBIS on that example with that list of 3k contributors??

No record in LIBRIS of that kind that i know of. As i wrote:

Usually these reports are kept in the Institutional repositories of the Universities,

Comment by Theodor Tolstoy (One-Group.se) [ 07/Nov/19 ]

Here is one: "only" 350+ contributors though 1-1.xml

Comment by Cate Boerema (Inactive) [ 08/Nov/19 ]

Theodor Tolstoy (One-Group.se) is there a reason this doesn't have a CHAL issue? What is the customer priority?

Comment by Cate Boerema (Inactive) [ 08/Nov/19 ]

Oh, nevermind. I see this was seen at 5 Colleges.

Comment by Bohdan Suprun (Inactive) [ 08/Nov/19 ]

Hi All,

It seems we have to revise all indexes for instance.title/contributors. We have three different DB indexes for these fields:

  • Simple B-tree index (that causes the issue);
  • GIN index;
  • Full text index;

I don't think that we really need the B-Tree index, since we already have full text indexes. Removing them will resolve the issue (hope will resolve MODINVSTOR-395 Closed as well).

Marc Johnson, Jakub Skoczen - what do you think?

Best regards,
Bohdan

Comment by Marc Johnson [ 08/Nov/19 ]

Bohdan Suprun Thank you for investigating this.

I don't think that we really need the B-Tree index, since we already have full text indexes.

The use of indexes is highly coupled to the queries that RAML Module Builder generates (particularly for CQL to SQL conversion).

Which is why I think this question is better answered by the core platform team.

Comment by Cate Boerema (Inactive) [ 08/Nov/19 ]

Per Core Functional grooming, this bug belongs with Core Platform (CF is far less familiar with this stuff than CP). FYI Jakub Skoczen and Oleksii Popov since I think Jakub is OOO

Comment by Jakub Skoczen [ 19/Dec/19 ]

Theodor Tolstoy (One-Group.se) Cate Boerema this problem is fixed with RMB 29.1.4 and it will be shipped with mod-inventory-storate 18.2.0 – please review if the problem is gone on your end.

Comment by Marc Johnson [ 20/Dec/19 ]

Jakub Skoczen Julian Ladisch Adam Dickmeiss Am I correct in understanding that there will need to be another release of RAML Module Builder to address the performance regressions, and mod-inventory-storage will need to be re-upgraded?

And this is needed to fix 2019 Q4 must fix bugs?

Comment by Cate Boerema (Inactive) [ 02/Jan/20 ]

Just catching up from the holidays. Marc Johnson did you every get an answer to your question above?

Jakub Skoczen Julian Ladisch Adam Dickmeiss Am I correct in understanding that there will need to be another release of RAML Module Builder to address the performance regressions, and mod-inventory-storage will need to be re-upgraded?

Comment by Cate Boerema (Inactive) [ 02/Jan/20 ]

Theodor Tolstoy Cate Boerema this problem is fixed with RMB 29.1.4 and it will be shipped with mod-inventory-storate 18.2.0 – please review if the problem is gone on your end.

Thanks Jakub Skoczen. Looking at BugFest environment, it appears it's still using an older version of mod-inventory-storage

I guess I need to put in a request to have the environment updated to use mod-inventory-storage 18.2.0

Comment by Ann-Marie Breaux (Inactive) [ 03/Jan/20 ]

Hi Cate Boerema mod-inventory-storage 18.2.0 is on the BugFest enviro now, so I think you should be able to check this. Also just FYI, mod-circulation 17.0.1 was added to BugFest yesterday as well.

Comment by Charlotte Whitt [ 05/Jan/20 ]

Manual test in FOLIO Snapshot, version @folio/inventory 1.13.1000758, using Chome.

Tested adding the title: looong_title.json as a successfully created instance record.
Did not get any errors when adding the crazy long title as Resource title or as Index title. I'll close the ticket as done.


Comment by Charlotte Whitt [ 05/Jan/20 ]

Cate Boerema - please release it to BugFest.

CC: Holly Mistlebauer

Comment by Cate Boerema (Inactive) [ 06/Jan/20 ]

Thanks Charlotte Whitt. It does look like Marc Johnson created a bugfix release for this (18.2.1 https://github.com/folio-org/mod-inventory-storage/releases) and it is not yet on the BugFest environment but I am going to hold off on requesting that it be deployed there because there are other fixes included in that release which aren't yet closed. Namely:

  • MODINVSTOR-395 Closed (can you test this Charlotte Whitt? I am not sure how)
  • MODINVSTOR-379 Closed and MODINVSTOR-139 Closed (these ones aren't currently tagged with q4-2019 or q4-2019-bugfix but if we can't release without including them, I think we need to add those tags and get them tested - how can we test these?)
Comment by Charlotte Whitt [ 06/Jan/20 ]

Hi @cate:

  • MODINVSTOR-395 Closed - I tested this morning and verified the fix, and closed the ticket.
  • MODINVSTOR-379 Closed - here I loaded the attached problem_bibs.mrc file in to the FOLIO environment where we are aloud to load big files through Data Import (https://folio-snapshot-load.aws.indexdata.com/) - all looks good, and I could load all 20 problematic records, and got no error message. Will add documentation in the relevant jira ticket.
  • MODINVSTOR-139 Closed - I'm checking too (stay tuned)
Comment by Cate Boerema (Inactive) [ 06/Jan/20 ]

Deployed to bugfest: https://github.com/folio-org/mod-inventory-storage/releases/tag/v18.2.1

Comment by Charlotte Whitt [ 07/Jan/20 ]

Added instance with the looong_title (see json attached). Todays test in Q4 2019 bugfest passed successfully, and all is fine. I have removed retest label.

Generated at Thu Feb 08 23:19:59 UTC 2024 using Jira 1001.0.0-SNAPSHOT#100246-sha1:7a5c50119eb0633d306e14180817ddef5e80c75d.