Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Extracting upload into a separate thread

Drawio
mVer2
zoom1
simple0
zoominComment10
inCommentcustContentId0663814270
pageId664928267
custContentIdlbox6638142701
diagramDisplayNameSpitfire-ongoing-indexing-seq.drawio
lbox1
contentVer2
revision2
baseUrlhttps://folio-org.atlassian.net/wiki
diagramNameSpitfire-ongoing-indexing-seq.drawio
pCenter0
width1009
links
tbstyle
height906

...

  1. Add last_updated_date to the contributors table

  2. Create the sub_resources_locks table(id, entity_type, locked_flag, last_updated_date) and fill it for all types of subresources

  3. Create a job that every minute gets the list of tenants select t.tenantjson -> 'descriptor' ->> 'id' from public.tenants t where t.tenantjson -> 'descriptor' ->> 'id' <> 'supertenant'; and for each of the tenants:

    1. Tries to update sub_resources_locks set locked_flag=true where entity_type='contributor' and locked_flag = false returning id, last_updated_date

    2. if the id has returned, then fetch the aggregate query of contributors with where last_updated_date >=? order by last_updated_date limit 100 clause. The value of 100 is the amount of records that could be processed in a minute, even if not, the lock will not allow it to run in the next minute.

    3. Load aggregates to OpenSeach

    4. Run update sub_resources_locks set locked_flag=false, last_updated_date=current_:timestamp where entity_type='contributor' and locked_flag = true, where :timestamp is the latest timestamp among the records that were fetched in step (b)

NB! In ECS mode the job only should run for the central tenant