Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Drawio
mVer2
simple0
zoom1
inComment0
pageId664928267
custContentId663814270
diagramDisplayNameSpitfire-ongoing-indexing-seq.drawio
lbox1
contentVer12
revision12
baseUrlhttps://folio-org.atlassian.net/wiki
diagramNameSpitfire-ongoing-indexing-seq.drawio
pCenter0
width9991009
links
tbstyle
height821906

The approach is as follows (using contributors as an example):

  1. Add last_updated_date to the contributors table

  2. Create the sub_resources_locks table(id, entity_type, locked_flag, last_updated_date) and fill it for all types of subresources

  3. Create a job that every minute gets the list of tenants select t.tenantjson -> 'descriptor' ->> 'id' from public.tenants t where t.tenantjson -> 'descriptor' ->> 'id' <> 'supertenant'; and for each of the tenants:

    1. Tries to update sub_resources_locks set locked_flag=true where entity_type='contributor' and locked_flag = false returning id, last_updated_date

    2. if the id has returned, then fetch the aggregate query of contributors with where last_updated_date >=? order by last_updated_date limit 100 clause. The value of 100 is the amount of records that could be processed in a minute, even if not, the lock will not allow to run in the next minute

    3. Load aggregates to OpenSeach

    4. Run update sub_resources_locks set locked_flag=false, last_updated_date=current_timestamp where entity_type='contributor' and locked_flag = true

NB! In ECS mode the job only should run for the central tenant