[UIHA-2] Processing of xml files by the harvester Created: 25/Aug/21  Updated: 06/Sep/21  Resolved: 06/Sep/21

Status: Closed
Project: ui-harvester
Components: None
Affects versions: None
Fix versions: None

Type: Bug Priority: P1
Reporter: Charlotte Whitt Assignee: Niels Erik Nielsen
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original estimate: Not Specified

Sprint:
Development Team: Thor

 Description   

Overview:
The harvester runs its first job:
2021-08-19 08:49:09,224 DEBUG [BulkRecordHarvestJob(10001 GBV)] BulkRecordHarvestJob downloading list: http://ouf-minerva2/picaxml/
...
2021-08-19 08:49:11,943 INFO [BulkRecordHarvestJob(10001 GBV)] Committed 1 adds, 0 deletes. 0 in total (pending warming of index).
2021-08-19 08:49:11,943 WARN [BulkRecordHarvestJob(10001 GBV)] No default or custom recipients specified, notification will not be sent
Sets its nextbegin timestamp to 08:49:12
2021-08-19 10:32:30,675 INFO [BulkRecordHarvestJob(10001 GBV)] Conditional request If-Modified-Since: Thu, 19 Aug 2021 08:49:12 GMT
Which is not correct because in the time between 08:49:09 (retrival of xml file list) and 08:49:12 (end of harvester job + 1 sec.) there are new xml files created which are never processed.
Begintimestamp of the next harvesterjob needs to be starttimestamp + 1 second.

Expected Results:
Having the timestamp with miliseconds should avoid corner cases.

Actual Results:
In the time between 08:49:09 (retrival of xml file list) and 08:49:12 (end of harvester job + 1 sec.) there are new xml files created which are never processed.

Additional Information:
URL:
Interested parties: Niels Erik Nielsen Dennis Benndorf



 Comments   
Comment by Charlotte Whitt [ 25/Aug/21 ]

Niels Erik Nielsen implemented an update to the harvester, which fixed the problem Dennis uncovered.
The solution was to do the “Begintimestamp of the next harvesterjob needs to be starttimestamp”

Comment by Charlotte Whitt [ 25/Aug/21 ]

charles - tested the fix throughly 8/24. And could not get this to fail.

Generated at Thu Feb 08 22:33:32 UTC 2024 using Jira 1001.0.0-SNAPSHOT#100246-sha1:7a5c50119eb0633d306e14180817ddef5e80c75d.