Jira Legacy | ||||||
---|---|---|---|---|---|---|
|
...
- 61 back-end modules deployed in 110 ECS services
- 3 okapi ECS services
- 8 m5.large EC2 instances
- 2 db.r5.xlarge AWS RDS instance (1 reader, 1 writer)
Software version
mod-oai-pmh 3.4.2
Summary
We're able to harvest all data set we have, which is 7.2 M records with different "Max records per response" parameter value.
Possibly there's a memory leak on oai-pmh side as we've seen continuously growing memory and CPU usage (see screenshots below).
Tests and results
test | Max records per response | Time to complete | Result | Issues |
---|---|---|---|---|
1 | 100 | 6 hours 26 minutes | all data harvested | Growing CPU/RAM usage |
2 | 300 | 2 hours 31 min | 5.5 M records harvested | connection lost with load generator not an oai-pmh issue |
3 | 500 | 2 hour 27 min | all data harvested | Growing CPU/RAM usage |
Service CPU usage
Service Memory usage
Source-record-storage memory usage
Source-record-storage CPU usage
Source-record-manager memory usage
Source-record-manager CPU usage
mod-inventory-storage CPU usage
mod-inventory-storage memory usage
Heap Analysis
There's 2 issues as a leak suspects common for each heap dumps taken (after each test):
io.vertx.core.http.impl.HttpClientImpl:
There is growing instances number 7 347 → 13 248 → 20 664;
io.vertx.core.http.impl.ConnectionManager:
There is growing instances number 14 694 →26 514 → 41 328;