[FOLIO-2276] Adjust memory settings source-record-storage, source-record-manager Created: 20/Sep/19 Updated: 03/Jun/20 Resolved: 27/Sep/19 |
|
| Status: | Closed |
| Project: | FOLIO |
| Components: | None |
| Affects versions: | None |
| Fix versions: | None |
| Type: | Task | Priority: | P2 |
| Reporter: | Ian Hardy | Assignee: | Ian Hardy |
| Resolution: | Done | Votes: | 0 |
| Labels: | ci, data-import, devops, platform-backlog | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original estimate: | Not Specified | ||
| Issue links: |
|
||||||||
| Sprint: | CP: sprint 73, CP: sprint 72 | ||||||||
| Story Points: | 2 | ||||||||
| Development Team: | Core: Platform | ||||||||
| Description |
|
Kateryna Senchenko reported source-record-manager was down on folio snapshot-load. After restarting source-record-manger with a higher container memory limit, the same load caused an OOMkill on source-record-storage. To remediate I'd propose increasing the container memory limit in the module descriptor for these two modules, while leaving the heap size of 256 unchanged. However, I want to be mindful to not over-provision these after https://folio-org.atlassian.net/browse/FOLIO-2242 is completed. Open to other suggestions as well David Crossley Wayne Schneider. |
| Comments |
| Comment by Wayne Schneider [ 20/Sep/19 ] |
|
Can we do some testing without the -Xmx setting in a container with a memory limit set? I'm curious to see if Java does a better job managing the burst under those conditions. |
| Comment by Ian Hardy [ 20/Sep/19 ] |
|
Yes, I'll try that first. |
| Comment by Ian Hardy [ 20/Sep/19 ] |
|
Built folio-snapshot-test w/-Xmx setting removed for srs and source-record -manager. On a test trying to upload 5 files w/500 marc records each the source-record-manager container ends up getting OOMKilled. JAVA_OPTIONS are: "JAVA_OPTIONS=-XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap", For some quick and dirty logging I ran: while true do docker stats --no-stream d8d2b604bc2c >> srm.txt sleep 1 done and saw it hit the limit: CONTAINER CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS d8d2b604bc2c 364.21% 275.6MiB / 341.3MiB 80.75% 4.19MB / 585kB 0B / 0B 43 CONTAINER CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS d8d2b604bc2c 103.85% 326.3MiB / 341.3MiB 95.60% 5.5MB / 6.2MB 0B / 0B 50 CONTAINER CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS d8d2b604bc2c 2.24% 326.5MiB / 341.3MiB 95.65% 5.5MB / 6.2MB 0B / 0B 49 CONTAINER CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS d8d2b604bc2c 451.66% 341.2MiB / 341.3MiB 99.95% 11.9MB / 6.65MB 0B / 0B 50 CONTAINER CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS d8d2b604bc2c 0.00% 0B / 0B 0.00% 0B / 0B 0B / 0B 0 CONTAINER CPU % MEM USAGE / LIMIT MEM % Will try leaving the java options the same (no xmx) and bumping up the container limit. |
| Comment by Ian Hardy [ 20/Sep/19 ] |
|
Looks like I can get through uploading 10 files of 500 marc records each with the host memory limit set to 536870912 leaving Xmx out. Kateryna Senchenko is this pretty close to a "typical" test you'll do of data import? how many files/records were you using for testing on Friday? |
| Comment by Kateryna Senchenko [ 23/Sep/19 ] |
|
Hi Ian Hardy, |
| Comment by Ian Hardy [ 25/Sep/19 ] |
|
I increased the memory limit on SRM and loaded 1 file of 30k records (this is the test that Kateryna Senchenko said failed this morning). Watching the memory usage of mod SRM I saw it peak at about 570MiB. the limit is currently configured at 682MiB which works out to aroudn 715 MB. Let me know if this seems like a reasonable limit here. |
| Comment by Wayne Schneider [ 25/Sep/19 ] |
|
Ann-Marie Breaux suggests that additional problems occur when you load multiple files (simultaneously or consecutively?), so that may also be something you want to test. |
| Comment by Ian Hardy [ 25/Sep/19 ] |
|
Good point Wayne Schneider. I did a batch of 3 files with 30,000 records, and one file of 60,000 just to test the limits of the current config. The 3 at 30,000 worked fine (now they queue up). After loading the 60,000 record file mod-inventory-storage crashed, but I'll consider that outside the scope of this issue. Ann-Marie BreauxKateryna Senchenko does that seem like a reasonable test, and if so, shall we leave the memory settings where there are now? |
| Comment by Wayne Schneider [ 25/Sep/19 ] |
|
That's great, Ian Hardy! Ann-Marie Breaux David Crossley this is kind of an interesting documentation challenge. Is there user documentation for data import currently? If so, it would seem logical to add a section for system administrators explaining how the memory settings can be tuned to support larger record loads. |
| Comment by Ann-Marie Breaux (Inactive) [ 26/Sep/19 ] |
|
Hi Wayne Schneider I don't think we've documented recommended configuration/memory for Data Import, but it's something we could do. Oleksii Kuzminov Kateryna Senchenko What do you think about adding something here? https://folio-org.atlassian.net/wiki/display/FOLIJET/Data-import+user+guides |
| Comment by Oleksii Kuzminov [ 26/Sep/19 ] |
|
Ann-Marie Breaux Yes, we will update documentation |
| Comment by Taras Spashchenko [ 26/Sep/19 ] |
|
Ian Hardy, could you please add parameters for jvm |
| Comment by Ann-Marie Breaux (Inactive) [ 26/Sep/19 ] |
|
Perfect Oleksii Kuzminov - thank you! |
| Comment by David Crossley [ 26/Sep/19 ] |
|
It would be useful to document stuff at the git README. And link to your other documentation. |
| Comment by Taras Spashchenko [ 27/Sep/19 ] |
|
Hello all, not sure if Oleksii Kuzminov has already shared the link to the post regarding memory management for containerized java processes. so we can reduce the memory limit for the container and use -XX:MinRAMPercentage & -XX:MAXRAMPercentage to allocate more memory for JAVA Heap |
| Comment by Wayne Schneider [ 27/Sep/19 ] |
|
Thanks, everyone, for all the work on this. Taras Spashchenko and Oleksii Kuzminov – you can set the Java options and memory for the container yourselves in the module descriptor template. If you have specific recommendations, at this point I'd suggest you:
|
| Comment by Ian Hardy [ 27/Sep/19 ] |
|
I'll close this one since the srm/srs/data import modules are no longer getting killed in the reference environment. Further changes can be made to memory settings in the launch descriptor if needed. |
| Comment by Ann-Marie Breaux (Inactive) [ 01/Oct/19 ] |
|
Thanks everyone for your analysis and attention to this - seems like last week was a big one for memory work! |