[FOLIO-968] Container RAM usage is getting out of control on blackbox VMs Created: 06/Dec/17  Updated: 12/Nov/18  Resolved: 03/Oct/18

Status: Closed
Project: FOLIO
Components: Continuous Integration
Affects versions: None
Fix versions: None

Type: Bug Priority: P2
Reporter: Wayne Schneider Assignee: Wayne Schneider
Resolution: Done Votes: 0
Labels: ci, sprint28, sprint29, sprint35, sprint36, sprint37, sprint48
Remaining Estimate: Not Specified
Time Spent: 7 hours, 30 minutes
Original estimate: Not Specified

Attachments: PNG File screenshot-1.png    
Issue links:
Blocks
blocks FOLIO-1532 Add mod-data-import and mod-source-re... Closed
blocks FOLIO-1531 Add mod-orders to vagrant box Closed
blocks MODVEND-12 Add mod-vendors to the FOLIO instance... Closed
is blocked by FOLIO-995 docker_entrypoint.sh script only acce... Closed
Relates
relates to FOLIO-977 Update default RAM allocation for Vag... Closed
relates to FOLIO-886 add java remote debugging as standard... Closed
relates to FOLIO-974 proxyRequestResponse failure Closed
Sprint:
Development Team: Prokopovych

 Description   

Each container wants something like 300MB of RAM. This is a little unreasonable, with the number of containers we're trying to run now. Vagrant boxes produced by CI (and folio-testing-backend01, maybe) are running into OOM issues.



 Comments   
Comment by Wayne Schneider [ 06/Dec/17 ]

Top from a recent "snapshot" VM sorted by memory usage shows:

1158 folio     20   0 3516332 270676  20460 S   0.0  6.7   0:06.83 java                                                       
 1923 folio     20   0 3508048 270624  22272 S   0.3  6.7   0:05.62 java                                                       
 1506 folio     20   0 3506624 233480  20836 S   0.0  5.8   0:05.39 java                                                       
 2195 folio     20   0 3504180 226044  20512 S   0.3  5.6   0:04.20 java                                                       
 2102 folio     20   0 3504204 221220  20324 S   0.0  5.5   0:04.90 java                                                       
 1241 folio     20   0 3506496 218076  20560 S   0.0  5.4   0:05.20 java                                                       
  819 okapi     20   0 3509932 201276  18616 S   0.3  5.0   0:04.63 java                                                       
 2367 folio     20   0 3504472 198380  20788 S   0.0  4.9   0:04.31 java                                                       
 2014 folio     20   0 3504224 197752  20400 S   0.0  4.9   0:03.90 java                                                       
 1326 folio     20   0 3504452 193680  20748 S   0.0  4.8   0:04.00 java                                                       
 2277 folio     20   0 3504452 192240  20720 S   0.3  4.7   0:03.63 java                                                       
 1408 folio     20   0 3495496  75132  17872 S   0.3  1.9   0:01.58 java                                                       
 1683 folio     20   0 3493900  74584  17556 S   0.3  1.8   0:01.46 java                                                       
Comment by Wayne Schneider [ 06/Dec/17 ]

Possible useful resources:
https://developers.redhat.com/blog/2017/03/14/java-inside-docker/
the “make JVM respect CPU and RAM limits” section of https://hub.docker.com/r/library/openjdk/ suggests there are some specialist parameters for running inside a container

Comment by John Malconian [ 06/Dec/17 ]

Some of the issues can be partially addressed by completing FOLIO-886 Closed and utilizing a new base Docker image for server-0side modules that has facilities for managing JVM memory in containers properly.

Comment by shale99 [ 07/Dec/17 ]


i've uploaded the output from `/admin/memory` for one of the modules (what Wayne Schneider sent on slack) - if you look at the summary (the last line) it indicates that the OS has committed 166MB to the JVM and that the JVM is using about 106MB - this is heap memory only (which in java apps is usually the bulk of the allocated memory) whereas the non-heap is at around 50MB - we can potentially do some tuning to cut down a couple of 10's of MBs, but i am not sure it is worth the effort at this point as this amount of memory isnt considered very high. so while we may be able to reduce the ~280MB , i wouldnt expect this to drop too much

Comment by Jakub Skoczen [ 07/Dec/17 ]

Assigning to John Malconian while Wayne Schneider is away. John can you check if you can deal with this by allocating more resources to the box. Also if there's anything specific that should be done to decrease resource usage for specific module. Which modules are the heavy resource consumers?

Comment by Wayne Schneider [ 18/Dec/17 ]

One thing to try – set -Xmx for all Java-based modules. A look at /admin/memory on all the RMB-based modules suggests that 256m is a reasonable baseline; mod-permissions and mod-inventory-storage may need to be set to 512m, and mod-users to 384m.

John Malconian also asked that a couple of debugging flags (-XX:+PrintFlagsFinal and -XX:+PrintGCDetails) be set. The only problem is that there appears to be a bug in the docker_entrypoint.sh script, so that you can only set one flag in the JAVA_OPTS environment variable when launching the docker container. That may be need to be filed as a separate issue.

Comment by Wayne Schneider [ 19/Dec/17 ]

Limiting heap size does seem to help (at least marginally), without obviously affecting performance. With the heap size limit, it appears we can configure the Vagrant boxes with 6GB of RAM (instead of 8), so that's an improvement. With JMX tooling in the Docker containers, we should be able to see easily whether GC is thrashing or fairly efficient.

Comment by John Malconian [ 20/Dec/17 ]

Note that all Docker images and there startup scripts have been updated. Use environment variable 'JAVA_OPTIONS' to set JVM options in place of 'JAVA_OPTS'.

Comment by Wayne Schneider [ 30/Mar/18 ]

Updated both testing and snapshot builds to include a -Xmx JVM option for all Java-based containers. This buys us a little more space. We are still very crowded in an 8GB VM.

Comment by Wayne Schneider [ 28/Sep/18 ]

We likely need to build a more basic VM and make it easy for devs to deploy the modules they need for development, while still building the full-stack for the AWS demo boxes.

Comment by Wayne Schneider [ 03/Oct/18 ]

This issue can be closed – we've implemented what controls we can on container RAM usage in the VMs.

Generated at Thu Feb 08 23:09:51 UTC 2024 using Jira 1001.0.0-SNAPSHOT#100246-sha1:7a5c50119eb0633d306e14180817ddef5e80c75d.