[FOLIO-2250] RMB modules crash on tenant init with updated LaunchDescriptor Created: 10/Sep/19 Updated: 03/Jun/20 Resolved: 11/Sep/19 |
|
| Status: | Closed |
| Project: | FOLIO |
| Components: | None |
| Affects versions: | None |
| Fix versions: | None |
| Type: | Bug | Priority: | P2 |
| Reporter: | Wayne Schneider | Assignee: | Wayne Schneider |
| Resolution: | Done | Votes: | 0 |
| Labels: | devops, platform-backlog | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original estimate: | Not Specified | ||
| Issue links: |
|
||||||||||||||||
| Sprint: | CP: sprint 72 | ||||||||||||||||
| Story Points: | 2 | ||||||||||||||||
| Development Team: | Core: Platform | ||||||||||||||||
| Description |
|
Reported by David Crossley: "POST request for mod-notes-2.7.0-SNAPSHOT.134 /_/tenant failed with Connection refused: /10.36.1.114:9152" I can do a successful test locally via the VM "testing-backend" to upgrade from SNAPSHOT-133 to SNAPSHOT-134. But of course that is via vagrant, not via the reference environment. |
| Comments |
| Comment by Wayne Schneider [ 10/Sep/19 ] |
|
Container crashed on tenant init. Container log: 10 Sep 2019 04:11:36:122 INFO TenantAPI [115771eqId] sending... postTenant for diku
10 Sep 2019 04:11:36:124 INFO PostgresClient [115773eqId] DB config read from environment variables
10 Sep 2019 04:11:36:135 INFO PostgresClient [115784eqId] postgreSQLClientConfig = {"maxPoolSize":5,"port":5432,"host":"10.36.1.114","username":"folio_admin","database":"okapi_modules","password":"..."}
10 Sep 2019 04:11:36:249 INFO BaseSQLClient [115898eqId] Creating configuration for 10.36.1.114:5432
Not very illuminating. |
| Comment by Wayne Schneider [ 10/Sep/19 ] |
|
The problem appears to be an OOM error on tenant init. I was not able to reproduce it on a Vagrant VM, but I can reproduce it reliably on the reference environment in AWS. I will do some further testing. It may be a problem that the container memory limit is set to the same as the max heap size – perhaps we need to give it more headroom? |
| Comment by Wayne Schneider [ 10/Sep/19 ] |
|
If I remove the -Xmx Java option from the command line, the container launches and stays up through tenant init. Monitoring memory usage, it seems just fine with the 256M set as the container limit. I believe we need to set the Memory key in the LaunchDescriptor to something like 1.33x the -Xmx setting, so that the max heap is set to roughly 75% of available memory. Unfortunately, this is more of a rule of thumb than anything. This would mean: -Xmx256m = 357913941 What do you think, David Crossley? |
| Comment by David Crossley [ 10/Sep/19 ] |
|
Good discovery. Okay, i will document that, and test with various modules today. |
| Comment by David Crossley [ 10/Sep/19 ] |
|
Wayne Schneider Is this a temporary fix, and these 1.33x memory settings can be reduced after the
|
| Comment by Wayne Schneider [ 11/Sep/19 ] |
|
Yes, this could be temporary, I think. I hate to have all that cleanup to do afterwards, though. |
| Comment by David Crossley [ 11/Sep/19 ] |
|
Probably going to revisit the MDs again anyway, after
Revising the LD Memory settings will help to bring the overall memory requirement (e.g. for folio-install) back to be less big. |
| Comment by David Crossley [ 11/Sep/19 ] |
|
The new LaunchDescriptors are now in place for mod-notes, mod-users, mod-login, and mod-circulation. So it seems that this ticket can be closed. Thanks for your great work. |