[FOLIO-2741] login fails due to failure to get /service-points Created: 20/Aug/20 Updated: 01/Sep/20 Resolved: 01/Sep/20 |
|
| Status: | Closed |
| Project: | FOLIO |
| Components: | None |
| Affects versions: | None |
| Fix versions: | None |
| Type: | Bug | Priority: | P2 |
| Reporter: | Ian Hardy | Assignee: | Ian Hardy |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original estimate: | Not Specified | ||
| Attachments: |
|
||||||||||||||||
| Issue links: |
|
||||||||||||||||
| Sprint: | DevOps: Sprint 95, DevOps: Sprint 96 | ||||||||||||||||
| Development Team: | FOLIO DevOps | ||||||||||||||||
| Release: | Q3 2020 | ||||||||||||||||
| Description |
|
Overview: Expected Results: Observing the request ID in the okapi logs. Notice the time difference between the early parts and the final 500. ubuntu@ip-10-36-1-13:~$ grep -r 808230 /var/log/folio/okapi/ /var/log/folio/okapi/okapi-5.log:2020-08-20T11:22:37,786 INFO ProxyContext 808230/service-points REQ 10.36.1.47:63632 diku GET /service-points mod-authtoken-2.6.0-SNAPSHOT.73 mod-inventory-storage-19.4.0-SNAPSHOT.457 /var/log/folio/okapi/okapi-5.log:2020-08-20T11:22:37,894 INFO ProxyContext 808230/service-points RES 202 107370us mod-authtoken-2.6.0-SNAPSHOT.73 http://10.36.1.13:9178/service-points?limit=10000&query=cql.allRecords%3D1%20sortby%20name /var/log/folio/okapi/okapi-10.log:2020-08-20T12:36:53,777 WARN ProxyContext 808230/service-points proxyRequestHttpClient failure: http://10.36.1.13:9168: Connection was closed /var/log/folio/okapi/okapi-10.log:2020-08-20T12:36:53,777 INFO ProxyContext 808230/service-points RES 500 4455883219us okapi proxyClient failure: mod-inventory-storage-19.4.0-SNAPSHOT.457 http://10.36.1.13:9168: Connection was closed See the attached sample from mod-inventory-storage which shows the restart, and the request coming back since the postgres client is closed. |
| Comments |
| Comment by Ian Hardy [ 20/Aug/20 ] |
|
happened again, didn't notice much new. One thing that is a little unusual was 107 open connections like this (sample from netstat) tcp 0 0 81efab5fb944:sunproxyadmin ip-10-36-1-13.ec2.internal:58270 ESTABLISHED tcp 0 0 81efab5fb944:sunproxyadmin ip-10-36-1-13.ec2.internal:58686 ESTABLISHED tcp 0 0 81efab5fb944:sunproxyadmin ip-10-36-1-13.ec2.internal:58258 ESTABLISHED tcp 0 0 81efab5fb944:sunproxyadmin ip-10-36-1-13.ec2.internal:58768 ESTABLISHED tcp 0 0 81efab5fb944:sunproxyadmin ip-10-36-1-13.ec2.internal:58782 ESTABLISHED tcp 0 0 81efab5fb944:sunproxyadmin ip-10-36-1-13.ec2.internal:58192 ESTABLISHED tcp 0 0 81efab5fb944:sunproxyadmin ip-10-36-1-13.ec2.internal:58918 ESTABLISHED after a restart those are clear and a more normatl netstat result: Proto Recv-Q Send-Q Local Address Foreign Address State tcp 0 0 81efab5fb944:56306 ip-10-36-1-13.ec2.internal:postgresql ESTABLISHED tcp 0 0 81efab5fb944:56310 ip-10-36-1-13.ec2.internal:postgresql ESTABLISHED tcp 0 0 81efab5fb944:56302 ip-10-36-1-13.ec2.internal:postgresql ESTABLISHED tcp 0 0 81efab5fb944:56304 ip-10-36-1-13.ec2.internal:postgresql ESTABLISHED tcp 0 0 81efab5fb944:56308 ip-10-36-1-13.ec2.internal:postgresql ESTABLISHED watching the container after a reset I see those connections opening and getting cleaned up |
| Comment by Jakub Skoczen [ 21/Aug/20 ] |
|
Cate Boerema can we get some help from Core:functional devs to get this figured out? |
| Comment by Cate Boerema (Inactive) [ 21/Aug/20 ] |
|
Sure, Jakub Skoczen. What's the priority? Is this must-fix for the hotfix? Q3? |
| Comment by Jakub Skoczen [ 24/Aug/20 ] |
|
Cate Boerema It seems high priority due to the fact that this breaks so frequently. |
| Comment by Jakub Skoczen [ 24/Aug/20 ] |
|
Julian Ladisch Do you have any ideas about this? There is clearly an NPE in the logs Ian attached. |
| Comment by Cate Boerema (Inactive) [ 25/Aug/20 ] |
|
Jakub Skoczen, do you still want help from Core Functional on this? If so, please assign it to us. Thanks |
| Comment by Ian Hardy [ 25/Aug/20 ] |
|
Cate Boerema yes please. mod-inventory-storage has been requiring at least one restart a day lately. Please let us know if we can gather any more information for this. |
| Comment by Jakub Skoczen [ 25/Aug/20 ] |
|
Ian Hardy we think this might be fixed once we migrate MIS over to RMB 30.2.6 and 31.0.2. Marc Johnson can you merge Adam's PR to bump the RMB version on master? |
| Comment by Marc Johnson [ 25/Aug/20 ] |
Sure, done. The Core Platform team can review and merge these changes autonomously (using the agreed development process) |
| Comment by Ian Hardy [ 26/Aug/20 ] |
|
Thanks all, looks like the update was merged. I'll assign this back to myself to watch and see if that fixed it. Looks promising so far. netstat shows a "normal" looking amount of connections (6 instead of 350). and no one has complained yet. If there are no more reports of mod-inventory-storage hanging I'll close this on Friday. |
| Comment by Ian Hardy [ 01/Sep/20 ] |
|
update to rmb resolved this. |