Investigate Time-out issue during FiscalYearRollover on eureka environment

Description

FYR has Time-out Gateway with a tiny amount of data - 1000 orders.

Starting FYR from UI we observe POST /finance/ledger-rollovers reply with 504 Error code (Internal Server Error). This problem followed by OrderRolloverService Order Rollover failed message.


Taken steps:

  • Increased connect Time-out for mod-finance from 1 to 2 minutes.

    • POST requests still go with Time-outs

  • Multiple tests with different amount of orders

    • FYR finish with Timeouts even in different environments.

    • FYR on QCON with 1k orders finished successfully but 30k orders throw the same issue (Time-out).

Results:

  • FYR 5 orders test QELC2

    • Successful

  • FYR 1k orders test QELC2

    • Failed with Time-out

  • FYR 1k orders test QCON

    • Successful

  • FYR 30k orders test QCON

    • Failed with Time-out

  • FYR 50k open orders + 50k pending orders test QCON

    • Successful after applying the parameter max_locks_per_transaction with 1024 for DB cluster and instance configs

  • FYR 50k open orders + 50k pending orders test QELC2

    • Failed in 8 hours


Container logs:

17:04:58 [732108/finance] [cs00000int] [9f9d1c46-52e1-4bb7-9c6c-56e6bb945c42] [mod_finance] INFO LogUtil 127.0.0.1:43264 POST /finance/ledger-rollovers null HTTP_1_1 500 265 400027 tid=cs00000int Internal Server Error

20:03:33 [956595/orders] [cs00000int] [9f9d1c46-52e1-4bb7-9c6c-56e6bb945c42] [mod_orders] ERROR RestClient org.folio.rest.core.exceptions.HttpException: <html>

19:50:13 [956595/orders] [cs00000int] [9f9d1c46-52e1-4bb7-9c6c-56e6bb945c42] [mod_orders] INFO RestRouting invoking postOrdersRollover

19:50:13 [956595/orders] [cs00000int] [9f9d1c46-52e1-4bb7-9c6c-56e6bb945c42] [mod_orders] INFO LogUtil 127.0.0.1:46056 POST /orders/rollover null HTTP_1_1 204 0 352 tid=cs00000int No Content

 

19:56:53 [956595/orders] [cs00000int] [9f9d1c46-52e1-4bb7-9c6c-56e6bb945c42] [mod_orders] ERROR RestClient org.folio.rest.core.exceptions.HttpException: <html>

19:56:54 [956595/orders] [cs00000int] [9f9d1c46-52e1-4bb7-9c6c-56e6bb945c42] [mod_orders] ERROR OrderRolloverService Order Rollover failed

In database we observe slow queries but deadlocks not observed.

Environment

QELC2

Potential Workaround

None

Attachments

14

Checklist

hide

Activity

Show:

Leonid Kolesnykov December 23, 2024 at 8:40 AM

good to know. Thank you.

Boburbek Kadirkhodjaev December 20, 2024 at 9:48 AM

We can close this one, I tested locally with 50K open order and it passed with no OOMs and RTR issues:

Leonid Kolesnykov November 8, 2024 at 9:30 AM

This issues with viewing budgets relate only to Eureka-based environment. The number of open orders when it becomes reproducible - > 42k. So with 42k open orders budget info in UI was available but in range 42 - 46k the page throw error. The log csv attached.
Some OOM (java.lang.OutOfMemoryError: Java heap space) follow the budget request. It happens on sidecar level. I see such message there (Failed to export spans. The request could not be executed. Full error message: Connection refused: localhost/127.0.0.1:4317)

Boburbek Kadirkhodjaev November 7, 2024 at 5:17 AM
Edited

, this issue with viewing budgets, is it common across both Okapi and Eureka-based environments at a certain amount of data? Also please attach any module logs and/or Okapi/Kong/Keycloak logs

Leonid Kolesnykov November 6, 2024 at 1:28 PM
Edited

FYR with 50k Open and 50k Pending orders (2 PO line for each order) was performed on QELC2 and QCON environments.
Eureka (qelc2) failed after 8 hours with ordersRolloverStatus - failed.
Observation here:
budgets became unavailable with this amount of data. Probably it may cause fail of FYR
finance/budget/2cd0f05d-d4c0-5210-9886-66c7e4dfe1a6/view/ this call failing with loading budget data. Transaction page throw this  /finance/transactions/budget/undefined

Non-eureka (qcon) - succeeded after 9 hours 7 minutes.
CC: let me know if something possible to tune here

Done

Details

Assignee

Reporter

Priority

Development Team

Thunderjet

Affected releases

Quesnelia (R1 2024)

TestRail: Cases

Open TestRail: Cases

TestRail: Runs

Open TestRail: Runs

Created October 11, 2024 at 1:31 PM
Updated December 23, 2024 at 8:40 AM
Resolved December 23, 2024 at 8:40 AM
TestRail: Cases
TestRail: Runs