Investigate mod-pubsub module memory growing during Check-in/Check-out
Description
CSP Request Details
CSP Rejection Details
Potential Workaround
Attachments
defines
Checklist
hideActivity

Leonid Kolesnykov January 31, 2025 at 2:24 PM
taking heap dump we get it for two containers (tasks of the module). So here I provide the info from both containers. Under “1 container” I just mean the first of two.
The results were taken from one test with 2 tasks on mod-pubsub module.

Alexander Kurash January 30, 2025 at 3:49 PMEdited
I can only see two heap dumps for “1 container” (BTW, does this mean that you ran one instance of mod-pubsub as opposed to 2 instances in the “2 container“ case?). Could it it be that one instance was just overwhelmed by the number and frequency of requests? I mean, growing memory trend doesn’t necessarily mean there’s a leak, maybe it would clear out once the “DoS-ing“ stopped. I think that’s what you make the fourth heap dump for (20 min after the test finish).

Leonid Kolesnykov January 30, 2025 at 1:53 PMEdited
Today we carried out 2 hour CI/CO test with 30 vUsers on mod-pubsub:2.15.4 . We took 4 heap dumps. First - at 15 minutes after the start, second - at 1 hour 40 minutes, third - at 1 hour 55 minutes and the last fourth heap dump 20 minutes after test was finished.
Modules resource utilization (CPU, RAM) had no some spikes or clear growing trends.
No exact evidence that the problem still exists on 2 container. Anyway I provide here the raw data for future analysis if needed sometimes.
But analysing 1 container I can still see the growing trend
Raw results for 1 container:
First Heap dump
Problem Suspect 1
One instance of java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue loaded by <system class loader> occupies 15,675,992 (15.13%) bytes. The instance is referenced by java.lang.Thread @ 0xdf631158 CompletableFutureDelayScheduler , loaded by <system class loader>.
The thread java.lang.Thread @ 0xdf631158 CompletableFutureDelayScheduler keeps local variables with total size 128 (0.00%) bytes.
The memory is accumulated in one instance of java.util.concurrent.RunnableScheduledFuture[], loaded by <system class loader>, which occupies 15,675,960 (15.13%) bytes.
Second Heap dump
Problem Suspect 1
One instance of java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue loaded by <system class loader> occupies 91,061,464 (51.06%) bytes. The instance is referenced by java.lang.Thread @ 0xdee257b8 CompletableFutureDelayScheduler , loaded by <system class loader>.
The thread java.lang.Thread @ 0xdee257b8 CompletableFutureDelayScheduler keeps local variables with total size 128 (0.00%) bytes.
The memory is accumulated in one instance of java.util.concurrent.RunnableScheduledFuture[], loaded by <system class loader>, which occupies 91,061,432 (51.06%) bytes.
The stacktrace of this Thread is available. See stacktrace. See stacktrace with involved local variables.
Keywords
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue
java.util.concurrent.RunnableScheduledFuture[]
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take()Ljava/util/concurrent/RunnableScheduledFuture;
Unknow
Raw results for 2 container:
First Heap dump
Problem Suspect 1
239 instances of org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter, loaded by jdk.internal.loader.ClassLoaders$AppClassLoader @ 0xd9950000 occupy 16,218,376 (17.58%) bytes.
Keywords
org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter
jdk.internal.loader.ClassLoaders$AppClassLoader @ 0xd9950000
Second Heap dump
Problem Suspect 1
One instance of java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue loaded by <system class loader> occupies 36,144,152 (29.43%) bytes. The instance is referenced by java.lang.Thread @ 0xdf741190 CompletableFutureDelayScheduler , loaded by <system class loader>.
The thread java.lang.Thread @ 0xdf741190 CompletableFutureDelayScheduler keeps local variables with total size 128 (0.00%) bytes.
The memory is accumulated in one instance of java.util.concurrent.RunnableScheduledFuture[], loaded by <system class loader>, which occupies 36,144,120 (29.43%) bytes.
The stacktrace of this Thread is available. See stacktrace. See stacktrace with involved local variables.
Keywords
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue
java.util.concurrent.RunnableScheduledFuture[]
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take()Ljava/util/concurrent/RunnableScheduledFuture;
Unknow
Problem Suspect 2
239 instances of org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter, loaded by jdk.internal.loader.ClassLoaders$AppClassLoader @ 0xd9950000 occupy 16,218,376 (13.21%) bytes.
Keywords
org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter
jdk.internal.loader.ClassLoaders$AppClassLoader @ 0xd9950000
Third Heap dump
Problem Suspect 1
One instance of java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue loaded by <system class loader> occupies 41,735,928 (32.51%) bytes. The instance is referenced by java.lang.Thread @ 0xded41a90 CompletableFutureDelayScheduler , loaded by <system class loader>.
The thread java.lang.Thread @ 0xded41a90 CompletableFutureDelayScheduler keeps local variables with total size 128 (0.00%) bytes.
The memory is accumulated in one instance of java.util.concurrent.RunnableScheduledFuture[], loaded by <system class loader>, which occupies 41,735,896 (32.51%) bytes.
The stacktrace of this Thread is available. See stacktrace. See stacktrace with involved local variables.
Keywords
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue
java.util.concurrent.RunnableScheduledFuture[]
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take()Ljava/util/concurrent/RunnableScheduledFuture;
Unknow
Problem Suspect 2
239 instances of org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter, loaded by jdk.internal.loader.ClassLoaders$AppClassLoader @ 0xd9950000 occupy 16,218,376 (12.63%) bytes.
Keywords
org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter
jdk.internal.loader.ClassLoaders$AppClassLoader @ 0xd9950000
Fourth Heap dump
Problem Suspect 1
One instance of java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue loaded by <system class loader> occupies 41,949,944 (32.72%) bytes. The instance is referenced by java.lang.Thread @ 0xded41a90 CompletableFutureDelayScheduler , loaded by <system class loader>.
The thread java.lang.Thread @ 0xded41a90 CompletableFutureDelayScheduler keeps local variables with total size 128 (0.00%) bytes.
The memory is accumulated in one instance of java.util.concurrent.RunnableScheduledFuture[], loaded by <system class loader>, which occupies 41,949,912 (32.72%) bytes.
The stacktrace of this Thread is available. See stacktrace. See stacktrace with involved local variables.
Keywords
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue
java.util.concurrent.RunnableScheduledFuture[]
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take()Ljava/util/concurrent/RunnableScheduledFuture;
Unknow
Problem Suspect 2
239 instances of org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter, loaded by jdk.internal.loader.ClassLoaders$AppClassLoader @ 0xd9950000 occupy 16,218,376 (12.65%) bytes.
Keywords
org.apache.kafka.common.telemetry.internals.ClientTelemetryReporter
jdk.internal.loader.ClassLoaders$AppClassLoader @ 0xd9950000

Anne Ekblad January 29, 2025 at 6:21 PM
Thank you, . I’ll raise this with Vega tomorrow during our Daily. More soon.

Leonid Kolesnykov January 29, 2025 at 8:22 AM
PTF environment RCON - ramsons release, consortia, okapi based. Mod-pubsub:2.15.3 was the latest version.
Does this version mod-pubsub:2.15.4 include changes related to the identified issue? Anyway we need to schedule to test it with a new version if needed.
Details
Assignee
UnassignedUnassignedReporter
Leonid KolesnykovLeonid KolesnykovLabels
Priority
P2Development Team
VegaRelease
Trillium (R2 2025)RCA Group
TBDTestRail: Cases
Open TestRail: CasesTestRail: Runs
Open TestRail: Runs
Details
Details
Assignee
Reporter

Environment: PTF-ECS
Release: Ramsons
Version: mod-pubsub:2.15.3
Testing mod-pubsub memory growing detected.
Additional 2 hour CI/CO test was carried out with 30 vUsers. Heap dump results:
at the beginning of the test
Problem Suspect 1
One instance of java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue loaded by <system class loader> occupies 10,633,688 (13.58%) bytes. The instance is referenced by java.lang.Thread @ 0xde1a2598 CompletableFutureDelayScheduler , loaded by <system class loader>.
The thread java.lang.Thread @ 0xde1a2598 CompletableFutureDelayScheduler keeps local variables with total size 128 (0.00%) bytes.
The memory is accumulated in one instance of java.util.concurrent.RunnableScheduledFuture[], loaded by <system class loader>, which occupies 10,633,656 (13.58%) bytes.
at the end of the test
Problem Suspect 1
One instance of java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue loaded by <system class loader> occupies 91,407,416 (57.49%) bytes. The instance is referenced by java.lang.Thread @ 0xdd9e95c0 CompletableFutureDelayScheduler , loaded by <system class loader>.
The thread java.lang.Thread @ 0xdd9e95c0 CompletableFutureDelayScheduler keeps local variables with total size 128 (0.00%) bytes.
The memory is accumulated in one instance of java.util.concurrent.RunnableScheduledFuture[], loaded by <system class loader>, which occupies 91,407,384 (57.49%) bytes.
Stacktrace:
Memory & CPU graph for CI/CO longevity test with 30 vUsers presenting one task had restarted with 102% of memory consumption