[FOLIO-2435] Spike: Running vert.x HttpClient requests in parallel Created: 28/Jan/20  Updated: 24/Feb/22  Resolved: 24/Feb/22

Status: Closed
Project: FOLIO
Components: None
Affects versions: None
Fix versions: None

Type: Task Priority: P2
Reporter: Julian Ladisch Assignee: Steve Ellis
Resolution: Done Votes: 0
Labels: platform-backlog
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original estimate: Not Specified

Issue links:
Blocks
blocks CIRC-468 Location and Drools-Rules Fetch Perfo... Open
Sprint: CP: sprint 134
Story Points: 0.5
Development Team: Core: Platform

 Description   

Marc Johnson wrote in CIRC-468:

I think we need to be careful about doing HTTP requests in parallel. I don't know how the vert.x HttpClient handles this. There is likely a need to improve the logging to help correlate requests and responses.

High priority because many FOLIO modules already use parallel processing with CompositeFuture:
https://github.com/search?q=org%3Afolio-org+compositefuture&type=Code
HttpClient documentation: https://vertx.io/docs/apidocs/io/vertx/core/http/HttpClient.html



 Comments   
Comment by Marc Johnson [ 28/Jan/20 ]

Julian Ladisch Thank you for raising this issue.

Please can you expand upon what your expectations for the output of the spike are?

Comment by Julian Ladisch [ 28/Jan/20 ]

Is vert.x HttpClient designed to properly handle parallel HTTP requests? If yes, we can close this issue without any further action. If there are restrictions they should be documented. If HttpClient is not designed to handle parallel HTTP requests we need to change all existing code where HttpClient (or WebClient based on HttpClient) is used in parallel.
See also "Calling the Service More Than Once" in "Clement Escoffier: Building Reactive Microservices in Java" https://www.oreilly.com/programming/free/files/building-reactive-microservices-in-java.pdf#page=35

Comment by Marc Johnson [ 28/Jan/20 ]

Julian Ladisch

Ok, thank you for that context.

Is vert.x HttpClient designed to properly handle parallel HTTP requests?

https://vertx.io/docs/vertx-web-client/java/ suggests that the library is asynchronous. Is that sufficient for the definition of parallel here or not?

See also "Calling the Service More Than Once" in "Clement Escoffier: Building Reactive Microservices in Java"

That example uses Vert.x's RxJava extensions, are you suggesting that FOLIO needs to do the same?

Comment by Julian Ladisch [ 28/Jan/20 ]

HttpClient is asynchronous allowing some other task to execute when waiting for the response.
This does not answer the question whether one HttpClient instance is designed to handle two parallel HTTP requests, or if we need two HttpClient instances.
RxJava is a wrapper (syntactic sugar) that is automatically generated from the base vert.x code: https://vertx.io/docs/vertx-rx/java/#_rxified_api
That example from Clement Escoffier shows one HttpClient running two HTTP requests in parallel; I don't know whether this is an error or intended.

Comment by Marc Johnson [ 28/Jan/20 ]

Julian Ladisch

This does not answer the question whether one HttpClient instance is designed to handle two parallel HTTP requests, or if we need two HttpClient instances.

Ok, so by parallel, that means asynchronous and multiple requests in progress at the same time?

The documentation suggests that we should not use more than one instance of HttpClient or WebClient:

In most cases, a Web Client should be created once on application startup and then reused. Otherwise you lose a lot of benefits such as connection pooling and may leak resources if instances are not closed properly.

Comment by Julian Ladisch [ 28/Jan/20 ]

by parallel, that means asynchronous and multiple requests in progress at the same time?

Yes.

Comment by Steve Ellis [ 24/Feb/22 ]

This came up in our backlog as being open and I said something like "It's vertx so it's async" and suggested closing it. I think there are two questions in the conversation here: 1) is HttpClient "properly" async, 2) are we instantiating HttpClient in the right way?

I think the answer to 1 has to be yes considering how widely used the library is and that it is following the normal vertx pattern of either using callbacks or futures to handle IO. For example it does send and then gets an AsyncResult in a callback.

I think the answer to 2 is it depends on our code. As Marc mentions in the comments the docs for this library suggest one instance per vertx process. In other words it should be static or used as a singleton. Don't new up an HttpClient instance for every request.

Adam noticed this just yesterday in the tests for mod-inventory. Too many sockets = too many instances of HttpClient.

Async IO is at its core about defining a way to wait for a result without blocking the thread. This is why in things like vertx and node the result arrives in a callback. There was also a question in the comments about parallelism. The amount of parallelism (the number of concurrent requests) is dependent on how many simultaneous threads a given processor can handle. So 8 cores will give you more parallelism than 6 etc. The beauty of vertx and all modern approaches to async is that the developer doesn't have to worry much about this in most cases. You just follow the pattern that the library or language defines (callbacks, async/await, coroutines, whatever) and you're getting the benefits of multicore processors.

Some references:

Generated at Thu Feb 08 23:20:37 UTC 2024 using Jira 1001.0.0-SNAPSHOT#100246-sha1:7a5c50119eb0633d306e14180817ddef5e80c75d.