[FOLIO-2435] Spike: Running vert.x HttpClient requests in parallel Created: 28/Jan/20 Updated: 24/Feb/22 Resolved: 24/Feb/22 |
|
| Status: | Closed |
| Project: | FOLIO |
| Components: | None |
| Affects versions: | None |
| Fix versions: | None |
| Type: | Task | Priority: | P2 |
| Reporter: | Julian Ladisch | Assignee: | Steve Ellis |
| Resolution: | Done | Votes: | 0 |
| Labels: | platform-backlog | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original estimate: | Not Specified | ||
| Issue links: |
|
||||||||
| Sprint: | CP: sprint 134 | ||||||||
| Story Points: | 0.5 | ||||||||
| Development Team: | Core: Platform | ||||||||
| Description |
|
Marc Johnson wrote in CIRC-468:
High priority because many FOLIO modules already use parallel processing with CompositeFuture: |
| Comments |
| Comment by Marc Johnson [ 28/Jan/20 ] |
|
Julian Ladisch Thank you for raising this issue. Please can you expand upon what your expectations for the output of the spike are? |
| Comment by Julian Ladisch [ 28/Jan/20 ] |
|
Is vert.x HttpClient designed to properly handle parallel HTTP requests? If yes, we can close this issue without any further action. If there are restrictions they should be documented. If HttpClient is not designed to handle parallel HTTP requests we need to change all existing code where HttpClient (or WebClient based on HttpClient) is used in parallel. |
| Comment by Marc Johnson [ 28/Jan/20 ] |
|
Ok, thank you for that context.
https://vertx.io/docs/vertx-web-client/java/ suggests that the library is asynchronous. Is that sufficient for the definition of parallel here or not?
That example uses Vert.x's RxJava extensions, are you suggesting that FOLIO needs to do the same? |
| Comment by Julian Ladisch [ 28/Jan/20 ] |
|
HttpClient is asynchronous allowing some other task to execute when waiting for the response. |
| Comment by Marc Johnson [ 28/Jan/20 ] |
Ok, so by parallel, that means asynchronous and multiple requests in progress at the same time? The documentation suggests that we should not use more than one instance of HttpClient or WebClient:
|
| Comment by Julian Ladisch [ 28/Jan/20 ] |
Yes. |
| Comment by Steve Ellis [ 24/Feb/22 ] |
|
This came up in our backlog as being open and I said something like "It's vertx so it's async" and suggested closing it. I think there are two questions in the conversation here: 1) is HttpClient "properly" async, 2) are we instantiating HttpClient in the right way? I think the answer to 1 has to be yes considering how widely used the library is and that it is following the normal vertx pattern of either using callbacks or futures to handle IO. For example it does send and then gets an AsyncResult in a callback. I think the answer to 2 is it depends on our code. As Marc mentions in the comments the docs for this library suggest one instance per vertx process. In other words it should be static or used as a singleton. Don't new up an HttpClient instance for every request. Adam noticed this just yesterday in the tests for mod-inventory. Too many sockets = too many instances of HttpClient. Async IO is at its core about defining a way to wait for a result without blocking the thread. This is why in things like vertx and node the result arrives in a callback. There was also a question in the comments about parallelism. The amount of parallelism (the number of concurrent requests) is dependent on how many simultaneous threads a given processor can handle. So 8 cores will give you more parallelism than 6 etc. The beauty of vertx and all modern approaches to async is that the developer doesn't have to worry much about this in most cases. You just follow the pattern that the library or language defines (callbacks, async/await, coroutines, whatever) and you're getting the benefits of multicore processors. Some references:
|