[OKAPI-248] Can't enable non-okapi deployed modules in OKAPI cluster mode Created: 06/Feb/17  Updated: 03/Jan/20  Resolved: 08/Feb/17

Status: Closed
Project: Okapi
Components: None
Affects versions: None
Fix versions: None

Type: Bug Priority: P2
Reporter: Hongwei Ji Assignee: Heikki Levanto
Resolution: Done Votes: 0
Labels: back-end, sprint8
Remaining Estimate: Not Specified
Time Spent: 3 hours, 30 minutes
Original estimate: Not Specified

Attachments: File 248.sh    
Issue links:
Relates
relates to OKAPI-719 Discovery reports service on node tha... Closed
Sprint:

 Description   

To reproduce, first register some running module (for example users-module) to OKAPI discovery endpoint using

{"srvcId":"users-module", "instId":"users-module", "url":"http://non-okapi-host:port"}

, and then try to enalbe the module for some existing tenant. Got error "No running instances for module users-module. Can not invoke tenant init". The problem seems to be in DiscoveryManager.java. Comment out following check "if (!clusterManager.getNodes().contains(md.getNodeId()))" can avoid the error.



 Comments   
Comment by Heikki Levanto [ 07/Feb/17 ]

I see the problem.
It happens in the case when we have a module with an external URL, and when the module supports the /_/tenant interface, the logic to find a running instance to make the POST request ot the tenant interface gets confused.

I am on it, hope to fix it today.

Comment by Heikki Levanto [ 07/Feb/17 ]

Hmm, as far as I can see, out ModuleTest does the same kind of thing with sample-module3, around line 815-876. I will try to reproduce this manually.

Comment by Heikki Levanto [ 07/Feb/17 ]

I could not reproduce this manually. See the attached 248.sh script for the steps I took. Can you see how this differs from the way you are doing things?

Comment by Heikki Levanto [ 07/Feb/17 ]

Aha, running it in clustering mode rperoduced the problem.

Changed the end of the Okapi invocation to `cluster -cluster-host 127.0.0.1`. Also had to increase the sleep to 5 seconds, so the cluster has time to get going.

Now I have something to hunt!

Comment by Heikki Levanto [ 07/Feb/17 ]

The error happens already when POSTing to discovery, listing the /_/discovery/modules retrurns an empty list in clustering mode. Digging deeper.

Comment by Heikki Levanto [ 08/Feb/17 ]

It was a simple logic error in the code that excluded module instances on nodes that are no longer running, just as Hongwei Ji reported. Now omitting the check for module instances that have an explicit URL.

Committed and merged in master

Generated at Thu Feb 08 23:05:56 UTC 2024 using Jira 1001.0.0-SNAPSHOT#100246-sha1:7a5c50119eb0633d306e14180817ddef5e80c75d.