[FOLIO-2520] reference environment build failures with Okapi 2.37.1 Created: 20/Mar/20  Updated: 03/Jun/20  Resolved: 27/Mar/20

Status: Closed
Project: FOLIO
Components: None
Affects versions: None
Fix versions: None

Type: Task Priority: P2
Reporter: Ian Hardy Assignee: Ian Hardy
Resolution: Done Votes: 0
Labels: devops-backlog
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original estimate: Not Specified

Attachments: File okapi-2-truncated.log     Text File relevant-excerpt.txt    
Issue links:
Blocks
is blocked by OKAPI-819 Missing permissions POSTed to permiss... Closed
Relates
relates to FOLIO-2528 Build reference environment fails: mo... Closed
Sprint: DevOps: sprint 84, DevOps: sprint 85
Development Team: FOLIO DevOps

 Description   

Many of the reference environment builds failed. Failures are not all in the same place during the ansible build, and not even on the same task for the same build if re-run.



 Comments   
Comment by Ian Hardy [ 20/Mar/20 ]

Since failures are inconsistent working thinking it probably doesn't have to do with the plays themselves or any recent folio change. Typical example:

TASK [folio-ansible/roles/mod-inventory-mods : Load sample data from MODS ingest] ***
FATAL: command execution failed
hudson.AbortException: Ansible playbook execution failed
	at org.jenkinsci.plugins.ansible.AnsiblePlaybookBuilder.perform(AnsiblePlaybookBuilder.java:262)
	at org.jenkinsci.plugins.ansible.workflow.AnsiblePlaybookStep$AnsiblePlaybookExecution.run(AnsiblePlaybookStep.java:400)
	at org.jenkinsci.plugins.ansible.workflow.AnsiblePlaybookStep$AnsiblePlaybookExecution.run(AnsiblePlaybookStep.java:321)
	at org.jenkinsci.plugins.workflow.steps.AbstractSynchronousNonBlockingStepExecution$1$1.call(AbstractSynchronousNonBlockingStepExecution.java:47)
	at hudson.security.ACL.impersonate(ACL.java:359)
	at org.jenkinsci.plugins.workflow.steps.AbstractSynchronousNonBlockingStepExecution$1.run(AbstractSynchronousNonBlockingStepExecution.java:44)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Comment by Ian Hardy [ 20/Mar/20 ]

From John, Cleaned up ec2 worker nodes where builds are executed, updated jenkins and plugins and restarted the jenkins process. Still seeing similar failures.

Comment by Ian Hardy [ 20/Mar/20 ]

Pinned Okapi to the previous version (2.37.0) and builds started passing. This is an empirical, not a rational discovery, so I'll configure a branch to use the latest to collect okapi logs for Adam Dickmeiss.

Comment by Craig McNally [ 21/Mar/20 ]

Isn't 2.37.0 the official okapi version for fameflower? Please share your findings Adam Dickmeiss

Nevermind I misread the above comment

Comment by Ian Hardy [ 21/Mar/20 ]

I wasn't able to come up with anything conclusive this afternoon since failures were not consistent. I can post okapi logs from one of the builds with problems which is better than nothing perhaps. [^okapi-2.log] relevant-excerpt.txt okapi-2-truncated.log

The shorter excerpt corresponds with the particular failure in this build: https://jenkins-aws.indexdata.com/job/Automation/job/folio-snapshot-test/132/console

The longer one is provides more context.

Generated at Thu Feb 08 23:21:15 UTC 2024 using Jira 1001.0.0-SNAPSHOT#100246-sha1:7a5c50119eb0633d306e14180817ddef5e80c75d.