Attendees:
Off-Hours Guidelines
D4. Define guidelines/best practices around pausing/stopping environments when they're not in use - e.g. off-hours/weekends/etc.
...
- Yogesh Kumar added them to the wiki: Off-hour Environment Downscale, with edits from last week
- Maccabee Levine updated the required information for a new environment to reference it.
Today:
- Good?
- Yogesh Kumar still working on a few details with Kitfox
- Required info change ok.
- Should release environments also have off-hours? (From Kitfox sprint review: Peter Murray discussed bringing to PC and TC whether we bring the same weekend suspension to the hosted release environments. Peter will look at what savings that would be.)
- Do we have a way to determine how much those envs are used overnight and on weekends? They are more publicly known.
- Saving dollars, why not? Multiple time zones, harder
- Maybe start with weekends only? Peter Murray will estimate cost savings. No actual env changes for that right now.
- Look at logs also, see who accesses them over the weekend.
- Maybe higher use nearer a flower release.
- Look at snapshot, snapshot-2, and the two release environments
- Get input from PC, SIGs, CC. At timing of the rest of the input.
- Don't shut down on weekends any system used for demos on the public website/wiki. Consider changing that website language? Flag for CC question (www.folio.org).
Budgets / Cost Anomaly Detection
...
Today:
- Review new section on budget page, "Review by AWS Cost Review Group". Accurate?
- Edited, now good.
- Review draft environment "price list" / "recipes" from Yogesh Kumar if ready.
- Look at next week.
- What work has to be done to the budgets, budget alerts, anomaly detection, rightsizing reocomendations?
- to the budget alerts?
- to the anomaly detection?
- to the rightsizing recommendations?
- Two alerts are set up. RDS and OpenSearch. Come in as emails and slack notifications.
- Also an alert set up for a budget that is everything other than RDS, OpenSearch and Compute.
- Anomaly detection has been set up. Group agrees it works for now, improve with each iteration.
- Regular review?
- ACRG should annually review the Budgets, Budget Alerts, Cost Anomaly Detection. Look at Rightsizing Recommendations after each flower release.
- No need for team-specific alerts. Regular alerts would point us to the team responsible.
Reviewing Environments to Shut Down
...
- Mark Veksler draft guidelines on who should have permissions to what operations in AWS. What will each team be allowed to do. Link from ACRG doc.
- Didn't get to this.
Today:
...
Today:
- Permissions on operations in AWS? Right now AWS is just Kitfox, but Jenkins jobs are available to dev teams.
- Kitfox prefers a self-service model.
- Yogesh Kumar will update the environment lifecycle document to indicate what dev teams can do.
AWS Environments that are not for dev teams. Do we apply similar processes? (If so how?)
- Off-hours shutdown during weekday eveningsDo we need any other procedures for those four environments? (snapshot, -2, and the two releases)
- Consensus: no.
Environment requests for the existing team environments when the new procedures are approved
- After we go live with the process, ask teams to submit the environment request for their existing environments. Starting with X release (defined during community review process).
Off-hours shutdown during weekday evenings
- Kitfox is looking at this. What would work best for each team? Ticket pending.
- We can at least look at the findings, and decide to do something or not. Harder b/c geographical spread.