- Organize conversation with hosting vendors to identify current offerings, concerns
- Organize conversation with implementers about reporting platforms and hosting plans (blocked by Recruit from new institutions not represented)
Summary of conversation on LDP1 Hosting in Sys Ops & Management SIG, October 2022
- LDP1 Deployment and Operations (Slides from Wayne Schneider at Index Data)
- Official notes from Sys Ops meeting
- Angela's notes from meeting
- FOLIO Sys Ops & Mgt SIG (21-Oct-2022 09:58 Eastern U.S. time) | FOLIO Meeting Recordings (openlibraryfoundation.org)
Best practices mentioned
...
Experiences and Strategies from Users/DevOps Providers
The following experiences were shared by people with varying amounts of experience hosting LDP1, either as a third-party hosting provider or within a library organization. These suggestions are not intended to supersede specific recommendations by the developers of the LDP1 software.
General tips
- Always follow the LDP1 Administrator Guide.
- The production database should be a standalone Postgres instance with dedicated storage. Only use it for FOLIO reporting.
- Use ldp_add_column.conf to make sure tables in LDP1 contain certain columns whether or not there are data for those columns. This helps make sure that automated derived table queries run correctly, even if the data are absent.periodic
Periodic extraction of data
- LDP1, ldpmarc and folio-analytics data extraction jobs need to be run on a schedule and in sequence. Possible scheduling services include cron, Jenkins, Kubernets cronjob.
- For order of extraction jobs, might try LDP1, then ldpmarc, then folio-analytics derived tables. (moved here from upgrades section)
- Once a day tends to be pretty good.
- While it updates the data, you wouldn't want to run large queries against the reporting database because the data update uses a lot of system resources.
- Daily incremental update for ldpmarc is quite fast (e.g., 10 minutes for University of Chicago).
- For performanceRun , run with network proximity between the FOLIO and LDP databases backups
Backups / disaster recovery:
- Q: What do you do There are different options for a postgres PostgreSQL backup solution (for a production LDP), and what cadence to you do that on ?A: some .
- Some are running on Amazon AWS, RDS Postgres services for hosting, including disaster recovery (7 days of snapshots).
- A: some Others have local snapshots. 7 days is likely enough, because LDP keeps history of the records internally.cloning
Cloning
- Q: When you clone FOLIO production environment, do you clone the LDP LDP1 data over or do you build the LDP LDP1 data back from the FOLIO data that you have cloned?
- - A (Texas A&M): We don't clone the data over, but we re-build it from scratch. Or, if I upgrade, I just upgrade the LDP. In that way it preserves the history. - A (Wayne)
- Index Data: If we refresh the staging environment for a tenant, we will re-build LDP from scratch.
Timing for upgrades / testing
- Q: error checking and logging. Do testing and upgrades go together ? Run two instances: One model for LDP1 and ldpmarc: run two instances (staging and production). LDP1 and ldpmarc releases are both fairly independent of FOLIO releases, new versions come out on their own cadence. When there is a new version of LDP1 or ldpmarc, implement that in staging. If it's more than a point upgrade, might invite users to test. Follow that with upgrade of production.
- An upgrade in LDP1 is really simple. You invoke the LDP server with an
upgrade-database
command. Then LDP knows what verison it is talking to. - For ldpmarc, no upgrade process is necessary. If needed, the new version will automatically perform a full update (rather than incremental update).
- One model for FOLIO Analytics: FOLIO Analytics is tested released with specific FOLIO flower releases, so upgrade those together; include . Include LDP1 in standard FOLIO testing. Apart from that, invite user to test only when folio-analytics tables have changed.
- Some Another model: have a special Jenkins job which creates a thing called the LDP engine. This engine includes all 3 components. They use Use a Docker image for all their the setups for the Flower Release. When a new version of LDP comes, they build a new Docker image with the latest versions. Then they do Do a smoke test. If everything works fine, they create recommendations for the upcoming Flower Release. They use Use these until the next upcoming flower release.logging
Logging
- Some use container log aggregation (CloudWatch, or Elastic), so the logs don't go away when the container goes away.
- Some use FluentD/Rancher for logs, which then get pushed to Splunk; might . Might need to log to standard error/out instead of to a file.
- Probably a good idea to sent send an alert for when the jobs fail, but could also check all manually each morning.
Security
- This group did not feel there are security concerns to hosting LDP1.
- Set up your network securely. Set up access to the reporting database securely. Secure (or pseudonomize) personally identifable identifiable information. Apart from that, there are no security concerns !
- Single-Sign On to database access is not supported. You do have to use regular postgresql security. But that is more an integration problem than a security problem.
- Additional strategies: Make sure the LDP IP address is only accessible from certain subnets. Make sure Kubernetes network is namespace isolated.
- Texas A&M: Set up users different user accounts with read-only access, and use those accounts for connections via tools like Cloud Beaver. Then embed those tools in a VM that uses standard university permissions systems (e.g., SSO via Shibboleth). The different read-only user accounts can be granted permissions to just certain types of data, and different staff can be granted permission to use just the VM that matches the permissions they should have.
- SSL is used for LDP database access unless otherwise configured..concerns
Concerns
- Data transfer time gets longer as data in FOLIO increases.
- You will need more resources in LDP if LDP1 as your production database grows.
- Silent failures. You do not have a Some experience silent failures and do not feel there is good and detailed logging. You never know what has happened. (But there is good support by Nassib!)
- Documentation in LDP1 repository is mostly oriented to developers. It is not easy to find information if you are not familiar with the system.
- For ldpmarc, the incremental update is quick, but the full update can be pretty long. (This is less of an issue since ldpmarc 1.6, which includes significant performance improvements.)
- There is no automatic recovery. When a process fails, it needs to be re-run manually.