Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Best practices mentioned

  • general tips
    • follow the admin guide
    • production database is Follow the LDP1 Administrator Guide
    • The production database should be a standalone Postgres VM with dedicated storage, only . Only use it for FOLIO reporting
    • run all of the LDP processes as cron jobs on Kubernetes (have our own container for FOLIO analytics) ( I think this is TAMU specific )
    • => possible scheduling services: cron, Jenkins, Kubernets cronjob
    • .
    • Use ldp_add_columbcolumn.conf : to ensure that all of the derived tables exist in the reporting table. It helps that the tables can be build to make sure tables in LDP1 contain certain columns, whether or not there are data for those columns. This helps make sure that automated derived table queries run correctly, even if the data are still absent.
  • periodic extraction of data 
    • LDP1, ldpmarc and folio-analytics data extraction jobs need to be run on a schedule and in sequence.   Possible scheduling services include cron, Jenkins, Kubernets cronjob.
    • For order of extraction jobs, might try LDP1, then ldpmarc, then folio-analytics derived tables. (moved here from upgrades section)
    • Once a day tends to be pretty good.
    • While it extracts the data, you wouldn't want to run queries against the reporting database because the data are being erased and then re-created.
    •  Daily Daily incremental update for ldpmarc is quite fast (e.g., 10 minutes for Univ. University of Chicago).
  • performance
    • run Run with network proximity between the FOLIO and LDP databases 
  • backups / disaster recovery:
    • Q: What do you do for a postgres backup solution (for a production LDP), and what cadence to you do that on ?
      • A: some are running on Amazon AWS, RDS Postgres services for hosting, including disaster recovery (7 days of snapshots)
      • A
      .
      • : some have local snapshots
      , up to 28 days ( =>
      • . 7 days is likely enough, because LDP keeps history of the records internally
      , anyway
      • .
      28 days is a lot of data )
  • cloning
    • Q: When you clone FOLIO production environment, do you clone the LDP data over or do you build the LDP data back from the FOLIO data that you have cloned ?
           - A (Texas A&M): We don't clone the data over, but we re-build it from scratch. Or, if I upgrade, I just upgrade the LDP . In that way it preserves the history.
           - A (Wayne): If we refresh the staging environment for a tenant, we will re-build LDP from scratch.
  • upgrades / testing
    • Q:  error checking and logging. Do testing and upgrades go together ? 
    • run Run two instances: staging and production.
    • LDP1 and ldpmarc releases are both fairly independent of FOLIO releases, new versions come out on their own cadence
    • when When there is a new version of LDP1 or ldpmarc, implement that in staging.
    • if If it's more than a point upgrade, might invite users to test.
    • follow Follow that with upgrade of production.
    • An upgrade in LDP1 is really simple. You envoke the LDP server with an upgrade database command. Then LDP knows what verison it is talking to.      
    • For ldpmarc, there is not really an upgrade process. The release notes will tell you if you have to run a full upgrade.review release notes for additional instructions about whether you need to run a full update
    • FOLIO Analytics is tied to FOLIO flower releases, so upgrade those together; include LDP1 in standard FOLIO testing. Apart from that, invite user to test only when folio-analytics tables have changed.
    • Do not upgrade LDP components until the next Flower Release of FOLIO. Except cases when you experience some problems. In that case, ask Nassib and then maybe install a new version.
    • for order of updates, might try LDP1, then ldpmarc, then derived tables (moved here from general section)
    •  Some Some have a special Jenkins job which creates a thing called the LDP engine. This engine includes all 3 components. They use a Docker image for all their setups for the Flower Release. When a new version of LDP comes, they build a new Docker image with the latest versions. Then they do a smoke test. If everything works fine, they create recommendations for the upcoming Flower Release. They use these until the next upcoming flower release.
  • logging
    • some Some use container log aggregation (CloudWatch, or Elastic), so the logs don't go away when the container goes away.
    • some Some use FluentD/Rancher for logs, which then get pushed to Splunk; might need to log to standard error/out instead of to a file.
    • probably Probably a good idea to sent an alert for when the jobs fail, but could also check all manually each morning.
  • security
    • Set up your network securely. Set up access to the reporting database securely. Secure (or pseudonomize) personally identifable information.
    • Apart from that, there are no security concerns !
    • Single-Sign on On to database access is not supported. You have to use regular postgresql security. But that is more an integration problem than a security problem.
    • make Make sure the LDP IP address is only accessible from certain subnets.
    • Make sure Kubernetes network is namespace isolated.
    • set Set up users with read-only access, and use those accounts for connections via tools like Cloud Beaver, then . Then embed those tools in a VM that uses standard university permissions systems (e.g., SSO via Shibboleth); the . The different read-only user accounts can be granted permissions to just certain types of data, and different staff can be granted permission to use just the VM that matches the permissions they should have.
    • "From a security standpoint, no, there isn't SSL all the way down to the schema, but we take care of that by controlling access. It  It is not exposed to the outside world. FOLIO has the same issue."
  • concerns
    • data Data transfer time gets longer if as data in FOLIO increases.
    • you You will need more resources in LDP if your production database grows.
    • Silent failures. You do not have a good and detailed logging. You never know what has happendhappened.  (but But there is good support by Nassib!).
    • Docemntatuon Documentation in LDP1 repo ist repository is mostly oriented to developers. It is not easy to find information if you are not  good familiar with the system system.
    • for For ldpmarc, incremental update is quick, but full update can be pretty long.
    • There is no automatic recovery; when . When process fails need , it needs to be re-run manually.