Skip to end of banner
Go to start of banner

DR-000032 - Splitting Database Read/Write Traffics

Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 13 Current »

Submitted Date

 

Approved Date

 

StatusACCEPTED
ImpactMEDIUM

 

Overrides/Supersedes 

NA

RFC 

NA

Stakeholders

  • Performance Task Force (PTF)
  • #spring-force
  • #development slack channel

Contributors

Approvers

Background/Context

  • Currently 40+ RMB storage modules communicate with the same database endpoint to retrieve and write.
  • For scalability, cloud providers such as AWS offer a solution to segregate the read and write operations to different database nodes by providing an easy way to attach the read nodes and sync-ing the data between the nodes.
  • FSE explored various proxy options to automatically split the read and write traffics such as pgBouncer, pgPool. None has worked out.
  • RMB-348 was created 4 years ago and in 2022 PTF took a stab at implementing it the way it was discussed in the story.  
  • With Core-Platform’s guidance, thanks to Julian Ladisch and Adam Dickmeiss’, RMB-348 was completed in Morning Glory and released in Nolana (RMB v35.0.0).
  • Data Import ( MODSOURCE-540 )
  • Three workflows underwent rigorous performance testing: 
    • Check In, Check Out
    • Data Import (Create and Update MARC BIBs)

Assumptions

  • Similar performance improvements could be seen in other workflows under high CPU load. 

Constraints

  • Database technology will handle sync-ing data between the read and write nodes

Rationale

  • For Scalability, performance, and cost-saving

Decision

  • FOLIO has adopted the following approach to splitting database read/write traffic:
    • Storage modules create a read connection pool by the presence of two new environment variables: DB_HOST_READER, DB_PORT_READER
    • The solution is not specific to any particular database technology.
    • Configuring the DB cluster to attach DB read nodes or to sync data between the nodes is not in the scope of this work
    • This solution can be implemented at the framework level, and has been in RMB v35.0.0

Implications

  • Workflows and module designs needed to consider the potential for stale data
  • Increased design complexity
  • Increased operational complexity
  • More database connections (and thus) being needed within each module instance

Please see Splitting Database Read/Write Traffics for more details. 

Other Related Resources

  • No labels