DR-000038 - PostgreSQL Upgrade to 16

Submitted Date

 

Approved Date

 

StatusAPPROVED
ImpactLOW

 

Overrides/Supersedes 

RFC 

Provide links to all relevant RFC's

Stakeholders

@mention individual(s) who has vested interest in the DR. This helps us to identify who needs to be aware of the decision

Contributors

Julian Ladisch Florian Gleixner 

Approvers

List TC members voted to approve the change

Background/Context

Poppy officially supports PostgreSQL 12. Support of this version ends November 14, 2024.

Once a year the PostgreSQL team releases a new major version of PostgreSQL. Versions 13, 14, 15 and 16 are available. Some institutions have already migrated away from version 12 to 13 or 14 without any problems.

Assumptions

Sysops prefer to avoid a PostgreSQL migration if possible.

If a PostgreSQL migration is needed, sysops prefer the latest version PostgreSQL version.

Constraints

PostgreSQL 16 at Amazon RDS for PostgreSQL will be available by March 2024 (assuming the same timing like PostgreSQL 15 that has been available as Amazon RDS since Februrary 28, 2023).

Multi-tenant installations migrate tenants on different days, for example 4 tenants per weekend. Therefore they require the previous flower release to run under the new PostgreSQL version.

On November 20, 2023 the TC decided that it's to late to bump the officially supported PostgreSQL version for Quesnelia to 15 because the Quesnelia development planning has already been completed.

Rationale

"pg_upgrade supports upgrades from 9.2.X and later to the current major release of PostgreSQL, including snapshot and beta releases." There is no need to incrementally upgrade to 13, and then to 14, and then to 15, and then to 16.

Skipping versions 13, 14, and 15 reduces the sysops' and devops' workload and downtimes.

Migrating PostgreSQL from 12 to 13 for Quesnelia should be avoided when it only lasts four months until the next migration from 13 to 16 is required for Ramson.

There's a gap between the end of PostgreSQL 12 support on November 14, 2024 and end of Quesnelia support around December 2024. Sysops should migrate PostgreSQL from 12 to 16 for their Quesnalia installations by November 14, 2024.

It's likely that breaking changes in 13, 14, 15 or 16 don't affect FOLIO or are very easy to fix. This is the result of the analysis of all published breaking changes, see list below.

The 35 biggest FOLIO modules (out of 60 database accessing modules) have been tested against PostgreSQL 13, 14, 15 and 16 using the module tests (mvn verify). No regression was found. See list below.

Making both PostgreSQL 12 and 16 the officially supported PostgreSQL version for the Quesnelia FOLIO release means that developers can only use PostgreSQL features that run under both versions. Almost always PostgreSQL is backward compatible so that only PostgreSQL 12 is needed for the automated tests for Quesnelia (unit tests, snapshot enviroments, vagrant boxes, rancher environments).

To test PostgreSQL 16 compatibility a single run of the Karate tests and the module tests (DR-000037 - TESTCONTAINERS_POSTGRES_IMAGE) is sufficient. A good time might be the Quesnelia bug fest. Bug fest and reference environments for Quesnelia containue to use PostgreSQL 12, the PostgreSQL 16 compatibility tests run on separate installations (Rancher environment, GitHub Actions).

Institutions are free to use any of the not officially supported PostgreSQL versions 13, 14, and 15 for their Quesnelia installation at their own risk.

Summary for developers:

  • For Quesnelia keep PostgreSQL 12 for unit tests. Use only SQL features of PostgreSQL 12, don't use SQL features from the breaking changes list (see below) of PostgreSQL 13, 14, 15 and 16.
  • For Ramson use PostgreSQL 16 for unit tests.

Summary for devops:

  • For Quesnelia keep reference environments, rancher environments, bug fest environment, Vagrant boxes etc. on PostgreSQL 12.
  • For Ramson run reference environments, rancher environments, bug fest environment, Vagrant boxes etc. on PostgreSQL 16.

Summary for test manager:

  • For Quesnelia: Quesnelia bug fest runs on PostgreSQL 12. During the bug fest execute additional PostgreSQL 16 tests on separate installations:
  • For Ramson: No additional tests are needed.

Summary for sysops using only officially supported PostgreSQL versions:

  • Keep PostgreSQL 12 while migrating tenants to Quesnelia.
  • Before migrating PostgreSQL from 12 to 16 all tenant must have been migrated to Quesnelia.
  • PostgreSQL must be migrated from 12 to 16 before migrating a tenant to Ramson and by November 14, 2024, whatever comes first.
  • No module or module configuration change is needed for the PostgreSQL migration.

Summary for sysops using not officially supported PostgreSQL versions:

  • There are no known issues running Poppy and Quesnelia with PostgreSQL 12, 13, 14, 15, or 16.
  • Because it's likely that Ramson modules use PostgreSQL 16 SQL features it's likely that PostgreSQL must be migrated to 16 before migrating a tenant to Ramson.
  • Note that the final PostgreSQL 12 release will be on November 14, 2024, no security fixes will be released after this date.
  • No module or module configuration change is needed for the PostgreSQL migration.

Decision

PostgreSQL 12 and 16 are the officially supported PostgreSQL server versions for the Quesnelia FOLIO release; PostgreSQL 12 server support ends November 14, 2024. Only PostgreSQL 12 SQL features are allowed for the Quesnelia FOLIO release; they must also work under PostgreSQL 16.

PostgreSQL 16 is the officially supported PostgreSQL version for the Ramson FOLIO release.

Implications

  • Pros
    • See "Assumptions" and "Rationale" above
  • Cons
    • Developers are restricted to PostgreSQL 12 features for Quesnelia.

Breaking changes analysis

13

https://www.postgresql.org/docs/13/release-13.html#id-1.11.6.17.4

14

https://www.postgresql.org/docs/14/release-14.html#id-1.11.6.14.4

15

https://www.postgresql.org/docs/15/release-15.html#id-1.11.6.9.4

  • Remove PUBLIC creation permission on the public schema (Noah Misch)

    The new default is one of the secure schema usage patterns that Section 5.9.6 has recommended since the security release for CVE-2018-1058. The change applies to new database clusters and to newly-created databases in existing clusters. Upgrading a cluster or restoring a database dump will preserve public's existing permissions.

    For existing databases, especially those having multiple users, consider revoking CREATE permission on the public schema to adopt this new default. For new databases having no need to defend against insider threats, granting CREATE permission will yield the behavior of prior releases.


    FOLIO has removed the permission since Iris (R1 2021) release and therefore already uses this new default: https://github.com/folio-org/raml-module-builder/commit/dfae85496591ce080cdb0544a156e1dd9cc8e13e
  • Change the owner of the public schema to be the new pg_database_owner role (Noah Misch)

    This allows each database's owner to have ownership privileges on the public schema within their database. Previously it was owned by the bootstrap superuser, so that non-superuser database owners could not do anything with it.

    This change applies to new database clusters and to newly-created databases in existing clusters. Upgrading a cluster or restoring a database dump will preserve public's existing ownership specification.

    FOLIO uses public schema for extensions only and therefore is not affected by this change.

  • Remove long-deprecated exclusive backup mode (David Steele, Nathan Bossart)

    If the database server stops abruptly while in this mode, the server could fail to start. The non-exclusive backup mode is considered superior for all purposes. Functions pg_start_backup()/pg_stop_backup() have been renamed to pg_backup_start()/pg_backup_stop(), and the functions pg_backup_start_time() and pg_is_in_backup() have been removed.

    FOLIO doesn't use pg_basebackup in exclusive backup mode, and doesn't use the removed methods: https://github.com/search?q=org%3Afolio-org%20pg_start_backup&type=code , https://github.com/search?q=org%3Afolio-org%20pg_stop_backup&type=code , https://github.com/search?q=org%3Afolio-org%20pg_backup_start_time&type=code , https://github.com/search?q=org%3Afolio-org%20pg_is_in_backup&type=code

  • Increase hash_mem_multiplier default to 2.0 (Peter Geoghegan)

    This allows query hash operations to use more work_mem memory than other operations.

    FOLIO doesn't rely on the old default and doesn't use the variable: https://github.com/search?q=org%3Afolio-org%20hash_mem_multiplier&type=code

  • Remove server-side language plpython2u and generic Python language plpythonu (Andres Freund)

    Python 2.x is no longer supported. While the original intent of plpythonu was that it could eventually refer to plpython3u, changing it now seems more likely to cause problems than solve them, so it's just been removed.

    FOLIO doesn't use the removed languages: https://github.com/search?q=org%3Afolio-org%20plpython2u&type=code , https://github.com/search?q=org%3Afolio-org%20plpythonu&type=code

  • Generate an error if array_to_tsvector() is passed an empty-string array element (Jean-Christophe Arnu)

    This is prohibited because lexemes should never be empty. Users of previous Postgres releases should verify that no empty lexemes are stored because they can lead to dump/restore failures and inconsistent results.

    FOLIO doesn't use the method: https://github.com/search?q=org%3Afolio-org%20array_to_tsvector&type=code

  • Generate an error when chr() is supplied with a negative argument (Peter Eisentraut)

    FOLIO uses CHR(31) only that is not affected: https://github.com/search?q=org%3Afolio-org+%22chr%28%22&type=code

  • Prevent CREATE OR REPLACE VIEW from changing the collation of an output column (Tom Lane)

    FOLIO uses the default collation of the database and is not affected by the view collations.

  • Disallow zero-length Unicode identifiers, e.g., U&"" (Peter Eisentraut)

    Non-Unicode zero-length identifiers were already disallowed.

    FOLIO doesn't use this identifier: https://github.com/search?q=org%3Afolio-org+%22U%26%22&type=code

  • Prevent numeric literals from having non-numeric trailing characters (Peter Eisentraut)

    Previously, query text like 123abc would be interpreted as 123 followed by a separate token abc.

    (warning) FOLIO is affected. One example: https://github.com/folio-org/mod-finance-storage/pull/370/files

  • Adjust JSON numeric literal processing to match the SQL/JSON-standard (Peter Eisentraut)

    This accepts numeric formats like .1 and 1., and disallows trailing junk after numeric literals, like 1.type().

    ((question)) No easy search possible to investigate whether FOLIO is affected. However, it's verly unlikely that FOLIO relies on trailing junk after numeric literals in JSON.

  • When interval input provides a fractional value for a unit greater than months, round to the nearest month (Bruce Momjian)

    For example, convert 1.99 years to 2 years, not 1 year 11 months as before.

    (question) No easy search possible to investigate whether FOLIO uses fractional year intervals and is affected.

  • Improve consistency of interval parsing with trailing periods (Tom Lane)

    Numbers with trailing periods were rejected on some platforms.

    ((question)) No easy search possible to investigate whether FOLIO is affected. However, it's very unlikely that FOLIO runtime code depends on trailing periods being rejected.

  • Mark the interval output function as stable, not immutable, since it depends on IntervalStyle (Tom Lane)

    This will, for example, cause creation of indexes relying on the text output of interval values to fail.

    FOLIO doesn't use IntervalStyle: https://github.com/search?q=org%3Afolio-org%20IntervalStyle&type=code

  • Detect integer overflow in interval justification functions (Joe Koshakow)

    The affected functions are justify_interval(), justify_hours(), and justify_days().

    FOLIO doesn't use the functions: https://github.com/search?q=org%3Afolio-org+justify_interval&type=code , https://github.com/search?q=org%3Afolio-org+justify_hours&type=code , https://github.com/search?q=org%3Afolio-org+justify_day&type=code

  • Change the I/O format of type "char" for non-ASCII characters (Tom Lane)

    Bytes with the high bit set are now output as a backslash and three octal digits, to avoid encoding issues.

    This output is still a valid representation and very unlikely to cause any issues in FOLIO.

  • Remove the default ADMIN OPTION privilege a login role has on its own role membership (Robert Haas)

    Previously, a login role could add/remove members of its own role, even without ADMIN OPTION privilege.

    FOLIO uses ADMIN OPTION privileges where needed.

  • Allow logical replication to run as the owner of the subscription (Mark Dilger)

    Because row-level security policies are not checked, only superusers, roles with bypassrls, and table owners can replicate into tables with row-level security policies.

    Replication is a sysop task, FOLIO is not affected, and allowing more roles cannot break an installation.

  • Prevent UPDATE and DELETE logical replication operations on tables where the subscription owner does not have SELECT permission on the table (Jeff Davis)

    UPDATE and DELETE commands typically involve reading the table as well, so require the subscription owner to have table SELECT permission.

    Replication is a sysop task, FOLIO is not affected. Sysops can easily add the SELECT permission if needed.

  • When EXPLAIN references the session's temporary object schema, refer to it as pg_temp (Amul Sul)

    Previously the actual schema name was reported, leading to inconsistencies across sessions.

    FOLIO doesn't use EXPLAIN with the session's temporary object schema: https://github.com/search?q=org%3Afolio-org+explain&type=code

  • Fix pg_statio_all_tables to sum values for the rare case of TOAST tables with multiple indexes (Andrei Zubkov)

    Previously such cases would show one row for each index.

    FOLIO doesn't use pg_statio_all_tables: https://github.com/search?q=org%3Afolio-org%20pg_statio_all_tables&type=code

  • Disallow setting custom options that match the name of an installed extension, but are not one of the extension's declared variables (Florin Irion, Tom Lane)

    This change causes any such pre-existing variables to be deleted during extension load, and then prevents new ones from being created later in the session. The intent is to prevent confusion about whether a variable is associated with an extension or not.

    (question) No easy search is possible to find out whether FOLIO uses confusing database variables. However, this change will fail unit tests that use such variables.

  • Remove obsolete server variable stats_temp_directory (Andres Freund, Kyotaro Horiguchi)

    FOLIO doesn't use the variable: https://github.com/search?q=org%3Afolio-org+stats_temp_directory&type=code

  • Improve the algorithm used to compute random() (Fabien Coelho)

    This will cause random()'s results to differ from what was emitted by prior versions, even for the same seed value.

    FOLIO uses PostgreSQL's random() but all uses cases don't require reproducibility: https://github.com/search?q=org%3Afolio-org+random()&type=code

  • libpq's PQsendQuery() function is no longer supported in pipeline mode (Álvaro Herrera)

    Applications that are using that combination will need to be modified to use PQsendQueryParams() instead.

    FOLIO doesn't use PQsendQuery: https://github.com/search?q=org%3Afolio-org+PQsendQuery&type=code

  • On non-Windows platforms, consult the HOME environment variable to find the user's home directory (Anders Kaseorg)

    If HOME is empty or unset, fall back to the previous method of checking the <pwd.h> database. This change affects libpq (for example, while looking up ~/.pgpass) as well as various client application programs.

    FOLIO modules get the PostgreSQL configuration from environment variables. Only sysop tasks that use client applications might be affected.

  • Remove pg_dump's --no-synchronized-snapshots option (Tom Lane)

    All still-supported server versions support synchronized snapshots, so there's no longer a need for this option.

    FOLIO doesn't use the option: https://github.com/search?q=org%3Afolio-org+no-synchronized-snapshots&type=code

  • After an error is detected in psql's --single-transaction mode, change the final COMMIT command to ROLLBACK only if ON_ERROR_STOP is set (Michael Paquier)

    FOLIO doesn't use the option: https://github.com/search?q=org%3Afolio-org+single-transaction&type=code

  • Avoid unnecessary casting of constants in queries sent by postgres_fdw (Dian Fay)

    When column types are intentionally different between local and remote databases, such casts could cause errors.

    Configuring a remote database is a sysop task. FOLIO doesn't use this feature: https://github.com/search?q=org%3Afolio-org+postgres_fdw&type=code

  • Remove xml2's xml_is_well_formed() function (Tom Lane)

    This function has been implemented in the core backend since Postgres 9.1.

    FOLIO doesn't use the function: https://github.com/search?q=org%3Afolio-org+xml_is_well_formed&type=code

  • Allow custom scan providers to indicate if they support projections (Sven Klemm)

    The default is now that custom scan providers are assumed to not support projections; those that do will need to be updated for this release.

    FOLIO doesn't use this feature: https://github.com/search?q=org%3Afolio-org+set_rel_pathlist_hook_type&type=code

16

https://www.postgresql.org/docs/16/release-16.html#RELEASE-16-MIGRATION

Module tests

Module tests (mvn verify) using DR-000037 - TESTCONTAINERS_POSTGRES_IMAGE:

module with link to test resultspostgres:13-alpinepostgres:14-alpinepostgres:15-alpinepostgres:16-alpine
mod-audit13 OK14 OK15 OK16 OK
mod-bulk-operations13 OK14 OK15 OK16 OK
mod-calendar13 OK14 OK15 OK16 OK
mod-circulation-storage13 OK14 OK15 OK16 OK
mod-consortia13 OK14 OK15 OK16 OK
mod-data-export13 OK14 OK15 OK16 OK
mod-data-export-spring13 OK14 OK15 OK16 OK
mod-data-export-worker13 OK14 OK15 OK16 OK
mod-data-import13 OK14 OK15 OK16 OK
mod-dcb13 OK14 OK15 OK16 OK
mod-di-converter-storage13 OK14 OK15 OK16 OK
mod-entities-links
no regressions, 129 succeeding tests
13 OK14 OK15 OK16 OK
mod-erm-usage13 OK14 OK15 OK16 OK
mod-erm-usage-harvester13 OK14 OK15 OK16 OK
mod-eusage-reports13 OK14 OK15 OK16 OK
mod-feesfines13 OK14 OK15 OK16 OK
mod-finance-storage13 OK14 OK15 OK16 OK
mod-finc-config13 OK14 OK15 OK16 OK
mod-inn-reach13 OK14 OK15 OK16 OK
mod-inventory-storage13 OK14 OK15 OK16 OK
mod-invoice13 OK14 OK15 OK16 OK
mod-lists13 OK14 OK15 OK16 OK
mod-login13 OK14 OK15 OK16 OK
mod-oai-pmh13 OK14 OK15 OK16 OK
mod-orders-storage13 OK14 OK15 OK16 OK
mod-patron-blocks13 OK14 OK15 OK16 OK
mod-pubsub13 OK14 OK15 OK16 OK
mod-quick-marc13 OK14 OK15 OK16 OK
mod-remote-storage13 OK14 OK15 OK16 OK
mod-reservoir13 OK14 OK15 OK16 OK
mod-search13 OK14 OK15 OK16 OK
mod-source-record-manager13 OK14 OK15 OK16 OK
mod-source-record-storage
no regressions, 506 succeeding tests
13 OK14 OK15 OK16 OK
mod-tags13 OK14 OK15 OK16 OK
mod-users13 OK14 OK15 OK16 OK
okapi13 OK14 OK15 OK16 OK

The modules for this list have been selected by highest lines of code as reported by Sonar.

How can I see that the build actually uses version 13, 14,15 or 16? In the build report click on the "Run mvn --batch-mode verify" step, then go to the "search logs" slot and enter "postgres:".

Why has the issue with 14 in one module removed from the table? It turned out that it is a sporadic test failure unrelated to the PostgreSQL version, it happens in versions 12, 13, 14, 15 and 16.

What does "no regressions, 129 succeeding tests" mean? This means that the module has 129 succeeding tests and one or more failing tests with PostgreSQL 12 and this doesn't change when using PostgreSQL 13, 14, 15 or 16.