Derived Table Review - Fall 2024

The next planned major release of the Metadb software, version 1.4, will extract all the data fields from the data arrays into columns. Given this change, many of the derived tables the Reporting SIG currently maintains can be retired because they were designed to extract data from the data arrays. Some derived tables were created to achieve different goals, however, so they will be retained. The Reporting SIG is reviewing release 1.7.9 of derived tables for use with Metadb with the goal of reducing the number of tables overall. The focus will be on keeping tables that help report writers create queries that work well with the FOLIO software data model.

Reports that use derived tables can be rewritten to use CTEs (Common Table Expression), such as the CTE at the top of this user_permission_sets.sql query:

WITH permissions_users AS (
SELECT
jsonb_array_elements_text(jsonb->'permissions') AS user_permissions,
jsonb_extract_path_text(jsonb, 'userId')::uuid AS user_id
FROM
folio_permissions.permissions_users
)

See the Chapter on CTEs in the cookbook for more information:

https://github.com/folio-org/folio-analytics/wiki/Cookbook:-General#6-using-cte-to-prepare-data-to-use-them-in-queries

The functionality of derived tables could also be replaced with FUNCTIONS.

 

Comments

  • CTEs not performative

  • need to understand impact of number of CTEs on performance

  • Can you build an index on a CTE?

  • Need to know more about FUNCTIONS

  • Derived tables enable building indexes

  • Lack of indexes could hamper performance of CTE

  • Can build CTE as a table and run an index on it, but more complex

  • Differences between derived and source table data can be confusing

  • How do we optimize our queries with CTEs? Put them closer to the end of the query?

  • Would it be possible to index the marct table?

  • It appears that marct is partitioned

  • Would be helpful to have some examples of FUNCTIONS

  • Need to see impacts with PowerBI, Tableau, etc. direct connections to the data.

  • Derived tables are being used at many institutions

 

Creating FUNCTIONS

-a script to convert derived tables to functions could run once a day or more frequently

-advantages - you only need to create one script

-in your query, you use the function when you need live data and you use the derived table when you do not

 

circulation query - use the function - get live data - use parameters to make it more performant than running the whole derived table

inventory query - use derived table - get yesterday’s data

 

Options

-functions available on folio-analytics repo; use these functions to create your own derived tables; write your sql query to point to the functions; you can tailor your institution’s functions to filter the data more to make the queries run faster

 

Indexes

-would be helpful to have indexes as FUNCTIONS

-for instance, we could index ISBNs or identifiers

 

 

1.7.9 Derived Table for Metadb

Keep or Retire?

Document in Wiki?

Notes

1.7.9 Derived Table for Metadb

Keep or Retire?

Document in Wiki?

Notes

agreements_custom_property.sql

Freeze

Yes

custom properties will be extracted into columns

agreements_package_content_item.sql

 

 

 

agreements_subscription_agreement.sql

 

 

 

agreements_subscription_agreement_entitlement.sql

 

 

 

agreements_subscription_agreement_org_ext.sql

 

 

 

feesfines_accounts_actions.sql

 

 

 

finance_budgets.sql

 

 

 

finance_funds.sql

 

 

 

finance_invoice_transactions.sql

 

 

 

finance_transaction_invoices.sql

 

 

 

finance_transaction_purchase_order.sql

 

 

 

holdings_administrative_notes.sql

 

 

 

holdings_electronic_access.sql

 

 

 

holdings_ext.sql

 

 

 

holdings_notes.sql

 

 

 

holdings_statements.sql

 

 

 

holdings_statements_indexes.sql

 

 

 

holdings_statements_supplements.sql

 

 

 

holdings_statistical_codes.sql

 

 

 

holdings_tags.sql

 

 

 

instance_administrative_notes.sql

 

 

 

instance_alternative_titles.sql

 

 

 

instance_classifications.sql

 

 

 

instance_contributors.sql

 

 

 

instance_editions.sql

 

 

 

instance_electronic_access.sql

 

 

 

instance_ext.sql

 

 

 

instance_formats.sql

 

 

 

instance_identifiers.sql

 

 

 

instance_languages.sql

 

 

 

instance_nature_content.sql

 

 

 

instance_notes.sql

 

 

 

instance_physical_descriptions.sql

 

 

 

instance_publication.sql

 

 

 

instance_relationships_ext.sql

 

 

 

instance_series.sql

 

 

 

instance_statistical_codes.sql

 

 

 

instance_subjects.sql

 

 

 

instance_tags.sql

 

 

 

invoice_adjustments_ext.sql

 

 

 

invoice_adjustments_in_addition_to.sql

 

 

 

invoice_lines_adjustments.sql

 

 

 

invoice_lines_fund_distributions.sql

 

 

 

invoice_voucher_lines_fund_distributions.sql

 

 

 

item_administrative_notes.sql

 

 

 

item_circulation_notes.sql

 

 

 

item_effective_callno_components.sql

 

 

 

item_electronic_access.sql

 

 

 

item_ext.sql

 

 

 

item_notes.sql

 

 

 

item_statistical_codes.sql

 

 

 

item_tags.sql

 

 

 

items_holdings_instances.sql

 

 

 

licenses_custom_property.sql

 

 

 

licenses_license_ext.sql

 

 

 

loans_items.sql

 

 

 

loans_renewal_count.sql

 

 

 

loans_renewal_dates.sql

 

 

 

locations_libraries.sql

 

 

 

locations_service_points.sql

 

 

 

openaccess_checklists.sql

 

 

 

openaccess_publication_request.sql

 

 

 

openaccess_works.sql

 

 

 

po_acq_unit_ids.sql

 

 

 

po_instance.sql

 

 

 

po_line_fund_distribution_transactions.sql

 

 

 

po_line_vendor_reference_number.sql

 

 

 

po_lines_cost.sql

 

 

 

po_lines_details_subscription.sql

 

 

 

po_lines_er_mat_type.sql

 

 

 

po_lines_eresource.sql

 

 

 

po_lines_fund_distribution_transactions.sql

 

 

 

po_lines_locations.sql

 

 

 

po_lines_phys_mat_type.sql

 

 

 

po_lines_physical.sql

 

 

 

po_lines_tags.sql

 

 

 

po_lines_vendor_reference_numbers.sql

 

 

 

po_ongoing.sql

 

 

 

po_organization.sql

 

 

 

po_prod_ids.sql

 

 

 

requests_items.sql

 

 

 

runlist.txt

 

 

 

users_addresses.sql

 

 

 

users_departments_unpacked.sql

 

 

 

users_groups.sql

 

 

 

END