Overview
The main goal of this document is to provide the benefits of moving the Authority API from the mod-inventory-storage module to the mod-entities-links.
...
- Data integrity: Centralized storage for authority data should remove possible inconsistency inconsistencies between microservices.
- Maintainability: Implementation of features related to handling and managing authorities should not require overhead on crosschecking between mod-inventory-storage and mod-entityentities-links.
- Performance: Removal of http HTTP requests and Kafka interaction overhead should improve the performance of linking and import.
...
- Rename mod-entities-links to mod-authority-manager.
- Fully move Authority API, Authority Note Types API, and Authority Source Files API from mod-entitiesinventory-links storage to mod-authority-manager. This API provides just CRUD operations and does not have any business logic.
- Move authority reindex API.
- Adjust mod-authority-manager to use an internal database instead of interacting with mod-inventory-storage and mod-search.
- Disable the above APIs in mod-inventory-storage and remove APIs implementation and enable it in mod-authority-manager. The dependent UI and BE modules will not experience any differences.
- Create a migration script for existing authorities.
...
- Consume data-import authority events to increase the performance of authority data-import flow. (8 SPs)
- Simplify authority stats generating
- Move mapping rules
Benefits
Moving the Authority API from the mod-inventory-storage module to the mod-entities-links module can bring several benefits, particularly in terms of reducing dependencies, minimizing interactions, and eliminating duplication of authorities. Here are some arguments to support this movement:
...
Overall, moving the Authority API from the mod-inventory-storage module to the mod-entities-links module offers the advantages of reduced dependencies, minimized interactions, elimination of duplication, simplified development and maintenance, improved scalability, and enhanced data integrity. These benefits contribute to a more efficient, maintainable, and robust system architecture. quests authority data for suggesting links @enduml.
Open development questions:
- How to handle permissions?
- How to handle Poppy release migrations?
Questions for POs
Area | Question | Answer | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
Duplicate Identifier | Do we want to implement authority validations that prevent saving an authority record if a similar authority already exists in the system, based on either the identifier (naturalId or 001/010a) or the heading? | KG - Not for this initial implementation. There is some logic we need to support for LOC related to 010 always having 12 characters. Also based on looking at some of the National Library of Poland authority records, we might have a situation where the 001/010a is the same as LOC. We need to more authority file analysis before implementation. MM: We have spoken with NUKAT (Poland) and we suggest them that they should move fields 010 to 035, because of Folio rules. So far we have no feedback from them. If it will be my system I would prevenet saving record with the same content in fields 001/010a. But do not prevent saving record with the same heading. KG: Marcin Mystkowski, I agree but I think we need to do some analysis to the libraries that have already loaded authority records into FOLIO. Also we will need to do this for LOC as it is a requirement to prevent duplicates. So Pavlo Smahin - we want to do it but I think we need more requirements analysis. Is it okay to implement this requirement as a phase 2 so we have time to define requirements? | ||||||||
Multiple Headings and Types | Do we want to enforce a validation rule that restricts saving an authority record if it contains multiple headings of the same type (e.g., several personal names) or multiple types (e.g., a combination of personal name and geographic title)? | KG: Yes. We need a rule that the authority record can only have one 1XX. I thought this rule was already in place. NOTE - We will support more than the 1XX values we support today. I have received feedback that some customers have authority records whereby 1XX is not on the list outlined in MARC authority documentation: 100, 110, 111, 130, 147, 148, 150, 151, 155, 162, 180, 181, 182, and 185 (https://www.loc.gov/marc/authority/ad1xx3xx.html). MM: Yes | ||||||||
Tracing Field Consistency | Do we want to implement a validation that ensures the "see from" and "see also from" tracing fields accurately reflect the heading? For example, if the heading is a personal name, should the tracing field be a meeting name? | KG: So the question is the following, should we apply a validation rule that if the heading is 100 that the 4XX must be 400 and the 5XX must be 500? No. See examples https://lccn.loc.gov/no2007000953 https://lccn.loc.gov/no2019024399 MM: We did it in our legacy system, after few moths we had to gave up with this validation So no. Check https://lccn.loc.gov/n79043402 NOTE - When we allow for creating a local authority via UI or support creating local authority records via DI then we should consider tenant level MARC validation rules related to this question. | ||||||||
Duplicate Headings | Do we want to prevent saving an authority record if a similar heading with the same heading type already exists in the system? For example, having 2 records that have "Apple Inc." in 110 field. | KG: Pavlo Smahin - can you provide an example? Hey Pavlo Smahin - Allow the save. Eventually we will have a duplicate headings report that allow the cataloger to make the correction if necessary. | ||||||||
Duplicate Tracing Fields | Do we want to address duplicates in search results by cleaning up duplicate tracing fields? For instance, if multiple 400 fields with the same values exist in a MARC record, should the search results remove the duplicates? | KG: No. The example seems very edge case. I cannot imagine this will happen very often. We can always support MM: If the record has got the same 4xx with the same content it looks like mistake. If we will remove duplicates librarian won't be able to see his mistake and easily correct the data |