2023-07-20 Metadata Management Meeting notes
Date
Attendees
~ 20-25
Recordings
Recordings of meetings can be found in the Metadata_Management_SIG > Recordings folder on AWS from 2022 onwards: https://recordings.openlibraryfoundation.org/folio/metadata-management-sig/
Discussion items
Notetaker | Christie Thomas | |
Announcements | At August 10 meeting, Templates for Inventory Record discussion will be continued. Volunteers needed to help with MM adjacent documentation for the wiki. If you are able to contribute to maintaining the documentation for the docs.folio.org, reach out Laura E Daniels. App Interaction SIG meetings to discuss enhancements with a FOLIO-wide impact. Check out the App Interaction wiki page or the App Interaction slack channel for updates about meeting times and when these items are scheduled to be discussed. Entity Management Working group - there will be an update coming to MM SIG soon. TBD. Entity Management work is wrapping and there is discussion about whether the group should be disbanded and oversight handed to the MM Sig. | |
PC update | There will not be a PC meeting this week due to multiple vacations. We will reconvene on Thurs, 7/27! | |
Searching in Inventory (mod-search) | Several questions around Inventory searching using mod-search have come up on Slack and in Jira tickets. We will go over the different questions and try to find consensus of how search should work in these specific areas. There is a slack channel for this discussion: #test-msearch-inventory. - MSEARCH-478Getting issue details... STATUS Inventory. Title search (all). Support Phrase search. Discussion of requirements. Consensus that they accurately represent the expectations for the behavior. More examples were provided. - MSEARCH-549Getting issue details... STATUS Remove word stemming and fuzzy logic from title (all) searches - draft Need to make sure that fuzzy logic or word stemming is removed for title (all) searches and reserved for keyword searches. Suggestion that maybe fuzzy logic could also be enabled in advanced search at the discretion of the user. Fuzzy logic works well for full text searching, but not with structured data. Or have the ability to select to enable fuzzy logic or stemming via conventions like truncation in advanced or title (all) searches, but automatic stemming or fuzzy logic only in keyword searches. Ticket will go back in draft for more revision. It was also noted that this is an issue for other fields, not just title (all). Subject was given as an example of where it is expected to search for an exact match. Identifier is expect to not have fuzzy logic or stemming at all. Question / concern. How can they solve keyword searching that includes identifier when that should be exact? Maybe it is okay for an identifier search in keyword to not be exact as long as a search of the identifier index explicitly is an exact search. - MSEARCH-567Getting issue details... STATUS Remove fuzzy logic and word-stemming from phrase CQL queries in Inventory See notes from conversation above re: MSEARCH-549. - MSEARCH-486Getting issue details... STATUS (Does anyone have specific examples of records where this is not working as expected?) Keyword & identifier (all) search not returning results with the leading period This may no longer be an issue. There will be systematic testing and the ticket will be closed if it is confirmed to no longer be a problem. - MSEARCH-507Getting issue details... STATUS Inventory. Holdings and Item > Search by call number, eye readable should be case insensitive Consensus is that case insensitivity should be the default for all searching. - MSEARCH-512Getting issue details... STATUS (This is a known issue in discovery systems as well, e. g. Inventory. Search on contributor names results in irrelevant noise What is expected? Is it a phrase search? The main issue is that the contributor name is a single string. Problematizes searching names via different orders of family name/personal name. Expectation that it is a Begins with search. There are 2 different use cases. One with the way contributor search now works and the second to have an exact search. The default right now is a contains any search, but the default behavior can be changed. What about relevancy ranking to make sure that the exact searches appear first? - Opinion that ranking by relevancy should be optional and another that this would not help - if you know what you are looking for, you just want that and not a keyword search - wants an exact match or a begins with. Comment that left anchored searching known titles. Example - If I search for "National Geographic" I want results that have "National Geographic" and not any thing that has national or geographic. Comment in chat that "If I'm doing a tokenized search (not phrase), I would appreciate seeing titles that have term-A and term-B in the same contributor appear before titles where term-A is in one field and term-B is found in another continuator." Highly complex topic - we can return to this topic in a future meeting. Along with a demo of what is coming in search for future releases. It was also suggested to ask about how previous systems have handled the issue. | |
Better documentation of search behavior | Former user (Deleted) | There is search documentation on the Tipps and Tricks page: Search - using Elasticsearch (or OpenSearch) More technical documentation is available in the mod-search README on GitHub: https://github.com/folio-org/mod-search#readme Questions arose around: - how to generate your own indexes -7/20 meeting question: Follow-up question: is mod-search a module on top of elastic search? - This question will also be addressed after the meeting. KG responses: Thread While trying to figure out how to search in FOLIO Inventory, we perused the documentation regarding ElasticSearch. @Rita Albrecht We're going to discuss multiple open tickets related to search today and I have added these documentation questions to the agenda as well: https://folio-org.atlassian.net/wiki/x/TD9H @Felix Thanks, we appreciate that and hope for a fruitful discussion! September 2022: Technical council concluded that the license for Elastic Search was unacceptable to the FOLIO community. And that FOLIO would only support Open Search from now on. See Open Search FAQs for details - https://opensearch.org/faqOpenSearch is a fork of open source Elasticsearch 7.10. As such, it provides backwards REST APIs for ingest, search, and management. The query syntax and responses are also the same. In addition, OpenSearch can use indices from Elasticsearch versions 6.0 up to 7.10. We also aim to support the existing Elasticsearch clients that work with Elasticsearch 7.10. I am unaware of any functional differences between OpenSearch and ElasticSearch. Also, if a self-hosting organization is interested in implementing a different search capability they are able to do that. They can change the search engine and rewrite mod-search to their taste. At which point they have the ability to hack the indexes anyway they want. FOLIO's mod-search module creates and manages indexes. Spitfire is the team that owns this module and Christine and I are the POs for this team. The README file provides a good overview of the Open Search functionality we use in FOLIO https://github.com/folio-org/mod-search/blob/master/README.md. If you feel there are areas that are unclear or need more details, please create a JIRA ticket and assign to the development team = Spitfire. Notes from discussion: Khalilah Gambrell For search, the team that manages mod-search is responsible for creating new indexes. Khalilah and Christine will work on the requirements and working with the team to decide on and implement changes. - NB: a fuller response will be provided after the meeting in writing via slack and the meeting notes. Is this all a topic for a future MM SIG meeting? MM Sig can decide after the responses are posted. Suggestion: How is truncation working and how can the wildcards be used? Maybe have a list of issues that need to be documented in GitHub since Tips and tricks are just for getting started. User documentation is still to come in the future. Khalilah suggests creating a wiki page with all these questions to start with, then use that to figure out if we want to address some topics at MM meetings. (like MARC+implementers) --Felix will follow up |