STORY: Punctuation normalization - Special treatment of trailing periods
Description
Environment
None
Potential Workaround
None
Attachments
10
- 14 Jan 2025, 09:57 AM
- 14 Jan 2025, 09:57 AM
- 14 Jan 2025, 09:48 AM
- 14 Jan 2025, 09:48 AM
- 14 Jan 2025, 09:48 AM
- 14 Jan 2025, 09:48 AM
- 14 Jan 2025, 09:48 AM
- 14 Jan 2025, 09:48 AM
- 14 Jan 2025, 09:48 AM
- 14 Jan 2025, 09:48 AM
relates to
Checklist
hideActivity
Show:
Tetiana Kovalchuk January 14, 2025 at 10:16 AM
Tested on edev diku.
Build version: mod-linked-data-1.0.1-SNAPSHOT.6d47448
Test cases and evidences attached.
Done
Details
Details
Assignee
Doug Loynes
Doug LoynesReporter
Doug Loynes
Doug LoynesLabels
Priority
Story Points
1
Sprint
None
Development Team
Citation
Release
Sunflower (R1 2025)
TestRail: Cases
Open TestRail: Cases
TestRail: Runs
Open TestRail: Runs
Created January 3, 2025 at 6:13 PM
Updated last week
Resolved January 24, 2025 at 12:09 PM
TestRail: Cases
TestRail: Runs
This card extends punctuation normalization rules covered in
https://folio-org.atlassian.net/browse/MODLD-619, and
https://folio-org.atlassian.net/browse/MODLD-634
And adds an additional rule affecting trailing periods.
Currently, the punctuation normalization rules state that trailing periods should be stripped when present at the end of a subfield all things equal. However, there are exceptions to the rule which are covered in this card.
When a punctuation normalization rule involves a period mark (.), and the preceding subfield has a trailing period, the value of the subfield needs to be inspected before stripping the trailing period.
Specifically, the last three characters of the preceding subfield must checked to see if they match either of two patterns:
Pattern 1: <space><uppercase letter><period>, e.g. 100 $a Rees, Paul A.
Pattern 2: <period><uppercase letter><period> e.g. 100 $a Kerrigan, Nancy W.K.
If either pattern is present, retain the period mark. Else strip the trailing period from the subfield.
Essentially, the pattern test checks to see if the value of the subfield ends with initials (say, for a person’s name). That use case for punctuation is important for identification.
Other use cases, e.g. trailing punctuation of a date (2015.) do not fit either pattern, and should be removed from the value of the subfield.