CLONE - Story: LCCN Normalization Back End | BF Thin Thread
Description
Environment
None
Potential Workaround
None
Attachments
9
clones
Checklist
hideActivity
Show:

Doug Loynes February 5, 2025 at 12:23 PM
Cloned wrong card
Declined
Details
Details
Assignee

Reporter

Labels
Priority
Story Points
1
Development Team
Citation
TestRail: Cases
Open TestRail: Cases
TestRail: Runs
Open TestRail: Runs
Created February 5, 2025 at 12:18 PM
Updated March 4, 2025 at 7:51 PM
Resolved February 5, 2025 at 12:23 PM
TestRail: Cases
TestRail: Runs
The Library of Congress Control Number (LCCN) is a key identifier used to uniquely identify book titles in the collection of the Library of Congress. LCCNs are also used to uniquely identify authority records, that catalogers rely on for creating comprehensive and definitive bibliographic descriptions of works in the collection.
There are specific normalization rules set by LC that need to be incorporated into FOLIO in order to enforce validation of resource descriptions at the Work and Instance levels.
In Scope
LC has specific rules for LCCNs based on the year that the item was acquired. Items acquired between 1898-2000 follow one set of rules (referred to as "Structure A") and items acquired since 2001 have another set of rules (referred to as "Structure B") Assume "Structure B" rules for this card
LCCN Structure B{*}
*
Name of Element
Number of characters
Character position in field
Character type
Alphabetic Prefix
2
00-01
Alpha or blank*
Year
4
02-05
Digit
Serial Number
6
06-11
Digit
*NOTE: The alphabetic prefix (character positions 00 and 01) will be present only for an LCCN applied to an authority record. When an LCCN is applied to a bibliographic resource, positions 00 and 01 are blank. For this card, assume LCCN identifiers for bibliographic resources only. LCCN identifiers for authority resources will be addressed later.
LCCN Normalization
From the Library of Congress online documentation for the MARC standard:
https://www.loc.gov/marc/lccn-namespace.html#normalization
The value present in the LCCN field will be captured and retained as part of the resource description. The indexing of the LCCN field will follow / adhere to the normalization rules as laid out below:
BLANK(S)
Where the value in the LCCN field contains blanks, remove the blanks for indexing the field
" 2017000002" --> "2017000002"
"2018 666666" --> "2018666666"
"2019123456 " --> "2019123456"
2. FORWARD SLASH
Where the value in the LCCN field contains a forward slash ('/'), remove the slash and all characters to the right for indexing the field
"2012425165//r75" --> "2012425165"
"2022139101/AC/r932" --> "2022139101"
3. HYPHEN
Where the fifth character in the LCCN field is a hyphen, remove the hyphen for indexing the field
"2022-890351" --> "2022890351"
"2001-000002" --> "2001000002"
4. SERIAL NUMBER
The serial number component of the LCCN is demarcated by a hyphen (which should be present as the fifth character in the field value). Further, the serial number component of the LCCN must be six digits in length. Where the serials component has fewer than six digits, remove the hyphen and left fill with zeroes so that there are six digits in the serial number component.
"2011-89035" --> "2011089035"
"2020-2 " --> "2020000002"
Out of Scope{*}
*
This card focuses on validation of the LCCN component of a bibliographic description.
Validation of an entire description is outside scope.
An LCCN assigned to an authority record is outside scope (e.g. the LCCN has leading alpha characters)
Use Cases
Happy path
GIVEN a value in the LCCN field
AND applying the indexing normalization rules results in a value that conforms to the Structure B pattern
THEN the value of the LCCN indexed on the back end meets cataloging conditions
AND the value is added to the index
2. Error
GIVEN a value in the LCCN field
AND applying the indexing normalization rules results in a value that doesn't conform to the Structure B pattern (see examples below)
THEN the value of the LCCN doesn't meet cataloging conditions
AND the indexing routine "skips" over the LCCN field
AND each "skip over" for the LCCN field is recorded as an exception in the logs
examples:
* the input LCCN includes special characters (@, #, $) that aren't accounted for in Structure B
* the serial number component of the LCCN contains more than 6 digits
* the input LCCN contains more than one hyphen
* the input LCCN is more than 10 characters in length
* the input LCCN has fewer than 10 characters in length and has no hyphen to trigger backfilling with zeros (e.g. 202212345)
* the input LCCN contains a hyphen within the first four characters of the entry (e.g. 200-1234567)
* the input LCCN contains a hyphen after the fifth character of the entry (e.g. 20011-2345467; 200112-3456)
* the input LCCN is malformed after applying the backslash rule (e.g. 20/18123456 translates to 20, not to 2018123456)