Authority Management - current practice


Below text represents drafts / overviews of authority management practices at institutions represented on this group.


Chicago


Cornell

Current Workflow for Importing/Indexing Authorities Into Blacklight index and Acting on Changes/Deletes in FOLIO

  1. Authority File is maintained locally by CUL-IT.
  2. CUL-IT loads new, updated and deleted headings weekly from Peter Ward (authority
    record vendor) into the authority file and generates a JSON report. Peter Ward files are
    split into Names (LCNAF) and Subjects (LCSH, CSH, LCGFT).
  3. JSON report is transformed into an Excel spreadsheet using OpenRefine. Spreadsheet
    includes:
    1. old (changed) heading
    2. new (updated) heading(s)
    3. variant headings from the authority record (4XX) that exist in our bibliographic
      records
    4. reference links to id.loc.gov authority records
    5. instance count (number of CUL bib records affected)
    6. Links to affected bib records in Blacklight
    7. Link to FOLIO UUIDs from SOLR index for affected records
    8. filterable fields to assist with maintenance (undifferentiated, diacritic change,
      etc.)
  4. Authorities Metadata Librarian uses the spreadsheet to perform maintenance:
    1. Changes—More than 10 records
      1. Confirm that all of them are likely to be the same author by clicking on
        the Blacklight link and scrolling through results. Results are especially
        likely to be correct if they involve closing a date.
      2. Batch edit by using the UUIDs from SOLR index link in spreadsheet to pull
        MARC records from data export. Load the records into MarcEdit, then
        batch change the heading. Once corrected, import the MARC file back
        into the system using data import.
    2. Changes—Less than 10 records
      1. Click on the Blacklight link and change the headings manually in FOLIO.
        (Blacklight direct links to FOLIO record if cataloger is logged in.)
    3. Deletes
      1. Retrieve deleted heading from id.loc.gov link
      2. Use Blacklight link to retrieve affected bib records
      3. Is the deleted heading undifferentiated? If so, match bib records to
        appropriate authority records and update headings in FOLIO. If not,
        retrieve replacing authority record by searching 010 $z and update
        heading in FOLIO
  5. Regardless of the number of results, if a heading is not a clear match with the updated
    heading in the report, extra research in OCLC is done to confirm the match and update as
    appropriate.
  6. Headings that need additional attention with the current workflow
    1. Partial matches that do not match updated heading, but are still incorrect
      and needs maintenance
    2. Undifferentiated names
    3. Partial matches of established names (e.g. “Williams, Richard” is a partial
      match of “Williams, Richard, $d 1780-844", but is also an established
      heading of its own)
    4. Matches on initials or pseudonyms that appear in updated authority
      records’ 4XX (reference tracings)

Potential Changes to Workflow Based on FOLIO Development

  1. Morning Glory and Nolana releases will include:
    1. Storing & deleting authority files in FOLIO
    2. Loading and updating authority records via data import
    3. Searching, browsing, and filtering authority records in FOLIO
    4. Some browse functionality by subject and contributor for inventory records
    5. The ability to populate LCCN identifiers to inventory records in the $0 from
      authority records
    6. Linked and authority controlled fields in bib records
  2. Future developments in FOLIO may affect our authorities workflow in these ways:
    1. We may utilize LCCN identifiers in the $0 for batch changes and/or linked data
      projects
    2. The “Source of Truth” for CUL’s maintained authority file may change from being
      external to FOLIO to internal
    3. Authority validation will likely be included in a future FOLIO version, and it will
      probably work differently than Voyager authority validation (identifier vs string
      match based)
    4. The ability to browse inventory records may eliminate some reliance on
      Blacklight, but not while our authority file is maintained outside of FOLIO


Past Workflow for Importing Authorities Into Voyager and Acting on Changes/Deletes:

[This set of steps was taken from a description here: https://confluence.cornell.edu/x/AkGJFg]

  • Connect via FTP to Peter Ward's server. 
    • There are two directories one for Name Authority files,unname.[yy.week no.].mrc and Subject Authority files, unsub.[yy.week no.].mrc.
    • LStools has separate import scripts for both. 
  • The bulk import profile, AUTHMERG, adds new authority records and merges existing. It also outputs a file of authority records to be deleted which is sent inside of an email which is printed out for students to delete the name authority ones.
  • Subject authority deletes are handled by the Authority coordinator or designated backup, an original cataloger.
  • Then connects to the Voyager production server via a secure shell client, currently MobaXterm and from the command line runs, ghc11 and ghc12 (ex. $ ./ghc11) which indexes the authority records in order to perform flips within the Global Headings Queue in Voyager using a combination of manual and Gary Strawn’s tool. Local documentation for Strawn's tool: https://confluence.cornell.edu/download/attachments/378093826/GARY%20STRAWN.doc?api=v2

More detailed information about different aspects of these workflows can be found here: https://confluence.cornell.edu/x/l4PYEg


Duke

The following is an attempt to capture the authority service processes for the Duke libraries.  Although we have several technical services teams across Duke University (DUL, Rubenstein, Ford Business Library, Medical Center Library, and Goodson Law Library) administration of the authority service management is done centrally, in the DUL Technical Services Division.  Functional owner for the authority services is Rich Murray; Technical owner for the authority service is Jacquie Samples.

History
The Technical Services staff members who create original bibliographic descriptions utilize the Connexion client and control headings there rather than working in our ILS, Aleph.  While the Aleph system does have an authority module, Duke’s authority control was outsourced to the company, LTI at least by 2004 when Aleph was implemented.

In 2018, a review of LTI’s service came under scrutiny at Duke due to changes in staffing and retirements. As we began to investigate other authority services, LTI coincidentally announced that they were going out of business in 2019.   After a thorough analysis of alternate services, Duke decided to implement MARCIVE’s offering.  


LD4P2 Related Project (Section updated on June 25, 2020)

As part of LD4P2, catalogers at Duke University Libraries embarked upon a Retrospective Authority Control pilot project using Unmatched Headings reports from MARCIVE, our authority service vendor. This project involved examining personal name headings appearing in our catalog, more than once, that MARCIVE was unable to match automatically to a Name Authority Record and addressing them by:

  1. creating a new NAR, or
  2. modifying an existing NAR because the form in our catalog justified a cross-reference that would make it match, or
  3. correcting an error in the heading in our catalog to make it match the NAR (i.e., the form in our catalog was too far from the AAP for an auto-match, but did not justify changing the NAR)

A group of 7 catalogers were asked to work at least 18 hours on this project over the course of two months (March & April). All NARs created were coded Preliminary in MARC 008/33 (Auth status) because catalogers did not have pieces in-hand. We were able to resolve 768 previously unmatched headings. During the project we created 424 new NARs and modified 11 NARs. Duke Libraries is currently evaluating the success of the pilot project and considering how we might integrate this type of work into our workflows.

Authority Control Vendor Services (ACVS) at Duke

As of January 2019

We decided to use MARCIVE’s services to embed $0 data from several thesauri, to update pre-RDA records to hybrid records by adding the 33X fields to records without those data.

Process timeline

  1. A copy of our bibliographic database is saved at MARCIVE, called the basefile.
  2. Files of updates are sent to MARCIVE weekly (OAS file),
    1. Extracts are run Thursday nights; file should be sent to MARCIVE on Friday mornings.
    2. File is located on L:\Departments\Aleph\authority_control
    3. File name = auth_wkly_ext_yyyymmdd.mrc
    4. Open with MarcEdit to retrieve number of records (the count is emailed to MARCIVE along with notification of file posting).
    5. Change file name to DUKU[MMDD]BIB.mrc
    6. Open a second File Explorer window, and enter MARCIVE URL ftp://ftp.marcive.com
    7. Copy the OAS file into the input/athority folder.
    8. Send a notification email to production@marcive.com with the file name and the record count.
  3. Weekly OAS extracts have the following parameters:
    1. A Perl script calculates today’s date and last week’s date in order to capture all records in the DUK01 Z00 table that have a DUK01 Z106_UPDATE_DATE between those dates. 
      1. Format Z00_DATA:  (for each tag until Z00_DATA_LEN is reached)

        Index From

        Index To

        Description

        0

        3

        Field Length

        4

        6

        Field Tag

        7

        8

        Spaces

        9

        9

        “L” constant for each Field

        10

        Field Length

        Field

        17

        17

        Encoding Level

    2. Exclude selected records: 
      1. Exclude if Tag is STA and Field is (SUPPRESSED, DELETED, CRASH, or CIRC-CREATED)
      2. Exclude if Tag is DEL
      3. Exclude if Tag is (245, 590 or 500) and Field is TEMPORARY RECORD
      4. Exclude if Tag is 904 AND Field is:
        • Serials Solutions supplemental brief
        • Serials Solutions original
        • Bowker Global Books in Print
        • Vendor-generated brief
        • Temporary record
  4. Weekly, Load ACVS-Processed Bibliographic Records
    1. For records processed by MARCIVE, vendor sends email notification with links to their FTP site. Copy records into L: Departments\AuthorityControlService\Data.
    2. Open bibliographic records file with MarcEdit and check the record structure by using Tools --> Validate MARC Records --> Validate Record Structure. Click "Save" when finished.
    3. In order to eliminate trailing spaces which may affect Aleph batch loading, recompile the records into a fresh MARC file using File --> Compile File into MARC.
    4. Copy the finalized MARC file to L: Departments: Aleph: FilesToMove.
    5. In the Aleph Cataloging module, go to Services --> Load Catalog Records --> Record Loading Super Service (custom-17)
    6. In the Batch Log, check the Log File for the service routine and verify all records matched successfully.
    7. Go to Services --> Load Catalog Records --> Apply Authority Update (custom-26). For the "Input File Name," enter the name of the file without any extension; for example, "dukubib134.match" should be entered as dukubib134. For the "Period Beginning Date," enter the day after the extract script ran. Set the "Runtime" to a later hour in the evening such as "18" for 6:00 pm. The ACVS updates are now set to load after business hours and will exclude any records updated by staff in the time between the extract and the load routine.
    8. In addition to the updated bibliographic and authority records, the OAS process produces several reports:
      1. Changed Headings, which shows both the old and new forms of the terms (X00, X10, X11, 240/X30, 490, 800,811,830, 650, 651, split terms, and modified Genre/Form)
      2. Multiple Matches or High Probability Matches (these have not been updated automatically)
      3. Unrecognized or Invalid Terms (this report is also split by the categories list for the changed headings report)
      4. Undifferentiated Terms (this report shows terms that matched to an undifferentiated record during authorities processing)
      5. Pre-processing Changes (this report shows data modifications that occur prior to authorities processing, not including changes to the leader, 040, or 9XX fields. Types of changes include subfield code fixes, addition of 33X and 34X data, punctuation error fixes, etc.)
      6. Statistical summary of records processed, actions taken, and/or reports generated
  5. Quarterly, the comprehensive notification service (CNS) requires that a list of deleted record numbers is sent to MARCIVE
    1. Extract a list of record numbers deleted since the last CNS
    2. Load records as for OAS process
    3. The CNS process includes a report of new authority headings.
  6. Annually, update the basefile at MARCIVE by sending a new complete extract from Aleph.
    1. Exclude selected records: 
      • Exclude if Tag is STA and Field is (SUPPRESSED, DELETED, CRASH, or CIRC-CREATED)
      • Exclude if Tag is DEL
      • Exclude if Tag is (245, 590 or 500) and Field is TEMPORARY RECORD
      • Exclude if Tag is 904 AND Field is:
        • Serials Solutions supplemental brief
        • Serials Solutions original
        • Bowker Global Books in Print
        • Vendor-generated brief
        • Temporary record
    2. Load updated bibliographic records
    3. Load new and updated authority records (for terms used in Duke bibliographic records)
    4. In addition to updated bibliographic and authority records, the annual basefile replacement process produces several reports
      1. Changed Headings, which shows both the old and new forms of the terms (X00, X10, X11, 240/X30, 490, 800,811,830, 650, 651, split terms, and modified Genre/Form)
      2. Multiple Matches or High Probability Matches (these have not been updated automatically)
      3. Unrecognized or Invalid Terms (this report is also split by the categories list for the changed headings report)
      4. Undifferentiated Terms (this report shows terms that matched to an undifferentiated record during authorities processing)
      5. Pre-processing Changes (this report shows data modifications that occur prior to authorities processing, not including changes to the leader, 040, or 9XX fields. Types of changes include subfield code fixes, addition of 33X and 34X data, punctuation error fixes, etc.)
      6. Statistical summary of records processed, actions taken, and/or reports generated

Lehigh

The following is a first attempt at understanding how headings are operating within Lehigh’s online library catalog, ASA <library.lehigh.edu>. Both a qualitative and quantitative analysis would be optimal in attempting to discern if the non-existence of authority control on our catalog for the past six years has been detrimental in discovery for our searchers. 


History:

Lehigh University’s Technical Services staff performed authority control on its database until 2014 when the library migrated its integrated library system from SirsiDynix to OLE. While SirsiDynix system did have an authority module, OLE did not upon implementation, and does not to this day. In addition to using some aspects of SirsiDynix’s authority control module, Lehigh Libraries authority control was outsourced to the company, LTI.

Plans to continue with LTI were made, however once migration to OLE took place other work took precedence, forcing authority control to be put aside. It became apparent around 2016 that OLE was not going to have a future, so implementing an authority workflow based on OLE was not desirable since we had already lost two years and those workflows would change to something else when we migrated to a new system.

Admittedly, hopes of a more sophisticated process of authority control in a new system also weighed into this decision to delay action in this area and continue to be a source of a hopeful alternative.

As stated above, a thorough study is desirable to measure the impact of the lack of authority control on the catalog to our searchers. For the moment, here is some information I hope is helpful:


Ways we do try to assert control:

  • Authority control is performed on special collections print materials because they are cataloged individually as opposed to in batch.
  • While “format” is not a heading or what one considers when thinking of entities, I do want to include here that the way a record’s format appears in ASA has been paid close attention to. While there is room for a lot of improvement and plans to hopefully transfer control of format representation in VuFind to coming from the 33X fields as much as possible, cataloging standards at Lehigh always demand that an item’s format is represented correctly in our catalog.
  • Whenever possible vendor records are not used. OCLC records are preferred over vendor records, in part, because headings coming from OCLC records are more likely to be under authority control. The only vendor records imported into our catalog are ones for electronic government documents from MARCIVE.


Some examples:

Using Google Analytics I looked at searches performed in Lehigh Library’s catalog, ASA in March of 2020 and found a few examples that give evidence to the lack of authority control in our catalog:


Author searches:

Shakespeare, William, 1564-1616

Heading in Catalog

Number of records with this heading

Shakespeare, William, 1564-1616

1,113

Shakespeare, William

96

Heidegger, Martin, 1889-1976

Heading in Catalog

Number of records

Heidegger, Martin, 1889-1976


105

Heidegger, Martin

8

Savage, John, 1964- (Lehigh author)

Heading in Catalog

Number of records

Savage, John, 1964-

5

Lebovic, Nitzan,‏ ‎1970- (Lehigh author)

Heading in Catalog

Number of records

Lebovic, Nitzan, 1970-

7

Lebovic, Nitzan,

1


Next steps: Look at subject searches performed and see how authority control would change the user experience. Look at keyword searches and see how the lack of authority control affects results in facets.

Example Subject Search

Number of results

greek letter societies

103

spiritualism

1,155

Federal aid to rural health services

22


University of Pennsylvania

When Penn used Voyager, we did batch corrections and updates. Alma (current system) heading maintenance is mostly done from their end. Alma runs daily jobs to link and update headings in the catalog. We usually handle issues on a case by case basis as they are brought to our attention; doing authority work in OCLC.

  • Alma has its own job to check and link Alma headings to Alma authority file. There is the Authority Control Task List menu in Alma (under Resources).
    • Bib heading changes not a part of our regular workflows.

Project: RDA update for bibs and authority headings.