Batch Importer (Bib/Acq) (UXPROD-47)

[UXPROD-1038] ISBN normalization: Preparation, and Initial Development Created: 23/Aug/18  Updated: 16/Sep/20  Resolved: 22/Jan/19

Status: Closed
Project: UX Product
Components: None
Affects versions: None
Fix versions: Q4 2018
Parent: Batch Importer (Bib/Acq)

Type: New Feature Priority: P3
Reporter: Charlotte Whitt Assignee: Ann-Marie Breaux (Inactive)
Resolution: Done Votes: 0
Labels: epam-folijet, inventory, marccat, marcimport, orders, split
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original estimate: Not Specified

Attachments: Microsoft Word ISBN procedures_in_MARCcat.docx     PNG File Skärmavbild 2018-08-23 kl. 17.54.09.png    
Issue links:
Defines
defines UXPROD-47 Batch Importer (Bib/Acq) Analysis Complete
Relates
relates to ISBNUTIL-1 Design and implement a reusable (shar... Closed
relates to ISBNUTIL-4 SPIKE: review the Java Apache Commons... Closed
relates to UXPROD-1394 ISBN normalization: Refinement Closed
relates to ISBNUTIL-2 Parse and validate ISBN numbers befor... Closed
relates to ISBNUTIL-3 SPIKE: review the PHP implementation ... Closed
Epic Link: Batch Importer (Bib/Acq)
Analysis Estimate: Medium < 5 days
Analysis Estimator: Ann-Marie Breaux (Inactive)
Front End Estimator: Ann-Marie Breaux (Inactive)
Back End Estimate: Medium < 5 days
Back End Estimator: Ann-Marie Breaux (Inactive)
Estimation Notes and Assumptions: Analysis estimate: need to learn about ISBNs and analyze the AtCult codebase
Frontend: none, since there is no front-end component for this
Backend: will be reviewed by Folijet; most work will be done in UXPROD-1394, so may be pretty small here.
Development Team: Folijet
Rank: BNCF (MVP Feb 2020): R1
Rank: Chalmers (Impl Aut 2019): R1
Rank: Chicago (MVP Sum 2020): R1
Rank: Cornell (Full Sum 2021): R1
Rank: 5Colleges (Full Jul 2021): R2
Rank: GBV (MVP Sum 2020): R1
Rank: Lehigh (MVP Summer 2020): R2
Rank: TAMU (MVP Jan 2021): R2

 Description   

This feature (for Q4) covers the initial investigation and research of ISBNs, review the AtCult ISBN codebase, and set up an initial shared library with 3 functions: validating ISBNs, convert 10 to 13, convert 13 to 10. The shared, central library will be accessible to all FOLIO apps for integrating into their modules. Refinement will be in Q1 2019, UXPROD-1394 Closed .

ISBN is International Standard Book Numbering (DS / ISO 2108). ISBN is defined for use with books, volumes of annual publications, micro cards and microcomputer programs. When a material contains multiple ISBNs, they are entered in each field in the instance record, or MARC tag 021 in MARCcat.

ISBN consists of 10 or 13 digits

  • The ISBN-10 digit is divided into 4 groups: country code, publisher number, title number, control digit
  • The ISBN-13 digit is divided into 5 groups: prefix, country code, publisher number, title number, control digit.
    (After 1 January 2007, all new ISBNs are 13 digits).

Examples:

  • 9351453375
  • 87-85207-17-3
  • 9789351453376
  • 978-1-86100-451-2

ISBN is used in most of the FOLIO apps: Order, Check out, Check in, Inventory, MARCcat, Batch loader etc, and we need the ISBN normalized in order to handle them - e.g. strip diacritics, spaces etc.

ISBN-10 is often entered with hyphens or standard spacing between the four or five parts of the number
ISBN-13 is compacted

If a material contains both a 10-digit and 13-digit ISBN, these numbers are entered in each of their fields in Instance or MARCcat.

Documentation:



 Comments   
Comment by Cate Boerema (Inactive) [ 12/Sep/18 ]

Hi Charlotte Whitt. This feature is missing an epic.

Comment by Charlotte Whitt [ 12/Sep/18 ]

Hi Cate Boerema - not really sure what epic this should be assigned to. It's a general feature for several apps. I have browsed through the list of epics - https://folio-org.atlassian.net/issues/?jql=issuetype%20%3D%20Epic.
The ones which maybe match the best would be:

None of them are really good.

Comment by Ann-Marie Breaux (Inactive) [ 12/Sep/18 ]

Hi Charlotte Whitt and Cate Boerema We're going to need this for Batch Loader - I'm happy to take it into that Epic if you like. I can add a note that we need to build it as a stand-alone component, so that it can be used by other apps as needed. Then I would just need to know what other Epics to link it to. Thanks

Comment by Cate Boerema (Inactive) [ 13/Sep/18 ]

Definitely no LIBAPP epics! The only epics that "matter" are in UXPROD (LIBAPP hasn't been used for a year or more ) Ann-Marie Breaux, it would be great if you could take this into the Batch Loader epic. It may still make sense to develop as a stand-along component, but it does need a home epic. I will assign this to batch loader.

Thanks!

Comment by Ann-Marie Breaux (Inactive) [ 19/Sep/18 ]

Hi Cate Boerema: Assigned team should be EPAM-Folijet, if possible, to keep them separated from the other EPAM teams. That's also the label we're using to keep their work delineated in FOLIO. Thank you!

Comment by Cate Boerema (Inactive) [ 20/Sep/18 ]

Thank Ann-Marie Breaux. When I have the new EPAM teams' names, I'll create a ticket to have them added (and this one changed) in JIRA.

Comment by Ann-Marie Breaux (Inactive) [ 20/Sep/18 ]

Sounds perfect - thank you!

Comment by Ann-Marie Breaux (Inactive) [ 01/Oct/18 ]

Taras Spashchenko I'm checking with AtCult to see if they already have this functionality in place, and we might be able to borrow it:
Hi Mirko and Christian, Does AtCult already have some programming that normalizes ISBNs? We want to be able to recognize the 10-digit and 13-digit ISBNs as the same number. Also to normalize the 020 $a to remove any hyphens or punctuation, remove any text in parentheses, etc. For example 020 $a 978-019-283739-0 (paperback) would become 9780192837390 for matching and searching purposes. If AtCult already has that capability, could we get more information about it? If not, then we'll build it as part of the data import dev work, and I think it will be useful to several other apps.
From Christian: Hello ann-marie, I think we already have the logic of normalization as you ask. But to be 100% sure, I do checks and let you know as soon as possible.

Comment by Taras Spashchenko [ 01/Oct/18 ]

Hello Ann-Marie Breaux,

Thank you for your help.

Comment by Tiziana Possemato [ 01/Oct/18 ]

Yes Christian and Ann-Marie, I can confirm that we already use in WeCat (so in MARCcat) something to normalize ISBN. Tomorrow we'll check with our collegues to give you more detailed info.

Comment by Ann-Marie Breaux (Inactive) [ 01/Oct/18 ]

Thanks, Tiziana Possemato That will be really helpful. We need to be able to have a clean ISBN (no punctuation, no trailing text) for matching. And we also need the system to be able to recognize equivalent ISBN-10 and -13 as the same number for matching purposes.

Comment by Ann-Marie Breaux (Inactive) [ 10/Oct/18 ]

Added doc with details from AtCult, and getting the code from them. AtCult will discuss with Folijet at an upcoming meeting.

Comment by Ann-Marie Breaux (Inactive) [ 05/Dec/18 ]

Splitting this into 2 features. This feature (in Q4) will accomplish the research and planning for a shared ISBN library, using AtCult's ISBN handling code as a starting point. UXPROD-1394 Closed (Q1 2019) will accomplish the implementation of the work planned in this feature.

Comment by Ann-Marie Breaux (Inactive) [ 05/Dec/18 ]

No estimates on this existing feature, so I added just now. Taras Spashchenko Please check the Analysis and Backend estimate at the top of this feature. If you think they should be different, please adjust, add a note in the Estimation Notes, and change the Estimator fields to you. Thank you!

Generated at Fri Feb 09 00:12:27 UTC 2024 using Jira 1001.0.0-SNAPSHOT#100246-sha1:7a5c50119eb0633d306e14180817ddef5e80c75d.