For License CSV export use quote mark (ASCII 34) as escape character rather than backslash (ASCII 92)

Description

When exporting licenses as CSV any quote characters inside a field should be escaped with an additional quote character (ASCII 34) rather than the backslash (ASCII 92).

i.e. text such as:

vgl. 7.6.8 "Zugelassen ist jedoch eine Verwendung der "Ariel Interlibrary Loan Software" zum Versand

should be escaped as:

vgl. 7.6.8 ""Zugelassen ist jedoch eine Verwendung der ""Ariel Interlibrary Loan Software"" zum Versand

Rather than

vgl. 7.6.8 \"Zugelassen ist jedoch eine Verwendung der \"Ariel Interlibrary Loan Software\" zum Versand

This is to ensure better compatibility when opening the resulting CSV with Excel and Google Sheets.

Note that the opencsv library version currently used is com.opencsv:opencsv:4.6 which uses the backslash (ASCII 92) as DEFAULT_ESCAPE_CHARACTER while the latest version com.opencsv:opencsv:5.2 uses the quote (ASCII 34) DEFAULT_ESCAPE_CHARACTER

The correct output can be tested as follows:

  • Add new license

  • Add the following text to a license term "internal note" (note that this text includes a newline as well as quote characters

vgl. 7.6.8 "Zugelassen ist jedoch eine Verwendung der "Ariel Interlibrary Loan Software" zum Versand eines Teils der Inhalte zu einem Drucker/Fax einer Empfängerbibliothek, ein Versand an Emailadressen ist nicht gestattet. "
  • Find license in search and filter results and select it with checkbox

  • Choose Actions -> Export selected as CSV

  • Select all then click Save & Close

  • Check in CSV that the quote characters are escaped with an additional escape character

For a final check the CSV can be opened in Excel and/or uploaded to Google Drive and opened with Google Sheets and checked that the text appears in the appropriate column.

Checklist

hide

TestRail: Results

Activity

Show:

Peter Böhm October 29, 2020 at 4:41 PM

I changed the code so that DEFAULT_ESCAPE_CHARACTER is now taken from ICSVWriter instead of ICSVParser. I did so too for the other two remaining Values (quote character and separator), since they aren't different between ICSVParser and ICSVWriter.

The exported CSV file now shows the expected behaviour for quotation marks in LibreOffice Calc, Softmaker Office and Google Docs.

Owen Stephens October 28, 2020 at 5:02 PM

Ah - I see what you mean
https://github.com/folio-org/mod-licenses/blob/6c2c1a544ca447fba8310dfcc7a6991059684b16/service/grails-app/controllers/org/olf/licenses/export/ExportController.groovy#L40 is

.withQuoteChar(ICSVParser.DEFAULT_QUOTE_CHARACTER)

and the ICSVParser.DEFAULT_QUOTE_CHARACTER is still ASCII 92, whereas ICSVWriter.DEFAULT_QUOTE_CHARACTER is ASCII 34

Owen Stephens October 28, 2020 at 4:58 PM
Edited

This was the documentation I thought was saying that the DEFAULT_ESCAPE_CHARACTER was " in the latest opencsv release http://opencsv.sourceforge.net/apidocs/constant-values.html#com.opencsv.ICSVWriter.DEFAULT_ESCAPE_CHARACTER

Peter Böhm October 28, 2020 at 4:48 PM

According to this Documentation, the default escape character is now indeed the quotation mark (ascii 34).

However, I think in this issue the problem lies with 'nested quotations'. Quotations in our string data should be escaped with double quotation marks in csv export.

There seem to be different defaults for DEFAULT_ESCAPE_CHARACTER between ICSVWriter and ICSVParser. I'll try that tomorrow.

Ian Ibbotson (Use this one) October 28, 2020 at 4:01 PM

https://github.com/folio-org/mod-licenses/blob/master/service/grails-app/controllers/org/olf/licenses/export/ExportController.groovy

says

def index(ExportControlObject exportObj) {
log.debug("ExportController::index")

// Set the file disposition.
OutputStreamWriter osWriter

try {
response.setHeader "Content-disposition", "attachment; filename=export.csv"
osWriter = new OutputStreamWriter(new BufferedOutputStream(response.outputStream))
ICSVWriter csvWriter = new CSVWriterBuilder(osWriter)
.withSeparator(ICSVParser.DEFAULT_SEPARATOR)
.withQuoteChar(ICSVParser.DEFAULT_QUOTE_CHARACTER)
.withEscapeChar(ICSVParser.DEFAULT_ESCAPE_CHARACTER)
.withLineEnd(ICSVWriter.DEFAULT_LINE_END)
.build()

exportService.exportLicensesAsCsv(csvWriter, exportObj)
} finally {
// Always close the stream.
osWriter?.close()
}
}

.withEscapeChar(ICSVParser.DEFAULT_ESCAPE_CHARACTER)

DEFAULT_ESCAPE_CHARACTER is '\' is it not? so this needs changing?

My documentation for OpenCSV says that the default escape char is still '
'

Done

Details

Components

Assignee

Reporter

Priority

Sprint

Development Team

Bienenvolk

TestRail: Cases

Open TestRail: Cases

TestRail: Runs

Open TestRail: Runs
Created October 22, 2020 at 10:45 AM
Updated November 16, 2020 at 11:01 AM
Resolved November 16, 2020 at 11:01 AM
TestRail: Cases
TestRail: Runs