For License CSV export use quote mark (ASCII 34) as escape character rather than backslash (ASCII 92)
Description
Checklist
hideTestRail: Results
Activity
Peter Böhm October 29, 2020 at 4:41 PM
I changed the code so that DEFAULT_ESCAPE_CHARACTER is now taken from ICSVWriter instead of ICSVParser. I did so too for the other two remaining Values (quote character and separator), since they aren't different between ICSVParser and ICSVWriter.
The exported CSV file now shows the expected behaviour for quotation marks in LibreOffice Calc, Softmaker Office and Google Docs.
Owen Stephens October 28, 2020 at 5:02 PM
Ah - I see what you mean @Peter Böhm
https://github.com/folio-org/mod-licenses/blob/6c2c1a544ca447fba8310dfcc7a6991059684b16/service/grails-app/controllers/org/olf/licenses/export/ExportController.groovy#L40 is
.withQuoteChar(ICSVParser.DEFAULT_QUOTE_CHARACTER)
and the ICSVParser.DEFAULT_QUOTE_CHARACTER is still ASCII 92, whereas ICSVWriter.DEFAULT_QUOTE_CHARACTER is ASCII 34
Owen Stephens October 28, 2020 at 4:58 PMEdited
This was the documentation I thought was saying that the DEFAULT_ESCAPE_CHARACTER was " in the latest opencsv release http://opencsv.sourceforge.net/apidocs/constant-values.html#com.opencsv.ICSVWriter.DEFAULT_ESCAPE_CHARACTER
Peter Böhm October 28, 2020 at 4:48 PM
According to this Documentation, the default escape character is now indeed the quotation mark (ascii 34).
However, I think in this issue the problem lies with 'nested quotations'. Quotations in our string data should be escaped with double quotation marks in csv export.
There seem to be different defaults for DEFAULT_ESCAPE_CHARACTER between ICSVWriter and ICSVParser. I'll try that tomorrow.
Ian Ibbotson (Use this one) October 28, 2020 at 4:01 PM
says
def index(ExportControlObject exportObj) {
log.debug("ExportController::index")
// Set the file disposition.
OutputStreamWriter osWriter
try {
response.setHeader "Content-disposition", "attachment; filename=export.csv"
osWriter = new OutputStreamWriter(new BufferedOutputStream(response.outputStream))
ICSVWriter csvWriter = new CSVWriterBuilder(osWriter)
.withSeparator(ICSVParser.DEFAULT_SEPARATOR)
.withQuoteChar(ICSVParser.DEFAULT_QUOTE_CHARACTER)
.withEscapeChar(ICSVParser.DEFAULT_ESCAPE_CHARACTER)
.withLineEnd(ICSVWriter.DEFAULT_LINE_END)
.build()
exportService.exportLicensesAsCsv(csvWriter, exportObj)
} finally {
// Always close the stream.
osWriter?.close()
}
}
.withEscapeChar(ICSVParser.DEFAULT_ESCAPE_CHARACTER)
DEFAULT_ESCAPE_CHARACTER is '\' is it not? so this needs changing?
My documentation for OpenCSV says that the default escape char is still '
'
When exporting licenses as CSV any quote characters inside a field should be escaped with an additional quote character (ASCII 34) rather than the backslash (ASCII 92).
i.e. text such as:
vgl. 7.6.8 "Zugelassen ist jedoch eine Verwendung der "Ariel Interlibrary Loan Software" zum Versand
should be escaped as:
vgl. 7.6.8 ""Zugelassen ist jedoch eine Verwendung der ""Ariel Interlibrary Loan Software"" zum Versand
Rather than
vgl. 7.6.8 \"Zugelassen ist jedoch eine Verwendung der \"Ariel Interlibrary Loan Software\" zum Versand
This is to ensure better compatibility when opening the resulting CSV with Excel and Google Sheets.
Note that the opencsv library version currently used is com.opencsv:opencsv:4.6 which uses the backslash (ASCII 92) as DEFAULT_ESCAPE_CHARACTER while the latest version com.opencsv:opencsv:5.2 uses the quote (ASCII 34) DEFAULT_ESCAPE_CHARACTER
The correct output can be tested as follows:
Add new license
Add the following text to a license term "internal note" (note that this text includes a newline as well as quote characters
vgl. 7.6.8 "Zugelassen ist jedoch eine Verwendung der "Ariel Interlibrary Loan Software" zum Versand eines Teils der Inhalte zu einem Drucker/Fax einer Empfängerbibliothek, ein Versand an Emailadressen ist nicht gestattet. "
Find license in search and filter results and select it with checkbox
Choose Actions -> Export selected as CSV
Select all then click Save & Close
Check in CSV that the quote characters are escaped with an additional escape character
For a final check the CSV can be opened in Excel and/or uploaded to Google Drive and opened with Google Sheets and checked that the text appears in the appropriate column.