Add UTF-8 BOM to Exported csv Files

Description

Problem

When opening a CSV file containing Arabic characters with Microsoft Excel, the Arabic text is not displayed correctly. Instead, strange or unreadable characters are shown.

Current Behavior

  • CSV files with Arabic content are exported without a UTF-8 Byte Order Mark (BOM).

  • Excel fails to recognize the correct character encoding.

  • Arabic characters appear as gibberish or placeholder characters.

Proposed Solution

Add a UTF-8 BOM to the beginning of CSV files during the export process.

Expected Outcome

  • Excel will correctly recognize the file encoding as UTF-8.

  • Arabic characters will display properly when the CSV is opened in Excel.

Technical Details

  • UTF-8 BOM: Byte sequence EF BB BF (hexadecimal) to be added at the start of the file.

Steps to Reproduce

  1. Export a CSV file containing Arabic text from our system.

  2. Open the exported file in Microsoft Excel.

  3. Observe that Arabic characters are not displayed correctly.

Additional Notes

  • This issue specifically affects Microsoft Excel's handling of CSV files.

  • Other text editors or applications may not require the BOM to display Arabic correctly.

  • Consider testing with other versions of Excel and on different operating systems.

Environment

None

Potential Workaround

None

Attachments

22

Checklist

hide

Activity

Show:

Magda ZacharskaNovember 29, 2024 at 12:27 AM
Edited

Verified on snapshot-2 environment - works as expected:

Added instance with a title in Arabic:

 

Started bulk edit and the formatting is correct in UI in downloaded .csv file opened in Excel:

Are you sure? form and downloaded preview:

Confirmation screen and the file downloaded:

 

 

Translation of the text copied from Excel:

Translation of the text copied from the Instance title:

Aliaksei HarbuzNovember 22, 2024 at 12:39 PM
Edited

Verified changes for mod-bulk-operations at local machine:

  1. Create script to check UTF-8 BOM at Windows PowerShell. $file_path should have actual path of the checking file :

  2. Start bulk edit of any record :

     

  3. Download matched records and check it with script from first step :

     

  4. Choose some bulk edit rule and press confirm changes :

     

  5. Download preview changes csv file and check it with script from first step :

     

  6. Press commit changes :

  7. Download changed records file and check it with script from first step:

     

  8. Validate that errors csv file also has UTF-8 BOM :

     

 

Verified changes for mod-data-export-worker at local machine for export manager page:

  1. Start bulk-edit of instances :

     

  2. At export manager page download csv file and check existence UTF-8 BOM :

     

     

In same way verify existence UTF-8 BOM for holdings, items, and users matched records csv files downloaded from export manager page:

holdings:

items:

users:

Done

Details

Assignee

Reporter

Priority

Story Points

Sprint

Development Team

Firebird

Fix versions

Release

Sunflower (R1 2025)

TestRail: Cases

Open TestRail: Cases

TestRail: Runs

Open TestRail: Runs

Created September 16, 2024 at 1:10 PM
Updated December 10, 2024 at 7:51 AM
Resolved November 29, 2024 at 12:28 AM
TestRail: Cases
TestRail: Runs