Skip to content

Commit

Permalink
Merge pull request #3714 from mlibrary/HELIO-4772/tmm_task_tidy_encod…
Browse files Browse the repository at this point in the history
…ing_string

HELIO-4772 - align input CSV encoding and comment with reality
  • Loading branch information
sethaj authored Oct 29, 2024
2 parents 587aea9 + c16fc4a commit a68e1b6
Showing 1 changed file with 3 additions and 4 deletions.
7 changes: 3 additions & 4 deletions lib/tasks/tmm/tmm_csv_monograph_create_update.rake
Original file line number Diff line number Diff line change
Expand Up @@ -34,10 +34,9 @@ namespace :heliotrope do
fail "CSV file may accidentally be a backup as '#{input_file}' contains 'bak'. Exiting." if input_file.include? 'bak'

puts "Parsing file: #{input_file}"
# we need UTF-8 and TMM needs to export UTF-16LE for now because of this kind of thing: https://dba.stackexchange.com/a/250018
# unfortunately we need to read this file into memory to force uniform line endings before parsing the CSV.
# side note: although the
file_content = File.read(input_file, encoding: 'bom|utf-16le')
# Firebrand are finally good with UTF8. Note that it's crucial for Ruby to know there's a BOM or the first column is lost.
# Unfortunately we need to read this file into memory to force uniform line endings before parsing the CSV.
file_content = File.read(input_file, encoding: 'bom|utf-8')
# Use `gsub!` to avoid holding more memory (I guess).
file_content.gsub!(/\r\n?/, "\n")

Expand Down

0 comments on commit a68e1b6

Please sign in to comment.