I’m sharing tools I’m using when I build Datawarehouses using ActiveWarehouse-ETL.
- CsvHelper – allows to:
- extract csv columns names in a stable fashion
- returns all the values of a specific column
- FixedWidthFileDestination : if you’re outputting stuff to COBOL machines
- DimensionMigrationHelper : useful for ActiveRecord migrations
- French(Date/Time)DimensionBuilder : fine-grained date/time dimensions in french
- EnsureFieldsPresenceProcessor : raises error as soon as one field is missing from a row
- EscapeCSVProcessor : replaces \" by "" in hope FasterCSV will be able to munge your CSV afterwards
- CleanUpTransform : like AW-ETL DecodeTransform, but leave the original value if no match is found
rake spec
Note: you’ll need to gem install activewarehouse-etl and fastercsv
- chunk-splitters bulk load processor to cope with mysql timeouts on Windows