This Cloud Function will respond to a new document in a GCP Storage bucket and will extract the CSV data, transform it, and load it into a BigQuery table.
- Create a Cloud Storage bucket (in the example below, my bucket is
sp24_elliott_41200_weather_dev
) - Create a new directory in Cloud Shell
- Initialize a new npm module
touch index.js .gitignore README.md
- Add
node_modules
to .gitignore - Add code to log the basic data about the file on function execution
- Deploy the function:
gcloud functions deploy weather_etl \
--runtime nodejs18 \
--trigger-event google.storage.object.finalize \
--entry-point readObservation \
--trigger-resource sp24_danish_41200_weather_dev
This repository includes a CSV file that you can use to test your Cloud Function. To copy the file to the Cloud Storage bucket, run the following command and substitute your bucket name!
gsutil cp 724380-93819_sample.csv gs://sp24_elliott_41200_weather_dev
- Install
@google-cloud/storage
andcsv-parser
- Use
file.createReadStream()
to open and log the data in the file
- Add a helper function that accepts the dictionary of CSV data and loops through it, logging out each row of data as a separate log entry
- Add a helper function that converts the dictionary values to numeric
- Modify helper function to transform values (fix the scales, set null values to
null
, etc.)
- Write an insert query to load the transformed data into a table
- Use the file name to identify the weather station providing the data