Skip to content

Latest commit

 

History

History
23 lines (13 loc) · 713 Bytes

README.md

File metadata and controls

23 lines (13 loc) · 713 Bytes

About

This repository contains boilerplate for PyTorch's training loop.

Some features included in this boilerplater are:

1. Periodic Checkpoints

  • Automatically save checkpoints of model state, optimizer state and training scores every x epochs
  • Load checkpoints

2. Gradient Scaling

  • If the gradients are computed as float16, the value of gradients may underflow to 0.
  • Gradients are scaled up after backpropagation and scaled down before optimizer step to solve this

3. Automatic Tensor Casting

  • The outputs and loss of the model are lowered in precision for less memory usage.

4. Garbage Collection

  • Every batch, the cache is cleared and non-used objects are flushed from memory.