Recommended Reading¶
We use a lot of external libraries in Undertale - if you’re not already pretty familiar with the following, it’s worth reading through their documentation and possibly completing their tutorial(s) before contributing.
- datatrove
The dataset building pipeline library from the folks at HuggingFace. We use this to codify all of our dataset building pipelines and parallelize them across compute infrastructure.
- PyTorch
The deep learning library. If you’re not already deeply familiar with PyTorch, the textbook Deep Learning with Pytorch is an excellent resource.
- PyTorch Lightning
All of our models are written in PyTorch and wrapped in Lightning modules. We largely let Lightning handle the complexities of multi-node, multi-GPU training, validation, and integration with tensorboard for monitoring training.
- Tensorboard
The visualization tool we use for tracking training runs.
- Sphinx
A software documentation library. All of our documentation is written in reStructuredText and built with Sphinx. We also use autodoc with Google-style Python docstrings for automatically generated reference documentation.
- pyinstrument
The statistical profiler for Python. We use this occasionally for performance testing.