Recommended Reading¶
We use a lot of external libraries and tools in Undertale - if you’re not already pretty familiar with the following, it’s worth reading through their documentation and possibly completing their tutorial(s) before contributing.
- dask
The parallel computing library that supports distributed computing via SLURM (among others). We use dask for all of our data processing pipelines.
- Pandera
The DataFrame schema validation library. We use Pandera to validate datasets and enforce common schema across processing stages.
- PyTorch
The deep learning library. If you’re not already deeply familiar with PyTorch, the textbook Deep Learning with Pytorch is an excellent resource.
- PyTorch Lightning
All of our models are written in PyTorch and wrapped in Lightning modules. We largely let Lightning handle the complexities of multi-node, multi-GPU training, validation, and integration with tensorboard for monitoring training.
- Binary Ninja
Binary Ninja is our disassembler/decompiler of choice. It’s fast and has a robust Python API. Most of our dataset pipelines require an active Binary Ninja license.
- Tensorboard
The visualization tool we use for tracking training runs.
- Sphinx
A software documentation library. All of our documentation is written in reStructuredText and built with Sphinx. We also use autodoc with Google-style Python docstrings for automatically generated reference documentation.
- pyinstrument
The statistical profiler for Python. We use this occasionally for performance testing.