27 April 2023
a lightweight local registry for storing and comparing PyTorch models
As part of VESSL’s mission to build modern infrastructure, tools, and workflows for machine learning, today we are releasing tvault. tvault is a local model registry designed to help academic researchers in ML iterate their models faster without logging overheads. You can think of it as a lightweight “git diff for ML” where you can track and hash-version your experiments, and get a birds-eye view of changes between two experiments.
Over the last year, we’ve met hundreds of graduate students in AI/ML through VESSL for Academics↗ and campus-wide adoption of VESSL↗. While a small fraction of them are avid users of full-fledged experiment tracking tools like VESSL Experiments↗ and Weights & Biases, many of the academic researchers we encounter simply want to get going with minimum set up, and focus on what matters the most — the models. Considering how academic researchers work more independently and focus more on the algorithms as opposed to fine-tuning through multiple iterations, that makes sense.
With tvault
, you can get started with the simplest form of experiment tracking just by pip install
— no sign-up or additional configuration required — and manage and compare experiments within your terminal.
Adding ↗tvault.log()
↗ in your code’s training loop↗ creates a local log file under model_log
↗ folder with a unique hash for your model and ID for each run. This log file is essentially a readout of your model that stores the model overview and key metrics like the accuracy, learning rate, and optimizer which you can add using tags. Since tvault
logs live under your project folder, you can also share your experiment progress as you git push your local codebase and make them publicly available on GitHub once you are ready to publish your reserach.
After you insert tvault.log()
in your training loop along with the metrics you want to track as tags, you can look up and compare experiments with a simple CLI command.
You can look up your experiments by hash, result, or tags using --find_flag
and specifying --condition
. The example below shows all experiments with an accuracy between 50
and 100
.
tvault’s diff_flag
option allows you to look up the difference between two models by specifying the model hash and index. tvault
automatically detects and displays the changes in functions while removing git diffs that are not related to the model. This is useful when you have a baseline model that you want to iterate with different hyperparameters and higher-level architectures without digging through your code line-by-line.
While MLOps is essential for production machine learning at scale and speed, the current engineering and tooling landscape can be overwhelming especially for graduate ML researchers. We believe that minimum viable tools like tvault
that give users an instant value with zero overheads will motivate them to explore more rather than simply overlook. We are looking forward to releasing more open-source frameworks that help engineers and researchers in AI/ML get started with MLOps —Terraform patterns for setting up multi-cloud Kubernetes clusters↗, for example.
For those who are looking to adopt a full-fledged, production-ready AI infrastructure & MLOps within days rather than months, sign up for a free account and start training at https://vessl.ai.↗
Growth Manager
ML Engineer Intern
Build, train, and deploy models faster at scale with fully managed infrastructure, tools, and workflows.