29 November 2023
Discover VESSL Run: a versatile ML tool streamlining training and deployment across multiple infrastructures with YAML for easy reproducibility and integration.
Machine learning (ML) is an ever-evolving landscape, with new models and techniques emerging on a daily basis. While this progress is exciting, it poses a challenge: How do you know if a specific ML model is suitable for your business needs? The ideal scenario would be to instantly reproduce these models, fine-tune them using your data, and build a demo with identical resource configurations to test their efficacy.
Complicating matters further is the ongoing GPU shortage↗. For those considering an on-premise cluster, the upfront cost, management expense, and limited GPU capabilities are deterrents. Multiple cloud providers are an option, but learning the intricacies of each platform is time-consuming.
That’s where VESSL Run comes into play. This tool offers a unified interface across different infrastructures — be it on-premise or cloud — making it easier to train and deploy ML models. Let’s delve into how VESSL Run solves these problems.
Whether you’re using AWS, Azure, GCP, or OCI, VESSL Run offers a way to interact seamlessly with all these cloud providers↗. Many organizations also maintain on-premise clusters for data security or compliance reasons. The answer to juggling these different infrastructures lies in a lightweight and powerful definition language: YAML
.
YAML offers several advantages↗ that make it indispensable in the world of ML:
name: llama2_c_training
description: Inference Llama 2 in one file of pure C.
resources:
cluster: aws-apne2
preset: v1.cpu-4.mem-13
image: quay.io/vessl-ai/ngc-pytorch-kernel:23.07-py3-202308010607
run:
- workdir: /root/examples/llama2_c/
command: |
wget <https://karpathy.ai/llama2c/model.bin>
gcc -O3 -o run run.c -lm
./run model.bin
import:
/root/examples/: git://github.com/vessl-ai/examples
The YAML file is a primitive example of Declarative ML, designed to configure a machine learning job for running inference on a model called “Llama 2” in C language, developed by Andrej Karpathy (karpathy/llama2.c↗). This YAML configuration is executed on the VESSL platform and is designed to be easily shared and versioned, as per VESSL’s YAML capabilities. For an in-depth explanation of every available YAML key-value pair, you can consult the complete YAML reference on VESSL’s documentation↗.
Explore the following real-world scenarios to understand VESSL Run:
VESSL’s extensive integrations include:
Best of all, these integration options are fully compatible with YAML definitions, enhancing both ease of use and consistency.
We prepared few useful use cases to grasp a concept of volume operations:
VESSL Hub↗ hosts a curated collection of YAML files tailored for state-of-the-art open-source models such as Llama.c, nanoGPT, Stable Diffusion, MobileNerf, LangChain, and Segment Anything. Getting started is as simple as copying and pasting these YAML files. Install VESSL CLI↗ with pypi and run the yaml files you want.
pip install vessl
vessl run create -f config.yaml
VESSL Run isn’t just about integration and reproducibility. It offers a plethora of features that cover the entire ML pipeline:
And that’s just scratching the surface. With VESSL Run, you get a one-stop solution for managing your ML models across various infrastructures.
Product Manager
Build, train, and deploy models faster at scale with fully managed infrastructure, tools, and workflows.