# TorchServe TorchServe allows you to serve PyTorch models with REST and gRPC APIs. It provides a built-in web server that can serve one or multiple models, with configurable ports, hosts, and logging options. Our platform simplifies deploying PyTorch models. You can upload your TorchServe model archive to **E2E Object Storage (EOS)**, and E2E will automatically create and manage the deployment — handling container creation, model download, and web server launch with monitoring and scaling features through the dashboard. --- ## Key features of TorchServe * Automated deployments from EOS * Automatic restart on failures * Managed TLS certificates * Token-based authentication * Manual or auto scaling * Optional persistent disks (for faster restarts) * REST (HTTP) and gRPC support * Health checks (readiness and liveness) --- ## Quick start This section focuses on deploying and serving TorchServe models. ### Step 1: Install dependencies Our platform uses **MinIO CLI (mc)** to upload model archives to EOS. If you’re using Instance, skip this step — `mc` is pre-installed. #### macOS installation ```bash brew install minio/stable/mc ``` --- ### Step 2: Create directories for model and config ```bash mkdir mnist && mkdir ./mnist/model-store && mkdir ./mnist/config cd mnist/model-store ``` --- ### Step 3: Download a trained model archive ```bash wget https://objectstore.e2enetworks.net/iris/mnist/model-store/mnist.mar ``` --- ### Step 4: Download the TorchServe config file ```bash cd ../config wget https://objectstore.e2enetworks.net/iris/mnist/config/config.properties ``` --- ### Step 5: Create a model 1. Go to the [AI Platform](https://tir.e2enetworks.com). 2. Navigate to **Model Repository** → **Create Model**. 3. Name it `my-mnist` and select **New E2E Object Store Bucket**. 4. Copy and run the `mc alias` command provided: ```bash mc config host add my-mnist https://objectstore.e2enetworks.net ``` ```bash mc ls my-mnist/ ``` --- ### Step 6: Upload model and config to EOS ```bash cd .. mc cp -r * my-mnist/ ``` --- ### Step 7: Create an inference service 1. Go to **Deployments** → **Create Deployment**. 2. Choose framework **TorchServe** and select the `my-mnist` model. 3. Use the **Sample API request** in the dashboard to test your service. --- ## Developer workflow Typical TorchServe workflow: 1. Train and save the model (`.pt`). 2. Optionally, write a custom handler. 3. Create a model archive (`.mar`) using `torch-model-archiver`. 4. Create a `config.properties` file. 5. Run TorchServe — automated in our platform For examples, visit the [TorchServe MNIST demo](https://github.com/pytorch/serve/tree/master/examples/image_classifier/mnist). --- ## Creating a model archive Use the **Torch Model Archiver** utility to generate `.mar` files. ```bash torch-model-archiver --model-name mnist --version 1.0 \ --model-file mnist.py --serialized-file mnist.pt --handler image_classifier \ --export-path model-store ``` ### Key parameters * **--model-name**: Name for your model (used in API endpoint). * **--version**: Optional version tag. * **--model-file**: Model definition file (e.g., `mnist.py`). * **--serialized-file**: Model weights file (e.g., `mnist.pt`). * **--handler**: Pre-built or custom handler. See [available handlers](https://github.com/pytorch/serve/tree/master/ts/torch_handler). --- ## TorchServe config file A sample `config.properties` file: ```properties metrics_format=prometheus number_of_netty_threads=4 job_queue_size=10 enable_envvars_config=true install_py_dep_per_model=true ``` --- ## Package and upload model updates 1. Create a model structure: ```bash mkdir -p my-model/config my-model/model-store cp config.properties my-model/config/ ``` 2. Upload to EOS: ```bash mc cp -r my-model my-mnist/ ``` --- ## Connecting to the service endpoint E2E secures TorchServe endpoints with authentication tokens. ### Check endpoint status ```bash curl -v -H 'Authorization: Bearer $AUTH_TOKEN' -X GET \ https://infer.e2enetworks.net/project//endpoint//v1/models/mnist ``` ### Send prediction requests ```bash curl -v -H 'Authorization: Bearer $AUTH_TOKEN' -X POST \ https://infer.e2enetworks.net/project//endpoint//v1/models/mnist:predict \ -d '{"instances": [{"data": "", "target": 0}]}' ``` --- ### Batch prediction example ```python import pathlib, base64 from tir_inference import endpoint data_dir = pathlib.Path("mnist-images-directory") files = list(data_dir.glob("*.png")) with open(files[0], "rb") as f: data = {"data": base64.b64encode(f.read()).decode("utf-8")} response = endpoint.predict(instances=[data]) ``` --- ## Monitoring It provides real-time logs and metrics for all TorchServe deployments. ### Logs View logs under **Deployments → Logs**. ### Metrics To enable Prometheus metrics: ```properties metrics_format=prometheus ``` --- ## Advanced use cases ### Custom containers Supports extending built-in containers. Learn more in [Custom Inference Containers](custom_inference). ### Large models and multi-GPU setups E2E fully supports TorchServe’s multi-GPU and large model capabilities. Refer to [TorchServe Large Model Inference](https://pytorch.org/serve/large_model_inference.html). --- ## Examples For additional examples, visit the [official TorchServe repository](https://github.com/pytorch/serve/tree/master/examples). ---