Features
Support JupyterLab
Browser based notebook interface, pre-installed and ready for ML workflows.
→Flexible Storage
Multiple storage tiers: workspace, datasets, SFS, and PFS for any workload.
→Network & Security
Control instance access with VPC, SSH keys, security group, and startup scripts.
→Save Image
Capture your full instance environment as a reusable, restorable snapshot.
→Monitoring & Alerts
Monitor real-time resource utilization through interactive charts and configure automated threshold-based email notifications.
→List of Different Prebuilt Images
Browse all E2E provided ML framework images with CUDA and driver details.
→1. JupyterLab Support
| Feature | Description |
|---|---|
| Prebuilt Images | JupyterLab is available with supported TIR prebuilt images. It is not available with Base OS images. |
| Custom / Private Images | JupyterLab will be available only if the image supports it. To enable: Build the image using TIR’s Image Builder utility, then select the ”JupyterLab Support” checkbox while creating the instance. To create an instance, follow the steps in the Create Instance guide. |
2. Flexible Storage
TIR offers multiple storage tiers to meet varying performance, scalability, and collaboration requirements. Each tier serves a distinct purpose and multiple storage tiers can be attached to a single instance based on your workload needs.
| Feature | Description |
|---|---|
| Workspace Storage | Every instance comes with a pre-attached disk as its primary workspace storage. Use workspace storage to persist files, models, and data across instance restarts. Free Tier: 30GB included with every instance. Max Capacity: Up to 15,000GB. Workspace size can be increased at any time through the Workspace Size tab but cannot be reduced once expanded. |
| Datasets | Use datasets to attach and manage external storage directly on your instance, keeping large data separate from your workspace. The Associated Datasets tab allows you to attach Disk or EOS datasets and view their mount status and storage details. Disk storage can only be attached to one instance at a time. |
| Shared File System (SFS) | Use SFS when multiple instances need access to the same data simultaneously, such as shared configurations or collaborative workloads. Concurrent Access: Multiple instances can read from and write to the same file system at the same time. Mount Management: Mount or unmount SFS paths (e.g., /my_sfs) directly from the Shared File System tab. |
| Parallel File System (PFS) | Use PFS for high throughput workloads that require fast, simultaneous data access across multiple instances or processes. High Throughput: Distributes data across multiple storage devices to minimize bottlenecks. Distributed Workloads: Ideal for HPC environments requiring simultaneous, high speed access to large datasets. Mount Management: Mount or unmount PFS paths (e.g., /my_pfs) directly from the Parallel File System tab. |
| Ephemeral Storage | Ephemeral storage is temporary instance storage available at /home/user, with a fixed limit of 50 GB per instance that cannot be increased.Data stored in ephemeral storage is permanently lost if the instance is terminated or restarted. Additionally, if the 50 GB limit is reached, the instance may automatically restart, leading to data loss. For production workloads, always attach persistent storage such as Datasets, SFS, or PFS, and use their mounted paths for storing important data instead of /home/user to ensure durability and prevent loss. |
3. Network and Security
TIR provides network and security controls to manage how your instance communicates and who can access it. Configure these settings through the Network and Security tab.
| Feature | Description |
|---|---|
| VPC (Virtual Private Cloud) | Use VPC to place your instance within a private network, enabling secure communication with other resources without exposure to the public internet. To attach: Reserve a VPC IP from Actions if you have not already done so, then select it from the dropdown and click Attach. To detach: Click Detach VPC IP and confirm. Each instance can only have one VPC IP attached at a time. |
| SSH Access | Use SSH access to securely connect to your instance via terminal for running commands, debugging, or managing files directly. Enable or disable SSH terminal access using the toggle switch. To access the instance via SSH, an SSH key must be attached. You can create or select an SSH key either during instance creation or afterward. Multiple SSH keys can be attached to an instance. |
| Reserved IP | Use a Reserved IP to ensure your instance maintains a consistent, static IP address across restarts, avoiding the need to update connection settings each time. Multiple Reserved IPs can be attached to an instance. |
| Security Groups | Use security groups to control which traffic is allowed in and out of your instance, acting as a virtual firewall for network-level protection. Inbound Rules: Control incoming traffic to the instance. Outbound Rules: Control outgoing traffic from the instance. Example: Attach the Default SSH Security Group to allow SSH connections on port 22.Multiple security groups can be created and attached to an instance. |
| Start Scripts | Use start scripts to automate environment setup on every boot, eliminating repetitive manual configuration after restarts. These scripts execute automatically when your instance starts, allowing you to install dependencies, apply configurations, and run initialization commands without manual intervention. Multiple start scripts can be created and attached to an instance. |
4. Monitoring
Use the Monitoring tab to track your instance's resource consumption in real time and make informed decisions about scaling and performance.
| Feature | Description |
|---|---|
| Visual Gauges | At-a-glance health indicators for Memory Usage, Workspace Usage, and Ephemeral Storage Usage. |
| CPU Utilization | Monitor processor load as a percentage over time. |
| CPU Memory Utilization | Track active RAM usage in MB or GB. |
| Workspace Memory Utilization | Observe dedicated workspace storage performance over selectable time intervals. |
5. Alerts
Alerts are threshold-based triggers that enable proactive resource management for your instances. When a monitored metric exceeds a defined limit, TIR automatically sends an email notification, allowing you to respond before performance or availability is affected.
For detailed configuration, see Alert Management.
6. Save Image
Use Save Image to capture your instance's current environment as a reusable image, so you can restore, replicate, or share the exact same setup without reconfiguration.
To save an image of an instance, follow this guide
| Feature | Description |
|---|---|
| What is Included | The following are preserved in the saved image: Installed packages and dependencies System-level configurations Custom scripts and application code Framework installations (PyTorch, TensorFlow, etc.) User-level configurations |
| When to Use Save Image | Before updating or changing your instance plan. Before making major modifications to the environment. When deploying identical environments across teams. When creating a backup or disaster recovery checkpoint. . After saving, you can restore the image to revert changes or use it to launch new instances with the same environment. |
7. Instance Lifecycle
| State | Description |
|---|---|
| Waiting | Instance is being deployed or updated. The instance is currently being configured on hardware. |
| Running | Instance is active and accessible via JupyterLab or SSH. Compute billing is active. |
| Stopped | Instance is not running on hardware. Workspace persists, but storage charges may apply. |
Best Practices for ML Workloads
Always save checkpoints, datasets, and model weights under /home/jovyan. Anything outside may be lost on restart.
Use committed billing for regular heavy training runs — better pricing and access to local NVMe on select GPU SKUs.
Keep large datasets in EOS buckets and mount them on demand rather than copying to workspace.
Run nvidia-smi in the first Jupyter cell of every session to confirm GPU visibility and VRAM.