Features
Pipeline Versioning
Track iterative changes with multiple versions per pipeline.
→Runs & Experiments
Execute pipelines and organize runs into experiments.
→Scheduled Runs
Automate pipeline execution at specific times or intervals.
→Docker Image Execution
Run custom Docker images as pipeline workflows.
→Data Transfer (EOS/PFS)
Transfer data between EOS and PFS using Argo Workflows.
→Scalable Execution
Serverless, async execution with retry and unlimited re-runs.
→Feature Overview
1. Argo & Kubeflow Support
TIR Pipelines support two industry-standard template formats:
- Argo Workflows — Define multi-step workflows using Argo's YAML specification. Supports DAG-based and step-based execution.
- Kubeflow Pipelines — Use Kubeflow's pipeline SDK to define ML workflows.
When you upload a YAML file, TIR automatically detects the pipeline type (Argo, Kubeflow, or generic). The YAML is validated on upload — containerSet templates are not supported.
2. Pipeline Versioning
Each pipeline supports multiple versions, allowing you to track iterative changes to your workflow definition.
How versioning works:
- When you create a pipeline, the first uploaded YAML becomes the default version.
- You can upload additional versions under the same pipeline by clicking CREATE VERSION in the Pipeline Versions tab.
- Each version is an independent YAML upload with its own
version_id. - One version is marked as the default version for the pipeline.
- Runs can be created against any specific version.
Managing versions:
- View versions: Select a pipeline and go to the Pipeline Versions tab.
- Create a version: Click CREATE VERSION, upload a
.yamlfile, and click UPLOAD. - Delete a version: Click the delete icon next to a version. This terminates all active runs for that version before deletion.
3. Pipeline Actions
From the pipeline listing page, the Actions menu provides:
- Create Version — Upload a new
.yamlversion for the selected pipeline. - Delete — Permanently delete the pipeline and all associated versions and runs.
Deleting a pipeline soft-deletes all associated versions and runs. This action cannot be undone.
4. Runs & Experiments
A Run is a single execution of a pipeline version. Each run uses a selected resource plan (CPU or GPU) and executes the workflow defined in the YAML.
Creating a run:
- Navigate to Pipelines > Run, or click Create Run from a specific pipeline or version.
- Select or create an Experiment to organize the run.
- Choose the pipeline version and configure any run parameters.
- Select a resource plan and click FINISH.
Experiments are containers that group related pipelines and their run histories. Use them to organize runs by project phase, model type, or any logical grouping.
Run actions:
- Retry — Restart a failed run without losing completed work. Uses the
PUT /runs/{run_id}/?action=retryendpoint. - Terminate — Stop a running execution immediately. Uses the
PUT /runs/{run_id}/?action=terminateendpoint. - Delete — Remove a run. Terminates it first if still active, then releases allocated resources.
Viewing run details:
- Click the run name to see execution details, workflow manifests, pod status, and progress.
- Run states include: pending, running, succeeded, failed, and terminated.
5. Scheduled Runs
Scheduled runs automate pipeline execution at specific times or recurring intervals.
Creating a scheduled run:
- During run creation, enable the Schedule Run toggle.
- Configure the schedule:
- Cron expression — Define a recurring pattern (e.g.,
0 0 * * *for daily at midnight). - Start time / End time — Optional time boundaries for the schedule.
- Max concurrency — Limit the number of simultaneous runs from this schedule.
- Cron expression — Define a recurring pattern (e.g.,
- Select a resource plan and click CREATE.
Managing scheduled runs:
- Navigate to Pipelines > Scheduled Run to view all scheduled jobs.
- Enable/Disable — Toggle a schedule on or off without deleting it.
- Delete — Permanently remove a scheduled job. The schedule is disabled first, then deleted.
- View related runs — See all runs triggered by a specific scheduled job.
6. Docker Image Execution
Run custom Docker images as pipeline workflows by defining an Argo Workflow YAML with your container image, command, and arguments.
- Supports both public and private images (private images require
imagePullSecrets). - Customize the entrypoint and arguments for your container.
For complete instructions, YAML templates, and ImagePullSecret setup, see the Docker Run Guide.
7. Data Transfer (EOS/PFS)
Transfer data between EOS object storage and Parallel File-System (PFS) using pre-built Argo Workflow templates.
- PFS to EOS — Upload files from your filesystem to an EOS bucket.
- EOS to PFS — Download data from an EOS bucket into your PFS filesystem.
Both workflows are available as downloadable YAML files that you can upload as pipelines.
For step-by-step instructions and YAML downloads, see the Data Transfer Guide.
8. Scalable & Reliable Execution
TIR Pipelines are built for production ML workloads:
| Capability | Description |
|---|---|
| Serverless execution | No infrastructure to manage — runs execute on demand. |
| Asynchronous processing | Pipelines run in the background; monitor via dashboard or API. |
| Best-in-class retry | Restart failed jobs without losing completed work. |
| Unlimited re-runs | Execute a pipeline version as many times as needed. |
| Stored results | Run artifacts and logs are stored in EOS buckets. |
| CPU & GPU plans | Choose the right resource plan for each step of your workflow. |
Best Practices
Pipeline Design
- Keep YAML definitions clean and simple — avoid introducing extra nodes or commands that may conflict.
- Use pipeline versioning to track iterative changes rather than overwriting existing pipelines.
- Use experiments to organize related runs by project phase or model type.
Resource Optimization
- Choose the right resource plan (CPU vs GPU) for your workload.
- Use CPU plans for data preprocessing steps; reserve GPU plans for training.
- Set appropriate
max_concurrencyon scheduled runs to avoid resource contention.
Reliability
- Leverage the retry mechanism to resume failed jobs without restarting from scratch.
- Store intermediate results in EOS buckets to avoid recomputation.
- Use scheduled runs for recurring batch jobs to automate execution.
Best Practices for Pipelines
Avoid extra nodes or commands that conflict. Use clean Argo or Kubeflow templates.
Track changes by creating new pipeline versions rather than overwriting existing ones.
Use CPU plans for preprocessing and GPU plans only for training steps to optimize costs.
Use the built-in retry mechanism to resume failed jobs without losing completed work.