Data Syncer
TIR's Data Syncer is a service that seamlessly transfers data from your data source to TIR Datasets, ensuring efficient and reliable data synchronization. You define a source (where data lives), a destination (which dataset path receives it), and a connection that ties them together with a sync mode and schedule. Run syncs on demand or on a recurring cadence so training and inference always read fresh data.
You can only transfer files up to 50GB in size. If a file exceeds this limit, the transfer for that particular file will be skipped.
Quick start
Explore Data Syncer
Sources
Connect external data stores
Destinations
Target paths on EOS datasets
Connections
Link sources to destinations
Sync & schedule
Modes, jobs, and timing
Console workflow
- Open TIR and select a project.
- In the sidebar, open Data Syncer. You will see Sources, Destinations, and Connections.
- Create and save a source, a destination, and then a connection that links them. All three are required before a sync can run.
Sources
A source is a configured connector to an external system (object storage, Drive, etc.). TIR reads objects from that source when a connection runs.
Supported sources
| Source | Authentication | Incremental sync |
|---|---|---|
| AWS S3 | IAM user | Yes |
| Azure Blob | Connection string | Yes |
| Google Drive | OAuth | Yes |
| Google Cloud Storage | Service account | Yes |
| MinIO | Access & secret keys | Yes |
Creating a source
- In Data Syncer, open Sources and choose Create Source.
- Pick a connector type and enter the required credentials and settings for your environment.
Step-by-step options for each connector are in Configuring Sources.
After you save, the source can be reused by one or more connections.
Updating a source
Open the source from the list and use Edit to change credentials or settings. Updates apply to future sync jobs; running jobs are not rolled back automatically.
Destinations
A destination maps sync output to an EOS-backed TIR Dataset and a path inside that dataset. You can select an existing dataset or create one first, then point the destination at the folder prefix where synced objects should land.
Creating a destination
- In Data Syncer, open Destinations and choose Create Destination.
- Select the dataset and the destination path (prefix) for incoming files.
- Confirm to create the destination.
Choose your destination path within the Dataset carefully. Any conflicts with the existing dataset files could result in data loss.
Updating a destination
Use Edit on a destination to change the target path or dataset selection. Validate downstream jobs or pipelines that depend on the old path before switching.
Connections
A connection binds one source to one destination and controls how and when data is copied. This is where you set sync mode, schedule, and optional manual runs.
Sync modes
A Sync Mode governs how TIR will read from a source and write to a destination. TIR supports two types of Sync Modes:
| Mode | Behavior |
|---|---|
| Full refresh | Reads the full source scope and writes to the destination, replacing content according to the connector semantics for that run. |
| Incremental | Copies objects added or changed since the last successful sync, when the source supports it. |
Schedule modes
Schedule mode defines the frequency of the data sync, i.e., how often the data from your source will sync to the destination. The options are as follows:
| Mode | Behavior |
|---|---|
| Scheduled | Runs at a regular interval you configure in the UI. |
| Cron | Runs on a cron schedule. See Cron expressions for syntax and examples. |
| Manual | No automatic runs; you trigger syncs with Sync Now. |
You can still use Sync Now on any connection when you need an extra run, regardless of schedule mode.
Creating a connection
- In Data Syncer, open Connections and choose Create Connection.
- Select the source and destination to link.
- Choose Sync mode and Schedule mode, then create the connection.
After creation, use the Overview tab for connection details and the Jobs tab for history.
Enable or disable a connection
Data Sync Jobs for any connection can be stopped temporarily by disabling the connection using the Toggle button in the Actions Column. Disabling the connection will stop all the scheduled sync jobs until the connection is enabled but does not impact any running sync job.
The connection can be enabled again using the Toggle button, and all the jobs, if any, will run as scheduled.
Updating a connection
Use Edit to change sync mode, schedule, or the linked source or destination. Review active jobs after changes so you understand what the next runs will do.
Jobs and lifecycle
Cancel running jobs
From the Jobs view for a connection, you can cancel an in-progress sync. Cancellation stops further work for that job; files already written remain—nothing is rolled back automatically.
What’s next
- Configuring Sources — credentials and options per connector
- Cron expressions — schedule advanced syncs
- TIR Datasets — create and manage EOS datasets used as destinations