---
title: Pricing
---

# Pricing

Fine-tuning job costs depend on the **GPU plan** you select and the **duration of training**. Pricing is based on compute usage — you are charged for the time your training job runs on the selected GPU.

---

## How billing works

| Factor | Description |
|--------|-------------|
| **GPU type** | The GPU selected (H100, A100, etc.) determines the per-hour rate |
| **Training duration** | You are charged for the actual time the job runs on that GPU |
| **Billing start** | Begins when the job starts executing on the GPU |
| **Billing stop** | Ends when the job completes, fails, or is terminated |

Fine-tuning jobs are billed **hourly** based on the GPU compute resources used. There are no charges while a job is queued or pending.

---

## What affects cost

- **GPU type** — H100 and A100 have different per-hour rates
- **Training duration** — More epochs and larger datasets increase training time
- **Quantization** — Enabling 4-bit quantization may reduce training time by lowering memory overhead, potentially allowing use of a smaller GPU

---

## Pricing examples

*Note: Values below are illustrative. Use the [E2E Calculator](https://calculator.e2enetworks.com/estimate-pricing) for current rates.*

### Example : Fine-tuning Llama 3 13B on H100

**Scenario:** Fine-tune Llama 3 13B for 2 epochs on a 50,000-row dataset. Estimated training time: ~8 hours.

| Factor | Value |
|--------|-------|
| GPU | H100 |
| Training duration | ~8 hours |

**Billing:** Cost = 8 hours × (price per H100 hour)

**Recommendation:** Use H100 for larger models where training speed matters; the higher per-hour rate is often offset by shorter job duration.

---

For detailed pricing, visit the [E2E Calculator](https://calculator.e2enetworks.com/estimate-pricing).

---

## Frequently Asked Questions

#### Am I charged if my job fails?

Yes. You are charged for the compute time used up to the point of failure. Use **Retry** to restart a failed job without losing configuration.

---

#### When does billing start?

Billing starts when the training job begins executing on the GPU — not when it enters the queue. There is no charge while the job is in a pending or queued state.

---

#### How can I reduce training costs?

- Enable **4-bit quantization** to reduce memory usage and potentially run on a smaller, lower-cost GPU.
- Start with **fewer epochs** to validate your dataset and configuration before running the full job.
- Use the **Clone** feature to reuse configurations and iterate without re-entering settings.
- **Terminate** a job early if training metrics indicate overfitting or an incorrect configuration.

---

#### Is there a minimum charge per job?

Billing is prorated to the billing period. You are charged only for actual GPU compute time used, with no minimum duration.

---


---