Loss Functions That Align With Trading PnL

Blog > Loss Functions That Align With Trading PnL

Loss Functions That Align With Trading PnL

2026-03-19

The default loss for time series forecasting is MSE or MAE. Both measure how close the predicted value is to the actual value. Neither measures what a trader actually cares about: did the forecast lead to a profitable decision?

This disconnect is the single biggest source of wasted compute in market model training. You can halve your MAE and see zero improvement in trading PnL. Conversely, a model with higher MAE but better directional accuracy on large moves can dramatically outperform in simulation.

The Problem With Symmetric Losses

MSE and MAE treat all errors equally regardless of direction. A forecast that is 1% too high on a bar where price dropped 3% gets the same penalty as one that is 1% too low on a bar where price rose 3%. But for a long-biased trading system, these errors have completely different PnL consequences.

Symmetric losses also weight all bars equally. A 0.5% error on a bar where price moved 0.1% dominates the gradient, even though that bar contributed nothing to trading PnL.

Directional Accuracy as a Loss Component

We add a term that explicitly penalizes getting the direction wrong:

direction_loss = -mean(sign(predicted_return) * actual_return)

This is essentially the negative of a simple long/short strategy return. The sign function is not differentiable, so we use a smooth approximation (tanh with a temperature parameter). The temperature controls how much the model focuses on getting direction right versus magnitude right.

Return-Weighted MAE

Instead of uniform MAE, we weight each bar's error by the absolute return of that bar:

weighted_mae = mean(|actual_return| * |predicted - actual|)

Bars with large moves contribute more to the loss. Bars where nothing happened contribute almost nothing. This aligns gradient signal with the bars that matter for trading.

The effect in practice: the model stops wasting capacity on predicting the exact close of quiet bars and focuses its representation on the dynamics around significant moves.

Quantile Loss With Asymmetric Penalties

Standard quantile (pinball) loss penalizes over-predictions and under-predictions differently based on the quantile level. We extend this by making the asymmetry depend on the trading context:

Upper quantile (90th): over-prediction is cheap (you miss an entry), under-prediction is expensive (you take a position that hits its stop).
Lower quantile (10th): under-prediction is cheap, over-prediction is expensive.

This produces tighter risk bounds where it matters and looser bounds where false precision is harmless.

Sortino-Inspired Training Loss

The Sortino ratio penalizes downside deviation but not upside deviation. We adapt this for training:

sortino_loss = -mean(predicted_return * actual_return) / downside_std(predicted_return * actual_return)

Where downside_std only counts the cases where the model's implied trade lost money. This teaches the model to avoid confident predictions that lead to losses, while being tolerant of predictions that lead to gains even if the magnitude estimate was off.

In practice this loss is noisy and needs to be combined with a standard regression loss for stability. We use a weighted sum: 70% return-weighted MAE + 30% Sortino-inspired component.

Multi-Horizon Loss Aggregation

Our models predict multiple future bars simultaneously. We weight the loss by horizon importance:

Steps 1-4 (next few hours): 40% weight — these drive entry decisions
Steps 5-12 (next day): 35% weight — these drive sizing and exit targets
Steps 13-24 (multi-day): 25% weight — these provide directional context

The weights come from analyzing which forecast horizons have the most impact on simulated PnL.

Regularization Toward the Prior

Foundation models like Chronos2 have strong priors from pre-training. When fine-tuning on market data, we add a KL divergence term between the fine-tuned model's output distribution and the base model's output distribution. This prevents the fine-tuned model from drifting too far from the base model's general time series understanding.

What We Measure To Validate

Every loss function change goes through a validation checklist:

Standard MAE/RMSE: must not degrade by more than 10% relative to baseline.
Quantile calibration: the 10/50/90 quantiles must remain calibrated on held-out data.
Directional accuracy on large moves: must improve or stay flat.
Simulated PnL: must improve Sortino ratio and reduce max drawdown on the market simulator.
Stability across assets: improvements must be consistent across at least 80% of the asset universe.

If any check fails, the loss function change is rejected regardless of how much it improved the primary target metric.

Summary

The gap between forecast accuracy and trading profitability is real and measurable. By incorporating directional accuracy, return weighting, asymmetric quantile penalties, and PnL-inspired components, you can train models that are directly useful for trading rather than just good at predicting quiet bars.

Plug: BitBank uses these custom loss functions in production. See the dashboard for live predictions and the tech blog for more implementation details.