We study speculative decoding for message-oriented
systems: a fast predictor proposes a decision and exits early when
confidence exceeds a threshold τ , otherwise a slow, more accurate
predictor runs (with timeout ∆). We quantify accuracy/latency
tradeoffs, throughput gains, accept/fallback rates, and show how
τ and ∆ shape the Pareto frontier.