ML4LM — Speculative Decoding — from where we left off

less than 1 minute read

Published:

Most blogs stop at the basics and skip the real details. I break down what’s usually missing: batching, accept/reject checks, and fallbacks.

Read the full article on Medium