Accelerating LLM Inference with Prompt Caching for Open‑Source Models on Databricks

* Prompt caching reuses repeated prompt prefixes so LLMs run faster. It cuts latency and boosts throughput automatically. * Databricks now supports prompt caching for open-source models across batch, pay-per-token, and provisioned workloads. No setup is requ…