News
- 10/4/2024 Preprint update of Preble: Efficient Distributed Prompt Scheduling for LLM Serving comparison with latest SGLang
- 5/22/2024 Preprint release of Preble: Efficient Distributed Prompt Scheduling for LLM Serving
- 5/1/2024 🎉 InferCept was accepted to ICML 2024!
- 2/4/2024 Arxiv release of InferCept: Efficient Intercept Support for Augmented LLM Inference
Efficient Augmented LLM Serving With InferCept
- 6 mins read