News
5/22/2024
Preprint release of Preble
: Efficient Distributed Prompt Scheduling for LLM Serving
5/1/2024 🎉 InferCept was accepted to ICML 2024!
2/4/2024
Arxiv release of InferCept
: Efficient Intercept Support for Augmented LLM Inference