News
- 11/25/2024 Release of Cognify: The Automated Optimizer for Generative AI Workflows
- 10/4/2024 Preprint update of Preble: Efficient Distributed Prompt Scheduling for LLM Serving comparison with latest SGLang
- 5/22/2024 Preprint release of Preble: Efficient Distributed Prompt Scheduling for LLM Serving
- 5/1/2024 🎉 InferCept was accepted to ICML 2024!
- 2/4/2024 Arxiv release of InferCept: Efficient Intercept Support for Augmented LLM Inference