Upstage
AI Engineer - LLM Serving Platform
Seoul, South KoreaPosted 11 days ago
What you'd do
- Design, develop, and operate LLM serving platform on large-scale GPU clusters.
- Implement routing and scheduling logic to handle large-scale traffic efficiently.
- Implement custom extensions for open-source inference runtimes like vLLM and SGLang.
What they want
- 3+ years of AI model serving experience.
- Deep understanding of latest LLM architectures and serving techniques.
- Production experience developing and operating real-time AI inference services.
Nice to have
- Multi-region global service operations experience.
- Contributions to open-source LLM frameworks (vLLM, SGLang, TensorRT-LLM, Transformers).
- Large-scale GPU cluster operations experience.