All open roles
// Engineering

Member of Technical Staff – AI Inference platform, features

Zürich full-time Experienced 2-5 years experience permanent
Apply for this role

Your mission

You will expand the capabilities of Lyceum's AI inference platform, the first EU-sovereign inference cloud. You'll own the features that customers interact with directly: model serving configurations, API surface, framework integrations, and developer experience. This means understanding what customers need, building it fast, and making sure it works reliably at scale.

Your focus

-              Feature development: Design and ship new platform capabilities - from supporting new model architectures and serving frameworks to building out API features that customers are asking for.

-              Customer-facing engineering: Work closely with customers and the commercial team to understand real-world usage patterns, translate feature requests into technical designs, and iterate based on feedback.

-              Developer experience: Improve the end-to-end experience of deploying and running inference on Lyceum, from initial setup through to monitoring and debugging in production.

Your KPIs

-              Number of platform features shipped

-              Time from customer request to feature availability

-              Breadth of supported models, frameworks, and deployment configurations

          -              Customer feedback on platform usability and capability

Your profile

We consider candidates from diverse backgrounds, with a deep love for technical challenges and the desire to take on ownership beyond what's reasonably expected. You're someone who stays close to the rapidly evolving open-source AI ecosystem and gets energy from turning emerging tools into production-grade platform capabilities.

Requirements

-              3+ years of experience in software engineering, with a focus on backend or infrastructure systems

-              Strong proficiency in Go and Python

-              Hands-on experience with at least one ML inference serving framework (vLLM, TGI, etc)

-              Solid understanding of how large language models and other AI models are deployed and served in production

-              Experience working with REST/gRPC APIs and designing developer-facing interfaces

Nice to have

-              Familiarity with GPU scheduling, batching strategies, or inference optimisation (quantisation, speculative decoding, etc.)

-              Experience with Kubernetes and container orchestration in a production setting

-              Knowledge of AI model formats and conversion pipelines (GGUF, SafeTensors, ONNX)

-              Background in developer tools, platform engineering, or API design

Why us?

-              Outstanding team: Work with some of the best engineers in the world, coming from hedge funds, big tech, AI startups and top universities.

-              Once in a lifetime opportunity: Early-stage company in the fastest-growing market in the world

-              Ownership: Shape how European AI companies access GPU compute

-              European mission: Build sovereign, GDPR-compliant AI infrastructure for the next generation of deep-tech

Apply for this role

Fill out the form below and we'll get back to you as soon as possible.

Having trouble with the form? Send us your CV directly at careers@lyceum.technology.

Documents

Please upload your CV, recent certificates, and a brief cover letter (max 4 MB per file).

Add file