Platform Engineer (AI Infrastructure)

We Love Alfa

ScreenedJust posted

London

£120000 - £180000/annum

Posted 1 day ago

Apply Now

About the role

We are hiring a Platform Engineer to help build and evolve the software platform behind large scale AI infrastructure. This is a hands on engineering role for someone who can write strong Python, work deeply with Kubernetes, design and build platform applications, and operate close to bare metal infrastructure. You will help build the systems that make GPU compute easier to provision, operate, secure and scale across AI infrastructure environments. This is not a generic DevOps role. We are not looking for someone who has only maintained pipelines, written Terraform or managed cloud services. We need someone who can build real platform software and understands the infrastructure it runs on. What you will do Design and build platform applications, APIs and services Write production grade Python for infrastructure and platform use cases Work with Kubernetes to build scalable platform capabilities Design and build Kubernetes operators and controllers across compute, storage and networking Build tooling that improves how bare metal and GPU infrastructure is provisioned, operated and monitored Translate operational pain points into scalable platform features Improve platform reliability, observability and performance Work across Linux, networking, storage and distributed systems Collaborate with product, security, infrastructure, networking and compute teams Help build the platform layer for AI infrastructure designed to operate at industrial scale What we are looking for Strong Python engineering experience Strong hands on Kubernetes experience Experience designing and building applications, APIs, services or internal platform tooling Bare metal infrastructure experience Strong Linux systems experience Good understanding of networking, storage and distributed systems Experience building production grade systems with proper testing, CI/CD, code reviews and clean engineering standards A practical engineering mindset and the ability to solve real infrastructure problems through software Preferred experience Experience building Kubernetes operators, CRDs or controllers Exposure to GPU infrastructure, HPC or high performance compute Experience with Go or Rust Knowledge of confidential computing, including TEE, SEV, TDX or CoCo Experience with Ceph or distributed storage systems Familiarity with Prometheus, Grafana or OpenTelemetry Experience with BGP, RDMA or high performance networking Exposure to NVIDIA GPU infrastructure or bare metal cloud environments Why this role matters AI infrastructure is constrained by the ability to deliver reliable compute at scale. This role sits in the platform layer that connects software engineering with real infrastructure. You will help build systems that run close to the metal, across Kubernetes, Linux, networking, storage and GPU compute. This is a role for someone who wants to build the infrastructure layer behind AI, not just operate tools around it

About this listing

Screened by Joboru

This role passed our automated spam and quality filters and was active in our feed when last checked. Joboru is an aggregator — here is how we screen listings. If anything looks off, tell us.