Data Engineer - ML Systems for Autonomous Driving
Oxfordshire, South East England; England
Posted 1 day ago
About the role
Founded in 2014, Oxa is a global leader in autonomous vehicle (AV) technology, dedicated to accelerating Industrial Mobile Autonomy (IMA).
We develop advanced physical AI and robotics technology, anchored around our configurable and explainable self-driving software, Oxa Driver; development toolchain, Oxa Foundry; and fleet management software, Oxa Hub. Our solutions automate repetitive industrial driving tasks, such as the towing and carrying of goods in locations like ports, airports and manufacturing facilities, or asset and perimeter monitoring in environments such as solar farms or industrial plants. We’re helping global businesses to address critical challenges like labour shortages and rising operational costs - driving efficiency, productivity, and safety.
Based in Oxford, and with offices in Canada, our engineering team is drawn from the world’s top physical AI specialists and led by originators of the field.
We are hiring a Data Engineer to help build the systems that prepare, curate, and scale training and evaluation data for machine learning in autonomous driving.
You will work across the full data lifecycle, from raw vehicle logs and simulation outputs to curated, labelled, and model-ready datasets. This includes handling multimodal sensor data, scaling labelling through both human and ML-based workflows, and enabling intelligent selection of high-value data from thousands of hours of real-world and simulated driving.
This role sits close to model performance and safety ensuring quality, structure, and selection of data directly influence how perception and planning systems behave in the real world.
Transform raw multimodal logs (camera, LiDAR, radar) into training-ready datasets
Support hand-labelled and auto-labelled data pipelines, including validation and quality control
Help build and scale autolabelling systems, where ML models generate annotations across large datasets
Support intelligent data curation and selection from thousands of hours of real-world and simulated driving
Generate and process simulated data for perception and planning, ensuring sufficient sim-to-real fidelity for synthetic data to be useful in training and evaluation
Manage multiple data representations, including sensor-native formats (images, point clouds), structured scene representations (objects, semantics, occupancy), and bird’s-eye view (BEV) representations for downstream models
Support dataset generation for perception models (for example detection, segmentation, and occupancy) and planning models (behavioural learning)
Design, build, and maintain scalable data pipelines from raw logs to training datasets
Implement quality control mechanisms for both human and ML-generated labels
Support ML-assisted data curation workflows to identify high-value or failure-prone scenarios
Build pipelines to generate, transform, and validate simulated datasets, helping identify and reduce sim-to-real mismatches to improve their usefulness for training and evaluation
Work closely with ML engineers to translate model requirements into data pipelines and datasets
Debug data issues across the stack, from sensor-level artefacts to dataset inconsistencies
Strong software engineering skills, with Python as a primary language
Strong SQL skills and experience working with analytical data warehouses (e.g. Experience building production-grade data pipelines or distributed data systems
GCP, AWS, or similar)
Solid understanding of data modelling, transformation, and data quality considerations
Experience working with ML data pipelines or supporting ML systems
Familiarity with computer vision, robotics, or autonomous systems
Experience working with multimodal sensor data, such as images, LiDAR, or radar
Experience with spatial or geospatial data
Familiarity with Linux-based development environments
Experience with tools such as Docker, shell scripting, workflow orchestrators, and transformation frameworks (e.g. Competitive salary, benchmarked against the market and reviewed annually
~ Hybrid and/or flexible remote working arrangements
~ Core benefits of market leading private healthcare, life assurance, critical illness cover, income protection, alongside a company paid health cash plan (including gym discounts)
~ A salary exchange pension plan
~25 days’ annual leave plus bank holidays
~ A pet-friendly office environment
~ We hire and nurture those we can learn from, valuing diversity and the innovation that this drives.
Our team of experts in computer science, AI, robotics and machine learning is world-class, and together they’re solving the most exciting and important technological challenges of our times.
If you are bold, creative and hyper skilled, come and create the future of autonomy with us at Oxa.
About this listing
Screened by Joboru
This role passed our automated spam and quality filters and was active in our feed when last checked. Joboru is an aggregator — here is how we screen listings. If anything looks off, tell us.
Similar jobs you may like
Information Data Architect
1 day agoPeregrine
Lead Data Architect (eDV)
1 day agoAker Systems Limited
Data Analyst Insurance
1 day agoBAE Systems
Business Intelligence lead
1 day agoLynx Employment Services Ltd
SC and NPPV3 Cleared AI Engineer
1 day agoIO Associates
Data Scientist
1 day agoLATHAM & WATKINS LLP
AI Engineering Manager
1 day agoHalian Technology Limited
Trainee Data Analyst Associate
1 day agoNewto Training
Data Architect
1 day agoRaytheon