Physical AI Data Engineering Lead - Innovation & Data Office - Associate Director - EY GDS
- Buenos Aires
- Permanente
- Tiempo completo
- Architect end-to-end AI-ready data pipelines that generate, curate, and govern high-quality datasets required to simulate physical AI scenarios at scale, addressing data quality, accessibility, and consistency challenges
- Lead the creation and maintenance of synthetic data generation systems to model diverse operational conditions and edge cases, ensuring robust, risk-aware training of robots, drones, and smart-edge devices
- Build and operate digital twin data models using NVIDIA Omniverse to capture real-world environment dynamics, enabling data-driven testing, optimization, and de-risking of physical AI deployments.
- Oversee the integration of simulation training datasets from NVIDIA Isaac frameworks, ensuring data fidelity and completeness for validating AI-driven robotics in 3D environments.
- Manage ingestion, transformation, and execution of compute-intensive AI workloads on NVIDIA AI Enterprise, ensuring data throughput, scalability, and security for training and inference.
- Apply Responsible Physical AI principles by embedding data governance controls i.e., safety, ethics, compliance, resilience, across the full data lifecycle
- Support client showcases using demos/proof of concepts. Enable Physical AI Data related thought leadership
- Evaluate when to use real-world data vs. synthetic data based on scenario variability, risk exposure, model generalization needs, and gaps in ground-truth observations.
- Decide appropriate data fidelity levels for digital twin simulations to optimize accuracy, simulation speed, and compute resource usage on accelerated NVIDIA infrastructure.
- Assess data completeness and reliability before approving robotic behaviors for pilot deployment, ensuring simulation-to-reality transfer readiness.
- Identify data-driven safety risks e.g., biases, incomplete synthetic scenarios, insufficient operational coverage. And escalate adjustments aligned with Responsible Physical AI guardrails.
- Determine appropriate storage, movement, residency, and processing strategies (edge vs. cloud vs. on-prem accelerated compute) based on data sensitivity, latency, and compliance requirements.
- Expertise in synthetic data generation, including domain randomization, procedural simulation, and scenario augmentation, to support scalable physical AI training.
- Deep skill in building and managing digital twin datasets within NVIDIA Omniverse, including sensor modeling, telemetry ingestion, and environment representation.
- Strong command of robotics simulation data workflows using NVIDIA Isaac (dataset creation, sim logs, trajectory data, behavioral modeling) for pre-deployment validation.
- Proficiency in designing AI-ready data architectures that meet reliability, scalability, governance, and security standards required for enterprise Physical AI systems.
- Knowledge of safety, ethics, and compliance requirements related to Responsible Physical AI, with the ability to enforce data-driven guardrails and auditability.
- Familiarity with accelerated data processing and model training using NVIDIA AI Enterprise, including dataset distribution, GPU-optimized pipelines, and high-volume simulation data handling.
- Knowledge as well as the ability to strengthen/acquire new knowledge and capabilities to keep up with the continuously evolving technology landscape
- Proven ability to manage high-performing teams and engage stakeholders across various cultures and time zones
- MS/PhD in Computer Science, Electronics Engineering, Robotics or related fields
- 7+ years of hands-on experience with data-driven robotics or simulation systems, including leadership of programs using Omniverse digital twins and Isaac-based robotics datasets.
- Demonstrated experience operationalizing synthetic data pipelines and AI-ready data governance frameworks in real-world automation or AI deployments.
- Track record delivering data-intensive AI workloads on NVIDIA AI Enterprise or similar accelerated computing environments.