Job Description
Why This Role Matters
Join us in building the next generation of data intelligence capabilities for the automotive industry. This role goes beyond traditional data engineering and offers the opportunity to shape the enterprise data foundation that powers analytical products, operational insights, and emerging AI-driven experiences across the platform.
You will play a key role in designing and evolving a multi-tenant, cloud-scale data platform that serves thousands of dealerships and business users while maintaining strong tenant isolation, governance, and data quality standards.
As part of our modernization journey, you will help drive the evolution from batch-oriented data processing to near real-time and real-time data ingestion architectures, enabling faster decision-making and unlocking new AI and machine learning use cases across the business.
If you are passionate about building large-scale data platforms, solving complex multi-tenant data challenges, and creating the data foundation that powers the future of analytics and AI, this role offers the opportunity to make a lasting impact.
Technical Expertise
Strong experience in designing and building scalable data platforms, data warehouses, and lakehouse architectures.
Deep expertise in data modeling, including dimensional modeling, Data Vault, and enterprise data architecture principles.
Advanced SQL skills with experience in query optimization, performance tuning, and large-scale data processing.
Hands-on experience with distributed data processing frameworks such as Apache Spark.
Strong understanding of modern lakehouse technologies such as Delta Lake, Apache Iceberg, or Apache Hudi.
Experience designing and implementing batch, streaming, and CDC-based data ingestion pipelines.
Proficiency in Python and/or Scala for data engineering applications.
Experience with workflow orchestration platforms such as Airflow or similar technologies.
Desired Skills & Experience
7+ years of experience in Data Engineering.
Strong expertise in Python, SQL, and Apache Spark.
Experience building scalable batch and real-time ETL/ELT pipelines.
Hands-on experience with AWS services including EMR, S3, Glue, and Athena.
Experience with Kafka, Flink, or Kinesis for streaming data processing.
Strong knowledge of dimensional modeling, Data Vault, and data warehousing concepts.
Experience with Delta Lake, Apache Iceberg, or Apache Hudi.
Expertise in workflow orchestration using Airflow.
Experience implementing data quality frameworks and monitoring solutions.
Strong understanding of partitioning, schema evolution, and performance optimization.
Familiarity with CI/CD, Git, and Infrastructure as Code tools is a plus.