Job Description
- Design and build the data arbitration and decision engine to resolve conflicts across multiple data sources, determining which values to publish.
- Drive the standardization and automation of our ingestion pipelines across structured, unstructured, and internal sources.
- Conduct data profiling and analysis to identify quality gaps, inconsistencies, and opportunities for process improvement.
- Implement data lineage, observability, and monitoring frameworks to ensure transparency, traceability, and reliability.
- Collaborate with Engineering and Product to define and evolve platform requirements and technical architecture.
- Apply a data product mindset—balancing engineering efficiency with data quality, client needs, and long-term maintainability.
- Support the integration of AI/LLM-based tools as part of our larger data processing and enrichment strategy.
*Please note we use years of experience as a guide, but we certainly will consider applications from all candidates who are able to demonstrate the skills necessary for the role.
- 4+ years of experience in data engineering, data architecture, or data automation roles.
- Experience working with financial data, especially within reference or entity/company data domains.
- Strong proficiency in a programming language (e.g., Python, Java, Scala) and modern data tooling (e.g., Spark, Airflow, Kafka).
- Strong SQL skills for data transformation, validation, and reconciliation
- Demonstrated experience working with large-scale datasets, ideally in domains such as reference or entity data.
- Experience with multi-source data arbitration, data normalization, and resolving conflicts across heterogeneous datasets.
- Deep understanding of data governance, quality frameworks, and metadata management.
- Strong analytical mindset and experience with data profiling and validation techniques.
- Proven ability to work independently and cross-functionally in a fast-evolving environment.
- Excellent communication skills and the ability to explain technical decisions to stakeholders with varying levels of technical knowledge.
- Experience building decision engines using rules-based logic and/or AI/ML or LLM-based models
We’d Love to See:
- Familiarity with frameworks like DCAM or DAMA-DMBOK.
- Experience working in AWS and/or Azure for cloud-native data processing and storage
- Proficiency with Git and CI/CD pipelines for reliable, production-grade deployments
- Familiarity with cloud data services (e.g., S3, EMR, Glue, ADLS, Data Factory, Databricks)
- Experience implementing data observability tools (e.g., Monte Carlo, OpenLineage, or custom solutions).
We offer one of the most comprehensive and generous benefits plans available and offer a range of total rewards that may include merit increases, incentive compensation (exempt roles only), paid holidays, paid time off, medical, dental, vision, short and long term disability benefits, 401(k) +match, life insurance, and various wellness programs, among others. The Company does not provide benefits directly to contingent workers/contractors and interns.