Back to jobs
Proofpoint

Senior Data Engineer

Cordoba, ArgentinaPosted Yesterday
Full-timeremote

Job Description

About Us:

 

Proofpoint is a global leader in human- and agent-centric cybersecurity. We protect how people, data, and AI agents connect across email, cloud, and collaboration tools. Over 80 of the Fortune 100, 10,000 large enterprises, and millions of smaller organizations trust Proofpoint to stop threats, prevent data loss, and build resilience across their people and AI workflows. Our mission is simple: safeguard the digital world and empower people to work securely and confidently. Join us in our pursuit to defend data and protect people.

How We Work:

At Proofpoint you’ll be part of a global team that breaks barriers to redefine cybersecurity guided by our BRAVE core values: 

Bold in how we dream and innovate

Responsive to feedback, challenges and opportunities

Accountable for results and best in class outcomes

Visionary in future focused problem-solving

Exceptional in execution and impact

The Role 

 

We're seeking a Senior Data Engineer to build and maintain the ML/AI data infrastructure powering our email security platform. In this role, you'll design and optimize scalable data pipelines that enable threat detection and investigation while supporting both machine learning models and LLM-powered agents that provide context-aware security insights. 

  

You'll work on our Detection Intelligence Platform (DIP) building feature engineering frameworks, and offline/online feature stores that serve as the foundation for ML model research and context engineering for AI agents. You'll collaborate with data scientists, ML engineers, and security researchers to build data models and context stores that power our detection systems and enable human security analysts to investigate threats effectively. 

  

Key Responsibilities: 

 

  • Develop and maintain scalable data pipelines on AWS/Azure using technologies such as Spark, Airflow, Athena, Kubernetes etc. to process structured and unstructured email data at scale 

  • Design and optimize Iceberg-based data lake tables and schemas for efficient storage, querying and versioning across petabyte-scale datasets distributed across data centers globally 

  • Build and manage feature engineering frameworks that support offline batch processing and online real-time feature serving for ML model training and inference 

  • Develop and maintain training data pipelines optimized for distributed ML model training, ensuring data lineage and reproducibility 

  • Collaborate with data scientists and security researchers to understand data requirements and translate them into robust, production-grade data solutions 

  • Monitor and optimize data pipeline performance, implementing observability and alerting to ensure data freshness and quality 

  • Mentor junior engineers and foster a culture of engineering excellence and knowledge sharing 

 

 

Required Experience: 

  

  • Several years of industry experience building and maintaining distributed data systems and high-scale data pipelines in a managed cloud environment (AWS / Azure / GCP) using big data processing engines such as Spark, Flink, Dask, Ray, Beam, DataBricks Workflows or similar 

  • Deep proficiency in Python for developing production-grade data processing code 

  • Strong experience with Infrastructure-as-Code frameworks, particularly Terraform 

  • Solid understanding and hands-on experience with open table formats for data lakes (Apache Iceberg, Hudi, DeltaLake) and data modeling best practices  

  • Experience with AWS Athena, Glue, or similar data query and cataloging services 

  • Experience with Apache Airflow or similar workflow orchestration tools for batch and real-time pipeline management 

  • Demonstrated ability to design and implement scalable ETL/ELT pipelines handling complex data transformations 

  • Excellent communication skills and ability to collaborate effectively with technical and non-technical stakeholders 

 

Good to have: 

 

  • Experience with feature engineering frameworks and feature stores (e.g., Feast, Tecton, or custom solutions) 

  • Familiarity with Kubernetes for containerized data workloads and orchestration 

  • Background in building data infrastructure for machine learning and AI applications 

  • Experience with data quality frameworks and observability tools for data pipelines 

Why Proofpoint?

At Proofpoint, we believe that an exceptional career experience includes a comprehensive compensation and benefits package. Here are just a few reasons you’ll love working with us:

  • Competitive compensation

  • Comprehensive benefits

  • Career success on your terms

  • Flexible work environment

  • Annual wellness and community outreach days

  • Always on recognition for your contributions

  • Global collaboration and networking opportunities

 

Our Culture:

Our culture is rooted in values that inspire belonging, empower purpose and drive success-every day, for everyone.

We encourage applications from individuals of all backgrounds, experiences, and perspectives. If you need accommodation during the application or interview process, please reach out to [email protected].

 

How to Apply

Interested? Submit your application along with any supporting information- we can’t wait to hear from you!

See Your Match Score

Sign up and Renata will show you how this job matches your skills and experience.

Get Started Free
Senior Data Engineer at Proofpoint | Renata