Job Description
Who we are
At Frontiers, our purpose is simple yet ambitious: to make science open. We believe open science empowers the global scientific community to accelerate discovery and develop the solutions needed for healthy lives on a healthy planet.
We are one of the world’s largest and most influential open-access research publishers. Every article we publish is peer-reviewed and quality-certified, ensuring research is accessible to everyone, everywhere. To date, Frontiers research has been viewed over 4 billion times, demonstrating the real-world impact of science without barriers.
Joining Frontiers means being part of a global, mission-driven organization at the intersection of science, technology, and innovation — working alongside passionate colleagues who care deeply about advancing knowledge for the benefit of society.
To learn more about our impact and culture, please watch this video:
https://www.youtube.com/watch?v=jLJ7ZO3wOW4
About the Role:
- Own and evolve the company’s data infrastructure: event tracking, ingestion pipelines, and analytics tooling
- Manage data collection pipelines via Snowplow, GTM (server-side and client-side), and Airflow-orchestrated workflows running on Cloud Composer
- Build and maintain BigQuery datasets: partitioning, clustering, cost optimization, scheduled queries, and access controls
- Write and maintain Python scripts for data ingestion, transformation, and reporting
- Manage Cloud Storage (GCS) for data staging, Snowplow enriched event storage, and pipeline artifacts
- Configure and maintain GCP IAM, service accounts, and access controls across services and external APIs
- Monitor and debug pipelines using Cloud Logging & Monitoring; set up alerting and ensure pipeline observability
- Work with relational and non-relational databases: schema design, query optimization, performance tuning
- Collaborate with frontend, QA, DevOps, and product teams
- Participate in code reviews and uphold code quality standards
- Contribute to technical improvements (testing, observability, performance, maintainability)
- Mentor mid-level engineers and contribute to architectural decisions
- Python — hands-on experience writing production scripts and data pipelines
- SQL — strong practical experience: queries, indexes, performance tuning, and analytical workloads
- Google Cloud Platform — working experience across GCP services including:
- BigQuery — partitioning, clustering, cost optimization, scheduled queries, access controls
- Cloud Composer — managed Airflow: DAG deployment, environment configuration, GCP Console monitoring
- Cloud Storage (GCS) — data staging, pipeline artifacts, enriched event storage
- IAM & service accounts — secure access management across GCP services and external APIs
- Cloud Logging & Monitoring — pipeline observability, alerting, and debugging
- Apache Airflow / Cloud Composer — building and maintaining DAGs and data workflows
- Snowplow — event tracking setup and data collection pipelines
- Google Tag Manager — server-side and client-side configuration and management
- Azure Databricks — data processing and analytics workloads
- Understanding of clean code principles (maintainable, testable pipelines and scripts)
- Comfortable with Git and code review workflows
Nice-to-have skills
- C# and .NET Core — backend/API development experience is a plus
- REST APIs — design, implementation, authentication/authorization basics
- SQL Server / T-SQL — experience with Microsoft SQL Server environments
- MongoDB
- Elasticsearch
- RabbitMQ or similar queuing system
- Docker / containers
- GraphQL (building or consuming APIs)
- Background job processing (e.g. Hangfire)
- Observability tools (e.g. New Relic or similar)
- PowerShell scripting
- Agile tooling (e.g. Jira)
- AI-assisted development tools (Cursor, etc.)
