Junior Data Engineer / 1

Warsaw, Masovian Voivodeship, PolandPosted 2 months ago

Full-timehybridEntry Level

Job Description

We are looking for a Junior Data Engineer to join us and contribute to the development of modern, real‑time data processing capabilities. You will help transition existing data and ML workflows from batch processing to scalable streaming solutions. The role involves hands‑on engineering, close collaboration with Data Scientists, and operational responsibility for production data pipelines.

Technology Environment

Modern real‑time data streaming technologies used for ML model inference
Distributed data processing frameworks supporting scalable, low‑latency pipelines
Containerized workloads orchestrated in cloud‑native environments
Monitoring and observability tools for ensuring reliability and performance of data pipelines
Python‑based ecosystem supporting ML model integration and lifecycle management

Key Responsibilities

Transform batch inference workflows into streaming pipelines.
Define streaming semantics to replace batch windows, including micro‑batching, windowing, and state management.
Design Kafka topic structures, partitioning strategies, and consumer group patterns for prediction workloads.
Implement checkpointing, backpressure handling, and delivery‑guarantee strategies (at‑least‑once / exactly‑once).
Package and version ML model artifacts for streaming jobs, supporting safe rollouts and rollbacks.
Tune performance for throughput and latency, including batching strategies and resource allocation.
Deploy and operate streaming jobs with monitoring and alerting (lag, throughput, error rates).
Integrate streaming outputs into downstream ETL/BI systems.
Collaborate with Data Scientists on CI/CD for streaming models and monitor model performance/drift.

Team & Collaboration

You will work in a distributed delivery model closely aligned with the central AI/BI team in Germany.
Daily collaboration through MS Teams, Jira, Confluence.
Agile methodologies (Scrum/Kanban) in cross‑functional squads.

Practical experience with Kafka (producers/consumers, topic design, partitions, retention).
Experience with Spark Structured Streaming or similar streaming frameworks.
Familiarity with migrating batch inference to streaming architectures.
Experience running containerized workloads in Kubernetes.
Strong Python skills and understanding of common ML libraries.
English and Polish level B1 or higher.

Nice to have:

Basic monitoring/logging experience (ELK, metrics) and performance tuning.
Experience with Kafka Streams.
Familiarity with feature stores or retraining orchestration.

This position offers a hybrid work model. Office location: Warszawa, Poznań, Lublin

The position includes participation in an on‑call duty.

We hereby inform you that Inetum Polska sp. z o.o. has implemented an internal reporting (whistleblowing) procedure. The content of the procedure and the possibility to submit an internal report are available at:

https://inetum.whispli.com/speakup?locale=pl

See Your Match Score

About Inetum

More jobs at Inetum

CONSULTANT FONCTIONNEL PLM/ALM

SAINT OUEN, , France

Développeur Java/Angular

NIORT, Nouvelle-Aquitaine, France

Senior Integration Engineer / 1

Warsaw, Masovian Voivodeship, Poland

Consultant technique RPA

Lyon, France

Intégrateur DevOps H/F

NIORT, Nouvelle-Aquitaine, France

Jefe/a de Proyectos IT (Infraestructura)

Santurtzi, PV, Spain

Similar roles

Staff Data Scientist, Growth Data Science

Moloco · Menlo Park, California, United States

Software Engineer, Dimensional Data

Warner Music Inc. · Ontario

$90K - $135K

Data Solutions Engineer

EviSmart™ · Vancouver

Sports Data Collector (Sport) - Sofia, Bulgaria

Genius Sports · Statistician Network

Data/AI Governance SME

Exadel · Bulgaria, Georgia, Poland, Romania

Sports Data Collector (Football) - Bucuresti, Romania

Genius Sports · Statistician Network