Junior Data Engineer / 1
Job Description
We are looking for a Junior Data Engineer to join us and contribute to the development of modern, real‑time data processing capabilities. You will help transition existing data and ML workflows from batch processing to scalable streaming solutions. The role involves hands‑on engineering, close collaboration with Data Scientists, and operational responsibility for production data pipelines.
Technology Environment
- Modern real‑time data streaming technologies used for ML model inference
- Distributed data processing frameworks supporting scalable, low‑latency pipelines
- Containerized workloads orchestrated in cloud‑native environments
- Monitoring and observability tools for ensuring reliability and performance of data pipelines
- Python‑based ecosystem supporting ML model integration and lifecycle management
Key Responsibilities
- Transform batch inference workflows into streaming pipelines.
- Define streaming semantics to replace batch windows, including micro‑batching, windowing, and state management.
- Design Kafka topic structures, partitioning strategies, and consumer group patterns for prediction workloads.
- Implement checkpointing, backpressure handling, and delivery‑guarantee strategies (at‑least‑once / exactly‑once).
- Package and version ML model artifacts for streaming jobs, supporting safe rollouts and rollbacks.
- Tune performance for throughput and latency, including batching strategies and resource allocation.
- Deploy and operate streaming jobs with monitoring and alerting (lag, throughput, error rates).
- Integrate streaming outputs into downstream ETL/BI systems.
- Collaborate with Data Scientists on CI/CD for streaming models and monitor model performance/drift.
Team & Collaboration
- You will work in a distributed delivery model closely aligned with the central AI/BI team in Germany.
- Daily collaboration through MS Teams, Jira, Confluence.
- Agile methodologies (Scrum/Kanban) in cross‑functional squads.
- Practical experience with Kafka (producers/consumers, topic design, partitions, retention).
- Experience with Spark Structured Streaming or similar streaming frameworks.
- Familiarity with migrating batch inference to streaming architectures.
- Experience running containerized workloads in Kubernetes.
- Strong Python skills and understanding of common ML libraries.
- English and Polish level B1 or higher.
Nice to have:
- Basic monitoring/logging experience (ELK, metrics) and performance tuning.
- Experience with Kafka Streams.
- Familiarity with feature stores or retraining orchestration.
This position offers a hybrid work model. Office location: Warszawa, Poznań, Lublin
The position includes participation in an on‑call duty.
We hereby inform you that Inetum Polska sp. z o.o. has implemented an internal reporting (whistleblowing) procedure. The content of the procedure and the possibility to submit an internal report are available at: