Popular
Categories
Blog - Popular articles
Jobs in Germany
At idealo, Generative AI (GenAI) is becoming a multiplier across every team. The AI Booster Team is our internal technical competence center: we pair with product teams, build reusable GenAI building blocks and share best practices company-wide.
As a Data Engineer in our AI Booster Team, you’ll be at the center of this transformation: designing data pipelines, integrations, and automations that power GenAI use cases at scale. You’ll combine classic data engineering skills with modern AI infrastructure, ensuring that product teams have instant, reliable access to “LLM-ready” data and the automation tools they need to move fast.
This position is available full-time or part-time.
Build robust pipelines: Ingest, transform, and unify data from APIs, databases, files, and streams into analytics- and LLM-ready formats.
Engineer integrations: Develop connectors and orchestrations (Airflow, n8n, Step Functions) that move data securely and efficiently between warehouses, APIs, CMSs, and GenAI services.
Operate modern data stores: Manage vector databases and feature stores to enable fast, reliable retrieval for RAG and ML use cases.
Ensure reliability & cost-efficiency: Implement data quality checks, lineage tracking, monitoring, and FinOps practices.
Enable self-service: Provide reusable workflows, templates, and automation components that empower product teams to build on top of your data infrastructure.
Coach & collaborate: Share best practices, write playbooks, and guide teams in building scalable data-driven and GenAI-enabled workflows.
3+ years in data engineering or MLOps, delivering production-grade data integrations.
Strong experience unifying heterogeneous data sources (SQL/NoSQL, APIs, streams).
Advanced Python & SQL skills; comfortable with Spark/Glue, Kafka/Kinesis, schema evolution.
Hands-on with AWS services (S3, Glue, Redshift, Lambda, SageMaker, Bedrock, CDK/Terraform).
Familiarity with vector stores and embedding pipelines for RAG.
Strong focus on observability, reliability, and cost control.
Excellent communication skills for enabling and coaching non-data specialists.
We’re keen to see evidence of exceptional achievement - perhaps you’ve scaled a personal project to thousands of users, published influential research, ranked highly in competitive arenas (e.g. sports, Kaggle, hackathons) or maintain widely-used open-source libraries. Tell us what makes you stand out!
You don’t tick every single box? No worries! We hire people, not checklists, and value motivation to grow.
#LI-AJ