Jobs in Germany

This job position has been removed from Arbeitnow and might not be hiring still.

Home  | English Speaking Jobs  | ilert GmbH  | Site Reliability Engineer (f/m...
  • Cologne

  • Location: Hybrid – Cologne (Rheinauhafen) — 3 days in the office, 2 remote (Tue + Thu)
    Team: Engineering · Reports to CTO

    Keep the world awake — build reliability at scale

    ilert helps thousands of DevOps & IT teams detect, fix, and communicate incidents faster.

    Our platform is mission-critical: customers rely on us 24/7 to keep their always-on businesses running.

    As a Site Reliability Engineer at ilert, you’ll own the reliability, performance, and scalability of our core platform across AWS, Kubernetes, Kafka, and more.

    Tasks

    Build & operate a highly available platform

    • Run and evolve our AWS-based infrastructure
    • Operate and optimize self-managed Kafka, ClickHouse clusters and our Observability stack
    • Ensure resilience, disaster recovery, and capacity planning across the stack

    Improve reliability & performance

    • Build and maintain SLOs, SLIs, error budgets, and observability dashboards
    • Debug production issues across layers (networking, Kubernetes, application, DB)
    • Improve performance of our ingestion pipeline

    Automation & tooling

    • Automate operations with Terraform, Helm, Kubernetes operators, and internal tooling
    • Build tooling for safer deploys, blue/green rollouts, and automated verification
    • Strengthen incident response workflows through deep collaboration with our AI SRE agent team

    Security & compliance

    • Implement best practices for workload isolation, secrets management, IAM, and auditability
    • Support our ISO27001 posture by automating controls and hardening our infrastructure

    Cross-functional impact

    • Partner with Backend, AI, and Product teams to design reliable services
    • Participate in on-call rotation
    • Lead post-incident reviews and drive reliability improvements long-term

    Requirements

    • 3+ years experience as SRE, Platform Engineer, DevOps Engineer, or Infrastructure Engineer
    • Strong hands-on experience with AWS, Kubernetes, Linux internals, networking, performance tuning
    • Experience operating self-managed distributed systems, ideally Kafka or ClickHouse
    • Strong understanding of observability
    • Experience automating infrastructure with Terraform and CI/CD systems
    • Fluent English (our working language); German optional

    Benefits

    • 🚀 Product-centric - 100 % focused on solving a mission-critical pain felt by every always-on business |
    • 🏡 Hybrid freedom - 2 days remote by default; gorgeous Rheinauhafen roof terrace when you’re in town |
    • 🕒 Focus > meetings - We time-box syncs, favour async docs and protect maker time |
    • 🌴 28 days off - …plus public holidays |
    • 🚲 Commute perks - subsidised public transport|

    Jobs at ilert GmbH

    All Jobs at ilert GmbH →

    Helpful information

    Job recommendations