Company:
Computer Futures
Location: remote
Closing Date: 29/06/2026
Hours: Full Time
Type: Permanent
Job Description
Site Reliability Engineer (m/f/d) - Full Remote (Germany)
Build & Operate a Next‑Gen Kubernetes Platform
We're looking for a Senior Site Reliability Engineer who wants to work with one of the most forward‑thinking engineering teams in Europe. The role is 100% remote within Germany and centers around building secure, high‑performance, Kubernetes‑native systems used in demanding real‑world environments.
What You'll Do
- Deploy, operate, and scale Kubernetes clusters across cloud, on‑prem, and air‑gapped environments
- Automate prototype environments for rapid innovation cycles
- Build GitOps workflows using ArgoCD, Helm, and Kustomize
- Implement observability with Prometheus, Grafana, and OpenTelemetry
- Validate platform constraints early and prepare clean handoff packages (runbooks, charts, specs)
- Work closely with platform teams to turn prototypes into production‑ready systems
What You Bring
- 4+ years in SRE, Platform Engineering, or DevOps
- Deep hands‑on experience with Kubernetes (cloud + on‑prem)
- GitOps mindset and experience with ArgoCD or Flux
- Strong IaC skills (Terraform, Ansible)
- Solid troubleshooting, incident response, and observability knowledge
- Ability to work independently in a highly technical environment
- Based in Germany (full remote)
Why This Role Is Special
- Work with a cutting‑edge tech stack
- High‑impact engineering in a security‑focused environment
- Small, senior engineering team with real ownership
- Remote‑first culture with occasional meetups
- Fast iteration cycles – no corporate slow‑down
Desired Skills and Experience
- Kubernetes
- On‑prem Kubernetes
- Hybrid Deployments
- GitOps
- Argo CD
- Flux
- Helm
- Kustomize
- Pulumi
- Terraform
- Infrastructure as Code (IaC)
- Ephemeral Environments
- Rapid Provisioning
- Prototype Environment Automation
- Environment Teardown
- Deployment Automation
- Observability
- Monitoring
- OpenTelemetry
- Prometheus
- Grafana
- Lightweight SLOs
- Error Budgets
- Incident Response
- Troubleshooting
- Demo Deployments
- Disaster Recovery
- Artifact Registries
- Restricted / Air‑Gapped Networks
- Platform Standards Alignment
- Runbooks
- Deployment Specs
- Handoff Packages
- Container Security
Share this job
Computer Futures
Useful Links