Which challenges you can expect
· Working in a distributed DevOps team across multiple locations using agile practices
· Supporting the design and operation of core infrastructure on GCP for data science and Python-based ML/AI workloads
· Maintaining and improving Kubernetes environments with Helm and ArgoCD
· Developing and reviewing Terraform modules and cloud configurations
· Managing GitLab CI/CD pipelines across applications, infrastructure, and ML projects
· Implementing and optimizing autoscaling strategies for workloads and clusters
· Ensuring robust observability across monitoring, logging, and tracing using Grafana, Prometheus, Loki, and Tempo
· Troubleshooting issues across cloud, container, networking, and application layers
· Enhancing platform reliability, cost efficiency, and developer experience
· Educating developers and data scientists on best practices and relevant parts of the technology stack
Expected experience and skills
· Fluent English; German is a plus
· Minimum 4 years of professional experience in DevOps or related roles
· Bachelor’s degree in Computer Science, Information Technology, or a related field
· Strong expertise with at least one major cloud provider (GCP preferred; AWS/Azure acceptable)
· Extensive experience with Docker, Kubernetes, Helm charts, and container-based production systems
· Understanding basic security concepts and helping to identify and mitigate common vulnerabilities
· Solid hands-on proficiency with Terraform and GitOps tools such as ArgoCD
· Strong experience building and maintaining CI/CD pipelines, ideally with GitLab CI
· Proficiency in Unix/Linux administration, shell scripting, and practical networking fundamentals
· Experience with monitoring and observability solutions (Grafana stack; ELK also relevant)
· Familiarity with autoscaling concepts (HPA/VPA, cluster autoscaler)
· Experience with relational and document databases, such as PostgreSQL and MongoDB, is a plus
· Experience with OpenTelemetry instrumentation is a plus