The role is positioned in the Software Engineering & User Experience (SWUX) organization, under the Team Lead Site Reliability Engineering. As a seasoned practitioner with a deep experience in managing software environments, you will work in a global team to ensure system reliability and performance. Being an experienced technologist, you will be able to optimize our system performance and innovate for continuous improvement.
KEY RESPONSIBILITIES:
1. Gather and analyze metrics from operating systems as well as applications to assist in performance tuning and fault finding
2. Partner with development and operations teams to improve services through rigorous testing and release procedures; perform root cause analyses and implement solutions
3. Partner with architecture teams
4. Improve existing systems through automation and uplifts
5. Participate in system design consulting and platform management
6. Balance feature development speed and reliability with well-defined service-level objectives.
COMPETENCIES &SKILL & EXPERIENCE
1. Bachelor's or Master's degree in Computer Science or Software Engineering or relevant experience
2. At least 3 years' experience in a Site Reliability Engineering / Platform Engineering / DevOps role or similar
3. Excellent troubleshooting skills and proven experience resolving production downtime with immediate and long-term solutions
4. A deep understanding of algorithms, data structures, complexity analysis and software design
5. Good analytical skills coupled with excellent communication skills; professional English is required, German is a bonus
6. At least Google Associate Cloud Engineer certification, higher certifications are a bonus.
Technical Skills:
1. Experience with Kubernetes and GCP cloud (AliCloud when located in China) both as an admin and user
2. Previous software development experience in one of: Golang, C++, or any other modern programming language; Flutter experience is a bonus
3. Extensive knowledge of relational databases, file systems and Linux
4. Familiarity with monitoring tools (e.g. Datadog) and project tracking software (e.g. Jira)
5. Proficiency in building / maintaining CI and CD pipelines
6. Experience working with container orchestration platforms such as Kubernetes.
7. Good understanding of systems automation and IT Security.