职位描述:
Technical Requirements:
Cloud / Infrastructure:
-AWS: EC2 (including Spot/Fleet), EBS, S3, EFS/FSx, VPC, IAM, Auto Scaling, ELB/ALB/NLB, Route 53.
-Networking: subnetting, routing tables, NAT, gateways, security groups, NACLs.
-AWS IAM best practices, secrets management, audit/compliance awareness.
-Experience with HPC clusters or batch systems (Open OnDemand, Slurm, AWS Batch, ParallelCluster) or strong willingness to learn.
-Monitoring and logging: CloudWatch, CloudTrail, alerting and on62call rotations.
-Cloud cost analysis and optimization
CI/CD / Source Control:
CI/CD: GitHub Actions, Azure DevOps; artifact/version management.
GitHub: repository governance, PR workflows, branching strategies, Actions integrations.
Terraform: module design, remote state and locking, workspaces, policy as code, CICD integration.
Configuration management (e.g., Ansible) experience preferred.
Programming / Scripting:
Good Python programming capability: Familiar with OOP and popular libs and frameworks; automation tooling, APIs, robust error handling and testing.
Shell scripting: Bash; Linux automation utilities.
OS / Platforms:
Linux: RHEL/CentOS/Ubuntu administration, systemd, filesystems.
Windows: system administration and configuration; Powershell scripting
Okta: SSO via OIDC/SAML, SCIM provisioning, MFA and access policies.
Experience with Jupiter Notebook, R Studio, etc. .