Job Description
As a Devops/Senior DevOps, you will play a critical role in ensuring the reliability, scalability, and performance of our company's digital infrastructure and services. You will work closely with cross-functional teams, including software engineers, system administrators to design, build, and maintain highly available systems that can handle large-scale traffic and deliver exceptional user experiences.
Your primary focus will be on automating operational tasks, optimizing system performance, and monitoring system health.
Responsibilities:
• Automation: Develop and maintain automation tools and frameworks to streamline operational processes and reduce manual intervention. Automate repetitive tasks and build self-healing systems.
• System Reliability: Monitor and maintain the reliability and availability of the company's digital infrastructure, including servers, networks, databases, and applications.
• Incident Management: Respond to and resolve incidents in a timely manner, ensuring minimal downtime and impact on users. Conduct post-incident analysis and implement preventive measures to avoid future incidents.
• Performance Optimization: Identify system bottlenecks and performance issues, and work with development teams to optimize system performance and scalability.
• Continuous Monitoring: Implement monitoring solutions to track system health, performance, and availability. Proactively identify and resolve issues before they impact users.
• Capacity Planning: Collaborate with the infrastructure team to perform capacity planning and ensure that systems have sufficient resources to handle expected growth and traffic spikes.
• Deployment and Release Management: Develop and improve deployment and release processes to ensure smooth and error-free deployments. Implement canary releases and A/B testing strategies.
• Collaboration: Work closely with software engineering and DevOps teams to promote a culture of collaboration and shared responsibility for system reliability and performance.
• Documentation: Create and maintain comprehensive documentation for system configurations, procedures, and troubleshooting guides.
Job Requirements
• Bachelor's degree in Computer Science, Information Technology, or a related field (or equivalent work experience).
• Have good mindset on DevOps culture.
• Strong knowledge of Linux/Unix systems and networking concepts.
• Proficiency in at least one programming language (e.g., Python, Go, Java) and experience with scripting languages (e.g., Bash, PowerShell).
• Experience with cloud platforms (e.g., AWS, Azure, Google Cloud) and infrastructure-as-code tools (e.g., Terraform, CloudFormation).
• Familiarity with containerization and orchestration technologies (e.g., Docker, Kubernetes, Rancher).
• Familiarity with GitOps concept and tools like ArgoCD.
• Hands-on experience with monitoring and logging tools (e.g., Prometheus, Grafana, ELK stack).
• Understanding of agile development methodologies and DevOps principles.
• Strong problem-solving skills and the ability to analyze complex systems.
• Excellent communication and collaboration skills.