Job Description
Key Responsibilities
▪ Infrastructure & Environment Management
▪ Maintain and ensure the availability of development, staging, and production environments
▪ Manage access, runtime stability, and environment upgrades
▪ CI/CD Pipeline Management
▪ Design, build, and improve CI/CD pipelines using tools like Jenkins,
Automate build, testing, and deployment processes
▪ Troubleshoot and resolve pipeline issues
▪ Infrastructure as Code (IaC)
▪ Develop and maintain automation scripts using Ansible
▪ Standardize infrastructure configurations and automate environment provisioning
▪ Monitoring & Observability
▪ Configure and maintain monitoring, logging, and alerting systems (Grafana, Prometheus, Splunk)
▪ Enhance observability with proactive alerting and dashboards
▪ Incident & SLA Management
▪ Respond to alerts and incidents within agreed SLAs
▪ Triage and resolve infrastructure and pipeline issues
▪ Document incidents and implement preventative measures
▪ Security & Compliance
▪ Apply system hardening, vulnerability remediation, and patching
▪ Support audits and compliance checks
▪ Performance & Capacity
▪ Monitor system performance and resource usage
▪ Conduct tuning (e.g., JVM, database, message broker) and provide optimization recommendations
Job Requirements
• Fluent English (we will interview in English)
• Middle level: 2–4 years of experience; Senior level: 5 years of experience in DevOps, Site Reliability Engineering (SRE), or Infrastructure Engineering roles
• Hands-on experience with CI/CD pipelines such as GitLab CI, Jenkins, etc.
• Proficiency in scripting languages, particularly Bash or Python
• Basic understanding of Java application troubleshooting
• Solid foundational knowledge of databases (Oracle, MSSQL,...)
• Experience with Infrastructure as Code (IaC) tools (Ansible, Chef,...)
• Proficient in version control systems (Git)
• Comfortable working with Linux-based operating systems, such as Ubuntu or RedHat...
• Experience with reverse proxy configurations
• Familiarity with observability and monitoring tools (Prometheus, Grafana, Splunk,...)
• Knowledge of messaging and application servers such as Kafka, RabbitMQ, and Tomcat (a plus)
• Experience with WebLogic Server installation, configuration, and maintenance (nice to have)