IT Infrastructure
1 week ago
Overview
We are seeking a highly skilled
IT Infrastructure & DevOps Lead
to oversee and evolve our global infrastructure footprint — spanning bare metal servers and a unified cloud environment.
This is a
hands-on leadership role
responsible for guiding a team of
3–5 DevOps engineers and system administrators
, driving automation, reliability, and consolidation toward a streamlined, single-cloud architecture.
Key Responsibilities
● Lead and mentor a team of
3–5 DevOps engineers and system administrators
, fostering a culture of ownership, performance, and continuous improvement.
● Oversee the design, deployment, and operation of infrastructure across
bare metal and a single chosen cloud provider
(to be consolidated from multi-cloud).
● Plan and execute the
cloud consolidation strategy
, ensuring efficiency, cost optimization, and operational simplicity.
● Maintain and improve
Kubernetes
,
Docker
, and
Helm
-based environments — ensuring scalability, availability, and security.
● Manage
CI/CD pipelines
,
infrastructure as code
, and
configuration management
(Terraform, Ansible, GitOps).
● Own
monitoring, observability
, and
incident response
processes for all production and staging systems.
● Define and enforce
security, access control
, and
compliance
standards across environments.
● Drive
capacity planning
,
backup
, and
disaster recovery (DR)
readiness.
● Evaluate and integrate new DevOps tools or services that enhance performance, reliability, or developer productivity.
● Collaborate with engineering and security teams to ensure infrastructure supports rapid and compliant product delivery.
Required Skills & Experience
●
8–10+ years
in IT Infrastructure / DevOps roles, including
3+ years of leadership or team lead experience
.
● Proven
hands-on expertise
with:
○
Bare metal server management
(hardware provisioning, networking, virtualization, PXE).
○
Linux systems administration
(Debian/Ubuntu/RHEL).
○
Container orchestration:
Docker, Kubernetes (cluster lifecycle, scaling, upgrades).
○
Helm
,
Terraform
,
Ansible
, and modern
GitOps
workflows.
○
Cloud infrastructure management
on at least one major provider (
AWS
,
GCP
, or
Azure
).
● Strong grasp of
networking fundamentals
(VPCs, VPNs, load balancing, DNS, routing).
● Deep understanding of
CI/CD
,
monitoring
, and
observability
practices.
● Knowledge of IT policies; key experience — working with DORA / EBA and
meeting audit requirements
● Practical experience with
security
,
identity
, and
compliance
controls in cloud or hybrid setups.
● Demonstrated ability to design and run
high-availability
and
disaster recovery
systems.
● Excellent problem-solving, optimization, and debugging skills.
Preferred / Nice to Have
● Experience in
cloud cost optimization
and migration/consolidation projects.
● Knowledge of
service meshes
(Istio, Linkerd) and observability stacks (Prometheus, Grafana, ELK/EFK).
● Scripting/programming experience in
Python
,
Go
, or
Bash
.
● Familiarity with
SRE principles
, SLIs/SLOs, and reliability engineering practices.
Soft Skills
● Strong leadership and mentoring skills with the ability to balance strategic vision and hands-on delivery.
● High ownership, initiative, and focus on operational excellence.
● Effective communicator across technical and business teams.
● Able to drive standardization, simplification, and continuous improvement across environments.