[Remote] Senior Site Reliability Engineer, Infrastructure
Note: The job is a remote job and is open to candidates in USA. Vultr is on a mission to make high-performance cloud infrastructure easy to use, affordable, and locally accessible for enterprises and AI innovators around the world. They are seeking a highly skilled and experienced Senior Site Reliability Engineer to build and own the observability pipeline for their global datacenter infrastructure.
Responsibilities
- Design and build the observability pipeline for datacenter infrastructure including CDUs, PDUs, bare metal servers, and provisioning workflows, collecting telemetry via Redfish, IPMI, SNMP, and OpenTelemetry
- Own the full stack from data collection through to visualization and alerting in Grafana, Loki, and Mimir
- Build dashboards and alerting that are actionable and meaningful for stakeholder teams including Datacenter Ops, SysAdmin, Network, and Provisioning
- Establish standards and patterns for how datacenter infrastructure telemetry is collected, stored, and visualized across Vultr's global footprint
- Partner closely with stakeholder teams to understand their operational needs and translate them into observable, measurable signals
- Drive infrastructure-as-code practices across the observability pipeline to ensure consistency, repeatability, and maintainability
Skills
- 5+ years of experience in site reliability, platform, or infrastructure engineering in a production environment
- Hands-on experience building and operating observability pipelines including metrics, logs, and alerting using Grafana, Loki, Mimir, or equivalent tooling
- Working knowledge of datacenter hardware telemetry protocols including Redfish, IPMI, and/or SNMP
- Strong Linux fundamentals and operational experience in production infrastructure environments
- Demonstrated experience with infrastructure-as-code and configuration management tooling (Terraform, Ansible, Chef or similar)
- Strong cross-functional communication skills and experience delivering tooling for operational stakeholder teams
Benefits
- 100% company-paid insurance premiums for employee medical, dental and vision plans.
- 401(k) plan that matches 100% up to 4%, with immediate vesting
- Professional Development Reimbursement of $2,500 each year
- 11 Holidays + Paid Time Off Accrual + Rollover Plan
- Increased PTO at 3 year and 10 year anniversary + 1 month paid sabbatical every 5 years + Anniversary Bonus each year
- $500 stipend for remote office setup in first year + $400 each following year
- Internet reimbursement up to $75 per month
- Gym membership reimbursement up to $50 per month
- Company paid Wellable subscription
Company Overview
Company H1B Sponsorship