Back to Jobs

[Remote] Cloud Operations Engineer

Remote, USA Full-time Posted 2026-06-21

Note: The job is a remote job and is open to candidates in USA. O'Reilly Media is dedicated to sharing the knowledge of innovators and helping professionals develop expertise. As a Cloud Operations Engineer, you will work on systems and tooling that power the learning platform, focusing on infrastructure-as-code and maintaining Kubernetes while collaborating with product engineering teams.

Responsibilities

  • Maintaining and updating our Kubernetes cluster to ensure steady-state operations
  • Writing or extending Terraform modules to provision and manage cloud infrastructure
  • Contributing features to the Python CLI tooling we use to manage infrastructure workflows
  • Design, build, and maintain cloud infrastructure using infrastructure-as-code (Terraform) on GCP
  • Manage and evolve our Kubernetes platform, including cluster operations, workload configuration, and service mesh (Istio)
  • Develop and improve internal tooling that abstracts cloud complexity and improves the developer experience
  • Collaborate with product engineering teams to understand service deployment needs and deliver infrastructure solutions
  • Monitor platform health using Datadog; proactively identify and resolve performance, availability, and security issues
  • Participate in on-call rotation and incident response; drive blameless post-mortems and eliminate recurring issues at their root cause
  • Define and track service-level indicators and objectives (SLIs/SLOs) for critical platform components
  • Implement and refine alerting, dashboards, and runbooks that reduce mean time to resolution
  • Embed security best practices into infrastructure workflows (DevSecOps) — not as an afterthought, but as a design principle
  • Help maintain cloud security posture, IAM hygiene, and policy guardrails across our cloud environment
  • Stay current with cloud security developments and proactively surface risks to the team
  • Execute and maintain our automated disaster recovery processes
  • Work closely with product engineering teams to understand their needs and remove infrastructure friction
  • Document systems, processes, and architectural decisions clearly so knowledge is shared, not siloed
  • Recommend improvements to tooling, architecture, and processes — and help drive them to completion
  • Keep current with the evolving cloud-native ecosystem and bring relevant knowledge back to the team

Skills

  • Bachelor's degree in Computer Science or a related field
  • 5+ years of experience working in cloud infrastructure, platform engineering, or a related discipline
  • In lieu of degree, equivalent education and/or experience may be considered
  • Hands-on experience with Kubernetes in production environments (cluster management, workloads, networking)
  • Proficiency with infrastructure-as-code tools, particularly Terraform
  • Experience with at least one major cloud provider (GCP, AWS, or Azure)
  • Solid scripting and automation skills in Python, Bash, or a comparable language
  • Experience with modern observability platforms (Datadog, Grafana, or similar)
  • Strong understanding of Linux systems administration
  • Working knowledge of CI/CD concepts and tools (GitHub Actions, ArgoCD, Jenkins, or similar)
  • Excellent communication skills — you write clearly, ask good questions, and explain complex systems accessibly
  • AI-Augmented Development: Has the ability to demonstrate using AI-enabled development tools (e.g., Claude Code, Cursor) to streamline coding, debugging, and infrastructure-as-code authoring
  • Experience with service mesh technologies such as Istio or Linkerd
  • Familiarity with GitOps workflows and tools (ArgoCD, Flux)
  • Experience with DevSecOps practices and tooling (Snyk, Trivy, OPA, or similar)
  • Working knowledge of SQL databases (PostgreSQL or MySQL)
  • Familiarity with FinOps practices and cloud cost optimization
  • Experience building or consuming internal developer platforms (IDPs)
  • Configuration management experience (Ansible, Chef, or similar)
  • Relevant certifications (CKA, CKAD, AWS/GCP Professional, or similar)

Company Overview

  • Inspiring the future for more than 45 years We share the knowledge and teach the skills people need to change their world. It was founded in 1978, and is headquartered in Seattle, Washington, USA, with a workforce of 201-500 employees. Its website is http://dankaminsky.com.
  • Apply To This Job

    Similar Jobs

    [Remote] Account Director, Central US (Remote)

    Remote, USA Full-time

    [Remote] SOLUTIONS ARCHITECT- [Clinician/ UX]

    Remote, USA Full-time

    [Remote] Lead Product Manager, First-Party Data Platform

    Remote, USA Full-time

    [Remote] Senior Technical Recruiter, AI/ML Research

    Remote, USA Full-time

    [Remote] Epic Clarity Analyst/ SQL Developer - Remote

    Remote, USA Full-time

    [Remote] Master Network Engineer - Security Infrastructure

    Remote, USA Full-time

    [Remote] Sr Epic Application Analyst - Epic Bones & Kaleidoscope-27665

    Remote, USA Full-time

    [Remote] Software Engineer | $75/hr Remote

    Remote, USA Full-time

    [Remote] Sr Epic Application Analyst - Epic Beaker-25571

    Remote, USA Full-time

    [Remote] Senior Machine Learning Engineer

    Remote, USA Full-time

    Experienced Remote Chat Support Specialist – Public Relations and Digital Strategy

    Remote, USA Full-time

    Overnight Maintenance

    Remote, USA Full-time

    Client Service Associate

    Remote, USA Full-time

    Remote Data Entry Specialist – Work From Home Opportunity with Global Entertainment Leader

    Remote, USA Full-time

    Engineering / Installation Technicians

    Remote, USA Full-time

    Experienced Data Entry Specialist and Customer Service Representative – Detail-Oriented and Organized Professional for Dynamic Team

    Remote, USA Full-time

    CVS At Home Careers ? Data Entry Remote Jobs (Part-Time) $30/Hr ? WFH

    Remote, USA Full-time

    Experienced Full Stack Healthcare Customer Service Representative – Telehealth Support

    Remote, USA Full-time

    Tableau Developer - Oracle HCM - Supply chain

    Remote, USA Full-time

    Technical Support Representative, Tier 1 (Philippines Remote)

    Remote, USA Full-time