Site Reliability Engineer
Join the dynamic journey at Vynca, where we're passionate about transforming care for individuals with complex needs. We’re more than just a team; we're a close-knit community. Our shared commitment to caring for each other and those we serve is what sets us apart. Guided by our unwavering core values: Excellence, Compassion, Curiosity, and Integrity, we forge paths of success together. Join us in this transformative movement where you can contribute to making a profound difference every day. At Vynca, our mission is to provide comprehensive care for more quality days at home. About the job We're looking for a Site Reliability Engineer (E3) to help build and operate the infrastructure that powers Vynca's healthcare technology platform. In this role, you'll work at the intersection of software engineering, cloud infrastructure, and operations to ensure our systems are reliable, scalable, secure, and performant. As a member of the Technology team, you'll design and manage cloud infrastructure in AWS, operate Kubernetes-based workloads, improve observability across our platform, and automate operational processes that enable engineering teams to move quickly and safely. You'll play a critical role in maintaining the health of our production environment while helping shape the future architecture of our systems. This is a hands-on engineering role with significant ownership and impact. You'll partner closely with Software Engineers, Product teams, and Data teams to build resilient systems that support our mission of delivering comprehensive care for more quality days at home. This position is remote and requires working East Coast business hours (EST). What You’ll Do Design, provision, and manage AWS infrastructure using Terraform as the source of truth. Operate, maintain, and scale production workloads running on Kubernetes. Package, deploy, and manage applications using Helm and infrastructure automation tools. Build, operate, and improve distributed and event-driven systems, including event sourcing, partitioning, event ordering, replay, and failure recovery mechanisms. Define, monitor, and maintain Service Level Indicators (SLIs), Service Level Objectives (SLOs), and error budgets to balance reliability and engineering velocity. Develop automation for deployment, scaling, monitoring, incident response, and operational workflows to reduce manual effort and improve system resilience. Own platform observability by implementing and maintaining metrics, logging, tracing, monitoring, and alerting solutions. Lead incident response efforts, facilitate blameless postmortems, and drive long-term corrective actions that improve system reliability. Partner with Product and Engineering teams on capacity planning, performance optimization, and resilient system design. Implement and maintain security best practices to support HIPAA, SOC 2, and other compliance requirements. Participate in an on-call rotation and provide operational support for production systems. Your experience and qualifications Experience: Three to five (3–5) years of experience in Site Reliability Engineering, DevOps Engineering, Platform Engineering, Cloud Infrastructure Engineering, or similar infrastructure-focused roles, preferably within healthcare, SaaS, or high-growth technology environments. Education: Bachelor's degree in Computer Science, Information Systems, Software Engineering, or a related technical field; equivalent professional experience will also be considered. Strong hands-on experience operating production workloads within AWS environments. Proven experience managing infrastructure as code using Terraform, including module development, state management, and deployment automation. Experience operating and supporting production Kubernetes environments. Hands-on experience deploying and managing applications using Helm. Experience working with distributed systems, event-driven architectures, or event-sourcing platforms, including concepts such as partitioning, event ordering, replay, and fault tolerance. Experience establishing and managing observability practices including monitoring, logging, tracing, alerting, and incident response. Strong understanding of Linux systems administration, networking, cloud architecture, and distributed systems fundamentals. Experience designing, implementing, and maintaining CI/CD pipelines and deployment automation. Strong problem-solving skills with the ability to troubleshoot complex infrastructure and application issues. Excellent written and verbal communication skills with the ability to collaborate effectively across technical and non-technical teams. High level of ownership, accountability, and initiative with a proactive approach to reliability and operational excellence. Ability and willingness to participate in an on-call rotation supporting production systems.
Preferred Qualifications
Strong programming or scripting experience with Python, Go, or similar languages. Experience with observability platforms such as Prometheus, Grafana, Datadog, CloudWatch, SigNoz, or OpenTelemetry. Experience with GitOps tools such as ArgoCD or Flux. Experience managing databases such as PostgreSQL, MySQL, Redshift, or ClickHouse. Experience implementing secrets management solutions such as AWS Secrets Manager or HashiCorp Vault. Experience supporting healthcare technology platforms or other highly regulated environments. Familiarity with data infrastructure technologies including Snowflake, Redshift, and ETL/ELT pipelines. Experience with database performance tuning and optimization. At this time we are only considering applicants in the following states: Arizona, California, Colorado, Florida, Georgia, Illinois, Nevada, North Carolina, Oregon, Texas, and Washington. Additional Information The hiring process for this role may consist of applying, followed by a phone screen, online assessment(s), interview(s), an offer, and background/reference checks. Background Screening: A background check, which may include a drug test or other health screenings depending on the role, will be required prior to employment.
Job Description
Scope: This job description is not exhaustive and may include additional activities, duties, and responsibilities not listed herein. Vaccination Requirement: Employees in patient, client, or customer-facing roles must be vaccinated against influenza. Requests for religious or medical accommodations will be considered but may not always be approved. Employment Eligibility: Compliance with federal law requires identity and work eligibility verification using E-Verify upon hire. Equal Opportunity Employer: At Vynca Inc., we embrace diversity and are committed to fostering an inclusive workplace. We value all applicants regardless of race, color, religion, age, national origin, ancestry, ethnicity, gender, gender identity, gender expression, sexual orientation, marital status, veteran status, disability, genetic information, citizenship status, or membership in any other protected group under federal, state, or local law. Apply To This Job