Back to Jobs

Senior Software Engineer, AI Inference

Remote, USA Full-time Posted 2026-06-22

Company Overview Deepgram is a foundational AI company on a mission to transform human-machine interaction using natural language. We give any developer access to the fastest, most powerful voice AI platform including access to models for speech-to-text, text-to-speech, and spoken language understanding with just an API call. From transcription to sentiment analysis to voice synthesis, Deepgram is the preferred partner for builders of voice AI applications. Opportunity We are seeking a backend engineer focused on AI inference to join the team powering Deepgram’s core speech inference APIs. You’ll implement and optimize inference code, experiment with cutting-edge technologies, and develop, maintain, and deploy the stack of services behind our blazing-fast, massive-throughput inference system. This role blends work on backend services and systems with domain specialty in neural networks and GPU programming. Our team owns the applications that serve api.deepgram.com and empowers... builders of innovative speech products by focusing on a world-class combination of reliability, efficiency, and latency. What You’ll Do • Implement inference for novel model architectures developed by Deepgram’s trailblazing research team • Develop, test, and deploy application code for massive-scale production services • Debug complex system issues that include networking, scheduling, and high-performance computing interactions • Build tooling for internal analysis and benchmarking to identify opportunities for efficiency improvements • Experiment with optimization techniques for ML workloads on NVIDIA GPUs and ship the key wins to prod You’ll Love This Role If You • Think of yourself as a generalist while enjoying learning deeply in specific areas, causing you to go from debugging a customer issue one day to designing an algorithm the next • Like sipping piña coladas and getting caught in the rain • Enjoy taking ownership of features from early collaborations with researchers through testing in production • Love getting nitty-gritty with profilers, hardware architectures, and inference algorithms • Want to work within the context of a humble, collaborative team that collectively owns mission-critical production services It’s Important to Us That You Have • The ability to work collaboratively in a fast-paced environment and adapt to changing priorities • Proven industry experience building and shipping production services • Strong confidence in a lower-level language like C, C++, or Rust • Experience slicing large projects or initiatives into smaller experiments or incremental improvements • Expertise in a ML framework like Torch or Tensorflow • Experience with GPU programming using tools like CUDA or libraries like cuDNN, cuBLAS, etc. It Would Be Great If You Also Had • Extensive professional experience with Rust and C++ • Experience optimizing ML workloads in production • Familiarity with GPU hardware architecture and its impact on inference pipelines Backed by prominent investors including Y Combinator, Madrona, Tiger Global, Wing VC and NVIDIA, Deepgram has raised over $85 million in total funding after closing our Series B funding round last year. If you're looking to work on cutting-edge technology and make a significant impact in the AI industry, we'd love to hear from you! Deepgram is an equal opportunity employer. We want all voices and perspectives represented in our workforce. We are a curious bunch focused on collaboration and doing the right thing. We put our customers first, grow together and move quickly. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, gender identity or expression, age, marital status, veteran status, disability status, pregnancy, parental status, genetic information, political affiliation, or any other status protected by the laws or regulations in the locations where we operate. We are happy to provide accommodations for applicants who need them Apply Job!

Similar Jobs

DRIVER - LOCAL DELIVERY

Remote, USA Full-time

TurboTax Live CA Manager 1 (Canadian Expert Network)

Remote, USA Full-time

Solutions Architect, AI Cloud Services

Remote, USA Full-time

Director of National Service

Remote, USA Full-time

$250/Per Day Data Entry Reps Needed (Remote)

Remote, USA Full-time

Work From Home Dutch Customer Support

Remote, USA Full-time

Cyber Security Architect with AWS

Remote, USA Full-time

Director of Career Services

Remote, USA Full-time

Credentialing Specialist

Remote, USA Full-time

Remote Software Programmer - AI Training (Contract)

Remote, USA Full-time

Online English Tutoring (PreK - K8 Students)

Remote, USA Full-time

Sr Manager Roundel Go-To-Market Strategy(Remote Or Hybrid)

Remote, USA Full-time

Experienced Part-Time Algebra 1 Teacher – Founders Classical Academy of Lewisville

Remote, USA Full-time

Mandiant – Senior Consultant, Security Architect (Remote – Central Region) – Chicago, IL

Remote, USA Full-time

Experienced Data Entry Specialist – Remote Opportunity for Young Professionals

Remote, USA Full-time

EAP Worklife Customer Support Associate – Mental Health & Wellbeing Support Specialist (Fully Remote, Sunday-Thursday Afternoon Shift)

Remote, USA Full-time

Job Title: Experienced Claims Customer Service Advocate II – Insurance Claims Resolution and Customer Engagement

Remote, USA Full-time

Experienced Bilingual Customer Service Representative – Remote Call Center Independent Contractor Opportunity for Career Growth and Development

Remote, USA Full-time

Digital Learning Production Coordinator – 12 Month Contract

Remote, USA Full-time

Medical Transcriber-Pathology (PD, Days) Monterey Park Hospital

Remote, USA Full-time