We are seeking a skilled Research Engineer for model inference to join our team. The Inference Engineer will play a crucial role in optimizing and deploying machine learning models for real-time applications. The ideal candidate will work closely with machine learning researchers, and production engineers to ensure the seamless integration of models into production environments, maximizing efficiency and performance.

Your Impact

Model optimization: Collaborate with machine learning researchers to understand model architectures and algorithms.
Implement optimization techniques to enhance machine learning models' efficiency and inference speed on production
Deployment and Integration: Work closely with product engineers to integrate machine learning models into production systems in a scalable way
Optimize models for real-time inference, ensuring low latency and high-throughput
Set up monitoring systems to track model performance in real time.
Ensure models can scale horizontally to handle the increased load.
Implement strategies for resource-efficient inference, considering factors such as memory usage and CPU/GPU utilization.
Collaborate with cross-functional teams to understand requirements and constraints.
Provide technical expertise on inference-related matters during the model development lifecycle.
Document the deployment and optimization processes for machine learning models.

We're looking for someone who

Master's degree in Computer Science
Experience in PyTorch
Proficiency in Python
Experience in C++
Basic knowledge of CUDA
Strong understanding of machine learning models, algorithms, and deployment strategies
Experience with model optimization techniques and performance profiling
Familiarity with docker and Kubernetes
Knowledge of AWS
Experience with monitoring tools
3+ years of professional software engineering experience

About Otter.ai

We are in the business of shaping the future of work. Our mission is to make conversations more valuable.

With over 1B meetings transcribed, Otter.ai is the world’s leading tool for meeting transcription, summarization, and collaboration. Using artificial intelligence, Otter generates real-time automated meeting notes, summaries, and other insights from in-person and virtual meetings - turning meetings into accessible, collaborative, and actionable data that can be shared across teams and organizations. The company is backed by early investors in Google, DeepMind, Zoom, and Tesla.

Otter.ai is an equal opportunity employer. We proudly celebrate diversity and are dedicated to inclusivity.

*Otter.ai does not accept unsolicited resumes from 3rd party recruitment agencies without a written agreement in place for permanent placements. Any resume or other candidate information submitted outside of established candidate submission guidelines (including through our website or via email to any Otter.ai employee) and without a written agreement otherwise will be deemed to be our sole property, and no fee will be paid should we hire the candidate.

Salary range

Salary Range: $175,000 to $250,000 USD per year.

This salary range represents the low and high end of the estimated salary range for this position. The actual base salary offered for the role is dependent based on several factors. Our base salary is just one component of our comprehensive total rewards package.

Apply now

See more open positions at Otter.ai

Privacy policy Cookie policy