Senior Voice AI Engineer

Help us build our own speech recognition and synthesis models. Reduce our dependency on third-party voice providers and own the moat.

Type: Full-time Location: Vancouver (in-office preferred), remote considered Salary: CAD $130,000 to $180,000 + equity Posted: Jun 28, 2026

The opportunity

1n1.ai is a voice AI receptionist for Canadian small business. Today we route every call through external providers (Google Gemini, OpenAI, Anthropic Claude, Telnyx, Groq) for speech-to-text, language reasoning, and text-to-speech. They work. They are also a long-term dependency, a per-call cost, and a constraint on the latency floor we can push to.

We are hiring a Senior Voice AI Engineer to change that. Your mission is to build, train, fine-tune, and ship 1n1.ai's own speech models so that, over time, more of the voice stack lives inside our own infrastructure. That includes streaming speech-to-text tuned for the languages and accents Canadian small businesses actually hear (Cantonese, Mandarin, Punjabi, Tagalog, French, plus dozens more), neural text-to-speech that sounds human enough to keep callers from hanging up, and a model router that picks the right engine for each call.

This is the technical-moat hire. If you have wanted to do real voice ML inside a focused team without committee meetings or quarterly OKR theater, this is the seat.

What you will work on

Streaming ASR (automatic speech recognition). Train or fine-tune Conformer / Whisper / wav2vec-style architectures for sub-second latency on noisy phone audio, multilingual code-switching, and Canadian accents.
Neural TTS. Ship voice synthesis that callers cannot distinguish from human in 5-10 second turns. VITS, XTTS, Tortoise, or whatever architecture wins your benchmarks.
Latency engineering. Quantize, distill, and prune models so they run on commodity GPUs at production cost. The whole conversation budget is under a second of perceived delay; every millisecond matters.
Voice activity detection (VAD) and turn-taking models so the AI knows when to listen, when to speak, and when to wait through hesitation.
Dataset work. Curate the multilingual phone-call corpus we need to train on. Negotiate licenses, build the pipelines, manage the labeling.
The model router. Decide per-call which engine (ours, Gemini, Groq, etc.) handles which segment, based on language, audio quality, latency budget, and cost.
Production integration with bridge.1n1.ai, our Node.js WebSocket server that mediates every live call.

Who you are

5+ years building production ML systems, with at least 2 years on speech, audio, or real-time inference.
You have actually trained an end-to-end speech model (ASR or TTS) yourself, not just fine-tuned someone else's checkpoint with a tutorial.
Strong PyTorch (JAX a bonus). Comfortable in CUDA when needed.
Experience with streaming audio inference and the latency tricks that make it real (chunked attention, ring buffers, KV caches, speculative decoding).
You read recent papers and can tell which results actually work in production and which are benchmark theater.
Bonus: experience with multilingual ASR, voice cloning, low-resource language adaptation, or telephony-band audio (8 kHz narrowband).

What we offer

Salary: CAD $130,000 to $180,000 based on experience, plus meaningful equity in a launching company.
Compute: a real GPU budget. Not "ask for permission each time." Burn what you need to ship.
Autonomy: you own the voice stack end-to-end. No PM proxying decisions, no design-by-committee.
Comprehensive Canadian benefits: extended health (paramedical, prescriptions), dental, vision, life and long-term disability, all from day one for full-time hires.
Time off: generous paid vacation, plus all Canadian statutory holidays.
Hardware budget to set up the workstation you need (Linux or Mac, your call).

Location

Vancouver office (in-person strongly preferred at 123 Placeholder Street, Suite 100, Vancouver, BC V6B 0A0). Remote considered for outstanding candidates anywhere in the world. We provide relocation support for top hires who want to move to Vancouver.

How to apply

Email careers@1n1.ai with your CV, GitHub or research links, and one paragraph on a speech or audio system you actually shipped (what you built, what was hard, what you learned). Cover letters not required; specifics are.

We respond to every application within 7 days.

Ready to apply?

Email us with your CV and one paragraph on a real speech or audio system you shipped.

Apply by email

About 1n1.ai

1n1.ai is an AI voice receptionist for Canadian small business, built and operated by Techalyst Software Inc. in Vancouver. We answer the calls our customers cannot pick up, in 70+ languages, around the clock, on real Canadian phone numbers. Launched 2026.

We are focused and shipping fast. Read the founder's launch story or the PIPEDA compliance page for what we actually do.