Deep Learning Engineer (Speech Recognition / Wake Word Detection)
Sunnyvale, United States
42dotFull-time

About Us

42dot is a mobility AI company committed to solving mobility challenges with software and AI. As the Global Software Center of Hyundai Motor Group, 42dot pioneers the future of mobility by advancing the development of software-defined vehicles.

We develop safety-first, user-centric software-defined vehicle technologies that deliver the latest performance through continuous updates like smartphones. By advancing software and AI technology, 42dot envisions a world where everything is connected and moves autonomously through a self-managing urban transportation operating system.

About the Role

The Audio Team develops voice technologies that enable users to communicate more conveniently with their vehicles. 42dot’s voice technology recognizes users’ speech, allowing them to interact with the vehicle using voice alone.

Responsibilities

  • Design STT (Speech-to-Text) acoustic/language models and build voice databases

  • Design WWD (Wake Word Detection) models and build voice databases

  • Develop batch/streaming speech applications for server and on-device (Linux, Android) environments

  • Develop and optimize hardware-efficient models

Qualifications

  • Minimum 2 years of relevant experience (Master’s degree holders without experience are also welcome to apply)

  • Strong understanding of basic concepts in speech signal processing

  • Research experience in machine learning/AI for STT or WWD

  • Experience developing on-device (edge) applications (Linux, Android)

  • Programming skills for proposing innovative technologies and solving problems (e.g., C/C++, Python, Shell)

  • Proficiency with open-source deep learning frameworks (e.g., PyTorch, TensorFlow, Keras, Caffe)

  • Must be familiar with North American English, Spanish and French.

  • Bilingual or Multilingual is a plus

Preferred Qualifications

  • Understanding and experience using version control systems (e.g., Git)

  • Familiarity with unit testing, static analysis, test automation, and CI/CD

  • Experience developing and deploying STT technologies in commercial products or services

  • Research experience in STT based on SSL (Self-Supervised Learning) or AED (Attention-based Encoder-Decoder)

  • Experience in multilingual STT research

  • Research experience related to language models for speech recognition

  • Publications in top-tier journals or conferences in speech recognition, machine learning, or AI

  • Master’s or Ph.D. degree (or currently pursuing) in related fields

Base Salary:

$120,000 to $300,000