Machine Learning Lead, Edge AI Inferencing

Hybrid

Published 3 months ago

About Blumind

Blumind is a deep-tech startup at the forefront of the AI revolution. We are building a new class of semiconductor: ultra-low-power analog AI processors (AMPL™) designed for the extreme edge. Our technology enables complex AI, from "always-on" keyword spotting to advanced vision and language models, to run on a fraction of the power of traditional digital chips. We are on a mission to make high-performance, on-device AI ubiquitous, from wearables and smart home devices to automotive and industrial IoT.

Role Overview: The Technical Challenge

We are seeking a seasoned and innovative Machine Learning Lead to own the technical strategy and execute on optimizing and deploying next-generation neural networks on our unique analog hardware. Your primary challenge will be to bridge the gap between the world of large, complex models (hybrid SSM, transformers with compression, CNN, RNN) and the hard constraints of our ultra-low-power analog compute architecture.

You will lead the effort with a combination of architectural insight and hands-on validation and deployment along with hardware architects and software developers to define the future of our ML model stack. This is a leadership role for a technical expert who is passionate about hardware-aware ML and eager to solve novel problems in model compression, quantization, and algorithm-hardware co-design. The work is exciting, fast-paced and highly collaborative.

Key Responsibilities

Technical Leadership & Strategy: Define and execute the roadmap for ML model support on Blumind's analog platform, with a special focus on advanced architectures like Transformers, CNNs, and RNNs.
Hardware-Aware ML: Lead the research and implementation of cutting-edge optimization techniques (e.g., hardware-aware training, aggressive quantization, pruning, knowledge distillation, bitwise models, hybrid SSM/attention structures and quantum compression techniques). AI model with hardware co-optimization to deliver market leading TOPS/watt, TTFT and tokens/sec is the end goal.
Algorithm-Hardware Co-Design: Serve as the primary ML expert in discussions with the hardware team. Provide critical feedback to inform the design of future analog compute cores, ensuring they are optimized for the next wave of AI models.
Team Leadership: Help recruit, lead, mentor, and grow a high-performing team of ML engineers. Foster a culture of innovation, rigorous testing, and cross-functional collaboration.
Full Stack Development: Collaborate and guide the requirements for our ML "translator" tools, which map models from standard frameworks (PyTorch, TensorFlow) onto our proprietary analog processor.
Benchmarking & Analysis: Establish and own the pipelines for benchmarking model performance, accuracy, power consumption, and latency on silicon. Use this data to drive improvements across both hardware and software.

Required Qualifications

Education: Master's or Ph.D. in Computer Science, Electrical Engineering, or a related field with a focus on Machine Learning.
Experience: 6+ years in machine learning, with at least 2 years in a technical lead or senior/principal role.
Deep Learning Expertise: Expert-level knowledge of deep learning architectures, particularly Transformers (e.g., BERT, ViT), SSM (e.g., Mamba, Falcon) and CNNs, with a strong theoretical and practical understanding of their internal mechanics.
Signal Processing: Familiarity with the fundamentals of signal processing and/or DSP.
Model Optimization: Proven, hands-on experience with model compression techniques, especially quantization (QAT, PTQ).
Core ML Skills: Strong proficiency in Python and ML frameworks like PyTorch or TensorFlow.
Leadership: Demonstrated experience leading projects, hiring and mentoring junior engineers, and setting technical direction.

Preferred Qualifications

Experience in the tinyML or edge AI ecosystem.
A background in computer architecture, device physics, semiconductor design, or analog circuits.
Experience in deploying models in production environments.
Experience with C/C++ for embedded systems.
A portfolio of published research in ML model compression, efficient AI, or hardware-aware ML.

Location

Preferred location is hybrid in Toronto/Ottawa or remote anywhere else in Canada. Candidates willing to relocate to Canada will also be considered

We thank all applicants for their interest. Only candidates being considered for the role will be contacted.

APPLY

Full time

Mid-Senior Level

Hybrid

APPLY

Report Job