SensiML uses AI voice generation for synthetic speech datasets

SensiML recently announced technology to help embedded device developers use AI voice generation and text-to-speech to quickly create synthetic speech datasets.

Such datasets should help developers build models for keyword recognition, voice command and speaker identification.  The AI voice tech is being integrated into SensiML’s Data Studio software for IoT edge applications. SensiML is a subsidiary of QuickLogic.

Using synthetic data, the cost and time will be drastically reduced to create speech recognition AI models, SensiML CEO Chris Rogers told Fierce Electronics. Currently, developers must manually record phrases from large populations of diverse speakers, often across multiple languages, a heavy workload for most development teams.

SensiML relied on speech generation tech from ElevenLabs. The models run autonomously on low-power microcontrollers used in edge IoT cases.

In a smart home security example, SensiML’s new AI voice generator feature would help a developer create extensive voice datasets to allow the system to recognize a wide range of user commands with accuracy.

Developers will also be able to custom build their own ML code for IoT devices that need to handle voice and sound recognition directly on-device.

“We asked how can we synthetically augment data sets to make projects easier…I wanted to create a command data set,” Rogers said. “We’re trying with this tool to put power in any developer’s hands to collect a small amount of data” and build out a model.  “This is the start of what we imagine to be a lot of use cases. LLM’s have established APIs and so we started there.”