Made in Vancouver, Canada by Picovoice
Cheetah is an on-device streaming speech-to-text engine. Cheetah is:
- Private; All voice processing runs locally.
- Accurate
- Compact and Computationally-Efficient
- Cross-Platform:
- Linux (x86_64), macOS (x86_64, arm64), and Windows (x86_64)
- Android and iOS
- Chrome, Safari, Firefox, and Edge
- Raspberry Pi (3, 4, 5)
AccessKey is your authentication and authorization token for deploying Picovoice SDKs, including Cheetah. Anyone who is using Picovoice needs to have a valid AccessKey. You must keep your AccessKey secret. You would need internet connectivity to validate your AccessKey with Picovoice license servers even though the voice recognition is running 100% offline.
AccessKey also verifies that your usage is within the limits of your account. Everyone who signs up for
Picovoice Console receives the Free Tier
usage rights described
here. If you wish to increase your limits, you can purchase a subscription plan.
- Cheetah Streaming Speech-to-Text currently supports English only.
- Support for additional languages is available for commercial customers on a case-by-case basis.
Install the demo package:
pip3 install pvcheetahdemo
cheetah_demo_mic --access_key ${ACCESS_KEY}
Replace ${ACCESS_KEY}
with yours obtained from Picovoice Console.
If using SSH, clone the repository with:
git clone --recurse-submodules [email protected]:Picovoice/cheetah.git
If using HTTPS, clone the repository with:
git clone --recurse-submodules https://github.com/Picovoice/cheetah.git
Build the demo:
cmake -S demo/c/ -B demo/c/build && cmake --build demo/c/build
Run the demo:
./demo/c/build/cheetah_demo_mic -a ${ACCESS_KEY} -m ${MODEL_PATH} -l ${LIBRARY_PATH}
Replace ${ACCESS_KEY}
with yours obtained from Picovoice Console, ${LIBRARY_PATH}
with the path to appropriate
library under lib, and ${MODEL_PATH}
to path to default model file
(or your custom one).
To run the demo, go to demo/ios/CheetahDemo and run:
pod install
Replace let accessKey = "${YOUR_ACCESS_KEY_HERE}"
in the file ViewModel.swift with your AccessKey
.
Then, using Xcode, open the generated CheetahDemo.xcworkspace
and run the application.
Using Android Studio, open demo/android/CheetahDemo as an Android project and then run the application.
Replace "${YOUR_ACCESS_KEY_HERE}"
in the file MainActivity.java with your AccessKey
.
To run the Cheetah demo on Android or iOS with Flutter, you must have the Flutter SDK installed on your system. Once installed, you can run flutter doctor
to determine any other missing requirements for your relevant platform. Once your environment has been set up, launch a simulator or connect an Android/iOS device.
Before launching the app, use the copy_assets.sh script to copy the cheetah demo model file into the demo project. (NOTE: on Windows, Git Bash or another bash shell is required, or you will have to manually copy the context into the project.).
Replace "${YOUR_ACCESS_KEY_HERE}"
in the file main.dart with your AccessKey
.
Run the following command from demo/flutter to build and deploy the demo to your device:
flutter run
The demo requires cgo
, which on Windows may mean that you need to install a gcc compiler like MinGW to build it properly.
From demo/go run the following command from the terminal to build and run the file demo:
go run micdemo/cheetah_mic_demo.go -access_key "${ACCESS_KEY}"
Replace ${ACCESS_KEY}
with yours obtained from Picovoice Console.
For more information about Go demos go to demo/go.
To run the React Native Cheetah demo app you will first need to set up your React Native environment. For this, please refer to React Native's documentation. Once your environment has been set up, navigate to demo/react-native to run the following commands:
For Android:
yarn android-install # sets up environment
yarn android-run # builds and deploys to Android
For iOS:
yarn ios-install # sets up environment
yarn ios-run
Install the demo package:
yarn global add @picovoice/cheetah-node-demo
With a working microphone connected to your device, run the following in the terminal:
cheetah-mic-demo --access_key ${ACCESS_KEY}
For more information about Node.js demos go to demo/nodejs.
The Cheetah Java demo is a command-line application that lets you choose between running Cheetah on an audio file or on real-time microphone input.
To try the real-time demo, make sure there is a working microphone connected to your device. Then invoke the following commands from the terminal:
cd demo/java
./gradlew build
cd build/libs
java -jar cheetah-mic-demo.jar -a ${ACCESS_KEY}
For more information about Java demos go to demo/java.
Cheetah .NET demo is a command-line application that lets you choose between running Cheetah on an audio file or on real-time microphone input.
Make sure there is a working microphone connected to your device. From demo/dotnet/CheetahDemo run the following in the terminal:
dotnet run -c MicDemo.Release -- --access_key ${ACCESS_KEY}
Replace ${ACCESS_KEY}
with your Picovoice AccessKey
.
For more information about .NET demos, go to demo/dotnet.
Cheetah Rust demo is a command-line application that lets you choose between running Cheetah on an audio file or on real-time microphone input.
Make sure there is a working microphone connected to your device. From demo/rust/micdemo run the following in the terminal:
cargo run --release -- --access_key ${ACCESS_KEY}
Replace ${ACCESS_KEY}
with your Picovoice AccessKey
.
For more information about Rust demos, go to demo/rust.
From demo/web run the following in the terminal:
yarn
yarn start
(or)
npm install
npm run start
Open http://localhost:5000
in your browser to try the demo.
From demo/react run the following in the terminal:
yarn
yarn start
(or)
npm install
npm run start
Open http://localhost:3000
in your browser to try the demo.
Install the Python SDK:
pip3 install pvcheetah
Create an instance of the engine and transcribe audio in real-time:
import pvcheetah
handle = pvcheetah.create(access_key='${ACCESS_KEY}')
def get_next_audio_frame():
pass
while True:
partial_transcript, is_endpoint = handle.process(get_next_audio_frame())
if is_endpoint:
final_transcript = handle.flush()
Replace ${ACCESS_KEY}
with yours obtained from Picovoice Console.
Create an instance of the engine and transcribe audio in real-time:
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include "pv_cheetah.h"
pv_cheetah_t *handle = NULL;
const pv_status_t status = pv_cheetah_init("${ACCESS_KEY}", "${MODEL_PATH}", 0.f, false, &handle);
if (status != PV_STATUS_SUCCESS) {
// error handling logic
}
extern const int16_t *get_next_audio_frame(void);
while (true) {
char *partial_transcript = NULL;
bool is_endpoint = false;
const pv_status_t status = pv_cheetah_process(
handle,
get_next_audio_frame(),
&partial_transcript,
&is_endpoint);
if (status != PV_STATUS_SUCCESS) {
// error handling logic
}
// do something with transcript
free(partial_transcript);
if (is_endpoint) {
char *final_transcript = NULL;
const pv_status_t status = pv_cheetah_flush(handle, &final_transcript);
if (status != PV_STATUS_SUCCESS) {
// error handling logic
}
// do something with transcript
free(final_transcript);
}
}
Replace ${ACCESS_KEY}
with yours obtained from Picovoice Console and ${MODEL_PATH}
to path to
default model file (or your custom one). Finally, when done be sure to release
resources acquired using pv_cheetah_delete(handle)
.
The Cheetah iOS binding is available via CocoaPods. To import it into your iOS project, add the following line to your Podfile and run pod install
:
pod 'Cheetah-iOS'
Create an instance of the engine and transcribe audio in real-time:
import Cheetah
let modelPath = Bundle(for: type(of: self)).path(
forResource: "${MODEL_FILE}", // Name of the model file name for Cheetah
ofType: "pv")!
let cheetah = Cheetah(accessKey: "${ACCESS_KEY}", modelPath: modelPath)
func getNextAudioFrame() -> [Int16] {
// .. get audioFrame
return audioFrame;
}
while true {
do {
let partialTranscript, isEndpoint = try cheetah.process(getNetAudioFrame())
if isEndpoint {
let finalTranscript = try cheetah.flush()
}
} catch let error as CheetahError {
// handle error
} catch { }
}
Replace ${ACCESS_KEY}
with yours obtained from Picovoice Console and ${MODEL_FILE}
with a custom trained model from
Picovoice Console or the default model.
To include the package in your Android project, ensure you have included mavenCentral()
in your top-level build.gradle
file and then add the following to your app's build.gradle
:
dependencies {
implementation 'ai.picovoice:cheetah-android:${LATEST_VERSION}'
}
Create an instance of the engine and transcribe audio in real-time:
import ai.picovoice.cheetah.*;
final String accessKey = "${ACCESS_KEY}"; // AccessKey obtained from Picovoice Console (https://console.picovoice.ai/)
final String modelPath = "${MODEL_FILE}";
short[] getNextAudioFrame() {
// .. get audioFrame
return audioFrame;
}
try {
Cheetah cheetah = new Cheetah.Builder().setAccessKey(accessKey).setModelPath(modelPath).build(appContext);
String transcript = "";
while true {
CheetahTranscript transcriptObj = cheetah.process(getNextAudioFrame());
transcript += transcriptObj.getTranscript();
if (transcriptObj.getIsEndpoint()) {
CheetahTranscript finalTranscriptObj = cheetah.flush();
transcript += finalTranscriptObj.getTranscript();
}
};
} catch (CheetahException ex) { }
Replace ${ACCESS_KEY}
with yours obtained from Picovoice Console and ${MODEL_FILE}
with the default or custom trained model from console.
Add the Cheetah Flutter plugin to your pub.yaml.
dependencies:
cheetah_flutter: ^<version>
Create an instance of the engine and transcribe audio in real-time:
import 'package:cheetah_flutter/cheetah.dart';
const accessKey = "{ACCESS_KEY}" // AccessKey obtained from Picovoice Console (https://console.picovoice.ai/)
List<int> buffer = getAudioFrame();
try{
Cheetah _cheetah = await Cheetah.create(accessKey, '{CHEETAH_MODEL_PATH}');
String transcript = "";
while true {
CheetahTranscript partialResult = await _cheetah.process(getAudioFrame());
transcript += partialResult.transcript;
if (partialResult.isEndpoint) {
CheetahTranscript finalResult = await _cheetah.flush();
transcript += finalResult.transcript;
}
}
_cheetah.delete()
} on CheetahException catch (err) { }
Replace ${ACCESS_KEY}
with your AccessKey
obtained from Picovoice Console and ${CHEETAH_MODEL_PATH}
with the the path a custom trained model from Picovoice Console or the default model.
Install the Go binding:
go get github.com/Picovoice/cheetah/binding/go
Create an instance of the engine and transcribe audio in real-time:
import . "github.com/Picovoice/cheetah/binding/go"
cheetah = NewCheetah{AccessKey: "${ACCESS_KEY}"}
err := cheetah.Init()
if err != nil {
// handle err init
}
defer cheetah.Delete()
func getNextFrameAudio() []int16{
// get audio frame
}
for {
partialTranscript, isEndpoint, err = cheetah.Process(getNextFrameAudio())
if isEndpoint {
finalTranscript, err = cheetah.Flush()
}
}
Replace ${ACCESS_KEY}
with yours obtained from Picovoice Console. When done be sure to explicitly release the resources using cheetah.Delete()
.
The Cheetah React Native binding is available via NPM. Add it via the following command:
yarn add @picovoice/cheetah-react-native
Create an instance of the engine and transcribe an audio file:
import {Cheetah, CheetahErrors} from '@picovoice/cheetah-react-native';
const getAudioFrame = () => {
// get audio frames
}
try {
while (1) {
const cheetah = await Cheetah.create("${ACCESS_KEY}", "${MODEL_FILE}")
const {transcript, isEndpoint} = await cheetah.process(getAudioFrame())
if (isEndpoint) {
const {transcript} = await cheetah.flush()
}
}
} catch (err: any) {
if (err instanceof CheetahErrors) {
// handle error
}
}
Replace ${ACCESS_KEY}
with yours obtained from Picovoice Console and ${MODEL_FILE}
with the default or custom trained model from console. When done be sure to explicitly release the resources using cheetah.delete()
.
Install the Node.js SDK:
yarn add @picovoice/cheetah-node
Create instances of the Cheetah class:
const Cheetah = require("@picovoice/cheetah-node");
const accessKey = "${ACCESS_KEY}"; // Obtained from the Picovoice Console (https://console.picovoice.ai/)
const endpointDurationSec = 0.2;
const handle = new Cheetah(accessKey);
function getNextAudioFrame() {
// ...
return audioFrame;
}
while (true) {
const audioFrame = getNextAudioFrame();
const [partialTranscript, isEndpoint] = handle.process(audioFrame);
if (isEndpoint) {
finalTranscript = handle.flush()
}
}
Replace ${ACCESS_KEY}
with yours obtained from Picovoice Console.
When done, be sure to release resources using release()
:
handle.release();
Create an instance of the engine with the Cheetah Builder class and transcribe audio in real-time:
import ai.picovoice.cheetah.*;
final String accessKey = "..."; // AccessKey provided by Picovoice Console (https://console.picovoice.ai/)
short[] getNextAudioFrame() {
// .. get audioFrame
return audioFrame;
}
String transcript = "";
try {
Cheetah cheetah = new Cheetah.Builder().setAccessKey(accessKey).build();
while true {
CheetahTranscript transcriptObj = cheetah.process(getNextAudioFrame());
transcript += transcriptObj.getTranscript();
if (transcriptObj.getIsEndpoint()) {
CheetahTranscript finalTranscriptObj = cheetah.flush();
transcript += finalTranscriptObj.getTranscript();
}
}
cheetah.delete();
} catch (CheetahException ex) { }
Install the .NET SDK using NuGet or the dotnet CLI:
dotnet add package Cheetah
The SDK exposes a factory method to create instances of the engine as below:
using Pv;
const string accessKey = "${ACCESS_KEY}";
Cheetah handle = Cheetah.Create(accessKey);
Replace ${ACCESS_KEY}
with yours obtained from Picovoice Console.
When initialized, the valid sample rate is given by handle.SampleRate
. Expected frame length (number of audio samples in an input array) is handle.FrameLength
. The engine accepts 16-bit linearly-encoded PCM and operates on single-channel audio.
short[] GetNextAudioFrame()
{
// .. get audioFrame
return audioFrame;
}
string transcript = "";
while(true)
{
CheetahTranscript transcriptObj = handle.Process(GetNextAudioFrame());
transcript += transcriptObj.Transcript;
if (transcriptObj.IsEndpoint) {
CheetahTranscript finalTranscriptObj = handle.Flush();
transcript += finalTranscriptObj.Transcript;
}
}
Cheetah will have its resources freed by the garbage collector, but to have resources freed immediately after use, wrap it in a using statement:
using(Cheetah handle = Cheetah.Create(accessKey))
{
// .. Cheetah usage here
}
First you will need Rust and Cargo installed on your system.
To add the cheetah library into your app, add pv_cheetah
to your app's Cargo.toml
manifest:
[dependencies]
pv_cheetah = "*"
Create an instance of the engine using CheetahBuilder
instance and transcribe an audio file:
use cheetah::CheetahBuilder;
fn next_audio_frame() -> Vec<i16> {
// get audio frame
}
let access_key = "${ACCESS_KEY}"; // AccessKey obtained from Picovoice Console (https://console.picovoice.ai/)
let cheetah: Cheetah = CheetahBuilder::new().access_key(access_key).init().expect("Unable to create Cheetah");
if let Ok(cheetahTranscript) = cheetah.process(&next_audio_frame()) {
println!("{}", cheetahTranscript.transcript)
if cheetahTranscript.is_endpoint {
if let Ok(cheetahTranscript) = cheetah.flush() {
println!("{}", cheetahTranscript.transcript)
}
}
}
Replace ${ACCESS_KEY}
with yours obtained from Picovoice Console.
Install the web SDK using yarn:
yarn add @picovoice/cheetah-web
or using npm:
npm install --save @picovoice/cheetah-web
Create an instance of the engine using CheetahWorker
and transcribe an audio file:
import { CheetahWorker } from "@picovoice/cheetah-web";
import cheetahParams from "${PATH_TO_BASE64_CHEETAH_PARAMS}";
let transcript = "";
function transcriptCallback(cheetahTranscript: CheetahTranscript) {
transcript += cheetahTranscript.transcript;
if (cheetahTranscript.isEndpoint) {
transcript += "\n";
}
}
function getAudioData(): Int16Array {
// ... function to get audio data
return new Int16Array();
}
const cheetah = await CheetahWorker.create(
"${ACCESS_KEY}",
transcriptCallback,
{ base64: cheetahParams }
);
for (;;) {
cheetah.process(getAudioData());
// break on some condition
}
cheetah.flush(); // runs transcriptionCallback on remaining data.
Replace ${ACCESS_KEY}
with yours obtained from Picovoice Console. Finally, when done release the resources using cheetah.release()
.
yarn add @picovoice/cheetah-react @picovoice/web-voice-processor
(or)
npm install @picovoice/cheetah-react @picovoice/web-voice-processor
import { useCheetah } from "@picovoice/cheetah-react";
function App(props) {
const {
result,
isLoaded,
isListening,
error,
init,
start,
stop,
release,
} = useCheetah();
const initEngine = async () => {
await init(
"${ACCESS_KEY}",
cheetahModel,
);
};
const toggleRecord = async () => {
if (isListening) {
await stop();
} else {
await start();
}
};
useEffect(() => {
if (result !== null) {
console.log(result.transcript);
console.log(result.isComplete);
}
}, [result])
}
- Improvements to error reporting
- Upgrades to authorization and authentication system
- Improved engine accuracy
- Various bug fixes and improvements
- Node min support bumped to Node 16
- Bumped iOS support to iOS 13+
- Patches to .NET support
- Added true-casing by default for transcription results
- Added option to enable automatic punctuation insertion
- Cheetah Web SDK release
- Initial release