Voice Commands
Learn how to use your voice as triggers. Build custom commands to perform actions, or capture information.
Last updated
Learn how to use your voice as triggers. Build custom commands to perform actions, or capture information.
Last updated
By default, all voice transcriptions happens on device using a local Whisper model. It should work across multiple languages, you can increase quality by changing the underlying model or setting common used words (see Transcription).
In the Advanced Settings, you can see the configuration for voice commands. Here you can configure the model and whether to send audio data directly to the ai model (this is only supported by OpenAI's gpt-4o-audio-preview
and should be considered beta). For more information on these settings see Transcription.
Once you finish the Quick Start, voice commands should work right away. Try holding the hotkey and say: "Write an email to my co-worker about the benefits of coffee" and see what happens.
By default, the "Process Audio" command is used to handle your command. Let's dive into this command and see how it works. Go to "Actions", and select "Process Audio".
Process audio is an Ask AI action, that uses the default AI provider and has a lot of actions it can choose to perform. Each action usually reflects a specific command. By default it has these actions:
Ask ChatGPT / Claude / Perplexity
Draft Email
Open Browser
Open Application
You can add or customize these actions as you please. Below the actions, you see the prompt which controls the AIs main behavior, this too can be fully customized.
In the prompt, you can specify the behavior you are looking for. The {{ value }}
placeholders in the prompt will be replaced by variables before sending them to the AI. In this case the {{ originalInput }}
is your spoken command, and your name and some time information is also inserted. For more information on templating, see the Templating page.