# Voice Commands

{% hint style="info" %}
By default, all voice transcriptions happens on device using a local Whisper model. It should work across multiple languages, you can increase quality by changing the underlying model or setting common used words (see [Transcription](/inbox-ai/transcription.md)).
{% endhint %}

In the [Advanced Settings](/inbox-ai/advanced-settings.md), you can see the configuration for voice commands. Here you can configure the model and whether to send audio data directly to the ai model (this is only supported by OpenAI's `gpt-4o-audio-preview`and should be considered beta). For more information on these settings see [Transcription](/inbox-ai/transcription.md).

<figure><img src="/files/5seoSKqvyfpCdnbPCeHj" alt=""><figcaption><p>Audio settings</p></figcaption></figure>

Once you finish the [Quick Start](/inbox-ai/quick-start.md), voice commands should work right away. Try holding the hotkey and say: "Write an email to my co-worker about the benefits of coffee" and see what happens.

By default, the "Process Audio" command is used to handle your command. Let's dive into this command and see how it works. Go to "Actions", and select "Process Audio".

<figure><img src="/files/rqXxhQsc5c76QzMe9k5G" alt=""><figcaption><p>The process audio action</p></figcaption></figure>

Process audio is an [Ask AI](/inbox-ai/actions/custom-actions/ask-ai.md) action, that uses the default AI provider and has a lot of actions it can choose to perform. Each action usually reflects a specific command. By default it has these actions:

* Ask ChatGPT / Claude / Perplexity
* Draft Email
* [Add Apple Reminder](/inbox-ai/actions/built-in-actions/system-actions/add-apple-reminder.md)
* Open Browser
* [Get Selected Text](/inbox-ai/actions/built-in-actions/system-actions/get-selected-text.md)
* [Paste at Cursor](/inbox-ai/actions/built-in-actions/system-actions/paste-at-cursor.md)
* [Send Notification](/inbox-ai/actions/built-in-actions/system-actions/send-notification.md)
* Open Application
* [Take Screenshot](/inbox-ai/actions/built-in-actions/system-actions/take-screenshot.md)
* [Do Nothing](/inbox-ai/actions/built-in-actions/do-nothing.md)

You can add or customize these actions as you please. Below the actions, you see the prompt which controls the AIs main behavior, this too can be fully customized.

In the prompt, you can specify the behavior you are looking for. The `{{ value }}` placeholders in the prompt will be replaced by variables before sending them to the AI. In this case the `{{ originalInput }}` is your spoken command, and your name and some time information is also inserted. For more information on templating, see the [Templating](/inbox-ai/templating.md) page.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://dreetje-1.gitbook.io/inbox-ai/voice-commands.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
