Speech to Text

The SpeechToText component allows administrators to configure speech-to-text functionality for their chatbot. This component is part of the chatbot configuration interface and enables the conversion of spoken language into written text.

Purpose

The main purpose of this component is to enable administrators to set up and manage speech-to-text capabilities, allowing users to interact with the chatbot using voice input.

Features

Provider Selection

Allows you to choose from multiple speech-to-text providers:
- OpenAI Whisper
- Assembly AI
- LocalAI STT
Option to disable speech-to-text by selecting "None"

Provider-Specific Configuration

Each provider has its own set of configuration options, which may include:

API Credentials
Language settings
Model selection
Advanced parameters (e.g., temperature, prompts)

How to Use

Accessing the Settings:
- Navigate to the chatflow configuration interface.
- Locate the "Speech to Text" section.
Selecting a Provider:
- Use the dropdown menu to select a speech-to-text provider.
- Options include "None" (to disable), "OpenAI Whisper", "Assembly AI", and "LocalAI STT".
Configuring Provider Settings:
- Once a provider is selected, its specific configuration options will appear.
- Fill in the required fields and any optional parameters as needed.
OpenAI Whisper Configuration:
- Connect OpenAI API credentials
- Optionally set language, prompt, and temperature
Assembly AI Configuration:
- Connect Assembly AI API credentials
LocalAI STT Configuration:
- Connect LocalAI API credentials
- Set the base URL for the local AI server
- Optionally configure language, model, prompt, and temperature
Saving Changes:
- After configuring the settings, click the "Save" button to apply the speech-to-text configuration.
- A success message will appear if the settings are saved successfully.

Important Notes

Only one speech-to-text provider can be active at a time.
Ensure that you have the necessary API credentials for the selected provider.
Some providers may require additional setup or have usage limits. Refer to the provider's documentation for more information.
The "Save" button will be disabled if a provider is selected but no credential is provided.

Technical Details

The component uses Redux for state management and dispatching actions.
Speech-to-text settings are stored in the speechToText field of the chatflow data as a JSON string.
When saved, the configuration is updated via an API call to updateChatflow.

Error Handling

If an error occurs while saving the settings, an error message will be displayed with details about the failure.

Security Implications

Ensure that API credentials are kept secure and not exposed to unauthorized parties.
Be aware of the data privacy implications of using cloud-based speech-to-text services, especially when handling sensitive information.
For LocalAI STT, ensure that the local server is properly secured and accessible only to authorized systems.

Customization

The speech-to-text functionality can be further customized by adjusting provider-specific parameters such as language, prompts, and temperature settings. These allow you to fine-tune the accuracy and behavior of the speech recognition for your specific use case.

Purpose​

Features​

Provider Selection​

Provider-Specific Configuration​

How to Use​

Important Notes​

Technical Details​

Error Handling​

Security Implications​

Customization​