tts

The tts skill converts any text into speech audio using two backends (Kokoro local and Noiz cloud). It supports features like voice cloning and timeline-accurate audio synthesis for dubbing and video narration.

51100Updated 2mo ago

Text-to-Speech

Quick InstallView Source

$ npx skills add https://github.com/noizai/skills --skill tts

Overview

About This Skill

The tts skill converts text into natural-sounding speech audio using two backends: Kokoro for local, privacy-first processing, and Noiz for cloud-based generation with advanced features. It supports precise duration control and timeline-aligned rendering, making it ideal for video dubbing, narration, and content creation workflows. The skill includes voice cloning capabilities powered by reference audio, allowing you to generate speech in specific voices. Setup requires either a local Kokoro installation or a Noiz API key for cloud processing, with the option to control emotion and speaking style through the characteristic-voice skill integration.

How to Use

Install via `$ npx skills add https://github.com/noizai/skills --skill tts`

Use Cases

Convert text, files, or URLs to audio with voice cloning.

Align speech to SRT subtitles for video narration.

Control voice selection, language, and emotion in segments.

Setup & Installation

Quick Install

Terminal

$ npx skills add https://github.com/noizai/skills --skill tts

Alternative Install (Git Clone)

git clone https://github.com/noizai/skills

Requirements

Claude Code or compatible AI agent

Quick Start Guide

Install the Skill

Copy the install command above and run it in your terminal.

Open Your AI Agent

Launch Claude Code, Cursor, or your preferred AI coding agent.

Try It Out

Use the prompt template or examples below to test the skill.

Customize

Adapt the skill to your specific use case and workflow.

Usage Examples

Prompt Template

Convert the following text to speech using the tts skill: '[TEXT_TO_CONVERT]'. Use the [BACKEND] backend ([Kokoro_LOCAL/Noiz_CLOUD]). If voice cloning is needed, specify the cloned voice with '[VOICE_NAME]' or 'default'. For timeline-accurate synthesis, include the timing markers in the format '[SYNC:TIMESTAMP]' where TIMESTAMP is in seconds.

Example Output

### Audio Output

Here’s the synthesized speech for your text:

```
[SYNC:0.0] Welcome to the [COMPANY] annual report for [YEAR].
[SYNC:2.5] Today, we'll review our progress in the [INDUSTRY] sector.
[SYNC:5.0] Key highlights include a 12% increase in revenue and a 15% reduction in operational costs.
```

**Backend Used:** Noiz Cloud
**Voice:** Default (Female, Neutral Tone)
**Duration:** 8.2 seconds

*Download the audio file here: [LINK_TO_AUDIO_FILE]*

Would you like to adjust the voice, backend, or timing markers for further refinement?

Find the right skills for your stack

Take a free 3-minute scan and get personalized AI skill recommendations.

Take free scan

Overview

About This Skill

How to Use

Install via `$ npx skills add https://github.com/noizai/skills --skill tts`

Use Cases

Convert text, files, or URLs to audio with voice cloning.

Align speech to SRT subtitles for video narration.

Control voice selection, language, and emotion in segments.

Quick Install

Terminal

$ npx skills add https://github.com/noizai/skills --skill tts

Alternative Install (Git Clone)

git clone https://github.com/noizai/skills

Requirements

Claude Code or compatible AI agent

Quick Start Guide

Install the Skill

Copy the install command above and run it in your terminal.

Open Your AI Agent

Launch Claude Code, Cursor, or your preferred AI coding agent.

Try It Out

Use the prompt template or examples below to test the skill.

Customize

Adapt the skill to your specific use case and workflow.

Usage Examples

Prompt Template

Convert the following text to speech using the tts skill: '[TEXT_TO_CONVERT]'. Use the [BACKEND] backend ([Kokoro_LOCAL/Noiz_CLOUD]). If voice cloning is needed, specify the cloned voice with '[VOICE_NAME]' or 'default'. For timeline-accurate synthesis, include the timing markers in the format '[SYNC:TIMESTAMP]' where TIMESTAMP is in seconds.

Example Output

### Audio Output

Here’s the synthesized speech for your text:

```
[SYNC:0.0] Welcome to the [COMPANY] annual report for [YEAR].
[SYNC:2.5] Today, we'll review our progress in the [INDUSTRY] sector.
[SYNC:5.0] Key highlights include a 12% increase in revenue and a 15% reduction in operational costs.
```

**Backend Used:** Noiz Cloud
**Voice:** Default (Female, Neutral Tone)
**Duration:** 8.2 seconds

*Download the audio file here: [LINK_TO_AUDIO_FILE]*

Would you like to adjust the voice, backend, or timing markers for further refinement?

tts

Overview

About This Skill

How to Use

Use Cases

Tags

Setup & Installation

Quick Install

Alternative Install (Git Clone)

Requirements

Quick Start Guide

Install the Skill

Open Your AI Agent

Try It Out

Customize

Usage Examples

Prompt Template

Example Output

Find the right skills for your stack

tts

Overview

About This Skill

How to Use

Use Cases

Tags

Setup & Installation

Quick Install

Alternative Install (Git Clone)

Requirements

Quick Start Guide

Install the Skill

Open Your AI Agent

Try It Out

Customize

Usage Examples

Prompt Template

Example Output

Find the right skills for your stack