Movatterモバイル変換


[0]ホーム

URL:


Skip to content
DEV Community
Log in Create account

DEV Community

Cover image for 🎙️ Building a Text-to-Speech (TTS) GUI with Python
Smooth Code
Smooth Code

Posted on

🎙️ Building a Text-to-Speech (TTS) GUI with Python

Have you ever wanted to turn text into natural-sounding speech directly from your computer? With Python, it's easier than ever! By combining Microsoft Edge's neural voices (via theedge-tts library) and Python's built-intkinter GUI framework, we can create a simple yet powerful Text-to-Speech (TTS) application.


This project lets you input text (or upload a file), select a voice, adjust the speaking speed, and save the output as an MP3 audio file.


✨ Features

  • 🎤 Multiple Voice Options

    Supports various neural voices such as US English, British English, Australian English, Canadian English, Spanish, and more.

  • ⚡ Customizable Speech Rate

    Adjust speed from -50% (slower) to +50% (faster) using a slider.

  • 📝 Flexible Text Input

    Enter text directly or upload a text file.

  • 💾 Export as MP3

    Save the generated speech to your preferred location.

  • 🖥️ Clean GUI

    Built withtkinter, offering a simple and user-friendly interface.


Dependencies

Ensure you have Python 3.7+ installed on your system. Then, install theedge-tts module:

pipinstalledge-tts
Enter fullscreen modeExit fullscreen mode

📦 Installation

  1. Clone the repository:
git clone https://github.com/smoothcoode/edge-tts-guicdedge-tts-gui
Enter fullscreen modeExit fullscreen mode
  1. Install dependencies:
pipinstalledge-tts
Enter fullscreen modeExit fullscreen mode
  1. Run the application:
python main.py
Enter fullscreen modeExit fullscreen mode

🗣️ Listing Available Voices

To see all available neural voices, run:

python-m edge_tts--list-voices
Enter fullscreen modeExit fullscreen mode

You'll find a variety of voices you can experiment with.


🧑‍💻 Source Code

Here's the complete Python script that powers the application:

importedge_ttsimporttkinterastkfromtkinterimportttk,messagebox,filedialogimportasyncioasyncdefgenerate_audio(text,voice,rate,output_file):communicate=edge_tts.Communicate(text=text,voice=voice,rate=rate)awaitcommunicate.save(output_file)messagebox.showinfo("Success","File saved successfully:"+output_file)# Initialize main windowroot=tk.Tk()root.title("Text to Speech")root.geometry("600x400")# Voice selectionVOICES=["en-US-AndrewNeural","en-US-AriaNeural","en-US-AshTurboMultilingualNeural","en-US-AshleyNeural","en-US-AvaMultilingualNeural","en-US-AvaNeural"]ttk.Label(root,text="Select a Voice:").grid(column=0,row=0,padx=10,pady=10,sticky="w")voice_var=tk.StringVar(value=VOICES[0])voice_dropdown=ttk.Combobox(root,values=VOICES,textvariable=voice_var,state="readonly")voice_dropdown.grid(row=0,column=1,padx=10,pady=10,sticky="ew")root.columnconfigure(1,weight=3)# Speed sliderttk.Label(root,text="Select a speed Rate").grid(row=1,column=0,padx=10,pady=10,sticky="w")speed_var=tk.IntVar(value=0)speed_slider=ttk.Scale(root,from_=-50,to=50,orient="horizontal",variable=speed_var)speed_slider.grid(row=1,column=1,padx=10,pady=10,sticky="ew")# Text inputttk.Label(root,text="Enter a text").grid(row=2,column=0,padx=10,pady=10,sticky="w")text_box=tk.Text(root,wrap="word")text_box.grid(row=2,column=1,padx=10,pady=10,sticky="nsew")root.rowconfigure(2,weight=1)# Upload file buttondefon_upload():file_path=filedialog.askopenfilename(filetypes=[("Text File","*.txt")])iffile_path:withopen(file_path,"r")asf:content=f.read()text_box.delete("1.0",tk.END)text_box.insert("1.0",content)upload_button=ttk.Button(root,text="Upload a text",command=on_upload)upload_button.grid(row=3,column=1,padx=10,pady=10,sticky="e")# Generate audiodefon_generate_audio():voice=voice_var.get()rate=speed_var.get()rate_str=f"+{rate}%"ifrate>=0elsef"{rate}%"text=text_box.get("1.0",tk.END).strip()ifnottext:messagebox.showwarning("Warning","No text Provided")returnoutput_file=filedialog.asksaveasfilename(defaultextension=".mp3",filetypes=[("MP3","*.mp3")])ifnotoutput_file:returnasyncio.run(generate_audio(text=text,voice=voice,rate=rate_str,output_file=output_file))generate_button=ttk.Button(root,text="Generate Audio",command=on_generate_audio)generate_button.grid(row=4,column=1,padx=10,pady=10)# Run the GUIroot.mainloop()
Enter fullscreen modeExit fullscreen mode

🚀 Conclusion

This project is a great starting point for anyone exploring Text-to-Speech applications in Python. By leveragingedge-tts andtkinter, you can create a fully functional GUI tool that makes text come alive as natural-sounding speech.

Whether you want to narrate articles, build accessibility tools, or experiment with voice synthesis, this Python TTS GUI is a practical and fun project to try out.


Top comments(0)

Subscribe
pic
Create template

Templates let you quickly answer FAQs or store snippets for re-use.

Dismiss

Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment'spermalink.

For further actions, you may consider blocking this person and/orreporting abuse

Short & Efficient IT articles
  • Joined

More fromSmooth Code

DEV Community

We're a place where coders share, stay up-to-date and grow their careers.

Log in Create account

[8]ページ先頭

©2009-2025 Movatter.jp