{"id":122,"date":"2025-01-22T11:56:39","date_gmt":"2025-01-22T11:56:39","guid":{"rendered":"https:\/\/yu-ki.org\/?p=122"},"modified":"2025-07-11T14:31:11","modified_gmt":"2025-07-11T14:31:11","slug":"text-zu-sprache","status":"publish","type":"post","link":"https:\/\/yu-ki.org\/en\/2025\/01\/22\/text-zu-sprache\/","title":{"rendered":"Text to speech"},"content":{"rendered":"<h2 class=\"wp-block-heading\"><strong>E<\/strong>inl<strong>tion<\/strong><\/h2>\n\n\n\n<p>The conversion of written text into spoken language - often referred to as text-to-speech (TTS) - makes it possible to present content in an accessible, more accessible and more versatile way.<\/p>\n\n\n\n<p>Thanks to artificial intelligence (AI), computer-generated voices now sound more natural than ever before: they can adapt pitch, speech tempo, emotion and intonation to imitate human speakers with astonishing realism.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Basics<\/strong><\/h2>\n\n\n\n<p>Text-to-speech technologies convert written content into acoustic signals. Modern TTS systems are based on neural networks that analyze speech patterns and generate synthetic voices from them.<\/p>\n\n\n\n<p>In the past, these often sounded mechanical or monotonous - but today's AI models offer expressive, dynamic voices that can even convey emotions.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Areas of application &amp; possible uses<\/strong><\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Accessibility:<\/strong>&nbsp;Read-aloud functions for people with visual impairments or reading difficulties.<\/li>\n\n\n\n<li><strong>Educational offers:<\/strong>&nbsp;Audio versions of training documents or presentations.<\/li>\n\n\n\n<li><strong>Public relations:<\/strong>&nbsp;Creation of audio statements, podcasts or video recordings.<\/li>\n\n\n\n<li><strong>Telephone and announcement systems:<\/strong>&nbsp;Automated voice announcements in hotlines or info terminals.<\/li>\n\n\n\n<li><strong>Multimedia content:<\/strong>&nbsp;Animated videos, explanatory films or social media clips with a synthesized voice.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Step-by-step procedure<\/strong><\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Step 1: Determine target and use<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Should the text sound informative, motivating or emotional?<\/li>\n\n\n\n<li>For which channel or medium is the audio file intended? (e.g. podcast, video, website)<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Step 2: Prepare text<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Optimize content linguistically (shorter sentences, clear formulations).<\/li>\n\n\n\n<li>Shorten or adapt passages that are not relevant.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Step 3: Formulate a request to the AI<\/h3>\n\n\n\n<p>A good text-to-speech prompt should contain the following elements:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Language and voice:<\/strong>&nbsp;In which language and with which voice character should be spoken?<\/li>\n\n\n\n<li><strong>Tone and mood:<\/strong>&nbsp;Should it sound friendly, neutral, motivating or serious?<\/li>\n\n\n\n<li><strong>Speech tempo and emphasis:<\/strong>&nbsp;If desired, specify whether you want to speak slowly, quickly or with pauses.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Step 4: Check audio file<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Listen to pronunciation and intonation.<\/li>\n\n\n\n<li>If necessary, make adjustments to the text or settings.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Step 5: Save and integrate the audio file<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Export in the desired format (e.g. MP3, WAV).<\/li>\n\n\n\n<li>Integrate into websites, videos or presentations.<\/li>\n<\/ul>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">Example from practice<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">Scenario<\/h3>\n\n\n\n<p>A non-profit organization wants to record an audio invitation for a neighborhood party and make it available on the website.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Prompt for an AI<\/h3>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>\"Read this invitation text in German in a friendly, natural female voice. Keep a calm speaking pace and emphasize the community aspect. The text is intended for an audio invitation that will be embedded on our website.\"<\/p>\n<\/blockquote>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">Conclusion<\/h2>\n\n\n\n<p>Text-to-speech with AI opens up a wide range of possibilities for making content audible and more lively. Whether for accessibility, public relations or education - with precisely formulated prompts and careful text preparation, professional audio content can be created quickly and easily.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">Further links<\/h2>\n\n\n\n<figure class=\"wp-block-table is-style-stripes text-align-left\"><table class=\"has-fixed-layout\"><tbody><tr><td><a href=\"https:\/\/voiceflow.com\" target=\"_blank\">Voiceflow Pro<\/a><\/td><td>Create your own voice assistants - for Alexa, Google Assistant or web interfaces.<\/td><\/tr><\/tbody><\/table><\/figure>","protected":false},"excerpt":{"rendered":"<p>Introduction The conversion of written text into spoken language - often referred to as text-to-speech (TTS) - makes it possible to present content in a barrier-free, more accessible and more versatile way. Thanks to artificial intelligence (AI), computer-generated voices now sound more natural than ever before: they can adjust pitch, pace of speech, emotion and emphasis to imitate human speakers in an astonishingly realistic way. Basics In text-to-speech technologies, written content is [...]<\/p>","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[25],"tags":[33],"class_list":["post-122","post","type-post","status-publish","format-standard","hentry","category-sprachbasierte-ki-anwendungen","tag-links"],"_links":{"self":[{"href":"https:\/\/yu-ki.org\/en\/wp-json\/wp\/v2\/posts\/122","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/yu-ki.org\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/yu-ki.org\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/yu-ki.org\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/yu-ki.org\/en\/wp-json\/wp\/v2\/comments?post=122"}],"version-history":[{"count":6,"href":"https:\/\/yu-ki.org\/en\/wp-json\/wp\/v2\/posts\/122\/revisions"}],"predecessor-version":[{"id":513,"href":"https:\/\/yu-ki.org\/en\/wp-json\/wp\/v2\/posts\/122\/revisions\/513"}],"wp:attachment":[{"href":"https:\/\/yu-ki.org\/en\/wp-json\/wp\/v2\/media?parent=122"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/yu-ki.org\/en\/wp-json\/wp\/v2\/categories?post=122"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/yu-ki.org\/en\/wp-json\/wp\/v2\/tags?post=122"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}