Showing posts with label text to speech. Show all posts
Showing posts with label text to speech. Show all posts

Saturday, November 3, 2012

Text to Speech with correct intonation

Google has an unoffciale text to speech API. It can be accessed by http requests but it is limited to strings with less than 100 characters. In this post we will see how to split a text longer than 100 characters in order to obtain a correct voice intonation with this service. The approach is straighforward, we split the text in sentences with less than 100 characters according to the punctuation. Let's see how:
def parseText(text):
 """ returns a list of sentences with less than 100 caracters """
 toSay = []
 punct = [',',':',';','.','?','!'] # punctuation
 words = text.split(' ')
 sentence = ''
 for w in words:
  if w[len(w)-1] in punct: # encountered a punctuation mark
   if (len(sentence)+len(w)+1 < 100): # is there enough space?
    sentence += ' '+w # add the word
    toSay.append(sentence.strip()) # save the sentence
   else:
    toSay.append(sentence.strip()) # save the sentence
    toSay.append(w.strip()) # save the word as a sentence
   sentence = '' # start another sentence
  else:
   if (len(sentence)+len(w)+1 < 100):   
    sentence += ' '+w # add the word
   else:
    toSay.append(sentence.strip()) # save the sentence
    sentence = w # start a new sentence
 if len(sentence) > 0:
  toSay.append(sentence.strip())
 return toSay
Now, we can obtain the speech with a http request for each setence:
text = 'Think of color, pitch, loudness, heaviness, and hotness. Each is the topic of a branch of physics.'

print text
toSay = parseText(text)

google_translate_url = 'http://translate.google.com/translate_tts'
opener = urllib2.build_opener()
opener.addheaders = [('User-agent', 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)')]

for i,sentence in enumerate(toSay):
 print i,len(sentence), sentence
 response = opener.open(google_translate_url+'?q='+sentence.replace(' ','%20')+'&tl=en')
 ofp = open(str(i)+'speech_google.mp3','wb')
 ofp.write(response.read())
 ofp.close()
 os.system('cvlc --play-and-exit -q '+str(i)+'speech_google.mp3')
The API returns the speech using the mp3 format. The code above saves the result of the query and plays it using vlc.