Google has an unoffciale text to speech API. It can be accessed by http requests but it is limited to strings with less than 100 characters. In this post we will see how to split a text longer than 100 characters in order to obtain a correct voice intonation with this service.
The approach is straighforward, we split the text in sentences with less than 100 characters according to the punctuation. Let's see how:
def parseText(text):
""" returns a list of sentences with less than 100 caracters """
toSay = []
punct = [',',':',';','.','?','!'] # punctuation
words = text.split(' ')
sentence = ''
for w in words:
if w[len(w)-1] in punct: # encountered a punctuation mark
if (len(sentence)+len(w)+1 < 100): # is there enough space?
sentence += ' '+w # add the word
toSay.append(sentence.strip()) # save the sentence
else:
toSay.append(sentence.strip()) # save the sentence
toSay.append(w.strip()) # save the word as a sentence
sentence = '' # start another sentence
else:
if (len(sentence)+len(w)+1 < 100):
sentence += ' '+w # add the word
else:
toSay.append(sentence.strip()) # save the sentence
sentence = w # start a new sentence
if len(sentence) > 0:
toSay.append(sentence.strip())
return toSay
Now, we can obtain the speech with a http request for each setence:
text = 'Think of color, pitch, loudness, heaviness, and hotness. Each is the topic of a branch of physics.'
print text
toSay = parseText(text)
google_translate_url = 'http://translate.google.com/translate_tts'
opener = urllib2.build_opener()
opener.addheaders = [('User-agent', 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)')]
for i,sentence in enumerate(toSay):
print i,len(sentence), sentence
response = opener.open(google_translate_url+'?q='+sentence.replace(' ','%20')+'&tl=en')
ofp = open(str(i)+'speech_google.mp3','wb')
ofp.write(response.read())
ofp.close()
os.system('cvlc --play-and-exit -q '+str(i)+'speech_google.mp3')
The API returns the speech using the mp3 format. The code above saves the result of the query and plays it using vlc.