## Saturday, November 17, 2012

### First steps with networkx

One of my favorite topics is the study of structures and, inspired by the presentation of Jacqueline Kazil and Dana Bauer at PyCon US, I started to use networkx in order to analyze some networks. This library provides a lot facilities for the creation, the visualization and the mining of structured data. So, I decided to write this post that shows the first steps to start with it. We will see how to load a network from the gml format and how to prune the network in order to visualize only the nodes with a high degree. In the following examples the coappearance network of characters in the novel Les Miserables, freely available here, will be used. In this network each node represents a character and the connection between two characters represent the coappearance in the same chapter.

```# read the graph (gml format)

# drawing the full network
figure(1)
nx.draw_spring(G,node_size=0,edge_color='b',alpha=.2,font_size=10)
show()
```
This should be the result:

It's easy to see that the graph is not really helpful. Most of the details of the network are still hidden and it's impossible to understand which are the most important nodes. Let's plot an histogram of the number of connections per node:
```# distribution of the degree
figure(2)
d = nx.degree(G)
hist(d.values(),bins=15)
show()
```
The result should be as follows:

Looking at this histogram we can see that only few characters have more than ten connections. Then, we decide to visualize only them:
```def trim_nodes(G,d):
""" returns a copy of G without
the nodes with a degree less than d """
Gt = G.copy()
dn = nx.degree(Gt)
for n in Gt.nodes():
if dn[n] <= d:
Gt.remove_node(n)
return Gt

# drawing the network without
# nodes with degree less than 10
Gt = trim_nodes(G,10)
figure(3)
nx.draw(Gt,node_size=0,node_color='w',edge_color='b',alpha=.2)
show()
```
In the graph below we can see the final result of the analysis. This time the graph makes us able to observe which are the most relevant characters and how they are related to each other according to their coappearance through the chapters.

## Saturday, November 3, 2012

### Text to Speech with correct intonation

Google has an unoffciale text to speech API. It can be accessed by http requests but it is limited to strings with less than 100 characters. In this post we will see how to split a text longer than 100 characters in order to obtain a correct voice intonation with this service. The approach is straighforward, we split the text in sentences with less than 100 characters according to the punctuation. Let's see how:
```def parseText(text):
""" returns a list of sentences with less than 100 caracters """
toSay = []
punct = [',',':',';','.','?','!'] # punctuation
words = text.split(' ')
sentence = ''
for w in words:
if w[len(w)-1] in punct: # encountered a punctuation mark
if (len(sentence)+len(w)+1 < 100): # is there enough space?
sentence += ' '+w # add the word
toSay.append(sentence.strip()) # save the sentence
else:
toSay.append(sentence.strip()) # save the sentence
toSay.append(w.strip()) # save the word as a sentence
sentence = '' # start another sentence
else:
if (len(sentence)+len(w)+1 < 100):
sentence += ' '+w # add the word
else:
toSay.append(sentence.strip()) # save the sentence
sentence = w # start a new sentence
if len(sentence) > 0:
toSay.append(sentence.strip())
```
Now, we can obtain the speech with a http request for each setence:
```text = 'Think of color, pitch, loudness, heaviness, and hotness. Each is the topic of a branch of physics.'

print text
toSay = parseText(text)