## Sunday, September 15, 2013

### Self Organizing Maps

The Self Organizing Maps (SOM), also known as Kohonen maps, are a type of Artificial Neural Networks able to convert complex, nonlinear statistical relationships between high-dimensional data items into simple geometric relationships on a low-dimensional display. In a SOM the neurons are organized in a bidimensional lattice and each neuron is fully connected to all the source nodes in the input layer. An illustration of the SOM by Haykin (1999) is the following

Each neuron n has a vector wn of weights associated. The process for training a SOM involves stepping through several training iteration until the item in your dataset are learnt by the SOM. For each pattern x one neuron n will "win" (which means that wn is the weights vector more similar to x) and this winning neuron will have its weights adjusted so that it will have a stronger response to the input the next time it sees it (which means that the distance between x and wn will be smaller). As different neurons win for different patterns, their ability to recognize that particular pattern will increase. The training algorithm can be summarized as follows:
1. Initialize the weights of each neuron.
2. Initialize t = 0
3. Randomly pick an input x from the dataset
4. Determine the winning neuron i as the neuron such that

5. Adapt the weights of each neuron n according to the following rule

6. Increment t by 1
7. if t < tmax go to step 3
We have that η(t) is called learning rate and that h(i) is called neighborhood function which has high values for i and the neurons close to i on the lattice (a Gaussian centered on i is a good example of neighborhood function). And, when t increases η also decrease and h decrease its spread. This way at each training step the weights of the neurons close to the winning one are adjusted to have a stronger response to the current pattern. After the training process, we have that the locations of the neurons become ordered and a meaningful coordinate system for the input features is created on the lattice. So, if we consider the coordinates of the associated winning neuron for each patter the SOM forms a topographic map of the input patterns.

MiniSom is a minimalistic and Numpy based implementation of the SOM. I made it during the experiments for my thesis in order to have fully hackable SOM algorithm and lately I decided to release it on GitHub. The next part of this post will show how to train MiniSom on the Iris Dataset and how to visualize the result. The first step is to import and normalize the data:
```from numpy import genfromtxt,array,linalg,zeros,apply_along_axis

# reading the iris dataset in the csv format
data = genfromtxt('iris.csv', delimiter=',',usecols=(0,1,2,3))
# normalization to unity of each pattern in the data
data = apply_along_axis(lambda x: x/linalg.norm(x),1,data)
```
The snippet above reads the dataset from a CSV and creates a matrix where each row corresponds to a pattern. In this case, we have that each pattern has 4 dimensions. (Note that only the first 4 columns of the file are used because the fifth column contains the labels). The training process can be started as follows:
```from minisom import MiniSom
### Initialization and training ###
som = MiniSom(7,7,4,sigma=1.0,learning_rate=0.5)
som.random_weights_init(data)
print("Training...")
som.train_random(data,100) # training with 100 iterations
```
Now we have a 7-by-7 SOM trained on our dataset. MiniSom uses a Gaussian as neighborhood function and its initial spread is specified with the parameter sigma. While with the parameter learning_rate we can specify the initial learning rate. The training algorithm implemented decreases both parameters as training progresses. This allows rapid initial training of the neural network that is then "fine tuned" as training progresses.
To visualize the result of the training we can plot the average distance map of the weights on the map and the coordinates of the associated winning neuron for each patter:
```from pylab import plot,axis,show,pcolor,colorbar,bone
bone()
pcolor(som.distance_map().T) # distance map as background
colorbar()
target = genfromtxt('iris.csv',
delimiter=',',usecols=(4),dtype=str)
t = zeros(len(target),dtype=int)
t[target == 'setosa'] = 0
t[target == 'versicolor'] = 1
t[target == 'virginica'] = 2
# use different colors and markers for each label
markers = ['o','s','D']
colors = ['r','g','b']
for cnt,xx in enumerate(data):
w = som.winner(xx) # getting the winner
# palce a marker on the winning position for the sample xx
plot(w[0]+.5,w[1]+.5,markers[t[cnt]],markerfacecolor='None',
markeredgecolor=colors[t[cnt]],markersize=12,markeredgewidth=2)
axis([0,som.weights.shape[0],0,som.weights.shape[1]])
show() # show the figure
```
The result should be like the following:

For each pattern in the dataset the corresponding winning neuron have been marked. Each type of marker represents a class of the iris data ( the classes are setosa, versicolor and virginica and they are respectively represented with red, green and blue colors). The average distance map of the weights is used as background (the values are showed in the colorbar on the right). As expected from previous studies on this dataset, the patterns are grouped according to the class they belong and a small fraction of Iris virginica is mixed with Iris versicolor.

For a more detailed explanation of the SOM algorithm you can look at its inventor's paper.

1. Hey, i've been using your MiniSOM, it works great..
i want to ask, how can we save the trained model for future use? so we dont have do the train again. Thanks in advance :)

1. Hi Guntur, MiniSom has the attribute weights, it's a numpy matrix that can be save with the function numpy.save. If you want to reuse a MiniSom, you can crate an instance, train and save the weights. When you want to reuse that network, you can create a new instance, load the weights from a file and overwrite the attribute weights.

2. thanks for the quick reply.. i will try that :)

2. This comment has been removed by the author.

3. This comment has been removed by the author.

4. How does the SOM react to categorical inputs, e.g. [1,2,3,..] are categorized peoples favorite foods? Is there a way to work with this type of data? Does the training assume an ordinal number set or does it learn this on its own?

1. The algorithm assumes that the order of the values has a meaning. I suggest you to try a binary encoder like this one: http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.LabelBinarizer.html#sklearn.preprocessing.LabelBinarizer

5. Hello.
This looks like a promising library.
I noted you used a 2D map, 7 by 7, for output neurons. Is 3D map also possible? Could you please illustrate this if possible.
Thanks

1. Hi John, I'm sorry but this library supports only 2D maps at the moment and I don't plan to extend it at the moment.

2. Hello
Thanks for the information

6. This comment has been removed by the author.

1. Hi Walter,

I can't read your data type but I'm pretty sure that the problem is in the shape of matrix that you are using. Please check that it has no more than 2 dimensions.

7. Hi Walter,

Hi can I install the package into my ubuntu via terminal.

1. yes, you can. Check the documentation: https://github.com/JustGlowing/minisom

2. Thanks Just Glowing,
I have gone through your github account before.
I am unable to identify despite my effort.
Please guide me with the 'pip install' command for the installation in ubuntu.

4. If you want to use pip, you just need to run this command in your terminal:

pip install git+https://github.com/JustGlowing/minisom

8. Hello
From the above illustration on iris, suppose we have a flower with the following measurements: 20,8,16 and 7. How can one use the trained SOM to determine the class of this new flower/data?
Thanks

1. You can get the position on the map using the method winner and assign the class according to the class of the samples mapped in the same (or close) position.

2. Thanks
I will try it out

9. hello
I am using your MiniSom for anomaly detection in high dimentional data but I have a problem with the choice of the input parameters
So how can we chose these parameters using the dimension of the input data?

1. Hi, I covered parameters selection in a different post:

http://glowingpython.blogspot.co.uk/2014/04/parameters-selection-with-cross.html

10. Hi, I am using your MiniSom for abnormaly detection in a data stocked in Elasticsearch, so how to recover my data in your code ? and how to replace your iris.csv with my data? should i convert my data to numpy matrix? if yes how?

1. Hi pikita,if you want to parse a csv I recommend you to use pandas. You'll find the function read_csv pretty useful.

2. hi, thank u for your answer but i didn't understand how to use pandas have u a link ?

3. Check this out: http://chrisalbon.com/python/pandas_dataframe_importing_csv.html

4. i have all my data in elasticsearch i don t know how to recover specially the code I want in your code, and then convert it to a numpy matrix ??

11. Hi, How to identify the accuracy in this code i.e. how many sample classified as correctly in each class

1. Hi Naveen, no classification is done here.

12. Fantastic piece of code, thanks for sharing it. I've been playing with it to classify oceanography data (pacific ocean temperatures) and am wondering about reproducibility. While it picks up the obvious patterns (i.e. El Nino) it seems that each time I run the algorithm I get different maps, sometimes patterns a show up as maps and other times are not present, and increasing the number of iterations doesn't seem to cause a convergence to consistent results. This is also true for your supplied iris and colors examples. Am I missing something, or do SOMs truly provide different results each time the algorithm is run?

1. Hi Eric, the result of the algorithm strongly depends on the initialization of the weights and they are randomly initialized. To have always the same results you can pass the argument random_seed to the constructor of MiniSom. If the seed is always the same, the weights will be initialized always the same way., hence you'll have always the same results.

13. Nice work.I have gone through your code,and I just wonder that why 'som.random_weights_init(data)' is called,for the weights already have been initialized in the MiniSom __init__ function。And 'som.random_weights_init(data)' regards normaliztion of the data as weights,is it a correct way?Also 'som.random_weights_init(data)' replaced the initial weights.

1. Hi, som.random_weights initializes the weights picking samples from data. This is supposed to speed up the convergence.

self.weights[it.multi_index] = data[self.random_generator.randint(len(data))]
self.weights[it.multi_index] = self.weights[it.multi_index]/fast_norm(self.weights[it.multi_index])
In the function,we iterate the activation_map,the first line means picking samples from data,and assigned to weights.That's to say,we swept away the initialization in Minisom's __init__ founction.Why so?

3. Because that initialization seemed more suitable in that case. The constructor initializes the weights using np.random.rand by default. It's totally fine to choose a different initialization if the dafault one doesn't suits you.

4. Thank you.That means both __init__ and random_weights_init are fine? And we can choose one of them or define a newer one.

14. Hi, How can I plot the contour of neighborhood for each cluster. The weights matrix correspond to the U-matrix?

1. Hi Mayra, you can compute the U-matrix with the method distance_map and you can plot it using pcolor.

15. Hi, thank for answer my question. I have a doubt. I was testing your code with the mnist data set, is similar to the digits dataset from python, but the difference is the size of the images. I trained the network SOM with a sample of 225 random digits and the dimension of my grid is 15*15. When I plot the U-matrix with the method distance_map, each coordinate of my plot should have a digit rigth? Why the position of my winning neuron is the same for different samples? There are empty positions in muy plot wuth any digits. When training the SOM with a larger sample there are empty positions too in muy plot. Should that happen? I use other implementation for example kohonen library from R package does not happen. Could you help me understand why this happens?. Thanks

1. Hi again,

I'll try to answer our questions one by one.

- If your samples represent digits, you can associate a cell in the map with a digit. See this example https://github.com/JustGlowing/minisom/blob/master/examples/example_digits.py

- If two samples have the same winning neuron, it means that the two samples are similar.

- With the training and initialization methods implemented, It's normal that some areas don't have winning neurons for the samples used for training, especially between regions that activate for samples that have different.

16. Thank you so much for providing the code. I hope you can help me with my question. I thought that it is possible to apply SOM for outlier detection in an unsupervised manner (without labeled data). In the iris dataset the data has labels right:

t = zeros(len(target),dtype=int)
t[target == 'setosa'] = 0
t[target == 'versicolor'] = 1
t[target == 'virginica'] = 2

If I just have numerical data without any labels, how can I use your SOM approach?

Thank you very, very much for your help! I really appreciate it! :-)

1. Nevermind! Got it! Thanks a lot! :-)

2. But how can I count the amount of samples in one position?

3. Hi, look at the documentation of the method win_map, I'm sure that will answer your question.

17. Hi, that's a great library you have implemented. I would like to try and combine a self organising map and a multilayer perceptron. I have used your network to cluster character images. Is there any way to save the clustering so I can feed it as input to a multilayer perceptron?

1. Hi, you need to use the method winner on each sample in your dataset and save the result in a format accepted by your MLP implementation.

2. Hi, thankyou fot the fast response. I shall try this, thank you.
I have labels associated with each image in the dataset very similar to t in your example. I want to associate the winner with a label if possible. When you use a for loop and use the enumerate function on the data, is the code t[cnt] associating your t label with the winner?

3. Yes, with t[cnt] the label of the sample is considered to assign a color to the corresponding marker.

18. Hi,

thank you for this great library.

When we find the outliers on the map by looking at the highest distance (close to 1), how can we know to which observation in the original data it corresponds ? In other words, is there an inverse function of the winner() function to reverse the mapping from the input space to the output map ?

1. hi, winner() can't be inverted, but win_map() will solve your problem.

19. Hi, do you have any function to calculate the SOM-MQE? Thanks!

1. Hi, I'm not sure this is exactly what you need but can help you: https://github.com/JustGlowing/minisom/blob/master/minisom.py#L167

20. Hi, is there a way to change the "lattice", "shape" and "neigh" of the map? Thanks!

1. Hi Jianshe, there's no way to do that without changing the code currently. The philosophy of the project is to be minimal.

2. Hi JustGlowing,

Thanks. The reason why I am asking this is that the clustering results and quantization errors are different from the results I got from MATLAB somtoolbox. After reviewing all the codes inside, I think the possible reason maybe the parameter settings on the training stage. After all, I think you did a brilliant job. Thanks!

21. Hi,

I am having a trouble with your pylab code from your example digits.py which I found here: https://github.com/JustGlowing/minisom/blob/master/examples/example_digits.py

I have digits images I want to plot but the problem is a window titled figure 1 comes up with an axis all by itself and a separate window titled figure 2 pops up with the clustered images but without an axis. I tried to combine the 2 figures by removing figure(2) from the code and even though only one window titled figure 1 pops up there is no axis, just the clusteted images alone. On top of that not all of the clustered images are shown up. Do you know of anyone else having this problem who fixed it or if there is a way to combine the 2 figures or a way for it the axis and clustered images to show up using 1 figure?

Thanks