SPIN (Sorting Points Into
Neighborhoods) Algorithm
ELTEM Neurogenomics Course
Biozentrum – Basel
December 1, 2006
Presented by Tal Shay and Assif Yitzhaky
Weizmann Institute of Science
Rehovot, Israel
D. Tsafrir , I. Tsafrir , L. Ein-Dor , O. Zuk , D.A. Notterman , and E. Domany
Sorting points into neighborhoods (SPIN): data analysis and
visualization by ordering distance matrices. Bioinformatics 2005 21:
2301-2308
SPIN
• A new method for mining gene expression
data
• The philosophy behind SPIN – feel and
play with the data
What is sorting?
One dimensional ordering of a set of objects
according to a particular trait.
Time
Height
Samples distance matrix
Distance matrices
samples distance matrix
samples
Genes distance matrix
genes
genes
Expression
matrix
genes samples
High Distance
60
50
40
30
20
10
1
Low
Multiple clusters
PCA 2
-5
-4
-3
-2
-1
0
1
2
3
4
5
-5
100 100
-4
-3
200 200
-2
300 300
-1
PCA 1
400 400
0 1
500 500
2
600 600
3
4
700 700
5
800 800
100 200 300 400 500 600 700 800 100 200 300 400 500 600 700 800
Unsorted SPIN
Q: How many objects?
Q: What are their shapes?
Q: What is the relative orientation ?
15
10
5
0
Large Small Distances
Decode this matrix
Q: How many objects?
Q: What are their shapes?
Q: What is the relative orientation ?
SPIN Interface Layout
Expression PCA
Distance matrix
matrix Analysis
Cluster
Scores & SPC buttons
Diagnostics Dendrogram Transpose
Sorting buttons
Side-To-Side on a circle:
5 5
4 4
3 3
Penalizes blue 2
1
2
1
0 0
points far from -1
-2
-1
-2
-3 -3
the main diagonal.-4
-5
-4
-5
-5 0 5 -5 0 5
50
50
100
100
150
Penalizes red
150
200
200
250
points near the
250
300
300
350
main diagonal.
350
400
50 100 150 200 250 300 350 400 400
50 100 150 200 250 300 350 400
Side to side Neighborhood
‘Electronic Microdissection’
Two-Way Sorted Expression Matrix
Normal Primary
Liver Met.
Liver Tumor
A
A
K
U
B Important: 97% of
genes that appear
H significantly over-
Genes
expressed in
K Metastasis vs.
Carcinoma are
M
Liver-Specific and
irrelevant to
U
cancer!
V
Identify ‘clean’
Normal Liver Primary Polyp Normal metastasis
Liver Met. Tumor Colon samples
Normal Liver Adenomas Liver metastasis
Normal Lung Normal colon Lung metastasis
Carcinomas
SPIN versus clustering
• SPIN rearranges points in a way that reflects
the shape of their arrangement.
• Clustering aims to rearranges points to
categories; points within a cluster are more
“similar” as compared to points outside the
cluster.
• SPIN rearranges the points and the user
should identify the clusters. Clustering do it
for you in a formal way.
Software availability
• SPIN is available upon request.
• Email assif.yitzhaky@weizmann.ac.il
Exercise Time
The EXCEL
The EXCEL
Shift header line one
column to the left