VIEWS: 7 PAGES: 9 POSTED ON: 11/19/2011
IEEE TRANSACTIONS ON COMPUTERS, VOL. C-18, NO. 5, MAY 1969 401 A Nonlinear Mapping for Data Structure Analysis JOHN W. SAMMON, JR. Abstract-An algorithm for the analysis of multivariate data is Let us now randomly2 choose an initial d-space con- presented along with some experimental results. The algorithm is figuration for the Y vectors and denote the configura- based upon a point mapping of N L-dimensional vectors from the L- space to a lower-dimensional space such that the inherent data tion as follows: "structure" is approximately preserved. Yii Y21 YN1 Index Terms-Clustering, dimensionality reduction, mappings, multidimensional scaling, multivariate data analysis, nonparametric, Y1= Y2 [Y...N pattern recognition, statistics. _ yldJ _Y2d _L_yNdJ _ INTRODUCTION Next we compute all the d-space interpoint distances dij, which are then used to define an error E, which HE purpose of this paper is to describe the non- represents how well the present configuration of N linear mapping algorithm (NLM) which has been points in the d-space fits the N points in the L-space, found to be highly effective in the analysis of mul- i.e., tivariate data. The analysis problem is to detect and vE [dj*dij* dj]2 identify "structure" which may be present in a list of 1 N - E = (1) N L-dimensional vectors. Here the word structure refers E [dij*] i<j to geometric relationships among subsets of the data i<i vectors in the L-space. Some examples of structure are the d N variables hyperspherical and hyperellipsoidal clusters, and linear Note that ... erroranda function ofd. TheXnext the is and certain nonlinear relationships among the vectors the NLM yp^ p= , ,N q= 1, * * , step in of some subset. algorithm is to adjust the y,, variables or The algorithm is based upon a point mapping of the equivalently change the d-space configuration so as to N L-dimensional vectors from the L-space to a lower- decrease the error. We use a steepest descent procedure dimensional space such that the inherent structure of further for a minimum of the error (see Appendix I for to search the data is approximately preserved under the mapping. details). The approximate structure preservation is maintained SOME COMPUTER RESULTS by fitting N points in the lower-dimensional space such that their interpoint distances approximate the corre- We have exercised the nonlinear mapping algorithm sponding interpoint distances in the L-space. We shall on several data sets in order to test and evaluate the be primarily interested in mappings to 2- and 3-dimen- utility of the program in detecting and identifying sional spaces since the resultant data configuration can structure in data. Some of the results obtained for sev- easily be evaluated by human observations in 3 or less eral different artificially generated data sets3 are re- dimensions. ported for the case where d= 2. We have also run the algorithm on many real data sets and have achieved THE NONLINEAR MAPPING highly satisfactory results; however, for demonstration purposes it is useful to work with artificially generated Suppose that we have N vectors in an L-space desig- data in order that we can compare our results with the nated Xi, i= 1, * *, N and corresponding to these we known data structure. The test data sets were as follows. define N vectors in a d-space (d = 2 or 3) designated Yi, i=l, *, N. Let the distance' between the vectors 1) Straight Line Data: These data consisted of nine Xi and Xj in the L-space be defined by dij*=dist [Xi, points distributed along a line in a 9-dimensional space. data points were spaced evenly Xj] and the distance between the corresponding vectors Theinterpoint Euclidean distance of along the line with -Y and Yj in the d-space be defined by dij = dist [ Yi, YP I . an V,/9 units. The ini- tial 2-space configuration was chosen randomly. Manuscript received August 26, 1968; revised February 2, 1969. The author was with Rome Air Development Center, Griffiss 2 For the purpose of this discussion it is convenient to think of AFB, Rome, N. Y. He is now with Computer Symbolic, Inc., Rome, the starting configuration as being selected randomly; however, in N. Y. practice the initial configuration for the vectors is found by project- 1 Any distance measure could be used; however, if we have no a ing the L-dimensional data orthogonally onto a d-space spanned priori knowledge concerning the data, we would have no reason to by the d original coordinates with the largest variances. prefer any metric over the Euclidean metric. Thus, this algorithm I One exception is data set 3 which is a classical data set. This uses the Euclidean distance measure. data set was not artificially generated. 402 IEEE TRANSACTIONS ON COMPUTERS, MAY 1969 2) Circular Data: The data consisted of nine points, in Appendix I. As was expected, the data structure in- eight of which were spaced evenly (450 apart) along a herent in both data sets 1 and 2 was faithfully repro- circle of radius 2.5 units in a 2-dimensional space. The duced under the mapping (see Figs. 1 and 2). This was ninth point was placed at the center of the circle. The expected since the data sets are 1- and 2-dimensional, initial 2-space configuration was chosen randomly. respectively, and therefore a mapping to a 2-space can 3) Iris Data: This data set is fairly well-known since be accomplished with zero error. In both cases the final it was used by Fisher [4] in several statistical experi- error was 10-16. ments. The data were originally obtained by making Observing the result of the Iris data mapping (Fig. 3), four measurements on Iris flowers. These measurements we can essentially detect the three species of Iris. The were then used to classify three different species of Iris final error was 2 X 10-3 which is considered quite small. flowers. Fifty sample vectors were obtained from each The results obtained on the simplex data (data set 4) of the three species. Thus, the data set consists of 150 were quite interesting. The result of the mapping clearly points distributed in a 4-dimensional space. showed the presence of five clusters (see Fig. 4). How- 4) Gaussian Data Distributed at the Vertices of a Sim- ever, when we compare this result with the projection plex: These data consist of 75 points distributed in a of the same data onto the 2-space defined by the two 4-dimensional space. There are five spherical Gaussian largest eigenvectors of the estimated data covariance distributions which have their respective mean vectors matrix, we can only detect four clusters. Two of the located at the vertices of a 4-dimensional simplex.4 The clusters overlap completely in the 2-space which fits intervertex distance is V5/4 units and each covariance the data in the least squares sense (see Fig. 5). The re- matrix is diagonal with a standard deviation along every sulting NLM error was 0.05. The same experiment has coordinate of 0.2 unit. Fifteen points were generated been conducted using Gaussian data distributed at the from each of the five Gaussian distributions, making a vertices of higher-dimensional simplexes. Figs. 6 and 7 total of 75 points. show the NLM and principal eigenvector plots, re- 5) Helix Data: This data set consisted of 30 points dis- spectively, for a 19-dimensional Gaussian simplex dis- tributed along a 3-dimensional helix. The parametric tribution. These experiments indicate that for some data equations for this helix are sets, the NLM is superior to eigenvector projections for data structure analysis. X = cos Z The results shown in Figs. 8 and 9 clearly indicate the Y = sin Z "istring structure" in data sets 5 and 6, respectively. The mapping error for the helix data was 6 X 10-4. The error z = .\/2 t. for data set 6 was 1.6 X 10-3. 2 The utility of any data analysis technique is somehow more convincing when applied to "real" data as opposed The points are distributed at one-unit intervals along to artificially generated data, presuming, of course, that the curve (i.e., t=0, 1, 2, , 29). the analysis results are correct. For this reason, the 6) Nonlinear Data: This data set consisted of 29 application of the NLM algorithm to an experiment in points distributed evenly along a 5-dimensional curve. document retrieval by content is reported here. The parametric equations for this curve are The experiment, conducted jointly by Rome Air De- X = cos velopment Center (RADC) and the University of Col- Z orado, involved the construction of a document classifi- Y = sin Z cation space (referred to as the C-space) where every U = 0.5 cos 2Z document in the library was represented as a 17-dimen- 0.5 sin 2Z sional vector. The construction technique devised by V Ossorio [8], [9] describes a mapping of 1125 preselected = -/ t. words and phrases into the C-space. Documents, or Z equivalently retrieval requests, were located in the space = 2 by computing the vector average of the corresponding The points were distributed at one-unit intervals along key words or phrases which were contained in the docu- the curve (i.e., t = 0, 1, * , 28). ment or the request. Retrieval was accomplished by Figs. 1 through 9 display the results obtained using rank-ordering the relevance of the library documents to the nonlinear mapping algorithm. Convergence was es- a given request. The relevance measure was computed sentially obtained for each case in twenty or less itera- using the Euclidean metric between the document vec- tions using the gradient searching technique described tors and the request vector, the concept being that document vectors which are close in the C-space are 4The vertices of a simplex are all equidistant from one another related by content and therefore should be retrieved to- as well as from the origin. gether. SAMMON: NONLINEAR MAPPING FOR DATA STRUCTURE ANALYSIS 403 NONLINEAR MAPPING Data: 9 pts. along a straight line Original Dimension =9 Starting Configuration: Random 15 Mapping Error-10-16 10 _ __/ -5t -15 -10 -5 0 5 10 I5 20 25 Fig. 1. NONLINEAR MAPPING NONLINEAR MAPPING Data: 9 pts. on o circle Data:75pts. trom 5Goussion distributions Original Dimension -2 Original Dimension 4 Starting Configurotion: Random Starting Configuration one coord I note pl Maximum variance Mapping Error = 10-16 Mapping Error - 05 2.0 1.5 12 10 00 1.0 0~~~~~~~~~~~ 43~ ~ ~~tg 4. 0~~~~~0 0 0 00~0 0 0~~~~~~~ 8-. .5 1. 0 - -- IS5 -- E<EYCTRPO2.5 2.0 S.C -I -2 -3 -4 -5 -6 -7 Fig. 4. Fig. 2. EIGENVECTOR PLOT NONLINEAR MAPPING Data: 75p t. from 5Gaussian distributions Original Dimension -4 Dot a: ris data Projection on the two principle eigenvectors Original Dimension =4 x 5 5 Storting Configurotion Maximum Variance X X Coordinate Plane X 51 6 Mapping Error-*002 X- XX 3 3 5 5 xX 5 5 5 35 5 5 35 5 5 5~ ~~tIll I X'xxx xxxK 3 333 5 5 33 3 5 55 .I 3 3 3 33 4 0OoO @ :=: K~: x o003 °° ° 0 6@ o @ Q tw~~~ X 2 2 22 000~~~~~ 4 2 2 44 4 4 44421 22 22 2222 22 0 00 44 22 22 22 2 2 4 4 44 2 444 4 4 4 2 4 2 3 4 5 6 7 s9 Fig. 3. Fig. 5. 4i4 IEEE TRANSACTIONS ON COMPUTERS, MAY 1969 NONLINEAR MAPPING Data: 100 pts. from 2OGaussiOn-spherical distributions= 19 Origina I d men sion a 19 Starting Conf ig ura t ion s Random 4.850 4.850 H .....ff~ ~~~~~~~~~ MaDPing Error 14 KK.K H E K A 4.484-- A A A 4.30: T 0 T 4.11. L ] | i | Ey L L 3.934 - -0205 t l ~~ ~~~R -~ NN 4 M 66 R _______ E~~~~~Fig 6 M 3.384 .-1G G G G DD 3.20: B B G 3.01 8 B B.S 2.834 F S S 2.65: .304 .608 .912 1.216 __50__24 C1.0 84 2.2 212 __ 43_276 243 276 300 .4 Fig. 6. EIGENVECTOR PLOT Data- 100 pts from 2OGaussiian spherical distributions Original Dimension a 19 0. 630 Projection on the two principle esigenvectors 0. 538 0.445 MN--- M 0.352 M 0.259 R 0. 16( -R--N- - R T Q O7s F N N v Q -TJ ~~~~c S UGH N A AA -0.205E 0 KK L 8 LL L - 0.391 -0.4841 R__ __BI_ -0.654 0.531 - 0.407 -0.283 -0.159 -0.035 B O.O8 0.212 0.336 0.460 Fig. 7. Briefly, the C-space construction proceeded as follows. or phrases as being represented by vectors in a 23- First the subject content covered by the 188 documents dimensional space spanned by the 23 coordinate fields.) in the experimental library was subjectively partitioned Next, a 23 X 23 field correlation matrix C was computed, into 23 technical fields (see Appendix II for a listing of where the ijth element represented the correlation be- these fields). Several experts representing each field tween the ith and jth fields. C was then factored using rated the relevance of each of the 1125 words or phrases the minimum residual method and rotated to a Varimax to his field, using a scale from 0 to 8. The rating by the criterion. Seventeen orthogonal factors were then experts within each field were then averaged to obtain a selected to define the 17-dimensional C-space. word-by-field relevance matrix designated X; the ijth All 1125 words and phrase vectors were mapped into element of X represents the relevance of word or phrase the 17-dimensional C-space using a simple nonlinear i to field j. (It is convenient to think of the 1125 words formula which tended to emphasize large coordinate 9.- 8.- IC5- IC Daa79ps 15 14 -H - I 2 3. IOUNA -Ori-in- 4 ~~0 SAMMON: NONLINEAR MAPPING FOR DATA STRUCTURE ANALYSIS NOUNA -.PiN ln 4.1 . l 5 5 6 - 7 MPIN 6 - - - - olna -im-ns-o-. 7 uv 8 - 8 - - 10 l0 - - 12 1 - 13 3 1 14 NONLINEAR MAPPING Data1: 3Opts. along a helix ~~~~~~~Original Dimension- 3 Starting Configuration:' Mopping 15 S 16 Error- 1 17 7 6xIOA4 1 18 19 Random 9 2 20 21 ;I 405 Fig. 9. projections and minimize small coordinate projections. and finally mapped into the C-space. Each requester was Finally, the 188 documents were located in the C-space then asked to identify those documents of the entire 188 by algebraically averaging the word or phrase vectors which he felt were most relevant to his query. The C- corresponding to the word or phrases which appeared in space was then evaluated by examining the rank order- the documents.5 ing of the retrieved documents to compare them to the In order to evaluate the C-space as a potential method list of relevant documents specified by the requester. for document indexing, several individuals were asked to The results of this evaluation can be found in Ossorio generate English queries (see Appendix III for the per- [8]. tinent queries used here) which were then keypunched, The nonlinear mapping algorithm was used to evalu- automatically scanned for key word or phrase content, ate the "structure" of the documents in the C-space. Specifically, we were interested in how the documents 5The entire document was never searched for key words or considered relevant to a particular request were clus- phrases. Rather, for one half of the documents only the abstracts and further, how these clusters were interrelated were used, and for the remainder several paragraphs from each docu- tered, ment were used. to each other and to the entire library. To accomplish 406 IEEE TRANSACTIONS ON COMPUTERS, MAY 1969 The experimental system will operate as follows. The on-line user would examine the 30 highest-ranked docu- ments by retrieving and reading their abstracts. He would then indicate those he considered relevant. Next, a scatter diagram similar to Fig. 10 would be presented upon the CRT display where each of the 30 documents would be indicated by an I or an R, depending upon its relevance. In addition, the original query vector will be displayed as a Q. After examining the relative positions of the documents in the mapping, the user would select (using a light pen) one or more relevant documents to be used to generate a new query vector(s). The concept is that the query vector can be moved to highly relevant regions of the document space by interacting at a display Fig. 10. Nonlinear mapping-photograph of CRT display. Data: console with a geometric representation of the space. 1= eight Request 1 vectors; 2 =seven Request 2 vectors; 3 = six- teen Request 3 vectors; 4 = thirteen Request 4 vectors; 5 = seven Request 5 vectors. Starting configuration: maximum variance RELATIONSIIIP OF NLM TO OTHER coordinate plane. Mapping error 0.062. = STRUCTURE ANALYSIS ALGORITHMS this analysis, all 188 17-dimensional vectors were used A mapping algorithm which bears a relationship to as the input data to the NLM. The numerals 1 through the NLM algorithm is one developed by Shepard [11] 5 were used in the resulting 2-dimensional mapping to and later improved by Kruskal [5], [6]. Briefly, the designate the documents labeled relevant to queries 1 Shepard-Kruskal algorithm seeks to find a configuration through 5. In addition, the symbol D was used to desig- of points in a t-space such that the resultant interpoint nate the remaining library documents. It is important distances preserve a monotonic relationship to a given to note that the NLM algorithm did not utilize the set of interelement similarities (or dissimilarities). numeric query designations in computing the mapping. Specifically, they wish to analyze a set of interelement Only at the time of plotting the final 2-space configura- similarities (or dissimilarities) given by Sij, i = 1, * * , tion of the 188 points were the numeric and symbolic N, j = 1, * * *, N. Suppose these similarities are ordered designators used to distinguish the data. The error in in increasing magnitude, such that achieving the NLM shown in Fig. 10 was 0.062, which was considered to be acceptable for adequate 2-space SPJl1 < SP2q12 < S... < representation. The Kruskal-Shepard algorithm seeks to find a set of The following facts were obtained upon investigation N t-dimensional vectors yi, i = 1, . . . , N, such that the of the NLM result. order of the interpoint distances dij=dist[yi, y;] devi- 1) The documents considered relevant to a given re- ates as little as possible from the monotonic ordering of quest were clustered, lending evidence to support the the corresponding similarities. Although the mathema- hypothesis that related documents have C-space vectors tical formulations are similar, the underlying mapping which are close. criterions are quite different. 2) There does not appear to be any natural C-space Ball [1 ] has compiled an excellent survey of cluster- structure relating subsets of documents. Instead, the ing and clumping algorithms which are useful in solving documents tend to be uniformly distributed throughout the "structure analysis" problem. However, it has been the space. our experience in using clustering techniques that these 3) Clusters 2 and 3 tend to overlap, yet they are well- algorithms suffer to some extent from the following four separated from clusters 4 and 5. This can easily be ac- deficiencies. counted for since requests 2 and 3 are both concerned 1) When using a particular algorithm, the resulting with the common subject of statistical data analysis, cluster configuration is highly dependent upon a set of whereas 4 and 5 involve completely different subjects. control parameters which must be fixed by the user. In general, the intercluster relationships seem consistent Some examples of such parameters are: with their respective subject relationships. a) the similarity measure; In summary, we have found the NLM algorithm to b) various similarity thresholds; be of considerable value in aiding us in our understand- c) number of iterations required; ing of the C-space as well as other document spaces. d) thresholds which control the increase or reduc- Presently we are planning to incorporate a similar map- tion of the number of clusters; ping technique in an on-line document retrieval system e) the minimum number of vectors required to de- in order to improve the retrieval via geometric means. fine a cluster. SAMMON: NONLINEAR MAPPING FOR DATA STRUCTURE ANALYSIS 407 When choosing the control parameters for complex we are limited at present to N< 250 vectors.6 In those data, the user must either have a good deal of a priori cases where N> 250, we suggest using a data compres- information regarding the "structure" of his data, or he sion technique to reduce the data set to less than 250 must apply the algorithm many times for different val- vectors. Specifically, we propose to use the Isodata [2] ues of the control parameters. This second alternative is, clustering algorithm to perform data compression. This at best, tedious. is actually a natural function of clustering since we re- 2) Most of the existing clustering algorithms are place several vectors with a typical representative particularly sensitive to hyperspherical structure and are vector (i.e., the cluster center). Our previous objections inefficient in detecting more complex relationships in to present-day clustering algorithms do not apply here the data. since we are only concerned with fitting the data with 3) Perhaps the most serious deficiency involving 250 cluster centers. We are specifically not using the present-day clustering algorithms is that there do not clustering algorithm to detect structure. exist really good ways for evaluating a resultant cluster We have used the NLM to analyze multivariate data configuration. from two or more classes for the purpose of determining 4) When two clusters are close, the vectors between how well the classes can be discriminated from one an- tend to form a bridge and cause spurious mergers [7]. other. In these cases, it is recommended that the dimen- sionality be reduced to the smallest number of variables We feel that the nonlinear mapping is a highly prom- which still preserve discrimination.7 In many problems ising structure analysis algorithm since it suffers little certain measurements provide little discriminatory in- from the listed clustering deficiencies. Consider the formation; yet if these measurements are included, the following facts concerning the algorithm. NLM will attempt to "fit" interpoint distances along 1) The routine does not depend upon any control these "noisy" directions as well as along discriminating parameters that would require a priori knowledge about directions. In truly high-dimensional problems, the re- the data. Specifically, the user must set the number of sulting mapping may show considerable overlap be- iterations and the convergence constant (MF in Appen- tween classes and still a high degree of discrimination dix I). may be possible. This phenomena occurred when 2) It is highly efficient in identifying hyperspherical, analyzing a 4-class, 24-dimensional data set. The result- hyperellipsoidal, and other complex data structures. ing NLM (the final error was 0.5, which was considered 3) The resulting mapping (scatter diagram) is easily high) showed considerable overlap among the data evaluated by the researcher, thereby taking advantage from three of the classes; yet, using a piecewise linear of the human ability to detect and identify data struc- discrimination technique (based upon the use of a ture. Fisher's linear discriminant between all pairs of classes), 4) The problem concerning extraneous data and 94 percent correct classification was achieved. In this spurious mergers is not present since humans easily case, the NLM did not give an incorrect result since the eliminate troublesome data points by making global evaluations (machines have difficulty performing this 6 The nonlinear map is programmed in FORTRAN IV and runs on a function). GE-635 computer equipped with 128 K of core. The computation time can be estimated by 5) The algorithm is simple and efficient. T'c (1.1 X 10-5)- (2 ) LIMITATIONS AND EXTENSIONS minutes, where There are, of course, limitations to every algorithm I= number of iterations N= number of vectors. and the nonlinear mapping is no exception. There exist two limitations which we are presently investigating. use 7the number of techniques may be used for this purpose. We often A following: The first has to do with the reliability of the scatter dia- a) Discriminant measure gram in displaying extremely complex high-dimensional structure. It is conceivable that the minimum mapping M(X) E - i<Ki .i2 + j2 error is too large (E>>0.1) and the 2-dimensional scatter b) Interpoint measure plot fails to portray the true structure. However, we feel that for data structures composed of superpositions of M (X) =2 E 1 1 Ni Ni E (Xp(fi) Xq(i))2 (TX i< NiNj p=1 q=.1 hyperspherical and hyperellipsoidal clusters, the non- linear mapping algorithm will, in general, display ade- where quate representations of the true data "structure." g2i= mean of class i along X O'Xi= variance of class i along X The second limitation of the nonlinear mapping al- 2=variance of all data along X gorithm is related to the number of vectors that it can Xp(i) the pth sample from the ith class along X handle. Since we must compute and store the interdis- Ni= number of samples from the ith class. tance matrix, which consists of N(N-1)/2 elements, c) Multilinear discriminant defined in Wilks [141. 408 IEEE TRANSACTIONS ON COMPUTERS, MAY 1969 classes greatly overlapped in approximately 20 dimen- aE -2 N[d -ddpj- sions and mildly overlapped in the remaining space. The Ld- yjq) j(ypq = (9yp,q c j=l , d * (Y- -Yq dpjdvj.* NLM weighted all coordinates equally in an attempt to fit the interpoint distances, and therefore the resulting mapping indicated the predominant overlap which actu- and ally existed. a2E -2 N The NLM algorithm described here is one of many al- gorithms which are being programmed and incorporated (9yV 2 C j=1 dpj*dpj j#p into a large on-line graphics-oriented computer sys- tem, entitled the On-Line Pattern Analysis and Recog- nition System (OLPARS) [10].8 Once the NLM al- [(d* - dp))- q +-Yi)2 *dp)] gorithm is incorporated into the OLPARS system, the on-line user will be able to designate a data set, and from In our program we take precautions to prevent any the graphics console execute the NLM. The user shall two points in the d-space from becoming identical. This specify a mapping to a 2-space or a 3-space. For d = 2, prevents the partials from "blowing up." the resultant scatter diagram will be displayed upon the CRT; for d = 3, a perspective scatter plot will be dis- APPENDIX II played. If the 3-space option is selected, the user will be CLASSIFICATION SPACE FIELDS able to dynamically analyze the resultant perspective 1) Adaptive Systems scatter diagram by selecting various rotations of the 2) Analog Computers three space. When the user selects d = 2, he will be given 3) Applied Mathematics the capability to designate subsets of data (via piecewise 4) Automata Theory linear boundaries drawn on the CRT) representing a col- 5) Computer Components and Circuits lection of points in the scatter diagram which exhibit 6) Computer Memories structure, and thereby partition the initial data list into 7) Computer Softwave structured subsets. 8) Display Consoles APPENDIX I 9) Human Factors 10) Information Retrieval Let E(m) be defined as the mapping error after the 11) Information Theory mth iteration, i.e., 12) Input-Output Equipment 1 N 13) Language Translation E(m) -- [d ij*-dij(m) 2/dij* 14 ) Linear Algebra C i<j 15) Multivariate Statistical Analysis where 16) Nonnumeric Data Processing N 17) Numerical Analysis c = E [dIj*] 18) Pattern Recognition iKi 19) Probability and Statistics and 20) Programming Languages 21) Stochastic Processes V d Z [yik(m) - yjk(m)]2 - 22) System Design and Evaluation dij(m) = k=l 23) Time-Sharing Systems. The new d-space configuration at time m +1 is given by APPENDIX I II REQUESTS ypq(m + 1) = ypq(m) - (MF) *p.(m) Request 1: What is known about the statistical dis- where tributions of words or concepts in English text? What impact does this knowledge or lack of knowledge have Apq(m) = aE(m) / 92E(m) on the effectiveness of standard statistical methods to ,Oypq(m) Oypq(M)2 information retrieval problems? Are nonparametric and MF is the "magic factor" which was determined methods more applicable? empirically to be MFm 0.3 or 0.4. The partial derivatives Request 2: I am interested in techniques for data anal- are given by ysis. In particular, I wish information on "cluster-seek- ing" techniques as opposed to those of factorial analysis 8 For other examples of interactive pattern analysis systems, see and discriminant analysis. "Cluster-seeking" techniques Ball and Hall [31, Stanley et al. [12], and Walters [13]. may be classified as follows: probabilistic techniques, IEEE TRANSACTIONS ON COMPUTERS, VOL. C-18, NO. 5, MAY 1969 409 signal detection, clustering techniques, clumping tech- REFERENCES niques, eigenvalue-type techniques, and minimal mode- [1] G. H. Ball, "A comparison of some cluster-seeking techniques," seeking techniques. Rome Air Development Center, Rome, N. Y., Tech. Rept. RADC-TR-66-514, November 1966. Request 3: I would like any information concerning [21 G. H. Ball and D. Hall, "Isodata," portion of Stanford Research Bayesian statistics. In particular, I would like to know Institute (SRI) Final Report to RADC Contract AF30(602)- 4196, September, 1967. if one can define or devise multiple-decision procedures [31 , "Promenade-an improved interactive graphics man/ma- from the Bayes approach. Also, how sensitive are Bayes chine system for pattern recognition," Stanford Research In- stitute, Menlo Park, Calif., Project 6737, October, 1968. procedures to the prior distribution? Finally, I would [4] R. A. Fisher, "The use of multiple measurements in taxonomic like a comparison of the Bayes approach to other classi- problems," Ann. Eugenics, vol. 7, pp. 178-188, 1936. [5] J. B. Kruskal, "Multidimensional scaling by optimizing good- cal decision theoretic approaches. ness of fit to a nonmetric hypothesis," Psychometrika, pp. 1-27, March 1964. Request 4: What is the structure and characteristics [6] -, "Nonmetric multidimensional scaling: a numerical meth- of paging techniques? od," Psychometrika, vol. 29, pp. 115-129, June 1964. [7] G. Nagy, "State of the art in pattern recognition," Proc. IEEE, Request 5: Are there survey documents (information) vol. 56, pp. 836-861, May 1968. available which discuss or detail the relative practical- [81 P. G. Ossorio, "Classification space analysis," RADC-TDR-64- 287, October 1964. ity of memories; for example, capacity versus utiliza- [9] , "Attribute space development and evaluation," RADC- tion, density, weight, environmental features, failure TDR-67-640, January 1968. [101 J. W. Sammon, "On-line pattern analysis and recognition sys- rates, economics, etc.? tem (OLPARS)," RADC-TR-68-263, August 1968. [11] R. N. Shepard, "The analysis of proximities: multidimensional scaling with an unknown distance function," Psychometrika, vol. 27, pp. 125-139, 219-246, 1962. [12] G. L. Stanley, G. G. Lendaris, and W. C. Nienow, "Pattern rec- ACKNOWLEDGMENT ognition program," AC Electronics Defense Research Labs., Santa Barbara, Calif., TR-567-16, November 1967. The author expresses his appreciation to D. Elefante [13] C. M. Walters, "On line computer based aids for the investiga- for his efforts in developing an efficient FORTRAN iv pro- tion of sensor data compression, transmission and delay prob- lems," 1966 Proc. Natl. Telemetry Conf., Boston, Mass. gram for the Nonlinear Mapping Algorithm. [14] S. S. Wilks, Mathematical Statistics. New York: J. Wiley, 1962. Mathematical Analysis of Ferrite Core Memory Arrays WILLIAM T. WEEKS Abstract-A mathematical model for simulating pulse propaga- INTRODUCTION tion in ferrite core memory arrays is described. Although specifically developed to analyze 3-dimensional arrays, the model is sufficiently EU URING the past five years, considerable prog- general to give a satisfactory analysis of pulse propagation, waveform ress has been made in the development of mathe- deterioration, and noise generation in a wide variety of memory matical models for simulating the electrical configurations. The model treats the memory as a generalized, properties of ferrite core memory arrays. The purpose of mutually coupled, multiconductor transmission line system. Insofar as is possible, the transmission line parameters are calculated from this paper is to describe the techniques available for the the array geometry, thus leaving only a small number of parameters analysis of 3-dimensional ferrite core memory arrays. that must be supplied empirically. Following a discussion of the The techniques presented here represent a substantial equations which define the model and the methods by which they advance in the state-of-the-art over earlier reported are solved, a sample array calculation is given to illustrate the kind work [1], [2], which dealt mainly with the simulation of of information that can be obtained from the model. 2-dimensional arrays. Index Terms-Arrays, computers, ferrite cores, memories, pulse A precise mathematical description of a memory ar- propagation, transmission line system. ray, backed up by a rigorous and practically realizable method for solving the resulting equations, would be of Manuscript received October 23, 1968; revised February 10, 1969. inestimable value to a memory designer, for it would re- The auithor is with IBM Corporation, Components Division, Poughkeepsie, N. Y. 12602. move much of the uncertainty from the design process