Docstoc

latent semantic indexing information

Document Sample
latent semantic indexing information Powered By Docstoc
					Latent Semantic Indexing Information

The latent semantic indexing information retrieval
model builds the prior research of information
retrieval. LSI uses the singular value decomposition,
or SVD, to reduce the dimensions of the space and
attempts to solve the problems that seem to plague the
auto info retrieval system.

The LSI represents terms and documents in rich and
high dimensional space. This allows the underlying
semantic relationships that come between the terms and
documents.

The latent semantic indexing model views the terms in
a document as unreliable indicators of the information
within the document. The variability of word choice
obscures the semantic structure of the documents
involved.

When the term-document space is reduced, the
underlying semantic relationships are then revealed.
Much of the noise is eliminated when the space is
reduced.

Latent Semantic Indexing differs from other attempts
at using reduced space models for info retrieval. LSI
represents documents in a high dimensional space.

Both terms and documents are represented in the same
space and no attempt is made to change the meaning of
each dimension. Limits imposed by the demands of
vector space are focused on relatively small document
collections.

LSI is able to represent and manipulate larger data
sets and makes them viable for real-world
applications.

Compared to other information retrieving techniques,
the LSI performs quite well. Latent Semantic Indexing
provides thirty percent more related documents than
the standard word based retrieval system,

LSI is also fully automatic and very easy to use. It
requires no complex expressions or confusing syntax.
Terms and documents are represented in the space and
feedback can be integrated with the LSI model.