A solution to Plato's problem: The Latent Semantic Analysis theory of the acquisition, induction and representation of knowledge
Thomas K. Landauer, and Susan T. Dumais
Abstract
How do people know as much as they do with as little information as they
get? The problem takes many forms; learning vocabulary from text is an
especially dramatic and convenient case for research. A new general
theory of acquired similarity and knowledge representation, Latent
Semantic Analysis (LSA), is presented and used to successfully simulate
such learning and several other psycholinguistic phenomena. By inducing
global knowledge indirectly from local co-occurrence data in a large body
of representative text, LSA acquired knowledge about the full vocabulary
of English at a comparable rate to school-children. LSA uses no prior
linguistic or perceptual similarity knowledge; it is based solely on a
general mathematical learning method that achieves powerful inductive
effects by extracting the right number of dimensions (e.g., 300) to
represent objects and contexts. Relations to other theories, phenomena,
and problems are sketched.
Full Paper (HTML)
|