정보검색 (Information Retrieval) 용어 정리
출처 : Modern Information Retrieval Glossary(http://people.ischool.berkeley.edu/~hearst/irbook/glossary.html)
알파벳 순서로 정리를 하였습니다.
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
A
- Ad hoc retrieval
standard retrieval task in which the user specifies his information need through a queru witch initiates a search (executed by the information system) for documents which are likely to be relevant to the user.
- All-pairs or spatial-join query
q auery that requests all the pairs of objects that are within the specified distance from their partner.
- Amdahl's law
Using N processors, the maximal speedup S obtainable for a given problem is related to f, the fraction of the problem that must be computed sequentially. The relationship is given by: SN <= 1/(f + (1-f)/N) <= 1/f.
- ASCII
Standard binary codes to represent occidental characters in one byte.
B
- Belief network
a probabilistic model of document retrieval based on interpreting documents, user queries, and index terms as nodes of a Bayesian network. This model is distinct from the inference network model.
- Bit-parallelism
a speed-up technique based on exploiting the fact that the processor performs some operations in parallel over all the bits of the computer word.
- Block addressing
a technique used to reduce the size of the lists of occurrences by pointing to text blocks instead of exact positions.
- Boolean model
a classic model of document retrieval based on classic set theory.
- Browsing
interactive task in which the user is more interested in exploring the document collection than in retrieving documents which satisfy a specific information need.
C
- CACM collection
a reference collection composed of all the 3204 articles published in the Communications of the ACM from 1958 to 1979.
- CISI collection
a reference collection composed of 1460 documents selected from a previous collections assembled at ISI.
- Clustering
the grouping of documents which satisfy a set of common properties. The aim is to assemble together documents which are related among themselves. Clustering can be used, for instance, to expand a user query with new and related index terms.
- Coding
the substitution of text symbols by numeric codes with the aim of encrypting or compressing text.
- Collection
a group of items, often documents. In (digital) libraries this designates all the works included, usually selected based on a collection management plan.
- Compression of text
the study of techniques for representing text in fewer bytes or bits.
- Content-based query
query exploiting data content.
- Conversion
changing from one form to another, as in converting from analog to digital (also called "digitization"), or paper to online (as in "retrospective conversion" of a card catalog to an online catalog, or old books to scanned images.
- Cystic Fibrosis collection
a reference collection composed of 1239 documents indexed with the term cystic fibrosis in the National Library of Medicine's MEDLINE database.
D
- DLITE
a retrieval system which splits functionality in two parts: control of the search process and display of the results.
- DLI
Digital Libraries Initiative, a program of the US National Science Foundation, for research and development related to digital libraries, which began with $24M of funding split across 6 universities for 1994-98, and which will continue from 1998 onward with roughly double that amount of support.
- DTD
Document Type Definition: SGML definition for a markup language.
- Data cartridge
data structure and associated methods to represent and query a particular multimedia data type.
- Data mining
- Data retrieval
- Database industry
- Database producers
- Database vendors
- Digital library
- Digital object
- Digital preservation
- Directory
- Distributed computing
- Distributed information retrieval
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
W
X
Y
Z