dc.description.abstract |
A procedure for the computer generation of a thesaurus from a set of descriptors, manually assigned to the documents in a library, is described. Recognises only a quasi-associative relationship among desciptors. the specific advantages of the thesaurus as an open-ended one, the keywords derived from actual documents being the most helpful in retrieving the documents, and as an aid in information search in the collection are pointed out. Computational procedures for generating the thesaurus inclue keyword statistics, matrix inversion, calculation of similarity matrix using Tanimoto coefficient, automatic cluster analysis using minmal-tree procedure, and compilation of groups and main groups of descriptors are given. An algorithm for the graphic display of the main groups of descriptors has been formulated. The main disadvantage of the procedure is that only a limited number of keywords can be processed within a reasonable computer CPU time. Points out that the procedure can be applied to a library of 10,000 to 20,000 documents with a keyword base of 1,000 words using about 3 hours of computing time. |
en_US |