  • Click Logs as a Source of Data for Evaluating the Quality of Hypertext

    Alexander, David M. (2011)

    Masters thesis
    University of Otago

    The automatic generation of useful hyperlinks has been a topic of interest in the field of information retrieval for several years, but no existing method for evaluating the quality of the generated hyperlinks has yet been shown to be optimal. I investigate the existing methods for the evaluation of hypertext quality (automatic assessment and manual assessment) and propose a new method based on the idea that good hypertext helps its users to form accurate representations of the subject matter: an idea which is supported by existing theories of coherence from outside the domain of hypertext. This new method involves examining the patterns of behaviour recorded in a click log, and using these to decide which hyperlinks are the most useful. The new method is formulated as a set of assessment metrics that are used to judge the utility of individual hyperlinks, and in turn, calculate Mean Average Precision (MAP) and Normalised Discounted Cumulative Gain (nDCG) scores for the result sets of hyperlink generation algorithms that have been submitted to the INEX evaluation forum in past years. These scores are used to rank the algorithms, and the rankings are compared with those produced by manual assessment. The result is that my assessment metrics produce substantially different rankings from manual assessment. The result sets of the top-performing algorithms under each of the two assessment methods are combined to produce a new result set that performs well under both. This leads to the conclusion that a future algorithm that combined the features of those two algorithms would produce better results in general. Additionally, suggestions are made regarding the use of the new assessment method to reduce the error rate of existing hyperlink-generation algorithms, and thus to facilitate the usage of such algorithms in mainstream applications.

