19 results for Paynter, Gordon W.

  • Automating iterative tasks with programming by demonstration

    Paynter, Gordon W. (2000)

    Doctoral thesis
    University of Waikato

    Programming by demonstration is an end-user programming technique that allows people to create programs by showing the computer examples of what they want to do. Users do not need specialised programming skills. Instead, they instruct the computer by demonstrating examples, much as they might show another person how to do the task. Programming by demonstration empowers users to create programs that perform tedious and time-consuming computer chores. However, it is not in widespread use, and is instead confined to research applications that end users never see. This makes it difficult to evaluate programming by demonstration tools and techniques. This thesis claims that domain-independent programming by demonstration can be made available in existing applications and used to automate iterative tasks by end users. It is supported by Familiar, a domain-independent, AppleScript-based programming-by-demonstration tool embodying standard machine learning algorithms. Familiar is designed for end users, so works in the existing applications that they regularly use. The assertion that programming by demonstration can be made available in existing applications is validated by identifying the relevant platform requirements and a range of platforms that meet them. A detailed scrutiny of AppleScript highlights problems with the architecture and with many implementations, and yields a set of guidelines for designing applications that support programming-by-demonstration. An evaluation shows that end users are capable of using programming by demonstration to automate iterative tasks. However, the subjects tended to prefer other tools, choosing Familiar only when the alternatives were unsuitable or unavailable. Familiar's inferencing is evaluated on an extensive set of examples, highlighting the tasks it can perform and the functionality it requires.

    View record details
  • Human evaluation of Kea, an automatic keyphrasing system.

    Jones, Steve; Paynter, Gordon W. (2001-02-01)

    Working or discussion paper
    University of Waikato

    This paper describes an evaluation of the Kea automatic keyphrase extraction algorithm. Tools that automatically identify keyphrases are desirable because document keyphrases have numerous applications in digital library systems, but are costly and time consuming to manually assign. Keyphrase extraction algorithms are usually evaluated by comparison to author-specified keywords, but this methodology has several well-known shortcomings. The results presented in this paper are based on subjective evaluations of the quality and appropriateness of keyphrases by human assessors, and make a number of contributions. First, they validate previous evaluations of Kea that rely on author keywords. Second, they show Kea's performance is comparable to that of similar systems that have been evaluated by human assessors. Finally, they justify the use of author keyphrases as a performance metric by showing that authors generally choose good keywords.

    View record details
  • KEA: Practical automatic keyphrase extraction

    Witten, Ian H.; Paynter, Gordon W.; Frank, Eibe; Gutwin, Carl; Nevill-Manning, Craig G. (2000-03)

    Working or discussion paper
    University of Waikato

    Keyphrases provide semantic metadata that summarize and characterize documents. This paper describes Kea, an algorithm for automatically extracting keyphrases from text. Kea identifies candidate keyphrases using lexical methods, calculates feature values for each candidate, and uses a machine learning algorithm to predict which candidates are good keyphrases. The machine learning scheme first builds a prediction model using training documents with known keyphrases, and then uses the model to find keyphrases in new documents. We use a large test corpus to evaluate Kea’s effectiveness in terms of how many author-assigned keyphrases are correctly identified. The system is simple, robust, and publicly available.

    View record details
  • Automating iterative tasks with programming by demonstration: a user evaluation

    Paynter, Gordon W.; Witten, Ian H. (1999-05)

    Working or discussion paper
    University of Waikato

    Computer users often face iterative tasks that cannot be automated using the tools and aggregation techniques provided by their application program: they end up performing the iteration by hand, repeating user interface actions over and over again. We have implemented an agent, called Familiar, that can be taught to perform iterative tasks using programming by demonstration (PBD). Unlike other PBD systems, it is domain independent and works with unmodified, widely-used, applications in a popular operating system. In a formal evaluation, we found that users quickly learned to use the agent to automate iterative tasks. Generally, the participants preferred to use multiple selection where possible, but could and did use PBD in situations involving iteration over many commands, or when other techniques were unavailable.

    View record details
  • Browsing in digital libraries: a phrase-based approach

    Nevill-Manning, Craig G.; Witten, Ian H.; Paynter, Gordon W. (1997-01)

    Working or discussion paper
    University of Waikato

    A key question for digital libraries is this: how should one go about becoming familiar with a digital collection, as opposed to a physical one? Digital collections generally present an appearance which is extremely opaque-a screen, typically a Web page, with no indication of what, or how much, lies beyond: whether a carefully-selected collection or a morass of worthless ephemera; whether half a dozen documents or many millions. At least physical collections occupy physical space, present a physical appearance, and exhibit tangible physical organization. When standing on the threshold of a large library one gains a sense of presence and permanence that reflects the care taken in building and maintaining the collection inside. No-one could confuse it with a dung-heap! Yet in the digital world the difference is not so palpable.

    View record details
  • Notes: an experiment in CSCW

    Apperley, Mark; Gianoutsos, Simon; Grundy, John C.; Paynter, Gordon W.; Reeves, Steve; Venable, John R. (1996-04)

    Working or discussion paper
    University of Waikato

    Computer Supported Co-operative Work (CSCW) systems are complex, yet no computer-based tools of any sophistication exist to support their development. Since several people often need to work together on the same project simultaneously, the computer system often proves to be a bottleneck. CSCW tools are a means of allowing several users to work towards their goal. Systems development is essentially a team process, yet support for CSCW on these systems is in its infancy.

    View record details
  • A user evaluation of hierarchical phrase browsing

    Edgar, Katrina D.; Nichols, David M.; Paynter, Gordon W.; Thomson, Kirsten; Witten, Ian H. (2003)

    Conference item
    University of Waikato

    Phrase browsing interfaces based on hierarchies of phrases extracted automatically from document collections offer a useful compromise between automatic full-text searching and manually-created subject indexes. The literature contains descriptions of such systems that many find compelling and persuasive. However, evaluation studies have either been anecdotal, or focused on objective measures of the quality of automatically-extracted index terms, or restricted to questions of computational efficiency and feasibility. This paper reports on an empirical, controlled user study that compares hierarchical phrase browsing with full-text searching over a range of information seeking tasks. Users found the results located via phrase browsing to be relevant and useful but preferred keyword searching for certain types of queries. Users experiences were marred by interface details, including inconsistencies between the phrase browser and the surrounding digital library interface.

    View record details
  • Applying machine learning to programming by demonstration

    Paynter, Gordon W.; Witten, Ian H.; Koblitz, Neil; Powell, Matthew (2004)

    Journal article
    University of Waikato

    ‘Familiar’ is a tool that helps end-users automate iterative tasks in their applications by showing examples of what they want to do. It observes the user’s actions, predicts what they will do next, and then offers to complete their task. Familiar learns in two ways. First, it creates a model, based on data gathered from training tasks, that selects the best prediction from among several candidates. Experiments show that decision trees outperform heuristic methods, and can be further improved by incrementally updating the classifier at task time. Second, it uses decision stumps inferred from analogous examples in the event trace to predict the parameters of conditional rules. Because data is sparse—for most users balk at giving more than a few training examples—permutation tests are used to calculate the statistical significance of each stump, successfully eliminating bias towards attributes with many different values.

    View record details
  • A combined phrase and thesaurus browser for large document collections

    Paynter, Gordon W.; Witten, Ian H. (2001)

    Conference item
    University of Waikato

    A browsing interface to a document collection can be constructed automatically by identifying the phrases that recur in the full text of the documents and structuring them into a hierarchy based on lexical inclusion. This provides a good way of allowing readers to browse comfortably through the phrases (all phrases) in a large document collection. A subject-oriented thesaurus provides a different kind of hierarchical structure, based on deep knowledge of the subject area. If all documents, or parts of documents, are tagged with thesaurus terms, this provides a very convenient way of browsing through a collection. Unfortunately, manual classification is expensive and infeasible for many practical document collections. This paper describes a browsing scheme that gives the best of both worlds by providing a phrase-oriented browser and a thesaurus browser within the same interface. Users can switch smoothly between the phrases in the collection, which give access to the actual documents, and the thesaurus entries, which suggest new relationships and new terms to seek.

    View record details
  • Metadata tools for institutional repositories

    Nichols, David M.; Paynter, Gordon W.; Chan, Chu-Hsiang; Bainbridge, David; McKay, Dana; Twidale, Michael B.; Blandford, Ann (2008-08)

    Working or discussion paper
    University of Waikato

    Current institutional repository software provides few tools to help metadata librarians understand and analyse their collections. In this paper we compare and contrast metadata analysis tools that were developed simultaneously, but independently, at two New Zealand institutions during a period of national investment in research repositories: the Metadata Analysis Tool (MAT) at The University of Waikato, and the Kiwi Research Information Service (KRIS) at the National Library of New Zealand. The tools have many similarities: they are convenient, online, on-demand services that harvest metadata using OAI-PMH, they were developed in response to feedback from repository administrators, and they both help pinpoint specific metadata errors as well as generating summary statistics. They also have significant differences: one is a dedicated tool while the other is part of a wider access tool; one gives a holistic view of the metadata while the other looks for specific problems; one seeks patterns in the data values while the other checks that those values conform to metadata standards. Both tools work in a complementary manner to existing web-based administration tools. We have observed that discovery and correction of metadata errors can be quickly achieved by switching web browser views from the analysis tool to the repository interface, and back. We summarise the findings from both tools’ deployment into a checklist of requirements for metadata analysis tools.

    View record details
  • Predicting Library of Congress Classifications from Library of Congress Subject Headings

    Frank, Eibe; Paynter, Gordon W. (2003-01)

    Working or discussion paper
    University of Waikato

    This paper addresses the problem of automatically assigning a Library of Congress Classification (LCC) to work given its set of Library of Congress Subject Headings (LCSH). LCC are organized in a tree: the root node of this hierarchy comprises all possible topics, and leaf nodes correspond to the most specialized topic areas defined. We describe a procedure that, given a resource identified by its LCSH, automatically places that resource in the LCC hierarchy. The procedure uses machine learning techniques and training data from a large library catalog to learn a classification model mapping from sets of LCSH to nodes in the LCC tree. We present empirical results for our technique showing its accuracy on an independent collection of 50,000 LCSH/LCC pairs.

    View record details
  • Domain-independent programming by demonstration in existing applications

    Paynter, Gordon W.; Witten, Ian H. (2001-02)

    Book item
    University of Waikato

    This paper describes Familiar, a domain- independent programming by demonstration system for automating iterative tasks in existing, unmodified applications on a popular commercial platform. Familiar is domain- independent in an immediate and practical sense: it requires no domain knowledge from the developer and works immediately with new applications as soon as they are installed. Based on the AppleScript language, the system demonstrates that commercial operating systems are mature enough to support practical, domain- independent programming by demonstration – but only just, for the work exposes many deficiencies.

    View record details
  • Interactive document summarisation.

    Jones, Steve; Lundy, Stephen; Paynter, Gordon W. (2001-02-01)

    Working or discussion paper
    University of Waikato

    This paper describes the Interactive Document Summariser (IDS), a dynamic document summarisation system, which can help users of digital libraries to access on-line documents more effectively. IDS provides dynamic control over summary characteristics, such as length and topic focus, so that changes made by the user are instantly reflected in an on-screen summary. A range of 'summary-in-context' views support seamless transitions between summaries and their source documents. IDS creates summaries by extracting keyphrases from a document with the Kea system, scoring sentences according to the keyphrases that they contain, and then extracting the highest scoring sentences. We report an evaluation of IDS summaries, in which human assessors identified suitable summary sentences in source documents, against which IDS summaries were judged. We found that IDS summaries were better than baseline summaries, and identify the characteristics of Kea keyphrases that lead to the best summaries.

    View record details
  • Importing documents and metadata into digital libraries: requirements analysis and an extensible architecture

    Witten, Ian H.; Bainbridge, David; Paynter, Gordon W.; Boddie, Stefan J. (2002)

    Conference item
    University of Waikato

    Flexible digital library systems need to be able to accept, or “import,” documents and metadata in a variety of forms, and associate metadata with the appropriate documents. This paper analyzes the requirements of the import process for general digital libraries. The requirements include (a) format conversion for source documents, (b) the ability to incorporate existing conversion utilities, (c) provision for metadata to be specified in the document files themselves and/or in separate metadata files, (d) format conversion for metadata files, (e) provision for metadata to be computed from the document content, and (f) flexible ways of associating metadata with documents or sets of documents. We argue that these requirements are so open-ended that they are best met by an extensible architecture that facilitates the addition of new document formats and metadata facilities to existing digital library systems. An implementation of this architecture is briefly described.

    View record details
  • Experiences in deploying metadata analysis tools for institutional repositories

    Nichols, David M.; Paynter, Gordon W.; Chan, Chu-Hsiang; Bainbridge, David; McKay, Dana; Twidale, Michael B.; Blandford, Ann (2009-04)

    Journal article
    University of Waikato

    Current institutional repository software provides few tools to help metadata librarians understand and analyze their collections. In this article, we compare and contrast metadata analysis tools that were developed simultaneously, but independently, at two New Zealand institutions during a period of national investment in research repositories: the Metadata Analysis Tool (MAT) at The University of Waikato, and the Kiwi Research Information Service (KRIS) at the National Library of New Zealand. The tools have many similarities: they are convenient, online, on-demand services that harvest metadata using OAI-PMH; they were developed in response to feedback from repository administrators; and they both help pinpoint specific metadata errors as well as generating summary statistics. They also have significant differences: one is a dedicated tool wheres the other is part of a wider access tool; one gives a holistic view of the metadata whereas the other looks for specific problems; one seeks patterns in the data values whereas the other checks that those values conform to metadata standards. Both tools work in a complementary manner to existing Web-based administration tools. We have observed that discovery and correction of metadata errors can be quickly achieved by switching Web browser views from the analysis tool to the repository interface, and back. We summarize the findings from both tools' deployment into a checklist of requirements for metadata analysis tools.

    View record details
  • An Evaluation of Document Keyphrase Sets

    Jones, Steve; Paynter, Gordon W. (2003)

    Conference item
    University of Waikato

    Keywords and keyphrases have many useful roles as document surrogates and descriptors, but the manual production of keyphrase metadata for large digital library collections is at best expensive and time-consuming, and at worst logistically impossible. Algorithms for keyphrase extraction like Kea and Extractor produce a set of phrases that are associated with a document. Though these sets are often utilized as a group, keyphrase extraction is usually evaluated by measuring the quality of individual keyphrases. This paper reports an assessment that asks human assessors to rate entire sets of keyphrases produced by Kea, Extractor and document authors. The results provide further evidence that human assessors rate all three sources highly (with some caveats), but show that the relationship between the quality of the phrases in a set and the set as a whole is not always simple. Choosing the best individual phrases will not necessarily produce the best set; combinations of lesser phrases may result in better overall quality.

    View record details
  • Scalable browsing for large collections: a case study

    Paynter, Gordon W.; Witten, Ian H.; Cunningham, Sally Jo; Buchanan, George (2000)

    Conference item
    University of Waikato

    Phrase browsing techniques use phrases extracted automatically from a large information collection as a basis for browsing and accessing it. This paper describes a case study that uses an automatically constructed phrase hierarchy to facilitate browsing of an ordinary large Web site. Phrases are extracted from the full text using a novel combination of rudimentary syntactic processing and sequential grammar induction techniques. The interface is simple, robust and easy to use. To convey a feeling for the quality of the phrases that are generated automatically, a thesaurus used by the organization responsible for the Web site is studied and its degree of overlap with the phrases in the hierarchy is analyzed. Our ultimate goal is to amalgamate hierarchical phrase browsing and hierarchical thesaurus browsing: the latter provides an authoritative domain vocabulary and the former augments coverage in areas the thesaurus does not reach.

    View record details
  • Improving browsing in digital libraries with keyphrase indexes

    Gutwin, Carl; Paynter, Gordon W.; Witten, Ian H.; Nevill-Manning, Craig G.; Frank, Eibe (1999)

    Journal article
    University of Waikato

    Browsing accounts for much of people's interaction with digital libraries, but it is poorly supported by standard search engines. Conventional systems often operate at the wrong level, indexing words when people think in terms of topics, and returning documents when people want a broader view. As a result, users cannot easily determine what is in a collection, how well a particular topic is covered, or what kinds of queries will provide useful results. We have built a new kind of search engine, Keyphind, that is explicitly designed to support browsing. Automatically extracted keyphrases form the basic unit of both indexing and presentation, allowing users to interact with the collection at the level of topics and subjects rather than words and documents. The keyphrase index also provides a simple mechanism for clustering documents, refining queries, and previewing results. We compared Keyphind to a traditional query engine in a small usability study. Users reported that certain kinds of browsing tasks were much easier with the new interface, indicating that a keyphrase index would be a useful supplement to existing search tools.

    View record details
  • Domain-specific keyphrase extraction

    Frank, Eibe; Paynter, Gordon W.; Witten, Ian H.; Gutwin, Carl; Nevill-Manning, Craig G. (1999)

    Conference item
    University of Waikato

    Keyphrases are an important means of document summarization, clustering, and topic search. Only a small minority of documents have author-assigned keyphrases, and manually assigning keyphrases to existing documents is very laborious. Therefore it is highly desirable to automate the keyphrase extraction process. This paper shows that a simple procedure for keyphrase extraction based on the naive Bayes learning scheme performs comparably to the state of the art. It goes on to explain how this procedure's performance can be boosted by automatically tailoring the extraction process to the particular document collection at hand. Results on a large collection of technical reports in computer science show that the quality of the extracted keyphrases improves significantly when domain-specific information is exploited.

    View record details