1,758 results for Working or discussion paper

  • Implementing an event-driven service-oriented architecture in TIP

    Rinck, Michael; Hinze, Annika (2010-06-17)

    Working or discussion paper
    University of Waikato

    Many mobile devices have a density of services, many of which are context or location-aware. To function, many of these services have to collaborate with other services, which may be located in many different places and networks. There is often more then on service suitable for the task at hand. To decide which service to use, quality of service measurements like the accuracy or reliability of a service need to be known. Users do not want third parties to have statistics on how and where they used services. Therefore the collaboration needs to be anonymous. This project implements a model of event-based context-aware service collaboration on a publish/subscribe basis. We compare different implementation designs, with focus on anonymity and quality of service of the services.

    View record details
  • Getting research students started: a tale of two courses

    Witten, Ian H.; Bell, Timothy C. (1992)

    Working or discussion paper
    University of Waikato

    As graduate programs in Computer Science grow and mature and undergraduate populations stabilize, an increasing proportion of our resources is being devoted to the training of researchers in the field. Many inefficiencies are evident in our graduate programs. These include undesirably long average times to thesis completion, students' poor work habits and general lack of professionalism, and the unnecessary duplication of having supervisors introduce their students individually to the basics of research. Solving these problems requires specifically targeted education to get students started in their graduate research and introduce them to the skills and tools needed to complete it efficiently and effectively. We have used two different approaches in our respective departments. One is a (half-) credit course on research skills; the other a one-week intensive non-credit "survival course" at the beginning of the year. The advantage of the former is the opportunity to cover material in depth and for students to practice their skills; the latter is much less demanding on students and is easier to fit into an existing graduate program.

    View record details
  • Random model trees: an effective and scalable regression method

    Pfahringer, Bernhard (2010-06)

    Working or discussion paper
    University of Waikato

    We present and investigate ensembles of randomized model trees as a novel regression method. Such ensembles combine the scalability of tree-based methods with predictive performance rivaling the state of the art in numeric prediction. An extensive empirical investigation shows that Random Model Trees produce predictive performance which is competitive with state-of-the-art methods like Gaussian Processes Regression or Additive Groves of Regression Trees. The training and optimization of Random Model Trees scales better than Gaussian Processes Regression to larger datasets, and enjoys a constant advantage over Additive Groves of the order of one to two orders of magnitude.

    View record details
  • Batch-Incremental Learning for Mining Data Streams

    Holmes, Geoffrey; Kirkby, Richard Brendon; Bainbridge, David (2004)

    Working or discussion paper
    University of Waikato

    The data stream model for data mining places harsh restrictions on a learning algorithm. First, a model must be induced incrementally. Second, processing time for instances must keep up with their speed of arrival. Third, a model may only use a constant amount of memory, and must be ready for prediction at any point in time. We attempt to overcome these restrictions by presenting a data stream classification algorithm where the data is split into a stream of disjoint batches. Single batches of data can be processed one after the other by any standard non-incremental learning algorithm. Our approach uses ensembles of decision trees. These tree ensembles are iteratively merged into a single interpretable model of constant maximal size. Using benchmark datasets the algorithm is evaluated for accuracy against state-of-the-art algorithms that make use of the entire dataset.

    View record details
  • Metadata tools for institutional repositories

    Nichols, David M.; Paynter, Gordon W.; Chan, Chu-Hsiang; Bainbridge, David; McKay, Dana; Twidale, Michael B.; Blandford, Ann (2008-08)

    Working or discussion paper
    University of Waikato

    Current institutional repository software provides few tools to help metadata librarians understand and analyse their collections. In this paper we compare and contrast metadata analysis tools that were developed simultaneously, but independently, at two New Zealand institutions during a period of national investment in research repositories: the Metadata Analysis Tool (MAT) at The University of Waikato, and the Kiwi Research Information Service (KRIS) at the National Library of New Zealand. The tools have many similarities: they are convenient, online, on-demand services that harvest metadata using OAI-PMH, they were developed in response to feedback from repository administrators, and they both help pinpoint specific metadata errors as well as generating summary statistics. They also have significant differences: one is a dedicated tool while the other is part of a wider access tool; one gives a holistic view of the metadata while the other looks for specific problems; one seeks patterns in the data values while the other checks that those values conform to metadata standards. Both tools work in a complementary manner to existing web-based administration tools. We have observed that discovery and correction of metadata errors can be quickly achieved by switching web browser views from the analysis tool to the repository interface, and back. We summarise the findings from both tools’ deployment into a checklist of requirements for metadata analysis tools.

    View record details
  • Mining meaning from Wikipedia

    Medelyan, Olena; Legg, Catherine; Milne, David N.; Witten, Ian H. (2008-09)

    Working or discussion paper
    University of Waikato

    Wikipedia is a goldmine of information; not just for its many readers, but also for the growing community of researchers who recognize it as a resource of exceptional scale and utility. It represents a vast investment of manual effort and judgment: a huge, constantly evolving tapestry of concepts and relations that is being applied to a host of tasks. This article provides a comprehensive description of this work. It focuses on research that extracts and makes use of the concepts, relations, facts and descriptions found in Wikipedia, and organizes the work into four broad categories: applying Wikipedia to natural language processing; using it to facilitate information retrieval and information extraction; and as a resource for ontology building. The article addresses how Wikipedia is being used as is, how it is being improved and adapted, and how it is being combined with other structures to create entirely new resources. We identify the research groups and individuals involved, and how their work has developed in the last few years. We provide a comprehensive list of the open-source software they have produced. We also discuss the implications of this work for the long-awaited semantic web.

    View record details
  • The syntax and semantics of μ-Charts

    Reeve, Greg; Reeves, Steve (2004-02)

    Working or discussion paper
    University of Waikato

    μ-Charts is a language for specifying the behaviour of reactive systems. The language is a simplified variant of the well-known language Statecharts that was introduced by Harel. Development of the μ-Charts language is ongoing research undertaken under the auspices of the Formal Methods Laboratory of the Computer Science Department, University of Waikato. This paper gives a comprehensive treatment of the syntax and semantic for μ-Charts.

    View record details
  • Applying propositional learning algorithms to multi-instance data

    Frank, Eibe; Xu, Xin (2003-06)

    Working or discussion paper
    University of Waikato

    Multi-instance learning is commonly tackled using special-purpose algorithms. Development of these algorithms has started because early experiments with standard propositional learners have failed to produce satisfactory results on multi-instance data—more specifically, the Musk data. In this paper we present evidence that this is not necessarily the case. We introduce a simple wrapper for applying standard propositional learners to multi-instance problems and present empirical results for the Musk data that are competitive with genuine multi-instance algorithms. The key features of our new wrapper technique are: (1) it discards the standard multi-instance assumption that there is some inherent difference between positive and negative bags, and (2) it introduces weights to treat instances from different bags differently. We show that these two modifications are essential for producing good results on the Musk benchmark datasets.

    View record details
  • From sit-forward to lean-back: Using a mobile device to vary interactive pace

    Jones, Mark; Jain, Preeti; Buchanan, George; Marsden, Gary (2003-03)

    Working or discussion paper
    University of Waikato

    Although online, handheld, mobile computers offer new possibilities in searching and retrieving information on the go, the fast-paced, "sit-forward" style of interaction may not be appropriate for all user search needs. In this paper, we explore how a handheld computer can be used to enable interactive search experiences that vary in pace from fast and immediate through to reflective and delayed. We describe a system that asynchronously combines an offline handheld computer and an online desktop Personal Computer, and discuss some results of an initial user evaluation.

    View record details
  • A logic boosting approach to inducing multiclass alternating decision trees

    Holmes, Geoffrey; Pfahringer, Bernhard; Kirkby, Richard Brendon; Frank, Eibe; Hall, Mark A. (2002-03)

    Working or discussion paper
    University of Waikato

    The alternating decision tree (ADTree) is a successful classification technique that combine decision trees with the predictive accuracy of boosting into a ser to interpretable classification rules. The original formulation of the tree induction algorithm restricted attention to binary classification problems. This paper empirically evaluates several methods for extending the algorithm to the multiclass case by splitting the problem into several two-class LogitBoost procedure to induce alternating decision trees directly. Experimental results confirm that this procedure is comparable with methods that are based on the original ADTree formulation in accuracy, while inducing much smaller trees.

    View record details
  • Research laboratory survey

    Thomson, Kirsten (2002-11)

    Working or discussion paper
    University of Waikato

    This report represents the results of a survey conducted by the University of Waikato Usability Laboratory of the research laboratories at the Department of Computer Science, The University of Waikato, Hamilton, New Zealand. The study was conducted on behalf of the Department of Computer Science. The goal of the research was to: Inform the development of future laboratories; Inform the process any of re-development of current laboratories; Provide information about the use and acceptance of the laboratories.

    View record details
  • A development environment for predictive modelling in foods

    Holmes, Geoffrey; Hall, Mark A. (2000-07)

    Working or discussion paper
    University of Waikato

    WEKA (Waikato Environment for Knowledge Analysis) is a comprehensive suite of Java class libraries that implement many state-of-the-art machine learning/data mining algorithms. Non-programmers interact with the software via a user interface component called the Knowledge Explorer. Applications constructed from the WEKA class libraries can be run on any computer with a web browsing capability, allowing users to apply machine learning techniques to their own data regardless of computer platform. This paper describes the user interface component of the WEKA system in reference to previous applications in the predictive modeling of foods.

    View record details
  • A compression-based algorithm for Chinese word segmentation

    Teahan, W.J.; Wen, Yingying; McNab, Rodger J.; Witten, Ian H. (1999-09)

    Working or discussion paper
    University of Waikato

    The Chinese language is written without using spaces or other word delimiters. Although a text may be thought of as a corresponding sequence of words, there is considerable ambiguity in the placement of boundaries. Interpreting a text as a sequence of words is beneficial for some information retrieval and storage tasks: for example, full-text search, word-based compression, and keyphrase extraction. We describe a scheme that infers appropriate positions for word boundaries using an adaptive language model that is standard in text compression. It is trained on a corpus of pre-segmented text, and when applied to new text, interpolates word boundaries so as to maximize the compression obtained. This simple and general method performs well with respect to specialized schemes for Chinese language segmentation.

    View record details
  • The Niupepa Collection: Opening the blinds on a window to the past

    Keegan, Te Taka Adrian Gregory; Cunningham, Sally Jo; Apperley, Mark (1999-12)

    Working or discussion paper
    University of Waikato

    This paper describes the building of a digital library collection of historic newspapers. The newspapers (Niupepa in Maori), which were published in New Zealand during the period 1842 to 1933, form a unique historical record of the Maori language, and of events from an historical perspective. Images of these newspapers have been converted to digital form, electronic text extracted from these, and the collection is now being made available over the Internet as a part of the New Zealand Digital Library (NZDL) project at the University of Waikato.

    View record details
  • Melody based tune retrieval over the World Wide Web

    Bainbridge, David; McNab, Rodger J.; Smith, Lloyd A. (1998-11)

    Working or discussion paper
    University of Waikato

    In this paper we describe the steps taken to develop a Web-based version of an existing stand-alone, single-user digital library application for melodical searching of a collection of music. For the three key components: input, searching, and output, we assess the suitability of various Web-based strategies that deal with the now distributed software architecture and explain the decisions we made. The resulting melody indexing service, known as MELDEX, has been in operation for one year, and the feed-back we have received has been favorable.

    View record details
  • A graphical notation for the design of information visualisations

    Humphrey, Matthew C. (1997-02)

    Working or discussion paper
    University of Waikato

    Visualisations are coherent, graphical expressions of complex information that enhance people’s ability to communicate and reason about that information. Yet despite the importance of visualisations in helping people to understand and solve a wide variety of problems, there is a dearth of formal tools and methods for discussing, describing and designing them. Although simple visualisations, such as bar charts and scatterplots, are easily produced by modern interactive software, novel visualisations of multivariate, multirelational data must be expressed in a programming language. The Relational Visualisation Notation is a new, graphical language for designing such highly expressive visualisations that does not use programming constructs. Instead, the notation is based on relational algebra, which is widely used in database query languages, and it is supported by a suite of direct manipulation tools. This article presents the notation and examines the designs of some interesting visualisations.

    View record details
  • Applications of machine learning in information retrieval

    Cunningham, Sally Jo; Littin, James; Witten, Ian H. (1997-02)

    Working or discussion paper
    University of Waikato

    Information retrieval systems provide access to collections of thousands, or millions, of documents, from which, by providing an appropriate description, users can recover any one. Typically, users iteratively refine the descriptions they provide to satisfy their needs, and retrieval systems can utilize user feedback on selected documents to indicate the accuracy of the description at any stage. The style of description required from the user, and the way it is employed to search the document database, are consequences of the indexing method used for the collection. The index may take different forms, from storing keywords with links to individual documents, to clustering documents under related topics.

    View record details
  • A sight-singing tutor

    Smith, Lloyd A.; McNab, Rodger J. (1997-03)

    Working or discussion paper
    University of Waikato

    This paper describes a computer program designed to aid its users in learning to sight-sing. Sight-singing-the ability to sing music from a score without prior study-is an important skill for musicians and holds a central place in most university music curricula. Its importance to vocalists is obvious; it is also an important skill for instrumentalists and conductors because it develops the aural imagination necessary to judge how the music should sound, when played (Benward and Carr 1991). Furthermore, it is an important skill for amateur musicians, who can save a great deal of rehearsal time through an ability to sing music at sight.

    View record details
  • Extracting text from PostScript

    Nevill-Manning, Craig G.; Reed, Todd; Witten, Ian H. (1997-04)

    Working or discussion paper
    University of Waikato

    We show how to extract plain text from PostScript files. A textual scan is inadequate because PostScript interpreters can generate characters on the page that do not appear in the source file. Furthermore, word and line breaks are implicit in the graphical rendition, and must be inferred from the positioning of word fragments. We present a robust technique for extracting text and recognizing words and paragraphs. The method uses a standard PostScript interpreter but redefines several PostScript operators, and simple heuristics are employed to locate word and line breaks. The scheme has been used to create a full-text index, and plain-text versions, of 40,000 technical reports (34 Gbyte of PostScript). Other text-extraction systems are reviewed: none offer the same combination of robustness and simplicity.

    View record details
  • Internationalising a spreadsheet for Pacific Basin languages

    Barbour, Robert H.; Yeo, Alvin (1997-07)

    Working or discussion paper
    University of Waikato

    As people trade and engage in commerce, an economically dominant culture tends to migrate language into other recently contacted cultures. Information technology (IT) can accelerate enculturation and promote the expansion of western hegemony in IT. Equally, IT can present a culturally appropriate interface to the user that promotes the preservation of culture and language with very little additional effort. In this paper a spreadsheet is internationalised to accept languages from the Latin-1 character set such as English, Maori and Bahasa Melayu (Malaysia’s national language). A technique that allows a non-programmer to add a new language to the spreadsheet is described. The technique could also be used to internationalise other software at the point of design by following the steps we outline.

    View record details