5,588 results for Conference item

  • Classifier chains for multi-label classification

    Read, Jesse; Pfahringer, Bernhard; Holmes, Geoffrey; Frank, Eibe (2009)

    Conference item
    University of Waikato

    The widely known binary relevance method for multi-label classification, which considers each label as an independent binary problem, has been sidelined in the literature due to the perceived inadequacy of its label-independence assumption. Instead, most current methods invest considerable complexity to model interdependencies between labels. This paper shows that binary relevance-based methods have much to offer, especially in terms of scalability to large datasets. We exemplify this with a novel chaining method that can model label correlations while maintaining acceptable computational complexity. Empirical evaluation over a broad range of multi-label datasets with a variety of evaluation metrics demonstrates the competitiveness of our chaining method against related and state-of-the-art methods, both in terms of predictive performance and time complexity.

    View record details
  • Multiple range imaging camera operation with minimal performance impact

    Whyte, Refael Z.; Payne, Andrew D.; Dorrington, Adrian A.; Cree, Michael J. (2010)

    Conference item
    University of Waikato

    Time-of-flight range imaging cameras operate by illuminating a scene with amplitude modulated light and measuring the phase shift of the modulation envelope between the emitted and reflected light. Object distance can then be calculated from this phase measurement. This approach does not work in multiple camera environments as the measured phase is corrupted by the illumination from other cameras. To minimize inaccuracies in multiple camera environments, replacing the traditional cyclic modulation with pseudo-noise amplitude modulation has been previously demonstrated. However, this technique effectively reduced the modulation frequency, therefore decreasing the distance measurement precision (which has a proportional relationship with the modulation frequency). A new modulation scheme using maximum length pseudo-random sequences binary phase encoded onto the existing cyclic amplitude modulation, is presented. The effective modulation frequency therefore remains unchanged, providing range measurements with high precision. The effectiveness of the new modulation scheme was verified using a custom time-of-flight camera based on the PMD19-K2 range imaging sensor. The new pseudo-noise modulation has no significant performance decrease in a single camera environment. In a two camera environment, the precision is only reduced by the increased photon shot noise from the second illumination source.

    View record details
  • On the role of pre and post-processing in environmental data mining

    Gibert, Karina; Izquierdo, Joaquin; Holmes, Geoffrey; Athanasiadis, Ioannis; Comas, Joaquim; Sanchez-Marre, Miquel (2008)

    Conference item
    University of Waikato

    The quality of discovered knowledge is highly depending on data quality. Unfortunately real data use to contain noise, uncertainty, errors, redundancies or even irrelevant information. The more complex is the reality to be analyzed, the higher the risk of getting low quality data. Knowledge Discovery from Databases (KDD) offers a global framework to prepare data in the right form to perform correct analyses. On the other hand, the quality of decisions taken upon KDD results, depend not only on the quality of the results themselves, but on the capacity of the system to communicate those results in an understandable form. Environmental systems are particularly complex and environmental users particularly require clarity in their results. In this paper some details about how this can be achieved are provided. The role of the pre and post processing in the whole process of Knowledge Discovery in environmental systems is discussed.

    View record details
  • Batch-incremental versus instance-incremental learning in dynamic and evolving data

    Read, Jesse; Bifet, Albert; Pfahringer, Bernhard; Holmes, Geoffrey (2012)

    Conference item
    University of Waikato

    Many real world problems involve the challenging context of data streams, where classifiers must be incremental: able to learn from a theoretically- infinite stream of examples using limited time and memory, while being able to predict at any point. Two approaches dominate the literature: batch-incremental methods that gather examples in batches to train models; and instance-incremental methods that learn from each example as it arrives. Typically, papers in the literature choose one of these approaches, but provide insufficient evidence or references to justify their choice. We provide a first in-depth analysis comparing both approaches, including how they adapt to concept drift, and an extensive empirical study to compare several different versions of each approach. Our results reveal the respective advantages and disadvantages of the methods, which we discuss in detail.

    View record details
  • An algorithm for compositional nonblocking verification of extended finite-state machines

    Mohajerani, Sahar; Malik, Robi; Fabian, Martin (2014)

    Conference item
    University of Waikato

    This paper describes an approach for compositional nonblocking verification of discrete event systems modelled as extended finite-state machines (EFSM). Previous results about finite-state machines in lock-step synchronisation are generalised and applied to EFSMs communicating via shared variables. This gives rise to an EFSM-based conflict check algorithm that composes EFSMs gradually and partially unfolds variables as needed. At each step, components are simplified using conflict-equivalence preserving abstraction. The algorithm has been implemented in the discrete event systems tool Supremica. The paper presents experimental results for the verification of two scalable manufacturing system models, and shows that the EFSM-based algorithm verifies some large models faster than previously used methods.

    View record details
  • Realistic books: A bizarre homage to an obsolete medium?

    Chu, Yi-Chun; Bainbridge, David; Jones, Matt; Witten, Ian H. (2004)

    Conference item
    University of Waikato

    For many readers, handling a physical book is an enjoyably exquisite part of the information seeking process. Many physical characteristics of a book-its size, heft, the patina of use on its pages and so on-communicate ambient qualities of the document it represents. In contrast, the experience of accessing and exploring digital library documents is often dull. The emphasis is utilitarian; technophile rather than bibliophile. We have extended the page-turning algorithm we reported at last year's JCDL into a scaleable, systematic approach that allows users to view and interact with realistic visualizations of any textual-based document in a Greenstone collection. Here, we further motivate the approach, illustrate the system in use, discuss the system architecture and present a user evaluation Our work leads us to believe that far from being a whimsical gimmick, physical book models can usefully complement conventional document viewers and increase the perceived value of a digital library system.

    View record details
  • Clustering documents with active learning using Wikipedia

    Huang, Anna; Witten, Ian H.; Frank, Eibe; Milne, David N. (2009)

    Conference item
    University of Waikato

    Wikipedia has been applied as a background knowledge base to various text mining problems, but very few attempts have been made to utilize it for document clustering. In this paper we propose to exploit the semantic knowledge in Wikipedia for clustering, enabling the automatic grouping of documents with similar themes. Although clustering is intrinsically unsupervised, recent research has shown that incorporating supervision improves clustering performance, even when limited supervision is provided. The approach presented in this paper applies supervision using active learning. We first utilize Wikipedia to create a concept-based representation of a text document, with each concept associated to a Wikipedia article. We then exploit the semantic relatedness between Wikipedia concepts to find pair-wise instance-level constraints for supervised clustering, guiding clustering towards the direction indicated by the constraints. We test our approach on three standard text document datasets. Empirical results show that our basic document representation strategy yields comparable performance to previous attempts; and adding constraints improves clustering performance further by up to 20%.

    View record details
  • Multi-label classification using ensembles of pruned sets

    Read, Jesse; Pfahringer, Bernhard; Holmes, Geoffrey (2008)

    Conference item
    University of Waikato

    This paper presents a Pruned Sets method (PS) for multi-label classification. It is centred on the concept of treating sets of labels as single labels. This allows the classification process to inherently take into account correlations between labels. By pruning these sets, PS focuses only on the most important correlations, which reduces complexity and improves accuracy. By combining pruned sets in an ensemble scheme (EPS), new label sets can be formed to adapt to irregular or complex data. The results from experimental evaluation on a variety of multi-label datasets show that [E]PS can achieve better performance and train much faster than other multi-label methods.

    View record details
  • Pitfalls in benchmarking data stream classification and how to avoid them

    Bifet, Albert; Read, Jesse; Žliobaitė, Indrė; Pfahringer, Bernhard; Holmes, Geoffrey (2013)

    Conference item
    University of Waikato

    Data stream classification plays an important role in modern data analysis, where data arrives in a stream and needs to be mined in real time. In the data stream setting the underlying distribution from which this data comes may be changing and evolving, and so classifiers that can update themselves during operation are becoming the state-of-the-art. In this paper we show that data streams may have an important temporal component, which currently is not considered in the evaluation and benchmarking of data stream classifiers. We demonstrate how a naive classifier considering the temporal component only outperforms a lot of current state-of-the-art classifiers on real data streams that have temporal dependence, i.e. data is autocorrelated. We propose to evaluate data stream classifiers taking into account temporal dependence, and introduce a new evaluation measure, which provides a more accurate gauge of data stream classifier performance. In response to the temporal dependence issue we propose a generic wrapper for data stream classifiers, which incorporates the temporal component into the attribute space.

    View record details
  • One-Class Classification by Combining Density and Class Probability Estimation

    Hempstalk, Kathryn; Frank, Eibe; Witten, Ian H. (2008)

    Conference item
    University of Waikato

    One-class classification has important applications such as outlier and novelty detection. It is commonly tackled using density estimation techniques or by adapting a standard classification algorithm to the problem of carving out a decision boundary that describes the location of the target data. In this paper we investigate a simple method for one-class classification that combines the application of a density estimator, used to form a reference distribution, with the induction of a standard model for class probability estimation. In this method, the reference distribution is used to generate artificial data that is employed to form a second, artificial class. In conjunction with the target class, this artificial class is the basis for a standard two-class learning problem. We explain how the density function of the reference distribution can be combined with the class probability estimates obtained in this way to form an adjusted estimate of the density function of the target class. Using UCI datasets, and data from a typist recognition problem, we show that the combined model, consisting of both a density estimator and a class probability estimator, can improve on using either component technique alone when used for one-class classification. We also compare the method to one-class classification using support vector machines.

    View record details
  • New Options for Hoeffding Trees

    Pfahringer, Bernhard; Holmes, Geoffrey; Kirkby, Richard Brendon (2007)

    Conference item
    University of Waikato

    Hoeffding trees are state-of-the-art for processing high-speed data streams. Their ingenuity stems from updating sufficient statistics, only addressing growth when decisions can be made that are guaranteed to be almost identical to those that would be made by conventional batch learning methods. Despite this guarantee, decisions are still subject to limited lookahead and stability issues. In this paper we explore Hoeffding Option Trees, a regular Hoeffding tree containing additional option nodes that allow several tests to be applied, leading to multiple Hoeffding trees as separate paths. We show how to control tree growth in order to generate a mixture of paths, and empirically determine a reasonable number of paths. We then empirically evaluate a spectrum of Hoeffding tree variations: single trees, option trees and bagged trees. Finally, we investigate pruning. We show that on some datasets a pruned option tree can be smaller and more accurate than a single tree.

    View record details
  • A low frequency supercapacitor circulation technique to improve the efficiency of linear regulators based on LDO ICs

    Kularatna, Nihal; Fernando, Jayathu; Kankanamge, Kosala; Zhang, Xu (2011)

    Conference item
    University of Waikato

    Linear regulators have output specifications far superior to switch-mode techniques, except for the overall efficiency. This efficiency limitation can be overcome by applying a very low frequency supercapacitor circulation technique at the input side of a low dropout regulator IC. The technique was proven in 12V to 5 V versions, and, can be easily applied to other power supplies such as 5 to 3.3V or 5 to 1.5V versions required by various processors. The paper outlines the concepts and experimental results related to this technique. With the commercial LDO chips available with output current ratings up to 10A, and, thin profile supercapacitors available are with DC voltage ratings from 2.3V to 5.5V, the technique assists developing medium current linear regulators which could compete with present day switch-mode power supplies in efficiency and compactness, maintaining the superior output specifications of a linear regulator.

    View record details
  • Challenges in cross-cultural/multilingual music information seeking

    Lee, Jin Ha; Downie, J. Stephen; Cunningham, Sally Jo (2005-09-01)

    Conference item
    University of Waikato

    Understanding and meeting the needs of a broad range of music users across different cultures and languages are central in designing a global music digital library. This exploratory study examines cross-cultural/multilingual music information seeking behaviors and reveals some important characteristics of these behaviors by analyzing 107 authentic music information queries from a Korean knowledge search portal Naver (knowledge) iN and 150 queries from Google Answers website. We conclude that new sets of access points must be developed to accommodate music queries that cross cultural or language boundaries.

    View record details
  • Greenstone digital library software: current research

    Bainbridge, David; Witten, Ian H. (2004)

    Conference item
    University of Waikato

    The Greenstone digital library software (www.greenstone.org)provides a flexible way of organizing information and publishing it on the Internet or removable media such as CDROM. Its aim is to empower users, particularly in universities, libraries and other public service institutions, to build their own digital libraries. It is open-source software, issued under the terms of the GNU General Public License. It is produced by the New Zealand Digital Library Project at the University of Waikato, and developed and distributed in cooperation with UNESCO and the Human Info NGO.

    View record details
  • New ensemble methods for evolving data streams

    Bifet, Albert; Holmes, Geoffrey; Pfahringer, Bernhard; Kirkby, Richard Brendon; Gavaldà, Ricard (2009)

    Conference item
    University of Waikato

    Advanced analysis of data streams is quickly becoming a key area of data mining research as the number of applications demanding such processing increases. Online mining when such data streams evolve over time, that is when concepts drift or change completely, is becoming one of the core issues. When tackling non-stationary concepts, ensembles of classifiers have several advantages over single classifier methods: they are easy to scale and parallelize, they can adapt to change quickly by pruning under-performing parts of the ensemble, and they therefore usually also generate more accurate concept descriptions. This paper proposes a new experimental data stream framework for studying concept drift, and two new variants of Bagging: ADWIN Bagging and Adaptive-Size Hoeffding Tree (ASHT) Bagging. Using the new experimental framework, an evaluation study on synthetic and real-world datasets comprising up to ten million examples shows that the new ensemble methods perform very well compared to several known methods.

    View record details
  • Fast perceptron decision tree learning from evolving data streams

    Bifet, Albert; Holmes, Geoffrey; Pfahringer, Bernhard; Frank, Eibe (2010)

    Conference item
    University of Waikato

    Mining of data streams must balance three evaluation dimensions: accuracy, time and memory. Excellent accuracy on data streams has been obtained with Naive Bayes Hoeffding Trees—Hoeffding Trees with naive Bayes models at the leaf nodes—albeit with increased runtime compared to standard Hoeffding Trees. In this paper, we show that runtime can be reduced by replacing naive Bayes with perceptron classifiers, while maintaining highly competitive accuracy. We also show that accuracy can be increased even further by combining majority vote, naive Bayes, and perceptrons. We evaluate four perceptron-based learning strategies and compare them against appropriate baselines: simple perceptrons, Perceptron Hoeffding Trees, hybrid Naive Bayes Perceptron Trees, and bagged versions thereof. We implement a perceptron that uses the sigmoid activation function instead of the threshold activation function and optimizes the squared error, with one perceptron per class value. We test our methods by performing an evaluation study on synthetic and real-world datasets comprising up to ten million examples.

    View record details
  • Elia Kazan as Melomane

    Camp, Gregory (2016-04-20)

    Conference item
    The University of Auckland Library

    Guest lecture on film director Elia Kazan's use of music.

    View record details
  • Conservation of pre-European wet organic archaeological materials in Aotearoa New Zealand

    Johns, Dilys; Hodgins, G; Gilberg, M; Rageth, J; O'Connor, S (2010-02)

    Conference item
    The University of Auckland Library

    Five international scholars were invited to discuss scientific techniques used during art and textile investigations. Presentations highlight the advantages and limitations of scientific analysis when attempting to solve provenance, dating, authenticity and conservation questions. My presentation illustrated the above issues with Maori/New Zealand case studies, stressing the importance of context, traditional knowledge and archaeology when linking together science and culture.

    View record details
  • Pasifika Rising: a cultural strand in contemporary New Zealand art

    Mane-Wheoki, Jonathan (2006)

    Conference item
    The University of Auckland Library

    View record details
  • Collaborative painting machine

    Ingram, Simon (2016-10-10)

    Conference item
    The University of Auckland Library

    View record details