104 results for Pfahringer, Bernhard

  • Random Relational Rules

    Pfahringer, Bernhard; Anderson, Grant (2006)

    Conference item
    University of Waikato

    Exhaustive search in relational learning is generally infeasible, therefore some form of heuristic search is usually employed, such as in FOIL[1]. On the other hand, so-called stochastic discrimination provides a framework for combining arbitrary numbers of weak classifiers (in this case randomly generated relational rules) in a way where accuracy improves with additional rules, even after maximal accuracy on the training data has been reached. [2] The weak classifiers must have a slightly higher probability of covering instances of their target class than of other classes. As the rules are also independent and identically distributed, the Central Limit theorem applies and as the number of weak classifiers/rules grows, coverages for different classes resemble well-separated normal distributions. Stochastic discrimination is closely related to other ensemble methods like Bagging, Boosting, or Random forests, all of which have been tried in relational learning [3, 4, 5].

    View record details
  • Mining data streams using option trees

    Holmes, Geoffrey; Pfahringer, Bernhard; Kirkby, Richard Brendon (2003-09)

    Working or discussion paper
    University of Waikato

    The data stream model for data mining places harsh restrictions on a learning algorithm. A model must be induced following the briefest interrogation of the data, must use only available memory and must update itself over time within these constraints. Additionally, the model must be able to be used for data mining at any point in time. This paper describes a data stream classification algorithm using an ensemble of option trees. The ensemble of trees is induced by boosting and iteratively combined into a single interpretable model. The algorithm is evaluated using benchmark datasets for accuracy against state-of-the-art algorithms that make use of the entire dataset.

    View record details
  • Locally weighted naive Bayes

    Frank, Eibe; Hall, Mark A.; Pfahringer, Bernhard (2003-04)

    Working or discussion paper
    University of Waikato

    Despite its simplicity, the naive Bayes classifier has surprised machine learning researchers by exhibiting good performance on a variety of learning problems. Encouraged by these results, researchers have looked to overcome naive Bayes' primary weakness—attribute independence—and improve the performance of the algorithm. This paper presents a locally weighted version of naive Bayes that relaxes the independence assumption by learning local models at prediction time. Experimental results show that locally weighted naive Bayes rarely degrades accuracy compared to standard naive Bayes and, in many cases, improves accuracy dramatically. The main advantage of this method compared to other techniques for enhancing naive Bayes is its conceptual and computational simplicity.

    View record details
  • Cache Hierarchy Inspired Compression: a Novel Architecture for Data Streams

    Holmes, Geoffrey; Pfahringer, Bernhard; Kirkby, Richard Brendon (2006)

    Journal article
    University of Waikato

    We present an architecture for data streams based on structures typically found in web cache hierarchies. The main idea is to build a meta level analyser from a number of levels constructed over time from a data stream. We present the general architecture for such a system and an application to classification. This architecture is an instance of the general wrapper idea allowing us to reuse standard batch learning algorithms in an inherently incremental learning environment. By artificially generating data sources we demonstrate that a hierarchy containing a mixture of models is able to adapt over time to the source of the data. In these experiments the hierarchies use an elementary performance based replacement policy and unweighted voting for making classification decisions.

    View record details
  • Bagging ensemble selection for regression

    Sun, Quan; Pfahringer, Bernhard (2012)

    Conference item
    University of Waikato

    Bagging ensemble selection (BES) is a relatively new ensemble learning strategy. The strategy can be seen as an ensemble of the ensemble selection from libraries of models (ES) strategy. Previous experimental results on binary classification problems have shown that using random trees as base classifiers, BES-OOB (the most successful variant of BES) is competitive with (and in many cases, superior to) other ensemble learning strategies, for instance, the original ES algorithm, stacking with linear regression, random forests or boosting. Motivated by the promising results in classification, this paper examines the predictive performance of the BES-OOB strategy for regression problems. Our results show that the BES-OOB strategy outperforms Stochastic Gradient Boosting and Bagging when using regression trees as the base learners. Our results also suggest that the advantage of using a diverse model library becomes clear when the model library size is relatively large. We also present encouraging results indicating that the non negative least squares algorithm is a viable approach for pruning an ensemble of ensembles.

    View record details
  • Having a Blast: Meta-Learning and Heterogeneous Ensembles for Data Streams

    van Rijn, Jan N.; Holmes, Geoffrey; Pfahringer, Bernhard; Vanschoren, Joaquin (2015-01-01)

    Conference item
    University of Waikato

    Ensembles of classifiers are among the best performing classifiers available in many data mining applications. However, most ensembles developed specifically for the dynamic data stream setting rely on only one type of base-level classifier, most often Hoeffding Trees. In this paper, we study the use of heterogeneous ensembles, comprised of fundamentally different model types. Heterogeneous ensembles have proven successful in the classical batch data setting, however they do not easily transfer to the data stream setting. We therefore introduce the Online Performance Estimation framework, which can be used in data stream ensembles to weight the votes of (heterogeneous) ensemble members differently across the stream. Experiments over a wide range of data streams show performance that is competitive with state of the art ensemble techniques, including Online Bagging and Leveraging Bagging. All experimental results from this work are easily reproducible and publicly available on OpenML for further analysis.

    View record details
  • On dynamic feature weighting for feature drifting data streams

    Barddal, Jean Paul; Gomes, Heitor Murilo; Enembreck, Fabricio; Pfahringer, Bernhard; Bifet, Albert (2016)

    Conference item
    University of Waikato

    The ubiquity of data streams has been encouraging the development of new incremental and adaptive learning algorithms. Data stream learners must be fast, memory-bounded, but mainly, tailored to adapt to possible changes in the data distribution, a phenomenon named concept drift. Recently, several works have shown the impact of a so far nearly neglected type of drifcccct: feature drifts. Feature drifts occur whenever a subset of features becomes, or ceases to be, relevant to the learning task. In this paper we (i) provide insights into how the relevance of features can be tracked as a stream progresses according to information theoretical Symmetrical Uncertainty; and (ii) how it can be used to boost two learning schemes: Naive Bayesian and k-Nearest Neighbor. Furthermore, we investigate the usage of these two new dynamically weighted learners as prediction models in the leaves of the Hoeffding Adaptive Tree classifier. Results show improvements in accuracy (an average of 10.69% for k-Nearest Neighbor, 6.23% for Naive Bayes and 4.42% for Hoeffding Adaptive Trees) in both synthetic and real-world datasets at the expense of a bounded increase in both memory consumption and processing time.

    View record details
  • Bound analysis for Whiley programs

    Weng, Min-Hsien; Utting, Mark; Pfahringer, Bernhard (2015)

    Conference item
    University of Waikato

    The Whiley compiler can generate naive C code, but the code is inefficient because it uses infinite integers and dynamic array sizes. Our project goal is to build up a compiler that can translate Whiley programs into efficient OpenCL code with fixed-size integer types and fixed-size arrays, for parallel execution on GPUs. This paper presents an abstract interpretation-based bound inference approach along with symbolic analysis for Whiley programs. The source Whiley program is first analyzed by using our symbolic analyzer to find the matching pattern and make any necessary program transformation. Then the bound analyzer is used to analyze the transformed program to make use of primitive integer types rather than third-party infinite integer type (e.g. using GMP arbitrary precision library). The bound analysis results provide conservative estimates of the ranges of integer variables and array sizes so that efficient code can be generated and integer overflows avoided. The bound analyzer combines the bound consistency technique along with a widening operator to give fast time of solving program constraints and of converging to the fixed point. Several example programs are used to illustrate the bound analyzer algorithm and the program transformation.

    View record details
  • Case study on bagging stable classifiers for data streams

    van Rijn, Jan N.; Holmes, Geoffrey; Pfahringer, Bernhard; Vanschoren, Joaquin (2015)

    Conference item
    University of Waikato

    Ensembles of classifiers are among the strongest classi-fiers in most data mining applications. Bagging ensembles exploit the instability of base-classifiers by training them on different bootstrap replicates. It has been shown that Bagging instable classifiers, such as decision trees, yield generally good results, whereas bagging stable classifiers, such as k-NN, makes little difference. However, recent work suggests that this cognition applies to the classical batch data mining setting rather than the data stream setting. We present an empirical study that supports this observation.

    View record details
  • Learning Distance Metrics for Multi-Label Classification

    Gouk, Henry; Pfahringer, Bernhard; Cree, Michael J. (2016)

    Conference item
    University of Waikato

    Distance metric learning is a well studied problem in the field of machine learning, where it is typically used to improve the accuracy of instance based learning techniques. In this paper we propose a distance metric learning algorithm that is specialised for multi-label classification tasks, rather than the multiclass setting considered by most work in this area. The method trains an embedder that can transform instances into a feature space where squared Euclidean distance provides an estimate of the Jaccard distance between the corresponding label vectors. In addition to a linear Mahalanobis style metric, we also present a nonlinear extension that provides a substantial boost in performance. We show that this technique significantly improves upon current approaches for instance based multi-label classification, and also enables interesting data visualisations.

    View record details
  • Annotate-Sample-Average (ASA): A New Distant Supervision Approach for Twitter Sentiment Analysis

    Bravo-Marquez, Felipe; Frank, Eibe; Pfahringer, Bernhard (2016-01-01)

    Conference item
    University of Waikato

    The classification of tweets into polarity classes is a popular task in sentiment analysis. State-of-the-art solutions to this problem are based on supervised machine learning models trained from manually annotated examples. A drawback of these approaches is the high cost involved in data annotation. Two freely available resources that can be exploited to solve the problem are: 1) large amounts of unlabelled tweets obtained from the Twitter API and 2) prior lexical knowledge in the form of opinion lexicons. In this paper, we propose Annotate-Sample-Average (ASA), a distant supervision method that uses these two resources to generate synthetic training data for Twitter polarity classification. Positive and negative training instances are generated by sampling and averaging unlabelled tweets containing words with the corresponding polarity. Polarity of words is determined from a given polarity lexicon. Our experimental results show that the training data generated by ASA (after tuning its parameters) produces a classifier that performs significantly better than a classifier trained from tweets annotated with emoticons and a classifier trained, without any sampling and averaging, from tweets annotated according to the polarity of their words.

    View record details
  • Experiment Databases: Creating a New Platform for Meta-Learning Research

    Vanschoren, Joaquin; Blockeel, Hendrik; Pfahringer, Bernhard; Holmes, Geoffrey (2008)

    Conference item
    University of Waikato

    Many studies in machine learning try to investigate what makes an algorithm succeed or fail on certain datasets. However, the field is still evolving relatively quickly, and new algorithms, preprocessing methods, learning tasks and evaluation procedures continue to emerge in the literature. Thus, it is impossible for a single study to cover this expanding space of learning approaches. In this paper, we propose a community-based approach for the analysis of learning algorithms, driven by sharing meta-data from previous experiments in a uniform way. We illustrate how organizing this information in a central database can create a practical public platform for any kind of exploitation of meta-knowledge, allowing effective reuse of previous experimentation and targeted analysis of the collected results.

    View record details
  • A logic boosting approach to inducing multiclass alternating decision trees

    Holmes, Geoffrey; Pfahringer, Bernhard; Kirkby, Richard Brendon; Frank, Eibe; Hall, Mark A. (2002-03)

    Working or discussion paper
    University of Waikato

    The alternating decision tree (ADTree) is a successful classification technique that combine decision trees with the predictive accuracy of boosting into a ser to interpretable classification rules. The original formulation of the tree induction algorithm restricted attention to binary classification problems. This paper empirically evaluates several methods for extending the algorithm to the multiclass case by splitting the problem into several two-class LogitBoost procedure to induce alternating decision trees directly. Experimental results confirm that this procedure is comparable with methods that are based on the original ADTree formulation in accuracy, while inducing much smaller trees.

    View record details
  • Data Quality in Predictive Toxicology: Identification of Chemical Structures and Calculation of Chemical Descriptors

    Helma, Christoph; Kramer, Stefan; Pfahringer, Bernhard; Gottmann, Eva (2000)

    Journal article
    University of Waikato

    Every technique for toxicity prediction and for the detection of structure–activity relationships relies on the accurate estimation and representation of chemical and toxicologic properties. In this paper we discuss the potential sources of errors associated with the identification of compounds, the representation of their structures, and the calculation of chemical descriptors. It is based on a case study where machine learning techniques were applied to data from noncongeneric compounds and a complex toxicologic end point (carcinogenicity). We propose methods applicable to the routine quality control of large chemical datasets, but our main intention is to raise awareness about this topic and to open a discussion about quality assurance in predictive toxicology. The accuracy and reproducibility of toxicity data will be reported in another paper.

    View record details
  • A novel two stage scheme utilizing the test set for model selection in text classification

    Pfahringer, Bernhard; Reutemann, Peter; Mayo, Michael (2005)

    Conference item
    University of Waikato

    Text classification is a natural application domain for semi-supervised learning, as labeling documents is expensive, but on the other hand usually an abundance of unlabeled documents is available. We describe a novel simple two stage scheme based on dagging which allows for utilizing the test set in model selection. The dagging ensemble can also be used by itself instead of the original classifier. We evaluate the performance of a meta classifier choosing between various base learners and their respective dagging ensembles. The selection process seems to perform robustly especially for small percentages of available labels for training.

    View record details
  • Random model trees: an effective and scalable regression method

    Pfahringer, Bernhard (2010-06)

    Working or discussion paper
    University of Waikato

    We present and investigate ensembles of randomized model trees as a novel regression method. Such ensembles combine the scalability of tree-based methods with predictive performance rivaling the state of the art in numeric prediction. An extensive empirical investigation shows that Random Model Trees produce predictive performance which is competitive with state-of-the-art methods like Gaussian Processes Regression or Additive Groves of Regression Trees. The training and optimization of Random Model Trees scales better than Gaussian Processes Regression to larger datasets, and enjoys a constant advantage over Additive Groves of the order of one to two orders of magnitude.

    View record details
  • Learning from the past with experiment databases

    Vanschoren, Joaquin; Pfahringer, Bernhard; Holmes, Geoffrey (2008-06-24)

    Working or discussion paper
    University of Waikato

    Thousands of Machine Learning research papers contain experimental comparisons that usually have been conducted with a single focus of interest, and detailed results are usually lost after publication. Once past experiments are collected in experiment databases they allow for additional and possibly much broader investigation. In this paper, we show how to use such a repository to answer various interesting research questions about learning algorithms and to verify a number of recent studies. Alongside performing elaborate comparisons and rankings of algorithms, we also investigate the effects of algorithm parameters and data properties, and study the learning curves and bias-variance profiles of algorithms to gain deeper insights into their behavior.

    View record details
  • Maximum Common Subgraph based locally weighted regression

    Seeland, Madeleine; Buchwald, Fabian; Kramer, Stefan; Pfahringer, Bernhard (2012)

    Conference item
    University of Waikato

    This paper investigates a simple, yet effective method for regression on graphs, in particular for applications in chem-informatics and for quantitative structure-activity relationships (QSARs). The method combines Locally Weighted Learning (LWL) with Maximum Common Subgraph (MCS) based graph distances. More specifically, we investigate a variant of locally weighted regression on graphs (structures) that uses the maximum common subgraph for determining and weighting the neighborhood of a graph and feature vectors for the actual regression model. We show that this combination, LWL-MCS, outperforms other methods that use the local neighborhood of graphs for regression. The performance of this method on graphs suggests it might be useful for other types of structured data as well.

    View record details
  • Static techniques for reducing memory usage in the C implementation of Whiley programs

    Weng, Min-Hsien; Pfahringer, Bernhard; Utting, Mark (2017)

    Conference item
    University of Waikato

    Languages that use call-by-value semantics, such as Whiley, can make program verification easier. But effcient implementation becomes harder, due to the overhead of copying and garbage collection. This paper describes how a mixture of static analysis and runtime-monitoring can be used to eliminate unnecessary copying and deallocate memory with- out garbage collection. We show that this allows Whiley programs to be translated into effcient C implementations.

    View record details
  • Probability calibration trees

    Leathart, Tim; Frank, Eibe; Holmes, Geoffrey; Pfahringer, Bernhard (2017)

    Conference item
    University of Waikato

    Obtaining accurate and well calibrated probability estimates from classifiers is useful in many applications, for example, when minimising the expected cost of classifications. Existing methods of calibrating probability estimates are applied globally, ignoring the potential for improvements by applying a more fine-grained model. We propose probability calibration trees, a modification of logistic model trees that identifies regions of the input space in which different probability calibration models are learned to improve performance. We compare probability calibration trees to two widely used calibration methods—isotonic regression and Platt scaling—and show that our method results in lower root mean squared error on average than both methods, for estimates produced by a variety of base learners.

    View record details