78 results for Pfahringer, Bernhard, Conference item

Random Relational Rules
Pfahringer, Bernhard; Anderson, Grant (2006)
Conference item
University of WaikatoExhaustive search in relational learning is generally infeasible, therefore some form of heuristic search is usually employed, such as in FOIL[1]. On the other hand, socalled stochastic discrimination provides a framework for combining arbitrary numbers of weak classifiers (in this case randomly generated relational rules) in a way where accuracy improves with additional rules, even after maximal accuracy on the training data has been reached. [2] The weak classifiers must have a slightly higher probability of covering instances of their target class than of other classes. As the rules are also independent and identically distributed, the Central Limit theorem applies and as the number of weak classifiers/rules grows, coverages for different classes resemble wellseparated normal distributions. Stochastic discrimination is closely related to other ensemble methods like Bagging, Boosting, or Random forests, all of which have been tried in relational learning [3, 4, 5].
View record details 
Bagging ensemble selection for regression
Sun, Quan; Pfahringer, Bernhard (2012)
Conference item
University of WaikatoBagging ensemble selection (BES) is a relatively new ensemble learning strategy. The strategy can be seen as an ensemble of the ensemble selection from libraries of models (ES) strategy. Previous experimental results on binary classiﬁcation problems have shown that using random trees as base classiﬁers, BESOOB (the most successful variant of BES) is competitive with (and in many cases, superior to) other ensemble learning strategies, for instance, the original ES algorithm, stacking with linear regression, random forests or boosting. Motivated by the promising results in classiﬁcation, this paper examines the predictive performance of the BESOOB strategy for regression problems. Our results show that the BESOOB strategy outperforms Stochastic Gradient Boosting and Bagging when using regression trees as the base learners. Our results also suggest that the advantage of using a diverse model library becomes clear when the model library size is relatively large. We also present encouraging results indicating that the non negative least squares algorithm is a viable approach for pruning an ensemble of ensembles.
View record details 
A novel two stage scheme utilizing the test set for model selection in text classification
Pfahringer, Bernhard; Reutemann, Peter; Mayo, Michael (2005)
Conference item
University of WaikatoText classification is a natural application domain for semisupervised learning, as labeling documents is expensive, but on the other hand usually an abundance of unlabeled documents is available. We describe a novel simple two stage scheme based on dagging which allows for utilizing the test set in model selection. The dagging ensemble can also be used by itself instead of the original classifier. We evaluate the performance of a meta classifier choosing between various base learners and their respective dagging ensembles. The selection process seems to perform robustly especially for small percentages of available labels for training.
View record details 
MOA: Massive Online Analysis, a framework for stream classification and clustering.
Bifet, Albert; Holmes, Geoffrey; Pfahringer, Bernhard; Kranen, Philipp; Kremer, Hardy; Jansen, Timm; Seidl, Thomas (2010)
Conference item
University of WaikatoMassive Online Analysis (MOA) is a software environment for implementing algorithms and running experiments for online learning from evolving data streams. MOA is designed to deal with the challenging problem of scaling up the implementation of state of the art algorithms to real world dataset sizes. It contains collection of offline and online for both classification and clustering as well as tools for evaluation. In particular, for classification it implements boosting, bagging, and Hoeffding Trees, all with and without Naive Bayes classifiers at the leaves. For clustering, it implements StreamKM++, CluStream, ClusTree, DenStream, DStream and CobWeb. Researchers benefit from MOA by getting insights into workings and problems of different approaches, practitioners can easily apply and compare several algorithms to real world data set and settings. MOA supports bidirectional interaction with WEKA, the Waikato Environment for Knowledge Analysis, and is released under the GNU GPL license.
View record details 
Mining Arbitrarily Large Datasets Using Heuristic kNearest Neighbour Search
Wu, Xing; Holmes, Geoffrey; Pfahringer, Bernhard (2008)
Conference item
University of WaikatoNearest Neighbour Search (NNS) is one of the top ten data mining algorithms. It is simple and effective but has a time complexity that is the product of the number of instances and the number of dimensions. When the number of dimensions is greater than two there are no known solutions that can guarantee a sublinear retrieval time. This paper describes and evaluates two ways to make NNS efficient for datasets that are arbitrarily large in the number of instances and dimensions. The methods are best described as heuristic as they are neither exact nor approximate. Both stem from recent developments in the field of data stream classification. The first uses Hoeffding Trees, an extension of decision trees to streams and the second is a direct stream extension of NNS. The methods are evaluated in terms of their accuracy and the time taken to find the neighbours. Results show that the methods are competitive with NNS in terms of accuracy but significantly faster.
View record details 
Tiebreaking in Hoeffding trees
Holmes, Geoffrey; Richard, Kirkby; Pfahringer, Bernhard (2005)
Conference item
University of WaikatoA thorough examination of the performance of Hoeffding trees, stateoftheart in classification for data streams, on a range of datasets reveals that tie breaking, an essential but supposedly rare procedure, is employed much more than expected. Testing with a lightweight method for handling continuous attributes, we find that the excessive invocation of tie breaking causes performance to degrade significantly on complex and noisy data. Investigating ways to reduce the number of tie breaks, we propose an adaptive method that overcomes the problem while not significantly affecting performance on simpler datasets.
View record details 
A discriminative approach to structured biological data
Mutter, Stefan; Pfahringer, Bernhard (2007)
Conference item
University of WaikatoThis paper introduces the first author’s PhD project which has just got out of its initial stage. Biological sequence data is, on the one hand, highly structured. On the other hand there are large amounts of unlabelled data. Thus we combine probabilistic graphical models and semisupervised learning. The former to handle structured data and latter to deal with unlabelled data. We apply our models to genotypephenotype modelling problems. In particular we predict the set of Single Nucleotide Polymorphisms which underlie a specific phenotypical trait.
View record details 
Experiments in Predicting Biodegradability
Blockeel, Hendrik; Džeroski, Sašo; Kompare, Boris; Kramer, Stefan; Pfahringer, Bernhard; Van Laer, Wim (2004)
Conference item
University of WaikatoThis paper is concerned with the use of AI techniques in ecology. More specifically, we present a novel application of inductive logic programming (ILP) in the area of quantitative structureactivity relationships (QSARs). The activity we want to predict is the biodegradability of chemical compounds in water. In particular, the target variable is the halflife for aerobic aqueous biodegradation. Structural descriptions of chemicals in terms of atoms and bonds are derived from the chemicals' SMILES encodings. The definition of substructures is used as background knowledge. Predicting biodegradability is essentially a regression problem, but we also consider a discretized version of the target variable. We thus employ a number of relational classification and regression methods on the relational representation and compare these to propositional methods applied to different propositionalizations of the problem. We also experiment with a prediction technique that consists of merging upper and lower bound predictions into one prediction. Some conclusions are drawn concerning the applicability of machine learning systems and the merging technique in this domain and the evaluation of hypotheses.
View record details 
Pitfalls in benchmarking data stream classification and how to avoid them
Bifet, Albert; Read, Jesse; Žliobaitė, Indrė; Pfahringer, Bernhard; Holmes, Geoffrey (2013)
Conference item
University of WaikatoData stream classification plays an important role in modern data analysis, where data arrives in a stream and needs to be mined in real time. In the data stream setting the underlying distribution from which this data comes may be changing and evolving, and so classifiers that can update themselves during operation are becoming the stateoftheart. In this paper we show that data streams may have an important temporal component, which currently is not considered in the evaluation and benchmarking of data stream classifiers. We demonstrate how a naive classifier considering the temporal component only outperforms a lot of current stateoftheart classifiers on real data streams that have temporal dependence, i.e. data is autocorrelated. We propose to evaluate data stream classifiers taking into account temporal dependence, and introduce a new evaluation measure, which provides a more accurate gauge of data stream classifier performance. In response to the temporal dependence issue we propose a generic wrapper for data stream classifiers, which incorporates the temporal component into the attribute space.
View record details 
Probability calibration trees
Leathart, Tim; Frank, Eibe; Holmes, Geoffrey; Pfahringer, Bernhard (2017)
Conference item
University of WaikatoObtaining accurate and well calibrated probability estimates from classifiers is useful in many applications, for example, when minimising the expected cost of classifications. Existing methods of calibrating probability estimates are applied globally, ignoring the potential for improvements by applying a more finegrained model. We propose probability calibration trees, a modification of logistic model trees that identifies regions of the input space in which different probability calibration models are learned to improve performance. We compare probability calibration trees to two widely used calibration methods—isotonic regression and Platt scaling—and show that our method results in lower root mean squared error on average than both methods, for estimates produced by a variety of base learners.
View record details 
Positive, Negative, or Neutral: Learning an Expanded Opinion Lexicon from Emoticonannotated Tweets
BravoMarquez, Felipe; Frank, Eibe; Pfahringer, Bernhard (2015)
Conference item
University of WaikatoWe present a supervised framework for expanding an opinion lexicon for tweets. The lexicon contains partofspeech (POS) disambiguated entries with a threedimensional probability distribution for positive, negative, and neutral polarities. To obtain this distribution using machine learning, we propose wordlevel attributes based on POS tags and information calculated from streams of emoticon annotated tweets. Our experimental results show that our method outperforms the threedimensional wordlevel polarity classification performance obtained by semantic orientation, a stateoftheart measure for establishing worldlevel sentiment.
View record details 
Case study on bagging stable classifiers for data streams
van Rijn, Jan N.; Holmes, Geoffrey; Pfahringer, Bernhard; Vanschoren, Joaquin (2015)
Conference item
University of WaikatoEnsembles of classifiers are among the strongest classifiers in most data mining applications. Bagging ensembles exploit the instability of baseclassifiers by training them on different bootstrap replicates. It has been shown that Bagging instable classifiers, such as decision trees, yield generally good results, whereas bagging stable classifiers, such as kNN, makes little difference. However, recent work suggests that this cognition applies to the classical batch data mining setting rather than the data stream setting. We present an empirical study that supports this observation.
View record details 
Organizing the World’s Machine Learning Information
Vanschoren, Joaquin; Blockeel, Hendrik; Pfahringer, Bernhard; Holmes, Geoffrey (2009)
Conference item
University of WaikatoAll around the globe, thousands of learning experiments are being executed on a daily basis, only to be discarded after interpretation. Yet, the information contained in these experiments might have uses beyond their original intent and, if properly stored, could be of great use to future research. In this paper, we hope to stimulate the development of such learning experiment repositories by providing a bird’seye view of how they can be created and used in practice, bringing together existing approaches and new ideas. We draw parallels between how experiments are being curated in other sciences, and consecutively discuss how both the empirical and theoretical details of learning experiments can be expressed, organized and made universally accessible. Finally, we discuss a range of possible services such a resource can offer, either used directly or integrated into data mining tools.
View record details 
Semirandom model tree ensembles: An effective and scalable regression method
Pfahringer, Bernhard (2011)
Conference item
University of WaikatoWe present and investigate ensembles of semirandom model trees as a novel regression method. Such ensembles combine the scalability of treebased methods with predictive performance rivalling the state of the art in numeric prediction. An empirical investigation shows that SemiRandom Model Trees produce predictive performance which is competitive with stateoftheart methods like Gaussian Processes Regression or Additive Groves of Regression Trees. The training and optimization of Random Model Trees scales better than Gaussian Processes Regression to larger datasets, and enjoys a constant advantage over Additive Groves of the order of one to two orders of magnitude.
View record details 
Prediction of ordinal classes using regression trees
Kramer, Stefan; Widmer, Gerhard; Pfahringer, Bernhard; de Groeve, Michael (2000)
Conference item
University of WaikatoThis paper is devoted to the problem of learning to predict ordinal (i.e., ordered discrete) classes using classification and regression trees. We start with SCART, a tree induction algorithm, and study various ways of transforming it into a learner for ordinal classification tasks. These algorithm variants are compared on a number of benchmark data sets to verify the relative strengths and weaknesses of the strategies and to study the tradeoff between optimal categorical classification accuracy (hit rate) and minimum distancebased error. Preliminary results indicate that this is a promising avenue towards algorithms that combine aspects of classification and regression.
View record details 
A Toolbox for Learning from Relational Data with Propositional and Multiinstance Learners
Reutemann, Peter; Pfahringer, Bernhard; Frank, Eibe (2005)
Conference item
University of WaikatoMost databases employ the relational model for data storage. To use this data in a propositional learner, a propositionalization step has to take place. Similarly, the data has to be transformed to be amenable to a multiinstance learner. The Proper Toolbox contains an extended version of RELAGGS, the MultiInstance Learning Kit MILK, and can also combine the multiinstance data with aggregated data from RELAGGS. RELAGGS was extended to handle arbitrarily nested relations and to work with both primary keys and indices. For MILK the relational model is flattened into a single table and this data is fed into a multiinstance learner. REMILK finally combines the aggregated data produced by RELAGGS and the multiinstance data, flattened for MILK, into a single table that is once again the input for a multiinstance learner. Several wellknown datasets are used for experiments which highlight the strengths and weaknesses of the different approaches.
View record details 
Stress testing Hoeffding trees
Holmes, Geoffrey; Kirkby, Richard Brendon; Pfahringer, Bernhard (2005)
Conference item
University of WaikatoHoeffding trees are stateoftheart in classification for data streams. They perform prediction by choosing the majority class at each leaf. Their predictive accuracy can be increased by adding Naive Bayes models at the leaves of the trees. By stresstesting these two prediction methods using noise and more complex concepts and an order of magnitude more instances than in previous studies, we discover situations where the Naive Bayes method outperforms the standard Hoeffding tree initially but is eventually overtaken. The reason for this crossover is determined and a hybrid adaptive method is proposed that generally outperforms the two original prediction methods for both simple and complex concepts as well as under noise.
View record details 
Text categorisation using document profiling
Sauban, Maximilien; Pfahringer, Bernhard (2003)
Conference item
University of WaikatoThis paper presents an extension of prior work by Michael D. Lee on psychologically plausible text categorisation. Our approach utilises Lee s model as a preprocessing filter to generate a dense representation for a given text document (a document profile) and passes that on to an arbitrary standard propositional learning algorithm. Similarly to standard feature selection for text classification, the dimensionality of instances is drastically reduced this way, which in turn greatly lowers the computational load for the subsequent learning algorithm. The filter itself is very fast as well, as it basically is just an interesting variant of Naive Bayes. We present different variations of the filter and conduct an evaluation against the Reuters21578 collection that shows performance comparable to previously published results on that collection, but at a lower computational cost.
View record details 
Wrapping boosters against noise
Pfahringer, Bernhard; Holmes, Geoffrey; Schmidberger, Gabi (2001)
Conference item
University of WaikatoWrappers have recently been used to obtain parameter optimizations for learning algorithms. In this paper we investigate the use of a wrapper for estimating the correct number of boosting ensembles in the presence of class noise. Contrary to the naive approach that would be quadratic in the number of boosting iterations, the incremental algorithm described is linear. Additionally, directly using the ksized ensembles generated during kfold crossvalidation search for prediction usually results in further improvements in classification performance. This improvement can be attributed to the reduction of variance due to averaging k ensembles instead of using only one ensemble. Consequently, crossvalidation in the way we use it here, termed wrapping, can be viewed as yet another ensemble learner similar in spirit to bagging but also somewhat related to stacking.
View record details 
Clustering for classification
Evans, Reuben James Emmanuel; Pfahringer, Bernhard; Holmes, Geoffrey (2011)
Conference item
University of WaikatoAdvances in technology have provided industry with an array of devices for collecting data. The frequency and scale of data collection means that there are now many large datasets being generated. To find patterns in these datasets it would be useful to be able to apply modern methods of classification such as support vector machines. Unfortunately these methods are computationally expensive, quadratic in the number of data points in fact, so cannot be applied directly. This paper proposes a framework whereby a variety of clustering methods can be used to summarise datasets, that is, reduce them to a smaller but still representative dataset so that advanced methods can be applied. It compares the results of using this framework against using random selection on a large number of classification problems. Results show that clustering prior to classification is beneficial when employing a sophisticated classifier however when the classifier is simple the benefits over random selection are not justified given the added cost of clustering. The results also show that for each dataset it is important to choose a clustering method carefully.
View record details