78 results for Pfahringer, Bernhard, Conference item

Random Relational Rules
Pfahringer, Bernhard; Anderson, Grant (2006)
Conference item
University of WaikatoExhaustive search in relational learning is generally infeasible, therefore some form of heuristic search is usually employed, such as in FOIL[1]. On the other hand, socalled stochastic discrimination provides a framework for combining arbitrary numbers of weak classifiers (in this case randomly generated relational rules) in a way where accuracy improves with additional rules, even after maximal accuracy on the training data has been reached. [2] The weak classifiers must have a slightly higher probability of covering instances of their target class than of other classes. As the rules are also independent and identically distributed, the Central Limit theorem applies and as the number of weak classifiers/rules grows, coverages for different classes resemble wellseparated normal distributions. Stochastic discrimination is closely related to other ensemble methods like Bagging, Boosting, or Random forests, all of which have been tried in relational learning [3, 4, 5].
View record details 
Bagging ensemble selection for regression
Sun, Quan; Pfahringer, Bernhard (2012)
Conference item
University of WaikatoBagging ensemble selection (BES) is a relatively new ensemble learning strategy. The strategy can be seen as an ensemble of the ensemble selection from libraries of models (ES) strategy. Previous experimental results on binary classiﬁcation problems have shown that using random trees as base classiﬁers, BESOOB (the most successful variant of BES) is competitive with (and in many cases, superior to) other ensemble learning strategies, for instance, the original ES algorithm, stacking with linear regression, random forests or boosting. Motivated by the promising results in classiﬁcation, this paper examines the predictive performance of the BESOOB strategy for regression problems. Our results show that the BESOOB strategy outperforms Stochastic Gradient Boosting and Bagging when using regression trees as the base learners. Our results also suggest that the advantage of using a diverse model library becomes clear when the model library size is relatively large. We also present encouraging results indicating that the non negative least squares algorithm is a viable approach for pruning an ensemble of ensembles.
View record details 
Semirandom model tree ensembles: An effective and scalable regression method
Pfahringer, Bernhard (2011)
Conference item
University of WaikatoWe present and investigate ensembles of semirandom model trees as a novel regression method. Such ensembles combine the scalability of treebased methods with predictive performance rivalling the state of the art in numeric prediction. An empirical investigation shows that SemiRandom Model Trees produce predictive performance which is competitive with stateoftheart methods like Gaussian Processes Regression or Additive Groves of Regression Trees. The training and optimization of Random Model Trees scales better than Gaussian Processes Regression to larger datasets, and enjoys a constant advantage over Additive Groves of the order of one to two orders of magnitude.
View record details 
Multilabel classification using boolean matrix decomposition
Wicker, Jörg; Pfahringer, Bernhard; Kramer, Stefan (2012)
Conference item
University of WaikatoThis paper introduces a new multilabel classifier based on Boolean matrix decomposition. Boolean matrix decomposition is used to extract, from the full label matrix, latent labels representing useful Boolean combinations of the original labels. Base level models predict latent labels, which are subsequently transformed into the actual labels by Boolean matrix multiplication with the second matrix from the decomposition. The new method is tested on six publicly available datasets with varying numbers of labels. The experimental evaluation shows that the new method works particularly well on datasets with a large number of labels and strong dependencies among them.
View record details 
Prediction of ordinal classes using regression trees
Kramer, Stefan; Widmer, Gerhard; Pfahringer, Bernhard; de Groeve, Michael (2000)
Conference item
University of WaikatoThis paper is devoted to the problem of learning to predict ordinal (i.e., ordered discrete) classes using classification and regression trees. We start with SCART, a tree induction algorithm, and study various ways of transforming it into a learner for ordinal classification tasks. These algorithm variants are compared on a number of benchmark data sets to verify the relative strengths and weaknesses of the strategies and to study the tradeoff between optimal categorical classification accuracy (hit rate) and minimum distancebased error. Preliminary results indicate that this is a promising avenue towards algorithms that combine aspects of classification and regression.
View record details 
Exploiting propositionalization based on random relational rules for semisupervised learning
Pfahringer, Bernhard; Anderson, Grant (2008)
Conference item
University of WaikatoIn this paper we investigate an approach to semisupervised learning based on randomized propositionalization, which allows for applying standard propositional classification algorithms like support vector machines to multirelational data. Randomization based on random relational rules can work both with and without a class attribute and can therefore be applied simultaneously to both the labeled and the unlabeled portion of the data present in semisupervised learning. An empirical investigation compares semisupervised propositionalization to standard propositionalization using just the labeled data portion, as well as to a variant that also just uses the labeled data portion but includes the label information in an attempt to improve the resulting propositionalization. Preliminary experimental results indicate that propositionalization generated on the full dataset, i.e. the semi supervised approach, tends to outperform the other two more standard approaches.
View record details 
A Toolbox for Learning from Relational Data with Propositional and Multiinstance Learners
Reutemann, Peter; Pfahringer, Bernhard; Frank, Eibe (2005)
Conference item
University of WaikatoMost databases employ the relational model for data storage. To use this data in a propositional learner, a propositionalization step has to take place. Similarly, the data has to be transformed to be amenable to a multiinstance learner. The Proper Toolbox contains an extended version of RELAGGS, the MultiInstance Learning Kit MILK, and can also combine the multiinstance data with aggregated data from RELAGGS. RELAGGS was extended to handle arbitrarily nested relations and to work with both primary keys and indices. For MILK the relational model is flattened into a single table and this data is fed into a multiinstance learner. REMILK finally combines the aggregated data produced by RELAGGS and the multiinstance data, flattened for MILK, into a single table that is once again the input for a multiinstance learner. Several wellknown datasets are used for experiments which highlight the strengths and weaknesses of the different approaches.
View record details 
Multinomial naive Bayes for text categorization revisited
Kibriya, Ashraf Masood; Frank, Eibe; Pfahringer, Bernhard; Holmes, Geoffrey (2005)
Conference item
University of WaikatoThis paper presents empirical results for several versions of the multinomial naive Bayes classifier on four text categorization problems, and a way of improving it using locally weighted learning. More specifically, it compares standard multinomial naive Bayes to the recently proposed transformed weightnormalized complement naive Bayes classifier (TWCNB) [1], and shows that some of the modifications included in TWCNB may not be necessary to achieve optimum performance on some datasets. However, it does show that TFIDF conversion and document length normalization are important. It also shows that support vector machines can, in fact, sometimes very significantly outperform both methods. Finally, it shows how the performance of multinomial naive Bayes can be improved using locally weighted learning. However, the overall conclusion of our paper is that support vector machines are still the method of choice if the aim is to maximize accuracy.
View record details 
Stress testing Hoeffding trees
Holmes, Geoffrey; Kirkby, Richard Brendon; Pfahringer, Bernhard (2005)
Conference item
University of WaikatoHoeffding trees are stateoftheart in classification for data streams. They perform prediction by choosing the majority class at each leaf. Their predictive accuracy can be increased by adding Naive Bayes models at the leaves of the trees. By stresstesting these two prediction methods using noise and more complex concepts and an order of magnitude more instances than in previous studies, we discover situations where the Naive Bayes method outperforms the standard Hoeffding tree initially but is eventually overtaken. The reason for this crossover is determined and a hybrid adaptive method is proposed that generally outperforms the two original prediction methods for both simple and complex concepts as well as under noise.
View record details 
Clustering large datasets using cobweb and Kmeans in tandem
Li, Mi; Holmes, Geoffrey; Pfahringer, Bernhard (2005)
Conference item
University of WaikatoThis paper presents a single scan algorithm for clustering large datasets based on a two phase process which combines two well known clustering methods. The Cobweb algorithm is modified to produce a balanced tree with subclusters at the leaves, and then Kmeans is applied to the resulting subclusters. The resulting method, Scalable Cobweb, is then compared to a single pass Kmeans algorithm and standard Kmeans. The evaluation looks at error as measured by the sum of squared error and vulnerability to the order in which data points are processed.
View record details 
Text categorisation using document profiling
Sauban, Maximilien; Pfahringer, Bernhard (2003)
Conference item
University of WaikatoThis paper presents an extension of prior work by Michael D. Lee on psychologically plausible text categorisation. Our approach utilises Lee s model as a preprocessing filter to generate a dense representation for a given text document (a document profile) and passes that on to an arbitrary standard propositional learning algorithm. Similarly to standard feature selection for text classification, the dimensionality of instances is drastically reduced this way, which in turn greatly lowers the computational load for the subsequent learning algorithm. The filter itself is very fast as well, as it basically is just an interesting variant of Naive Bayes. We present different variations of the filter and conduct an evaluation against the Reuters21578 collection that shows performance comparable to previously published results on that collection, but at a lower computational cost.
View record details 
Wrapping boosters against noise
Pfahringer, Bernhard; Holmes, Geoffrey; Schmidberger, Gabi (2001)
Conference item
University of WaikatoWrappers have recently been used to obtain parameter optimizations for learning algorithms. In this paper we investigate the use of a wrapper for estimating the correct number of boosting ensembles in the presence of class noise. Contrary to the naive approach that would be quadratic in the number of boosting iterations, the incremental algorithm described is linear. Additionally, directly using the ksized ensembles generated during kfold crossvalidation search for prediction usually results in further improvements in classification performance. This improvement can be attributed to the reduction of variance due to averaging k ensembles instead of using only one ensemble. Consequently, crossvalidation in the way we use it here, termed wrapping, can be viewed as yet another ensemble learner similar in spirit to bagging but also somewhat related to stacking.
View record details 
Clustering for classification
Evans, Reuben James Emmanuel; Pfahringer, Bernhard; Holmes, Geoffrey (2011)
Conference item
University of WaikatoAdvances in technology have provided industry with an array of devices for collecting data. The frequency and scale of data collection means that there are now many large datasets being generated. To find patterns in these datasets it would be useful to be able to apply modern methods of classification such as support vector machines. Unfortunately these methods are computationally expensive, quadratic in the number of data points in fact, so cannot be applied directly. This paper proposes a framework whereby a variety of clustering methods can be used to summarise datasets, that is, reduce them to a smaller but still representative dataset so that advanced methods can be applied. It compares the results of using this framework against using random selection on a large number of classification problems. Results show that clustering prior to classification is beneficial when employing a sophisticated classifier however when the classifier is simple the benefits over random selection are not justified given the added cost of clustering. The results also show that for each dataset it is important to choose a clustering method carefully.
View record details 
Full model selection in the space of data mining operators
Sun, Quan; Pfahringer, Bernhard; Mayo, Michael (2012)
Conference item
University of WaikatoWe propose a framework and a novel algorithm for the full model selection (FMS) problem. The proposed algorithm, combining both genetic algorithms (GA) and particle swarm optimization (PSO), is named GPS (which stands for GAPSOFMS), in which a GA is used for searching the optimal structure of a data mining solution, and PSO is used for searching the optimal parameter set for a particular structure instance. Given a classification or regression problem, GPS outputs a FMS solution as a directed acyclic graph consisting of diverse data mining operators that are applicable to the problem, including data cleansing, data sampling, feature transformation/selection and algorithm operators. The solution can also be represented graphically in a human readable form. Experimental results demonstrate the benefit of the algorithm.
View record details 
New Options for Hoeffding Trees
Pfahringer, Bernhard; Holmes, Geoffrey; Kirkby, Richard Brendon (2007)
Conference item
University of WaikatoHoeffding trees are stateoftheart for processing highspeed data streams. Their ingenuity stems from updating sufficient statistics, only addressing growth when decisions can be made that are guaranteed to be almost identical to those that would be made by conventional batch learning methods. Despite this guarantee, decisions are still subject to limited lookahead and stability issues. In this paper we explore Hoeffding Option Trees, a regular Hoeffding tree containing additional option nodes that allow several tests to be applied, leading to multiple Hoeffding trees as separate paths. We show how to control tree growth in order to generate a mixture of paths, and empirically determine a reasonable number of paths. We then empirically evaluate a spectrum of Hoeffding tree variations: single trees, option trees and bagged trees. Finally, we investigate pruning. We show that on some datasets a pruned option tree can be smaller and more accurate than a single tree.
View record details 
Experiment Databases: Creating a New Platform for MetaLearning Research
Vanschoren, Joaquin; Blockeel, Hendrik; Pfahringer, Bernhard; Holmes, Geoffrey (2008)
Conference item
University of WaikatoMany studies in machine learning try to investigate what makes an algorithm succeed or fail on certain datasets. However, the field is still evolving relatively quickly, and new algorithms, preprocessing methods, learning tasks and evaluation procedures continue to emerge in the literature. Thus, it is impossible for a single study to cover this expanding space of learning approaches. In this paper, we propose a communitybased approach for the analysis of learning algorithms, driven by sharing metadata from previous experiments in a uniform way. We illustrate how organizing this information in a central database can create a practical public platform for any kind of exploitation of metaknowledge, allowing effective reuse of previous experimentation and targeted analysis of the collected results.
View record details 
Optimizing the induction of alternating decision trees
Pfahringer, Bernhard; Holmes, Geoffrey; Kirkby, Richard Brendon (2001)
Conference item
University of WaikatoThe alternating decision tree brings comprehensibility to the performance enhancing capabilities of boosting. A single interpretable tree is induced wherein knowledge is distributed across the nodes and multiple paths are traversed to form predictions. The complexity of the algorithm is quadratic in the number of boosting iterations and this makes it unsuitable for larger knowledge discovery in database tasks. In this paper we explore various heuristic methods for reducing this complexity while maintaining the performance characteristics of the original algorithm. In experiments using standard, artificial and knowledge discovery datasets we show that a range of heuristic methods with log linear complexity are capable of achieving similar performance to the original method. Of these methods, the random walk heuristic is seen to outperform all others as the number of boosting iterations increases. The average case complexity of this method is linear.
View record details 
Organizing the World’s Machine Learning Information
Vanschoren, Joaquin; Blockeel, Hendrik; Pfahringer, Bernhard; Holmes, Geoffrey (2009)
Conference item
University of WaikatoAll around the globe, thousands of learning experiments are being executed on a daily basis, only to be discarded after interpretation. Yet, the information contained in these experiments might have uses beyond their original intent and, if properly stored, could be of great use to future research. In this paper, we hope to stimulate the development of such learning experiment repositories by providing a bird’seye view of how they can be created and used in practice, bringing together existing approaches and new ideas. We draw parallels between how experiments are being curated in other sciences, and consecutively discuss how both the empirical and theoretical details of learning experiments can be expressed, organized and made universally accessible. Finally, we discuss a range of possible services such a resource can offer, either used directly or integrated into data mining tools.
View record details 
Classifier chains for multilabel classification
Read, Jesse; Pfahringer, Bernhard; Holmes, Geoffrey; Frank, Eibe (2009)
Conference item
University of WaikatoThe widely known binary relevance method for multilabel classification, which considers each label as an independent binary problem, has been sidelined in the literature due to the perceived inadequacy of its labelindependence assumption. Instead, most current methods invest considerable complexity to model interdependencies between labels. This paper shows that binary relevancebased methods have much to offer, especially in terms of scalability to large datasets. We exemplify this with a novel chaining method that can model label correlations while maintaining acceptable computational complexity. Empirical evaluation over a broad range of multilabel datasets with a variety of evaluation metrics demonstrates the competitiveness of our chaining method against related and stateoftheart methods, both in terms of predictive performance and time complexity.
View record details 
Handling numeric attributes in Hoeffding trees
Pfahringer, Bernhard; Holmes, Geoffrey; Kirkby, Richard Brendon (2008)
Conference item
University of WaikatoFor conventional machine learning classification algorithms handling numeric attributes is relatively straightforward. Unsupervised and supervised solutions exist that either segment the data into predefined bins or sort the data and search for the best split points. Unfortunately, none of these solutions carry over particularly well to a data stream environment. Solutions for data streams have been proposed by several authors but as yet none have been compared empirically. In this paper we investigate a range of methods for multiclass treebased classification where the handling of numeric attributes takes place as the tree is constructed. To this end, we extend an existing approximation approach, based on simple Gaussian approximation. We then compare this method with four approaches from the literature arriving at eight final algorithm configurations for testing. The solutions cover a range of options from perfectly accurate and memory intensive to highly approximate. All methods are tested using the Hoeffding tree classification algorithm. Surprisingly, the experimental comparison shows that the most approximate methods produce the most accurate trees by allowing for faster tree growth.
View record details