1,737 results for Working or discussion paper

  • Flexible refinement

    Reeves, Steve; Streader, David (2007-05-07)

    Working or discussion paper
    University of Waikato

    To help make refinement more usable in practice we introduce a general, flexible model of refinement. This is defined in terms of what contexts an entity can appear in, and what observations can be made of it in those contexts. Our general model is expressed in terms of an operational semantics, and by exploiting the well-known isomorphism between state-based relational semantics and event-based labelled transition semantics we were able to take particular models from both the state- and event-based literature, reflect on them and gradually evolve our general model. We are also able to view our general model both as a testing semantics and as a logical theory with refinement as implication. Our general model can used as a bridge between different particular special models and using this bridge we compare the definition of determinism found in different special models. We do this because the reduction of nondeterminism underpins many definitions of refinement found in a variety of special models. To our surprise we find that the definition of determinism commonly used in the process algebra literature to be at odds with determinism as defined in other special models. In order to rectify this situation we return to the intuitions expressed by Milner in CCS and by formalising these intuitions we are able to define determinism in process algebra in such a way that it no longer at odds with the definitions we have taken from other special models. Using our abstract definition of determinism we are able to construct a new model, interactive branching programs, that is an implementable subset of process algebra. Later in the chapter we show explicitly how five special models, taken from the literature, are instances of our general model. This is done simply by fixing the sets of contexts and observations involved. Next we define vertical refinement on our general model. Vertical refinement can be seen both as a generalisation of what, in the literature, has been called action refinement or non-atomic refinement. Alternatively, by viewing a layer as a logical theory, vertical refinement is a theory morphism, formalised as a Galois connection. By constructing a vertical refinement between broadcast processes and interactive branching programs we can see how interactive branching programs can be implemented on a platform providing broadcast communication. But we have been unable to extend this theory morphism to implement all of process algebra using broadcast communication. Upon investigation we show the problem arises with the examples that caused the problem with the definition of determinism on process algebra. Finally we illustrate the usefulness of our flexible general model by formally developing a single entity that contains events that use handshake communication and events that use broadcast communication.

    View record details
  • Computational sense: the role of technology in the education of digital librarians

    Twidale, Michael B.; Nichols, David M. (2006-10-01)

    Working or discussion paper
    University of Waikato

    The rapid progress of digital library technology from research to implementation has created a force for change in the curricula of library schools. The education of future librarians has always had to adapt to new technologies but the pace, complexity and implications of digital libraries pose considerable challenges. In this article we explore how we might successfully blend elements of computer science and library science to produce effective educational experiences for the digital librarians of tomorrow. We first outline the background to current digital librarian education and then propose the concept of computational sense as an appropriate meeting point for these two disciplines.

    View record details
  • Design and analysis of an efficient distributed event notification service

    Bittner, Sven; Hinze, Annika (2004-01-01)

    Working or discussion paper
    University of Waikato

    Event Notification Services (ENS) use the publish/subscribe paradigm to continuously inform subscribers about events they are interested in. Subscribers define their interest in so-called profiles. The event information is provided by event publishers, filtered by the service against the profiles, and then send to the subscribers. In real-time systems such as facility management, an efficiency filter component is one of the most important design goals. In this paper, we present our analysis and evaluation of efficient distributed filtering algorithms. Firstly, we propose a classification and first-cut analysis of distributed filtering algorithms. Secondly, based on the classification we describe our analysis of selected algorithms. Thirdly, we describe our ENS prototype DAS that includes three filtering algorithms. This prototype is tested with respect to efficiency, network traffic and memory consumption. In this paper, we discuss the results of our practical analysis in detail.

    View record details
  • Extracting corpus specific knowledge bases from Wikipedia

    Milne, David N.; Witten, Ian H.; Nichols, David M. (2007-06-01)

    Working or discussion paper
    University of Waikato

    Thesauri are useful knowledge structures for assisting information retrieval. Yet their production is labor-intensive, and few domains have comprehensive thesauri that cover domain-specific concepts and contemporary usage. One approach, which has been attempted without much success for decades, is to seek statistical natural language processing algorithms that work on free text. Instead, we propose to replace costly professional indexers with thousands of dedicated amateur volunteers--namely, those that are producing Wikipedia. This vast, open encyclopedia represents a rich tapestry of topics and semantics and a huge investment of human effort and judgment. We show how this can be directly exploited to provide WikiSauri: manually-defined yet inexpensive thesaurus structures that are specifically tailored to expose the topics, terminology and semantics of individual document collections. We also offer concrete evidence of the effectiveness of WikiSauri for assisting information retrieval.

    View record details
  • Analyzing library collections with starfield visualizations

    Sánchez, J. Alfredo; Twidale, Michael B.; Nichols, David M.; Silva, Nabani N. (2004-01-01)

    Working or discussion paper
    University of Waikato

    This paper presents a qualitative and formative study of the uses of a starfield-based visualization interface for analysis of library collections. The evaluation process has produced feedback that suggests ways to significantly improve starfield interfaces and the interaction process to improve their learnability and usability. The study also gave us clear indication of additional potential uses of starfield visualizations that can be exploited by further functionality and interface development. We report on resulting implications for the design and use of starfield visualizations that will impact their graphical interface features, their use for managing data quality and their potential for various forms of visual data mining. Although the current implementation and analysis focuses on the collection of a physical library, the most important contributions of our work will be in digital libraries, in which volume, complexity and dynamism of collections are increasing dramatically and tools are needed for visualization and analysis.

    View record details
  • Arbitrary boolean advertisements: the final step in supporting the boolean publish/subscribe model

    Bittner, Sven; Hinze, Annika (2006-06-01)

    Working or discussion paper
    University of Waikato

    Publish/subscribe systems allow for an efficient filtering of incoming information. This filtering is based on the specifications of subscriber interests, which are registered with the system as subscriptions. Publishers conversely specify advertisements, describing the messages they will send later on. What is missing so far is the support of arbitrary Boolean advertisements in publish/subscribe systems. Introducing the opportunity to specify these richer Boolean advertisements increases the accuracy of publishers to state their future messages compared to currently supported conjunctive advertisements. Thus, the amount of subscriptions forwarded in the network is reduced. Additionally, the system can more time efficiently decide whether a subscription needs to be forwarded and more space efficiently store and index advertisements. In this paper, we introduce a publish/subscribe system that supports arbitrary Boolean advertisements and, symmetrically, arbitrary Boolean subscriptions. We show the advantages of supporting arbitrary Boolean advertisements and present an algorithm to calculate the practically required overlapping relationship among subscriptions and advertisements. Additionally, we develop the first optimization approach for arbitrary Boolean advertisements, advertisement pruning. Advertisement pruning is tailored to optimize advertisements, which is a strong contrast to current optimizations for conjunctive advertisements. These recent proposals mainly apply subscription-based optimization ideas, which is leading to the same disadvantages. In the second part of this paper, our evaluation of practical experiments, we analyze the efficiency properties of our approach to determine the overlapping relationship. We also compare conjunctive solutions for the overlapping problem to our calculation algorithm to show its benefits. Finally, we present a detailed evaluation of the optimization potential of advertisement pruning. This includes the analysis of the effects of additionally optimizing subscriptions on the advertisement pruning optimization.

    View record details
  • Computer graphics techniques for modeling page turning

    Liesaputra, Veronica; Witten, Ian H. (2007-10-24)

    Working or discussion paper
    University of Waikato

    Turning the page is a mechanical part of the cognitive act of reading that we do literally unthinkingly. Interest in realistic book models for digital libraries and other online documents is growing. Yet actually producing a computer graphics implementation for modeling page turning is a challenging undertaking. There are many possible foundations: two-dimensional models that use reflection and rotation; geometrical models using cylinders or cones; mass-spring models that simulate the mechanical properties of paper at varying degrees of fidelity; finite-element models that directly compute the actual forces within a piece of paper. Even the simplest methods are not trivial, and the more sophisticated ones involve detailed physical and mathematical models. The variety, intricacy and complexity of possible ways of simulating this fundamental act of reading is virtually unknown. This paper surveys computer graphics models for page turning. It combines a tutorial introduction that covers the range of possibilities and complexities with a mathematical synopsis of each model in sufficient detail to serve as a basis for implementation. Illustrations are included that are generated by our implementations of each model. The techniques presented include geometric methods (both two- and three-dimensional), mass-spring models with varying degrees of accuracy and complexity, and finite-element models. We include a detailed comparison of experimentally-determined computation time and subjective visual fidelity for all methods discussed. The simpler techniques support convincing real-time implementations on ordinary workstations.

    View record details
  • One dimensional non-uniform rational B-splines for animation control

    Mahoui, Abdelaziz (2000-03)

    Working or discussion paper
    University of Waikato

    Most 3D animation packages use graphical representations called motion graphs to represent the variation in time of the motion parameters. Many use two-dimensional B-splines as animation curves because of their power to represent free-form curves. In this project, we investigate the possibility of using One-dimensional Non-Uniform Rational B-Spline (NURBS) curves for the interactive construction of animation control curves. One-dimensional NURBS curves present the potential of solving some problems encountered in motion graphs when two-dimensional B-splines are used. The study focuses on the properties of One-dimensional NURBS mathematical model. It also investigates the algorithms and shape modification tools devised for two-dimensional curves and their port to the One-dimensional NURBS model. It also looks at the issues related to the user interface used to interactively modify the shape of the curves.

    View record details
  • Benchmarking attribute selection techniques for data mining

    Hall, Mark A.; Holmes, Geoffrey (2000-07)

    Working or discussion paper
    University of Waikato

    Data engineering is generally considered to be a central issue in the development of data mining applications. The success of many learning schemes, in their attempts to construct models of data, hinges on the reliable identification of a small set of highly predictive attributes. The inclusion of irrelevant, redundant and noisy attributes in the model building process phase can result in poor predictive performance and increased computation. Attribute selection generally involves a combination of search and attribute utility estimation plus evaluation with respect to specific learning schemes. This leads to a large number of possible permutations and has led to a situation where very few benchmark studies have been conducted. This paper presents a benchmark comparison of several attribute selection methods. All the methods produce an attribute ranking, a useful devise of isolating the individual merit of an attribute. Attribute selection is achieved by cross-validating the rankings with respect to a learning scheme to find the best attributes. Results are reported for a selection of standard data sets and two learning schemes C4.5 and naive Bayes.

    View record details
  • Feature selection for discrete and numeric class machine learning

    Hall, Mark A. (1999-04)

    Working or discussion paper
    University of Waikato

    Algorithms for feature selection fall into two broad categories: wrappers use the learning algorithm itself to evaluate the usefulness of features, while filters evaluate features according to heuristics based on general characteristics of the data. For application to large databases, filters have proven to be more practical than wrappers because they are much faster. However, most existing filter algorithms only work with discrete classification problems. This paper describes a fast, correlation-based filter algorithm that can be applied to continuous and discrete problems. Experiments using the new method as a preprocessing step for naïve Bayes, instance-based learning, decision trees, locally weighted regression, and model trees show it to be an effective feature selector - it reduces the data in dimensionality by more than sixty percent in most cases without negatively affecting accuracy. Also, decision and model trees built from the pre-processed data are often significantly smaller.

    View record details
  • A survey of software requirements specification practices in the New Zealand software industry

    Groves, Lindsay; Nickson, Ray; Reeve, Greg; Reeves, Steve; Utting, Mark (1999-06)

    Working or discussion paper
    University of Waikato

    We report on the software development techniques used in the New Zealand software industry, paying particular attention to requirements gathering. We surveyed a selection of software companies with a general questionnaire and then conducted in-depth interviews with four companies. Our results show a wide variety in the kinds of companies undertaking software development, employing a wide range of software development techniques. Although our data are not sufficiently detailed to draw statistically significant conclusions, it appears that larger software development groups typically have more well-defined software development processes, spend proportionally more time on requirements gathering, and follow more rigorous testing regimes.

    View record details
  • Reduced-error pruning with significance tests

    Frank, Eibe; Witten, Ian H. (1999-06)

    Working or discussion paper
    University of Waikato

    When building classification models, it is common practice to prune them to counter spurious effects of the training data: this often improves performance and reduces model size. “Reduced-error pruning” is a fast pruning procedure for decision trees that is known to produce small and accurate trees. Apart from the data from which the tree is grown, it uses an independent “pruning” set, and pruning decisions are based on the model’s error rate on this fresh data. Recently it has been observed that reduced-error pruning overfits the pruning data, producing unnecessarily large decision trees. This paper investigates whether standard statistical significance tests can be used to counter this phenomenon. The problem of overfitting to the pruning set highlights the need for significance testing. We investigate two classes of test, “parametric” and “non-parametric.” The standard chi-squared statistic can be used both in a parametric test and as the basis for a non-parametric permutation test. In both cases it is necessary to select the significance level at which pruning is applied. We show empirically that both versions of the chi-squared test perform equally well if their significance levels are adjusted appropriately. Using a collection of standard datasets, we show that significance testing improves on standard reduced error pruning if the significance level is tailored to the particular dataset at hand using cross-validation, yielding consistently smaller trees that perform at least as well and sometimes better.

    View record details
  • High precision traffic measurement by the WAND research group

    Cleary, John G.; Graham, Ian; McGregor, Anthony James; Pearson, Murray W.; Siedins, Ilze; Curtis, James; Donnelly, Stephen; Martens, Jed; Martin, Stele (1999-12)

    Working or discussion paper
    University of Waikato

    Over recent years the size and capacity of the Internet has continued its exponential growth driven by new applications and improving network technology. These changes are particularly significant in the New Zealand context where the high costs of trans-Pacific traffic has mandated that traffic be charged for by volume. This has also lead to a significant focus within the New Zealand Internet community on issues of caching and of careful planning for capacity. Approximately three years ago the WAND research group began with a program to measure ATM traffic. We were sharply constrained by cost and decided to start by reprogramming some ATM NIC cards. This paper is largely based on our experience as we have broadened this work to include IP-based non-ATM networks and the construction of our own hardware. We have learned a number of lessons in this work, rediscovering along the way some of the hard discipline that all observation scientists must submit to.

    View record details
  • An entropy gain measure of numeric prediction performance

    Trigg, Leonard E. (1998-05)

    Working or discussion paper
    University of Waikato

    Categorical classifier performance is typically evaluated with respect to error rate, expressed as a percentage of test instances that were not correctly classified. When a classifier produces multiple classifications for a test instance, the prediction is counted as incorrect (even if the correct class was one of the predictions). Although commonly used in the literature, error rate is a coarse measure of classifier performance, as it is based only on a single prediction offered for a test instance. Since many classifiers can produce a class distribution as a prediction, we should use this to provide a better measure of how much information the classifier is extracting from the domain. Numeric classifiers are a relatively new development in machine learning, and as such there is no single performance measure that has become standard. Typically these machine learning schemes predict a single real number for each test instance, and the error between the predicted and actual value is used to calculate a myriad of performance measures such as correlation coefficient, root mean squared error, mean absolute error, relative absolute error, and root relative squared error. With so many performance measures it is difficult to establish an overall performance evaluation. The next section describes a performance measure for machine learning schemes that attempts to overcome the problems with current measures. In addition, the same evaluation measure is used for categorical and numeric classifier.

    View record details
  • Naive Bayes for regression

    Frank, Eibe; Trigg, Leonard E.; Holmes, Geoffrey; Witten, Ian H. (1998-10)

    Working or discussion paper
    University of Waikato

    Despite its simplicity, the naïve Bayes learning scheme performs well on most classification tasks, and is often significantly more accurate than more sophisticated methods. Although the probability estimates that it produces can be inaccurate, it often assigns maximum probability to the correct class. This suggests that its good performance might be restricted to situations where the output is categorical. It is therefore interesting to see how it performs in domains where the predicted value is numeric, because in this case, predictions are more sensitive to inaccurate probability estimates. This paper shows how to apply the naïve Bayes methodology to numeric prediction (i.e. regression) tasks, and compares it to linear regression, instance-based learning, and a method that produces “model trees” - decision trees with linear regression functions at the leaves. Although we exhibit an artificial dataset for which naïve Bayes is the method of choice, on real-world datasets it is almost uniformly worse than model trees. The comparison with linear regression depends on the error measure: for one measure naïve Bayes performs similarly, for another it is worse. Compared to instance-based learning, it performs similarly with respect to both measures. These results indicate that the simplistic statistical assumption that naïve Bayes makes is indeed more restrictive for regression than for classification.

    View record details
  • Proceedings of the second computing women congress: Student Papers

    Hinze, Annika; Jung, Doris; Cunningham, Sally Jo (2006-02-11)

    Working or discussion paper
    University of Waikato

    The CWC 2006 Proceedings contains the following student papers: • Kathryn Hempstalk: Hiding Behind Corners: Using Edges in Images for Better Steganography • Supawan Prompramote, Kathy Blashki: Playing to Learn: Enhancing Educational Opportunities using Games Technology • Judy Bowen: Celebrity Death Match: Formal Methods vs. User-Centred Design • Liz Bryce: BECOMING INDIGENOUS: an impossible necessity • Tatiana King: Privacy Issues in Health Care and Security of Statistical Databases • Nilufar Baghaei: A Collaborative Constraint-based Intelligent System for Learning Object-Oriented Analysis and Design using UML • Sonja van Kerkhof: Alternatives to stereotypes: some thoughts on issues and an outline of one game

    View record details
  • The referendum incentive compatibility hypothesis: Some new results using information messages

    Stefani, Gianluca; Scarpa, Riccardo (2007-06)

    Working or discussion paper
    University of Waikato

    We report results from a laboratory experiment that allows us to test the incentive compatibility hypothesis of hypothetical referenda used in CV studies through the public or private provision of information messages. One of the main methodological issues about hypothetical markets regards whether people behave differently when bidding for a public good through casting a ballot vote than when they are privately purchasing an equivalent good. This study tried to address the core of this issue by using a good that can be traded both as private and public: information messages. This allows the elimination of confounding effects associated with the specific good employed. In our case information dispels some of the uncertainty about a potential gain from a gamble. So, the approximate value of the message can be inferred once the individual measure of risk aversion is known. Decision tasks are then framed in a systematic manner according to the hypothetical vs real nature of the decision and the public vs private nature of the message. A sample of 536 university students across three countries (I, UK and NZ) participated into this lab experiment. The chosen countries reflect diversity in exposure to the practice of advisory (NZ) and abrogative (Italy) referenda, with the UK not having any exposure to it. Under private provision the results show that the fraction of participants unwilling to buy information is slightly higher in the real treatment than in the hypothetical one. Under public provision, instead, there is no statistical difference between real and hypothetical settings, confirming in part the finding of previous researchers. A verbal protocol analysis of the thought processes during choice highlights that public provision of information systematically triggers concerns and motivations different from those arising under the private provision setting. These findings suggest that the incentive compatibility of public referenda is likely to rely more on affective and psychological factors than on the strategic behaviour assumptions theorised by economists.

    View record details
  • Consumer trust and willingness to pay for certified animal-friendly products

    Nocella, Giuseppe; Hubbard, Lionel; Scarpa, Riccardo (2007-05)

    Working or discussion paper
    University of Waikato

    Increasing animal welfare standards requires changes along the supply chain which involve several stakeholders: scientists, farmers and people involved in transportation and slaughtering. The majority of researchers agree that compliance with these standards increases costs along the livestock value chain, especially for monitoring and certifying animal-friendly products. Knowledge of consumer willingness to pay (WTP) in such a decision context is paramount to understanding the magnitude of market incentives necessary to compensate all involved stakeholders. The market outcome of certification programs is dependent on consumer trust. Particularly, there is a need to understand to what extent consumers believe that stakeholders operating in the animal-friendly supply chain will respect certification standards. We examine these issues using a contingent valuation survey administered in five economically dominant EU countries. The implied WTP estimates are found to be sensitive to robust measures of consumer trust for certified animal-friendly products. Significant differences across countries are discussed.

    View record details
  • Harnessing the private sector for rural development, poverty alleviation and HIV/AIDS prevention

    Lim, Steven; Cameron, Michael Patrick; Taweekul, Krailert; Askwith, John (2007-01)

    Working or discussion paper
    University of Waikato

    In resource-constrained developing countries, mobilizing resources from outside sources may assist in overcoming many development challenges. This paper examines the Thai Business Initiative in Rural Development (TBIRD), an NGO-sponsored program that brings together the comparative advantages and self-interest of rural villages, private sector firms and a facilitating NGO, to improve social and community health outcomes in rural areas. We analyze key issues in the program with data from Northeast Thailand. We find that the TBIRD program appears to improve the income earning and other prospects of the TBIRD factory workers. Further, TBIRD factory employment exhibits a pro-poor bias. A key impact is to provide jobs for people who might otherwise be at increased risk of HIV infection through poverty-induced decisions to migrate to urban centres and participate in the commercial sex industry. This program adds another important tool for development planners in the fight against HIV/AIDS.

    View record details
  • Intra-industry trade and trade intensities: Evidence from New Zealand

    Bano, Sayeeda (2002-10)

    Working or discussion paper
    University of Waikato

    This study analyses the development of intra-industry and inter industry trade between New Zealand, Australia, and the selected Asia-Pacific nations during the period 1990 to 2000. The study adapts mainly two approaches to examine these developments. First, an historical analysis of New Zealand trading patterns is presented. For this purpose, intra-industry trade development is examined. The Grubel-Lloyd and Aquino indices are used to calculate the intensity of intra- industry trade at the 3-digit SITC levels to determine the relative importance of intra-industry trade as opposed to inter-industry trade. IIT has been estimated across industries and for selected trading partners. A time series approach is used to estimate any trend in the ratio of intra industry trade to total trade in relation to Australia. Secondly, the paper examines the strength of trade relations between New Zealand and the other countries. For this purpose the intensity of trade index has been estimated for bilateral trade flows between these nations. These analyses are examined to consider how trade has changed in this period of trade liberalisation. The results show that intra-industry trade has increased between New Zealand and Australia. The results also suggest that bilateral trade flows between New Zealand, Australia and other countries has become more intense indicating trading relations are strengthening. In some cases bilateral trade flows have decreased. The results also suggest that the removal of trade barriers through bilateral and multilateral negotiations has positive impacts on intra-industry trade and the intensity of trade of these economies.

    View record details