10
10.0



by
Ehsan Jahangiri; Erdem Yoruk; Rene Vidal; Laurent Younes; Donald Geman
texts
eye 10
favorite 0
comment 0
Despite enormous progress in object detection and classification, the problem of incorporating expected contextual relationships among object instances into modern recognition systems remains a key challenge. In this work we propose Information Pursuit, a Bayesian framework for scene parsing that combines prior models for the geometry of the scene and the spatial arrangement of objects instances with a data model for the output of highlevel image classifiers trained to answer specific...
Topics: Machine Learning, Statistics, Artificial Intelligence, Computing Research Repository, Computer...
Source: http://arxiv.org/abs/1701.02343
6
6.0
texts
eye 6
favorite 0
comment 0
The recently proposed "generalized minmax" (GMM) kernel can be efficiently linearized, with direct applications in largescale statistical learning and fast near neighbor search. The linearized GMM kernel was extensively compared in with linearized radial basis function (RBF) kernel. On a large number of classification tasks, the tuningfree GMM kernel performs (surprisingly) well compared to the besttuned RBF kernel. Nevertheless, one would naturally expect that the GMM kernel...
Topics: Learning, Machine Learning, Statistics, Computing Research Repository
Source: http://arxiv.org/abs/1701.02046
7
7.0
texts
eye 7
favorite 0
comment 0
We consider a firm that sells a large number of products to its customers in an online fashion. Each product is described by a high dimensional feature vector, and the market value of a product is assumed to be linear in the values of its features. Parameters of the valuation model are unknown and can change over time. The firm sequentially observes a product's features and can use the historical sales data (binary sale/no sale feedbacks) to set the price of current product, with the objective...
Topics: Learning, Machine Learning, Statistics, Computer Science and Game Theory, Computing Research...
Source: http://arxiv.org/abs/1701.03537
6
6.0



by
M. A. Abd Elgawad; A. M. Elsawah; Hong Qin; Ting Yan
texts
eye 6
favorite 0
comment 0
In many biological, agricultural, military activity problems and in some quality control problems, it is almost impossible to have a fixed sample size, because some observations are always lost for various reasons. Therefore, the sample size itself is considered frequently to be a random variable (rv). The class of limit distribution functions (df's) of the random bivariate extreme generalized order statistics (GOS) from independent and identically distributed RV's are fully characterized. When...
Topics: Statistics Theory, Statistics, Mathematics
Source: http://arxiv.org/abs/1701.04682
7
7.0



by
Yifei Yan; Athanasios Kottas
texts
eye 7
favorite 0
comment 0
We propose a new family of error distributions for modelbased quantile regression, which is constructed through a structured mixture of normal distributions. The construction enables fixing specific percentiles of the distribution while, at the same time, allowing for varying mode, skewness and tail behavior. It thus overcomes the severe limitation of the asymmetric Laplace distribution  the most commonly used error model for parametric quantile regression  for which the skewness of the...
Topics: Statistics, Methodology
Source: http://arxiv.org/abs/1701.05666
5
5.0



by
Sylvie Huet; MarieLuce Taupin
texts
eye 5
favorite 0
comment 0
We propose to estimate a metamodel and the sensitivity indices of a complex model m in the Gaussian regression framework. Our approach combines methods for sensitivity analysis of complex models and statistical tools for sparse nonparametric estimation in multivariate Gaussian regression model. It rests on the construction of a metamodel for aproximating the HoeffdingSobol decomposition of m. This metamodel belongs to a reproducing kernel Hilbert space constructed as a direct sum of Hilbert...
Topics: Statistics Theory, Statistics, Mathematics
Source: http://arxiv.org/abs/1701.04671
7
7.0



by
Amirhossein Javaheri; Hadi Zayyani; Farokh Marvasti
texts
eye 7
favorite 0
comment 0
This paper investigates the problem of recovering missing samples using methods based on sparse representation adapted especially for image signals. Instead of $l_2$norm or Mean Square Error (MSE), a new perceptual quality measure is used as the similarity criterion between the original and the reconstructed images. The proposed metric called Convex SIMilarity (CSIM) index is a modified version of the Structural SIMilarity (SSIM) index which despite its predecessor, is convex and unimodal. We...
Topics: Learning, Machine Learning, Statistics, Computing Research Repository
Source: http://arxiv.org/abs/1701.07422
6
6.0



by
Noam Shazeer; Azalia Mirhoseini; Krzysztof Maziarz; Andy Davis; Quoc Le; Geoffrey Hinton; Jeff Dean
texts
eye 6
favorite 0
comment 0
The capacity of a neural network to absorb information is limited by its number of parameters. Conditional computation, where parts of the network are active on a perexample basis, has been proposed in theory as a way of dramatically increasing model capacity without a proportional increase in computation. In practice, however, there are significant algorithmic and performance challenges. In this work, we address these challenges and finally realize the promise of conditional computation,...
Topics: Learning, Computing Research Repository, Machine Learning, Computation and Language, Neural and...
Source: http://arxiv.org/abs/1701.06538
5
5.0



by
Francisco Macedo; M. Rosário Oliveira; António Pacheco; Rui Valadas
texts
eye 5
favorite 0
comment 0
Feature selection problems arise in a variety of applications, such as microarray analysis, clinical prediction, text categorization, image classification and face recognition, multilabel learning, and classification of internet traffic. Among the various classes of methods, forward feature selection methods based on mutual information have become very popular and are widely used in practice. However, comparative evaluations of these methods have been limited by being based on specific...
Topics: Learning, Machine Learning, Statistics, Computing Research Repository
Source: http://arxiv.org/abs/1701.07761
7
7.0



by
Qiuyi Zhang; Rina Panigrahy; Sushant Sachdeva
texts
eye 7
favorite 0
comment 0
We study the efficacy of learning neural networks with neural networks by the (stochastic) gradient descent method. While gradient descent enjoys empirical success in a variety of applications, there is a lack of theoretical guarantees that explains the practical utility of deep learning. We focus on twolayer neural networks with a linear activation on the output node. We show that under some mild assumptions and certain classes of activation functions, gradient descent does learn the...
Topics: Learning, Physics, Data Analysis, Statistics and Probability, Data Structures and Algorithms,...
Source: http://arxiv.org/abs/1702.00458
5
5.0
texts
eye 5
favorite 0
comment 0
Stochastic Gradient Descent (SGD) is widely used in machine learning problems to efficiently perform empirical risk minimization, yet, in practice, SGD is known to stall before reaching the actual minimizer of the empirical risk. SGD stalling has often been attributed to its sensitivity to the conditioning of the problem; however, as we demonstrate, SGD will stall even when applied to a simple linear regression problem with unity condition number for standard learning rates. Thus, in this work,...
Topics: Learning, Optimization and Control, Computing Research Repository, Machine Learning, Computation,...
Source: http://arxiv.org/abs/1702.00317
7
7.0



by
Jose E. FigueroaLopez; K. Lee
texts
eye 7
favorite 0
comment 0
High frequency based estimation methods for a semiparametric purejump subordinated Brownian motion exposed to a small additive microstructure noise are developed building on the twoscales realized variations approach originally developed by Zhang et. al. (2005) for the estimation of the integrated variance of a continuous Ito process. The proposed estimators are shown to be robust against the noise and, surprisingly, to attain better rates of convergence than their precursors, method of...
Topics: Quantitative Finance, Statistics Theory, Statistical Finance, Statistics, Mathematics
Source: http://arxiv.org/abs/1702.01164
5
5.0



by
Joaquin Miguez; Ines P. Mariño; Manuel A. Vazquez
texts
eye 5
favorite 0
comment 0
The Bayesian estimation of the unknown parameters of statespace (dynamical) systems has received considerable attention over the past decade, with a handful of powerful algorithms being introduced. In this paper we tackle the theoretical analysis of the recently proposed {\it nonlinear} population Monte Carlo (NPMC). This is an iterative importance sampling scheme whose key features, compared to conventional importance samplers, are (i) the approximate computation of the importance weights...
Topics: Computation, Statistics
Source: http://arxiv.org/abs/1702.03146
7
7.0



by
Giuseppe Cerati; Peter Elmer; Slava Krutelyov; Steven Lantz; Matthieu Lefebvre; Kevin McDermott; Daniel Riley; Matevž Tadel; Peter Wittich; Frank Würthwein; Avi Yagil
texts
eye 7
favorite 0
comment 0
Limits on power dissipation have pushed CPUs to grow in parallel processing capabilities rather than clock rate, leading to the rise of "manycore" or GPUlike processors. In order to achieve the best performance, applications must be able to take full advantage of vector units across multiple cores, or some analogous arrangement on an accelerator card. Such parallel performance is becoming a critical requirement for methods to reconstruct the tracks of charged particles at the Large...
Topics: Physics, High Energy Physics  Experiment, Data Analysis, Statistics and Probability,...
Source: http://arxiv.org/abs/1702.06359
6
6.0



by
Silvia Chiappa; Sébastien Racaniere; Daan Wierstra; Shakir Mohamed
texts
eye 6
favorite 0
comment 0
Models that can simulate how environments change in response to actions can be used by agents to plan and act efficiently. We improve on previous environment simulators from highdimensional pixel observations by introducing recurrent neural networks that are able to make temporally and spatially coherent predictions for hundreds of timesteps into the future. We present an indepth analysis of the factors affecting performance, providing the most extensive attempt to advance the understanding...
Topics: Learning, Machine Learning, Statistics, Artificial Intelligence, Computing Research Repository
Source: http://arxiv.org/abs/1704.02254
6
6.0



by
Soraia Pereira; Feridun Turkman; Luis Correia
texts
eye 6
favorite 0
comment 0
This study aims to analyze the methodologies that can be used to estimate the total number of unemployed, as well as the unemployment rates for 28 regions of Portugal, designated as NUTS III regions, using model based approaches as compared to the direct estimation methods currently employed by INE (National Statistical Institute of Portugal). Model based methods, often known as small area estimation methods (Rao, 2003), "borrow strength" from neighbouring regions and in doing so, aim...
Topics: Statistics, Applications
Source: http://arxiv.org/abs/1704.05767
5
5.0



by
Will Wei Sun; Guang Cheng; Yufeng Liu
texts
eye 5
favorite 0
comment 0
Stability is an important aspect of a classification procedure because unstable predictions can potentially reduce users' trust in a classification system and also harm the reproducibility of scientific conclusions. The major goal of our work is to introduce a novel concept of classification instability, i.e., decision boundary instability (DBI), and incorporate it with the generalization error (GE) as a standard for selecting the most accurate and stable classifier. Specifically, we implement...
Topics: Machine Learning, Statistics
Source: http://arxiv.org/abs/1701.05672
5
5.0



by
Chaofei Yang; Qing Wu; Hai Li; Yiran Chen
texts
eye 5
favorite 0
comment 0
Poisoning attack is identified as a severe security threat to machine learning algorithms. In many applications, for example, deep neural network (DNN) models collect public data as the inputs to perform retraining, where the input data can be poisoned. Although poisoning attack against support vector machines (SVM) has been extensively studied before, there is still very limited knowledge about how such attack can be implemented on neural networks (NN), especially DNNs. In this work, we first...
Topics: Learning, Cryptography and Security, Machine Learning, Statistics, Computing Research Repository
Source: http://arxiv.org/abs/1703.01340
22
22



by
Yuncheng Li; Jianchao Yang; Yale Song; Liangliang Cao; Jiebo Luo; LiJia Li
texts
eye 22
favorite 0
comment 0
The ability of learning from noisy labels is very useful in many visual recognition tasks, as a vast amount of data with noisy labels are relatively easy to obtain. Traditionally, the label noises have been treated as statistical outliers, and approaches such as importance reweighting and bootstrap have been proposed to alleviate the problem. According to our observation, the realworld noisy labels exhibit multimode characteristics as the true labels, rather than behaving like independent...
Topics: Learning, Machine Learning, Statistics, Computing Research Repository, Computer Vision and Pattern...
Source: http://arxiv.org/abs/1703.02391
5
5.0
texts
eye 5
favorite 0
comment 0
We consider the sparse highdimensional linear regression model $Y=Xb+\epsilon$ where $b$ is a sparse vector. For the Bayesian approach to this problem, many authors have considered the behavior of the posterior distribution when, in truth, $Y=X\beta+\epsilon$ for some given $\beta$. There have been numerous results about the rate at which the posterior distribution concentrates around $\beta$, but few results about the shape of that posterior distribution. We propose a prior distribution for...
Topics: Statistics Theory, Statistics, Mathematics
Source: http://arxiv.org/abs/1704.02646
4
4.0



by
Max Yi Ren; Clayton Scott
texts
eye 4
favorite 0
comment 0
We consider the problem of identifying the most profitable product design from a finite set of candidates under unknown consumer preference. A standard approach to this problem follows a twostep strategy: First, estimate the preference of the consumer population, represented as a point in partworth space, using an adaptive discretechoice questionnaire. Second, integrate the estimated partworth vector with engineering feasibility and cost models to determine the optimal design. In this work,...
Topics: Information Retrieval, Machine Learning, Statistics, Computing Research Repository
Source: http://arxiv.org/abs/1701.01231
3
3.0



by
Marwin H. S. Segler; Thierry Kogej; Christian Tyrchan; Mark P. Waller
texts
eye 3
favorite 0
comment 0
In de novo drug design, computational strategies are used to generate novel molecules with good affinity to the desired biological target. In this work, we show that recurrent neural networks can be trained as generative models for molecular structures, similar to statistical language models in natural language processing. We demonstrate that the properties of the generated molecules correlate very well with the properties of the molecules used to train the model. In order to enrich libraries...
Topics: Physics, Learning, Computing Research Repository, Chemical Physics, Machine Learning, Neural and...
Source: http://arxiv.org/abs/1701.01329
4
4.0



by
Thea Bjørnland; Anja Bye; Einar Ryeng; Ulrik Wisløff; Mette Langaas
texts
eye 4
favorite 0
comment 0
Extreme phenotype sampling is a selective genotyping design for genetic association studies where only individuals with extreme values of a continuous trait are genotyped for a set of genetic variants. Under financial or other limitations, this design is assumed to improve the power to detect associations between genetic variants and the trait, compared to randomly selecting the same number of individuals for genotyping. Here we present extensions of likelihood models that can be used for...
Topics: Statistics, Applications
Source: http://arxiv.org/abs/1701.01286
3
3.0



by
Gregory S. Ledva; Laura Balzano; Johanna L. Mathieu
texts
eye 3
favorite 0
comment 0
Though distribution system operators have been adding more sensors to their networks, they still often lack an accurate realtime picture of the behavior of distributed energy resources such as demand responsive electric loads and residential solar generation. Such information could improve system reliability, economic efficiency, and environmental impact. Rather than installing additional, costly sensing and communication infrastructure to obtain additional realtime information, it may be...
Topics: Machine Learning, Statistics, Optimization and Control, Mathematics
Source: http://arxiv.org/abs/1701.04389
5
5.0



by
Qian Lin; Xinran Li; Dongming Huang; Jun S. Liu
texts
eye 5
favorite 0
comment 0
The central subspace of a pair of random variables $(y,x) \in \mathbb{R}^{p+1}$ is the minimal subspace $\mathcal{S}$ such that $y \perp \hspace{2mm} \perp x\mid P_{\mathcal{S}}x$. In this paper, we consider the minimax rate of estimating the central space of the multiple index models $y=f(\beta_{1}^{\tau}x,\beta_{2}^{\tau}x,...,\beta_{d}^{\tau}x,\epsilon)$ with at most $s$ active predictors where $x \sim N(0,I_{p})$. We first introduce a large class of models depending on the smallest...
Topics: Statistics Theory, Statistics, Mathematics
Source: http://arxiv.org/abs/1701.06009
8
8.0



by
Stephen A. CollinsElliott
texts
eye 8
favorite 0
comment 0
Methods of measuring differentiation in archaeological assemblages have long been based on attributelevel analyses of assemblages. This paper considers a method of comparing assemblages as probability distributions via the Hellinger distance, as calculated through a Dirichletcategorical model of inference using Monte Carlo methods of approximation. This method has application within practicetheory traditions of archaeology, an approach which seeks to measure and associate different factors...
Topics: Statistics, Applications
Source: http://arxiv.org/abs/1701.06720
3
3.0



by
Alexander R. Luedtke; Peter B. Gilbert
texts
eye 3
favorite 0
comment 0
Suppose one has data from one or more completed vaccine efficacy trials and wishes to estimate the efficacy in a new setting. Often logistical or ethical considerations make running another efficacy trial impossible. Fortunately, if there is a biomarker that is the primary modifier of efficacy, then the biomarkerconditional efficacy may be identical in the completed trials and the new setting, or at least informative enough to meaningfully bound this quantity. Given a sample of this biomarker...
Topics: Statistics, Methodology
Source: http://arxiv.org/abs/1701.06739
5
5.0



by
Lingfei Wang; Tom Michoel
texts
eye 5
favorite 0
comment 0
Pvalues are being computed for increasingly complicated statistics but lacking evaluations on their quality. Meanwhile, accurate pvalues enable significance comparison across batches of hypothesis tests and consequently unified false discover rate (FDR) control. This article discusses two related questions in this setting. First, we propose statistical tests to evaluate the quality of pvalue and the crossbatch comparability of any other statistic. Second, we propose a lasso based variable...
Topics: Molecular Networks, Quantitative Methods, Quantitative Biology, Applications, Statistics,...
Source: http://arxiv.org/abs/1701.07011
4
4.0



by
Gery Geenens; Thomas Cuddihy
texts
eye 4
favorite 0
comment 0
In international football (soccer), twolegged knockout ties, with each team playing at home in one leg and the final outcome decided on aggregate, are common. Many players, managers and followers seem to believe in the `secondleg home advantage', i.e. that it is beneficial to play at home on the second leg. A more complex effect than the usual and wellestablished home advantage, it is harder to identify, and previous statistical studies did not prove conclusive about its actuality. Yet,...
Topics: Statistics, Applications, Methodology
Source: http://arxiv.org/abs/1701.07555
7
7.0



by
Vincent Audigier; Ian R. White; Shahab Jolani; Thomas P. A. Debray; Matteo Quartagno; James Carpenter; Stef van Buuren; Matthieu RescheRigon
texts
eye 7
favorite 0
comment 0
We present and compare multiple imputation methods for multilevel continuous and binary data where variables are systematically and sporadically missing. We particularly focus on three recent approaches: the joint modelling approach of Quartagno and Carpenter (2016a) and the fully conditional specification approaches of Jolani et al. (2015) and RescheRigon and White (2016). The methods are compared from a theoretical point of view and through an extensive simulation study motivated by a real...
Topics: Statistics, Methodology
Source: http://arxiv.org/abs/1702.00971
3
3.0



by
Anne M. Presanis; David Ohlssen; Kai Cui; Magdalena Rosinska; Daniela De Angelis
texts
eye 3
favorite 0
comment 0
Evidence synthesis models that combine multiple datasets of varying design, to estimate quantities that cannot be directly observed, require the formulation of complex probabilistic models that can be expressed as graphical models. An assessment of whether the different datasets synthesised contribute information that is consistent with each other (and in a Bayesian context, with the prior distribution) is a crucial component of the model criticism process. However, a systematic assessment of...
Topics: Statistics, Methodology
Source: http://arxiv.org/abs/1702.07304
3
3.0



by
Muhammad Farooq; Ingo Steinwart
texts
eye 3
favorite 0
comment 0
Conditional expectiles are becoming an increasingly important tool in finance as well as in other areas of applications. We analyse a support vector machine type approach for estimating conditional expectiles and establish learning rates that are minimax optimal modulo a logarithmic factor if Gaussian RBF kernels are used and the desired expectile is smooth in a Besov sense. As a special case, our learning rates improve the best known rates for kernelbased least squares regression in this...
Topics: Learning, Machine Learning, Statistics, Computing Research Repository
Source: http://arxiv.org/abs/1702.07552
3
3.0



by
Swayambhoo Jain; Akshay Soni; Nikolay Laptev; Yashar Mehdad
texts
eye 3
favorite 0
comment 0
For many internet businesses, presenting a given list of items in an order that maximizes a certain metric of interest (e.g., clickthroughrate, average engagement time etc.) is crucial. We approach the aforementioned task from a learningtorank perspective which reveals a new problem setup. In traditional learningtorank literature, it is implicitly assumed that during the training data generation one has access to the \emph{best or desired} order for the given list of items. In this work,...
Topics: Learning, Machine Learning, Statistics, Computing Research Repository
Source: http://arxiv.org/abs/1702.07798
3
3.0



by
Tomer Galanti; Lior Wolf
texts
eye 3
favorite 0
comment 0
When learning a mapping from an input space to an output space, the assumption that the sample distribution of the training data is the same as that of the test data is often violated. Unsupervised domain shift methods adapt the learned function in order to correct for this shift. Previous work has focused on utilizing unlabeled samples from the target distribution. We consider the complementary problem in which the unlabeled samples are given post mapping, i.e., we are given the outputs of the...
Topics: Learning, Machine Learning, Statistics, Computing Research Repository
Source: http://arxiv.org/abs/1703.01606
3
3.0



by
Constantin Grigo; PhaedonStelios Koutsourelakis
texts
eye 3
favorite 0
comment 0
We discuss a Bayesian formulation to coarsegraining (CG) of PDEs where the coefficients (e.g. material parameters) exhibit random, fine scale variability. The direct solution to such problems requires grids that are small enough to resolve this fine scale variability which unavoidably requires the repeated solution of very large systems of algebraic equations. We establish a physically inspired, datadriven coarsegrained model which learns a low dimensional set of microstructural features...
Topics: Machine Learning, Statistics
Source: http://arxiv.org/abs/1703.01962
3
3.0



by
Daniele Ramazzotti; Marco S. Nobile; Paolo Cazzaniga; Giancarlo Mauri; Marco Antoniotti
texts
eye 3
favorite 0
comment 0
The emergence and development of cancer is a consequence of the accumulation over time of genomic mutations involving a specific set of genes, which provides the cancer clones with a functional selective advantage. In this work, we model the order of accumulation of such mutations during the progression, which eventually leads to the disease, by means of probabilistic graphic models, i.e., Bayesian Networks (BNs). We investigate how to perform the task of learning the structure of such BNs,...
Topics: Learning, Machine Learning, Statistics, Computing Research Repository
Source: http://arxiv.org/abs/1703.03038
4
4.0



by
Ashish Bora; Ajil Jalal; Eric Price; Alexandros G. Dimakis
texts
eye 4
favorite 0
comment 0
The goal of compressed sensing is to estimate a vector from an underdetermined system of noisy linear measurements, by making use of prior knowledge on the structure of vectors in the relevant domain. For almost all results in this literature, the structure is represented by sparsity in a wellchosen basis. We show how to achieve guarantees similar to standard compressed sensing but without employing sparsity at all. Instead, we suppose that vectors lie near the range of a generative model $G:...
Topics: Learning, Computing Research Repository, Machine Learning, Information Theory, Statistics,...
Source: http://arxiv.org/abs/1703.03208
5
5.0



by
Fani Tsapeli; Peter Tino; Mirco Musolesi
texts
eye 5
favorite 0
comment 0
The abundance of data produced daily from large variety of sources has boosted the need of novel approaches on causal inference analysis from observational data. Observational data often contain noisy or missing entries. Moreover, causal inference studies may require unobserved highlevel information which needs to be inferred from other observed attributes. In such cases, inaccuracies of the applied inference methods will result in noisy outputs. In this study, we propose a novel approach for...
Topics: Computation, Statistics, Machine Learning, Methodology
Source: http://arxiv.org/abs/1703.04334
5
5.0



by
Mark McLeod; Michael A. Osborne; Stephen J. Roberts
texts
eye 5
favorite 0
comment 0
We propose a novel Bayesian Optimization approach for blackbox functions with an environmental variable whose value determines the tradeoff between evaluation cost and the fidelity of the evaluations. Further, we use a novel approach to sampling support points, allowing faster construction of the acquisition function. This allows us to achieve optimization with lower overheads than previous approaches and is implemented for a more general class of problem. We show this approach to be effective...
Topics: Machine Learning, Statistics
Source: http://arxiv.org/abs/1703.04335
3
3.0



by
Satu Helske; Jouni Helske
texts
eye 3
favorite 0
comment 0
Sequence analysis is being more and more widely used for the analysis of social sequences and other multivariate categorical time series data. However, it is often complex to describe, visualize, and compare large sequence data, especially when there are multiple parallel sequences per subject. Hidden (latent) Markov models (HMMs) are able to detect underlying latent structures and they can be used in various longitudinal settings: to account for measurement error, to detect unobservable...
Topics: Computation, Statistics, Applications
Source: http://arxiv.org/abs/1704.00543
4
4.0



by
Haotian Pang; Tuo Zhao; Robert Vanderbei; Han Liu
texts
eye 4
favorite 0
comment 0
High dimensional sparse learning has imposed a great computational challenge to large scale data analysis. In this paper, we are interested in a broad class of sparse learning approaches formulated as linear programs parametrized by a {\em regularization factor}, and solve them by the parametric simplex method (PSM). Our parametric simplex method offers significant advantages over other competing methods: (1) PSM naturally obtains the complete solution path for all values of the regularization...
Topics: Learning, Optimization and Control, Computing Research Repository, Machine Learning, Statistics,...
Source: http://arxiv.org/abs/1704.01079
5
5.0



by
François Roueff; Rainer Von Sachs
texts
eye 5
favorite 0
comment 0
Locally stationary Hawkes processes have been introduced in order to generalise classical Hawkes processes away from stationarity by allowing for a timevarying secondorder structure. This class of selfexciting point processes has recently attracted a lot of interest in applications in the life sciences (seismology, genomics, neuroscience,...), but also in the modelling of highfrequency financial data. In this contribution we provide a fully developed nonparametric estimation theory of both...
Topics: Statistics Theory, Statistics, Mathematics
Source: http://arxiv.org/abs/1704.01437
5
5.0



by
Iván Díaz; Mark J. van der Laan
texts
eye 5
favorite 0
comment 0
Missing outcome data is one of the principal threats to the validity of treatment effect estimates from randomized trials. The outcome distributions of participants with missing and observed data are often different, which increases the risk of bias. Causal inference methods may aid in reducing the bias and improving efficiency by incorporating baseline variables into the analysis. In particular, doubly robust estimators incorporate estimates of two nuisance parameters: the outcome regression...
Topics: Statistics, Methodology
Source: http://arxiv.org/abs/1704.01538
7
7.0
texts
eye 7
favorite 0
comment 0
In statistics education, the concept of population is widely felt hard to grasp, as a result of vague explanations in textbooks. Some textbook authors therefore chose not to mention it. This paper offers a new explanation by proposing a new theoretical framework of population and sampling, which aims to achieve high mathematical sensibleness. In the explanation, the term population is given clear definition, and the relationship between simple random sampling and iid random variables are...
Topics: Other Statistics, Statistics
Source: http://arxiv.org/abs/1704.01732
3
3.0



by
Isabel Schlangen; Emmanuel D. Delande; Jeremie Houssineau; Daniel E. Clark
texts
eye 3
favorite 0
comment 0
The Probability Hypothesis Density (PHD) and Cardinalized PHD (CPHD) filters are popular solutions to the multitarget tracking problem due to their low complexity and ability to estimate the number and states of targets in cluttered environments. The PHD filter propagates the firstorder moment (i.e. mean) of the number of targets while the CPHD propagates the cardinality distribution in the number of targets, albeit for a greater computational cost. Introducing the Panjer point process, this...
Topics: Statistics, Methodology
Source: http://arxiv.org/abs/1704.02084
13
13



by
Maria Schuld; Francesco Petruccione
texts
eye 13
favorite 0
comment 0
Quantum machine learning witnesses an increasing amount of quantum algorithms for datadriven decision making, a problem with potential applications ranging from automated image recognition to medical diagnosis. Many of those algorithms are implementations of quantum classifiers, or models for the classification of data inputs with a quantum computer. Following the success of collective decision making with ensembles in classical machine learning, this paper introduces the concept of quantum...
Topics: Learning, Statistics Theory, Computing Research Repository, Statistics, Quantum Physics, Mathematics
Source: http://arxiv.org/abs/1704.02146
4
4.0



by
Hamza Dhaker; Papa Ngom; Malick Mbodj
texts
eye 4
favorite 0
comment 0
This article is devoted to the study of overlap measures of densities of two exponential populations. Various Overlapping Coefficients, namely: Matusita's measure $\rho$, Morisita's measure $\lambda$ and Weitzman's measure $\Delta$. A new overlap measure $\Lambda$ based on KullbackLeibler measure is proposed. The invariance property and a method of statistical inference of these coefficients also are presented. Taylor series approximation are used to construct confidence intervals for the...
Topics: Statistics, Methodology
Source: http://arxiv.org/abs/1704.02671
4
4.0



by
Esmaeil Bashkar; Hamzeh Torabi; Ali Dolati; Felix Belzunce
texts
eye 4
favorite 0
comment 0
In this paper, we use a new partial order, called the fmajorization order. The new order includes as special cases the majorization , the reciprocal majorization and the plarger orders. We provide a comprehensive account of the mathematical properties of the fmajorization order and give applications of this order in the context of stochastic comparison for extreme order statistics of independent samples following the Frechet distribution and scale model. We discuss stochastic comparisons of...
Topics: Statistics Theory, Statistics, Mathematics
Source: http://arxiv.org/abs/1704.03656
4
4.0



by
Gianluca Mastrantonio; Giovanna Jona Lasinio; Alan E. Gelfand
texts
eye 4
favorite 0
comment 0
Circular data arise in many areas of application. Recently, there has been interest in looking at circular data collected separately over time and over space. Here, we extend some of this work to the spatiotemporal setting, introducing spacetime dependence. We accommodate covariates, implement full kriging and forecasting, and also allow for a nugget which can be time dependent. We work within a Bayesian framework, introducing suitable latent variables to facilitate Markov chain Monte Carlo...
Topics: Statistics, Methodology
Source: http://arxiv.org/abs/1704.05029
3
3.0



by
M. Dolores RuizMedina; J. ÁlvarezLiébana
texts
eye 3
favorite 0
comment 0
A special class of standard Gaussian Autoregressive Hilbertian processes of order one (Gaussian ARH(1) processes), with bounded linear autocorrelation operator, which does not satisfy the usual HilbertSchmidt assumption, is considered. To compensate the slow decay of the diagonal coefficients of the autocorrelation operator, a faster decay velocity of the eigenvalues of the trace autocovariance operator of the innovation process is assumed. As usual, the eigenvectors of the autocovariance...
Topics: Other Statistics, Statistics Theory, Statistics, Applications, Mathematics
Source: http://arxiv.org/abs/1704.05630
3
3.0



by
Tao Wu; David Gleich
texts
eye 3
favorite 0
comment 0
Users form information trails as they browse the web, checkin with a geolocation, rate items, or consume media. A common problem is to predict what a user might do next for the purposes of guidance, recommendation, or prefetching. Firstorder and higherorder Markov chains have been widely used methods to study such sequences of data. Firstorder Markov chains are easy to estimate, but lack accuracy when history matters. Higherorder Markov chains, in contrast, have too many parameters and...
Topics: Learning, Machine Learning, Statistics, Computing Research Repository, Social and Information...
Source: http://arxiv.org/abs/1704.05982
3
3.0



by
Shirshendu Chatterjee; Ofer Zeitouni
texts
eye 3
favorite 0
comment 0
We consider the "searching for a trail in a maze" composite hypothesis testing problem, in which one attempts to detect an anomalous directed path in a lattice 2D box of side n based on observations on the nodes of the box. Under the signal hypothesis, one observes independent Gaussian variables of unit variance at all nodes, with zero, mean off the anomalous path and mean \mu_n on it. Under the null hypothesis, one observes i.i.d. standard Gaussians on all nodes. AriasCastro et al....
Topics: Statistics Theory, Statistics, Mathematics
Source: http://arxiv.org/abs/1704.05991
5
5.0



by
Shonosuke Sugasawa; Genya Kobayashi; Yuki Kawakubo
texts
eye 5
favorite 0
comment 0
This article proposes a mixture modeling approach to estimating clusterwise conditional distributions in clustered (grouped) data. We adapt the mixtureofexperts model to the latent distributions, and propose a model in which each clusterwise density is represented as a mixture of latent experts with clusterwise mixing proportions distributed as Dirichlet distribution. The model parameters are estimated by maximizing the marginal likelihood function using a newly developed Monte Carlo...
Topics: Statistics, Methodology
Source: http://arxiv.org/abs/1704.05993
4
4.0



by
Siwar Jendoubi; Arnaud Martin; Ludovic Liétard; Boutheina Ben Yaghlane; Hend Ben Hadji
texts
eye 4
favorite 0
comment 0
Social messages classification is a research domain that has attracted the attention of many researchers in these last years. Indeed, the social message is different from ordinary text because it has some special characteristics like its shortness. Then the development of new approaches for the processing of the social message is now essential to make its classification more efficient. In this paper, we are mainly interested in the classification of social messages based on their spreading on...
Topics: Machine Learning, Statistics, Artificial Intelligence, Computing Research Repository, Social and...
Source: http://arxiv.org/abs/1701.07756
3
3.0



by
Marie Turčičová; Jan Mandel; Kryštof Eben
texts
eye 3
favorite 0
comment 0
The asymptotic variance of a maximum likelihood estimate is proved to decrease by restricting the maximization to a subspace that is known to contain the true parameter. Covariance matrices of many random fields are known to be diagonal or approximately diagonal in a suitable basis. Such sample covariance matrices were improved by omitting offdiagonal terms. Maximum likelihood estimation on subspaces of diagonal matrices allows a systematic fitting of diagonal covariance models including...
Topics: Statistics Theory, Statistics, Mathematics
Source: http://arxiv.org/abs/1701.08185
4
4.0



by
Dimosthenis Tsagkrasoulis; Giovanni Montana
texts
eye 4
favorite 0
comment 0
An increasing array of biomedical and computer vision applications requires the predictive modeling of complex data, for example images and shapes. The main challenge when predicting such objects lies in the fact that they do not comply to the assumptions of Euclidean geometry. Rather, they occupy nonlinear spaces, a.k.a. manifolds, where it is difficult to define concepts such as coordinates, vectors and expected values. In this work, we construct a nonparametric predictive methodology for...
Topics: Machine Learning, Statistics, Methodology
Source: http://arxiv.org/abs/1701.08381
6
6.0



by
Ronald L. Rivest
texts
eye 6
favorite 0
comment 0
We propose a simple risklimiting audit for elections, ClipAudit. To determine whether candidate A (the reported winner) actually beat candidate B in a plurality election, ClipAudit draws ballots at random, without replacement, until either all cast ballots have been drawn, or until \[ a  b \ge \beta \sqrt{a+b} \] where $a$ is the number of ballots in the sample for the reported winner A, and $b$ is the number of ballots in the sample for opponent B, and where $\beta$ is a constant determined...
Topics: Cryptography and Security, Statistics, Applications, Computing Research Repository
Source: http://arxiv.org/abs/1701.08312
4
4.0



by
BaNgu Vo; Quang N. Tran; Dinh Phung; BaTuong Vo
texts
eye 4
favorite 0
comment 0
Point patterns are sets or multisets of unordered elements that can be found in numerous data sources. However, in data analysis tasks such as classification and novelty detection, appropriate statistical models for point pattern data have not received much attention. This paper proposes the modelling of point pattern data via random finite sets (RFS). In particular, we propose appropriate likelihood functions, and a maximum likelihood estimator for learning a tractable family of RFS models....
Topics: Learning, Machine Learning, Statistics, Computing Research Repository
Source: http://arxiv.org/abs/1701.08473
3
3.0
texts
eye 3
favorite 0
comment 0
We study cascading failures in a system comprising interdependent networks/systems, in which nodes rely on other nodes both in the same system and in other systems to perform their function. The (inter)dependence among nodes is modeled using a dependence graph, where the degree vector of a node determines the number of other nodes it can potentially cause to fail in each system through aforementioned dependency. In particular, we examine the impact of the variability and dependence properties...
Topics: Physics, Physics and Society, Computing Research Repository, Social and Information Networks,...
Source: http://arxiv.org/abs/1702.00298
5
5.0



by
Shantanu Jain; Martha White; Predrag Radivojac
texts
eye 5
favorite 0
comment 0
A common approach in positiveunlabeled learning is to train a classification model between labeled and unlabeled data. This strategy is in fact known to give an optimal classifier under mild conditions; however, it results in biased empirical estimates of the classifier performance. In this work, we show that the typically used performance measures such as the receiver operating characteristic curve, or the precisionrecall curve obtained on such data can be corrected with the knowledge of...
Topics: Learning, Machine Learning, Statistics, Computing Research Repository
Source: http://arxiv.org/abs/1702.00518
4
4.0



by
Javier Hidalgo; Jungyoon Lee; Myung Hwan Seo
texts
eye 4
favorite 0
comment 0
This paper is concerned with inference in regression models with either a kink or a jump at an unknown threshold, particularly when we do not know whether the kink or jump is the true specification. One of our main results shows that the statistical properties of the estimator of the threshold parameter are substantially different under the two settings, with a slower rate of convergence under the kink design, and more surprisingly slower than if the correct kink specification were employed in...
Topics: Statistics Theory, Statistics, Methodology, Mathematics
Source: http://arxiv.org/abs/1702.00836
3
3.0



by
Lane T. McIntosh; Niru Maheswaranathan; Aran Nayebi; Surya Ganguli; Stephen A. Baccus
texts
eye 3
favorite 0
comment 0
A central challenge in neuroscience is to understand neural computations and circuit mechanisms that underlie the encoding of ethologically relevant, natural stimuli. In multilayered neural circuits, nonlinear processes such as synaptic transmission and spiking dynamics present a significant obstacle to the creation of accurate computational models of responses to natural stimuli. Here we demonstrate that deep convolutional neural networks (CNNs) capture retinal responses to natural scenes...
Topics: Machine Learning, Quantitative Biology, Neurons and Cognition, Statistics
Source: http://arxiv.org/abs/1702.01825
4
4.0



by
Jose Diogo Barbosa; Marcelo J. Moreira
texts
eye 4
favorite 0
comment 0
Lancaster (2002} proposes an estimator for the dynamic panel data model with homoskedastic errors and zero initial conditions. In this paper, we show this estimator is invariant to orthogonal transformations, but is inefficient because it ignores additional information available in the data. The zero initial condition is trivially satisfied by subtracting initial observations from the data. We show that differencing out the data further erodes efficiency compared to drawing inference...
Topics: Statistics, Methodology
Source: http://arxiv.org/abs/1702.02231
4
4.0



by
Quang N. Tran; BaNgu Vo; Dinh Phung; BaTuong Vo
texts
eye 4
favorite 0
comment 0
Clustering is one of the most common unsupervised learning tasks in machine learning and data mining. Clustering algorithms have been used in a plethora of applications across several scientific fields. However, there has been limited research in the clustering of point patterns  sets or multisets of unordered elements  that are found in numerous applications and data sources. In this paper, we propose two approaches for clustering point patterns. The first is a nonparametric method based...
Topics: Learning, Machine Learning, Statistics, Computing Research Repository
Source: http://arxiv.org/abs/1702.02262
4
4.0



by
Yunyun Li; Debajyoti Debnath; Pulak K. Ghosh; Fabio Marchesoni
texts
eye 4
favorite 0
comment 0
We investigate both analytically and by numerical simulation the relaxation of an overdamped Brownian particle in a 1D multiwell potential. We show that the mean relaxation time from an injection point inside the well down to its bottom is dominated by statistically rare trajectories that sample the potential profile outside the well. As a consequence, also the hopping time between two degenerate wells can depend on the detailed multiwell structure of the entire potential. The nonlocal nature...
Topics: Physics, Statistical Mechanics, Condensed Matter, Data Analysis, Statistics and Probability,...
Source: http://arxiv.org/abs/1702.02296
4
4.0



by
Christoph Hirnschall; Adish Singla; Sebastian Tschiatschek; Andreas Krause
texts
eye 4
favorite 0
comment 0
We study an online multitask learning setting, in which instances of related tasks arrive sequentially, and are handled by taskspecific online learners. We consider an algorithmic framework to model the relationship of these tasks via a set of convex constraints. To exploit this relationship, we design a novel algorithm  COOL  for coordinating the individual online learners: Our key idea is to coordinate their parameters via weighted projections onto a convex set. By adjusting the rate...
Topics: Learning, Machine Learning, Statistics, Computing Research Repository
Source: http://arxiv.org/abs/1702.02849
6
6.0



by
Claire Brécheteau
texts
eye 6
favorite 0
comment 0
In this paper, we introduce the notion of DTMsignature, a measure on R + that can be associated to any metricmeasure space. This signature is based on the distance to a measure (DTM) introduced by Chazal, CohenSteiner and M\'erigot. It leads to a pseudometric between metricmeasure spaces, upperbounded by the GromovWasserstein distance. Under some geometric assumptions, we derive lower bounds for this pseudometric. Given two Nsamples, we also build an asymptotic statistical test based...
Topics: Statistics Theory, Computing Research Repository, Computational Geometry, Probability, Statistics,...
Source: http://arxiv.org/abs/1702.02838
4
4.0



by
Dan Crisan; Jeremie Houssineau; Ajay Jasra
texts
eye 4
favorite 0
comment 0
We introduce a new class of Monte Carlo based approximations of expectations of random variables defined whose laws are not available directly, but only through certain discretisatizations. Sampling from the discretized versions of these laws can typically introduce a bias. In this paper, we show how to remove that bias, by introducing a new version of multiindex Monte Carlo (MIMC) that has the added advantage of reducing the computational effort, relative to i.i.d. sampling from the most...
Topics: Computation, Statistics
Source: http://arxiv.org/abs/1702.03057
3
3.0



by
Stéphanie van der Pas; Botond Szabó; Aad van der Vaart
texts
eye 3
favorite 0
comment 0
We investigate the frequentist properties of Bayesian procedures for estimation based on the horseshoe prior in the sparse multivariate normal means model. Previous theoretical results assumed that the sparsity level, that is, the number of signals, was known. We drop this assumption and characterize the behavior of the maximum marginal likelihood estimator (MMLE) of a key parameter of the horseshoe prior. We prove that the MMLE is an effective estimator of the sparsity level, in the sense that...
Topics: Statistics Theory, Statistics, Mathematics
Source: http://arxiv.org/abs/1702.03698
3
3.0



by
Piotr Szymański; Tomasz Kajdanowicz
texts
eye 3
favorite 0
comment 0
We study the performance of datadriven, a priori and random approaches to label space partitioning for multilabel classification with a Gaussian Naive Bayes classifier. Experiments were performed on 12 benchmark data sets and evaluated on 5 established measures of classification quality: micro and macro averaged F1 score, Subset Accuracy and Hamming loss. Datadriven methods are significantly better than an average run of the random baseline. In case of F1 scores and Subset Accuracy  data...
Topics: Learning, Machine Learning, Statistics, Computing Research Repository
Source: http://arxiv.org/abs/1702.04013
3
3.0
texts
eye 3
favorite 0
comment 0
This paper presents a stochastic logic time delay reservoir design. The reservoir is analyzed using a number of metrics, such as kernel quality, generalization rank, performance on simple benchmarks, and is also compared to a deterministic design. A novel reseeding method is introduced to reduce the adverse effects of stochastic noise, which may also be implemented in other stochastic logic reservoir computing designs, such as echo state networks. Benchmark results indicate that the proposed...
Topics: Machine Learning, Statistics, Emerging Technologies, Computing Research Repository
Source: http://arxiv.org/abs/1702.04265
5
5.0



by
Iván Díaz; Oleksandr Savenkov; Karla Ballman
texts
eye 5
favorite 0
comment 0
We consider estimation of an optimal individualized treatment rule (ITR) from observational and randomized studies when data for a highdimensional baseline variable is available. Our optimality criterion is with respect to delaying time to occurrence of an event of interest (e.g., death or relapse of cancer). We leverage semiparametric efficiency theory to construct estimators with desirable properties such as double robustness. We propose two estimators of the optimal ITR, which arise from...
Topics: Statistics, Methodology
Source: http://arxiv.org/abs/1702.04682
3
3.0



by
Pixu Shi; Hongzhe Li
texts
eye 3
favorite 0
comment 0
In human microbiome studies, sequencing reads data are often summarized as counts of bacterial taxa at various taxonomic levels specified by a taxonomic tree. This paper considers the problem of analyzing two repeated measurements of microbiome data from the same subjects. Such data are often collected to assess the change of microbial composition after certain treatment, or the difference in microbial compositions across body sites. Existing models for such count data are limited in modeling...
Topics: Statistics, Applications
Source: http://arxiv.org/abs/1702.04808
4
4.0



by
Akshay Soni; Yashar Mehdad
texts
eye 4
favorite 0
comment 0
The multilabel learning problem with large number of labels, features, and datapoints has generated a tremendous interest recently. A recurring theme of these problems is that only a few labels are active in any given datapoint as compared to the total number of labels. However, only a small number of existing work take direct advantage of this inherent extreme sparsity in the label space. By the virtue of Restricted Isometry Property (RIP), satisfied by many random ensembles, we propose a...
Topics: Information Retrieval, Learning, Machine Learning, Statistics, Computing Research Repository
Source: http://arxiv.org/abs/1702.05181
3
3.0
texts
eye 3
favorite 0
comment 0
We prove a new concentration inequality for the excess risk of a Mestimator in leastsquares regression with random design and heteroscedastic noise. This kind of result is a central tool in modern model selection theory, as well as in recent achievements concerning the behavior of regularized estimators such as LASSO, group LASSO and SLOPE.
Topics: Statistics Theory, Machine Learning, Statistics, Mathematics
Source: http://arxiv.org/abs/1702.05063
4
4.0



by
Dmitry I. Ignatov; Bruce W. Watson
texts
eye 4
favorite 0
comment 0
Being an unsupervised machine learning and data mining technique, biclustering and its multimodal extensions are becoming popular tools for analysing objectattribute data in different domains. Apart from conventional clustering techniques, biclustering is searching for homogeneous groups of objects while keeping their common description, e.g., in binary setting, their shared attributes. In bioinformatics, biclustering is used to find genes, which are active in a subset of situations, thus...
Topics: Machine Learning, Statistics, Artificial Intelligence, Computing Research Repository, Discrete...
Source: http://arxiv.org/abs/1702.05376
4
4.0



by
Songbai Yan; Chicheng Zhang
texts
eye 4
favorite 0
comment 0
It has been a longstanding problem to efficiently learn a linear separator using as few labels as possible. In this work, we propose an efficient perceptronbased algorithm for actively learning homogeneous linear separators under uniform distribution. Under bounded noise, where each label is flipped with probability at most $\eta$, our algorithm achieves nearoptimal $\tilde{O}\left(\frac{d}{(12\eta)^2}\log\frac{1}{\epsilon}\right)$ label complexity in time...
Topics: Learning, Machine Learning, Statistics, Computing Research Repository
Source: http://arxiv.org/abs/1702.05581
4
4.0



by
Nicolas Flammarion; Francis Bach
texts
eye 4
favorite 0
comment 0
We consider the minimization of composite objective functions composed of the expectation of quadratic functions and an arbitrary convex function. We study the stochastic dual averaging algorithm with a constant stepsize, showing that it leads to a convergence rate of O(1/n) without strong convexity assumptions. This thus extends earlier results on leastsquares regression with the Euclidean geometry to (a) all convex regularizers and constraints, and (b) all geometries represented by a...
Topics: Optimization and Control, Machine Learning, Statistics, Mathematics
Source: http://arxiv.org/abs/1702.06429
3
3.0



by
Fangzheng Xie; Mingyuan Zhou; Yanxun Xu
texts
eye 3
favorite 0
comment 0
Tumor is heterogeneous  a tumor sample usually consists of a set of subclones with distinct transcriptional profiles and potentially different degrees of aggressiveness and responses to drugs. Understanding tumor heterogeneity is therefore critical to precise cancer prognosis and treatment. In this paper, we introduce BayCount, a Bayesian decomposition method to infer tumor heterogeneity with highly overdispersed RNA sequencing count data. Using negative binomial factor analysis, BayCount...
Topics: Statistics, Applications
Source: http://arxiv.org/abs/1702.07981
4
4.0



by
Emma Pierson; Sam CorbettDavies; Sharad Goel
texts
eye 4
favorite 0
comment 0
Threshold tests have recently been proposed as a robust method for detecting bias in lending, hiring, and policing decisions. For example, in the case of credit extensions, these tests aim to estimate the bar for granting loans to white and minority applicants, with a higher inferred threshold for minorities indicative of discrimination. This technique, however, requires fitting a Bayesian latent variable model for which inference is often computationally challenging. Here we develop a method...
Topics: Learning, Machine Learning, Statistics, Computing Research Repository
Source: http://arxiv.org/abs/1702.08536
5
5.0



by
Dustin Tran; Rajesh Ranganath; David M. Blei
texts
eye 5
favorite 0
comment 0
Implicit probabilistic models are a flexible class for modeling data. They define a process to simulate observations, and unlike traditional models, they do not require a tractable likelihood function. In this paper, we develop two families of models: hierarchical implicit models and deep implicit models. They combine the idea of implicit densities with hierarchical Bayesian modeling and deep neural networks. The use of implicit models with Bayesian analysis has been limited by our ability to...
Topics: Learning, Computing Research Repository, Machine Learning, Computation, Statistics, Methodology
Source: http://arxiv.org/abs/1702.08896
4
4.0



by
Lu Zhang; Yongkai Wu; Xintao Wu
texts
eye 4
favorite 0
comment 0
Discriminationaware classification is receiving an increasing attention in the data mining and machine learning fields. The data preprocessing methods for constructing a discriminationfree classifier remove discrimination from the training data, and learn the classifier from the cleaned data. However, there lacks of a theoretical guarantee for the performance of these methods. In this paper, we fill this theoretical gap by mathematically bounding the probability that the discrimination in...
Topics: Learning, Machine Learning, Statistics, Computing Research Repository
Source: http://arxiv.org/abs/1703.00060
5
5.0



by
Junier B. Oliva; Barnabas Poczos; Jeff Schneider
texts
eye 5
favorite 0
comment 0
Sophisticated gated recurrent neural network architectures like LSTMs and GRUs have been shown to be highly effective in a myriad of applications. We develop an ungated unit, the statistical recurrent unit (SRU), that is able to learn long term dependencies in data by only keeping moving averages of statistics. The SRU's architecture is simple, ungated, and contains a comparable number of parameters to LSTMs; yet, SRUs perform favorably to more sophisticated LSTM and GRU alternatives, often...
Topics: Learning, Machine Learning, Statistics, Artificial Intelligence, Computing Research Repository
Source: http://arxiv.org/abs/1703.00381
3
3.0



by
Tuan Anh Le; Atilim Gunes Baydin; Robert Zinkov; Frank Wood
texts
eye 3
favorite 0
comment 0
We draw a formal connection between using synthetic training data to optimize neural network parameters and approximate, Bayesian, modelbased reasoning. In particular, training a neural network using synthetic data can be viewed as learning a proposal distribution generator for approximate inference in the syntheticdata generative model. We demonstrate this connection in a recognition task where we develop a novel Captchabreaking architecture and train it using synthetic data, demonstrating...
Topics: Learning, Machine Learning, Statistics, Computing Research Repository, Computer Vision and Pattern...
Source: http://arxiv.org/abs/1703.00868
4
4.0



by
Carl Jidling; Niklas Wahlström; Adrian Wills; Thomas B. Schön
texts
eye 4
favorite 0
comment 0
We consider a modification of the covariance function in Gaussian processes to correctly account for known linear constraints. By modelling the target function as a transformation of an underlying function, the constraints are explicitly incorporated in the model such that they are guaranteed to be fulfilled by any sample drawn or prediction made. We also propose a constructive procedure for designing the transformation operator and illustrate the result on both simulated and realdata examples.
Topics: Machine Learning, Statistics
Source: http://arxiv.org/abs/1703.00787
3
3.0



by
Mohammad Reza Bonyadi; Quang M. Tieng; David C. Reutens
texts
eye 3
favorite 0
comment 0
In this paper we introduce a new classification algorithm called Optimization of Distributions Differences (ODD). The algorithm aims to find a transformation from the feature space to a new space where the instances in the same class are as close as possible to one another while the gravity centers of these classes are as far as possible from one another. This aim is formulated as a multiobjective optimization problem that is solved by a hybrid of an evolutionary strategy and the QuasiNewton...
Topics: Learning, Machine Learning, Statistics, Computing Research Repository
Source: http://arxiv.org/abs/1703.00989
4
4.0



by
Bo Ning; Peter Bloomfield
texts
eye 4
favorite 0
comment 0
Dependent generalized extreme value (dGEV) models have attracted much attention due to the dependency structure that often appears in real datasets. To construct a dGEV model, a natural approach is to assume that some parameters in the model are timevarying. A previous study has shown that a dependent Gumbel process can be naturally incorporated into a GEV model. The model is a nonlinear state space model with a hidden state that follows a Markov process, with its innovation following a Gumbel...
Topics: Statistics, Methodology
Source: http://arxiv.org/abs/1703.00968
3
3.0



by
Veronika Cheplygina; Lauge Sørensen; David M. J. Tax; Marleen de Bruijne; Marco Loog
texts
eye 3
favorite 0
comment 0
We address the problem of \emph{instance label stability} in multiple instance learning (MIL) classifiers. These classifiers are trained only on globally annotated images (bags), but often can provide finegrained annotations for image pixels or patches (instances). This is interesting for computer aided diagnosis (CAD) and other medical image analysis tasks for which only a coarse labeling is provided. Unfortunately, the instance labels may be unstable. This means that a slight change in...
Topics: Machine Learning, Statistics, Computing Research Repository, Computer Vision and Pattern Recognition
Source: http://arxiv.org/abs/1703.04986
4
4.0



by
Olga Klopp; Nicolas Verzelen
texts
eye 4
favorite 0
comment 0
Consider the twin problems of estimating the connection probability matrix of an inhomogeneous random graph and the graphon of a Wrandom graph. We establish the minimax estimation rates with respect to the cut metric for classes of block constant matrices and step function graphons. Surprisingly, our results imply that, from the minimax point of view, the raw data, that is, the adjacency matrix of the observed graph, is already optimal and more involved procedures cannot improve the...
Topics: Statistics Theory, Statistics, Mathematics
Source: http://arxiv.org/abs/1703.05101
10
10.0



by
Ramin M. Hasani; Victoria Beneder; Magdalena Fuchs; David Lung; Radu Grosu
texts
eye 10
favorite 0
comment 0
We introduce SIMCE, an advanced, userfriendly modeling and simulation environment in Simulink for performing multiscale behavioral analysis of the nervous system of Caenorhabditis elegans (C. elegans). SIMCE contains an implementation of the mathematical models of C. elegans's neurons and synapses, in Simulink, which can be easily extended and particularized by the user. The Simulink model is able to capture both complex dynamics of ion channels and additional biophysical detail such as...
Topics: Neurons and Cognition, Quantitative Methods, Computing Research Repository, Machine Learning,...
Source: http://arxiv.org/abs/1703.06270
5
5.0



by
Julien Flamant; Nicolas Le Bihan; Pierre Chainais
texts
eye 5
favorite 0
comment 0
A novel approach towards the spectral analysis of stationary random bivariate signals is proposed. Using the Quaternion Fourier Transform, we introduce a quaternionvalued spectral representation of random bivariate signals seen as complexvalued sequences. This makes possible the definition of a scalar quaternionvalued spectral density for bivariate signals. This spectral density can be meaningfully interpreted in terms of frequencydependent polarization attributes. A natural decomposition...
Topics: Statistics, Methodology
Source: http://arxiv.org/abs/1703.06417
3
3.0



by
Sophie A. Murray; Suzy Bingham; Michael Sharpe; David R. Jackson
texts
eye 3
favorite 0
comment 0
The Met Office Space Weather Operations Centre produces 24/7/365 space weather guidance, alerts, and forecasts to a wide range of government and commercial end users across the United Kingdom. Solar flare forecasts are one of its products, which are issued multiple times a day in two forms; forecasts for each active region on the solar disk over the next 24 hours, and fulldisk forecasts for the next four days. Here the forecasting process is described in detail, as well as first verification...
Topics: Physics, Solar and Stellar Astrophysics, Data Analysis, Statistics and Probability, Astrophysics,...
Source: http://arxiv.org/abs/1703.06754
3
3.0



by
Nick Pawlowski; Miguel Jaques; Ben Glocker
texts
eye 3
favorite 0
comment 0
In this work we perform outlier detection using ensembles of neural networks obtained by variational approximation of the posterior in a Bayesian neural network setting. The variational parameters are obtained by sampling from the true posterior by gradient descent. We show our outlier detection results are comparable to those obtained using other efficient ensembling methods.
Topics: Learning, Machine Learning, Statistics, Computing Research Repository
Source: http://arxiv.org/abs/1703.06749
4
4.0



by
Ian Osband; Daniel Russo; Zheng Wen; Benjamin Van Roy
texts
eye 4
favorite 0
comment 0
We study the use of randomized value functions to guide deep exploration in reinforcement learning. This offers an elegant means for synthesizing statistically and computationally efficient exploration with common practical approaches to value function learning. We present several reinforcement learning algorithms that leverage randomized value functions and demonstrate their efficacy through computational studies. We also prove a regret bound that establishes statistical efficiency with a...
Topics: Learning, Machine Learning, Statistics, Artificial Intelligence, Computing Research Repository
Source: http://arxiv.org/abs/1703.07608
4
4.0



by
Mario Lucic; Matthew Faulkner; Andreas Krause; Dan Feldman
texts
eye 4
favorite 0
comment 0
How can we train a statistical mixture model on a massive data set? In this paper, we show how to construct coresets for mixtures of Gaussians and natural generalizations. A coreset is a weighted subset of the data, which guarantees that models fitting the coreset also provide a good fit for the original data set. We show that, perhaps surprisingly, Gaussian mixtures admit coresets of size polynomial in dimension and the number of mixture components, while being independent of the data set...
Topics: Machine Learning, Statistics
Source: http://arxiv.org/abs/1703.08110
4
4.0
texts
eye 4
favorite 0
comment 0
Lai and Robbins (1985) and Lai (1987) provided efficient parametric solutions to the multiarmed bandit problem, showing that arm allocation via upper confidence bounds (UCB) achieves minimum regret. These bounds are constructed from the KullbackLeibler information of the reward distributions, estimated from within a specified parametric family. In recent years there has been renewed interest in the multiarmed bandit problem due to new applications in machine learning algorithms and data...
Topics: Statistics Theory, Statistics, Mathematics
Source: http://arxiv.org/abs/1703.08285
4
4.0



by
Kaspar Märtens; Michalis K Titsias; Christopher Yau
texts
eye 4
favorite 0
comment 0
Bayesian inference for complex models is challenging due to the need to explore highdimensional spaces and multimodality and standard Monte Carlo samplers can have difficulties effectively exploring the posterior. We introduce a general purpose rejectionfree ensemble Markov Chain Monte Carlo (MCMC) technique to improve on existing poorly mixing samplers. This is achieved by combining parallel tempering and an auxiliary variable move to exchange information between the chains. We demonstrate...
Topics: Computation, Statistics, Machine Learning, Methodology
Source: http://arxiv.org/abs/1703.08520
3
3.0
texts
eye 3
favorite 0
comment 0
This paper shows how a time series of measurements of an evolving system can be processed to create an inner time series that is unaffected by any instantaneous invertible, possibly nonlinear transformation of the measurements. An inner time series contains information that does not depend on the nature of the sensors, which the observer chose to monitor the system. Instead, it encodes information that is intrinsic to the evolution of the observed system. Because of its sensorindependence, an...
Topics: Statistics Theory, Computing Research Repository, Statistics, Sound, Methodology, Mathematics
Source: http://arxiv.org/abs/1703.08596
5
5.0



by
Song Mei; Theodor Misiakiewicz; Andrea Montanari; Roberto I. Oliveira
texts
eye 5
favorite 0
comment 0
A number of statistical estimation problems can be addressed by semidefinite programs (SDP). While SDPs are solvable in polynomial time using interior point methods, in practice generic SDP solvers do not scale well to highdimensional problems. In order to cope with this problem, Burer and Monteiro proposed a nonconvex rankconstrained formulation, which has good performance in practice but is still poorly understood theoretically. In this paper we study the rankconstrained version of SDPs...
Topics: Optimization and Control, Machine Learning, Statistics, Mathematics
Source: http://arxiv.org/abs/1703.08729
4
4.0



by
Saber Salehkaleybar; S. Jamaloddin Golestani
texts
eye 4
favorite 0
comment 0
In distributed function computation, each node has an initial value and the goal is to compute a function of these values in a distributed manner. In this paper, we propose a novel tokenbased approach to compute a wide class of target functions to which we refer as "Tokenbased function Computation with Memory" (TCM) algorithm. In this approach, node values are attached to tokens and travel across the network. Each pair of travelling tokens would coalesce when they meet, forming a...
Topics: Machine Learning, Statistics, Distributed, Parallel, and Cluster Computing, Computing Research...
Source: http://arxiv.org/abs/1703.08831