Current statistics seminars

McGill Statistics Seminar Series 2017-2018

 

Fall Term 2017

Date Event Speaker(s) Title Time Location

September 8, 2017

McGill Statistics Seminar
Simon Gravel
Genomics like it's 1960: Inferring human history

15:30-16:30

BURN 1205
Abstract: A central goal of population genetics is the inference of the biological, evolutionary and demographic forces that shaped human diversity. Large-scale sequencing experiments provide fantastic opportunities to learn about human history and biology if we can overcome computational and statistical challenges. I will discuss how simple mid-century statistical approaches, such as the jackknife and Kolmogorov equations, can be combined in unexpected ways to solve partial differential equations, optimize genomic study design, and learn about the spread of modern humans since our common African origins.

 

Speaker: Simon Gravel is an Assistant Professor at the Department of Human Genetics, McGill University.

September 15, 2017

McGill Statistics Seminar
Farzan Rohani
Our quest for robust time series forecasting at scale

15:30-16:30

BURN 1205
Abstract: The demand for time series forecasting at Google has grown rapidly along with the company since its founding. Initially, the various business and engineering needs led to a multitude of forecasting approaches, most reliant on direct analyst support. The volume and variety of the approaches, and in some cases their inconsistency, called out for an attempt to unify, automate, and extend forecasting methods, and to distribute the results via tools that could be deployed reliably across the company. That is, for an attempt to develop methods and tools that would facilitate accurate large-scale time series forecasting at Google. We were part of a team of data scientists in Search Infrastructure at Google that took on the task of developing robust and automatic large-scale time series forecasting for our organization. In this talk, we recount how we approached the task, describing initial stakeholder needs, the business and engineering contexts in which the challenge arose, and theoretical and pragmatic choices we made to implement our solution. We describe our general forecasting framework, offer details on various tractable subproblems into which we decomposed our overall forecasting task, and provide an example of our forecasting routine applied to publicly available Turkish Electricity data.

 

Speaker: Farzan Rohani is a Senior Staff Data Scientist at Google
September 22, 2017
McGill Statistics Seminar
Kai Zhang

BET on independence

14:00-15:00

BRONF179, Purvis Hall
Abstract: We study the problem of nonparametric dependence detection. Many existing methods suffer severe power loss due to non-uniform consistency, which we illustrate with a paradox. To avoid such power loss, we approach the nonparametric test of independence through the new framework of binary expansion statistics (BEStat) and binary expansion testing (BET), which examine dependence through a filtration induced by marginal binary expansions. Through a novel decomposition of the likelihood of contingency tables whose sizes are powers of 2, we show that the interactions of binary variables in the filtration are complete sufficient statistics for dependence. These interactions are also pairwise independent under the null. By utilizing these interactions, the BET avoids the problem of non-uniform consistency and improves upon a wide class of commonly used methods (a) by achieving the optimal rate in sample complexity and (b) by providing clear interpretations of global and local relationships upon rejection of independence. The binary expansion approach also connects the test statistics with the current computing system to allow efficient bitwise implementation. We illustrate the BET by a study of the distribution of stars in the night sky and by an exploratory data analysis of the TCGA breast cancer data.

 

Speaker: Kai Zhang is an Assitant Professor in the Department of Statistics and Operations Research, University of North Carolina, Chapel Hill, NC.

September 29, 2017

CRM Colloque de statistique
Alex McNeil and Barbara Jasiulis-Goldyn


McNeil: Spectral backtests of forecast distributions with application to risk management

Jasiulis-Goldyn: Asymptotic properties and renewal theory for Kendall random walks

 

14:30-16:30

BURN 1205
Abstract: McNeil: In this talk we study a class of backtests for forecast distributions in which the test statistic is a spectral transformation that weights exceedance events by a function of the modelled probability level. The choice of the kernel function makes explicit the user’s priorities for model performance. The class of spectral backtests includes tests of unconditional coverage and tests of conditional coverage. We show how the class embeds a wide variety of backtests in the existing literature, and propose novel variants as well. We assess the size and power of the backtests in realistic sample sizes, and in particular demonstrate the tradeoff between power and specificity in validating quantile forecasts.

 

Jasiulis-Goldyn: We consider extremal Markovian sequences connected with the Kendall convolution called Kendall random walks. The best tool for solving problems connected with the Kendall generalized convolution is the Williamson transform, which is also generator of Archimedean copula. We prove that one dimensional distributions of Kendall random walks are regularly varying. The Central Limit Theorem in the Kendall convolution algebra will be showed using the Williamsom transform. We notice that obtained stable distributions belong to maximal domain of attraction of the Fréchet distribution. We prove convergence of finite dimensional distributions for continuous time stochastic processes constructed by Kendall random walks. We also construct renewal processes for extremal Markovian sequences of the Kendall type and present significant connections with the limit distribution of Kendall random walks.

Speaker: Alexander McNeil is Professor of Actuarial Science at the University of York, England. Barbara Jasiulis-Goldyn is an Assistant Professor in the Mathematical Institute, University of Wrocław, Poland.

October 6, 2017

McGill Statistics Seminar
 
Cancelled

15:30-16:30

BURN 1205
Abstract:

 

Speaker:  

October 13, 2017

McGill Statistics Seminar
Anne-Laure Fougères
Quantifying spatial flood risks: A comparative study of max-stable models

15:30-16:30

BURN 1205
Abstract:

In various applications, evaluating spatial risks (such as floods, heatwaves or storms) is a key problem. The aim of this talk is to make use of extreme value theory and max-stable processes to provide quantitative answers to this issue. A review of the literature will be provided, as well as a wide comparative study based on a simulation design mimicking daily rainfall in France. This is a joint work with Cécile Mercadier (Université Claude-Bernard Lyon 1 (UCBL)) and Quentin Sebille (UCBL).

Speaker: Anne-Laure Fougères is Professor of Statistics at the Institut Camille-Jordan, Université Claude-Bernard, Lyon, France.
October 20, 2017
McGill Statistics Seminar
Qiang Sun

Statistical optimization and nonasymptotic robustness

15:30-16:30

BURN 1205
Abstract:

Statistical optimization has generated quite some interest recently. It refers to the case where hidden and local convexity can be discovered in most cases for nonconvex problems, making polynomial algorithms possible. It relies on a careful analysis of the geometry near global optima. In this talk, I will explore this issue by focusing on sparse regression problems in high dimensions. A computational framework named iterative local adaptive majorize-minimization (I-LAMM) will be proposed to simultaneously control algorithmic complexity and statistical error. I-LAMM effectively turns the nonconvex penalized regression problem into a series of convex programs by utilizing the locally strong convexity of the problem when restricting the solution set in an L_1 cone. Computationally, we establish a phase transition phenomenon: it enjoys a linear rate of convergence after a sub-linear burn-in. Statistically, it provides solutions with optimal statistical errors. Extensions to robust regression will be discussed.

Speaker: Qiang Sun is an Assistant Professor in the Department of Statistical Sciences at the University of Toronto.

October 27, 2017

McGill Statistics Seminar
Gabriela V. Cohen Freue
Penalized robust regression estimation with applications to proteomics

15:30-16:30

BURN 1205
Abstract:

In many current applications, scientists can easily measure a very large number of variables (for example, hundreds of protein levels), some of which are expected be useful to explain or predict a specific response variable of interest. These potential explanatory variables are most likely to contain redundant or irrelevant information, and in many cases, their quality and reliability may be suspect. We developed two penalized robust regression estimators that can be used to identify a useful subset of explanatory variables to predict the response, while protecting the resulting estimator against possible aberrant observations in the data set. Using an elastic net penalty, the proposed estimator can be used to select variables, even in cases with more variables than observations or when many of the candidate explanatory variables are correlated. In this talk, I will present the new estimator and an algorithm to compute it. I will also illustrate its performance in a simulation study and a real data set. This is joint work with Professor Matias Salibian-Barrera, my PhD student David Kepplinger, and my PDF Ezequiel Smuggler.

Speaker: Gabriela V. Cohen Freue has completed her PhD in Mathematical Statistics from the University of Maryland at Collage Park and postdoctoral studies in Biostatistics through her participation in the Biomarkers in Transplantation (BiT) initiative, hosted by the University of British Columbia in Vancouver. She then joined the PROOF Centre of Excellence where she led the statistical analysis of proteomics data. She is now an Assistant Professor in the Department of Statistics at the University of British Columbia and a Canada Research Chair-II in Statistical Genomics. Her research interests are in robust estimation and regularization of linear models with applications to Statistical Genomics and Proteomics.

November 3, 2017

McGill Statistics Seminar
Daniel Simpson
How to do statistics

15:30-16:30

BURN 1205
Abstract:

In this talk, I will outline how to do (Bayesian) statistics. I will focus particularly on the things that need to be done before you see data, including prior specification and checking that your inference algorithm actually works.

Speaker: Daniel Simpson is an Assistant Professor in the Department of Statistical Sciences, University of Toronto

November 10, 2017

McGill Statistics Seminar
Daniel Roy 
PAC-Bayesian Generalizations Bounds for Deep Neural Networks

15:30-16:30

BURN 1205
Abstract:

One of the defining properties of deep learning is that models are chosen to have many more parameters than available training data. In light of this capacity for overfitting, it is remarkable that simple algorithms like SGD reliably return solutions with low test error. One roadblock to explaining these phenomena in terms of implicit regularization, structural properties of the solution, and/or easiness of the data is that many learning bounds are quantitatively vacuous when applied to networks learned by SGD in this "deep learning" regime. Logically, in order to explain generalization, we need nonvacuous bounds. We return to an idea by Langford and Caruana (2001), who used PAC-Bayes bounds to compute nonvacuous numerical bounds on generalization error for stochastic two-layer two-hidden-unit neural networks via a sensitivity analysis. By optimizing the PAC-Bayes bound directly, we are able to extend their approach and obtain nonvacuous generalization bounds for deep stochastic neural network classifiers with millions of parameters trained on only tens of thousands of examples. We connect our findings to recent and old work on flat minima and MDL-based explanations of generalization. Time permitting, I will discuss recent work on computing even tighter generalization bounds associated with a learning algorithm introduced by Chaudhari et al. (2017), called Entropy-SGD. We show that Entropy-SGD indirectly optimizes a PAC-Bayes bound, but does so by optimizing the "prior" term, violating the hypothesis that the prior be independent of the data. We show how to fix this defect using differential privacy. The result is a new PAC-Bayes bound for data-dependent priors, which we show, up to some approximations, delivers even tighter generalization bounds. Joint work with Gintare Karolina Dziugaite, based on https://arxiv.org/abs/1703.11008

Speaker: Daniel Roy is an Assistant Professor in the Department of Statistical Sciences at the University of Toronto.

November 17, 2017

McGill Statistics Seminar
Toby Hocking
A log-linear time algorithm for constrained changepoint detection

15:30-16:30

BURN 1205
Abstract:

Changepoint detection is a central problem in time series and genomic data. For some applications, it is natural to impose constraints on the directions of changes. One example is ChIP-seq data, for which adding an up-down constraint improves peak detection accuracy, but makes the optimization problem more complicated. In this talk I will explain how a recently proposed functional pruning algorithm can be generalized to solve such constrained changepoint detection problems. Our proposed log-linear time algorithm achieves state-of-the-art peak detection accuracy in a benchmark of several genomic data sets, and is orders of magnitude faster than our previous quadratic time algorithm. Our implementation is available as the PeakSegPDPA function in the PeakSegOptimal R package, https://cran.r-project.org/package=PeakSegOptimal

Speaker: Toby Hocking is a Postdoctoral Fellow in the Department of Human Genetics, McGill University.

November 24, 2017

CRM Colloque de statistique
 David R. Bellhouse
150 years (and more) of
data analysis in Canada

15:30-16:30

LEA 232
Abstract: As Canada celebrates its 150th anniversary, it may be good to reflect on the past and future of data analysis and statistics in this country. In this talk, I will review the Victorian Statistics Movement and its effect in Canada, data analysis by a Montréal physician in the 1850s, a controversy over data analysis in the 1850s and 60s centred in Montréal, John A. MacDonald’s use of statistics, the Canadian insurance industry and the use of statistics, the beginning of mathematical statistics in Canada, the Fisherian revolution, the influence of Fisher, Neyman and Pearson, the computer revolution, and the emergence of data science.
Speaker:  David R. Bellhouse is Professor Emeritus of Statistics and Actuarial Science at Western University, London, Ontario. He is a leading world figure in statistical history.
December 1, 2017
McGill Statistics Seminar
Lei Sun

Fisher’s method revisited: set-based genetic association and interaction studies

15:30-16:30

BURN 1205
Abstract:

Fisher’s method, also known as Fisher’s combined probability test, is commonly used in meta-analyses to combine p-values from the same test applied to K independent samples to evaluate a common null hypothesis. Here we propose to use it to combine p-values from different tests applied to the same sample in two settings: when jointly analyzing multiple genetic variants in set-based genetic association studies, or when jointly capturing main and interaction effects in the presence of missing one of the interacting variables. In the first setting, we show that many existing methods (e.g. the so called burden test and SKAT) can be classified into a class of linear statistics and another class of quadratic statistics, where each class is powerful only in part of the high-dimensional parameter space. In the second setting, we show that the class of scale-tests for heteroscedasticity can be utilized to indirectly identify unspecified interaction effects, complementing the class of location-tests designed for detecting main effects only. In both settings, we show that the two classes of tests are asymptotically independent of each other under the global null hypothesis. Thus, we can evaluate the significance of the resulting Fisher’s test statistic using the chi-squared distribution with four degrees of freedom; this is a desirable feature for analyzing big data. In addition to analytical results, we provide empirical evidence to show that the new class of joint test is not only robust but can also have better power than the individual tests. This is based on join work with formal graduate students Andriy Derkach (Derkach et al. 2013, Genetic Epidemiology; Derkach et al. 2014, Statistical Science) and David Soave (Soave et al. 2015, The American Journal of Human Genetics; Soave and Sun 2017, Biometrics).

Speaker: Lei Sun is Professor from the Department of Statistical Sciences and Science Division of Biostatistics, Dalla Lana School of Public Health at the University of Toronto

 

Winter Term 2018

Date Event Speaker(s) Title Time Location
January 13, 2018
McGill Statistics Seminar
 

 

15:30-16:30

BURN 1205
Abstract:

 

Speaker:  

January 20, 2018

McGill Statistics Seminar
 
 

15:30-16:30

BURN 1205
Abstract:

 

Speaker:  

January 27, 2018

CRM-SSC Prize 2017 Colloque
 
 

15:30-16:30

ROOM 6254

Pavillon André-Aisenstadt 2920, UdeM

Abstract:

 

Speaker:  

February 3, 2018

McGill Statistics Seminar
 
 

15:30-16:30

BURN 1205
Abstract:

 

Speaker:  
February 10, 2018
McGill Statistics Seminar
 

 

15:30-16:30

BURN 1205
Abstract:

 

Speaker:  

February 17, 2018

McGill Statistics Seminar
 
 

15:30-16:30

BURN 1205
Abstract:

 

Speaker:  

February 24, 2018

McGill Statistics Seminar
 
 

15:30-16:30

BURN 1205
Abstract:

 

Speaker:  

March 10, 2018

McGill Statistics Seminar
 
 

15:30-16:30

BURN 1205
Abstract:

 

Speaker:  
March 17, 2018
McGill Statistics Seminar
 

 

15:30-16:30

BURN 1205
Abstract:

 

Speaker:  

March 24, 2018

 
 
 

15:30-16:30

BURN 1205
Abstract:

 

Speaker:  
March 31, 2018
McGill Statistics Seminar
 

 

15:30-16:30

BURN 1205
Abstract:

 

Speaker:  
April 6, 2018
McGill Statistics Seminar
 

 

 

 
Abstract:

 

Speaker:  

 

Website design: Prof. Johanna G. Nešlehová

 

 

Last edited by on Fri, 11/24/2017 - 15:54