McGill Logo

Technical Reports




> Research

> Teaching





> Home Page
> Mathematics and Statistics
> McGill University






Inference for Levy Driven Stochastic Volatility Models Via Adaptive Sequential Monte Carlo (pdf)  (May 2008)

by Ajay Jasra(1), David A. Stephens(2), Arnaud Doucet(3), Theodoros Tsagaris(4)

(1) Institute of Statistical Mathematics, Tokyo
(2) Department of Mathematics and Statistics, McGill University, Montreal
(3) Departments of Statistics and Computer Science, University of British Columbia, Vancouver
(4) Mathematics Institute, McGill University

Abstract: In the following paper we investigate simulation and inference for a class of Levy driven stochastic volatility (SV) models. The model is comprised of a Heston type model with an independent, additive, variance-Gamma process in the returns equation. The infinite activity nature of the driving gamma process can capture the observed behaviour of many financial time series, and a discretized version, fit in a Bayesian manner, has been found to be very useful for modelling equity data. In this paper we investigate two aspects associated to this model, and one associated to SV models in general. Firstly, we demonstrate that it is possible to draw exact inference, in the sense of no time-discretization error, from the Bayesian SV model, by the usage of simple results in stochastic calculus and by an auxiliary variable representation of the posterior. Secondly, we investigate the effectiveness of the model in capturing the leverage effect in high-frequency (S & P 500) returns data. Finally, to facilitate the first two points, we introduce a fully automated sequential Monte Carlo (SMC) algorithm, which substantially improves over the standard Markov chain Monte Carlo (MCMC) methods in the literature.

(Back to top)

Complexity in Systems Level Biology and Genetics:  Statistical Perspectives
(pdf)  (April 2008)
To appear as a chapter in Encyclopedia of Complexity and System Science, Springer, 2008.

by  David A. Stephens

This chapter identifies the challenges posed to biologists, geneticists and other scientists by advances in technology that have made the observation and study of biological systems increasingly possible. High-throughput platforms have made routine the collection vast amounts of structural and functional data, and have provided insights into the working cell, and helped to explain the role of genetics in common diseases.  Associated with the improvements in technology is the need for statistical procedures that extract the biological information from the available data in a coherent fashion, and perhaps more importantly, can quantify the certainty with which conclusions can be made. This chapter outlines a biological hierarchy of structures, functions and interactions that can now be observed, and detail the statistical procedures that are necessary for analyzing the resulting data.  The chapter has four main sections.  The first section details the historical connection between statistics and the analysis of biological and genetic data, and summarizes fundamental concepts in biology and genetics.  The second section outlines specific mathematical and statistical methods that are useful in the modelling of data arising in bioinformatics.  In sections three and four, two particular issues are discussed in detail: functional genomics via microrray analysis, and metabolomics.  Section five identifies some future directions for biological research in which statisticians will play a vital role.

(Back to top)


Estimation of Dose-Response Functions for Longitudinal Data (COBRA bepress link) (November 2007)

by Erica. E. M. Moodie(1) and David A. Stephens(2)

(1) Department of Epidemiology, Biostatistics and Occupational Health, McGill University
(2) Department of Mathematics and Statistics, McGill University

Abstract: In a longitudinal study of dose-response, the presence of confounding or non-compliance compromises the estimation of the true effect of a treatment. Standard regression methods cannot remove the bias introduced by patient-selected treatment level, that is, they do not permit the estimation of the causal effect of dose. Using an approach based on the Generalized Propensity Score (GPS), a generalization of the classical, binary treatment propensity score, it is possible to construct a balancing score that provides a more meaningful estimation procedure for the true (unconfounded) effect of dose. Previously, the GPS has been applied only in a single interval setting. In this paper, we extend the GPS methodology to the longitudinal setting. The methodology is applied to simulated data and two real data sets; first, we study the Riesby depression data, and secondly we present analysis of a recent study, the Monitored Occlusion Treatment of Amblyopia Study (MOTAS), which investigated the dose-response relationship in an ophthalmological setting between occlusion and improvement in visual acuity. The MOTAS study was revolutionary as it was the first to accurately measure occlusion dose received by the child.

(Back to top)


Non-Regular Likelihood Inference for Seasonally Persistent Processes (Arxiv link) (September 2007)

by Emma J. McCoy(1), Sofia C. Olhede(2), David A. Stephens(3)

(1) Department of Mathematics, Imperial College London
(2) Department of Statistical Science, University College London
(3) Department of Mathematics and Statistics, McGill University, Montreal, Canada

Summary: The estimation of parameters in the frequency spectrum of a seasonally persistent stationary stochastic process is addressed. For seasonal persistence associated with a pole in the spectrum located away from frequency zero, a new Whittle-type likelihood is developed that explicitly acknowledges the location of the pole. This Whittle likelihood is a large sample approximation to the distribution of the periodogram over a chosen grid of frequencies, and constitutes an approximation to the time-domain likelihood of the data, via the linear transformation of an inverse discrete Fourier transform combined with a demodulation. The new likelihood is straightforward to compute, and as will be demonstrated has good, yet non-standard, properties. The asymptotic behaviour of the proposed likelihood estimators is studied; in particular, N-consistency of the estimator of the spectral pole location is established. Large finite sample and asymptotic distributions of the score and observed Fisher information are given, and the corresponding distributions of the maximum likelihood estimators are deduced. A study of the small sample properties of the likelihood approximation is provided, and its superior performance to previously suggested methods is shown, as well as agreement with the developed distributional approximations.

(Back to top)


Statistical Techniques in Metabolic Profiling (pdf) (December 2006)
To appear in Handbook of Statistical Genetics: 3rd Edition, John Wiley, 2007.

by Maria De Iorio(1), Timothy M. D. Ebbels(2), David A. Stephens(3)

Summary: A summary of processing methods for metabonomic data, focussing in pre-processing rather than more advanced modelling.

(1) Department of Epidemiology and Public Health, Imperial College London
(2) Biological Chemistry, Division of Surgery, Oncology, Reproductive Biology and Anaesthetics, Imperial College London
(3) Department of Mathematics and Statistics, McGill University, Montreal, Canada


(Back to top)



Bayesian Inference for a Complex Ecological System: Viability and Reproduction under Finite Resource Constraints with Incomplete Observations. (pdf) (December 2006, Revised July 2008)

This paper is now accepted for publication in JRSSC.


by C. Jessica E. Metcalf (1), David A. Stephens(2), Mark Rees(3), Svata M. Louda(4), and Kathleen H. Keeler(4).

(1) Nicholas School for the Environment, Duke University, Durham, North Carolina, USA
(2) Department of Mathematics and Statistics, McGill University, Montreal, Canada
(3) Department of Animal and Plant Sciences, University of Sheffield, Sheffield, UK
(4) School of Biological Sciences, University of Nebraska, Lincoln, Nebraska, USA


Summary. We address the problem of MCMC analysis of a complex ecological system using a Bayesian inferential approach. We describe a complete likelihood framework for the life history of the wavyleaf thistle, including missing stages and density dependence. We indicate how, in order to make inference on life history transitions involving both missing information and density dependence, the stochastic models underlying each component can be combined to obtain expressions that can be directly sampled. Our approach allows otherwise unobtainable inference on interactions between demographic rates. This innovation and the principles described could be extended to other species featuring such missing stage information, with potential for improving inference relating to a range of ecological or evolutionary questions.

(Back to top)

Quantification of the Dose-Response Relationship for a Continuous Treatment in the Presence of Confounding or Informative Non-Compliance (pdf) (Web Supplement) (September 2006)

by Erica. E. M. Moodie(1) and David. A. Stephens(2)


(1) Department of Epidemiology, Biostatistics and Occupational Health, McGill University
(2) Department of Mathematics and Statistics, McGill University

Summary. In a longitudinal study of dose-response, the presence of confounding or non-compliance compromises the estimation of the true effect of a treatment. Flexibility in modelling confounding is essential in order to capture the treatment eŽect when the causal model is not fully understood,
so that observed treatment effect is not due to the imposition of a rigid model for the relationship between response, treatment and other covariates.
A semiparametric additive linear mixed (SPALM) model (Ruppert et al., 2003) provides a tractable and flexible approach to modelling the influence of potentially confounding variables. However, this approach does not on its own remove the bias introduced due to patient-selected treatment level, that is, it does not permit the estimation of the causal effect of dose. Using an approach based on the Generalized Propensity Score (GPS) (Hirano and Imbens, 2004), a generalization of the classical, binary treatment propensity score, it is possible to construct instrumental variables that provide a more meaningful (and less biased) estimation procedure for the true effect of dose. In this paper, we present Bayesian versions of the SPALM model and of the GPS where the propensity score relies on a novel formulation of the treatment density. The use of Bayesian methods are readily implementable and allow cohesive propagation of uncertainty in the models. The methodology is applied to the Monitored Occlusion Treatment of Amblyopia Study (MOTAS) which investigated the dose-response relationship between occlusion and improvement in visual acuity. This analysis quantifies the beneficial effect of occlusion for the first time.

This paper is now available in a different format.
(Back to top)








Contact Details:
Professor David A. Stephens

Room 1225, Burnside Hall
Department of Mathematics and Statistics
McGill University


Phone: 514-398-2005
Fax: 514-398-3899
E-mail : David Stephens