Matias Quiroz joined the School of Mathematical and Physical Sciences at UTS in May 2019. He was previously a Postdoctoral fellow at University of New South Wales. Matias has a Ph.D degree in Statistics awarded from Stockholm University in 2015.
Matias is interested in computational statistics, with a particular emphasis on Bayesian methods. He works on Markov chain Monte Carlo methods for large datasets and, more recently, variational approximations for complex high-dimensional models.
Visit www.matiasquiroz.com for up-to-date information regarding publications and working papers.
Matias is an Associate Investigator in the ARC Centre of Excellence for Mathematical and Statistical Frontiers. He is also a Council Member of the Statistical Society of Australia's NSW branch, where he is the Young Statistician representative.
Can supervise: YES
Matias Quiroz is a Bayesian believer which is reflected in the research he pursuits. Matias works on developing methodology for Bayesian model estimation, where he follows two lines of research. The first is developing fast Markov chain Monte Carlo methods for models with likelihoods that are costly to evaluate, for example in large data scenarios or when the model itself has an intractable component. The second is developing variational inference algorithms for complex statistical models.
Matias is passionate about teaching Statistics. He has taught courses on all academic levels.
Adams, MP, Koh, EJY, Vilas, MP, Collier, CJ, Lambert, VM, Sisson, SA, Quiroz, M, McDonald-Madden, E, McKenzie, LJ & O'Brien, KR 2020, 'Predicting seagrass decline due to cumulative stressors', Environmental Modelling and Software, vol. 130.View/Download from: Publisher's site
© 2020 Elsevier Ltd Seagrass ecosystems are increasingly subjected to multiple interacting stressors, making the consequent trajectories difficult to predict. Here, we present a new process-based model of seagrass decline in response to cumulative light and temperature stress. The model is calibrated to laboratory datasets for Great Barrier Reef seagrasses using Bayesian inference. Our model, which is fit to both physiological and morphological data, supports the hypothesis that physiological carbon loss rate controls the shoot density decline rate of seagrasses. The model predicts the time to complete shoot loss, and a new, generalisable, cumulative stress index that indicates the potential seagrass shoot density decline based on the time period of cumulative stress. All model predictions include uncertainty estimates based on uncertainty in the model fit to the data. The calibrated model is packaged into a computer program that can forecast the potential declines of seagrasses due to cumulative light and temperature stress.
Xu, M, Quiroz, M, Kohn, R & Sisson, SA 2020, 'Variance reduction properties of the reparameterization trick', AISTATS 2019 - 22nd International Conference on Artificial Intelligence and Statistics.
© Copyright 2019 by the author(s). The reparameterization trick is widely used in variational inference as it yields more accurate estimates of the gradient of the variational objective than alternative approaches such as the score function method. Although there is overwhelming empirical evidence in the literature showing its success, there is relatively little research exploring why the reparameterization trick is so effective. We explore this under the idealized assumptions that the variational approximation is a mean-field Gaussian density and that the log of the joint density of the model parameters and the data is a quadratic function that depends on the variational mean. From this, we show that the marginal variances of the reparameterization gradient estimator are smaller than those of the score function gradient estimator. We apply the result of our idealized analysis to real-world examples.
Dang, KD, Quiroz, M, Kohn, R, Tran, MN & Villani, M 2019, 'Hamiltonian monte carlo with energy conserving subsampling', Journal of Machine Learning Research, vol. 20, pp. 1-31.
© 2019 Khue-Dung Dang, Matias Quiroz, Robert Kohn, Minh-Ngoc Tran, Mattias Villani. Hamiltonian Monte Carlo (HMC) samples efficiently from high-dimensional posterior distributions with proposed parameter draws obtained by iterating on a discretized version of the Hamiltonian dynamics. The iterations make HMC computationally costly, especially in problems with large data sets, since it is necessary to compute posterior densities and their derivatives with respect to the parameters. Naively computing the Hamiltonian dynamics on a subset of the data causes HMC to lose its key ability to generate distant parameter proposals with high acceptance probability. The key insight in our article is that efficient subsampling HMC for the parameters is possible if both the dynamics and the acceptance probability are computed from the same data subsample in each complete HMC iteration. We show that this is possible to do in a principled way in a HMC-within-Gibbs framework where the subsample is updated using a pseudo marginal MH step and the parameters are then updated using an HMC step, based on the current subsample. We show that our subsampling methods are fast and compare favorably to two popular sampling algorithms that use gradient estimates from data subsampling. We also explore the current limitations of subsampling HMC algorithms by varying the quality of the variance reducing control variates used in the estimators of the posterior density and its gradients.
Quiroz, M, Kohn, R, Villani, M & Minh-Ngoc, T 2019, 'Speeding Up MCMC by Efficient Data Subsampling', JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, vol. 114, no. 526, pp. 831-843.View/Download from: Publisher's site
Quiroz, M, Minh-Ngoc, T, Villani, M & Kohn, R 2018, 'Speeding up MCMC by Delayed Acceptance and Data Subsampling', JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, vol. 27, no. 1, pp. 12-22.View/Download from: Publisher's site
© 2018, Indian Statistical Institute. The rapid development of computing power and efficient Markov Chain Monte Carlo (MCMC) simulation algorithms have revolutionized Bayesian statistics, making it a highly practical inference method in applied work. However, MCMC algorithms tend to be computationally demanding, and are particularly slow for large datasets. Data subsampling has recently been suggested as a way to make MCMC methods scalable on massively large data, utilizing efficient sampling schemes and estimators from the survey sampling literature. These developments tend to be unknown by many survey statisticians who traditionally work with non-Bayesian methods, and rarely use MCMC. Our article explains the idea of data subsampling in MCMC by reviewing one strand of work, Subsampling MCMC, a so called Pseudo-Marginal MCMC approach to speeding up MCMC through data subsampling. The review is written for a survey statistician without previous knowledge of MCMC methods since our aim is to motivate survey sampling experts to contribute to the growing Subsampling MCMC literature.
Gunawan, D, Dang, K-D, Quiroz, M, Kohn, R & Tran, M-N, 'Subsampling Sequential Monte Carlo for Static Bayesian Models'.
We show how to speed up Sequential Monte Carlo (SMC) for Bayesian inference
in large data problems by data subsampling. SMC sequentially updates a cloud of
particles through a sequence of distributions, beginning with a distribution
that is easy to sample from such as the prior and ending with the posterior
distribution. Each update of the particle cloud consists of three steps:
reweighting, resampling, and moving. In the move step, each particle is moved
using a Markov kernel; this is typically the most computationally expensive
part, particularly when the dataset is large. It is crucial to have an
efficient move step to ensure particle diversity. Our article makes two
important contributions. First, in order to speed up the SMC computation, we
use an approximately unbiased and efficient annealed likelihood estimator based
on data subsampling. The subsampling approach is more memory efficient than the
corresponding full data SMC, which is an advantage for parallel computation.
Second, we use a Metropolis within Gibbs kernel with two conditional updates. A
Hamiltonian Monte Carlo update makes distant moves for the model parameters,
and a block pseudo-marginal proposal is used for the particles corresponding to
the auxiliary variables for the data subsampling. We demonstrate both the
usefulness and limitations of the methodology for estimating four generalized
linear models and a generalized additive model with large datasets.
Quiroz, M, Tran, M-N, Villani, M, Kohn, R & Dang, K-D, 'The block-Poisson estimator for optimally tuned exact subsampling MCMC'.
Speeding up Markov Chain Monte Carlo (MCMC) for datasets with many
observations by data subsampling has recently received considerable attention.
A pseudo-marginal MCMC method is proposed that estimates the likelihood by data
subsampling using a block-Poisson estimator. The estimator is a product of
Poisson estimators, allowing us to update a single block of subsample
indicators in each MCMC iteration so that a desired correlation is achieved
between the logs of successive likelihood estimates. This is important since
pseudo-marginal MCMC with positively correlated likelihood estimates can use
substantially smaller subsamples without adversely affecting the sampling
efficiency. The block-Poisson estimator is unbiased but not necessarily
positive, so the algorithm runs the MCMC on the absolute value of the
likelihood estimator and uses an importance sampling correction to obtain
consistent estimates of the posterior mean of any function of the parameters.
Our article derives guidelines to select the optimal tuning parameters for our
method and shows that it compares very favourably to regular MCMC without
subsampling, and to two other recently proposed exact subsampling approaches in
Salomone, R, Quiroz, M, Kohn, R, Villani, M & Tran, M-N, 'Spectral Subsampling MCMC for Stationary Time Series'.
Bayesian inference using Markov Chain Monte Carlo (MCMC) on large datasets
has developed rapidly in recent years. However, the underlying methods are
generally limited to relatively simple settings where the data have specific
forms of independence. We propose a novel technique for speeding up MCMC for
time series data by efficient data subsampling in the frequency domain. For
several challenging time series models, we demonstrate a speedup of up to two
orders of magnitude while incurring negligible bias compared to MCMC on the
full dataset. We also propose alternative control variates for variance
reduction based on data grouping and coreset constructions.
Tran, M-N, Kohn, R, Quiroz, M & Villani, M, 'The Block Pseudo-Marginal Sampler'.
The pseudo-marginal (PM) approach is increasingly used for Bayesian inference
in statistical models, where the likelihood is intractable but can be estimated
unbiasedly. %Examples include random effect models, state-space models and data
subsampling in big-data settings. Deligiannidis et al. (2016) show how the PM
approach can be made much more efficient by correlating the underlying Monte
Carlo (MC) random numbers used to form the estimate of the likelihood at the
current and proposed values of the unknown parameters. Their approach greatly
speeds up the standard PM algorithm, as it requires a much smaller number of
samples or particles to form the optimal likelihood estimate. Our paper
presents an alternative implementation of the correlated PM approach, called
the block PM, which divides the underlying random numbers into blocks so that
the likelihood estimates for the proposed and current values of the parameters
only differ by the random numbers in one block. We show that this
implementation of the correlated PM can be much more efficient for some
specific problems than the implementation in Deligiannidis et al. (2016); for
example when the likelihood is estimated by subsampling or the likelihood is a
product of terms each of which is given by an integral which can be estimated
unbiasedly by randomised quasi-Monte Carlo. Our article provides methodology
and guidelines for efficiently implementing the block PM. A second advantage of
the the block PM is that it provides a direct way to control the correlation
between the logarithms of the estimates of the likelihood at the current and
proposed values of the parameters than the implementation in Deligiannidis et
al. (2016). We obtain methods and guidelines for selecting the optimal number
of samples based on idealized but realistic assumptions.
Quiroz, M, Nott, DJ & Kohn, R, 'Gaussian variational approximation for high-dimensional state space models'.
Our article considers a Gaussian variational approximation of the posterior
density in a high-dimensional state space model. The variational parameters to
be optimized are the mean vector and the covariance matrix of the
approximation. The number of parameters in the covariance matrix grows as the
square of the number of model parameters, so it is necessary to find simple yet
effective parameterizations of the covariance structure when the number of
model parameters is large. We approximate the joint posterior distribution over
the high-dimensional state vectors by a dynamic factor model, having Markovian
time dependence and a factor covariance structure for the states. This gives a
reduced description of the dependence structure for the states, as well as a
temporal conditional independence structure similar to that in the true
posterior. The usefulness of the approach is illustrated for prediction in two
high-dimensional applications that are challenging for Markov chain Monte Carlo
sampling. The first is a spatio-temporal model for the spread of the Eurasian
Collared-Dove across North America; the second is a Wishart-based multivariate
stochastic volatility model for financial returns.