The Department of Statistics and Data Sciences is pleased to announce the line-up for the 2014 Fall SDS Seminar Series. In its 5th year, the lecture series provides participants with the opportunity to hear from leading scholars and experts who work in different applied areas, including business, biology, text mining, computer vision, economics, and public health.

The series is envisioned as a vital contribution to the intellectual, cultural, and scholarly environment at The University of Texas at Austin for students, faculty, and the wider community. Each talk is free of charge and open to the public. For more information, contact Sasha Schellenberg

 

September 12, 2014 – Ryan P Adams
(Harvard University, Computer Science)
“Accelerating Exact MCMC with Subsets of Data 
CBA 4.328, 2:00 to 3:00 PM

September 19, 2014 – TBA
(-)
“TBA” 
CBA 4.328, 2:00 to 3:00 PM

September 26, 2014 – Tong Zhang
(Rutgers University, Department of Statistics)
“TBA” 
CBA 4.328, 2:00 to 3:00 PM

October 24, 2014 – TBA
(-)
“TBA” 
CBA 4.328, 2:00 to 3:00 PM

November 14, 2014 - Ming Yuan
(University of Wisconsin-Madison, Department of Statistics )
"TBA"

CBA 4.328, 2:00 to 3:00 PM

 

November 21, 2014 - Mladen Kolar
(University of Chicago, Booth School of Business)
“TBA”
CBA 4.328, 2:00 to 3:00 PM


 


 

Adams 200x300Ryan P Adams (Harvard University, Department of Computer Science

Title: "Accelerating Exact MCMC with Subsets of Data"

Abstract: One of the challenges of building statistical models for large data sets is balancing the correctness of inference procedures against computational realities.  In the context of Bayesian procedures, the pain of such computations has been particularly acute, as it has appeared that algorithms such as Markov chain Monte Carlo necessarily need to touch all of the data at each iteration in order to arrive at a correct answer.  Several recent proposals have been made to use subsets (or "minibatches") of data to perform MCMC in ways analogous to stochastic gradient descent.  Unfortunately, these proposals have only provided approximations, although in some cases it has been possible to bound the error of the resulting stationary distribution.

In this talk I will discuss two new, complementary algorithms for using subsets of data to perform faster MCMC.  In both cases, these procedures yield stationary distributions that are exactly the desired target posterior distribution.  The first of these, "Firefly Monte Carlo", is an auxiliary variable method that uses randomized subsets of data to achieve valid transition operators, with connections to recent developments in pseudo-marginal MCMC.  The second approach I will discuss, parallel predictive prefetching, uses subsets of data to parallelize Markov chain Monte Carlo across multiple cores, while still leaving the target distribution intact.  These methods have both yielded significant gains in wallclock performance in sampling from posterior distributions with millions of data.


 

SDS sem-Ico

TBA 

Title: 

Abstract: 

 


  

TongZhang

Tong Zhang (Rutgers University, Department of Statistics)



Title: "TBA”





Abstract: 

 

 


  

SDS sem-Ico

TBA 

Title: 






Abstract: 

 


  

mingyuan

Ming Yuan (University of Wisconsin-Madison, Department of Statistics)


Title: "TBA"

Abstract: 

 

 

 



 

mkolarMladen Kolar (University of Chicago, Booth School of Business)



Title: "TBA"

Abstract: