NGSSC: Computational Methods in Statistics with Applications

This course is organized by the NGSSC Graduate School.

It consists of two parts, which both will be given at the Department Information Technology, Uppsala University.

As a part of the work to be done within the course, a self-study part is included, to take place during Week 35 (August 29 - September 2) 2011.

Review material for Week 35's self study:

In order to follow the material, the following basic concepts and definitions from Statistics and Linear Algebra have to be reviewed.

Course materials (to be updated)

Detailed time schedule

Week Date Topic(s) Time
Location
Lecturer
36 Sep 05 Introduction. General description of the course 9:15-9:30
1412
MN
  Computational Statistics - the statistician's point of view 9:30-11:00
1412
DvR
  Computational Statistics - the numerical analyst's point of view 11:15-12:00
1412
LE
  The statistical language R and Matlab - short introduction 13:15-15:30
1412
MN
  The statistical language R and Matlab - computer lab exercises 15:30-17:00
1412
MN
   
  Sep 06 Discussion of the experience from the last computer lab 9:15-9:30
1412
MN
  Regression analysis, statistical concepts 9:30-10:30
1412
DvR
  Least Squares and QR factorization 10:45-12:00
1412
LE
  Multiple regression. Normal equations vs QR factorization. Polynomial regression. 13:00-17:00
1412
LE
   
  Sep 07 Discussion of the experience from the last computer lab 9:00-9:30
1412
LE
  Regression analysis (cont). Rank deficiency. Singular value decomposition (SVD). Pseudo-inverses. Shrinkage methods. (Cross validation) 9:30-12:00
1412
LE
  Numerical rank deficiency, collinearity. Application in pattern recognition: classification of handwritten digits (regression) 13:15-17:00
1412
LE
   
  Sep 08 Discussion of the experience from the last computer lab 9:15-9:30
1412
LE
  Floating point computations - short introduction, variance example
Graphs and their usage in Statistical applications (page-rank), regression trees and classification trees. Concepts of numerical stability
9:30-12:00
1412
LE
  Floating point arithmetic - examples of loss of accuracy. Page ranking, the Google matrix 13:15-17:00
1412
LE
   
  Sep 09 Discussion of the experience from the last computer lab 9:15-9:30
1412
MN
  Regression problems leading to sparse matrices 9:30-12:00
1412
DvR
  Sparse matrices - storage formats. Solving least squares problems with sparse matrices: direct and iterative methods. Computing the SVD 10:45-12:00
1412
MN
  Handling sparse matrices in R and MATLAB. Regression. Some parallelization issues, related to data structure 13:15-17:00
1412
MN
   
37 Sep 12-Oct 16 Finalize and send all lab reports from the first course week.      
   
38 Sep 19 Partial Least Squares 9:15-12:00
1412
LE
  Partial Least Squares 13:15-17:00
1412
LE
   
  Sep 20 Discussion of the experience from the last computer lab 9:15-9:30
1412
MN
  Principal Componennt Analysis and regression 9:30-10:30
1412
DvR
  Krylov subspace methods. Eigenvalue computations (large scale, sparse) 10:45-12:00
1412
MN
  Testing some classical computational methods 13:15-17:00
1412
MN
   
  Sep 21 Discussion of the experience from the last computer lab 9:15-9:30
1412
MN
  Random number generators 9:30-10:15
1412
MN
  Markov chain Monte Carlo methods (MCMC) 10:30-12:00
1412
DvR
  MCMC 13:15-17:00
1412
DvR
   
  Sep 22 Discussion of the experience from the last computer lab 9:15-9:30
1412
DvR MN
  Parallel computing. Parallel Statistical computing 9:30-12:00
1412
MN
  Parallel programming in R and Matlab, examples 13:15-17:00
1412
MN
   
  Sep 23 Discussion of the experience from the last computer lab 9:15-9:30
1412
MN
  Summary of the course material and sketch of new problems, methodologies, methods to be considered further. 9:30-12:00
1412
LE DvR MN

Recommended books:

  1. Peter Dalgaard, Introductory Statistics with R, Springer, 2002.
  2. W.John Braun, Duncan J. Murdoch, A First Course in Statistical Programmimg with R, Cambridge University Press, 2007.
  3. Geof H. Givens and Jennifer A. Hoeting, Computational Statistics, Wiley, 2005.
  4. Wendy L. Martinez and Angel R. Martinez, Computational Statistics Handbook with MATLAB, Chapman & Hall/CRC, 2002.
  5. Lars Eldén. Matrix Methods in Data Mining and Pattern Recognition. SIAM, Philadelphia, PA, Philadelphia, PA, USA, 2007.

Organization issues:
Some instructions how to find us in Uppsala are to be found here .
Suggested hotel to book rooms in Uppsala: Hotel Uppsala .

For NGSSC students only:
Your home department is expected to provide advance payment for travel and housing. After the course has been completed, the costs will be reimbursed from NGSSC by a lump grant of SEK 12 000.


Back
Last changed on September 1, 2011.
Mail to: Maya dot Neytcheva "at" it dot uu dot se "