This wiki page is about part II of SciNet's Scientific Computing course.
Information on part I can be found on the page Scientific Software Development Course
Information on part III can be found on the page High Performance Scientific Computing

Syllabus

The second part of SciNet's for-credit Scientific Computing course (Phys 2109 modular course credit / Ast 3100 mini-course credit) will start in January 2012, annd will be held in SciNet's conference room on Fridays, Jan 13, 20, 27 and Feb 3, 9:30-11:30am.

Whereas the first part of the course focused on the basics of best practice for writing maintainable, modular scientific programs, this second part will focus on common techniques and algorithms, such as floating point computations, validation+verification, visualization, ODEs, monte Carlo, linear algebra and fast fourier transforms. Available libraries that already implement these will be introduced and used (LAPACK, FFTW, ...).

Students that did not take part I, can take part II provided that they have solid C or C++ skills, and are familiar with the basics of modular, version-controled software development in a linux-like environment, debugging and profiling, and installing libraries. The slides and recordings of part I are available on the page Scientific Software Development Course for students who need a refresher.

At the end of minicourse II, "Numerical Tools for Physical Scientists", students able to do basic, modular, version-controled software development in a linux-like environment, will leave with a basic understanding of numerical aspects of linear algebra, fast fourier transforms, random numbers and ordinary differential equation solvers, and will be able apply these in their own code using external libraries.

The course will require 4-6 hours each week spent on reading and homework.

Required Software

Each lecture will have a hands-on component; students are strongly encouraged to bring laptops. (Contact us if this will be a problem). Windows, Mac, or Linux laptops are all fine, but some software will have to be installed *before* the first lecture:

On windows laptops only, Cygwin (http://www.cygwin.com/) will have to be installed ; ensure that development tools (gcc/g++/gfortran, gdb), git, and the X environment (Xorg) is installed.

On Mac laptops, ensure that the development tools (Xcode) is installed.

On Linux, ensure that packages for the gcc compiler suite (gcc/g++/gfortran), gdb, and git are installed

In addition, the following numerical libraries that will be used in the course should be installed:

BLAS: If available you can use vendor-specific implementations (such as mkl on the GPC cluster). Good free implementations include gotoblas and ATLAS.
LAPACK: Here too, you can use vendor-specific implementations if you can, or use the free implementations at http://www.netlib.org/lapack (this builds upon BLAS).
GNU Scientific Library: http://www.gnu.org/s/gsl (get version 1.15)
FFTW: http://www.fftw.org (get version 3)

It is okay to install these through a package manager if that works on your platform.

Course outline

The classes will cover the material as follows; homework will be due by email at Thursday noon on the day before the following class.

Lecture 5: Modelling, floating point, validation + verification, visualization

Lecture 6: Classic ODE solvers, pseudo random numbers, Monte Carlo.

Lecture 7: Linear algebra - Blas, Lapack, ...

Lecture 8: Fast Fourier Transform

Evaluation will be based entirely on the four home works, with equal weighting given to each.

Location and Dates (CHANGED!)

The location has changed because of the number of students that signed up. This required to shift the time slot by 30 minutes.

Fridays 10:00am - 12:00am

Jan 13, 20, 27 and Feb 3.

Bahen Centre for Information Technology

40 St. George Street

Room 1230

Office Hours

The instructors will have office hours on Monday and Wednesday afternoons, 3pm-4pm, starting the week of the first class.

Location: SciNet offices at 256 McCaul, 2nd Floor.

Mon, Jan 9, 3pm-4pm
Wed, Jan 11, 3pm-4pm
Mon, Jan 16, 3pm-4pm
Wed, Jan 18, 3pm-4pm
Mon, Jan 23, 3pm-4pm
Wed, Jan 25, 3pm-4pm
Mon, Jan 30, 3pm-4pm
Wed, Feb 1, 3pm-4pm
Mon, Feb 6, 3pm-4pm
Wed, Feb 8, 3pm-4pm

Materials from Lectures

Homeworks

Homework following lecture 1

Assignment 1

Consider the sequence of numbers: 1 followed by 10⁸ values of 10^-8
Should sum to 2
Write code which sums up those values in order. What answer does it get?
Add to program routine which sums up values in reverse order. Does it get correct answer?
How would you get correct answer?
Submit code, Makefile, text file with answers.

Assignment 2

Implement an linear congruential generator with a = 106, c = 1283, m = 6075 that generates random numbers from 0..1
Using that and MT: generate 10,000 pairs (dx, dy) with dx, dy each in -0.1 .. +0.1. Generate histograms of dx and dy (say 200 bins). Does it look okay? What would you expect variation to be?
For 10,000 points: take random walks from (x,y)=(0,0) until exceed radius of 2, then stop. Plot histogram of final angles for the two psuedo random number generators. What do you see?
Submit makefile, code, plots, version control log.

Both assignments due at noon on Thursday Jan 19th. Email the files to rzon@scinet.utoronto.ca and ljdursi@scinet.utoronto.ca.

Answer Set for Lecture 1's Homework

A sample set of solutions for HW1

Homework following lecture 2

Assignment 1

Compute numerically (using the GSL):

∫₀³ f(x) dx

(that is the integral of f(x) from x=0 to x=3)

with

f(x) = ln(x) sin(x) e^-x

using three different methods:

Extended Simpsons' rule
Gauss-Legendre quadrature
Monte Carlo sampling

Hint: what is f(0)?

Compare the convergence of these methods by increasing number of function evaluations.

Submit makefile, code, plots, version control log.

Assignment 2

Using an adaptive 4th order Runge-Kutta approach, with a relative accuracy of 1e-4, compute the solution for t = [0,100] of the following set of coupled ODEs (Lorenz oscillator)

dx/dt = σ(y - x)

dy/dt = (ρ-z)x-y

dz/dt = xy - βz

with σ=10; β=8/3; ρ = 28, and with initial conditions

x(0) = 10

y(0) = 20

z(0) = 30

Hint: study the GSL documentation.

Submit makefile, code, plots, version control log.

Homework following lecture 3

Assignment 1: Ax = b

We’ve talked about a matrix arising from the explicit finite difference heat diffusion equation; a related form (the implicit equations) provides the temperature at a much later time from that of an earlier time,

$LaTeX: \begin{pmatrix} 1 \\ -1 & 2 &-1 & \\ <pre> & -1& 2 &-1 \\ & & & \cdots & \\ & & & -1& 2 & -1 \\ & & & & & 1 \\ </pre> \end{pmatrix} x^{n+1} = x^n$

For a 1d grid of size 100 (eg, 100x100 matrix A), using lapack, find the long term evolution of an initial condition where x = 1 at the first zone, and zero everywhere else (hot plate “turns on” in a cold domain).

Plot and explain the results.

Hand in code, makefile, logs, plot, vc log, text file explaining.

You want to use the driver routines for linear solvers . Do solution in double precision (D__SV). Which one should you choose?

In C, you can use the LAPACKE interface to link to the LAPACK libraries

Assignment 2: Linear Least Squares

For 100 points (x = 0, 1, 2..) generate 100 random y values in [0,1]

Fit to a cubic, eg:

$LaTeX: \begin{pmatrix} x_0^3 & x_0^2 & x_0 & 1 \\ x_1^3 & x_1^2 & x_1 & 1 \\ \dots \\ x_n^3 & x_n^2 & x_n & 1 \\ \end{pmatrix} \begin{pmatrix} a\\ b\\ c\\ d\\ \end{pmatrix} <pre>= </pre> \begin{pmatrix} y_0\\ y_1\\ \dots\\ y_n\\ \end{pmatrix}$

Use DGELS

Print fit, plot fit + data

Hand in Makefile, code, plots, VC log.

Assignment 3: Eigenvalues

The time-explicit formulation of the 1d heat diffusion equation has a term that looks like this (ignoring boundary conditions)

$LaTeX: \frac{D \Delta t}{\Delta x^2} \begin{pmatrix} -2 & 1 \\ 1 & -2 & 1 & \\ <pre> & 1& -2 & 1 \\ & & & \cdots & \\ & & & 1 & -2 & 1 \\ & & & & 1 & -2 \\ </pre> \end{pmatrix} x^n$

Ignoring the constants, what are the eigenvalues for this problem - what might we expect to get amplified/damped by this operator? (use 100 points; D__EV)

Plot the eigenmode with the largest and smallest absolute eigenvalues, and explain them.

This system is physically extremely stable; we expect all the eigenvalues of the real physical problem to have absolute values less than one. Numerical stability of this method requires the eigenvalues to have less than one; otherwise, the system blows up purely due to numerical problems. Use the largest absolute eigenvalue to put a constraint on the timestep, Δt given Δx, D. This is a stability constraint on the numerical method.

Hand in Makefile, code, plots, VC log, text file.

Homework following lecture 4

Assignment 1

Trigonometric interpolation uses a n point Fourier series to find values at intermediate points. It is one way of downscaling data, and was a motivation for Gauss, to be applied to planetary motion.

The way it works is:

You fourier-transform your data
You add frequecies above the Nyquist frequency (in absolute values), but set all the amplitudes of the new frequencies to zero.
Note that the frequencies are stored such that eg. f_n-1 is a low frequency -1.
The resulting 2n array can be back transformed, and now gives an interpolated signal.

For this assignment, write an application that will read in an image as a binary file into a 2d double precision array, and creates an image twice the size in all directions using trigonometric interpolation. Use a real-to-real version of the fftw.

Assignment 2

Write an application which reads an image and performs a low pass filter on the image, i.e., any fourier components with magnitudes k larger than n/4 are to be set to zero, after which the fourier inverse is taken and the image is written out to disk again.

Input image

Use this image of Gauss.

Image format:

Use the following simple PPM format:

First line (ascii): 'P3\n'
Second line, in ascii, 'width height\n'
Third line, in ascii, 'maxcolorvalue\n' (this is typically just 255)
Following that, in binary, are byte-triplets with the red, green and blue values of each pixel.
Note: in C, the 'unsigned char' data type matches the concept of a byte best (for most machines anyway).

In fact, between the first and second line, one can have comment lines that start with '#'.

Links

BLAS

LAPACK

http://www.netlib.org/lapack

GSL

GNU Scientific Library: http://www.gnu.org/s/gsl

FFT

FFTW: http://www.fftw.org

Git

Git cheat sheet from Git Tower: http://www.git-tower.com/files/cheatsheet/Git_Cheat_Sheet_grey.pdf

Python

Enthought python distribution: http://www.enthought.com/products/edudownload.php
Intro to python from software carpentry: http://software-carpentry.org/4_0/python
Tutorial on matplotlib: http://conference.scipy.org/scipy2011/tutorials.php#jonathan

Other Resources

What Every Computer Scientist Should Know About Floating-Point Arithmetic - the classic (and extremely comprehensive) overview of the basics of floating point math. The first few pages, in particular, are very useful.
Random Numbers In Scientific Computing: An Introduction by Katzgraber. A very lucid discussion of pseudo random number generators for science.

Numerical Tools for Physical Scientists (course)

Contents

Syllabus

Required Software

Course outline

Location and Dates (CHANGED!)

Office Hours

Materials from Lectures

Homeworks

Homework following lecture 1

Homework following lecture 2

Homework following lecture 3

Homework following lecture 4

Links

BLAS

LAPACK

GSL

FFT

Git

Python

Other Resources

Navigation menu

Search