Exercises for Tue Feb 6
1. On Jan 30 I went through some basic processes for the Dirichlet process, including existence, big support, the posterior, the marginal distribution of a new sample point, etc. We also discussed the war-and-peace dataset, cf. Nils's FocuStat Blog Post. Use my "com2a" to look more into this, with Dirichelt process priors for the two cumulatives for the battle deaths before and after Vietnam.
2. Next week, still in pre-Olympian modus, we discuss aspects of Ch 2, including density estimation and clustering models.
3. Soon I'll TeX up some Nils Exercises and Lecture Notes (but not this week).
4. Information regarding curriculum and exam project: next week.
5. Exercises for next week are as follows.
(i) Do some more analyses for the war data, including inventing and examining your own interest parameter \delta(F_L, F_R), and with a reasonable setup for the two priors.
(ii) Let P be a Dir(a P_0) on [0,1], with P_0 the uniform, and let \theta = \int x d P(x) be the ensuing random mean. Find the mean and variance of \theta. Then try a couple of ways for simulating \theta values from its distribution. Among your methods should be the "stick-breaking" one from the book's Ch 2 (the Sethuraman representation).
(iii) Suppose again that P is from a Dir(a P_0) with P_0 the uniform on [0,1], and take a = 1 for concreteness. Simulate a sample X_1, ..., X_{100} from the random P, using formulae from the last lecture or from the book's Ch 2. How many ties are there, or, equivalently, what is D_{100}, the number of distinct values?
(iv) Now the idea is to use (iii), again, but to see these as *parameters in a bigger model*, rather than data themselves. So look at Y_i being N(\theta_i,1), for i = 1, ..., 100, where the \theta_i stem from the Dir(a P_0), with a = 1 and P_0 the standard normal. Simulate such a set of Y_i, and see if you can spot clusters.