<!--
.. title: Hassett, chapter 10: Multivariate distributions
.. date:
.. description: notes on Hassett and Stewart, chapter 10
.. type: text
.. has_math: true
-->

{{% hassett_navigation %}}

[TOC]

# 10.1 Joint distributions for discrete random variables

> Consider an investor who owns two things $x$ and $y$.  In a year their
> values have these probabilities:
>
> | $y$ | $x=90$ | $x=100$ | $x=110$ |
> |----:|-------:|--------:|--------:|
> |   0 |   0.05 |    0.27 |    0.18 |
> |  10 |   0.15 |    0.33 |    0.02 |
>
> Joint probability is $p(x,y) = P(X=x,Y=y)$.

An example where the probability distribution is a formula:

> Suppose $X$ is accidents per day in town $A$, and $Y$ is accidents
> per day in town $B$.  Perhaps
>
> $$\begin{aligned}
p(x,y)
    &= \frac{e^{-2}}{x!y!}
        & \text{for } x,y \in \mathbb N
\end{aligned}$$
>
> The probability there's on accident in $A$ and two in $B$ is
>
>$$\begin{aligned}
p(1,2)
    &= \frac{e^{-2}}{1!2!} \approx 0.068
\end{aligned}$$

Note that this is the product of two independent Poisson distributions
with mean $\lambda=1$.

## Marginal distributions

From the joint distributions we can find the individual ones.
Starting from the table above,

$$\begin{aligned}
P(X=90)
     &= P(X=90 | Y=0) + P(X=90 | Y=10)
\\\\ &= 0.05 + 0.15 = 0.20
\end{aligned}$$

We can add sums if we like to find the "marginal distributions":

| $y$    | $x=90$ | $x=100$ | $x=110$ | $p(y)$ |
|:------:|:------:|:-------:|:-------:|:------:|
| 0      | 0.05   | 0.27    | 0.18    | 0.50   |
| 10     | 0.15   | 0.33    | 0.02    | 0.50   |
| $p(x)$ | 0.20   | 0.60    | 0.20    |        |

$$\begin{aligned}
p_X(x)  &= \sum_y p(x,y) \\\\
p_Y(y)  &= \sum_x p(x,y) \\\\
\end{aligned}$$

# 10.2 (exclude) Joint distributions for continuous random variables

For single variables, continuous distributions obey

* Non-negative, $f(x) \geq 0$ for all $x$
* Normalized, $\int_{-\infty}^\infty f(x) \mathrm dx = 1 $
* $f=\frac{\mathrm dP}{\mathrm dx}$, or $P(a<X<b) = \int_a^b f(x)\mathrm dx$

Similarly,

* Non-negative, $f(x,y) \geq 0$ for all $x,y$
* Normalized, $\int_{-\infty}^\infty \int_{-\infty}^\infty f(x)
  \  \mathrm dx\ \mathrm dy = 1 $
* $f=\frac{\mathrm d^2P}{\mathrm dx \ \mathrm dy}$, or
  $$\begin{aligned}
  P(a<X<b, c<Y<d) &=  \int_a^b\mathrm dx \int_c^d\mathrm dy\ f(x,y)
  \end{aligned}$$

Marginal distributions are

$$\begin{aligned}
f_X(x) &= \int_{-\infty}^\infty \mathrm dy\ f(x,y) \\\\
f_Y(y) &= \int_{-\infty}^\infty \mathrm dx\ f(x,y) \\\\
\end{aligned}$$

# 10.3 Conditional distributions
<!-- (exclude 10.3.2, 10.3.3 continuous) -->

Discrete conditionals.

> Let's look at our table again:
>
> | $y$    | $x=90$ | $x=100$ | $x=110$ | $p(y)$ |
> |:------:|:------:|:-------:|:-------:|:------:|
> | 0      | 0.05   | 0.27    | 0.18    | 0.50   |
> | 10     | 0.15   | 0.33    | 0.02    | 0.50   |
> | $p(x)$ | 0.20   | 0.60    | 0.20    |        |
>
> If $Y=0$, we can find
> $$\begin{aligned}
>     P(X | Y+0) &= \frac{P(X \and (Y=0))}{P(Y=0)}
>     \\\\    &= \frac{P(\text{top row})}{0.50}
> \end{aligned}$$
>
> expanding to
>
> | $y$ | $x=90$ | $x=100$ | $x=110$ | $p(y)$ |
> |:---:|:------:|:-------:|:-------:|:------:|
> | 0   | 0.10   | 0.54    | 0.36    | 1.00   |

Likewise for continuous probability distributions:

> Let X be sick leave hours last year and $Y$ be sick leave hours this
> year, with $$\begin{aligned}
f(x,y) &= 2 - 1.2x - 0.8y \\\\
f_X(x) &= 1.6 - 1.2x \\\\
f_Y(y) &= 1.4 - 0.8y \\\\
\end{aligned}$$
> over $x,y$ within $[0,1]$.
>
> So $$\begin{aligned}
f(x|y) &= \frac{f(x,y)}{f(y)}
    &&= \frac{2 - 1.2x - 0.8y}{1.4 - 0.8y}
\\\\
f(y|x) &= \frac{f(x,y)}{f(x)}
    &&= \frac{2 - 1.2x - 0.8y}{1.6-1.2x}
\end{aligned}$$

Now, to find probabilities, do I have to integrals with polynomials in
the denominator?  Not necessarily:

$$\begin{aligned}
P(Y<0.40 | X=0.10)
    &= \int_0^{0.40} \mathrm dy
        \left(
            \frac{2-1.2\cdot0.10 - 0.8y}{1.6 - 1.2\cdot0.10}
        \right)
\\\\&= \int_0^{0.40} \mathrm dy
        \left(
            \frac{1.88 - 0.8y}{1.48}
        \right)
\end{aligned}$$

I'm pretty sure that you *don't* have to put the integral outside of
the fraction: Bayes' rule is about probabilities, not about
probability densities.  Suppose that $x$ and $y$ have different
dimensions so that

* $f=\frac{\mathrm dP}{\mathrm dx\ \mathrm dy}$ has dimension $[xy]^{-1}$
* $f_X = \int\mathrm dy\ \frac{\mathrm dP}{\mathrm dx\ \mathrm dy}$
  has dimension $[x]^{-1}$, as you'd expect, and similar with $f_Y$.

What is the dimension of the conditional probability?

$$\begin{aligned}
f(x|y)
    &= \frac{f(x,y)}{f(y)}
\\\\
[f(x|y)]
    &= \frac{ [xy]^{-1}}{ [y]^{-1}} = [x]^{-1}
\end{aligned}$$

It needs to be $[x]^{-1}$, because we want to integrate only over $x$ again.

Let's look at the probability-based definition.

$$\begin{aligned}
P(a<X<b | c<Y<d )
    &= \frac{ P(a<X<b ,  c<Y<d ) } {P(c<Y<d)}
\\\\&= \frac{
        \left(\int_a^b \mathrm dx \int_c^d \mathrm dy\right) \ f(x,y)
    }{
        \left(\int_{-\infty}^{\infty} \mathrm dx \int_c^d \mathrm dy\right) \ f(x,y)
    }
\\\\&= \frac{
        \left(\int_a^b \mathrm dx' \int_c^d \mathrm dy'\right) \ f(x',y')
    }{
        \left(\int_{-\infty}^{\infty} \mathrm dx' \int_c^d \mathrm dy'\right) \ f(x',y')
    }
\end{aligned}$$

I guess I'm leaning towards the cumulative distribution,

$$\begin{aligned}
f(x|c<Y<d)
    &= \frac{\mathrm d}{\mathrm dx} P(X<x|c<Y<d)
\\\\&= \frac{
        \left(\int_a^x \mathrm dx' \int_c^d \mathrm dy'\right) \ f(x',y')
    }{
        \left(\int_{-\infty}^{\infty} \mathrm dx' \int_c^d \mathrm dy'\right) \ f(x',y')
    }
\end{aligned}$$

So there is are integrals in both the numerator and the denominator.

In the limit $d-c\to0$, a well-behaved probability function is going
to do the same thing in both the numerator and the denominator.
There's probably some fancy way to write this in terms of delta
functions.

Anyway.  I was supposed to skip this part.

## Conditional expected value

We have

$$\begin{aligned}
E(Y|X=x) = \sum_y y\cdot p(y|x) \text{, etc.} \\\\
\end{aligned}$$

# 10.4 Independence for random variables

They're independent if

$$\begin{aligned}
    p(x,y) &= p_X(x)\cdot p_Y(y)
\end{aligned}$$

in which case

$$\begin{aligned}
    p(x|y) &= p_X(x)    & p(y|x) &= p_Y(y)
\end{aligned}$$

which comes trivially from Bayes' Theorem.

<!-- (exclude 10.4.2) -->

Likewise independence means

$$\begin{aligned}
    f(x|y) &= f_X(x) \text{, and vice-versa.}
\end{aligned}$$

# 10.5 The multinomial distribution

The number of partitions of $n$ objects into $k$ groups of size
$n_1,n_2,\cdots,n_k$ is given by

$$\begin{aligned}
{n \choose n_1, n_2, \cdots n_k}
    &= \frac{ n! }{n_1!\, n_2! \cdots n_k!}
\end{aligned}$$

Then the probability of a particular partition size is

$$\begin{aligned}
P(X_1 = n_1 \and X_2 = n_2 \and \cdots \and X_k = n_k )
    &=
        {n \choose n_1, n_2, \cdots n_k}
        p_1^{n_1} p_2^{n_2} \cdots p_k^{n_k}
\end{aligned}$$

We have to have $\sum_i p_i = 0$ for the probabilities $p_i$ that an
item from the $i$-th group is picked.
The binomial distribution is a special case, where there's one group
that we care about and the other group is those we don't.

# Problems
