next up previous contents
Next: Geostatistics in Hydrology: Kriging Up: Random Vectors Previous: Random Vectors   Contents

Independence

The conditional probability of $ A$ given $ B$is defined as

$\displaystyle P(A \vert B) := {P( A \cap B ) \over P(B) }
$

provided both $ A$ and $ B$ are events and $ P(B) \not= 0$.

One can say that $ A$ and $ B$ are independent if $ P(A \vert B) = P(A) $ and $ P(B \vert A) = P(B) $. Thus $ A$ and $ B$ are independent if and only if

$\displaystyle P(A \cap B) = P(A) P(B)$ (1.7)

Clearly, this last condition makes sense even if either $ P(A)$ or $ P(B)$ vanishes, hence can be (and usually is) taken as the definition of independence of $ A$ and $ B$.

Two random variables $ X$ and $ Y$ are independent if the events $ (X \le x)$ and $ (Y \le y)$ are independent in the sense of (1.7) for each choice of $ x,y \in$   {\NUMBERS R}, i.e. if

$\displaystyle P(X \le x, Y \le y) = P(X \le x) P(Y \le y)$ (1.8)

In other words $ X$ and $ Y$ are independent if and only if

$\displaystyle F_{X,Y}(x,y) = F_X(x) F_Y(y).$ (1.9)

A random vector $ X \in$   {\NUMBERS R}$ ^n$ has independent components if all its marginals $ F_{X_i{_1}, \ldots, X_i{_k} } $ have the multiplicative property

$\displaystyle F_{X_i{_1}, \ldots, X_i{_k} } = F_{X_i{_1} } \cdots F_{X_i{_k} } ,$

for each ordered $ k$-tuple $ ( i_1, \ldots, i_k ) $ with $ \{ i_1, \ldots, i_k \} \subset \{1, \ldots, n \}$ with $ k\leq n$. N.B. It does not suffice to ask that $ F_{X_1, \ldots, X_n } $ have the multiplicative property for $ k=n$ only.

If $ X$ and $ Y$ have a joint density $ f_{X Y} $, then both $ X$ and $ Y$ have a density ($ f_X$ and $ f_Y$, respectively), namely the marginals

$\displaystyle f_X(x)$ $\displaystyle =$ $\displaystyle \int_{-\infty}^{+\infty} f_{X Y}(x, y) dy$  
$\displaystyle f_Y(y)$ $\displaystyle =$ $\displaystyle \int_{-\infty}^{+\infty} f_{X Y}(x, y) dx .$  

The vectors $ X$ and $ Y$ are independent with a joint density $ f_{X,Y}$, then

$\displaystyle f_{X Y}(x, y) = f_X(x) f_Y(y)$ (1.10)

Conversely, if the joint density factors into the marginal densities, then (1.9) holds. Thus (1.10) is a necessary and sufficient condition for independence.

Let $ X$ and $ Y$ be independent and let $ \phi$ and $ \psi$ be ``regular'' functions. Then:

$\displaystyle E[\phi(X) \psi(Y)]$ $\displaystyle =$ $\displaystyle \int_{-\infty}^{+\infty} \int_{-\infty}^{+\infty} \phi(x) \psi(y)
d F_{X Y}(x, y)$  
  $\displaystyle =$ $\displaystyle \int_{-\infty}^{+\infty} \phi(x) d F_{X}(x)
\int_{-\infty}^{+\infty} \psi(y) d F_{Y}(y),$  

i.e. under independence of $ X$ and $ Y$,

$\displaystyle E[\phi(X) \psi(Y)]= E[\phi(X)] E[\psi(Y)]$ (1.11)

In particular, if $ X, Y $ are independent, then

   Cov$\displaystyle [X, Y] = \left( \begin{array}{cc}
\mbox{Var}[X] & 0 \ 0 & \mbox{Var}[Y] \end{array} \right)
$

In fact, by (1.11)

$\displaystyle E[(X - E[X]) (Y - E[Y])] = E[(X - E[X])] E[(Y - E[Y])] = 0
$

Moreover

   Var$\displaystyle [ a X + b Y] =
a^2$   Var$\displaystyle [X] + 2 a b E[(X - E[X]) (Y - E[Y])] +
b^2$   Var$\displaystyle [Y]
$

i.e.

   Var$\displaystyle [ a X + b Y] =
a^2$   Var$\displaystyle [X] +
b^2$   Var$\displaystyle [Y]
$

if $ X$ and $ Y$ are independent.

The above results can be generalized for random vectors. If an $ n$-dimensional random vector $ X$ has independent components, then

   Cov$\displaystyle [X] =$   diag$\displaystyle ($Var$\displaystyle [X_1], \ldots,$   Var$\displaystyle [X_n])
$

and

   Var$\displaystyle [c^T X] =
c_1^2$   Var$\displaystyle [X_1] + \cdots + c_n^2$   Var$\displaystyle [X_n]
$

Let us apply the above results to the following particular situation: sampling a given random variable, like when a measurement is repeated a certain number of times.

Suppose $ X$ is a given random variable. A sample of length $ n$ from $ X$ is a sequence of independent random variables $ X_1, \ldots, X_n$, each of them having the same distribution as $ X$. The components of the sample are said to be independent and identically distributed (``iid" for short). The sample mean

$\displaystyle M_n := { X_1 + \ldots + X_n \over n }$ (1.12)

is computed in order to estimate the ``value" of $ X$.

If

$\displaystyle E[X] = m,$   Var$\displaystyle [X] = \sigma^2
$

Then

$\displaystyle E M_n = m,$   Var$\displaystyle M_n = {\sigma^2 \over n}$ (1.13)

and the advantage of forming the sample average becomes apparent: While the mean is unaltered, the variance reduces when the number $ n$ of observations increases. Thus, forming the arithmetic mean of a sample of measurements results in a higher precision of the estimate. The situation is as depicted in figure 1.8.
Figure 1.8: Density of the sample mean when the number of observations $ n$ increases.
\includegraphics{smden.eps}width=10cm

One feels tempted to assert that

$\displaystyle M_n \rightarrow m  $    as $\displaystyle   n \rightarrow \infty
$

In fact it is true that

$\displaystyle P \left( \lim_{n \rightarrow \infty} M_n = m \right) = 1$ (1.14)

if $ X_1, X_2, \ldots$ are iid random variables with mean $ m$. This result is known as the Strong Law of Large Numbers.

On the other hand, we know that

$\displaystyle E[M_n] = m,$   Var$\displaystyle [M_n] = \sigma^2 / n
$

but we know nothing about the distribution of $ M_n$. It is true that

$\displaystyle E[Z_n] = 0,$   Var$\displaystyle [Z_n ]= 1
$

where

$\displaystyle Z_n := {M_n - m \over \sigma / \sqrt{n} }$

The Central Limit Theorem states that if $ X_1, \ldots, X_n$ are iid with mean $ m$ and variance $ \sigma^2$, then

$\displaystyle \lim_{n \rightarrow \infty} P(Z_n \le x) =
{1 \over \sqrt{2 \pi} } \int_{-\infty}^{x} e^{-\xi^2 /2} d\xi
$

uniformly in $ x \in$   {\NUMBERS R}.

Thus, the sample mean is given by

$\displaystyle M_n = m + {\sigma \over \sqrt{n} } Z_n,
$

where the normalized errors $ Z_n$ are ``asymptotically Gaussian".
next up previous contents
Next: Geostatistics in Hydrology: Kriging Up: Random Vectors Previous: Random Vectors   Contents
Mario Putti 2003-10-06