# Dragon Notes

UNDER CONSTRUCTION
Latest content:
 Apr 05 Deep Learning Mar 19 Anomaly Detection - ML Mar 13 +Data Tables Mar 08 Clustering - Machine Learning Feb 28 Support Vector Machines - ML Feb 20 Regression - Data Science

# Randomness & Probability

Subtopics

## Discrete Random Variables

Probability Mass Function

 Bernoulli \ds p_X[k] =\left\{ \begin{array}{c} \begin{align} & 1-p & k=0 \\ & p & k=1 \end{align} \end{array} \right. Binomial $$\ds p_X[k]=\binom{M}{k}p^k (1-p)^{M-k},\quad k=0,1,...,M$$ Geometric $$\ds p_X[k]=(1-p)^{k-1}p,\quad k=1,2,... \vphantom{\frac{{\lambda}^k}{k!}}$$ Poisson $$\ds p_X[k]=e^{-\lambda}\frac{{\lambda}^k}{k!}, \quad k=0,1,...$$

Expected Value

For a discrete rv $$X$$, its expected value/mean/average is given by
$$\vpl \ds E[X]=\sum_{i}x_ip_X[x_i]$$
May act as the best prediction of the outcome of a random experiment for a single trial.

Important rv $$\small{}E[X]$$'s
[𝕯

3,4,5,6

]

 $$\ds \bb{Bernoulli}\$$ $$E[X] =\hspace{3px}$$ $$p$$ $$\hspace{150px}$$ $$\ds \bb{Geometric}\$$ $$E[X] =\hspace{3px}$$ $$Mp$$ $$\ds \bb{Binomial}\$$ $$E[X] =\hspace{3px}$$ $$1/p$$ $$\hspace{150px}$$ $$\ds \bb{Poisson}\$$ $$E[X] =\hspace{3px}$$ $$\lambda$$

Function expected value
For an rv $$X$$, the expected value of some function $$Y=g(X)$$ is
$$\ds E[g(X)]=\sum_{i}g(x_i)p_X[x_i] \vplup$$
[Ex1] Given $$p_X[k]=1/5,\ 0\leq k\leq 4$$, and $$Y=\t{sin}[(\pi /2)X]$$, determine the PMF of $$Y$$.

[Sol] Directly substituting,
$$\ds E[Y]=E[g(X)]=\frac{1}{5}\sum_{k=0}^{4}\Sin{\frac{\pi}{2}k}=\frac{1}{5}[0+1+0-1+0]=\bxred{0=E[Y]}$$

Variance

Given a discrete rv $$X$$, its average squared deviation from the mean, variance, is defined as
$$\vpl \ds \t{var}(X)=E[(X-E[X])^2]$$
From a PMF, the variance is
\ds \begin{align} \t{var}(X) &= \sum_{i}(x_i-E[X])^2p_X[x_i] \\ &= E[X^2]-E^2[X] \end{align}

 [P0] Constant modification [P1] Nonlinear operator \boxed{\begin{align} \t{var}(c) &= 0 \\ \t{var}(X+c) &= \t{var}(X) \\ \t{var}(cX) &= c^2\t{var}(X) \end{align}} $$\boxed{\ds \var{X+Y} \neq \var{X}+\var{Y}}$$

Transformations

Given a discrete rv $$X$$, a transformed rv $$Y=g(X)$$ can be obtained from
$$\vpl \ds p_Y[y_i]=\sum_{\large{\{j:g(x_j)=y_i\}}}\ns{}p_X[x_j]$$
It's the sum of probabilities for all values of $$X=x_j$$ that are mapped into $$Y=y_i$$.
RHS reads: "Sum $$p_X[x_j]$$ for all $$j$$ such that $$g(x_j)=y_i$$"

[Ex1]$$\hspace{1px}$$ Many-to-one transformation

$$\vplup$$Let the transformation be $$Y=g(X)=X^2$$ which is defined on the sample spaces $$S_X={-1,0,1}$$ so that $$S_Y={0,1}$$. Clearly, $$g(x_j)=x_j^2=0$$ only for $$x_j=0$$. Hence,
$$p_Y[0]=p_X[0]$$

However, $$g(x_j)=x_j^2=1$$ for $$x_j=-1$$ and $$x_j=1$$. Thus, we have
\ds \begin{align} p_Y[1] &= \sum_{\large{}x_j:x_j^2=1}\ns{}p_X[x_j] \\ &= p_X[-1]+p_X[1] \end{align}
Thus, $$p_Y[y_i]$$ was determined by summing the probabilities of all the $$x_j\t{'}$$s that map into $$y_i$$ via the transformation $$y=g(x)$$.

[Ex2]$$\hspace{1px}$$ Given a PMF $$p_X[k]=1/5,\ 0\leq k \leq 4,$$ and rv $$Y=\t{sin}[(\pi /2)X]$$, determine $$p_Y[k]$$.

[Sol] Determine which values of $$X$$ yield a given $$Y\vplup$$, and sum probabilities in that mapping:
\ds \begin{align} p_Y[0] &= p_X[0]+p_X[2]+p_X[4]=3/5 \vplup\\ p_Y[1] &= p_X[1]=1/5 \\ p_Y[-1] &= p_X[3]=1/5 \Rightarrow \end{align}
$$\ds \bxred{p_Y[k]=\left\{ \frac{1}{5},\frac{3}{5},\frac{1}{5} \right\},\ k:[-1,1]}\hspace{16px}$$

Conditional Events

The probability that $$Y$$ given that $$X$$ has occurred is given by the conditional PMF, defined as
$$\vpl \ds p_{Y|X}[y_j|x_i]=\frac{p_{X,Y}[x_i,y_j]}{p_X[x_i]},\quad j=1,2,...$$
This gives the probability of the event $$Y=y_j$$ for $$j=1,2,...$$ once we have observed that $$X=x_i$$. Since $$X=x_i$$ has occurred, the only joint events with a nonzero probability are $$\{(x,y):x=x_i, y=y_1, y_2, ...\}$$ We thus divide the joint probability $$p_{X,Y}(x_i,y_j)=P[X=x_i,Y=y_j]$$ by the probability of the reduced sample space, which is $$p_X[x_i]=P[X=x_i,Y=y_1]+P[X=x_i,Y=y_2]+...=\sum_{j=1}^{\infty}p_{X,Y}[x_i,y_j]$$.

## Continuous Random Variables

Probability Density Function

The probability density function is defined as the probability per unit interval of a continuous sample space, and is given by
$$\vpl \ds P[a\leq X\leq b]=\int_a^b p_X(x)dx$$

 [P0] Is non-negative [P1] Integrates to 1 $$\boxed{p_X(x)\geq 0, \quad -\infty < x < \infty}$$ $$\boxed{\ds \int_{-\infty}^{\infty}p_X(x)dx=1}$$

Gaussian Distribution

The widely-applied "bell curve" is given by
$$\vpl \ds p_X(x)=\frac{1}{\sqrt{2\pi \sigma^2}}\t{exp}\left[-\frac{(x-\mu)^2}{2\sigma^2}\right],$$
$$-\infty < x < \infty,\ \ \sigma^2 >0,\ \ -\infty < \mu < \infty$$
The outcome probability of a Gaussian rv over an interval is described by$$\vplup$$
$$\vpl \ds \t{P}(a\leq X \leq b)=\frac{1}{\sqrt{2\pi}}\int_b^a \t{exp}(-x^2/2)dx$$

The joint bivariate standard Gaussian distribution is given by
$$\ds \vplup p_{X,Y}(x,y)=\frac{1}{2\pi\sqrt{1-\rho^2}}\t{exp}\left[-\frac{x^2-2\rho xy+y^2}{2(1-\rho^2)}\right],\ \ \begin{matrix} -\infty < x < \infty \\ -\infty < y < \infty \end{matrix}$$
- where $$\rho=$$ correlation coefficient. Its realization with a non-zero mean and non-unity variance & correlation is shown below.

## Basics

Independence

Two events are statistically independent if and only if
$$\ds \Pr (A,B)=\Pr (A) \Pr (B) \hspace{25px}$$
Three events $$A$$, $$B$$, and $$C$$ are independent if and only if each involved pair is independent: $$\vplup$$
\ds \begin{align} \Pr (A,B) &= \Pr (A) \Pr (B) \vplup \\ \Pr (A,C) &= \Pr (A) \Pr (C) \\ \Pr (B,C) &= \Pr (B) \Pr (C) \\ \Pr (A,B,C) &= \Pr (A) \Pr (B) \Pr (C) \end{align}
An $$n$$-number of events are independent if and only if all their pairs are, in addition $$\vplup$$
$$\ds \Pr (A_1, A_2, ..., A_n) = \Pr (A_1) \Pr (A_2) ... \Pr (A_n) \hspace{8px}$$

Two rv's $$X$$ & $$Y$$ are independent if and only if their joint PDF/CDF factors as
\ds \begin{align} & f_{X,Y}(x,y)=f_X(x)f_Y(y) \\ & F_{X,Y}(x,y)=F_X(x)F_Y(y) \end{align}

[P0] For two independent events $$A$$ & $$B$$, and random variables $$X$$ & $$Y$$,

\ds \begin{align} \vplup \Pr (A,B) &= \Pr (A) \\ f_{X|Y}(x|y) &= f_X(x) \\ f_{X,Y}(x,y) &= f_X(x)f_Y(y) \end{align}