Graduationwoot

Dragon Notes

i

\( \newcommand{bvec}[1]{\overrightarrow{\boldsymbol{#1}}} \newcommand{bnvec}[1]{\overrightarrow{\boldsymbol{\mathrm{#1}}}} \newcommand{uvec}[1]{\widehat{\boldsymbol{#1}}} \newcommand{vec}[1]{\overrightarrow{#1}} \newcommand{\parallelsum}{\mathbin{\|}} \) \( \newcommand{s}[1]{\small{#1}} \newcommand{t}[1]{\text{#1}} \newcommand{tb}[1]{\textbf{#1}} \newcommand{ns}[1]{\normalsize{#1}} \newcommand{ss}[1]{\scriptsize{#1}} \newcommand{vpl}[]{\vphantom{\large{\int^{\int}}}} \newcommand{vplup}[]{\vphantom{A^{A^{A^A}}}} \newcommand{vplLup}[]{\vphantom{A^{A^{A^{A{^A{^A}}}}}}} \newcommand{vpLup}[]{\vphantom{A^{A^{A^{A^{A^{A^{A^A}}}}}}}} \newcommand{up}[]{\vplup} \newcommand{Up}[]{\vplLup} \newcommand{Uup}[]{\vpLup} \newcommand{vpL}[]{\vphantom{\Large{\int^{\int}}}} \newcommand{lrg}[1]{\class{lrg}{#1}} \newcommand{sml}[1]{\class{sml}{#1}} \newcommand{qq}[2]{{#1}_{\t{#2}}} \newcommand{ts}[2]{\t{#1}_{\t{#2}}} \) \( \newcommand{ds}[]{\displaystyle} \newcommand{dsup}[]{\displaystyle\vplup} \newcommand{u}[1]{\underline{#1}} \newcommand{tu}[1]{\underline{\text{#1}}} \newcommand{tbu}[1]{\underline{\bf{\text{#1}}}} \newcommand{bxred}[1]{\class{bxred}{#1}} \newcommand{Bxred}[1]{\class{bxred2}{#1}} \newcommand{lrpar}[1]{\left({#1}\right)} \newcommand{lrbra}[1]{\left[{#1}\right]} \newcommand{lrabs}[1]{\left|{#1}\right|} \newcommand{bnlr}[2]{\bn{#1}\left(\bn{#2}\right)} \newcommand{nblr}[2]{\bn{#1}(\bn{#2})} \newcommand{real}[1]{\Ree\{{#1}\}} \newcommand{Real}[1]{\Ree\left\{{#1}\right\}} \newcommand{abss}[1]{\|{#1}\|} \newcommand{umin}[1]{\underset{{#1}}{\t{min}}} \newcommand{umax}[1]{\underset{{#1}}{\t{max}}} \newcommand{und}[2]{\underset{{#1}}{{#2}}} \) \( \newcommand{bn}[1]{\boldsymbol{\mathrm{#1}}} \newcommand{bns}[2]{\bn{#1}_{\t{#2}}} \newcommand{b}[1]{\boldsymbol{#1}} \newcommand{bb}[1]{[\bn{#1}]} \) \( \newcommand{abs}[1]{\left|{#1}\right|} \newcommand{ra}[]{\rightarrow} \newcommand{Ra}[]{\Rightarrow} \newcommand{Lra}[]{\Leftrightarrow} \newcommand{rai}[]{\rightarrow\infty} \newcommand{ub}[2]{\underbrace{{#1}}_{#2}} \newcommand{ob}[2]{\overbrace{{#1}}^{#2}} \newcommand{lfrac}[2]{\large{\frac{#1}{#2}}\normalsize{}} \newcommand{sfrac}[2]{\small{\frac{#1}{#2}}\normalsize{}} \newcommand{Cos}[1]{\cos{\left({#1}\right)}} \newcommand{Sin}[1]{\sin{\left({#1}\right)}} \newcommand{Frac}[2]{\left({\frac{#1}{#2}}\right)} \newcommand{LFrac}[2]{\large{{\left({\frac{#1}{#2}}\right)}}\normalsize{}} \newcommand{Sinf}[2]{\sin{\left(\frac{#1}{#2}\right)}} \newcommand{Cosf}[2]{\cos{\left(\frac{#1}{#2}\right)}} \newcommand{atan}[1]{\tan^{-1}({#1})} \newcommand{Atan}[1]{\tan^{-1}\left({#1}\right)} \newcommand{intlim}[2]{\int\limits_{#1}^{#2}} \newcommand{lmt}[2]{\lim_{{#1}\rightarrow{#2}}} \newcommand{ilim}[1]{\lim_{{#1}\rightarrow\infty}} \newcommand{zlim}[1]{\lim_{{#1}\rightarrow 0}} \newcommand{Pr}[]{\t{Pr}} \newcommand{prop}[]{\propto} \newcommand{ln}[1]{\t{ln}({#1})} \newcommand{Ln}[1]{\t{ln}\left({#1}\right)} \newcommand{min}[2]{\t{min}({#1},{#2})} \newcommand{Min}[2]{\t{min}\left({#1},{#2}\right)} \newcommand{max}[2]{\t{max}({#1},{#2})} \newcommand{Max}[2]{\t{max}\left({#1},{#2}\right)} \newcommand{pfrac}[2]{\frac{\partial{#1}}{\partial{#2}}} \newcommand{pd}[]{\partial} \newcommand{zisum}[1]{\sum_{{#1}=0}^{\infty}} \newcommand{iisum}[1]{\sum_{{#1}=-\infty}^{\infty}} \newcommand{var}[1]{\t{var}({#1})} \newcommand{exp}[1]{\t{exp}\left({#1}\right)} \newcommand{mtx}[2]{\left[\begin{matrix}{#1}\\{#2}\end{matrix}\right]} \newcommand{nmtx}[2]{\begin{matrix}{#1}\\{#2}\end{matrix}} \newcommand{nmttx}[3]{\begin{matrix}\begin{align} {#1}& \\ {#2}& \\ {#3}& \\ \end{align}\end{matrix}} \newcommand{amttx}[3]{\begin{matrix} {#1} \\ {#2} \\ {#3} \\ \end{matrix}} \newcommand{nmtttx}[4]{\begin{matrix}{#1}\\{#2}\\{#3}\\{#4}\end{matrix}} \newcommand{mtxx}[4]{\left[\begin{matrix}\begin{align}&{#1}&\hspace{-20px}{#2}\\&{#3}&\hspace{-20px}{#4}\end{align}\end{matrix}\right]} \newcommand{mtxxx}[9]{\begin{matrix}\begin{align} &{#1}&\hspace{-20px}{#2}&&\hspace{-20px}{#3}\\ &{#4}&\hspace{-20px}{#5}&&\hspace{-20px}{#6}\\ &{#7}&\hspace{-20px}{#8}&&\hspace{-20px}{#9} \end{align}\end{matrix}} \newcommand{amtxxx}[9]{ \amttx{#1}{#4}{#7}\hspace{10px} \amttx{#2}{#5}{#8}\hspace{10px} \amttx{#3}{#6}{#9}} \) \( \newcommand{ph}[1]{\phantom{#1}} \newcommand{vph}[1]{\vphantom{#1}} \newcommand{mtxxxx}[8]{\begin{matrix}\begin{align} & {#1}&\hspace{-17px}{#2} &&\hspace{-20px}{#3} &&\hspace{-20px}{#4} \\ & {#5}&\hspace{-17px}{#6} &&\hspace{-20px}{#7} &&\hspace{-20px}{#8} \\ \mtxxxxCont} \newcommand{\mtxxxxCont}[8]{ & {#1}&\hspace{-17px}{#2} &&\hspace{-20px}{#3} &&\hspace{-20px}{#4}\\ & {#5}&\hspace{-17px}{#6} &&\hspace{-20px}{#7} &&\hspace{-20px}{#8} \end{align}\end{matrix}} \newcommand{mtXxxx}[4]{\begin{matrix}{#1}\\{#2}\\{#3}\\{#4}\end{matrix}} \newcommand{cov}[1]{\t{cov}({#1})} \newcommand{Cov}[1]{\t{cov}\left({#1}\right)} \newcommand{var}[1]{\t{var}({#1})} \newcommand{Var}[1]{\t{var}\left({#1}\right)} \newcommand{pnint}[]{\int_{-\infty}^{\infty}} \newcommand{floor}[1]{\left\lfloor {#1} \right\rfloor} \) \( \newcommand{adeg}[1]{\angle{({#1}^{\t{o}})}} \newcommand{Ree}[]{\mathcal{Re}} \newcommand{Im}[]{\mathcal{Im}} \newcommand{deg}[1]{{#1}^{\t{o}}} \newcommand{adegg}[1]{\angle{{#1}^{\t{o}}}} \newcommand{ang}[1]{\angle{\left({#1}\right)}} \newcommand{bkt}[1]{\langle{#1}\rangle} \) \( \newcommand{\hs}[1]{\hspace{#1}} \)

  UNDER CONSTRUCTION

Deep Learning:
Clustering



K-means Clustering

Separating data into groups of nearest datapoints via self-adjusting clusters:

Inputs:
\(K =\) # of clusters
\(\{x^{(1)}, x^{(2)}, ..., x^{(m)}\} =\) training set, \(x\in \mathbb{R}^n\)
    Algorithm:\(\up\)
  • - Randomly initialize \(K\) cluster centroids \(\mu_1, \mu_2, ...,\mu_K \in \mathbb{R}^n\)
  • - Repeat {\(\hspace{9px}\)

    for

    \(i = 1\hspace{2px}\t{:}\hspace{2px}m\)
    \(c^{(i)} \t{ := }\) index (\(1\) to \(K\)) of cluster centroid closest to \(x^{(i)}\)

    for

    \(k = 1\hspace{2px}\t{:}\hspace{2px}K\)
    \(\mu_k \t{ := }\) mean of points assigned to cluster \(k\)\(\hspace{9px}\)}

  • - Unsupervised learning algorithm; no data labels \(y\)
  • - Useful in discovering hidden patterns, grouping & organizing data
  • - Applied in market segmentation, social network & astronomical data analysis, organizing computing clusters, improving learning algorithms, etc

Cost Function (K-means)

Optimization objective lies in minimizing \((\bn{c},\bn{\mu})\) over the cost function:

\(\ds J(\bn{c},\bn{\mu}) = \sfrac{1}{m}\sum_{i=1}^m \abss{x^{(i)}-\mu_{c^{(i)}}}^2,\)
\(\ds \begin{align} &\bn{c} = c^{(1)},...,c^{(m)}\\ &\bn{\mu} = \mu_1,...,\mu_K \end{align}\)
  • - \(c^{(i)}=\) index of cluster to which example \(x^{(i)}\) is currently assigned
  • - \(\mu_k =\) cluster centroid \(k\) (\(\mu_k \in \mathbb{R}^n\))
  • - \(\mu_{c^{(i)}} =\) cluster centroid of cluster to which example \(x^{(i)}\) has been assigned

Principal Component Analysis

Reducing dimensions by mapping data onto a set of principal components:

    Algorithm:\(\up\) Reduce data from \(n\)- to \(k\)-dimensions:
  • - \(\ds\s{\Sigma}=\sfrac{1}{m}\sum_{i=1}^n(x^{(i)})(x^{(i)})^T\)
  • - \(\t{[U, S, V]} = \t{svd(Sigma)},\ \ \bn{U}=\lrbra{\bn{u}^{(1)}\ \bn{u}^{(2)}\ \cdots\ \bn{u}^{(n)}} \in \mathbb{R}^{n\times n}\)
  • - \(\bn{U}_{\t{reduce}} = \lrbra{\bn{u}^{(1)}\ \bn{u}^{(2)}\ \cdots\ \bn{u}^{(k)}} \in \mathbb{R}^{n\times k}\)
  • - \(\bn{z}^{(i)}=\bn{U}_{\t{reduce }}^T \bn{x}^{(i)},\ \ \bn{z}^{(i)}\in \mathbb{R}^k \ \ \ (=\lrbra{k\times n}*\lrbra{n\times1})\)
  • \(\up\u{\smash{\t{Choosing }k}}\t{:}\) choose smallest \(k\) such that
  • \(\ds\frac{\frac{1}{m}\sum_{i=1}^m\abss{\bn{x}^{(i)}-\bn{x}^{(i)}_{\t{approx}}}^2}{\frac{1}{m}\sum_{i=1}^m\abss{\bn{x}^{(i)}}^2}\leq 0.01 \quad\Leftrightarrow \quad \frac{\sum_{i=1}^k S_{ii}}{\sum_{i=1}^m S_ii}\geq 0.99\hspace{20px}\)
  • \(\ra\) 99% of variance is retained

    (can be 95%, 90%, etc)


Reconstruction from compression diagram
Principal Component Analysis diagram
  • - Is a method of dimensionality reduction
  • - Works by collapsing correlated components onto principal (uncorrelated, independent) comp- onents
  • - \(\t{svd}\) returns \(\t{S}\) diagonal entries in order of descending 'importance', such that the last (bottom) component bears the least influence
  • - Reconstruction from compression, shown left, maps data from compressed representation back to original space
  • - \(\bn{U}^T=\bn{U}^{-1}\), since \(\bn{U}\) is unitary and real




Dragon Notes,   Est. 2018     About

By OverLordGoldDragon