# Dragon Notes

# Deep Learning

Neural Networks

ANN Base Model

An artificial neural network (ANN) is a set of artificial neurons processing inputs via intermediate layers to yield an output.

- An artificial neuron (AN) is a node in an ANN modifying input data via a mathematical function, then passing to other ANs
- Layers include a bias unit, $$a_0$$, analogous to the intercept term in logistic regression; may be omitted from diagrams (img. right omits neuron links)
- In deep learning, only the hidden + output layers are counted, & input layer is defined as $$l=0$$

ANN Base Model (II)

ANN model representation may utilize logistic regression functions for ANs:

\ds \begin{align} a_1^{(2)} &= g(\Theta_{10}^{(1)}x_0 + \Theta_{11}^{(1)}x_1 + \Theta_{12}^{(1)}x_2 + \Theta_{13}^{(1)}x_3) \\ a_2^{(2)} &= g(\Theta_{20}^{(1)}x_0 + \Theta_{21}^{(1)}x_1 + \Theta_{22}^{(1)}x_2 + \Theta_{23}^{(1)}x_3) \\ a_3^{(2)} &= g(\Theta_{30}^{(1)}x_0 + \Theta_{31}^{(1)}x_1 + \Theta_{32}^{(1)}x_2 + \Theta_{33}^{(1)}x_3) \\ h_{\Theta}(x) = a_1^{(3)} &= g(\Theta_{10}^{(2)}x_0 + \Theta_{11}^{(2)}x_1 + \Theta_{12}^{(2)}x_2 + \Theta_{13}^{(2)}x_3) \\ \end{align}
- Bias unit is added to layer $$j$$ after implementing $$\bb{1}$$
- In the last step, between layer $$\s{}j$$ and $$\s{}(j+1)$$, the last theta mtx. $$\bn{\Theta}^{(j)}$$ will only have one row, which is multiplied by one column $$\bn{a}^{(j)}$$ - to yield a scalar
- The output is passed to layer $$\s{}(j+1)$$ as $$h_{\bn{\Theta}}(\bn{x}) = \bn{a}^{(j+1)} = g(\bn{z}^{(j+1)})$$
- Vectorization: see $$\bb{1}$$, for AN w/ $$j$$ layers & $$n$$ inputs, where $$k=$$ node index
- $$\t{dim}(\bn{\Theta}) = s_{j+1}\s{\times}\ns{}(s_j+1)$$, to account for the bias unit in layer $$j$$
- $$\bn{\Theta}$$ rows $$\ra \s{}(j+1)$$'s activ. units, columns $$\ra$$ prev. layer units w/ bias (i.e. $$\s{}j$$)
\ds \begin{align} a_1^{(j)} &= g(z_1^{(j)}) \\ a_2^{(j)} &= g(z_2^{(j)}) \\ & \vdots \\ a_n^{(j)} &= g(z_n^{(j)}) \end{align} \quad \quad \bn{z}^{(j)} = \bn{\Theta}^{(j-1)}\bn{a}^{(j-1)}\ \bb{1}
$$\ds z_k^{(j)} = \Theta_{k,0}^{(j-1)}x_0 + \Theta_{k,1}^{(j-1)}x_1 + ... + \Theta_{k,n}^{(j-1)}x_n$$