import tensorflow as tf

# a placeholder just tells the system the type of the variable which will
a = tf.placeholder('int32')
b = tf.placeholder('int32')

# the mul is the multiplication
y = tf.multiply(a,b)

# start the session
sess = tf.Session()

# feed the placeholders in the model and have it run
sess.run(y,feed_dict={a:2,b:5})

10

The Tensor Data Structure¶

It can be identified by 3 parameters: RANK, SHAPE, TYPE

RANK: the number of dimensions of the tensor
- n-dimensional array or list
- As examples: Rank 0 (scalar), Rank 1 (a 1 by n vector), Rank 2 (n by m matrix), Rank 3 (n by m by q matrix)
SHAPE: the number of rows and columns the tensor has
TYPE: the data type
- Examples: tf.int8, tf.string, tf.bool,...

import numpy as np
tensor_1d = np.array([1.3, 1, 4.0, 23.99])
print (tensor_1d)

[  1.3    1.     4.    23.99]

print (tensor_1d[0])
print (tensor_1d[2])

1.3
4.0

# the indexing is the same as python
tensor_1d.ndim # number of dimensions, this is like the rank

1

tensor_1d.shape

(4,)

tensor_1d.dtype

dtype('float64')

import numpy as np
my_numbers = np.array([1,2,3,4,5])
my_numbers.dtype

dtype('int32')

# convert the array to TF tensor
import tensorflow as tf

tf_tensor = tf.convert_to_tensor(tensor_1d,dtype=tf.float64)

print (tf_tensor)

Tensor("Const:0", shape=(4,), dtype=float64)

# running the session we can then visualize the tensor and it's elements
with tf.Session() as sess:
    print (sess.run(tf_tensor))
    print (sess.run(tf_tensor[0]))
    print (sess.run(tf_tensor[2]))

[  1.3    1.     4.    23.99]
1.3
4.0

Tensor Handling¶

# let's build 2 integer arrays
matrix1 = np.array([(2,2,2),(2,2,2),(2,2,2)],dtype='int32')
matrix2 = np.array([(1,1,1),(1,1,1),(1,1,1)],dtype='int32')

# visualize them:
print ('matrix1 = ')
print (matrix1)

print ('matrix2 = ')
print (matrix2)

matrix1 =
[[2 2 2]
 [2 2 2]
 [2 2 2]]
matrix2 =
[[1 1 1]
 [1 1 1]
 [1 1 1]]

# defining matrix operations which will not be run until the run is called

matrix_product = tf.matmul(matrix1,matrix2) # matrix multiply
print (matrix_product)

matrix_sum = tf.add(matrix1,matrix2)
print (matrix_sum)

Tensor("MatMul:0", shape=(3, 3), dtype=int32)
Tensor("Add:0", shape=(3, 3), dtype=int32)

# new matrix to be used to compute a matrix determinant
matrix3 = np.array([(2,7,2),(1,4,2),(9,0,2)],dtype='float32')
print ('matrix3 = ')
print (matrix3)

matrix_det = tf.matrix_determinant(matrix3)
matrix_det

matrix3 =
[[ 2.  7.  2.]
 [ 1.  4.  2.]
 [ 9.  0.  2.]]

<tf.Tensor 'MatrixDeterminant:0' shape=() dtype=float32>

with tf.Session() as sess:
    result1 = sess.run(matrix_product)
    result2 = sess.run(matrix_sum)
    result3 = sess.run(matrix_det)

# print the results
print ('matrix1 * matrix2 = ')
print (result1)

print ('matrix1 + matrix2 = ')
print (result2)

print ('matrix3 determinant = ')
print (result3)

matrix1 * matrix2 =
[[6 6 6]
 [6 6 6]
 [6 6 6]]
matrix1 + matrix2 =
[[3 3 3]
 [3 3 3]
 [3 3 3]]
matrix3 determinant =
56.0

Tensor Board¶

A visualization tool
Aims at analyzing data flow graph
Helps understanding machine learning tools
Can be somehow confusing

Neural Networks and Deep Learning¶

Biologically inspired programming paradigm which enables a computer to learn from observational data
DEEP LEARNING is a powerful set of techniques that is based on the way human brain processes imformation and learns, responding to external stimuli
- It consists of a machine learning model at several levels of representations in which the deeper levels take as input the outputs of the previous levels, transforming them and always abstracting more
Neural networds and deep learning currently provide the best solutions to many problems in image recognition, speech recognition and NLP

Artificial Neural Networks (ANN)¶

Information processing system whose operating mechanism is inspired by biological circuits
Generalizations of mathematical models of human cognition of neaural biology
- Information is processed at many nodes called neurons
- Signals are transfurred from one neauron to another via a link
- Each connection link has an associated weight
- Each neuron applies an activation function to the net input they are receiving
Nodes - elements in a layer
Weights - strength of connections between nodes of layers
Layers
- Input - The input nodes
- Hidden - Every Interior layer in the network
- Output - The result
Activation functions
- Input 1 strength of connection 1 + input 2 strength of connection 2 ----- apply function ------> output
- Identity function: f(x) = x
- Step function with a theshold T:
  - f(x) = 1 if x is bigger or equal to T
  - f(x) = 0 if x is less than T
- Logistic of sigmoid fucntion between 0 and 1
- Hyperbolic tanget (-1 to 1 values)
- Relu()

The first run throughout the Net is called feed forward and it will give a bad result usually. Then it takes the output and

Types of Neural Networks¶

Perceptron
Feed Forward
Radial Basis Network
Recurrent Neaural Network
Convolution Neural Network

Where are Neural Networks Being Used?¶

Pattern recognition
Signal Processing
Medicine
Speech recognition
Business

Strenghts and Weaknesses¶

Strenghts:
- Relatively simple learning algorithm (SGD and backprop)
- Can almost learn any function
- Scales well to large datasets
- Can significantly out-perform other models when the right conditions are met
Weaknesses:
- Hard to interpret the model
- NNs are a black box
- Do not perform as well on small data sets

Perceptron¶

Single Layer Perceptron
Multi Layer Perceptron
Multi Layer Perceptron Classification
Multi Layer Perceptron function approximation

Basic Steps of Training:¶

The weights are initialized with random values at the beginning of the training
For each element of the training error is calculated, that is the difference between the desired output and the actual output. This error is used to adjust
The process is repeated, resubmitting to the networ, in a random error, all the examples of the training set until the error made on the entire training set is less than a certain threshold, or until the maximum number of iterations is reached

MNIST with Multi Layer Perceptron¶

Images are 28x28 or 784 input nodes
Predict number solely on the image data in a form of array

import tensorflow as tf
import matplotlib.pyplot as plt
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('MNIST_data/',one_hot=True)

Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz

Data Format¶

The data is stored in a vector format, although the original data was a 2d matrix with values representing how much pigment was at a certain location

type(mnist) # to find out the type of mnist

tensorflow.contrib.learn.python.learn.datasets.base.Datasets

mnist.train.images.shape #to find the number of rows and cols we have in

(55000, 784)

sample = mnist.train.images[0].reshape(28,28)

%matplotlib inline
plt.imshow(sample,cmap='Greys')

<matplotlib.image.AxesImage at 0x18be56393c8>

Parameters¶

We need to define (w) parameters from the stochastic gradient descent

- Learning Rate - how quickly to adjust the cost function, or how quickly the network forgets about older information
- Training Epochs - how many training cycles to go through (example, going through the 55K images above is an epoch)
- Batch Size - size of "batches" of training data (most of the Time all data will not fit into memory so it needs to be split in batches)

# parameters - they needs to be set up for

learning_rate = 0.001
training_epochs = 35
batch_size = 100

Network Parameters¶

Neaural Network Dependent
Input data your data looked like
What kind of net you would want to build
Learning Rate:
- Small learning rate leads to slow and very lengthy learning
- Large learning rate may:
  - Lead to a very long training time
  - Output saturation
Weights:
- Make sure to avoid larger weight at they can lead to a saturation of the output of the first layer
- They should be small weights (-0.5 and 0.5)

# Due to the fact that we are working with images and images are stored in 
# Choose 256 but you can choose whatever you want

n_hidden_1 = 256 # first layer number of features
n_hidden_2 = 256 # second layer number of features
n_input = 784 # MNIST data input (image shape 28*28)
n_classes = 10 # MNIST total classes (0-9 digits)
n_samples = mnist.train.num_examples # the number of images in the dataset (55000)

Build Multilayer Mode¶

We first receive the input data array (784 pixels) and then send it to the first hidden layers
The data will begin to have a weight (w) attached to it between layers (this is initially a random value) and then sent to a node to undergo an activiation function (along with a bias (b))
Then it will continue on to the next hidden layer and so on until the final output layer
In our case we will do just 2 hidden layers, the more you use the slower the calculations but the higher the chance of getting better results

Build Multilayer Mode - Cost (Lower Error)¶

Apply an optimization function to minimize the cost; this is done by adjusting weight values accordingly across the network
We will use the Adam Optimizer
Adjust the optimizer by changing the learning rate parameter
The lower the rate the higher the possibility for accurate training results
Use the RELU activation function

Cost function at output (Adam Optimizer)¶

Creating Multilayer Perceptron¶

We will crete our model, we'll start with 2 hidden layers, whcih use the RELU activation function

def multilayer_perceptron(x, weights, biases):
    '''
    x: place Holder for Data Input
    weights: Dictionary of weights
    biases: Dictionary of biases
    '''

    layer_1 = tf.add(tf.matmul(x,weights['h1']),biases['b1'])
    layer_1 = tf.nn.relu(layer_1)

    layer_2 = tf.add(tf.matmul(layer_1,weights['h2']),biases['b2'])
    layer_2 = tf.nn.relu(layer_2)

    out_layer = tf.matmul(layer_2,weights['out']+biases['out'])

    return out_layer

Weights and Biases¶

In order for the tf model to work we need to create 2 dicts containing our weight and bias objects for the model tf.variable
A variable maintain state in the graph across calls to run()

# In this case we are initializaing the weights with random ones
weights = {
    'h1': tf.Variable(tf.random_normal([n_input,n_hidden_1])),
    'h2': tf.Variable(tf.random_normal([n_hidden_1,n_hidden_2])),
    'out': tf.Variable(tf.random_normal([n_hidden_2,n_classes]))
}

# Same for the biases
biases = {
    'b1':tf.Variable(tf.random_normal([n_hidden_1])),
    'b2':tf.Variable(tf.random_normal([n_hidden_2])),
    'out':tf.Variable(tf.random_normal([n_classes]))
}

Construct Model & Cost Optimization Functions¶

# We are defining the input and output as placeholders
x = tf.placeholder('float',[None,n_input])
y = tf.placeholder('float',[None,n_classes])

# initialize the pred variable and pass into it the function we've defined
pred = multilayer_perceptron(x,weights,biases)
print (pred)

Tensor("MatMul_3:0", shape=(?, 10), dtype=float32)

cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred,labels=y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)

init = tf.global_variables_initializer()

Illustrative Purpose:¶

# We are grabbing the first batch
Xsamp,ysamp = mnist.train.next_batch(1)

plt.imshow(Xsamp.reshape(28,28),cmap='Greys')

# Remember indexing starts at zero!
print (ysamp)

[[ 0.  0.  0.  0.  0.  0.  1.  0.  0.  0.]]

Running the Session, we will start the interactive session¶

sess = tf.InteractiveSession()

sess.run(init)

for epoch in range(training_epochs):

    avg_cost = 0.0

    total_batch = int(n_samples/batch_size)

    for i in range(total_batch):
        batch_x, batch_y = mnist.train.next_batch(batch_size)

        _, c = sess.run([optimizer,cost],feed_dict={x:batch_x,y:batch_y})

        avg_cost += c/total_batch

    print ('Epoch: {} cost= {:.4f}'.format(epoch+1,avg_cost))

print ('Model has completed {} Epochs of Training'.format(training_epochs))

Epoch: 1 cost= 1621.5909
Epoch: 2 cost= 21.9404
Epoch: 3 cost= 5.1439
Epoch: 4 cost= 3.0614
Epoch: 5 cost= 2.5197
Epoch: 6 cost= 2.2830
Epoch: 7 cost= 2.1415
Epoch: 8 cost= 2.0404
Epoch: 9 cost= 1.9527
Epoch: 10 cost= 1.8770
Epoch: 11 cost= 1.8121
Epoch: 12 cost= 1.7474
Epoch: 13 cost= 1.6893
Epoch: 14 cost= 1.6277
Epoch: 15 cost= 1.5745
Epoch: 16 cost= 1.5317
Epoch: 17 cost= 1.4611
Epoch: 18 cost= 1.3815
Epoch: 19 cost= 1.3201
Epoch: 20 cost= 1.2615
Epoch: 21 cost= 1.1663
Epoch: 22 cost= 1.0351
Epoch: 23 cost= 0.9568
Epoch: 24 cost= 0.8641
Epoch: 25 cost= 0.7890
Epoch: 26 cost= 0.7407
Epoch: 27 cost= 0.6983
Epoch: 28 cost= 0.6677
Epoch: 29 cost= 0.6415
Epoch: 30 cost= 0.6119
Epoch: 31 cost= 0.5628
Epoch: 32 cost= 0.5350
Epoch: 33 cost= 0.4947
Epoch: 34 cost= 0.4713
Epoch: 35 cost= 0.4503
Model has completed 35 Epochs of Training

Model Evaluations¶

TF comes with some built infunctions to help evaluate our model, including tf.equal and tf.cast with tf.reduce_me
Predictions == y_test. In our case, since we know the format of the labels is a 1 in an array of zeros, we can compare argmax() location of that 1
Remember that y here is still

correct_predictions = tf.equal(tf.argmax(pred,1),tf.argmax(y,1))

print (correct_predictions[0])

Tensor("strided_slice_2:0", shape=(), dtype=bool)

correct_predictions = tf.cast(correct_predictions,'float')
print (correct_predictions[0])

Tensor("strided_slice_3:0", shape=(), dtype=float32)

Now we use the tf.reduce to get the prediction accuracy¶

accuracy = tf.reduce_mean(correct_predictions)

type(accuracy)

tensorflow.python.framework.ops.Tensor

mnist.test.labels

array([[ 0.,  0.,  0., ...,  1.,  0.,  0.],
       [ 0.,  0.,  1., ...,  0.,  0.,  0.],
       [ 0.,  1.,  0., ...,  0.,  0.,  0.],
       ...,
       [ 0.,  0.,  0., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ...,  0.,  0.,  0.]])

mnist.test.images

array([[ 0.,  0.,  0., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ...,  0.,  0.,  0.],
       ...,
       [ 0.,  0.,  0., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ...,  0.,  0.,  0.]], dtype=float32)

print ('Accuracy: ',accuracy.eval({x:mnist.test.images,y:mnist.test.labels}))

Accuracy:  0.8491

Deep Learning with Neural Networks - Class Notes