Neural networks in keras, a first look

Artifical intelligence (AI) has featured on the news several times in the past few years. AI , machine learning and deep learning feature in countless articles often outside technological oriented publications. You will hear often of self driving cars, chatbots, robots replacing humans in factories and many other stories, with AI as the vilan or the hero. At the heart of this technological revolution is neural networks (NN) which we briefly introduce in this article. The focus here is to introduce NNs with python using the keras API.

The problem we will solve here is to classify grayscale images of handwritten digits (28x28 pixels) into their 10 categories which ranges from 0-9. We a will use a classic dataset in the machine learning community, namely MNIST. MNIST has 60k training images and another 10k images for testing. Let’s import the MNIST dataset.

from tensorflow.keras.datasets import mnist
(train_images, train_labels),(test_images, test_labels) = mnist.load_data()

train_images and train_labels represent the data that the model will learn from. We will then use test_images and test_labels to test the model. The data object is a numpy array with each image of dimension 28x28 mapped to one and only one label in the range [0,9]. The training data looks like this:

train_images.shape

(60000, 28, 28)

train_labels

array([5, 0, 4, ..., 5, 6, 8], dtype=uint8)

The test data:

test_images.shape

(10000, 28, 28)

test_labels

array([7, 2, 1, ..., 4, 5, 6], dtype=uint8)

Let’s now create an architecture for the network.

This will be the workflow:

Create an architecture for the network, and create a network object
Compile the network object
Prepare the training and test datasets
Fit the network object to data