Actions

Digit recognition

From EdWiki

Problem Statement

Aim: To recognize a single digit saved as an image on PC and display the result

Demo

The image in PC is sent to TIVA microcontroller.
The image being recognized is displayed on NOKIA LCD display.
The predicted digit is displayed on 16x2 LCD display.
Digitrecognitiondemo.gif
Demo video

Data transfer

Image is saved as 8 bit 28x28 grayscale image in PC. It is sent to TIVA microcontroller through UART file transfer. The image write starts with writing a character 's' through UART followed by 784 pixel values in column major format. TIVA accepts the image through UART interrupt service routine. Once the image transfer is complete, it is displayed on NOKIA LCD. The prediction algorithm starts as soon as the image is displayed. Once the prediction is completed, the predicted character is sent to 16x2 LCD display. To read back the image that is currently loaded, a character 'r' is sent to UART. The microcontroller will respond with 784 pixel values through UART which can be read.

Learning Objectives

  • Interfacing NOKIA LCD display
  • Neural Network training and prediction
  • UART file transfer
  • Serial port access using Octave

Hardware connections

Hardware connections

16x2 LCD Interfacing

16x2 LCD interfacing consists of 8 data lines and 3 control bits. Control bits are Enable, Register Select and Register Read/Write. Read/Write is always set as write. Register select is set to low for command register and high for data register. The command or data is latched on positive edge of enable pin.

16x2 LCD Interfacing

16x2 LCD interfacing consists of 8 data lines and 3 control bits. Control bits are Enable, Register Select and Register Read/Write. Read/Write is always set as write. Register select is set to low for command register and high for data register. The command or data is latched on positive edge of enable pin.

NOKIA LCD Interfacing

NOKIA 6100 LCD interfacing is through SPI implemented by bit banging. Each transfer of command/data contains 9 bytes with MSB sent first through MOSI pin. A logic high MSB indicates command and logic low MSB indicates data. The rest of the bits determine the actual command/data. The data or command is latched on each high to low edge of clock. Chip select needs to be made low before each transaction.

Training and prediction

Neural networks are used for recognizing the digit. Training is done in PC using Octave. Sigmoid neuron is the basic building block of neural networks. A neuron takes several inputs (x1,x2, ..) and produces a single output. Inputs can take any value between 0 and 1. It has weights for each input, w1, w2, ... and an overall bias, b. Its output is σ(w.x+b), where σ is the sigmoid function.

Neural network used for digit recognition contains 3 layers – Input layer containing 784 (28x28) input neurons, hidden layer containing 20 neurons and output layer with 10 output neurons.

Neural network for digit recognition

Training of the neural network is done with 60000 images of handwritten digits fetched from MNIST database. - http://yann.lecun.com/exdb/mnist/. Goal of the training process is to minimize the cost function which is the average norm of the error between predicted value and target value. Cost function is minimized using stochastic gradient descent algorithm by updating the weights using back propagation after every image prediction. Gradient descent algorithm works on the principle that gradient of a multi dimensional function represents the direction of maximum change in the function with its magnitude being the maximum rate of change. This ensures that each iteration will result in reduction of error provided the step taken is less than the magnitude of the error. At the same time step taken should be high enough for faster convergence of the training. There are 784*20 weights for the hidden layer and 20*10 weights for the output layer.

Code used for Training: https://github.com/gitofsachin/digitrecognition/blob/master/sourcecode/PCTraining/training.o
Output of training: Weights for hidden layer and output layer in separate files (hiddenweights1 & outputweights1)

Weights are found out with an error of 0.06 after 60000 iteration of back propagation. These weights are ported to TIVA microcontroller and is used to find the output of the neural network.Digit is predicted by the micro-controller from 784 pixel values received from UART and the fetched weights.Output vector is found using the following equation
Input to the output layer, [hiddenOutput]20x1= sigmoid([hiddenWeights]20x784 * [inputVector]784x1 )
Output of the output layer, [outputVector]10x1= sigmoid([outputWeights]10x20 *[hiddenOutput]20x1 )

Index of the output neuron which has the highest output value is the predicted digit.

Code used for Prediction in TIVA: https://github.com/gitofsachin/digitrecognition/tree/master/sourcecode/TIVA

Training and prediction algorithm for digit recognition explained in http://neuralnetworksanddeeplearning.com/

Components Used

  • TIVA microcontroller
  • NOKIA LCD Display
  • 16x2 LCD Display

Softwares used

  • Eclipse IDE
  • TivaWare Peripheral Driver Library
  • gcc
  • Octave

References

Source code

All the source code can be found at https://github.com/gitofsachin/digitrecognition

Future scope

  • Training implementation on TIVA
  • Image read from SD card
  • Alphabets and multiple character recognition

Team Members

  • Sachin S
  • Sajna Remi Clere