Gesture Detector

Team name: CodeBrewers

Team members

Rutuja Ajay Kolte - kolterutuja1@gmail.com
Mahek Ajay Salia - saliamahek13@gmail.com
Reshmika Sreenath Nambiar - reshmikasnambiar@gmail.com
Prerna Jagesia - pkjagesia@gmail.com

Mentors

Anuj Raghani
Bhavya Sheth
Owais Hetavkar
Vedant Paranjape

Description

The goal of our project was to train a machine learning algorithm capable of classifying images of different hand gestures (such as fists, palm, etc.) and use it for gesture detection and recognition.

We have used the Hand Gesture Recognition Database from Kaggle.

Creating the Model

First we load the images from proj.zip
Their sizes are reduced and their color space is turned to gray. They are stored in array X while their labels are stored in array Y.

The model is constructed using Tensorflow and Keras.

  model = Sequential()
  model.add(Conv2D(32, (5, 5), activation='relu', input_shape=(120, 320, 1))) 
  model.add(MaxPooling2D((2, 2)))
  model.add(Conv2D(64, (3, 3), activation='relu')) 
  model.add(MaxPooling2D((2, 2)))
  model.add(Conv2D(64, (3, 3), activation='relu'))
  model.add(MaxPooling2D((2, 2)))
  model.add(Flatten())
  model.add(Dense(128, activation='relu'))
  model.add(Dense(10, activation='softmax'))

The model is then configured and trained.

  model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
  model.fit(X, Y, epochs=5, batch_size=64, verbose=2, validation_data=(X, Y))

The model is then saved as an HDF5 file.
Using the Model
The model from GestureRecognition.h is loaded.
The image is taken from webcam.
It is resized and changed to grayscale.

Running average method is used for background subtraction.

  averageValue1 = np.float32(img)
  while True:
      try:
          filename = take_photo()
          img = cv2.imread('/content/photo.jpg')  # Reads the picture taken from webcam
          cv2.accumulateWeighted(img, averageValue1, 0.02)  # Updates the running average
          resultingFrames1 = cv2.convertScaleAbs(averageValue1) # Converts the matrix elements to absolute values and converts the result to 8-bit
          cv2_imshow(resultingFrames1)  # Background using Running Average Method
          m = cv2.subtract(img,resultingFrames1)  # Foreground by subtracting background from original image
      except Exception as err:
          # Errors will be thrown if the user does not have a webcam or if they do not grant the page permission to access it
          print(str(err))

The model then predicts the gesture and prints it.

  gesture = ("down", "palm", "l", "fist", "fist_moved", "thumb", "index", "ok", "palm_moved", "c")
  prediction = model.predict(np.expand_dims(m, axis = 0)) # Makes predictions
  ans = np.argmax(prediction[0])
  print(prediction[0][ans]) # Prints probability of prediction
  print(gesture[ans]) # Prints predicted gesture

Technology stack

Tools and technologies that we learnt and used in the project.

Python
Open CV and CNN
Jupyter notebook
Machine learning

Project Setup

Method 1

Clone the CodeBrewers repository

 git clone https://github.com/Rutuja-Kolte/CodeBrewers

Open Google Drive and create a folder named CodeBrewers.
Upload all files from the CodeBrewers repository on your PC to Google Drive.
Also add the dataset from Kaggle and name it proj.zip

Method 2

Clone the CodeBrewers repository

 git clone https://github.com/Rutuja-Kolte/CodeBrewers

Go to the drive link and copy the folder and save it in your own drive.
Usage

To Create the Model (skip this if you want to use the pre-trained model)

Method 1
Right click on CodeBrewers.ipynb file in Google Drive.
Click on open with Google Colab.
Run the code.
Method 2
Open CodeBrewers.ipynb from the CodeBrewers repository in Google Colab.
Run the code.

To Use the Model

Method 1

Right click on GestureDetector.ipynb file in Google Drive.
Click on open with Google Colab.
Run the code.
Method 2
Open GestureDetector.ipynb from the CodeBrewers repository in Google Colab.
Run the code.

Applications

Touchless user interface is an emerging type of technology in relation to gesture control. One type of touchless interface uses the bluetooth connectivity of a smartphone to activate a company’s visitor management system. This prevents having to touch an interface during the COVID-19 pandemic.
Hand gesture recognition has great value in sign language recognition and sign language interpreters for the disabled.
In cranes, this can be used instead of remotes so that easy picking and shedding of load can be done at difficult locations.

Future scope

The project can be linked to a Media player such as VLC and the gestures can be used to control the video like increasing or decreasing its volume or fast forwarding and rewinding the video. Also, instead of using a mouse the gestures can also be used to control your mouse pointer.

Currently, the model used cannot recognise when there are no gestures detected. This functionality can be added as well.

In the above project we have used only static gestures. It can be modified to include dynamic gestures (swiping your fist to the right or left, moving your finger up and down, etc.).