Skip to content

Latest commit

 

History

History
71 lines (60 loc) · 6.27 KB

README.md

File metadata and controls

71 lines (60 loc) · 6.27 KB

Driver Alert System

About Project

The Driver Alert System is a program written in Python to alert a drowsy driver when their eyes are closed. It uses OpenCV and machine learning to determine if the driver's eyes are open or not, and the live classification is streamed using Qt.

demo-driver-alert-sys.webm

Building it

1. Detect presence and location of eyes

The first step in this project is using OpenCV to detect faces, more specifically, the eyes, so the machine can determine if the driver's eyes are closed or not. Haar cascade is a commonly used object detection method where a cascade function is trained from a lot of positive and negative images. Typically, if one wants to train their own classifier for specific objects, that can be done, but in our case, OpenCV already contained pre-trained classifiers for the face and eyes. To use these classifiers, simply load it.

face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
eye_cascade = cv2.CascadeClassifier('haarcascade_lefteye_2splits.xml')

Once the classifiers are loaded, we can then use the detectMultiScale function to detect the face/eye, which is going to return the position of the face as a rectangle (Rect(x, y, w, h)). With these points, we can then create an ROI (region of interest or bounding box) around the eyes.

faces = face_cascade.detectMultiScale(image, 1.3, 5)
roi_eyes = []
for (x,y,w,h) in faces:
     roi_gray = image[y:y+h, x:x+w]
     roi_color = image[y:y+h, x:x+w]
     eyes = eye_cascade.detectMultiScale(roi_gray)
     dsize = (80,80)
     for (ex, ey, ew, eh) in eyes:
         resolutionImage = cv2.resize(roi_gray[ey:ey+eh, ex:ex+ew], dsize, interpolation = cv2.INTER_AREA)
         roi_eyes.append(resolutionImage)

2. Determine if the eyes retrieved from step 1 are closed

This part uses Tensorflow and Keras models to determine if the eye images retrieved from the above step are open or closed. To begin this step, first a dataset of open and closed eyes must be collected. For this project, 30,000 images of open and closed eyes were collected from here. 20,000 of these images were used for the training, and 10,000 were used for testing.

After some pre-processing of the images, layers must be set up. Layers are the basic building blocks of neural networks in Keras. A layer consists of a tensor-in tensor-out computation function and some state. Most layers have some parameters which are learned during the training. For this project, 3 layers were used: 1 flatten and 2 dense layers. The flattening layer takes a 2d array of pixels and unstacks the rows into a 1d array and was used for data reformatting purposes. The densing layers return an array with length of num_nodes, which is a parameter passed in.

model = keras.Sequential([
     keras.layers.Flatten(input_shape=(70,70)),
     keras.layers.Dense(128, activation='relu'),
     keras.layers.Dense(2)])

The next step is to compile the model. This part is done using a few functions:

  • Loss function -measures the accuracy of the model during training. We want to minimize this function and it will "steer" the model in the right direction.
  • Optimizer - how the model is updated based on the data it sees and the loss function.
  • Metrics - monitors the training and testing steps.
model.compile(optimizer='adam', #adam is some algorithm
    loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=['accuracy'])

Then comes the training of the model. This is done in one line, in four steps: 1. Feed the training data (train_images and train_labels) into the model. 2. Model will learn to associate images and labels. 3. Ask the model to make predictions about a test set (test_images). 4. Check that the predictions match the test_labels.

model.fit(train_images, train_labels, epochs=10)

With every epoch, the loss/accuracy should improve (decrease/increase).

Now moving on to the testing part, the accuracy is evaluated by trying the model on a test set:

test_loss, test_acc = model.evaluate(test_images, test_labels, verbose=2)

And finally, the model is now ready to be used. To used this model, once the image to be classified is inputted, the output will be an array of predictions, where the predictions are the confidence levels of the image being the thing at the index. For instance, if returned an array of [0.9 0.2], where index 0 is for closed eye and 1 is open, then there is a 90 percent chance that the inputted image is a closed eye.

3. Put them together in a GUI

This part is done using PyQt5. Two main topics covered in this step are multithreading and signals & slots.

In order to display a live stream of the webcam, a multi threaded program is required, as otherwise the GUI would keep freezing and not be able to keep up with the work it must do with a single thread. Usually with GUI programming, we have a main thread (or the GUI thread) and a worker thread. The main thread is responsible for creating the display of the GUI by creating widgets and layouts and other things required for a GUI while the worker thread takes care of the other processing work/blocking calls. In our project, we have a MainWindow class and a Thread class to act as the main loop and the worker loop respectively. In our case the Thread class deals with taking the stream of frames from the webcam, isolates the eyes (step 1) and uses the model to determine if the eyes are closed or not (step 2). It also takes care of everything else in between, like converting the frames to a PyQt compatible format.

Once these roles are established, we now need a method for these objects to communicate with each other. This is where signals and slots come in. When a signal is emitted, the connected slot, which is a function, gets called. When emitting a signal, parameters can be included in the emit function, which then can get used in the slot function. In our case, this mechanism is used for two purposes: 1) to emit a signal from the worker thread with the frame from the webcam as a parameter so that the frames can be displayed on the GUI in the main thread and 2) to emit a signal when the worker thread detects that the eyes are closed/opened so that the state of the eyes can be displayed on the GUI.