image.png
One of my favorite parts of tutoring is demonstrating how to build actual solutions to problems using computer vision. Demonstrate how to implement a bubble sheet test scanner and grader using strictly computer vision and image processing techniques, along with the OpenCV library.

Bubble sheet scanner & test grader

The 7 steps to build a bubble sheet scanner and grader using Python and OpenCV. To accomplish this, our implementation will need to satisfy the following 7 steps:

  • Step #1: Detect the exam in an image.
  • Step #2: Apply a perspective transform to extract the top-down, birds-eye-view of the exam.
  • Step #3: Extract the set of bubbles (i.e., the possible answer choices) from the perspective transformed exam.
  • Step #4: Sort the questions/bubbles into rows.
  • Step #5: Determine the marked (i.e., “bubbled in”) answer for each row.
  • Step #6: Lookup the correct answer in our answer key to determine if the user was correct in their choice.
  • Step #7: Repeat for all questions in the exam.

OMR

Optical Mark Recognition, or OMR for short, is the process of automatically analyzing human-marked documents and interpreting their results.

With the basic understanding of OMR, let’s build a computer vision system using Python and OpenCV that can read and grade bubble sheet tests.

Below is an example filled in bubble sheet exam that I have put together for this project:
image.png

Document Scanner

Get started

  1. #!/usr/bin/env python
  2. # encoding: utf-8
  3. # import the necessary packages
  4. from imutils.perspective import four_point_transform
  5. from imutils import contours
  6. import numpy as np
  7. import argparse
  8. import imutils
  9. import cv2
  10. # construct the argument parse and parse the arguments
  11. ap = argparse.ArgumentParser()
  12. ap.add_argument("-i", "--image", required=False, default='omr_test_01.png', help="path to the input image")
  13. args = vars(ap.parse_args())
  14. # define the answer key which maps the question number
  15. # to the correct answer
  16. ANSWER_KEY = {0: 1, 1: 4, 2: 0, 3: 3, 4: 1}
  • Lines 4-10 import our required Python packages.
  • Lines 14-15 parse our command line arguments. We only need a single switch here, —image , which is the path to the input bubble sheet test image that we are going to grade for correctness.
  • Line 19 then defines our ANSWER_KEY .

As the name of the variable suggests, the ANSWER_KEY provides integer mappings of the question numbers to the index of the correct bubble.

In this case, a key of 0 indicates the first question, while a value of 1 signifies “B” as the correct answer (since “B” is the index 1 in the string “ABCDE”). As a second example, consider a key of 1 that maps to a value of 4 — this would indicate that the answer to the second question is “E”.

Preprocess input image

  1. # load the image, convert it to grayscale, blur it slightly, then find edges
  2. image = cv2.imread(args["image"])
  3. gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
  4. blurred = cv2.GaussianBlur(gray, (5, 5), 0)
  5. edged = cv2.Canny(blurred, 75, 200)
  • Line 2 load our image from disk,
  • Line 3 converting it to grayscale,
  • Line 4 blurring it to reduce high frequency noise.
  • Line 5 apply the Canny edge detector on to find the edges/outlines of the exam.

Below I have included a screenshot of our exam after applying edge detection:
test_grader_edged.png
Notice how the edges of the document are clearly defined, with all four vertices of the exam being present in the image.

Contour Sorting

Obtaining this silhouette of the document is extremely important in our next step as we will use it as a marker to apply a perspective transform to the exam, obtaining a top-down, birds-eye-view of the document.

  1. # find contours in the edge map, then initialize the contour that corresponds to the document
  2. cnts = cv2.findContours(edged.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
  3. cnts = imutils.grab_contours(cnts)
  4. docCnt = None
  5. # ensure that at least one contour was found
  6. if len(cnts) > 0:
  7. # sort the contours according to their size in
  8. # descending order
  9. cnts = sorted(cnts, key=cv2.contourArea, reverse=True)
  10. # loop over the sorted contours
  11. for c in cnts:
  12. # approximate the contour
  13. peri = cv2.arcLength(c, True)
  14. approx = cv2.approxPolyDP(c, 0.02 * peri, True)
  15. # if our approximated contour has four points,
  16. # then we can assume we have found the paper
  17. if len(approx) == 4:
  18. docCnt = approx
  19. break
  20. pass
  21. pass
  • Line 7 making sure at least one contour was found on
  • Line 10 sort our contours by their area (from largest to smallest). This implies that larger contours will be placed at the front of the list, while smaller contours will appear farther back in the list.
  • Line 12 loop over each of our (sorted) contours. For each of them, approximate the contour, which in essence means we simplify the number of points in the contour, making it a “more basic” geometric shape.
  • Line 18 make a check to see if our approximated contour has four points, and if it does, we assume that we have found the exam.

Now that we have the outline of our exam, we apply the cv2.findContours function to find the lines that correspond to the exam itself.

Below I have included an example image that demonstrates the docCnt variable being drawn on the original image:
test_grader_outline.png
Sure enough, this area corresponds to the outline of the exam.

Perspective Transforms

Now that we have used contours to find the outline of the exam, we can apply a perspective transform to obtain a top-down, birds-eye-view of the document.

  1. # apply a four point perspective transform to both the original image and grayscale image to obtain a top-down birds eye view of the paper
  2. paper = four_point_transform(image, docCnt.reshape(4, 2))
  3. warped = four_point_transform(gray, docCnt.reshape(4, 2))
  • Orders the (x, y)-coordinates of our contours in a specific, reproducible manner.
  • Applies a perspective transform to the region.

this function handles taking the “skewed” exam and transforms it, returning a top-down view of the document (original on the left, grayscale on the right):
test_grader_top_down_original.pngtest_grader_top_down_gray.png

Binarization

The process of thresholding/segmenting the foreground from the background of the image:

  1. # apply Otsu's thresholding method to binarize the warped piece of paper
  2. thresh = cv2.threshold(warped, 0, 255, cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1]

After applying Otsu’s thresholding method, our exam is now a binary image:
test_grader_top_down_thresh.png
Notice how the background of the image is black, while the foreground is white.

Contour Extraction

the above binarization will allow us to once again apply contour extraction techniques to find each of the bubbles in the exam.

  1. # find contours in the thresholded image,
  2. # then initialize the list of contours that correspond to questions
  3. cnts = cv2.findContours(thresh.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
  4. cnts = imutils.grab_contours(cnts)
  5. questionCnts = []
  6. # loop over the contours
  7. for c in cnts:
  8. # compute the bounding box of the contour,
  9. # then use the bounding box to derive the aspect ratio
  10. (x, y, w, h) = cv2.boundingRect(c)
  11. ar = w / float(h)
  12. # in order to label the contour as a question,
  13. # region should be sufficiently wide, sufficiently tall,
  14. # and have an aspect ratio approximately equal to 1
  15. if w >= 20 and h >= 20 and ar >= 0.9 and ar <= 1.1:
  16. questionCnts.append(c)
  17. pass
  • Lines 3-5 finding contours on our thresh binary image, followed by initializing questionCnts , a list of contours that correspond to the questions/bubbles on the exam.
  • Line 8 loop over each of the individual contours to determine which regions of the image are bubbles.
  • Line 11 compute the bounding box for each of these contours.
  • Line 12 compute the aspect ratio, or more simply, the ratio of the width to the height.

In order for a contour area to be considered a bubble, the region should:

  1. Be sufficiently wide and tall (in this case, at least 20 pixels in both dimensions).
  2. Have an aspect ratio that is approximately equal to 1.

As long as these checks hold, we can update our questionCnts list and mark the region as a bubble. Below I have included a screenshot that has drawn the output of questionCnts on our image:
test_grader_top_down_questions.png
Notice how only the question regions of the exam are highlighted and nothing else.

Grading

We can now move on to the “grading” portion of our OMR system.

Each Row

  1. # sort the question contours top-to-bottom,
  2. # then initialize the total number of correct answers
  3. questionCnts = contours.sort_contours(questionCnts, method="top-to-bottom")[0]
  4. correct = 0
  5. # each question has 5 possible answers,
  6. # to loop over the question in batches of 5
  7. for (q, i) in enumerate(np.arange(0, len(questionCnts), 5)):
  8. # sort the contours for the current question from left to right,
  9. # then initialize the index of the bubbled answer
  10. cnts = contours.sort_contours(questionCnts[i:i + 5])[0]
  11. bubbled = None
  • Line 8 start looping over our questions. Since each question has 5 possible answers, we’ll apply NumPy array slicing and contour sorting to to sort the current set of contours from left to right.
  • Line 11 ensures each row of contours are sorted into rows, from left-to-right.

test_grader_top_down_ques_rows.png

Filled in Bubbles

To determine which bubble is filled in. we using our thresh image and counting the number of non-zero pixels (i.e., foreground pixels) in each bubble region.

  1. # loop over the sorted contours
  2. for (j, c) in enumerate(cnts):
  3. # construct a mask that reveals only the current "bubble" for the question
  4. mask = np.zeros(thresh.shape, dtype="uint8")
  5. cv2.drawContours(mask, [c], -1, 255, -1)
  6. # apply the mask to the thresholded image,
  7. # then count the number of non-zero pixels in the bubble area
  8. mask = cv2.bitwise_and(thresh, thresh, mask=mask)
  9. total = cv2.countNonZero(mask)
  10. # if the current total has a larger number of total non-zero pixels,
  11. # then we are examining the currently bubbled-in answer
  12. if bubbled is None or total > bubbled[0]:
  13. bubbled = (total, j)
  14. pass
  • Line 2 handles looping over each of the sorted bubbles in the row.
  • Line 4 We then construct a mask for the current bubble on.
  • Lines 8-9 then count the number of non-zero pixels in the masked region.
  • Line 12- 13 The more non-zero pixels we count, then the more foreground pixels there are, and therefore the bubble with the maximum non-zero count is the index of the bubble that the the test taker has bubbled in.

omr_mask.gif
Clearly, the bubble associated with “B” has the most thresholded pixels, and is therefore the bubble that the user has marked on their exam.

Look up the correct answer

This next code block handles looking up the correct answer in the ANSWER_KEY, updating any relevant bookkeeper variables, and finally drawing the marked bubble on our image.

  1. # initialize the contour color and the index of the *correct* answer
  2. color = (0, 0, 255)
  3. k = ANSWER_KEY[q]
  4. # check to see if the bubbled answer is correct
  5. if k == bubbled[1]:
  6. color = (0, 255, 0)
  7. correct += 1
  8. # draw the outline of the correct answer on the test
  9. cv2.drawContours(paper, [cnts[k]], -1, color, 3)

Based on whether the test taker was correct or incorrect yields which color is drawn on the exam. If the test taker is correct, we’ll highlight their answer in green. However, if the test taker made a mistake and marked an incorrect answer, we’ll let them know by highlighting the correct answer in red:
test_grader_top_down_exam.png

Give the grade

Our last code block handles scoring the exam and displaying the results to our screen.

  1. # grab the test taker
  2. score = (correct / 5.0) * 100
  3. print("[INFO] score: {:.2f}%".format(score))
  4. cv2.putText(paper, "{:.2f}%".format(score), (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 0.9, (0, 0, 255), 2)
  5. cv2.imshow("Gader", paper)
  6. cv2.waitKey(0)

Below you can see the output of our fully graded example image:
test_grader_top_down_grade.png

More test results

omr_test_02.jpgomr_test_02_grade.png

omr_test_03.jpgomr_test_03_grade.png

omr_test_04.jpgomr_test_04_grade.png
omr_test_05.jpgomr_test_05_grade.png

Shortcomings

Discuss some of the shortcomings of this current bubble sheet scanner system and how we can improve it in future iterations.

Why not circle detection?

Tuning the parameters to Hough circles on an image-to-image basis can be a real pain.

The real reason is: User error.
How many times, whether purposely or not, have you filled in outside the lines on your bubble sheet? I’m not expert, but I’d have to guess that at least 1 in every 20 marks a test taker fills in is “slightly” outside the lines.

The cv2.findContours function doesn’t care if the bubble is “round”, “perfectly round”, or “oh my god, what the hell is that?”.

What’s More

While I was able to get the barebones of a working bubble sheet test scanner implemented, there are certainly a few areas that need improvement. The most obvious area for improvement is the logic to handle non-filled in bubbles.

Since we determine if a particular bubble is “filled in” simply by counting the number of thresholded pixels in a row and then sorting in descending order, this can lead to two problems:

  1. What happens if a user does not bubble in an answer for a particular question?
  2. What if the user is nefarious and marks multiple bubbles as “correct” in the same row?

    Not bubbled in a row

    If a reader chooses not to bubble in an answer for a particular row, then we can place a minimum threshold when we compute cv2.countNonZero.
    image.png

    Multi-bubbled in a row

    Apply our thresholding and count step, this time keeping track if there are multiple bubbles that have a total that exceeds some pre-defined value. If so, we can invalidate the question and mark the question as incorrect.
    image.png

    Source Code

    the complete source code can be found bellow: ```python

    !/usr/bin/env python

    encoding: utf-8

import the necessary packages

from imutils.perspective import four_point_transform from imutils import contours import numpy as np import argparse import imutils import cv2

construct the argument parse and parse the arguments

ap = argparse.ArgumentParser() ap.add_argument(“-i”, “—image”, required=False, default=’omr_test_01.png’, help=”path to the input image”) args = vars(ap.parse_args())

define the answer key which maps the question number to the correct answer

ANSWER_KEY = {0: 1, 1: 4, 2: 0, 3: 3, 4: 1}

load the image, convert it to grayscale, blur it slightly, then find edges

image = cv2.imread(args[“image”]) gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) blurred = cv2.GaussianBlur(gray, (5, 5), 0) edged = cv2.Canny(blurred, 75, 200)

cv2.imshow(‘edged’, edged) cv2.waitKey(0)

find contours in the edge map,

then initialize the contour that corresponds to the document

cnts = cv2.findContours(edged.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE) cnts = imutils.grab_contours(cnts) docCnt = None

ensure that at least one contour was found

if len(cnts) > 0:

  1. # sort the contours according to their size in descending order
  2. cnts = sorted(cnts, key=cv2.contourArea, reverse=True)
  3. # loop over the sorted contours
  4. for c in cnts:
  5. # approximate the contour
  6. peri = cv2.arcLength(c, True)
  7. approx = cv2.approxPolyDP(c, 0.02 * peri, True)
  8. # if our approximated contour has four points,
  9. # then we can assume we have found the paper
  10. if len(approx) == 4:
  11. docCnt = approx
  12. break
  13. pass

if docCnt is not None: outline = image.copy() cv2.drawContours(outline, [docCnt], -1, (0, 255, 0), 2) cv2.imshow(“Outline”, outline) cv2.waitKey(0)

apply a four point perspective transform to both

the original image and grayscale image to obtain a top-down birds eye view of the paper

paper = four_point_transform(image, docCnt.reshape(4, 2)) warped = four_point_transform(gray, docCnt.reshape(4, 2))

cv2.imshow(“Original”, imutils.resize(paper, height = 500)) cv2.waitKey(0)

cv2.imshow(“Scanned”, imutils.resize(warped, height = 500)) cv2.waitKey(0)

apply Otsu’s thresholding method to binarize the warped

piece of paper

thresh = cv2.threshold(warped, 0, 255, cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1] cv2.imshow(“thresh”, thresh) cv2.waitKey(0)

find contours in the thresholded image,

then initialize the list of contours that correspond to questions

cnts = cv2.findContours(thresh.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE) cnts = imutils.grab_contours(cnts) questionCnts = []

loop over the contours

for c in cnts:

  1. # compute the bounding box of the contour,
  2. # then use the bounding box to derive the aspect ratio
  3. (x, y, w, h) = cv2.boundingRect(c)
  4. ar = w / float(h)
  5. # in order to label the contour as a question,
  6. # region should be sufficiently wide, sufficiently tall,
  7. # and have an aspect ratio approximately equal to 1
  8. if w >= 20 and h >= 20 and ar >= 0.9 and ar <= 1.1:
  9. questionCnts.append(c)
  10. pass

questions = paper.copy() for cnt in questionCnts: cv2.drawContours(questions, [cnt], -1, (0, 0, 255), 3) pass cv2.imshow(“questions”, questions) cv2.waitKey(0)

ques_rows = paper.copy() ques_colors = [(0, 0, 255), (0, 255, 142), (150, 255, 0), (0, 255, 251), (255, 55, 148)]

sort the question contoudrs top-to-bottom,

then initialize the total number of correct answers

questionCnts = contours.sort_contours(questionCnts, method=”top-to-bottom”)[0] correct = 0

each question has 5 possible answers,

to loop over the question in batches of 5

for (q, i) in enumerate(np.arange(0, len(questionCnts), 5)):

  1. # sort the contours for the current question from left to right,
  2. # then initialize the index of the bubbled answer
  3. cnts = contours.sort_contours(questionCnts[i:i + 5])[0]
  4. bubbled = None
  5. cv2.drawContours(ques_rows, cnts, -1, ques_colors[i//5], 3)
  6. # loop over the sorted contours
  7. for (j, c) in enumerate(cnts):
  8. # construct a mask that reveals only the current "bubble" for the question
  9. mask = np.zeros(thresh.shape, dtype="uint8")
  10. cv2.drawContours(mask, [c], -1, 255, -1)
  11. # apply the mask to the thresholded image,
  12. # then count the number of non-zero pixels in the bubble area
  13. mask = cv2.bitwise_and(thresh, thresh, mask=mask)
  14. total = cv2.countNonZero(mask)
  15. # if the current total has a larger number of total non-zero pixels,
  16. # then we are examining the currently bubbled-in answer
  17. if bubbled is None or total > bubbled[0]:
  18. bubbled = (total, j)
  19. pass
  20. # initialize the contour color and the index of the *correct* answer
  21. color = (0, 0, 255)
  22. k = ANSWER_KEY[q]
  23. # check to see if the bubbled answer is correct
  24. if k == bubbled[1]:
  25. color = (0, 255, 0)
  26. correct += 1
  27. # draw the outline of the correct answer on the test
  28. cv2.drawContours(paper, [cnts[k]], -1, color, 3)
  29. pass

cv2.imshow(“rows”, ques_rows) cv2.waitKey(0)

grab the test taker

score = (correct / 5.0) * 100 print(“[INFO] score: {:.2f}%”.format(score))

cv2.imshow(“Original”, image)

cv2.imshow(“Exam”, paper) cv2.waitKey(0)

cv2.putText(paper, “{:.2f}%”.format(score), (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 0.9, (0, 0, 255), 2) cv2.imshow(“Gader”, paper) cv2.waitKey(0)

cv2.destroyAllWindows() ```