Implementation of Template Matching on Detection of Stop Line Violations

,


INTRODUCTION
Road marking is a sign located above the road surface that combines either a line or a symbol and provides information as visual guidance to road users [1]. Road markings must be made as clear as possible so that road users can understand and the traffic will be safe.
One example of a road marking is the stop line behind the zebra crossing. Zebra crossing is a useful road marking as a crossing between pavements and the roadway [2]. Whereas, stop line behind the zebra crossing is a simple line to inform the drivers of the vehicle where they must stop [3]. It helps pedestrians cross the road safely when the red light is on. However, road users often ignore the road markings and cause many violations, namely by not stopping behind this line [4].
According to WHO's Global Status Report on Road Safety 2015, Indonesia ranks in fifth place for the highest number of road accidents in the world [5]. WHO also updated their data in 2018 Road Traffic Accidents Deaths that Indonesia reached 41,862 fatalities in road traffic accidents [6]. The analysis of road traffic accident data in Indonesia from 2013 to 2019 published by Statista Research Department reveals that the average number of fatalities reach over 28,000 [7]. Based on road fatality statistic data, it is estimated that road fatality in Indonesia will reach 40,000 in 2020 and up to 65,000 per year for 2035 [8]. Although not all road traffic accidents are caused by stop line violations, it is described in [9] that in Indonesia, a high motorcycle population can give a negative impact, such as violations, because of the accumulation of traffic violations by the motorcyclist, particularly at crossroads. One particular violation is not stopping behind the stop lines at zebra crosses, and therefore blocking or disturbing pedestrians crossing the road. This form of violations can be reduced by posting a police officer at every zebra crossing. However, this requires a large number of officers and is therefore impractical.
From the problem definition above, it is obvious that a solution is needed for minimizing stop line violations to ensure traffic safety. The main contribution of this paper is proposing a system that can detect stop line violations automatically using a template matching algorithm. The template matching is used for detecting whether the stop line is present. If it does, it can then show whether shown that there is a violation or not.
Applications of template matching for image-based pattern recognition can be widely found in the literature. One example of such implementation is a door lock security system using Raspberry Pi proposed in [10]. This study presented a system based on image processing to replace the use of RFID that gives less flexibility. The user can simply position themselves in front of the Raspberry Pi camera and its data will be matched using the template matching method with an image in database. The experimental result has a success rate of 96%. Another example is presented in [11] in which template matching method is used for Optical Character Recognition (OCR) and obtain different accuracy rates for different fonts, namely 100% for Calibri, 100% for Verdana, 86.66% for Arial, 80% for Lucida Fax, and lower accuracy for Cambria and Times New Roman.
Previous research on automatic detection of road marking violations can also be found in the literature. In [12], the Boost-CNN method and automatic hard mining method are used for the stop line detector. The Boost-CNN method is a combination of an AdaBoost classifier and a CNN. It achieves 91.5% in accuracy. Whereas, in [13] the detection of stop line violation system is made using Haar Cascade Classifier to identify the vehicles that commit the violation. It can characterize vehicles well and obtain an accuracy of 91.5% at 720p resolution. Another research made a prototype system to identify a vehicle whether it is halted before the stop line or not by infrared sensor module [14]. Later, its signal is sent to Raspberry Pi to capture the license plate number of vehicles that commit violations. The number in the image will be identified and converted to audio to be voiced by Google Voice as a warning. This system produces an accuracy of 80%. However, to the best of the authors' knowledge, no system based on template matching has been found.
In this paper, the authors propose the use of a template matching method to detect this type of traffic violation. Compared to the previously mentioned methods, template matching offers a much simpler mathematical model and computation. This method also relies only on the captured camera image and does not require any additional hardware or sensors. In this paper, the system is implemented using MATLAB R2017a.
The paper is organized as follows. The METHOD section presents data acquisition, the method, and analysis. The result of the experiments is discussed in the RESULT AND DISCUSSION section. Finally, the CONCLUSIONS section provides some conclusions and pointers for future work.

Data Acquisition
In this research, the author used closed-circuit television (CCTV) video footage in Simpang Empat Bareng Arah Solo, Klaten as input data. The video is accessed online and live at [15]. In this video, the red light lights up for about 1 minute and has a frame rate of about 30 fps in size of 1366×768. The video was taken over several days at three distinct time ranges: morning, afternoon, and evening. All the data were taken when the red light was on. The morning dataset is collected at about 10:00 to 12:00 WIB, the afternoon dataset is collected at about 14:00 to 15

Template Matching
The method used in this paper is template matching, which is one of the most frequently used techniques in digital image processing that can measure the similarity of all pixels (or features) within the template and a candidate window in the target image [16]. A template is defined as representative local features selected from images [17]. Template matching algorithms work by comparing input image patterns with template image patterns in datasets. The template is moved through every position in the input image to identify whether images/patterns are similar to the template present in the input image.
The author takes the stop line template behind the zebra cross from Figure 1(h), Figure 2  The algorithm of stop line detection using the template matching method is as follows. The first step is pre-processing of the input image. The pre-processing process is used to improve computation efficiency and processing speed as well as noise removal in the image. This process consists of: a. Input image resizing The input image is an RGB image with a size of 1366×768 which is then cropped and resized to a size of 591×69 which focuses on the zebra cross stop line. This step aims to improve the computation efficiency and speed. b. Input image conversion to Grayscale In the proposed system, color information is not needed, and therefore the processed image is converted from color to grayscale. A grayscale image is an image that only has shades of gray colors [18]. The grayscale intensity is stored as an 8-bit integer giving 256 possible shades from black to white. c. Padding The padding process is the process of adding pixels on each left, right, top, and bottom side of the image data. It is often required when part of the filter extends beyond the input image [19]. Here the author uses zero padding, i.e., the value of the added pixels is 0. Padding is used for the affixed template area to the input image that is not outside the image. The size of padding that the authors use is 10×10. The authors choose the small size of padding because there is a possibility of the camera movement due to wind.
The next stage is classification using a template matching algorithm by calculating the similarity value of the template image with the input image. This value is calculated by the normalized correlation formula shown as follows [20]: The equation above produces the value of -1 ≤ ≤ +1. This value will be compared to the threshold value , if the value ≥ then the template image resembles the input image in that position. However, if the value ≤ , then that particular area of the input image does not resemble the template image.
Threshold value is determined empirically, as shown in Figure  7. The authors chose two values for : 0.8 and 0.9 because the values produce better accuracy. If the value ≥ , then the template is detected in the input image so that no violation occurs. Otherwise, if the value ≤ , then the template is not detected in the input image. This may be due to the stop line being obstructed by a vehicle not stopping behind it properly and is thus an indication that there is a violation.

Analysis
The output of the process carried out by the system for both the threshold value 0.8 and 0.9 is the classification that predicts input into one of two classes, namely the "violation" class and "no violation" class. There are four possible outputs, namely true positive (TP), false negative (FN), true negative (TN), and false positive (FP). True Positive (TP) is when an input image belonging to the positive class is classified into the positive class by the system. If such an image is classified into the negative class by the system, we have a False Negative (FN). When the input image belongs to the negative class and is classified by the system into the negative class, we have a True Negative (TN), otherwise when such image is classified into the positive class then we have a False Positive (FP) [21]. These relationships are shown as the confusion matrix shown in Figure 5. In this paper, an input image is said to belong to the positive class when it contains a stop line violation. Conversely, an image is said to belong to the negative class when it does not contain a stop line violation.

RESULTS AND DISCUSSION
The number of classified images for each class as well as the percentage of accuracy, precision, and recall are shown in Table  1 and Table 2 for threshold 0.8; Table 3 and Table 4 for threshold value of 0.9.  Table 2 and Table 4 show that the average accuracy percentage of the system for threshold values of 0.8 and 0.9 are 83% and 79%, respectively. The authors chose the threshold value of 0.8 because the average value of accuracy is higher compared to the threshold value of 0.9. However, it should be noted that while the overall average accuracy drops, increasing the threshold to 0.9 raises the accuracy rate for the afternoon dataset.
In the dataset of the variation in the morning time, many images are classified as FP (shown in Table 3) because the images taken in the time where there are shadows of trees (effects of sunlight) that cover the observed stop line, so it is classified as a violation, even if it is just a shadow.
In the dataset of variations in the evening time, some images show no violation but the program detects the violations. This is because of the color-changing of the asphalt (effects of the vehicle's headlamp) so that the input image is not similar to the template image.
From the results above, the template matching method has the potential to be used in the detection of violations in the stop line behind the zebra crossing.

CONCLUSIONS
This research proposes a system to detect the stop line violation using the template matching method. The system classified whether vehicles halt over the stop line causing the violation or not by comparing the input image to the stop line template image.
The system shows an accuracy performance for the morning and afternoon dataset of about 100% and 78% respectively, and for the evening dataset is 70%. So, the average accuracy performance is 83%. However, this research is limited to one place only, manually changing the template for each period, and depending on the good lighting condition.
Future work will continue to improve the system like using automatically extracting and changing the best template images in all conditions, places, and weather so a better level of accuracy performance is obtained.

AUTHOR(S) BIOGRAPHY Dwira Kurnia Larasati
She is an undergraduate student of Electronic Engineering from Universitas Kristen Satya Wacana in her third year. The author's interest is in the field of computer vision and machine learning.

Iwan Setyawan
Iwan Setyawan is currently a lecturer at the Department of Electronic Engineering of Universitas Kristen Satya Wacana, Salatiga. His research interests include digital image and video processing, digital watermarking, and pattern recognition.