Classification of Alcohol Type Using Gas Sensor and K-Nearest Neighbor

Ethanol, isopropyl and methanol belong to the same alcohol group. The latter is commonly used as an industrial solvent, not for personal consumption. Many traditional alcoholic drink sellers often mix alcoholic beverages, which are commonly called as “oplosan”, this mixed drink is very dangerous for human if it contains methanol. Based on this problem, it is necessary to make a measuring device for the alcohol content in the liquid to classify the alcohol type. The design of this gas sensor-based alcohol classification system and method consists of a series of hardware and software applications. The block diagram of the alcohol classification system measures the ethanol and methanol substances in each alcoholic drink using the MQ3 gas sensor and WeMos as a data acquisition device and microcontroller. The computer was used to process the acquisition data from the gas sensor being used then calculates the K-Nearest Neighbor (K-NN) to obtain the prediction results. The K-NN system testing consists of testing the effect of the K value and testing its accuracy. The result of testing the effect of the K value produces 100% optimum accuracy at the values namely K=1, K=3, K=5, K=10 and 55% on K=20.


INTRODUCTION
Alcoholic beverages are a problem in Indonesia. The problem is often due to the emergence of illegal producers who make alcoholic beverages with alcohol content of more than 55%. Alcoholic beverages according to presidential regulation No. 74 of 2013 are defined as beverages that contain ethyl alcohol or flammable ethanol which is processed from agricultural materials containing carbohydrates by fermentation and distillation, or fermentation without distillation [1].
Ethanol, methanol and isopropyl belong to the same alcohol group. The naming of alcohol type compound will be determined based on the length of the carbon chain that composes the alcohol compound. Ethanol, also called ethyl alcohol, has chemical formula namely C2H5OH, while methanol or methyl alcohol has chemical formula namely CH3OH. The latter is commonly used as an industrial solvent, not for personal consumption [2]. Methanol and ethanol have the same physical appearance, which is clear and colorless. So, it is very important to make sure the label is listed on the packaging before using it as shown in table 1, knowing the composition of the ingredients according to the label is the easiest way to tell the difference. It becomes a problem if there is no information label for the alcoholic liquid. Many traditional alcoholic drink sellers often mix alcoholic beverages, which are commonly called as "oplosan", this mixed liquor is very dangerous for human if it contains methanol. Until 2018, the number of victims who died due to "oplosan" liquors reached 18,000 people [3].
Based on this problem, it is necessary to make a measuring device for the alcohol content in the liquid according to its classification of alcohol type. It is hoped that the existence of this tool can help the public ensure that the beverages they drink do not contain methanol because they are harmful to human body and the result is close to accurate.
Similar studies have been carried out by several researchers using TGS 2602 sensor as the classification testing of a single gas of acetone or methanol [4] and the classification of tea aroma using Arduino Uno-Based Gas Sensor [5]. Another study examined the measurement of alcohol content using the most appropriate preparation method and the accuracy of gas chromatography to determine the levels of ethanol and methanol in "oplosan" liquors [6].
In this study, the MQ3 alcohol gas sensor was used. Besides trying variations in the use of this sensor type, this sensor was chosen because it was cheaper than the existing alcohol sensors, which has almost the same sensor sensitivity. However, this MQ3 gas sensor consumes quite a lot of power compared to other sensors, which is around 750 mW [7] .
To find out the compounds contained in each type of liquid, this test will be calculated using the K-Nearest Neighbor (K-NN). K-NN is a method to classify the objects based on the closest data learning to the object [8], [9], [10]. Data learning is projected into a multidimensional space, where each dimension represents a feature of the data. Using K-Nearest Neighbor (K-NN) method, the accuracy of each contained compound in the type of alcohol can be known.
In this research, the K-NN method will be applied and processed in MS Excel based on the result of gas sensor reading, then the data was sent via the internet network. The gas sensor used is MQ3 to classify the type of alcohol gas, namely the types of ethanol and methanol scattered in the room. In this study, the accuracy of each sensor reading on the distribution of gas in a closed room will be tested.

METHOD
The design of this gas sensor-based alcohol classification system and method in this paper consists of a series of hardware and software applications. The block diagram of the alcohol classification system measures ethanol and methanol substances using the MQ3 gas sensor and WeMos as a data acquisition device, microcontroller, as well as a Wi-Fi network connection. The data is sent via Wi-Fi and received by the computer to be analyzed using the K-NN method as shown in Figure 1.

Gas Sensor
The MQ3 sensor is a gas sensor which is suitable to detect alcohol level directly. The MQ3 alcohol sensor has high sensitivity and fast response time. The MQ3 alcohol sensor element consists of an SnO2 layer with low conductivity in clean air [11]. The sensor resistance will change along with the detection of the ethanol gas presented by the sensor element. If the ethanol concentration is high, the sensor resistance will decrease so that the voltage output will increase.
When metal oxide (SnO2) crystals are under normal conditions, namely at room temperature, the surface of the metal oxide (SnO2) material interacts with oxygen molecules in the air. The oxygen atoms will be adsorbed and bind free electrons on the surface of the metal oxide (SnO2) [12]. Inside the gas sensor, an electric current flows through the grain boundaries of the SnO2 crystal. In the joint area, oxygen absorption prevents the load from moving freely. If the gas concentration is reduced, the oxidation process will occur. The surface density of the negative oxygen charge will decrease and will result in a decrease in the height of the barrier from the joint area. By lowering the barrier's height, that barrier sensor resistance will also decrease.
As shown in figure 2, the minimum circuit for the MQ3 sensor is very simple. The circuit consists of 1 variable resistor and H pin which is connected to a voltage of 5 V. Figure 2 The MQ3 Sensor Driver Circuit [13] The selected alcohol gas sensor on this system is the MQ3 type. The sensor is manufactured by Hanwei Electronic Co. Ltd. The MQ3 sensor has high sensitivity to alcohol gas and low sensitivity to benzene gas [14]. Table 2 shows the sensitivity level of MQ3 to some detectable gases. The following solvents were used for the gas sensor evaluation, the alcohols tested were ethanol and methanol. To identify the type of gas used in this study, the researcher used a gas sensor, namely the MQ3 type gas sensor as an alcohol gas detector to detect the type of ethanol and methanol.

Data Acquisition
Data acquisition is part of the stages of the Knowledge Discovery in Database (KDD) process to record gas sensor data when testing the samples [15]. With data acquisition, we can classify, predict, estimate and get other useful information from the large data sets.

Figure 3 The Design Circuit of Alcohol Classification
The hardware of this research begins the blocking of gas sensor circuit consisting of the commercially available MQ3 gas sensor as shown in figure 3. The gas sensor to measure the volatile alcohol content in a holding container was simulated by placing a liquid alcohol in a closed room to measure the volatile alcohol content with the MQ3 sensor. The output of the MQ3 sensor is the conversion result of the alcohol content detection. The voltage was then forwarded to WeMos A0 analog input pin and the reading of alcohol content was displayed to a 16x2 LCD screen via D1 and D2 pins. Measurements were made by pouring alcohol into the reservoir. The MQ3 sensor was drilled into the alcohol reservoir cap to find out the compounds which were contained in each type of liquid alcohol.
From the sensor, the voltage and the reading of alcohol content were forwarded to WeMos A0 analog input pin, then the formula was processed. The unit was controlled by the ESP8266 IoT embedded system and then the information was sent to the database server with software via Wi-Fi internet network.

K-Nearest Neighbor
The computer was used to process the acquisition data from the gas sensor using the K-Nearest Neighbor (K-NN) method. It is a method that uses a supervised algorithm where the result of the new query instance are classified based on the majority of the categories in the K-NN. The purpose of this algorithm is to classify new objects based on attributes and training samples. The K-NN algorithm method is very simple. The algorithm works based on the shortest distance from the query instance to the training sample to determine its K-NN [16]. The training sample is projected onto a multidimensional space, where each dimension represents a feature of the data. This space is divided into sections based on training sample classification. A point in this space is marked by class c, if class c is the most suitable classification found in the K-Nearest Neighbors of that point [17]. The closest or furthest neighbor is usually measured based on Euclidean Distance which is presented as follows: = √( 1 − 1 ) 2 + ( 2 − 2 ) 2 + ⋯ ( + ) 2 (1) Where: a = sample data / training data b = testing data i = data variable n = data dimension The pattern of each gas type will be known through the K-NN classification.

RESULTS AND DISCUSSIONS
The MQ3 gas sensor was used to detect the gas vapor in each type of alcohol directly using methanol and ethanol in the driver circuit with 1 variable resistor. The output of the MQ3 sensor was an analog voltage proportional to the smell of alcohol gas received using the ADC, which was functioned to communicate with microcontroller.

Figure 4 The Prototype of Alcohol Classification
The MQ3 gas sensor output was the conversion of yield voltage alcohol content detection. The voltage was forwarded to the WeMos A0 analog input pin and the alcohol content reading was displayed in a 16x2 LCD screen via D1 and D2 pins. The measurements were made by pouring alcohol into a round container with a diameter of 15 cm and a height of 10 cm as shown in Figure 4. The MQ3 sensor was drilled into the lid of the alcohol reservoir to read the type of gas which appeared from the ethanol or methanol alcohol being tested.
The first stage was to prepare the training data in advance. A training data is the data or information taken in the previous time and the class or label has already been known. In this study, the training data used the gas sensor reading for ethanol and methanol. Figure 5 The Response of MQ3 Gas Sensor to Ethanol at Different Distances Figure 5 shows the measurement of ethanol placed in a holding container using a gas sensor in which the distance varies from 10 cm and 20 cm.
The further the device was from the source of ethanol gas, the lower the concentration of ethanol vapor detected. This can be seen in Figure 5 in which the ethanol gas was read at a distance of 10 cm from the ethanol source, which produced an ADC value of 380-500 and 330-440 at a distance of 20 cm on the MQ3 gas sensor. Figure 6 The Response of MQ3 Gas Sensor to Methanol at Different Distances Figure 6 shows the measurement of methanol placed in a holding container using a gas sensor in which the distance varies from 10 cm and 20 cm. Figure 5 and 6 consists of the x-axis which contains the measurement time, and the y-axis which contains the ADC value of the gas sensor voltage measurement. The second stage was to create a table for the normalization calculation. The normalization formula uses the following equation: Then calculate each attribute and make sure the result obtained are the interval between the numbers 0 to 1. The third step was to calculate the Euclidean distance. Euclidean Distance measures proximity between two objects described as straight line or direct measurement. This measurement method is suitable when implemented on 14 data that the attributed values are numeric in particular with continuous attributes. 0.00 0.05 0 Table 2 shows the normalization calculation with the description of class 1 = ethanol and class 0 = methanol.
The most common distance calculation used in the calculation of K-NN algorithm is using Euclidean distance calculation. The closer the data similarity, the smaller the distance between the two points. Euclidean distance is said to be good if the new data has a minimum distance and has a high similarity.  Table 3 shows the Euclidean distance calculation, with the description of class 1 = ethanol and class 0 = methanol.
The value of Euclidean distance which had been sorted was limited and the majority values were taken based on the K value as the prediction result. The data testing used the closest value to the ethanol in the gas sensor reading. Then determine the closest distance to the K sequence and pair the corresponding classes.
The K values used are 1, 3, 5 and 10. The sorting result for the K value is shown in Table 3.  Figure 7. The Testing Accuracy Figure 7 shows the result of testing the effect of K value that produced an optimum accuracy of 100% at the value of K=1, K=3, K=5, K=10 and the accuracy decreases at 55% when K value = 20. The test results show that the parameter value of K is very influential on the classification results and the resulting accuracy. The average accuracy tends to decrease with the addition of the value of K [18] [19].
Based on Table 3 and the accuracy calculation of the data being tested, there are five results obtained in the field that are appropriate or correct as predicted using the prediction of K-NN calculation. Thus, based on these equations, the accuracy of the gas sensor reading is 74%.
Based on the comparison data above, the result of the application provided a minimum limit in obtaining the predicted result. In real condition, it was possible to get the more amounts of the cutting result so that the implications can be used for greater profits. This is another advantage obtained from the use of K-NN algorithm with Euclidean distance applied to the alcohol type classification application using the gas sensor that has been studied.
In the results of the study, it was concluded that if the K-NN method can classify the types of methanol and ethanol that are detected very well in a closed room. Other studies have succeeded in classifying alcohol gas using the SVM method [20]. K-NN and SVM methods are classification algorithms from Machine Learning.

CONCLUSIONS
Commercially available gas sensors showed that selectivity increased when the detection was combined from the response of MQ3 gas sensor and the calculation of K-Nearest Neighbor (K-NN) method so as to produce a classification of 2 types of alcohol, namely ethanol and methanol.
The gas sensor detected volatile organic compounds as the closest training data (similar) to the object in the new data or K-NN training and testing data.
The measurement of this system consisted of testing the effect of K value and testing its accuracy. The result of testing the effect of K value produced an optimum accuracy of 100% at the value of K=1, K=3, K=5, K=10 and 55% on K value = 20.