Proyectos Poliemprende
92 O A DOLFO CAMPO L ÓPEZ G USTAVO 93 ORTIZ LIMÓN NANCY
Data mining is the process of extracting new information and patterns from the large volumes of data collected in a database. Data mining searches through the large volumes of data and analyzes them to find patterns and trends. This newly found patterns and trends are used for making future decisions. The process of mining valuable information from the data collected using machines are adopted by many business, governments and scientists. IoT has made it much easier to
collect the data. But IoT is collecting data in huge volumes with high velocity. Data mining uses artificial intelligence and machine learning to find the hidden gems of knowledge quicker, which can then be implemented into action, resulting in better outcomes. (SAS 2020a.)
Organizations are using data mining for various purposes. They use data mining algorithms to detect any fraudulent credit card activities, calculate credit risks, manage resources and increase response time. Marketing departments are using data mining to expand their sales. Companies are collecting customers data and their buying patterns to suggest new products. Companies look at the trends to bring certain products on sale. With the help of data mining, organizations are able to reach the right customers for their marketing campaigns. Data mining is aiding organizations to improve their customer service, attract new customers and retain existing customers. Data mining is also applicable in healthcare and in science. In healthcare data mining is used for identifying adverse side effects of drugs during clinical trial. (SAS 2020a, 1– 3).
As shown in Figure 5, there are seven phases in data mining one must follow to discover new information: Data Integration, Data Selection, Data Cleaning, Data Transformation, Data Mining, Presentation and Deployment. In data integration data are collected from different sources. From the collection, only the useful data are selected. The selected data are then checked for errors, redundancy and for missing values. The raw data are then transformed into data that are acceptable for data mining. After that, data mining algorithms and techniques are applied to the data. The obtained information and patterns are visualized using graphs which makes them easier for users to study and interact for finding meaningful patterns. In the last phase, the discovered patterns are used when making decisions. (Sumiran 2018.)
Figure 5. Phases of data mining
Data mining uses various techniques and algorithms to identify a model or
patterns from the available data. They are: Classification, Clustering, Regression, Artificial Intelligence, Neural Networks, Association Rules, Decision Trees and Genetic Algorithm. Different algorithms are suitable for different applications. In order to detect fraudulent credit applications, the algorithms are familiarized with both valid records and fraudulent records. Based on these, the algorithms
determine whether the application is valid or fraud. For this type of setups Classification Techniques are applied. Regression Technique predicts the outcomes based on the known target values. This technique is used for
predicting stock prices, sales and product failure rates. (Porkizhi 2017.) Another important data mining technique is Neural Network. Neural Network can adopt to the changes in the input without having to change the criteria of the output. Neural network is able to extract complex patterns that are too difficult for
humans and other computer techniques. They are used for training computers to pronounce English texts and recognize handwritten characters. Neural network is suitable for predicting and forecasting needs and are already deployed in many industries. (Ramageri 2010.)
Data mining has a high importance in IoT. The data collected from IoT are very valuable. But due to the velocity and volume of data it generates, humans by themselves are not able to take advantage of these data. Data mining is assisting many industries and businesses to grow. They are able to make advantageous
decisions while still maintaining good customer service. Any industry that generates data can take advantage of data mining. (Porkizhi 2017.)
5 SECURITY IN IOT
The idea behind IoT technology is to create a network where devices can seamlessly connect with other devices to positively influence people’s lives. IoT has already built its root in the society. Its application ranges from smart homes, to smart cities, smart healthcare and smart industries. The purpose of deploying IoT in these fields is to make people’s life easier and comfortable while saving costs and energy. IoT collects huge amounts of data from these fields. The collected data can consist of personal and private information about users and companies. Without proper security hackers can exploit this information with the intention of harming an individual or a company. For instance, in smart homes, if the wireless security cameras have poor security features, attackers can use that camera to spy on people’s lives. In healthcare systems smart devices are used to monitor the patients’ health. These devices collect critical information of the patient and help track their health condition. Hackers can hack into these devices to give false reading which can danger the patient’s life. IoT needs to provide strong security features in of its devices and the advancement of IoT will depend on its ability to gain users’ trust by providing strong security and privacy features. (Ali et al. 2016.)
Security is a challenging topic in IoT. This is because IoT is heterogenous in nature. IoT consists of different technologies that are connected together to form an IoT network. IoT lacks a uniform standard and governance, which is why its security and privacy issue is a major concern. A simple IoT architecture consists of three layers: Sensing layer, Network layer and Application layer, as shown in Figure 6. (Hassija et al. 2019.)
Figure 6. A simple IoT architecture
Each of these layers consists of different technologies and each of these technologies has their own security issues. From the security point of view connecting different technologies creates a wider area for network related attacks. If one of these devices is infected with malware it can become the
starting point of infection and can be used to infiltrate the network, causing further damage. (Tanaka et al. 2016.)