شناسایی و حذف داده‌های پرت و تکراری درشبکه‌های حسگربی‌سیم با‌استفاده از تکنیک‌های طبقه‌بندی (مطالعه موردی: داده‌های هواشناسی کاشان)

نوع مقاله: مقاله پژوهشی

نویسندگان

1 استادیار، گروه کامپیوتر و فناوری اطلاعات، واحد کاشان ، دانشگاه آزاد اسلامی، کاشان، ایران

2 کارشناسی ارشد، گروه کامپیوتر و فناوری اطلاعات، واحد کاشان ، دانشگاه آزاد اسلامی، کاشان، ایران

چکیده

شبکه­های حسگربی­سیم نسل جدیدی از شبکه­ها هستند که درواقع از تعداد زیادی گره­های حسگر پراکنده در محیط تشکیل شده­اند و هرکدام به­طورخودمختار و با همکاری سایرگره­ها هدف خاصی را دنبال می­کنند. ازآن­جایی­که انتقال داده به­عنوان یکی ازمهم­ترین عملیات مصرف انرژی درشبکه­های حسگر بی­سیم می­باشد، بنابراین کاهش میزان انتقال داده در این شبکه­ها منجربه کاهش مصرف انرژی و درنتیجه طول­عمر بیش­تر این شبکه­ها می­گردد. روش­هایی که برای کاهش میزان انتقال داده در شبکه­حسگربی­سیم باهدف ذخیره انرژی، اغلب استفاده می­شود شامل تجمع داده­ها ، پالایش و پیش­بینی داده­ها می­باشد. ازسوی دیگر، استفاده از پروتکل­های­مسیریابی مناسب می­تواننددرحفظانرژیمصرفی گره­هانقشمؤثریداشته­باشند. دراین پژوهش سعی شده از طریق شناسایی داده­های پرت توسط برخی از تکنیک­های طبقه­بندی و پروتکل­های مسیریابی سلسله­مراتبی ، انرژی مصرفی درکل شبکه کاهش یابد. این آزمایشات روی داده­های حسگرهای ایستگاه خودکار هواشناسی کاشان انجام شد. نتایج نشان­می­دهد که با به­کارگرفتن تکنیک ماشین بردارپشتیبان با کرنل چندجمله­ای از درجه­ی 3 و با ضریب تنظیم­گر 10 درشبکه­ی حسگر پیشنهادی می­توان حدود 92%  داده­ها را درمدت 2 ثانیه به­درستی شناسایی نموده و بنابراین با فیلتر کردن داده های پرت و داده های تکراری و ارسال داده جدید و صحیح ، به صرفه­جویی انرژی و درنتیجه افزایش طول­عمر شبکه کمک نمود.

کلیدواژه‌ها


عنوان مقاله [English]

Detection and removal of outliers and redundant data in wireless sensor networks by classification technigues (Case Study: Weather data of Kashan)

نویسندگان [English]

  • Mehdi Esmaeli 1
  • Azimeh Sharif 2
چکیده [English]

Abstract
Wireless sensor networks are new generation of networks which consists of a large number of sensor nodes distributed in environment and each follows special porpuse independently with cooperation of other nodes. where as data transmission is one of the most important operation in wireless sensor networks, so reduction of data in these networks causes reduction in energy consumption and accordingly increase networks’ lifetime. The frequently used methods to reduce the amount of data transmissions in WSNs for power saving include the data aggregation , and the data prediction and filtering. On the other hand, usage of suitable routing protocols has important role in protection of nods, energy consumption.
In this research it was tried to reduce energy consumption in whole of the network by some of classification techniques and hierarchical routing protocols. these experiences was done on sensor data of automatic weather station of kashan . The results show by appling support vector machine technique with polynomial kernel of 3 degree, and with adjustment coefficient of 10 in the proposed sensor network, accurately about %92 of data is detected in 2 second, and then by filtering outliers and redundant data as well as sending new and correct data, helped to save energy and increase of network lifetime.

Introduction
Wireless sensor networks consist of a large number of sensor nodes distributed in environment randomly.so location of these nodes isn,t determined or clear that cause to leave them in dangerous or inaccessible places. on the other hand , every sensor nodes operate independently and also are able of cooperating with other sensor nodes to achieve the purpose. in wireless sensor networks , the energy issue and optimum usage is one of the most important challenges encounter with these networks , because wireless sensor nodes are usually feed with battery and network,s lifetime depends on battery lifetime. differend methods have been proposed for wireless sensor networks that the most importants are: decrease of data transmission,usage of routing protocols, clustering in wireless sensor networks .a suitable method and solution for encountering with energy depends on variety of network and usage.
Data transmission in wireless sensor networks has been identified as one of the most energy consumption operations [1],as well battery lifetime decrease considerably by data transmission in sensor node, one way to reduce data transmission ,is outlier,s detection, in order to prevent their transmission in network.
In wireless sensor networks,data that digressed from normal patterns,have been detected as outlier and they are important because they show significant abnormalcy. after detecting outliers depending on system decision, outliers can be removed or saved in sensor memory to evaluate more and totally prevent their transmission in network which decrease energy consumption in the whole network. on the other hand , if distance of sensor nodes is small when an event occurs in a region , it will be sense by several nodes and they send the same massage. also in many monitoring applications, wireless sensor network like aerology that sensors sense data from environment sequentially, since environmental data like temperature and other parameters change slowly it may be similar sequential senses of one node and cause sending redundant data , that in addition to aggregation in network, causes great decrease of energy. so by preventing creation of redundant data of filtering can decrease energy consumption in the whole network .
Purpose of this research , is usage of common classification techniques and acquiring optimal values for its parameters to form a model to detect outlier with high speed and accuracy. in this research was tried to reduce energy consumption in the whole network by detecting sensed outliers and redundant data by sensors and preventing their transmission in wireless sensor network that use backboneformation for multi-hop transmission.
In second part history of research will be reviewed. in third part program and necessary algorithms will be checked. in forth part proposed structure of wireless sensor network will be introduced. in fifth and sixth parts evaluation and results will be mentioned as well as next activities in order .

Review of research history
Recently , most of wireless sensor networks in real world were used to collecting valuable raw data. in analizing these data the main step is recognization and remove of abnormal data ( outlier , noise , …..) in [2] outline of ways of outliers, detection .
SVM is one of the machine learning techniques. Separating normal data from abnormal data has been introduced in [3] by SVM technique .
Outlier detection techniques based on one-class support vector machine use spatial and temporal correlations between sensor data for identifying outliers , but this technique for large scale training samples , take more spatial and temporal overhead to process and optimize .
In [4] a technique was defined based on bayesian for detection of local outliers in data flow sensor. according to what was mentioned in [5] , we can use combination of some techniques for outlier,s detection. for example, combination of k-nearest neighbor and SVM methods proposed for outliers detection that use k-nearest method for reducing scale of training sample. this method can shorten training time and optimize time however for data in large scale, k-nearest neighbor has considerable temporal and spatial consumption.
or in [6] combination of SVM and fisher discriminant ratio (FDR) proposed for redundant data detection that at first distributed clustering is applied on sensor data, then by FDR (fisher discriminant ratio) there is boundaries for discriminant between clustering data and scattered data in clusters. the clustered data are regarded as redundant data and will be removed. thus the number of data samples for training SVM is greatly reduced however classification accuracies in this method is a little lower than the traditional incremental SVM training methods.
in [7] proposed combinational technique of k-nearest neighbor and clustering to calculate interesting events for outliers detection. in [8] three techinques for outlier detection including a machine learning technique,a principal component analysis - based methodology and an univariate statistics-based approach were studied in wireless sensor networks.

کلیدواژه‌ها [English]

  • outlier detection
  • Wireless Sensor Network
  • Classification Techniques
  • Data mining
  • reduction of energy consumption