Aditya polumetla in partial fulfillment of the requirements for the degree of master of science

səhifə	1/12
tarix	25.06.2016
ölçüsü	1.32 Mb.

1 2 3 4 5 6 7 8 9 ... 12

MACHINE LEARNING METHODS FOR THE

DETECTION OF RWIS SENSOR MALFUNCTIONS

A THESIS

SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL

OF THE UNIVERSITY OF MINNESOTA

ADITYA POLUMETLA

IN PARTIAL FULFILLMENT OF THE REQUIREMENTS

FOR THE DEGREE OF MASTER OF SCIENCE

JULY 2006

UNIVERSITY OF MINNESOTA

This is to certify that I have examined this copy of Master's thesis by

Aditya Polumetla
and have found it is complete and satisfactory in all respects,

and that any and all revisions required by the final

examining committee have been made.
___________Dr. Richard Maclin__________

Name of Faculty Advisor(s)

_____________________________________

Signature of Faculty Advisor(s)

_____________________________________

Date

GRADUATE SCHOOL

Abstract
Mn/DOT uses meteorological information obtained from Road Weather Information System (RWIS) sensors for the maintenance of roads and to ensure safe driving conditions. It is important that these sensors report accurate data in order to make accurate forecasts, as these forecasts are used extensively by Mn/DOT. Real time detection of sensor malfunctions can reduce the expense incurred in performing routine maintenance checks and re-calibrations of the sensors, while guaranteeing accurate data.
In this work we predict RWIS sensor values using weather information from nearby RWIS sensors and other sensors from the AWOS network. Significant and/or systemic deviations from the predicted values are used to identify malfunctions. Based on historical data collected from the sensor and its nearby locations, we construct statistical models that can be used to predict current values. We use machine learning (ML) methods to build these models. We employ three types of ML models: classification algorithms, regression algorithms and Hidden Markov Models (HMMs). We use classification algorithms such as J48 decision trees, Naive Bayes and Bayesian Networks, regression algorithms such as Linear Regression, Least Median Square (LMS), M5P regression trees, MultiLayer Perceptron, RBF Networks and Conjunctive Rule, and HMMs to predict the variables temperature, precipitation type and visibility.
We selected a representative sample of the RWIS sites in Minnesota. We employed different representations of the data to try improve the model efficiency. To use temperature in regression algorithms and HMMs, we developed a method to discretize temperature. The Viterbi algorithm used in HMM was modified to obtain the symbol observed along the most probable path.
From the results, we observed that LMS and M5P are highly accurate in predicting temperature and visibility. Predicting precipitation works well with J48 decision trees and Bayesian Belief Networks. HMMs can predict temperature class values accurately but fail in case of precipitation type. Our experiments suggest hese methods can be efficiently used to detect malfunctions of the sensors that report these variables.

Acknowledgements
I would like to take this opportunity to acknowledge all those who helped me during this thesis work. I would like to thank my advisor Dr. Richard Maclin for introducing me to the world of machine learning, his valuable suggestions and guidance during the course of this thesis work, and his patience in reviewing my thesis.
I would like to thank my committee members Dr. Donald Crouch and Dr. Robert McFarland for evaluating my thesis and for their suggestions.
I would like to thank the Northland Advanced Transportation Systems Research Laboratories for providing the funding for this research and its members Dr. Richard Maclin, Dr. Donald Crouch and Dr. Carolyn Crouch for their ideas and feedback. I would especially like to thank my team members at various times during the project including Saiyam Kohli, Ajit Datar, Jeff Sharkey and Jason Novek for their help in the research work.
I would also like to thank all the faculty and staff at the Computer Science Department at the University of Minnesota Duluth for their assistance during my master’s course-work. My fellow graduate students who offered great help during my study and stay in Duluth, I would like to thank them all.
Finally, I would like to thank my parents and grandparents for their endless support and encouraging me to do my best.

Contents

Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .	i
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .	iii
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .	1
1.1 Building Models for Sensor Data using Machine Learning Methods . . . .	3
1.2 Thesis Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .	4
1.3 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .	4
2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .	6
2.1 Description of Sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . .	6
2.1.1 The Road Weather Information System . . . . . . . . . . . . . . . .	7
2.1.1.1 Data from RWIS Sensors . . . . . . . . . . . . . . . . . . . .	9
2.1.2 The Automated Weather Observing System . . . . . . . . . . . . . .	12
2.1.2.1 Data from AWOS Sensors . . . . . . . . . . . . . . . . . . .	13
2.2 Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .	15
2.2.1 Classification Algorithms . . . . . . . . . . . . . . . . . . . . . . .	19
2.2.1.1 The J48 Decision Tree Algorithm . . . . . . . . . . . . . . .	19
2.2.1.2 Naive Bayes . . . . . . . . . . . . . . . . . . . . . . . . . .	23
2.2.1.2 Bayes Nets . . . . . . . . . . . . . . . . . . . . . . . . . . .	25
2.2.2 Regression Algorithms . . . . . . . . . . . . . . . . . . . . . . . . .	28
2.2.2.1 Linear Regression . . . . . . . . . . . . . . . . . . . . . . .	28
2.2.2.2 LeastMedSquare . . . . . . . . . . . . . . . . . . . . . . . .	30
2.2.2.3 M5P . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .	30
2.2.2.4 MultiLayer Perceptron . . . . . . . . . . . . . . . . . . . . .	33
2.2.2.5 RBF Network . . . . . . . . . . . . . . . . . . . . . . . . . .	35
2.2.2.6 The Conjunctive Rule Algorithm . . . . . . . . . . . . . . .	37
2.3 Predicting Time Sequence Data – Hidden Markov Models . . . . . . . . .	38
2.3.1 The Forward Algorithm and the Backward Algorithm . . . . . . . . .	42
2.3.2 The Baum-Welch Algorithm . . . . . . . . . . . . . . . . . . . . . .	45
2.3.3 The Viterbi Algorithm . . . . . . . . . . . . . . . . . . . . . . . . .	47
3 Machine Learning Methods for Weather Data Modeling . . . . . . . . . .	49
3.1 Choosing RWIS - AWOS Sites . . . . . . . . . . . . . . . . . . . . . . . .	50
3.2 Features Used . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .	53
3.2.1 Transformation of the Features . . . . . . . . . . . . . . . . . . . . .	53
3.2.2 Discretization of the Features . . . . . . . . . . . . . . . . . . . . . .	56
3.3 Feature Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .	58
3.4 Feature Symbols for HMMs . . . . . . . . . . . . . . . . . . . . . . . . .	59
3.5 Methods Used for Weather Data Modeling . . . . . . . . . . . . . . . . . .	60
3.5.1 Cross-Validation . . . . . . . . . . . . . . . . . . . . . . . . . . .	61
3.5.2 General Classification Approach . . . . . . . . . . . . . . . . . . .	62
3.5.2 General Regression Approach . . . . . . . . . . . . . . . . . . . .	64
3.5.4 General HMM Approach . . . . . . . . . . . . . . . . . . . . . . .	65
4 Experiments and Results . . . . . . . . . . . . . . . . . . . . . . . . . . . .	71
4.1 Using ML Algorithms to Predict Weather Variables . . . . . . . . . . . . .	71
4.1.1 Predicting Temperature . . . . . . . . . . . . . . . . . . . . . . . . .	72
4.1.1.1 Experiment 1: Temperature using regression methods . . . . .	72
4.1.1.2 Experiment 2: Temperature using regression methods, with precipitation type included as inputs . . . . . . . . . . . . . . .	75
4.1.1.3 Experiment 3: Temperature class using classification methods.	77
4.1.2 Predicting Precipitation Type . . . . . . . . . . . . . . . . . . . . . .	79
4.1.2.1 Experiment 4: Precipitation type using classification methods .	80
4.1.3 Predicting Visibility . . . . . . . . . . . . . . . . . . . . . . . . . . .	84
4.1.3.1 Experiment 5: Visibility using regression methods . . . . . . .	84
4.2 Using HMMs to Predict Weather Variables . . . . . . . . . . . . . . . . .	86
4.2.1 Predicting Temperature . . . . . . . . . . . . . . . . . . . . . . . . . .	88
4.2.1.1 Experiment 6: Comparison of two methods for training HMM .	88
4.2.1.2 Experiment 7: Temperature class using HMMs . . . . . . . . .	91
4.2.1.3 Experiment 8: Site independent prediction of temperature class using HMMs . . . . . . . . . . . . . . . . . . . . . . . . . . .	93
4.2.2 Predicting Precipitation Type . . . . . . . . . . . . . . . . . . . .	96
4.2.2.1 Experiment 9:Precipitation type using HMMs . . . . . . . . . .	96
5 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .	100
5.1 Using RWIS sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . .	100
5.2 Weather Data Modeling and Forecasting using Machine Learning Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .	101
5.3 Time Series Prediction using HMM . . . . . . . . . . . . . . . . . . . . .	104
6 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .	105
7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .	108
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .	110
Appendix A: RWIS and AWOS Site Locations . . . . . . . . . . . . . . . .	115
Appendix B: Using WEKA . . . . . . . . . . . . . . . . . . . . . . . . . . .	117
Appendix C: Detailed Results . . . . . . . . . . . . . . . . . . . . . . . . . .	121

List of Figures

1.1	Predicting temperature value at RWIS site 67 for the time t using weather data from nearby sensors . . . . . . . . . . . . . . . . . . . . . . . . . . .	2
2.1	RWIS sites in Minnesota . . . . . . . . . . . . . . . . . . . . . . . . . . .	9
2.2	AWOS Sites in Minnesota . . . . . . . . . . . . . . . . . . . . . . . . . .	13
2.3	Using data from nearby sites to predict temperature for the location C . . .	18
2.4	A Decision Tree to predict the current temperature based on temperature readings taken from a set of nearby sites . . . . . . . . . . . . . . . . . . .	20
2.5	A Bayesian network to predict temperature temp_Ct at a site . . . . . . . .	26
2.6	A M5 model tree for predicting temperature at a site . . . . . . . . . . . .	31
2.7	A multilayer perceptron with two hidden layers to predict temperature at a site . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .	33
2.8	A sigmoid unit that takes inputs xi, wi the weights associated with the inputs and sigmoid the resulting output from the unit . . . . . . . . . . . .	34
2.9	An RBF Network with n hidden units to predict temperature at a site . . . .	36
2.10	A Hidden Markov Model. A new state is visited when a transition occurs at a certain duration of time and each state emits a symbol when reached . . .	39
2.11	A HMM with states 1, 2, 3 and 4 that emit a symbol when reached . . . . .	41
3.1	Predicting temperature value at a site using weather data from nearby sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .	49
3.2	Grouping of RWIS and AWOS sites into three sets. This map also shows the locations of the selected RWIS and AWOS sites across Minnesota . . .	52
4.1	The mean of absolute error and standard deviation obtained from predicting temperature across all 13 RWIS sites using regression algorithms . . . . . .	74
4.2	The comparison of mean of absolute error and standard deviation obtained from predicting temperature using Experiment 1 (Section 4.1.1.1) and Experiment 2 (Sections 4.1.1.2) . . . . . . . . . . . . . . . . . . . . . . .	77
4.3	The distance between actual and predicted temperature class obtained from J48 and Naive Bayes algorithms . . . . . . . . . . . . . . . . . . . . . . .	79
4.4	The classification error and standard deviation obtained from predicting precipitation type across all 13 RWIS sites using classification algorithms .	81
4.5	The percentage of instances with precipitation present and with no precipitation present predicted correctly and incorrectly using classification algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .	82
4.6	The mean of absolute errors obtained from predicting visibility across the RWIS sites that report visibility using various algorithms . . . . . . . . . .	85
4.7	The percentage of instances with each distance to actual value when the HMM is trained using the two different methods . . . . . . . . . . . . . . .	91
4.8	Percentage of instances having a certain distance from the actual class value when predicting temperature class using HMMs . . . . . . . . . . . .	93
4.9	Percentage of instances having a certain distance from the actual class value when predicting temperature class by applying ten 10-fold cross-validation on HMMs and using the extended dataset focusing on predicting class value for an RWIS group . . . . . . . . . . . . . . . . . . . . . . . .	95
4.10	The percentage of instances with precipitation present and with no precipitation present predicted correctly using classification algorithms . .	97
C.1	Mean Absolute Errors for different RWIS sites obtained from predicting temperature using regression algorithms . . . . . . . . . . . . . . . . . . .	122
C.2	Mean Absolute Errors for different RWIS sites obtained from predicting temperature using regression algorithms, with precipitation type information added to the feature vector . . . . . . . . . . . . . . . . . . .	124
C.3	Classification errors for different RWIS sites obtained from predicting precipitation using classification algorithms . . . . . . . . . . . . . . . . .	126
C.4	Mean Absolute Errors for different RWIS sites obtained from predicting visibility using regression algorithms . . . . . . . . . . . . . . . . . . . . .	129
C.5	Percentage of instances with a certain distances between actual and predicted class values, obtained by using HMM to predict temperature class	131

List of Tables

2.1	Codes used for reporting precipitation type and precipitation intensity by RWIS Sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .	11
2.2	The Forward Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . .	43
2.3	The Backward Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . .	44
2.4	The Baum-Welch Algorithm . . . . . . . . . . . . . . . . . . . . . . . . .	46
2.5	The Viterbi Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . .	47
3.1	Grouping of the selected 13 RWIS sites into three sets, along with their respective AWOS sites . . . . . . . . . . . . . . . . . . . . . . . . . . . .	53
3.2	The Modified Viterbi Algorithm . . . . . . . . . . . . . . . . . . . . . . .	67
A.1	Latitude and longitude coordinates for RWIS sites in Minnesota . . . . . . .	115
A.2	Latitude and longitude coordinates for AWOS sites in Minnesota . . . . . .	116
B.1	Format of an arff file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .	118
C.1	Results obtained from using regression algorithms to predict temperature at an RWIS site (Experiment 1) . . . . . . . . . . . . . . . . . . . . . . . . .	121
C.2	Results obtained from using regression algorithms to predict temperature at an RWIS site (Experiment 2) . . . . . . . . . . . . . . . . . . . . . . . . .	123
C.3	Results obtained from using classification algorithms to predict precipitation type at an RWIS site (Experiment 4) . . . . . . . . . . . . . .	125
C.4	Precentage of instances predicted correctly using classification algorithms (Experiment 4) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .	127
C.5	Results obtained from using regression algorithms to predict visibility at an RWIS site . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .	129
C.6	Percentage of instances with a certain distance between actual and predicted temperature class values, obtained by using HMM to predict temperature class (Experiment 7) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .	130
C.7	Percentage of instances with a certain distance between actual and predicted temperature class values, obtained using extended dataset focusing on predicting value for an RWIS set rather than for an RWIS site (Experiment 8) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .	132
C.8	Results obtained from predicting precipitation type using HMM (Experiment 9) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .	133

Chapter 1
Introduction

Many state transportation departments, such as the Minnesota Department of Transportation (Mn/DOT), have Road Weather Information System (RWIS) units located along major roadways. Each RWIS unit employs a set of sensors that collect weather and pavement condition information. The information from these sensors is used to determine the current conditions on and near the roads. This information is used to conduct roadway related maintenance and for ensuring safe driving conditions. The current weather information and forecasts based on this data are used for the allocation of resources such as the timing of operations such as snow removal, the selection of the right amount of materials such as salt for ice removal from roads, and the mobilization of maintenance personnel and equipment efficiently. These decisions help in running the organization in an efficient and cost effective manner.

During the winter months, the condition of the roads may get hazardous if the proper amount of salt is not applied at the right time. Information about weather and road conditions is obtained from the data reported by the RWIS sensors. This makes it important to keep these sensors in working order and ensure the readings are accurate. The maintenance of RWIS units is expensive and is done manually. Routine re-calibrations and maintenance checks are needed to ensure the proper working of these sensors. It would be beneficial if there existed an automated system that could detect malfunctions in real time and alert the maintenance personnel. We propose to use Machine Learning (ML) methods to form models to use to create such a system.

Figure 1.1: Predicting temperature value at RWIS site 67 for the time t using weather data from nearby sensors.
To detect a malfunctioning sensor using ML methods, we propose to observe the sensor's output over a period of time to determine any significant and/or systemic variations from the actual conditions present that might indicate a sensor malfunction. Figure 1.1 shows an overview of our proposed approach. This method works by predicting a sensor value at a site, for example site 67 from the Figure 1.1, which is then compared with the actual sensor reading to detect malfunctions. To predict the temperature value for site 67 at a given time t, we can use the current and a couple of previous hours of weather data, such as temperature and visibility, obtained from a set of sites that are located close to site 67. Figure 1.1 shows the nearby sites whose data correlate with the weather conditions at site 67. We attempt to build models for a site using ML methods which are used to predict a value at the site. To build such a model, ML methods require historical weather data obtained from the site and its nearby sites to learn the weather patterns. The model for site 67 uses the temperature and visibility information from the nearby sites to predict the temperature value that will be seen at site 67 at future times.

1.1 Building Models of Sensor Data using Machine Learning
Machine Learning (ML) methods build models based on previous observations which can then be used to predict new data. The model built is a result of a learning process that extracts useful information about the data generation process of the system using the previous observations. ML methods take a set of data corresponding to the process (in this case the weather at a sensor) and construct a model of that process in a variety of ways to predict that process. The resulting model can be applied to future data to attempt to predict sensor values. The resulting predictions can then be compared to sensor values reported and in cases where there are significant deviations, these sensors can be flagged as possibly malfunctioning.
We propose to use a variety of ML methods such as classification methods (e.g., J48 Decision Trees, Naive Bayes and Bayesian Networks), regression methods (e.g., Linear Regression, Least Median Squares) and Hidden Markov Models to try to predict this data. In all cases we are attempting to identify cases where sensors appear to have failed or are malfunctioning.
Classification algorithms label a given observation into one of a set of possible distinct categories. For example, we could take information about the recent temperature and humidity at a site and nearby sites as well as precipitation at the nearby sites and attempt to predict whether it is raining at that site. Raining or not raining would be the labels for this problem. Models built might be in the form of decision trees, lists of rules, neural networks, etc.
Regression algorithms develop a model for a system by finding equations that predict a continuous-valued result from the measured input values of a given observation. For example, we might use information about the temperature and humidity at other sites to try to predict the temperature value at a site.
HMMs predict values that are produced by a system in the form of a sequence generated over time. For example, the sequence of values of raining or not raining at a site over the course of a 24 hour period can be viewed as a time sequence. HMMs use the information from the value observed one step previously to predict the present value. HMMs capture time sequence information in the form of a graph where states represent information about the world (possibly including information that cannot be observed) and transitions between states which can emit the symbols (in this case, raining or not raining) which we observe. The process of determining the probabilities of emitting symbols and the probabilities of transitions between states is the basis of HMM learning.
1.2 Thesis Statement
In this thesis we propose to build models that can predict weather conditions at a given RWIS location using the current information from that location and surrounding locations (see Figure 1.1). We use ML algorithms including classification, regression and HMM methods to build the models to predict weather conditions at a selected sample of RWIS sites in the state of Minnesota. We use these predicted values and compare them with the values reported by the RWIS sensors to identify possible malfunctions. Of all the weather conditions reported by RWIS units we focus on predicting temperature, precipitation type and visibility. We hypothesize that the models we build can accurately detect deviations in the expected sensor readings that will allow us to identify sensor malfunctions.
1.3 Thesis Outline
This thesis is organized as follows. Chapter 2 presents the background for this thesis with a detailed description of the weather data collection systems present and the sensors they use. It describes the basic concepts of machine learning and presents various ML algorithms and HMMs used. Chapter 3 discusses the use of ML methods in detecting RWIS sensor malfunctions. It presents the process of selecting the sites used, feature representations used by ML algorithms and HMMs, and the general methodology used by the ML methods in predicting sensor values. Chapter 4 presents the experiments done to predict temperature, precipitation type and visibility using ML algorithms, and the results obtained from these experiments. Chapter 5 discusses research related to this work. Chapter 6 discusses extensions that can be made to the the proposed system. Chapter 7 presents conclusions and summarizes the work done in this thesis.

Chapter 2
Background

This chapter presents the background for this thesis. In the first section we introduce the two weather data collection systems, the Road Weather Information System (RWIS) and the Automated Weather Observing System (AWOS). We describe these sensors and the data format they use for recording values. In the second section we discuss the basic concepts of machine learning and its use in data mining problems. We then present various machine learning algorithms used in this thesis.

2.1 Description of Sensors
Two major automated systems are present in the United States that collect, process and distribute meteorological data, namely the Road Weather Information System (RWIS) [Aurora; Boselly et al., 1993] and the Automated Weather Observing System (AWOS) [FAA, 2006]. RWIS units are used by state and local roadway maintenance organizations for diverse roadway related operations. RWIS units have sensors that collect weather and pavement condition information. These units are generally installed on highways. The weather information gathered is used for understanding the current conditions on roadways at a specific location.
AWOS units are operated and controlled by Federal Aviation Administration (FAA). AWOS systems are installed at airports across the country. The meteorological sensors located on AWOS units measure and distribute weather information at the airports that is used by pilots and airport administrators. This information is used to keep the runways in proper condition, determine flight plans, and for landing and takeoff of airplanes. At present, a newer version of AWOS called the Automated Surface Observing System (ASOS) is also being used. A detailed description of these sensor systems, RWIS and AWOS, is given in the following sections.
2.1.1 The Road Weather Information System
The Road Weather Information System (RWIS) [Aurora; Boselly et al., 1993] is used for collection, transmission, model generation, and distribution of weather and road condition information. An RWIS is a collection of various systems that work together. The systems that form an RWIS are meteorological sensors, data communication equipment, weather profiles, site specific models, forecast and prediction algorithms, data processing systems and display systems to interface with humans.
The component of RWIS that collects weather data is called the Environmental Sensor Station (ESS) [Manfredi et al., 2005]. The ESS stations are placed at strategic locations in the road network, usually on state highways, in a grid-wise manner. A typical ESS consists of a tower, two road sensors embedded in and below the road for measurement of pavement conditions, an array of weather sensors located on the tower for meteorological observations, a Remote Processing Unit (RPU) for data collection and storage, and remote communication hardware that connects the ESS to a central server which is present at a maintenance center. The road sensors measure road surface temperature, surface condition (dry, wet and snow), water-film level and freezing temperature based on residual salt content on the road. The weather sensors measure air temperature, dew point, precipitation (type, intensity and rate), amount of precipitation accumulation, wind speed and direction, air pressure, visibility, relative humidity and solar radiation.
The RPUs are able to collect the raw data sent by the various sensors on the ESS and store it. They are not designed to process the data that is collected and thus the data is transmitted to a central server present at the maintenance center using the remote communication hardware. The communication between the RPU and the central server is done using radio signals.
The central server located at the highway maintenance centers has processing capabilities and performs data-related tasks such as communication with the RPUs, and collection, archiving and distribution of data. A set of site-specific weather models and data processing algorithms are loaded onto the central server. The central server uses these models and its data processing capabilities to come up with forecasts. The central server has a number of displays for human interaction with the RWIS system.
State and local organizations dealing with highway maintenance, such as the Minnesota Department of Transportation (Mn/DOT^¹), use RWIS for maintenance of roads and to ensure safe driving conditions in all seasons. The data collected and forecasts made by RWIS are used for the allocation of resources, timing of operations such as repairs, selection of the right amount of materials such as salt in case of ice removal, and mobilizing personnel and equipment. All of these decisions help in running the organization in a cost effective manner. The current weather and road conditions are relayed to motorists to help them in planning their trips and estimating travel times. In addition to the uses mentioned above, the data is also shared with various government, commercial and university related sources.
Mn/DOT maintains 93 RWIS sites spread all across the state of Minnesota. Figure 2.1 shows the location of the 79 of the 93 RWIS sites present in Minnesota. The remaining 14 sites added recently are not included in our original map. Each RWIS site has a specific number, assigned by Mn/DOT, associated with it. The latitude and longitude of each RWIS site shown in the Figure 2.1 are given in Table A.1 in the appendix.

F
igure 2.1: RWIS sites in Minnesota
2.1.1.1 Data from RWIS sensors
RWIS sensors report observed values every 10 minutes, resulting in 6 records per hour. Greenwich Mean Time (GMT) is used for recording the values. The meteorological conditions reported by RWIS sensors are air temperature, surface temperature, dew point, visibility, precipitation, the amount of precipitation accumulation, wind speed and wind direction. Following is a short description of these variables along with the format they follow.
(i) Air Temperature: Air temperature is recorded in Celsius in increments of one hundredth of a degree, with values ranging from -9999 to 9999 and a value of 32767 indicating an error. For example, a temperature of 10.5 degree Celsius is reported as 1050.
(ii) Surface Temperature: Surface temperature is the temperature measured near the road surface. It is recorded in the same format as air temperature.
(iii) Dew Point: Dew point is defined as the temperature at which dew forms. It is recorded in the same format as air temperature.
(iv) Visibility: Visibility is the maximum distance to which it is possible to see without any aid. Visibility reported is the horizontal visibility recorded in one tenth of a meter with values ranging from 00000 to 99999. A value of -1 indicates an error. For example, a visibility of 800.2 meters is reported as 8002.
(v) Precipitation: Precipitation is the amount of water in any form that falls to earth. Precipitation is reported using three different variables, precipitation type, precipitation intensity and precipitation rate. A coded approach is used for reporting precipitation type and intensity. The codes used and the information they convey are given in Table 2.1.
The precipitation type gives the form of water that reaches the earth's surface. Precipitation type with a code of 0 indicates no precipitation, a code of 1 indicates the presence of some amount of precipitation but the sensor fails to detect the form of the water, a code of 2 represents rain, and the codes 3, 41 and 42 respectively represent snowfall with an increase in intensity.
The precipitation intensity is used to indicate how strong the precipitation is when present. When no precipitation is present then intensity is given by code 0. As we move from codes 1 to 4 they indicate an increase in intensity of precipitation. Codes 5 and 6 are Table 2.1: Codes used for reporting precipitation type and precipitation intensity by RWIS Sensors.

Code	Precipitation Intensity	Code	Precipitation Type
0	No precipitation	0	None
1	Precipitation detected, not identified	1	Light
2	Rain	2	Slight
3	Snow	3	Moderate
41	Moderate Snow	4	Heavy
42	Heavy Snow	5	Other
		6	Unknown

used to indicate an intensity that cannot be classified into any of previous codes and in cases when the sensor is unable to measure the intensity respectively. A value of 255 for precipitation intensity indicates either an error or absence of this type of sensor.

Precipitation rate is measured in millimeters per hour with values ranging from 000 to 999 except for a value of -1 that indicates either an error or absence of this type of sensor.
(vi) Precipitation Accumulation: Precipitation accumulation is used to report the amount of water falling in millimeters. Values reported range from 00000 to 99999 and a value of -1 indicating an error. Precipitation accumulation is reported for the last 1, 3, 6, 12 and 24 hours.
(vii) Wind Speed: Wind speed is recorded in tenths of meters per second with values ranging from 0000 to 9999 and a value of -1 indicating an error. For example, a wind speed of 2.05 meters/second is reported as 205.
(viii) Wind Direction: Wind direction is reported as an angle with values ranging from 0 to 360 degrees. A value of -1 indicates an error.
(ix) Air Pressure: Air pressure is defined as the force exerted by the atmosphere at a given point. The pressure reported is the pressure when reduced to sea level. It is measured in tenths of a millibar and the values reported range from 00000 to 99999. A value of -1 indicates an error. For example, air pressure of 1234.0 millibars is reported as 12340.
2.1.2 The Automated Weather Observing System
The Automated Weather Observing System (AWOS) is a collection of systems including meteorological sensors, data collection system, centralized server and displays for human interaction which are used to observe and report weather conditions at airports in order to help pilots in maneuvering the aircraft and airport authorities for proper working of airports and runways.
AWOS sensors [AllWeatherInc] are placed at strategic locations such as runways at the airport. These sensors record hourly weather conditions and are capable of reporting air temperature, visibility, dew point, wind speed and direction, precipitation type and amount, humidity, air pressure and cloud cover. The sensors are connected to a powerful computer that analyzes the data gathered and broadcasts weather reports. The information collected from the AWOS sensors are used by pilots, air traffic control and maintenance personnel for safe operation of runways and for determining flight routes. The AWOS system provides update to pilots approaching an airport using a non-directional beacon. The meteorological information gathered is also used by the National Weather Service for forecasting and other weather services. There are over 600 AWOS sites located across the US according to the FAA. The sites are named in accordance with the code assigned to the airport they are located in, for example the city of Duluth in Minnesota that has an airport with DLH has its AWOS sensor named KDLH.

F
igure 2.2: AWOS Sites in Minnesota
Figure 2.2 shows the location of AWOS sites across the state of Minnesota and Table A.2 gives the latitude and longitude of these sites.
2.1.2.1 Data from AWOS sensors
AWOS sensors report values on an hourly basis. The local time is used in reporting. For example, the sites in Minnesota use Central Time (CT). AWOS sensors report air temperature, dew point, visibility, weather conditions in a coded manner, air pressure and wind information. Following is a description of these variables.
(i) Air Temperature: The air temperature value is reported as integer values in Fahrenheit ranging from -140 F to 140 F.
(ii) Dew Point: The dew point temperature is reported in the same format as air temperature.
(iii) Visibility: The visibility is reported as a set of values ranging from 3/16 mile to 10 miles. These values can be converted into floating point numbers for simplicity.
(iv) Weather Code: AWOS uses a coded approach to represent the current weather condition. 80 different possible codes are listed to indicate various conditions. Detailed information about these codes is given in the document titled “Data Documentation of Data Set 3283: ASOS Surface Airways Hourly Observations” published by the National Climatic Data Center [NCDC, 2005].
(v) Air Pressure: The air pressure is reported in tenth of a millibar increments. For example, an air pressure of 123.4 millibars is reported as 1234.

(vi) Wind Speed and Wind Direction: AWOS encodes wind speed and direction into a single variable. Wind speed is measured in knots ranging from 0 knots to 999 knots. Wind direction is measured in degrees ranging from 0 to 360 in increments of 10. The single variable for wind has five digits with the first two representing direction and the last three representing speed. For example, the value for a wind speed of 90 knots and direction of 80 degrees will be 80090.
In this work we propose to predict malfunctions of a RWIS sensor using other nearby RWIS and AWOS sensors. A sensor is said to be malfunctioning when the values reported by it deviate from the actual conditions present. We will be applying various machine learning algorithms and Hidden Markov Models in order to find a model that can detect significant variations in the readings obtained from a sensor. The following section presents a brief description of the fields of machine learning and data mining along with a description of the machine learning algorithms used in this thesis.
2.2 Machine Learning
Learning can be defined in general as a process of gaining knowledge through experience. We humans start the process of learning new things from the day we are born. This learning process continues throughout our life where we try to gather more knowledge and try to improve what we have already learned through experience and from information gathered from our surroundings.
Artificial Intelligence (AI) is a field of computer science whose objective is to build a system that exhibits intelligent behavior in the tasks it performs. A system can be said to be intelligent when it has learned to perform a task related to the process it has been assigned to without any human interference and with high accuracy. Machine Learning (ML) is a sub-field of AI whose concern is the development, understanding and evaluation of algorithms and techniques to allow a computer to learn. ML intertwines with other disciplines such as statistics, human psychology and brain modeling. Human psychology and neural models obtained from brain modeling help in understanding the workings of the human brain, and especially its learning process, which can be used in the formulation of ML algorithms. Since many ML algorithms use analysis of data for building models, statistics plays a major role in this field.
A process or task that a computer is assigned to deal with can be termed the knowledge or task domain (or just the domain). The information that is generated by or obtained from the domain constitutes its knowledge base. The knowledge base can be represented in various ways using Boolean, numerical, and discrete values, relational literals and their combinations. The knowledge base is generally represented in the form of input-output pairs, where the information represented by the input is given by the domain and the result generated by the domain is the output. The information from the knowledge base can be used to depict the data generation process (i.e., output classification for a given input) of the domain. Knowledge of the data generation process does not define the internals of the working of the domain, but can be used to classify new inputs accordingly.
As the knowledge base grows in size or gets complex, inferring new relations about the data generation process (the domain) becomes difficult for humans. ML algorithms try to learn from the domain and the knowledge base to build computational models that represent the domain in an accurate and efficient way. The model built captures the data generation process of the domain, and by use of this model the algorithm is able to match previously unobserved examples from the domain.
The models built can take on different forms based on the ML algorithm used. Some of the model forms are decision lists, inference networks, concept hierarchies, state transition networks and search-control rules. The concepts and working of various ML algorithms are different but their common goal is to learn from the domain they represent.
ML algorithms need a dataset, which constitutes the knowledge base, to build a model of the domain. The dataset is a collection of instances from the domain. Each instance consists of a set of attributes which describe the properties of that example from the domain. An attribute takes in a range of values based on its attribute type, which can be discrete or continuous. Discrete (or nominal) attributes take on distinct values (e.g., car = Honda, weather = sunny) whereas continuous (or numeric) attributes take on numeric values (e.g., distance = 10.4 meters, temperature = 20ºF).
Each instance consists of a set of input attributes and an output attribute. The input attributes are the information given to the learning algorithm and the output attribute contains the feedback of the activity on that information. The value of the output attribute is assumed to depend on the values of the input attributes. The attribute along with the value assigned to it define a feature, which makes an instance a feature vector. The model built by an algorithm can be seen as a function that maps the input attributes in the instance to a value of the output attribute.
Huge amounts of data may look random when observed with the naked eye, but on a closer examination, we may find patterns and relations in it. We also get an insight into the mechanism that generates the data. Witten & Frank [2005] define data mining as a process of discovering patterns in data. It is also referred to as the process of extracting relationships from the given data. In general data mining differs from machine learning in that the issue of the efficiency of learning a model is considered along with the effectiveness of the learning. In data mining problems, we can look at the data generation process as the domain and the data generated by the domain as the knowledge base. Thus, ML algorithms can be used to learn a model that describes the data generation process based on the dataset given to it. The data given to the algorithm for building the model is called the training data, as the computer is being trained to learn from this data, and the model built is the result of the learning process. This model can now be used to predict or classify previously unseen examples. New examples used to evaluate the model are called a test set. The accuracy of a model can be estimated from the difference between the predicted and actual value of the target attribute in the test set.
Predicting weather conditions can also be considered as an example of data mining. Using the weather data collected from a location for a certain period of time, we obtain a model to predict variables such as temperature at a given time based on the input to the model. As weather conditions tend to follow patterns and are not totally random, we can use current meteorological readings along with those taken a few hours earlier at a location and also readings taken from nearby locations to predict a condition such as the temperature at that location. Thus, the data instances that will be used to build the model may contain present and previous hour's readings from a set of nearby locations as input attributes. The variable that is to be predicted at one of these locations for the present hour is the target attribute. The type and number of conditions that are included in an instance depend on the variable we are trying to predict and on the properties of the ML algorithm used.
WEKA [Witten & Frank, 2005], for Waikato Environment for Knowledge Analysis, is a collection of various ML algorithms, implemented in Java, that can be used for data mining problems. Apart from applying ML algorithms on datasets and analyzing the results generated, WEKA also provides options for pre-processing and visualization of the dataset. It can be extended by the user to implement new algorithms.
Suppose that we want to predict the present temperature at a site C (see Figure 2.3). To do this we use eight input attributes: the previous two hours temperature together with the present hour temperature at C and two nearby locations A and B. The output attribute is the present hour temperature at C. Let temp_

1 2 3 4 5 6 7 8 9 ... 12