Some Interval Time Series Models for Temperature Data in India

Received: 06/May/2018, Revised: 19/May/2018, Accepted: 13/Jun/2018, Online: 30/Jun/2018 Abstract: The temperature is fluctuated by special climate changes during seasons in India. In this paper, we were taking temperature data from 2000 to 2016. Maximum and minimum temperature values for season wise i.e., Jan-Mar(Spring), AprJune(Summer), July-Sep(Autumn), Oct-Dec(Winter) has to be taken throughout India. The organization of the work is divided into two parts, first part contains data from 2000 to 2010 as test group and second part contains from 2011 to 2016 as main group. For this maximum and minimum temperature seasonal data we apply nine models. Among the nine, seven are ARIMA models, the 8 th one is Adaptive smoothing model and the last one is non linear model. In this paper, two measures of accuracy are used. They are Mean Absolute Error (MAE) and Root Mean Square Error (RMSE). The nine models are empirically tested using Maximum and Minimum temperature data of season wise in India.


I. INTRODUCTON
The temperature can be fluctuated by seasons in India because of Himalayas on one side, and the other sides Bay of Bengal, Arabian Sea and Indian Ocean. Due to these, the climate change in India is differ with the other countries regarding the temperature. For the last two decades the temperature variation is more due to global warming i.e., Emission of green house gases such as Carbon Dioxide, Methane and Nitrous oxide. Burning of fossil fuels like coal, oil and natural gas for energy, cut down and burn forests, dangerous pesticides used in agriculture, deforestation and farming. Data analysis has been carried for season wise in India using interval time series temperature data from 2000 to 2016. Maximum and minimum temperature values for season wise i.e., Jan-Mar, Apr-June, July-Sep, Oct-Dec has to be taken throughout India. By conducting three interval time series models to this data. These three models are compared using R 2 criteria for the best model for predicting the temperature values.
Form the above paragraph of the introduction, we are provided the introduction of the paper. The rest of the sections is as follows. Section II contains review of literature given by different authors in their respective papers in bibliography we are given the list of papers used in the literature. Section III contains nine models as ARIMA ( Maximum and minimum temperature values for season wise i.e., Jan-Mar, Apr-June, July-Sep, Oct-Dec has to be taken throughout India. The organization of the work is divided into two parts, first part contains data from 2000 to 2010 as test group and second part contains from 2011 to 2016 as main group. The comparison of the data is by using MAE & RMSE for interval data. Section V contains brief summery and conclusions of the above sections.

II. RELATED WORK
The prime indicator of global warming is global mean temperature. Time series of global temperature show a well known rise since the early 20 th century and most notably since the late 1970s. the impacts are: shrinking mountain glaciers, accelerating ice loss from ice sheets in Greenland and Antarctica, shrinking Arctic sea ice extent, sea level rise, © 2018, IJSRMSS All Rights Reserved 41 and a number of well-documented biospheric changes like earlier bud burst and blossoming time is spring. Much of the variability during that time span can be related to three known causes of short-term temperature variations: El Nino/southern oscillation, volcanic eruptions, and solar variations including the solar cycle. This complicates both comparison and trend analysis of the temperature records.
Since independent measures of these variations are available, their influence can to a large extent be removed, leading to adjusted, less noisy global temperature data sets. Therefore we will remove the influence of these factors on the temperature data sets, not only to isolate the longer-term changes, but also to identify whether different data sets show meaningful differences in their response to these factors. The influence of exogenous factors will be approximated by multiple regression of temperature against ENSO, volcanic influence, total solar irradiance (TSI) and a linear time trend to approximate the global warming that has occurred during 32 years subject to analysis [1].
Global surface temperatures continue to rise. In most surface temperature data sets, the years 2014, 2015, and 2016 set new global records since the start of regular measurements. Never before have three record years occurred in a row. Global-mean surface temperature(GMST) is the most important indicator of global climate change, because (i) it is directly related to the planetary energy balance and increases quasi-linearly with cumulative greenhouse gas emission (ii) GMST is directly related to most climate impacts and risks. In this, the authors deal with the former only, i.e. with analysis of possible trend changes in the observational data [2]. Now a days the forecasting of agricultural commodity future prices is very essential. Because now a days the growth of population is high in the countries like India, China etc. In recent years, agriculture commodity futures markets in populated countries have witnessed massive growth with an increasing product variety and deepering liquidity pools. The agricultural commodity futures markets in India are playing an increasing important role in serving the global financial market and the national economy [3].
Agricultural commodity futures price forecasting is considered as a challenging task. Due to the fact that the prices are highly volatile, complex and dynamic and is thus of great interests to finance researchers, market practitioners and policy makers. An extensive investigation reveals that it is not difficult to find futures prices including stock index futures, gold futures, and metal futures. But an important point note from the past studies is their preoccupation with point forecasting rather than interval one. An interval forecasting of futures prices has the advantage of taking into account the variability and/or uncertainty so as to reduce the amount of random variation relative to that found in classic single valued futures prices time series. Interval analysis and forecasting has attracted particular attention in various fields particularly in finance market and energy market. The Intervalvalued time series (ITS) forecasting method is a potential tool and will lead to a reduction in risk when making power system planning and operational decisions.
A variety of Interval-valued time series forecasting methods has been developed. They are Traditional statistical techniques, including interval Exponential smoothing methods, Vector Auto Regressive (VAR) model and Vector error correction model (VECM). The traditional statistical techniques can provide good predictions only when ITS under study are linear and stationary. But it may not possible all times. Sometimes it appears as nonlinear and non stationary due to intrinsic complexity and volatility of ITS. In order to overcome the limitations of traditional statistical techniques, machine learning techniques have recently attracted many attentions. The Interval multilayer perceptions, Multi output Supports Vector Regression (MSVR) are very useful in nonlinear modeling capability for ITS in real world.
Actually in real world, ITS appears linear (or) non linear pattern and usually contains both patterns. This difficult forecasting can be partially solved by using combined linear and non linear model. It is very difficult in practice to construct a single model which is the best in all situations. Such as a hybrid Auto Regressive Integrated Moving Average (ARIMA) and Artificial Neural Network (ANN) model for time series forecasting to take advantage in linear and non linear modeling respectively. Now a days over confidence is one of the most prevalent judgment biases. Several studies show that over confidence can lead to sub optimal decisions of investors, managers or politicians. It is the object of an active field of research over the last two decades. Over confidence indeed a bias or rather than an ecological and statistical illusion and therefore only an apparent anomaly that just seems to exist, but it is not real. There are different types of overconfidence i) overestimation of one's actual performance ii) over placement of one's performance relative to others called the better than average effect iii) excessive precision in one's belief, called miscalibration.
It is important to distinguish between these main manifestations of overconfidence, in particular between (relative) performance based and miscalibration based measures, because empirical studies find them typically to be hardly related. The degree of miscalibration can be measured in various ways, such as binary choice questions or interval estimates. For difficult reasons, the analysis is restricted to miscalibration in interval estimates; it is most closely related to the facet of overconfidence that is modeled in economics, finance and management. Interval estimates are less studied than two-choice questions. Newly designed method of measuring "true overconfidence" is naturally related to interval estimates [4].

© 2018, IJSRMSS All Rights Reserved 42
The widely used Generalized Additive Models (GAM) method is a flexible and effective technique for conducting nonlinear regression analysis in time series studies of health effects of air pollution. The GAM is being applied when the estimated regression coefficients are small and there exist confounding factors that are modeled using at least two non parametric smooth functions [5].

III. METHODOLOGY
We  The forecasting equation in this case is This is also called AR (1) model. Observation Y t depends on Y t-1 and the value of the auto regression coefficient ϕ 1 is restricted to lie between -1 and +1.
If ϕ 1 is positive and less than 1 in magnitude (it must be less than 1 in magnitude if Y is stationary), the model describes mean-reverting behavior in which next period's value should be predicted to be ϕ 1 times as far away from the mean as this period's value. If ϕ 1 is negative, it predicts mean-reverting behavior with alternation of signs, i.e., it also predicts that Y will be below the mean next period if it is above the mean this period. If the series Y is not stationary, the simplest possible model for it is a random walk model, which can be considered as a limiting case of an AR(1) model in which the autoregressive coefficient is equal to 1, i.e., a series with infinitely slow mean reversion. The prediction equation for this model can be written as: Where the constant term is the average period-to-period change in Y. This model could be fitted as a no-intercept regression model in which the first difference of Y is the dependent variable. Since it includes a non seasonal difference and a constant term, it is classified as an "ARIMA (0,1,0) model with constant." The random-walk-withoutdrift model would be an ARIMA (0,1,0) model without constant If the errors of a random walk model are auto correlated, perhaps the problem can be fixed by adding one lag of the dependent variable to the prediction equation i.e., by regressing the first difference of Y on itself lagged by one period. This would yield the following prediction equation:   This can be rearranged to This is a first-order autoregressive model with one order of non seasonal differencing and a constant term i.e., an ARIMA (1, 1, 0 ) model.

(c) ARIMA(0,1,1) without constant = Simple Exponential Smoothing:
Another strategy for correcting auto correlated errors in a random walk model is suggested by the simple exponential smoothing model. The random walk model does not perform as well as a moving average of past values. In other words, rather than taking the most recent observation as the forecast of the next observation, it is better to use an average of the last few observations in order to filter out the noise and more accurately estimate the local mean. The simple exponential smoothing model uses an exponentially weighted moving average of past values to achieve this effect. The prediction equation for the simple exponential smoothing model can be written in a number of mathematically equivalent forms, one of which is called "error correction" form, in which the previous forecast is adjusted in the direction of the error it made: Because e t-1 = Y t-1 -Ŷ t-1 by definition, this can be rewritten as:   Which is an ARIMA (0, 1, 1) without constant forecasting equation with θ 1 = 1-α. This means that we can fit a simple exponential smoothing by specifying it as an ARIMA (0, 1, 1) model without constant, and the estimated MA (1) coefficient corresponds to 1 -alpha in the SES formula. By implementing the SES model as an ARIMA model. First of all, the estimated MA (1) coefficient is allowed to be negative: this corresponds to a smoothing factor larger than 1 in an SES model, which is usually not allowed by the SES model-fitting procedure. Second, we have the option of including a constant term in the ARIMA model if you wish, in order to estimate an average non-zero trend. The ARIMA (0,1,1) model with constant has the prediction equation: The one-period-ahead forecasts from this model are qualitatively similar to those of the SES model, except that the trajectory of the long-term forecasts is typically a sloping line (whose slope is equal to mu) rather than a horizontal line.

(e) ARIMA(0,2,1) or (0,2,2) without constant = Linear Exponential Smoothing:
Linear exponential smoothing models are ARIMA models which use two non seasonal differences in conjunction with MA terms. The second difference of a series Y is not simply the difference between Y and itself lagged by two periods, but rather it is the first difference of the first difference--i.e., the change-in-the-change of Y at period t. Thus, the second difference of Y at period t is equal to (Y t -Y t-1 ) -(Y t-1 -Y t-2 ) = Y t -2 Y t-1 + Y t-2 . A second difference of a discrete function is analogous to a second derivative of a continuous function: it measures the "acceleration" or "curvature" in the function at a given point in time.
The ARIMA (0,2,2) model without constant predicts that the second difference of the series equals a linear function of the last two forecast errors: Where θ 1 and θ 2 are the MA (1) and MA (2) coefficients. This is a general linear exponential smoothing model, essentially the same as Holt's model, and Brown's model is a special case. It uses exponentially weighted moving averages to estimate both a local level and a local trend in the series. The long-term forecasts from this model converge to a straight line whose slope depends on the average trend observed towards the end of the series.
This model is illustrated for accompanying slides on ARIMA models. It extrapolates the local trend at the end of the series but flattens it out at longer forecast horizons to introduce a note of conservatism, a practice that has empirical support. It is generally advisable to stick to models in which at least one of p and q is no larger than 1.

© 2018, IJSRMSS All Rights Reserved 44
The single exponential smoothing forecasting model requires the specifications of an α value and it has been shown that the mean absolute percentage error (MAPE) and Mean Square Error (MSE) measures depends on this choice. Adaptive Response Rate Single Exponential Smoothing (ARRSES) may have an advantage over Single Exponential Smoothing (SES), it allows the value of α to be modified in a controlled manner, as changes in the pattern of data occur. This characteristic seems attractive when hundreds (or) thousands of items require forecasting.
The basic equation for forecasting with the method of ARRSES is similar to equation here β is a parameter between 0 and 1 and mod || denotes absolute values.
In the equation A t denotes a smoothed estimate of forecast error and is calculated as a weighted average of A t-1 and the last forecasting error € t . M t denotes a smoothed estimate of the absolute forecast error; being calculated as a weighted average of M t-1 and the last absolute forecasting error |€ t |. A t and M t gives single exponential smoothing estimates themselves.
indicates that the value of α t to be used for forecasting period (t+2) is defined as an absolute value of the ratio of A t and M t . instead of α t+1, we could have used α t in the above equation . We prefer α t+1 because ARRSES is often too responsive to changes, thus using α t+1, we introduce a small log of one period, which allows the system to "settle" a little and forecast in a more conservative manner.

(h) Non Linear Model
A non linear model fitted to the data is

IV. RESULTS AND DISCUSSIONS
For temperature data of India, we are fitted all nine models for minimum and maximum temperature data to the years 2000 to 2010 are as follows.
ARIMA(0,2,1)or(0,2,2) without constant ARIMA(1,1,2) without constant Non linear model   Non linear model + + Table 3 represents the positive and negative signs. Here the comparison is made-up of MAE values of maximum temperature data with minimum temperature data. The RMSE values of maximum temperature data with minimum temperature data. Here the positive (+) sign indicates increasing trend and negative (-) sign indicates decreasing trend.
The test group is from 2011 to 2016. The equations for maximum and minimum Temperature data are fitted for nine models as given in table ARIMA(0,2,1)or(0,2,2) without constant    The best model for minimum temperature data for seasons from the above table is ARIMA(0,2,1)or(0,2,2) without constant and the equation for this model is