Skip to main content
Login | Suomeksi | På svenska | In English

Browsing by Subject "Time series"

Sort by: Order: Results:

  • Lahdensuo, Sofia (2022)
    The Finnish Customs collects and maintains the statistics of the Finnish intra-EU trade with the Intrastat system. Companies with significant intra-EU trade are obligated to give monthly Intrastat declarations, and the statistics of the Finnish intra-EU trade are compiled based on the information collected with the declarations. In case of a company not giving the declaration in time, there needs to exist an estimation method for the missing values. In this thesis we propose an automatic multivariate time series forecasting process for the estimation of the missing Intrastat import and export values. The forecasting is done separately for each company with missing values. For forecasting we use two dimensional time series models, where the other component is the import or export value of the company to be forecasted, and the other component is the import or export value of the industrial group of the company. To complement the time series forecasting we use forecast combining. Combined forecasts, for example the averages of the obtained forecasts, have been found to perform well in terms of forecast accuracy compared to the forecasts created by individual methods. In the forecasting process we use two multivariate time series models, the Vector Autoregressive (VAR) model, and a specific VAR model called the Vector Error Correction (VEC) model. The choice of the model is based on the stationary properties of the time series to be modelled. An alternative option for the VEC model is the so-called augmented VAR model, which is an over-fitted VAR model. We use the VEC model and the augmented VAR model together by using the average of the forecasts created with them as the forecast for the missing value. When the usual VAR model is used, only the forecast created by the single model is used. The forecasting process is created as automatic and as fast as possible, therefore the estimation of a time series model for a single company is made as simple as possible. Thus, only statistical tests which can be applied automatically are used in the model building. We compare the forecast accuracy of the forecasts created with the automatic forecasting process to the forecast accuracy of forecasts created with two simple forecasting methods. In the non-stationary-deemed time series the Naïve forecast performs well in terms of forecast accuracy compared to the time series model based forecasts. On the other hand, in the stationary-deemed time series the average over the past 12 months performs well as a forecast in terms of forecast accuracy compared to the time series model based forecasts. We also consider forecast combinations where the forecast combinations are created by calculating the average of the time series model based forecasts and the simple forecasts. In line with the literature, the forecast combinations perform overall better in terms of the forecast accuracy than the forecasts based on the individual models.
  • Widgrén, Joona (2017)
    The internet is a popular channel for finding information. The search queries entered into a search engine contain a huge amount of data, but can it be used in economic forecasting? This thesis investigates if Google searches observe the changes in the Finnish housing market. The focus is this thesis is in housing price and home sales forecasting. Google search data is collected from Google Trends. Google Trends provides data describing the popularity of search queries. Google Trends data is updated every day and thus its publishing frequency is much higher in comparison with the official housing market data. The difference in publishing frequency can help to predict changes in housing markets before the official data is revealed. To evaluate the usefulness of Google data a simple model is extended with the Google search index. The forecasting ability of the simple model and the model with Google searches are then compared. Both models are used to forecast the current values of housing market indicators as well as forecasting near-future values. Furthermore, the Granger causality test is employed to investigate if Google searches are useful in forecasting housing market variables. The robustness of the results is studied using the fixed effects model. Also, housing price changes are forecasted as a robustness check. The results suggest that Google searches are useful in forecasting the Finnish housing market. Adding Google searches to a simple housing price forecasting model improves the accuracy of the contemporaneous forecast by 7.5 percent on average. Google searches improve contemporaneous home sales forecast by 15.9 percent on average. Also, the Granger causality test suggests that Google searches are useful in forecasting home sales. The findings are not as clear for Granger causality between Google searches and housing prices. The Granger causality test results suggest that Google searches could be useful in forecasting the current housing prices but not future values. The results also suggest that Google searches improve the near-future forecasts of both indicators.
  • Salmirinne, Simo (2020)
    Time series are essential in various domains and applications. Especially in retail business forecasting demand is a crucial task in order to make the appropriate business decisions. In this thesis we focus on a problem that can be characterized as a sub-problem in the field of demand forecasting: we attempt to form clusters of products that reflect the products’ annual seasonality patterns. We believe that these clusters would aid us in building more accurate forecast models. The seasonality patterns are identified from weekly sales time series, which in many cases are very sparse and noisy. In order to successfully identify the seasonality patterns from all the other factors contributing in a product’s sales, we build a pipeline to preprocess the data accordingly. This pipeline consist of first aggregating the sales of individual products over several stores to strengthen the sales signal, followed by solving a regularized weighted least squares objective to smooth the aggregates. Finally, the seasonality patterns are extracted using the STL decomposition procedure. These seasonality patterns are then used as input for the k-means algorithm and several hierarchical agglomerative clustering algorithms. We evaluate the clusters using two distinct approaches. In the first approach we manually label a subset of the data. These labeled subsets are then compared against the clusters provided by the clustering algorithms. In the second approach we form a simple forecast model that fits the clusters’ seasonality patterns back to the observed sales time series of individual products. In this approach we also build a secondary validation forecast model with the same objective, but instead of using the clusters provided by the algorithms, we use predetermined product categories as the clusters. These product categories should naturally provide a valid baseline for groups of products with similar seasonality as they reflect the structure of how similar products are organized within close proximity in physical stores. Our results indicate that we were able to find clear seasonal structure in the clusters. Especially the k-means algorithm and hierarchical agglomerative clustering algorithms with complete linkage and Ward’s method were able to form reasonable clusters, whereas hierarchical agglomerative clustering algorithm with single linkage was proven to be unsuitable given our data.