High spatial and temporal detail in timely prediction of tourism demand

What happens in forecasting problems when high frequency and high spatial detail data encounter significant publication delays? In this paper, we consider a monthly dynamic panel data model, augmented by Google Trends search query volume data, for tourism demand forecasting at high spatial detail, in which one of the main aspects is represented by a publication delay ranging from 8 to 15 months. Some findings in the tourism literature already specify forecasting/nowcasting applications considering a realistic time delay but not for more than 3 months.

Most of these studies, however, focus primarily on macro tourism areas (i.e., nations or metropolitan areas), without considering data realization and real-time vintages.Camacho and Pacce (2017) are the only exception to the tourism literature with their consideration of publication delays, data releases, and updates.The authors focus on forecasting the demand for Spain at a national level, dealing with a 3-month publication delay of official data.However, enhancing the territorial detail of the information causes a clear increase in this delay.
To overcome these shortcomings in forecasting tourism demand, this study incorporates the following novelties.

| Fine territorial detail
The first aspect to be considered and enhanced, in this paper, is the prediction of fine territorial detailed data (i.e., municipalities).In the literature, analysis of a specific small area is rare (Goffi & Cucculelli, 2019), and the finest spatial detail considered by scholars is metropolitan cities or cities characterized by high tourism flows (Gunter & Onder, 2015, 2016;Emili, Figini, & Guizzardi, 2019).Small towns, where the tourism industry is at the core of the local economy, may benefit from an accurate analysis of the dynamics of tourism flows to a larger extent than metropolitan areas.This issue is highly relevant to both (a) seasonal destinations, where the predictability of tourism arrivals is fundamental to destination management; and (b) areas characterized by different attractions (i.e., clusters of municipalities) responding to alternative segments of the tourism market.

| Panel data model
A second important aspect in this paper is found in the methodological approach.In this sense, seasonality, multiproduct destinations, and spatial proximity require coordinated marketing and management actions to deal with and control for potential tourism pull factors in a timely manner.To account for these issues, we suggest using an econometric panel approach, where municipalities within an area represent the unit of the analysis.Song and Li (2008) highlight panel data analysis of tourism demand as an approach embodying "some advantages over the time series econometric models.It incorporates much richer information from both time series and cross sectional data.This approach also reduces the problem of multicollinearity and provides more degrees of freedom in the model estimation".

| Nowcasting approach with real (long) publication delay and GT data
Encouraged by the temporal mismatch between timely prediction for hospitality-firm management and tourism-destination urban policies (Jackman & Naitram, 2015) and long delays in the publication of official statistics, the third contribution of this paper is the forecasting/nowcasting schemes with horizons ranging from 8 to 15 months.This is in line with real availability of fine territorial tourism data.Introducing ex ante demand variable information, on the basis of selected keywords of a Google search-volume index, we estimate timely measures of production activities at the destination level with the inclusion of GT data.In addition, the panel nowcasting model allows us to account for the possible spatial relationships among the tourism destinations in the area, thereby improving the quality of the predictions.
In Italy, foreign visitors bring in €37.2 billion (7.5% of total Italian exports), supporting 5.5% of the direct and 12.6% of total employment.The heterogeneous geographical wealth of Italy along with a rich offering of tourism attractions leads to tourism earnings playing an important role in both the public and private budgets of the local areas.The province of Rimini can be considered one of the most representative examples in Italy of local territories strongly connected to tourism earnings.
The province of Rimini is an interesting setting to evaluate the proposed panel specification for several reasons.First, it is a top Italian tourism destination-ranking second for tourism GDP per capitawhere 30% of the total product in the whole area is provided by the tourism sector. 1 Second, it comprises different typologies of municipalities, from the cities located on the Adriatic Sea coast (from north to south, Bellaria, Rimini, Riccione, Misano, and Cattolica) to the inland towns, rich in historical attractions, boasting Romans relics and Mediaeval castles.Furthermore, Rimini area is a mass and multiproduct destination offering solutions to different demand segments including leisure, meeting and conference, sport, and culture.
The remainder of this paper is organized as follows.Section 2 reviews the forecasting and nowcasting literature, especially focusing on the use of search query volumes to enhance accuracy and a lack of interest in publication delays and data vintages.In Section 3, we specify the panel data forecasting model, and we present the case study.
Section 4 reports results of both forecasting and nowcasting applications.Finally, in Section 5, we analyse the main findings and the related managerial implications.

| Tourism forecasting and big data
In general, the tourism forecasting literature is composed of two strands (Song & Li, 2008): econometric models (Uysal & O'Leary, 1986;Lim & McAleer, 2001;Rossellò-Nadal, 2001;Song et al., 2009) and pure time-series models (Witt & Witt, 1991;Hyndman et al., 2002;Gil-Alana, 2005;Athanasopoulos et al., 2011).In this strand of the literature, the first possible causes of failure in tourism modelling and forecasting might be the high spatial substitutability among neighbouring cities or destinations with comparable tourism attractions.A second source of failure in forecasting time-series tourism data may be found in the high unpredictability of tourism choices, which are contingent on war, terrorism, and geopolitical instability in addition to atmospheric conditions, environmental crises, fad swings, and so forth.To address these possible causes of failure in tourism modelling and forecasting, we combine the exploration of (a) the use of Big Data generated by Internet traffic in order to assess changes in tourism flows from those factors a timely manner and (b) the inclusion of these data in the context of panel data models, accounting for the volatility due to spillovers from substitutable tourism destinations and innovations resulting from changes in tourist preferences.
As clearly pointed out by Song and Liu (2017), in the framework of tourism demand analysis, the decision process and implicit characteristics of the demand can be easily controlled for and studied through the digital tracks left by travellers on the web.The Internet has already assumed a key role in fashions and consumer trends (Horrigan, 2008); holiday preferences revealed by individual searches on an Internet browser provide a new set of empirical evidence, which could furnish real-time information about the main indicators of tourism activities and improve scientific analysis of tourism-demand dynamics.
Since the release of the GT index (in previous years, as well known as Google Insight for Searches), numerous applications have been created to evaluate the information behind the volume of online searches.For instance, Ettredge, Gerdes, and Karuga (2005) focused on the analysis of the unemployment rate.Askitas and Zimmermann (2009) and D'Amuri and Marcucci (2009) investigated developing short-term forecasting labour-market models including Google data.Ginsberg et al. (2009) used Google data to detect the flu epidemic.Smith (2012) applied GT indices in modelling the volatility dynamics of financial markets.In the tourism framework, Choi andVarian (2009, 2012) first explored the possibility of adding the Google search index to a simple autoregressive model of tourism arrivals in order to improve the forecasting performance of tourism demand.Recently, Bangwayo-Skeete and Skeete (2015) and Hirashima et al. (2017) implemented Autoregressive-Mixed Data Sampling (AR-MIDAS) regressions to forecast and nowcast tourism demand.Yang, Pan, Evans, and Lv (2015) evaluated the power of GT data and similar search engine volumes in predicting arrivals in China, depicting the importance of reliable web source information.Rivera (2016) defined a dynamic linear model to evaluate the ability to forecast hotel stays in Puerto Rico, considering GT data as the realization of an unobservable process.Dergiades, Mavragani, and Pan (2018) investigated the predictive ability of their corrected Google index (from language bias and platform bias) to analyse causality relationships of the search data and monthly arrivals in Cyprus.
In the present work, we combine the effect of GT data on the accuracy of tourism demand forecasting in neighbourhood small areas with panel data techniques.This class of models has been found relevant to many consumer goods, such as liquor (Baltagi & Griffin, 1995) and cigarettes (Baltagi & Griffin, 2001), but has had very few applications on tourism-demand analysis (Gallego, Rodríguez-Serrano, & Casanueva, 2019;Garín-Muñoz, 2004;Narajan et al., 2010;Sequeira & Maçãs-Nunes, 2008), especially related to tourism quantities sampled at frequencies higher than quarterly and rarely with a forecasting purpose.A general detailed review of the literature on panel forecasting models is provided by Baltagi (2007), presenting evidence about the superiority of panel techniques with respect to the time-series approach.Baltagi et al. (2003) found unstable variability of the single regional estimates and worst out-of-sample forecasts.Hoogstrate, Palm, and Pfan (2000) compared forecast accuracy of pooling techniques and single country forecasts, specifically referring to the cases of N fixed and T large.They argued that, when there are similarities among estimated parameters, pooling may improve forecast accuracy.Similar findings have been demonstrated by Gavin and Theodorou (2005) and Fok, Dijk, and Franses (2005) who find that the panel model forecasts of disaggregated series can be more accurate than forecasts of aggregates measures (e.g., total output or unemployment of 48 states).

| Nowcasting literature
The representativeness and predicting capabilities of GT data can overcome an important problem with the official data: In the real world, most of the quantities and relative measures closely related to tourism and this economic segment are published with a delay that increases when the territorial detail is enhanced.The result is a serious lack of information.In many cases, an evaluation of the forecasting accuracy can only be completed 1 year after the data are published.Fifteen years ago, nowcasting was introduced as a solution to this problem: a new class of econometric methodologies to forecast the timely measure of economic variables.This methodological approach, which takes its name from the combination of the terms now and forecasting (Banbura, Giannone, & Reichlin, 2011), can help mitigate the problems from publication delays in official statistics when measuring tourism activities.
The first contributions to the nowcasting literature were developed in the macroeconomic context (Giannone et al., 2008;Kuzin et al., 2011;Aastveit et al., 2014)

| Model specification
The dynamic panel data model for the tourism demand for the six neighbourhood territories is defined by where i = 1, …, N, N = 6, indicates the statistical unit (destinations); and t =1, …, T is the temporal instant at which we evaluate the dependent variables y i, t (logarithm of the arrivals).η i is the individual specific effect, and ε i, t is the remainder error; D i, t is a matrix of seasonal dummies (11 dummies, the last variable omitted to avoid multicollinearity problems) introduced to detect deterministic seasonal patterns and furnishing an interpretation of the seasonal fluctuations and their effects on the dependent variable (Shen, Li &, Song, 2008).This choice is supported by Canova and Hansen's (1995) test results reported in the last column of Table 1 for seasonality stability of the series.Specifically, the tests do not reject stable seasonal pattern in the series.
In the methodological literature on dynamic panel analysis, the main problem consists in the dependence between the individual effects and the lagged dependent variable y i, t − 1 .In order to overcome the resulting estimation problems, alternative transformations have been proposed in the literature with the aim of evaluating panel dynamic relationships and wiping out the individual effects (Anderson & Hsiao 1982;Arellano & Bond 1991;Arellano & Bover 1995;Blundell & Bond 1998).In this paper, we consider the difference generalized method of moments (GMM) estimation approach proposed by Arellano (1988) and Arellano and Bond (1991).In addition to having good asymptotic properties, 2 difference GMM, unlike IV estimators, permits both the use of internal-lagged variables as instruments and increasing efficiency.Furthermore, differently from system GMM, difference GMM is not affected by the likelihood of strong proliferation of instruments and then by overidentification problems (Gallego, Rodríguez-Serrano, & Casanueva, 2019).
To assess the predictive power of GT data, we augment the panel data model introducing the variables GT 0 i,t− 12 of Google searches at lag t − 12 (after log transformation) to exploit the information from the search engine query for the specified keywords.GT variables are introduced to account for holiday destination preferences as well as for possible structural changes caused by general economic quadrature, word of mouth phenomena, exchange-rate changes, and specific regional situations (e.g., terrorism, wars, and political turbulence).
Choosing lag t − 12 for both the lagged dependent variable and the explanatory variable, we can consider one of the main characteristics of the tourism demand for Rimini: loyalty.In 2014, the European Union-funded SEE Project for territorial cooperation-InTourAct (Province of Rimini, the University of Bologna, and the Piepoli Institute)-loyal tourists were found to count for 60% of the total, with nearly 90% intending to return (next year).
Starting from the baseline model of Equation ( 1), the augmented forecasting panel model is given by    Last column (CH) in Table 1 shows the results from the Canova and Hansen (1995) stability test at Lag 12.For each destination, we do not reject the null of stable seasonality at the 5% significance level.
The search query volumes are obtained from the website https:// trends.google.com/trends/.Since 2004, Google has provided a search query index that reflects the magnitude of World Wide Web searches for all possible keywords.The Google algorithm provides tracked and categorized searches for a specific query, which includes the searches made around the keywords.This aims to deal with and identify the searches performed using all possible meanings of a specific term.Through this perspective, GT allows the queries to be refined in two ways: selecting specific categories (all categories, travels, finance, food and drink, etc.) and suggesting the meaning of an inquirer's answers.
Hence, for this study, the category of interest is the first important aspect to be defined when downloading the GT.
Even though the literature shows different procedures have been adopted to choose the most informative combination of keywords and categories (for a survey, see Li et al. (2017)), in our study, we refine the searches in the category "Hotel & Accommodation": first, in Rimini, the tourism sector of second homes can be considered marginal with respect to hotel accommodations; and second, the Emilia Romagna region quantifies the arrivals using hotel accommodations.
The search query variables are obtained for the keywords "rimini," "riccione," "cattolica," "misano," "bellaria," and a sixth query built as a composite of the names of several villages in the hinterland inland areas.This last choice basically covers the unavailability of a unique measure of search volume for this territory due to the lack of a specific name that identifies one GT series for the whole destination.
We suggest building this search query index with the names of the five main destinations (in terms of arrivals).The final query for the inland area of the province is then given by "san leo" + "santarcangelo di romagna" + "pennabilli" + "verucchio" + "mondaino".
A third aspect that must be considered is the time interval of the downloaded data.Every week, the (absolute) number of searches is obtained referring to a different sample of IP addresses and then is normalized by the total searches of the week.This adjustment provides not only a data update on the maximum of the series (the cases in which a new maximum of searches is identified or when the set and investigated time interval exclude the observed maximum) but also some small differences among the same series downloaded in two different weeks.In this way, every time we update the training set, we cannot add a new observation to the GT series, but we have to download all of the data again.Both dependent and independent variables are plotted in Figure 2, with solid and dotted lines, respectively.The seasonal pattern of the arrivals is also observed for the GT series, with peaks emerging mostly in July and troughs in winter.Contrary to the other series, the volumes of the searches for inland areas show a notsmoothed seasonal dynamic.

| RESULTS
In Table 2,  cases and H 0, a : ρ 1 = ρ 12 = γ = 0 for the augmented ones.In the last row of Table 3, we summarize the results of Arellano-Bond test from the second to the 12th-order autocorrelation in first-differenced errors (Arellano & Bond, 1991).Specifically, we refer to the number of times that the tests cannot be rejected at 1% significance level.
Findings suggest that autocorrelation coefficients and the estimates of the GT variables impacts are always positive and significant, as also confirmed by the Wald tests.The sign of the dummy variables remain constant over the two samples: we observe negative signs for autumn and winter dummies (δ 1 , δ 2 , δ 9 , δ 10 , and δ 11 ), and positive impacts in spring and summer dummies(δ 3 , δ 4 , δ 5 , δ 6 , δ 7 , and δ 8 ).The 0 values in the last rows guarantee the absence of correlation and so the validity of the moment conditions used by the Arellano-Bond estimator for the four applications.
The forecasting and nowcasting abilities of the two models have been evaluated also referring to further rival specifications.In particular, we consider the univariate time series models defined by for i = 1, …, 6, and where Δ 12 Á i,t − 12 = Á i,t − Á i,t − 12 is the 12 differences of both dependent and independent variables.The third and fourth rival specifications are naïve forecasting models, identified below by N and defined by ŷi,T + 1 = y i,T , and Holt-Winters' triple exponential smoothing (following Gunter & Onder, 2015) from the class of exponential smoothing method.
The forecast accuracy measures employed in this work are the mean absolute error (MAE) and the root mean square error (RMSE); as pointed out by Gunter and Onder (2015), the characteristic of these quantities, calculated on the logarithmic series, is to approximate the related percentage indices (mean absolute percent error and root mean square percent error) of the untransformed variables.We also consider a third measure of accuracy, built as a ratio of the absolute errors.Specifically, the relative quantities are defined by the ratios of the absolute errors done with the benchmark models relative to the absolute error obtained with the augmented model, specifically defined by where y T + h is the variable observed h steps forward the end of the estimation period T; and ŷT + h,b and ŷT + h,a are the forecasts for the benchmark and the augmented models, respectively.Through these quantities, we obtain a direct measure of comparison because values equal to 1 represent equal errors for the two models, R T + h > 1 assesses a better performance of the benchmark model, and R T + h > 1 confirms the goodness of the augmented forecasting model.
Moreover, the lack of information caused by the publication delay for official data aggravates the difficulty of obtaining reliable out-ofsample results.Thus we further implement a nowcasting application.
We collect the GT variables raising alternative samples of search-

| Forecasting.2
Given the seasonal nature of arrivals in the province, we propose further considerations.In particular, we implement a recursive procedure to forecast h = 1, …, 12 horizons, starting from the estimation sample January 2009-December 2014, and we focus on the predicted values for summer (June, July, and August), differentiating by the horizon at which these months appear as forecasts.Even if the number of training sets, and hence forecasts, is small, several results and implications can be derived.
For each month we report in Table 4, the ratio R T + h of Equation (5) for the baseline and the augmented models and the mean of the 12 horizons' absolute errors for all rival specifications, that is, MAE b is the mean of the absolute errors with the benchmark panel specification; MAE a refers to the absolute errors of the augmented models; MAE RW is calculated for the forecasts obtained through the specification in Equation (3); MAE AR + GT for Equation (4); and MAE N and MAE ETS for naïve and exponential smoothing model, respectively.
If we compare MAEs in Table 3 and MAEs in Table 4, we see that both the baseline and the augmented model provide forecasts that are more accurate in summer than in the other periods of the year.
Referring to the R T + h quantities, most of the ratios in June and July are lower than 1, assessing the forecasting ability of the augmented model.
Focusing on each destination, the main advantages including GT

| Nowcasting.1
In the nowcasting application, we focus on the predictions of arrivals in May, June, July, August, and September 2016, only when these months appear as h = 8, 10, 12, and 15 for the best four rival models identified in Table 4-the benchmark panel specification, the augmented model, the random walk specification, and the Autoregressive (AR) model augmented by GT.In Table 5, we show the MAE and the RMSE indices of the 5-month forecasts evaluated at each horizon.
For h = 8, we confirm an enhancement of the results for all the destinations after introducing the GT variables.In the cases of h = 10 and h = 12, only Misano exhibits better results with the baseline model.For the largest horizon, h = 15, we find the performance overturned, all MAEs of the benchmark are lower than MAEs of augmented specifications despite the cases of Misano and Bellaria.

| Nowcasting.2
We show results for summer in Table 6, through R T + h ratios for the best four rival models identified in situation is represented by h = 15.If we consider the univariate timeseries models, the predictive power of the augmented panel specification is not so clear.However, we can find that the enhancement of accuracy is obtained for destinations representative of similar products and with the weakest tourist flows among the six territories: Cattolica, Misano, and Bellaria.As for forecasting results, the summer evaluation in Table 6 shows errors that are quite different from the mean obtained in Table 5, finding the goodness of the model, and therefore of the GT data, to capture the summer dynamic.Even if results differ across municipalities and forecast horizons (Witt & Witt, 1995), focusing on weak summer tourism flows and adopting a more realistic forecasting scheme, the augmented panel data model proposed in this paper provides more accurate forecasts than the baseline counterpart and four rival univariate specifications, which underscore the importance of jointly modelling weak and strong tourism demand for neighbouring destinations at monthly frequency.
Despite the advantages identified with this approach, limitations and further developments can be illustrated.
, with the aim of anticipating variables characterized by low-frequency (e.g., quarterly or yearly data) release.Only recently have the aspects of real-time adjustment and data realization contaminated the tourism framework.This literature provides a set of applications to evaluate nowcast predictions, where time-series models have been augmented by search-volume indices.However, there are a limited number of contributions that address the specific problems related to delay in publication and data vintages.In particular, Camacho and Pacce (2017) considered a dynamic factor model for a real-time database of vintages, finding that the models augmented by Google indices outperform the benchmark models.The authors, in collaboration with Google, built a collection of search-volume index series, tracing the evolution of different tourism queries in real time.They evaluated their nowcasting model referring to short horizons, ranging from 1 to 3 months.This way, the authors completely ignored possible problems caused by publication delays.The research strategy proposed below overcomes this gap, allowing for different aspects of real-time analysis for local public and private agents: (a) high spatial and frequency detail, (b) a related long delay in publication of official data, and (c) possible substitutive and complementary effects caused by multiproducts for spatially close destinations.
3.2 | The dataThe geographical position and layout (see Figure1) are defining features of the area around Rimini.The largest municipalities are along the Adriatic Sea coast, and the inland areas are rich in historical attractions, including Roman relics and Mediaeval castles.Rimini offers a wide set of attractions and tourism products over an area of only 863.6 km 2 .As a result, tourism plays a significant role in the economic balance of the province, accounting for 30% of the total product of the whole area and has the second largest tourism GDP per capita among Italian provinces.In 2016, the international tourism demand in Rimini provided €579 million, with monthly tourism arrivals (total, as the sum of international and Italian) ranging from 55,610 in February to 764,443 in August.Data on monthly tourism arrivals are obtained from the Emilia Romagna region website (https://www.regione.emilia-romagna.it)for the interval January 2009-December 2016.Specifically, we use arrivals as a proxy for the Rimini Province tourism demand.The highest spatial detail for which data are available is defined by six destinations across the territory: Riccione, Rimini, Cattolica, Misano, Bellaria, and the inland areas of the province.In contrast to the first five seaside locations, the inland area has more than one municipality, each offering different magnitudes of attractions but with similar characteristics of tourism supply.As seen in Figure 2, tourism arrivals (solid lines) at each destination show strong summer flows and a weaker flow in the winter.

Findings
on Rimini and Riccione are very similar, especially in the coefficient of variation and the Doornik-Hansen normality test.Moreover, we note that the coastal destinations of Cattolica, Misano, and Bellaria are defined by the weakest tourist flows and show closed coefficient of variation and Doornik-Hansen normality values.The inland area presents different results from the other territories, still referring to the instant at which the maximum appears (August 2016).

F
I G U R E 1 Province of Rimini F I G U R E 2 Arrivals and GT data for the six destinations, from January 2009 to December 2016 we show baseline and augmented model estimates and tests on linear restrictions and serial correlation of two specific training sets: the first from January 2009 to December 2014 and the last from January 2009 to December 2015 (from left to right).In particular, we refer to the Wald test for H 0, b : ρ 1 = ρ 12 for baseline models schemes: the first one general to evaluate the accuracy of the whole yearly performances of each destination; and second, the prediction of summer months when they appear as 1, 2, …,12 steps ahead.For the nowcasting scheme, first we focus on the predicting capability of the model measuring MAEs and RMSEs for the period May-September 2016; hence, we investigate the nowcasting performances of the rivals in summer months of June, July, and August when they appear as horizons h = 8, 10, 12, and 15.

From a practitioner
point of view, it is extremely important for public and private agents operating at different spatial detail to understand, model, and accurately predict the number of tourists coming to the destination.The aims are to prevent and sustain low-level demand conditions (e.g., in winter months and local inefficiency of smaller destinations, this one often characterized by a low level of attractiveness) while planning and programming both an adequate number of available accommodations and employees.Accurately predicting the specific-municipality demand provides the opportunity to track the performance of competing destinations (as in the case of h = 15 for Cattolica, Misano, and Bellaria), the opportunity for minor destinations (inland areas and Misano) to monitor the role of events in neighbourhood municipalities in the off-peak as well as peak season, and the opportunity to detect the impact of shocks at different microeconomic levels on tourist arrivals in real time through contractions in search volumes.These issues also suggest that greater access to tourism destinations could be improved by means of greater visibility and a more accurate and proper marketing strategy on the Internet, especially for small isolated municipalities with small magnitude attractions (e.g., inland areas of the province).All these practical aspects require opportune and prompt predictions of tourist flows dealing with data availability and publication delays of official data.Despite the importance, both availability and delay are usually disregarded in tourism literature often leading researchers to highlight accuracy of inappropriate and almost unrealistic forecasting schemes.5 | CONCLUSIONS AND IMPLICATIONSShort-and long-term predictions provide answers to different questions of economic agents.Frechtling (2001) summarized the relationship between tourism managerial requirements and the temporal intervals to adopt efficient and effective economic, managerial, and marketing actions.For local destinations, the gap in official data, which affects either short-and long-term predictions, becomes dramatic, strongly undermining the efficiency of the decision-making processes.However, small towns where the tourism industry is at the core of the local economy may benefit from an accurate analysis of the dynamics of the tourism flows to a larger extent than the metropolitan areas.The aim of this paper is to respond to this shortcoming by combining relevant concepts introduced (separately) in the tourism framework over the past decade.First, the use of a panel specification to model and forecast tourism demand of both complementary and substitutive neighbourhoods.Second, the use of GT indices to enhance the predictive capability of forecasting and nowcasting models.This strand of statistical and econometric literature has been recently introduced to define models able to predict the present, the very near future, and the very recent past.Researchers either become interested in building specific instruments (for a review, seeBanbura, Giannone, & Reichlin (2011)) able to deal with mixed frequency data or augmenting forecasting models with ex ante measures of the interesting variables.Following the latter direction, tourism literature focuses on using advanced techniques to deal with models augmented by online search data, usually avoiding important aspects of forecasting and nowcasting approaches.The third and fourth relevant concepts are found in two important and ignored problems of local economic agents: the availability of official data and the spatial detail of the quantities involved in the analysis.
First, the combination of temporal availability and the model specification: in fact, due to the size of the data set joint with the number of parameters in the panel specification, we avoid increasing the number of training sets, taking care of the general advantages of panel-data techniques.Further developments should consider increasing the temporal dimension and/or the number of destinations.Second, given the geographic proximity of the analysed municipalities, the analysis would be improved by increasing the data set from a spatial point of view, 3 specifically increasing the number of neighbouring territories and providing a spatiotemporal panel data investigation.ORCIDSilvia Emilihttps://orcid.org/0000-0002-7086-0650ENDNOTES

Table 3
Mean absolute error and the root mean square error indexes calculated on the forecasting periods January-December 2015 and January-December 2016 Comparison of the summer forecasting performances of the benchmark and the augmented model through R T + h , and mean absolute error values for the six rival models from 0.1134 to 0.2875 for the augmented model; benchmark RMSEs vary between 0.1686 and 0.4509 and 0.1407 and 0.4382 for the augmented specification), the augmented model generally produces smaller errors than the baselines (in bold).For the forecasting period January-December 2016 in particular (fourth and fifth columns), we observe five augmented model MAEs and all the RMSEs, lower than T A B L E 3 variables can be noted for the inland area and Rimini, especially in July and August.A different result is obtained for Cattolica (for which only 12 R T + h ratios over 36 are lower than 1) and for Misano, the only bathing destination close to Cattolica in the province.The worst ratios are shown for Riccione when August represents horizons h = 11 and h = 12.However, if we observe the terms of these relative measures, we show that the R T + h are given by