Introduction

In fast-paced restaurant environments, efficient inventory planning is essential to reduce waste, avoid shortages, and maintain consistent customer satisfaction. One of the key challenges lies in accurately forecasting ingredient-level demand, particularly for fresh and perishable items such as lettuce.

This project focuses on forecasting short-term daily demand for lettuce at four locations of a fast-food restaurant chain, using internal transactional and recipe data. By merging point-of-sale (POS) data with hierarchical recipe and ingredient information, we compute historical lettuce usage per location and develop time series forecasting models to predict future demand over a 14-day horizon.

The objective is to provide accurate, store-specific forecasts that can be used by managers and supply chain teams to optimize purchasing and reduce spoilage. This involves building a robust data pipeline, applying classical time series models (ETS and ARIMA), and selecting the best-performing approach for each location. The analysis draws on real-world data from restaurants operating in Berkeley and New York.

Data Import

To prepare the dataset for analysis, we loaded multiple data sources containing key information about menu items, recipes, ingredients, transactions, and store details. The datasets were read from CSV and Excel files using read_csv() and read_excel(), ensuring they are structured correctly for further processing.

Data Wrangling & Feature Engineering

# All Ingredient IDs containing 'Lettuce'
lettuce_ingredients <- ingredients %>%
  filter(grepl("Lettuce", IngredientName, ignore.case = TRUE)) 

lettuce_ingredients %>%
  knitr::kable(format = "html") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"), 
                full_width = FALSE, font_size = 12)
IngredientName IngredientShortDescription IngredientId PortionUOMTypeId
Lettuce Lettuce 27 15
Lettuce - Metric Lettuce - Metric 291 13

To accurately track lettuce demand, we first identified all ingredient entries related to lettuce from the ingredients dataset. Using a case-insensitive string-matching approach (grepl()), we filtered for ingredient names containing “Lettuce,” ensuring we captured all relevant entries. The dataset contains two distinct lettuce entries: IngredientId 27 (“Lettuce”), measured in ounces (PortionUOMTypeId = 15), and IngredientId 291 (“Lettuce - Metric”), measured in grams (PortionUOMTypeId = 13).

# Conversion factor (1 ounce = 28.35 grams)
gram_to_ounce <- 1 / 28.35

# Recipes with Lettuce
lettuce_recipes <- recipe_ingredient_assignments %>%
  filter(IngredientId %in% c(27, 291)) %>%
  left_join(ingredients %>% select(IngredientId, PortionUOMTypeId), by = "IngredientId") %>%
  mutate(Quantity_Ounces = ifelse(PortionUOMTypeId == 13, Quantity * gram_to_ounce, Quantity)) %>%
  select(RecipeId, IngredientId, Quantity_Ounces)

# Sub-Recipes with Lettuce
lettuce_sub_recipes <- sub_recipe_ingr_assignments %>%
  filter(IngredientId %in% c(27, 291)) %>%
  left_join(ingredients %>% select(IngredientId, PortionUOMTypeId), by = "IngredientId") %>%
  mutate(Quantity_Ounces = ifelse(PortionUOMTypeId == 13, Quantity * gram_to_ounce, Quantity)) %>%
  select(SubRecipeId, IngredientId, Quantity_Ounces)
RecipeId IngredientId Quantity_Ounces
687 27 0
688 27 0
689 27 0
690 27 0
959 27 12
960 27 24
1044 27 10
1220 27 5
1778 27 5
1815 27 1
SubRecipeId IngredientId Quantity_Ounces
5 27 4.000000
10 27 2.000000
41 291 3.527337
42 291 1.516755
43 291 1.516755
49 27 2.000000
50 27 0.000000
51 27 2.000000
58 27 4.000000
73 27 2.000000

To standardize lettuce measurements across all recipes, we extracted and converted ingredient quantities into ounces. Since lettuce is recorded in both grams (PortionUOMTypeId = 13) and ounces (PortionUOMTypeId = 15), we applied a conversion factor (1 ounce = 28.35 grams) to ensure consistency.

We first identified recipes that contain lettuce (IngredientId = 27, 291) from recipe_ingredient_assignments and sub_recipe_ingr_assignments, with a total of 56 and 10 recipes, respectively. These datasets were joined with the ingredients table to retrieve their unit of measurement, allowing us to convert all quantities to ounces.

lettuce_recipes_from_subs <- recipe_sub_recipe_assignments %>%
  inner_join(lettuce_sub_recipes, by = "SubRecipeId") %>%
  mutate(Quantity_Ounces = Quantity_Ounces * Factor) %>%  
  select(RecipeId, IngredientId, Quantity_Ounces)

datatable(lettuce_recipes_from_subs, options = list(pageLength = 10, scrollX = TRUE))

Next, we linked sub-recipes to their respective main recipes using the recipe_sub_recipe_assignments dataset. This step is essential because some recipes do not directly list lettuce as an ingredient but instead include sub-recipes that contain lettuce.

By performing an inner_join() between recipe_sub_recipe_assignments and lettuce_sub_recipes, we identified which main recipes inherit lettuce from their sub-recipes. The Quantity of Lettuce (in Ounces) was then adjusted by multiplying the amount of lettuce in the sub-recipe (Quantity_Ounces) by the Factor, which represents how much of the sub-recipe is used in the main recipe.

# Direct and Sub-Recipe Lettuce Usage
all_lettuce_recipes <- bind_rows(lettuce_recipes, lettuce_recipes_from_subs)

datatable(all_lettuce_recipes, options = list(pageLength = 10, scrollX = TRUE))

We combined direct lettuce usage from recipes with lettuce usage inherited from sub-recipes. This was achieved by binding the two datasets—lettuce_recipes, which includes recipes that directly contain lettuce, and lettuce_recipes_from_subs, which accounts for lettuce inherited through sub-recipes. The result is a unified dataset, all_lettuce_recipes, mapping each RecipeId to its total lettuce quantity in ounces.

The dataset confirms that some recipes use zero ounces of lettuce, indicating that lettuce may have been removed from their formulations. Meanwhile, other recipes show varying levels of lettuce usage, with some containing significant amounts (e.g., 12, 24 ounces, etc.). This step ensures that all lettuce contributions—whether direct or indirect—are accounted for, providing a complete basis for calculating lettuce demand per store.

# Menu Items Containing These Recipes
lettuce_menu_items <- menu_items %>%
  inner_join(all_lettuce_recipes, by = "RecipeId") 

datatable(lettuce_menu_items, options = list(pageLength = 10, scrollX = TRUE))

To connect lettuce consumption to specific menu items, we linked all recipes containing lettuce to their corresponding menu items using the menu_items dataset. This was done by performing an inner_join() between all_lettuce_recipes and menu_items based on RecipeId. The resulting dataset also shows that in some cases, menu items have variations in lettuce quantities.

lettuce_sales <- menuitem %>%
  inner_join(lettuce_menu_items, by = c("PLU" = "PLU", "Id" = "MenuItemId")) %>%
  mutate(Lettuce_Used = Quantity * Quantity_Ounces) %>%
  group_by(StoreNumber, date) %>%
  summarise(Daily_Lettuce = sum(Lettuce_Used, na.rm = TRUE), .groups = "drop") %>%
  mutate(date = as.Date(paste0("20", substr(date, 1, 8)), format = "%Y-%m-%d")) %>% 
  complete(date = seq(min(date, na.rm = TRUE), max(date, na.rm = TRUE), by = "day"), 
           StoreNumber, 
           fill = list(Daily_Lettuce = NA)) %>%
  pivot_wider(names_from = StoreNumber, values_from = Daily_Lettuce, values_fill = 0)
date 4904 12631 20974 46673
2015-03-05 NA 232 NA 152
2015-03-06 NA 227 4 100
2015-03-07 NA 181 NA 54
2015-03-08 NA 221 NA 199
2015-03-09 NA 229 NA 166
2015-03-10 NA 237 0 143
2015-03-11 NA 235 NA 162
2015-03-12 NA 244 NA 116
2015-03-13 176 218 29 136
2015-03-14 182 205 NA 68
date 4904 12631 20974 46673
2015-06-06 230 245 161 86
2015-06-07 416 282 200 244
2015-06-08 318 230 224 173
2015-06-09 352 294 210 127
2015-06-10 298 235 256 124
2015-06-11 256 376 386 151
2015-06-12 223 366 170 134
2015-06-13 216 229 206 96
2015-06-14 350 234 167 138
2015-06-15 338 232 266 244

To quantify lettuce demand at the store level, we linked menu item sales data (menuitem) with lettuce usage per menu item (lettuce_menu_items). The join was performed using both PLU and MenuItemId to ensure accurate mapping of menu items to their respective recipes.

Next, we calculated lettuce consumption per transaction by multiplying the quantity sold by the lettuce usage per item (Quantity_Ounces). We then aggregated the data by store and date, summing up the total lettuce used (Daily_Lettuce).

To ensure that all dates are represented, we used complete(), filling in missing dates while maintaining store-level granularity. Finally, the dataset was reshaped using pivot_wider(), structuring the output so that each store is represented as a column, each row corresponds to a date, and the values reflect daily lettuce demand per store. This final transformation ensures a structured format for time-series forecasting.

Modeling Approach

Holt-Winter’s Model Application

The Holt-Winters Model will be applied to forecast lettuce demand over the next two weeks. This method extends exponential smoothing by incorporating adjustments for trend and seasonality, allowing for more accurate short-term forecasting. It assigns exponentially decreasing weights to past observations, giving more importance to recent data points.

By leveraging this approach, the model can effectively capture underlying patterns in the lettuce demand data, making it well-suited for forecasting daily sales trends. However, before applying Holt-Winters’ model, we need to convert lettuce demand into time-series objects.

start_date <- min(lettuce_sales$date)  
end_date <- max(lettuce_sales$date)   

start_week <- as.numeric(format(start_date, "%W")) + 1  

store_ids <- names(lettuce_sales)[-1]

lettuce_ts_list <- lapply(store_ids, function(store) {
  ts(lettuce_sales[[store]], 
     start = start_week,  
     frequency = 7)  
})

names(lettuce_ts_list) <- store_ids

First, the earliest and latest dates from the dataset are extracted to determine the start and end points of the time series.

Next, the daily lettuce demand for each store is transformed into a time-series object. The ts() function is used, where:

  • start = start_week, which ensures that the time series begins at the correct week of the year.
  • frequency = 7, indicating that the data follows a weekly pattern, effectively capturing recurring trends within each week.
lettuce_ts_4904 <- lettuce_ts_list[["4904"]]
lettuce_ts_12631 <- lettuce_ts_list[["12631"]]
lettuce_ts_20974 <- lettuce_ts_list[["20974"]]
lettuce_ts_46673 <- lettuce_ts_list[["46673"]]

Once all store data is converted into time-series format, store identifiers are assigned to each respective time-series object, allowing for easy reference. This step ensures that each store’s lettuce demand is correctly structured for forecasting and further time-series analysis.

The plot above visualizes the daily lettuce demand for four different restaurants from early March to mid-June 2015. Each store is represented by a distinct color, allowing for a direct comparison of lettuce usage trends across locations. The x-axis represents the timeline, while the y-axis indicates the demand in ounces.

There are noticeable fluctuations in lettuce demand across all stores. Some restaurants experience more pronounced variations, while others show just a bit more stable patterns. This suggests that different locations may face varying levels of customer traffic or operational influences affecting ingredient usage. However, one common pattern across all restaurants is that whenever there is a significant drop in demand, it relatively occurs simultaneously across all locations.

The previous visualization displayed the lettuce demand for all four restaurants in a single plot, allowing for an overall comparison of demand trends. However, to better understand the unique patterns of each restaurant, this faceted plot presents the time series data separately for each store. By visualizing them individually, we can identify store-specific variations that may have been harder to distinguish in the combined graph.

This breakdown reveals distinct demand patterns for each location. Store 20974 exhibits relatively unpredictable fluctuations with an unclear trend over time. In contrast, Store 12631 initially shows more stable fluctuations but transitions to more varied demand patterns as time progresses. Notably, it appears to be the only store with a slight upward trend, whereas the others maintain a relatively stable trajectory. Both Store 46673 and Store 4904 display high variability, suggesting irregular consumption patterns. However, Store 4904 has a lower average demand compared to the others.

Despite these differences, all stores share a common trend of periodic spikes and dips, suggesting the presence of weekly seasonality or other recurring demand cycles. These insights will guide the next steps in our analysis as we move forward with time series decomposition and the application of the Holt-Winters forecasting model.

Restaurant New York 12631

Let’s start with one of the restaurants in New York.

We present the STL decomposition (Seasonal-Trend decomposition using LOESS) of weekly lettuce demand for Restaurant 12631 that reveals key patterns in its consumption.

The trend component shows a gradual increase, followed by stabilization with minor fluctuations. The seasonal component displays a strong and consistent weekly cycle, reinforcing the influence of operational patterns on lettuce usage. However, the large bar for the seasonality factor suggests that its contribution to the overall variation in lettuce demand is relatively small. The remainder component exhibits minor fluctuations with occasional irregular spikes. Given these characteristics, we can expect an ETS(A, A, N) model—where A represents an additive error component, A denotes an additive trend, and N indicates no seasonal component—to best describe the data. However, we will determine the optimal model selection in the next section.

Model Selection

split_index <- length(lettuce_ts_12631) - 14

train_12631 <- window(lettuce_ts_12631, end = time(lettuce_ts_12631)[split_index])
test_12631 <- window(lettuce_ts_12631, start = time(lettuce_ts_12631)[split_index + 1])

The split index is determined by subtracting 14 days (the forecast horizon) from the total number of observations, ensuring that the last two weeks of data are reserved for testing. This choice is made because we need to forecast demand for the next 14 days, and we aim to achieve the highest possible accuracy by utilizing all available historical data. As a result, the training set consists of all observations up to the split index, while the test set includes the final 14 days. This approach ensures that the model is trained on a comprehensive dataset while allowing for an unbiased evaluation of its forecasting performance.

ets_model_12631 <- ets(train_12631, model = "ZZZ", lambda = "auto", biasadj = TRUE)
ets_model_12631_aan <- ets(train_12631, model = "AAN", lambda = "auto", biasadj = TRUE)

A Holt-Winters model will now be applied using R’s ets() function, which requires three parameters specifying the type of error, trend, and seasonality. We fit two ETS (Error-Trend-Seasonality) models on the training dataset for restaurant 12631 to evaluate and compare different forecasting approaches.

The first model, ets_model_12631, is specified with "ZZZ", which allows the function to automatically determine the optimal ETS configuration based on the data. Additionally, a Box-Cox transformation (lambda = "auto") is applied to stabilize variance, and bias adjustment (biasadj = TRUE) is enabled to improve forecast accuracy.

The second model, ets_model_12631_aan, is explicitly set to ETS(A, A, N) as this configuration was anticipated from the STL decomposition, which suggested a significant trend component but no strong seasonal pattern. In this configuration, the model assumes an additive error (A), an additive trend (A), and no seasonality (N).

By fitting both models, we can compare their performance and assess whether the automatically selected ETS model or the specified AAN model provides better forecasting accuracy for daily lettuce demand.

ets_model_12631$components[1:3]
## [1] "A" "N" "A"
ets_forecast_12631 <- forecast(ets_model_12631, h = length(test_12631))
ets_forecast_12631_aan <- forecast(ets_model_12631_aan, h = length(test_12631))

ets_test_acc_12631 <- accuracy(ets_forecast_12631, test_12631)
ets_test_acc_12631_aan <- accuracy(ets_forecast_12631_aan, test_12631)
ME RMSE MAE MPE MAPE MASE ACF1 Theil’s U
Training set 3.986680 35.66635 27.53202 -0.0548485 10.39996 0.7445996 0.0469063 NA
Test set -9.920354 40.22329 33.63813 -5.5924766 12.59408 0.9097382 0.0735180 0.6530292
ME RMSE MAE MPE MAPE MASE ACF1 Theil’s U
Training set -7.596097 43.90831 36.78783 -5.4023003 14.66089 0.9949216 0.1245670 NA
Test set 7.895533 49.47119 35.91927 0.3598508 12.24663 0.9714315 0.1680072 0.7865137

The forecast accuracy metrics compare two ETS models—the automatically selected ETS(A, N, A) and the manually specified ETS(A, A, N)—for restaurant 12631. Various accuracy measures were used to assess their performance, including Mean Error (ME), Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), Mean Percentage Error (MPE), Mean Absolute Percentage Error (MAPE), Mean Absolute Scaled Error (MASE), Autocorrelation at Lag 1 (ACF1), and Theil’s U Statistic. Each of these metrics provides insight into the model’s accuracy and predictive reliability. The ETS(A, N, A) model, which includes additive error, no trend, and additive seasonality, outperforms ETS(A, A, N) in key metrics.

ETS(A, N, A) has a lower RMSE (40.22 vs. 49.47) and MAE (33.64 vs. 35.91) on the test set, indicating better predictive accuracy. Additionally, Theil’s U statistic (0.6530 vs. 0.7865) further confirms the superiority of ETS(A, N, A) over ETS(A, A, N).

Given these results, ETS(A, N, A) is the better model, highlighting that capturing seasonality is more beneficial than modeling a trend for forecasting daily lettuce demand. However, to ensure the model’s reliability, we must now examine the behavior of its residuals.

checkresiduals(ets_model_12631)

## 
##  Ljung-Box test
## 
## data:  Residuals from ETS(A,N,A)
## Q* = 8.3081, df = 14, p-value = 0.8727
## 
## Model df: 0.   Total lags used: 14

The residual analysis suggests that the ETS(A,N,A) model performs well, with no significant autocorrelation or systematic patterns remaining in the errors.

The Ljung-Box test yields a high p-value (0.8727), indicating that we fail to reject the null hypothesis. This suggests that the residuals are uncorrelated and resemble white noise, meaning the model has effectively captured the data’s structure.

Examining the time series plot of residuals, we see that they fluctuate randomly around zero, with no visible trends or systematic patterns. This randomness reinforces that the model is not missing key patterns in the data.

The ACF plot further supports this, as most autocorrelations remain within the confidence bounds, confirming the absence of strong dependencies in the residuals. Additionally, the histogram of residuals follows an approximately normal distribution, suggesting that the error terms align well with normality assumptions.

Overall, these diagnostics confirm that the ETS(A,N,A) model is well-specified, with residuals exhibiting desirable properties. This makes it a reliable choice for forecasting daily lettuce demand.

Demand Forecasting

We perform the final demand forecasting for Restaurant 12631 using the ETS(A,N,A) model, as it was previously identified as the most suitable based on forecasting accuracy comparisons.

First, the ETS model is fitted to the full dataset (lettuce_ts_12631) to ensure it learns from all available data before generating forecasts. Then, a 14-day forecast is computed using the forecast() function.

ets_model_12631_final <- ets(lettuce_ts_12631, model = "ANA", lambda = "auto", biasadj = TRUE)
ets_forecast_12631_final <- forecast(ets_model_12631_final, h = 14)

forecast_dates <- seq(from = as.Date("2025-06-16"), by = "day", length.out = 14)
ets_forecast_12631_table <- data.frame(
  Date = forecast_dates,
  Forecast = ets_forecast_12631_final$mean
)
Date Forecast
2025-06-16 266.7777
2025-06-17 279.3228
2025-06-18 307.8626
2025-06-19 305.8711
2025-06-20 234.1088
2025-06-21 267.8582
2025-06-22 239.9318
2025-06-23 267.8663
2025-06-24 280.5532
2025-06-25 309.4576
2025-06-26 307.4355
2025-06-27 234.8684
2025-06-28 268.9501
2025-06-29 240.7416

This plot presents the ETS (A,N,A) Model Daily Demand Forecast for Restaurant 12631, displaying historical sales data along with a 14-day forecast. The black line represents actual sales trends leading up to the forecast period, while the blue shaded areas indicate prediction intervals.

The forecasted demand follows the observed trend, maintaining fluctuations similar to historical data. The blue shaded regions illustrate the 80% and 95% confidence intervals, showing increasing uncertainty as the forecast extends further into the future. The widening bands suggest that variability in demand is expected, which aligns with the historical volatility observed in sales.

Overall, this visualization helps assess the reliability of the forecast, providing insights into potential fluctuations and uncertainty in lettuce demand for Restaurant 12631 over the next two weeks.

Restaurant New York 20974

# Extracting longest ts 
valid_indices_20974 <- which(!is.na(lettuce_ts_20974))

breaks_20974 <- c(0, which(diff(valid_indices_20974) > 1), length(valid_indices_20974))

longest_seq_start_20974 <- breaks_20974[which.max(diff(breaks_20974))] + 1
longest_seq_end_20974 <- breaks_20974[which.max(diff(breaks_20974)) + 1]

start_index_20974 <- valid_indices_20974[longest_seq_start_20974]
end_index_20974 <- valid_indices_20974[longest_seq_end_20974]

lettuce_longest_ts_20974 <- window(lettuce_ts_20974, start = c(10, start_index_20974), end = c(10, end_index_20974))

For restaurant 20974, extracting the longest continuous time series is crucial to ensuring a reliable STL decomposition, given the presence of missing values in the data, as shown above.

The STL decomposition of lettuce demand for Restaurant 12631 suggests that an ETS(A,N,A) model might be the most appropriate configuration, which incorporates additive error, no trend, and additive seasonality. The trend component exhibits fluctuations over time, indicating periods of increasing and decreasing demand. However, it has the largest bar on the right-hand side, signifying that its contribution to the variation in lettuce demand is very small—almost nonexistent. This suggests that a nonexistent or negligible trend (N) should be used in the ETS model.

The seasonal component, on the other hand, shows a clear and strong weekly pattern, reinforcing the idea that lettuce demand follows a cyclical behavior, likely influenced by weekly restaurant operations. The remainder component captures the unexplained variations, which exhibit relatively small fluctuations, indicating that most of the variability in the data is effectively accounted for.

Model Selection

Now, we follow the same steps as before.

split_index <- length(lettuce_ts_20974) - 14

train_20974 <- window(lettuce_ts_20974, end = time(lettuce_ts_20974)[split_index])
test_20974 <- window(lettuce_ts_20974, start = time(lettuce_ts_20974)[split_index + 1])
ets_model_20974 <- ets(train_20974, model = "ZZZ")
ets_model_20974$components[1:3]
## [1] "A" "N" "A"
ets_forecast_20974 <- forecast(ets_model_20974, h = length(test_20974))

ets_test_acc_20974 <- accuracy(ets_forecast_20974, test_20974)
ME RMSE MAE MPE MAPE MASE ACF1 Theil’s U
Training set -0.9560356 40.69057 33.66340 -13.46909 26.18756 0.7180669 0.0861895 NA
Test set 27.2057558 55.99135 33.87778 9.46673 13.38155 0.7226398 -0.2949359 0.8497462

The automated selection process identified ETS(A,N,A) as the best-fitting model, aligning with our expectations.

checkresiduals(ets_model_20974)

## 
##  Ljung-Box test
## 
## data:  Residuals from ETS(A,N,A)
## Q* = 13.73, df = 14, p-value = 0.47
## 
## Model df: 0.   Total lags used: 14

The Ljung-Box test (p-value = 0.47) suggests no significant autocorrelation in the residuals. The ACF plot confirms this, with most lags within the confidence bounds. The residual histogram appears approximately normal, and the time series plot shows random fluctuations, indicating that the ETS(A,N,A) model is well-specified with no major issues.

Demand Forecasting

After evaluating multiple ETS model configurations, we selected the ETS(A, N, A) model for Restaurant 20974. This decision was based on the model’s ability to effectively capture the data’s underlying trend and error structure while omitting an unnecessary seasonal component.

We apply this final model to the full dataset and generate a 14-day forecast. This ensures that the prediction incorporates all available historical data, improving forecast accuracy.

ets_model_20974_final <- ets(lettuce_ts_20974, model = "ANA")
ets_forecast_20974_final <- forecast(ets_model_20974_final, h = 14)

forecast_dates <- seq(from = as.Date("2025-06-16"), by = "day", length.out = 14)
ets_forecast_20974_table <- data.frame(
  Date = forecast_dates,
  Forecast = ets_forecast_20974_final$mean  
)
Date Forecast
2025-06-16 245.3543
2025-06-17 259.3771
2025-06-18 245.3595
2025-06-19 207.5218
2025-06-20 163.0179
2025-06-21 224.4164
2025-06-22 244.1267
2025-06-23 245.3543
2025-06-24 259.3771
2025-06-25 245.3595
2025-06-26 207.5218
2025-06-27 163.0179
2025-06-28 224.4164
2025-06-29 244.1267

Restaurant California 46673

The STL decomposition for store 46673 suggests that the trend component has minimal influence on fluctuations in lettuce demand, as indicated by its relatively large bar. In contrast, the seasonal component appears to be a key driver of variation, given its smaller bar. Additionally, the consistent variance in both the seasonal and remainder components implies that an additive structure is more appropriate than a multiplicative one. Based on these insights, we will begin by fitting an ETS(A, N, A) model, which assumes additive error and trend with no seasonality. This model will later be evaluated against an automatically selected ETS configuration.

Model Selection

split_index <- length(lettuce_ts_46673) - 14

train_46673 <- window(lettuce_ts_46673, end = time(lettuce_ts_46673)[split_index])
test_46673 <- window(lettuce_ts_46673, start = time(lettuce_ts_46673)[split_index + 1])
ets_model_46673 <- ets(train_46673, model = "ZZZ", lambda = "auto", biasadj = TRUE)
ets_model_46673$components[1:3]
## [1] "A" "N" "A"
ets_model_46673_ana <- ets(train_46673, model = "ANA")

The automated model selection process validates that the ETS(A, N, A) configuration is the most suitable choice.

ets_forecast_46673 <- forecast(ets_model_46673, h = length(test_46673))
ets_forecast_46673_ana <- forecast(ets_model_46673_ana, h = length(test_46673))

ets_test_acc_46673 <- accuracy(ets_forecast_46673, test_46673)
ets_test_acc_46673_ana <- accuracy(ets_forecast_46673_ana, test_46673)
ME RMSE MAE MPE MAPE MASE ACF1 Theil’s U
Training set -1.197518 23.23844 18.40454 -3.814912 14.15808 0.6919636 0.0857551 NA
Test set 12.932897 38.32624 27.62740 6.297726 17.57719 1.0387192 0.0466446 0.6071629
ME RMSE MAE MPE MAPE MASE ACF1 Theil’s U
Training set -1.014638 23.09898 18.47706 -3.708049 14.26531 0.6946899 0.0879864 NA
Test set 12.025567 38.53718 28.55525 5.920676 18.50333 1.0736040 0.0779578 0.608142
checkresiduals(ets_model_46673)

## 
##  Ljung-Box test
## 
## data:  Residuals from ETS(A,N,A)
## Q* = 10.221, df = 14, p-value = 0.7458
## 
## Model df: 0.   Total lags used: 14

Again, the Ljung-Box test (p-value = 0.7458) indicates no significant autocorrelation, suggesting the residuals behave like white noise. The ACF plot supports this, with most lags within the confidence bounds. The residual histogram appears approximately normal, and the time series plot shows no clear patterns, confirming that the ETS(A,N,A) model is well-specified and suitable for forecasting.

Demand Forecasting

We now generate a 14-day forecast using our selected model.

ets_model_46673_final <- ets(lettuce_ts_46673, model = "ANA", lambda = "auto", biasadj = TRUE)
ets_forecast_46673_final <- forecast(ets_model_46673_final, h = 14)

forecast_dates <- seq(from = as.Date("2025-06-16"), by = "day", length.out = 14)
ets_forecast_46673_table <- data.frame(
  Date = forecast_dates,
  Forecast = ets_forecast_46673_final$mean  
)
Date Forecast
2025-06-16 161.88903
2025-06-17 175.51673
2025-06-18 163.78747
2025-06-19 102.47294
2025-06-20 77.11859
2025-06-21 170.33491
2025-06-22 178.01567
2025-06-23 161.96875
2025-06-24 175.60049
2025-06-25 163.86776
2025-06-26 102.53316
2025-06-27 77.16918
2025-06-28 170.41714
2025-06-29 178.10014

Restaurant California 4904

Since the trend component has the largest bar, it is likely to be set to None. Additionally, there is no indication of exponentially increasing variation in either the seasonal or error components. As a result, additive parameters will be applied instead of multiplicative ones.

Model Selection

split_index <- length(lettuce_ts_4904) - 14

train_4904 <- window(lettuce_ts_4904, end = time(lettuce_ts_4904)[split_index])
test_4904 <- window(lettuce_ts_4904, start = time(lettuce_ts_4904)[split_index + 1])
ets_model_4904 <- ets(train_4904, model = "ZZZ")
ets_model_4904$components[1:3]
## [1] "A" "N" "A"
ets_forecast_4904 <- forecast(ets_model_4904, h = length(test_4904))

ets_test_acc_4904 <- accuracy(ets_forecast_4904, test_4904)
ME RMSE MAE MPE MAPE MASE ACF1 Theil’s U
Training set -0.9194867 43.04688 32.02198 -2.351651 10.999318 0.6799502 -0.0667622 NA
Test set -9.7799936 37.75467 28.39140 -3.682129 9.266691 0.6028589 0.1306179 0.4259769
checkresiduals(ets_model_4904)

## 
##  Ljung-Box test
## 
## data:  Residuals from ETS(A,N,A)
## Q* = 16.065, df = 14, p-value = 0.3094
## 
## Model df: 0.   Total lags used: 14

As before, the residual diagnostics indicate no issues. The plots and Ljung-Box test confirm that the residuals exhibit no significant autocorrelation, appear normally distributed, and show no clear patterns, suggesting that the model is well-specified.

Demand Forecasting

For the final demand forecasting of Restaurant 4904, we selected the ETS(A, N, A) model, as it was proven to be the best fit. This model effectively captures the additive trend and error components while assuming no seasonality. Using this final model, we generate a 14-day forecast, providing reliable predictions for lettuce demand.

ets_model_4904_final <- ets(lettuce_ts_4904, model = "ANA")
ets_forecast_4904_final <- forecast(ets_model_4904_final, h = 14)

forecast_dates <- seq(from = as.Date("2025-06-16"), by = "day", length.out = 14)
ets_forecast_4904_table <- data.frame(
  Date = forecast_dates,
  Forecast = ets_forecast_4904_final$mean  
)
Date Forecast
2025-06-16 359.2945
2025-06-17 347.8354
2025-06-18 308.8660
2025-06-19 212.2648
2025-06-20 212.9511
2025-06-21 336.5967
2025-06-22 336.8785
2025-06-23 359.2945
2025-06-24 347.8354
2025-06-25 308.8660
2025-06-26 212.2648
2025-06-27 212.9511
2025-06-28 336.5967
2025-06-29 336.8785

Holt-Winter’s Final Forecasts

Date New_York.12631 New_York.20974 California.46673 California.4904
2025-06-16 266.7777 245.3543 161.88903 359.2945
2025-06-17 279.3228 259.3771 175.51673 347.8354
2025-06-18 307.8626 245.3595 163.78747 308.8660
2025-06-19 305.8711 207.5218 102.47294 212.2648
2025-06-20 234.1088 163.0179 77.11859 212.9511
2025-06-21 267.8582 224.4164 170.33491 336.5967
2025-06-22 239.9318 244.1267 178.01567 336.8785
2025-06-23 267.8663 245.3543 161.96875 359.2945
2025-06-24 280.5532 259.3771 175.60049 347.8354
2025-06-25 309.4576 245.3595 163.86776 308.8660
2025-06-26 307.4355 207.5218 102.53316 212.2648
2025-06-27 234.8684 163.0179 77.16918 212.9511
2025-06-28 268.9501 224.4164 170.41714 336.5967
2025-06-29 240.7416 244.1267 178.10014 336.8785

Finally, we compile the ETS forecasts for each restaurant into a structured table. The data.frame() function is used to create ets_forecast_table, where each column represents the 14-day forecasted lettuce demand for a specific restaurant, and the Date column ensures clear time alignment.

After evaluating multiple ETS configurations, all restaurants were ultimately modeled using ETS(A, N, A), as it provided the best balance between accuracy and interpretability. This consistency across locations enhances comparability and ensures a standardized approach to forecasting lettuce demand.

ARIMA Model Application

In the following sections, we apply an alternative forecasting method, the Autoregressive Integrated Moving Average (ARIMA), to each restaurant’s lettuce demand data. ARIMA is a widely used time series forecasting model that captures different components of temporal data. It consists of three main parameters:

  • Autoregressive (AR) Component (p): Represents the dependency of a value on its past values.
  • Integrated (I) Component (d): Accounts for differencing to make the time series stationary.
  • Moving Average (MA) Component (q): Captures the relationship between a value and past forecast errors.

ARIMA is highly flexible, capable of modeling various time series patterns, including trends and seasonality when extended to Seasonal ARIMA (SARIMA).

Restaurant New York 12631

Same as before, we will start with Restaurant New York 12631 for the ARIMA application. The first step is to check for stationarity. To do this, we will visualize the time series data to identify any trends or patterns that indicate non-stationarity. A clear upward or downward trend in the plot suggests that the data is non-stationary and may require differencing.

Stationarity Check

Based on the time series plot, we can observe a subtle upward trend in lettuce demand for Restaurant 12631, suggesting that the data might be non-stationary. To confirm this, we applied statistical differencing tests.

ndiffs(train_12631)
## [1] 1
nsdiffs(train_12631)
## [1] 0

The ndiffs(train_12631) function returned 1, indicating that one order of differencing is required to achieve stationarity. This confirms our initial observation that a slight trend exists, but a single differencing step should be sufficient to remove it.

Additionally, nsdiffs(train_12631) returned 0, meaning that no seasonal differencing is necessary. This suggests that the data does not exhibit strong seasonal patterns that require additional transformations.

Thus, before applying the ARIMA model, we will proceed with first-order differencing (d = 1) to stabilize the mean and ensure stationarity.

train_12631_diff <- diff(train_12631, differences = 1)

ndiffs(train_12631_diff)
## [1] 0

After applying first-order differencing, we assess whether further differencing is necessary. The results indicate that ndiffs() returns 0, confirming that no additional non-seasonal differencing is required, as the trend component has been effectively removed.

To further validate that the time series is now stationary, we conduct formal stationarity tests, including the Augmented Dickey-Fuller (ADF) Test, Phillips-Perron (PP) Test, and the Kwiatkowski-Phillips-Schmidt-Shin (KPSS) Test. These tests provide statistical confirmation by examining unit roots, stationarity assumptions, and structural properties of the time series, ensuring that we can confidently proceed with ARIMA modeling.

adf_test_12631 <- adf.test(train_12631_diff)
pp_test_12631 <- pp.test(train_12631_diff)
kpss_test_12631 <- kpss.test(train_12631_diff)

stationarity_results_12631 <- data.frame(
  Test = c("Augmented Dickey-Fuller", "Phillips-Perron", "KPSS"),
  P_Value = round(c(adf_test_12631$p.value, pp_test_12631$p.value, kpss_test_12631$p.value), 4)
)
Stationarity Test P-Value
Augmented Dickey-Fuller 0.01
Phillips-Perron 0.01
KPSS 0.10

The stationarity test results provide key insights into whether the differenced time series is now stationary, which is a crucial assumption for ARIMA modeling.

The Augmented Dickey-Fuller (ADF) test and the Phillips-Perron (PP) test are designed to detect the presence of a unit root, which indicates non-stationarity. Both tests returned p-values of 0.01, which are well below the common significance level of 0.05. This means we can reject the null hypothesis of these tests, which states that the time series has a unit root. In other words, these results provide strong evidence that the differenced series is now stationary.

On the other hand, the KPSS test works differently. It assumes that the time series is stationary under the null hypothesis, and a small p-value would indicate evidence against stationarity. In this case, the KPSS test returned a p-value of 0.10, which is greater than 0.05. This means we fail to reject the null hypothesis, supporting the claim that the series is stationary.

Since both ADF and PP tests confirm stationarity by rejecting non-stationarity, and the KPSS test does not contradict this conclusion, we can confidently state that no further differencing is required. This allows us to move forward with fitting the ARIMA model on the stabilized series.

Model Selection

Now that we have confirmed stationarity, the next step in the ARIMA modeling process is to determine the appropriate values for the autoregressive (p) and moving average (q) components. To do this, we need to examine the Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF) plots.

acf_plot_12631 <- ggAcf(train_12631_diff, lag.max = 40) + ggtitle("")
pacf_plot_12631 <- ggPacf(train_12631_diff, lag.max = 40) + ggtitle("")

grid.arrange(acf_plot_12631, pacf_plot_12631, ncol = 1, nrow = 2, 
             top = "ACF & PACF Analysis for Restaurant 12631")

The ACF and PACF plots for Restaurant 12631 provide valuable insights into the autocorrelation structure of the differenced time series. The ACF plot reveals notable spikes at lags 1, 7, 14, 21, and 28, which could suggest a weekly seasonality pattern, although it is not overwhelmingly strong. The gradual decay of autocorrelations beyond these lags indicates that past values still have a lingering influence on the present. Such behavior is often associated with an autoregressive (AR) process.

In contrast, the PACF plot exhibits a sharp cutoff after lag 1, meaning that after the first lag, partial correlations drop significantly and remain within the confidence bands. This pattern is a classic characteristic of an AR(1) process, where the current value is primarily influenced by the immediately preceding value.

Since the stationarity tests confirmed that the series is stationary after first-order differencing, and there is no evident strong seasonal pattern requiring seasonal differencing, we can proceed with fitting an ARIMA(p,1,q) model, where:

  • p (AR order) is likely 1, based on the PACF cutoff.
  • q (MA order) will be determined based on model selection criteria and further analysis.
arima_models_12631 <- capture.output(auto.arima(train_12631, trace = TRUE, d = 1))

model_lines_12631 <- arima_models_12631[str_detect(arima_models_12631, "ARIMA\\(")]

arima_results_12631 <- str_extract(model_lines_12631, "ARIMA\\([0-9,]+\\)(\\([0-9,]+\\)\\[[0-9]+\\])?|ARIMA\\([0-9,]+\\)")

aic_values_12631 <- str_extract(model_lines_12631, "[0-9]+\\.[0-9]+$") %>% as.numeric()

if (length(arima_results_12631) == length(aic_values_12631)) {
  arima_table_12631 <- data.frame(
    Model = arima_results_12631,
    AIC = aic_values_12631
  )

  best_arima_models_12631 <- arima_table_12631 %>%
    arrange(AIC) %>%
    head(3)

  best_arima_models_12631$Drift <- ifelse(str_detect(model_lines_12631[match(best_arima_models_12631$Model, arima_results_12631)], "with drift"), "Yes", "No")

  best_arima_models_12631 %>%
    kable(format = "html", col.names = c("Model", "AIC", "Drift")) %>%
    kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
} else {
  print("Error: Extracted model names and AIC values do not match in count.")
}
Model AIC Drift
ARIMA(0,1,1)(2,0,0)[7] 912.8614 No
ARIMA(1,1,1)(2,0,0)[7] 914.4597 No
ARIMA(0,1,2)(2,0,0)[7] 914.5621 No

The process of selecting the best ARIMA model for Restaurant 12631 involves an automated model selection approach using auto.arima(). The trace output provides multiple ARIMA configurations along with their AIC (Akaike Information Criterion) values. The lower the AIC, the better the model fits the data while maintaining simplicity.

In our approach, we will select the top three models and compare their accuracy scores to determine the best-performing one. Based on the extracted results, the three best models, chosen for their lowest AIC values, are:

  1. ARIMA(0,1,1)(2,0,0)[7] with an AIC of 912.8614 (No drift).
  2. ARIMA(1,1,1)(2,0,0)[7] with an AIC of 914.4597 (No drift).
  3. ARIMA(0,1,2)(2,0,0)[7] with an AIC of 914.5621 (No drift).

These models incorporate differencing (1) to ensure stationarity and include seasonal components (weekly seasonality with order [7]). The absence of “drift” indicates that no deterministic trend was required in these models. However, despite the table indicating no drift, we also test models with drift to explore whether they yield better performance for each store—essentially experimenting with different configurations to identify the best fit.

Now, we can begin by evaluating how well they fit and predict the data by examining and comparing the errors in the training and test sets.

arima_model_12631_1 <- Arima(train_12631, order = c(0,1,1), seasonal = list(order = c(2,0,0), period = 7), include.drift = FALSE)  
arima_model_12631_2 <- Arima(train_12631, order = c(1,1,1), seasonal = list(order = c(2,0,0), period = 7), include.drift = FALSE)  
arima_model_12631_3 <- Arima(train_12631, order = c(0,1,2), seasonal = list(order = c(2,0,0), period = 7), include.drift = FALSE)  
arima_forecast_12631_1 <- forecast(arima_model_12631_1, h = length(test_12631))
arima_forecast_12631_2 <- forecast(arima_model_12631_2, h = length(test_12631))
arima_forecast_12631_3 <- forecast(arima_model_12631_3, h = length(test_12631))
Model ME RMSE MAE MPE MAPE MASE ACF1 Theil’s U
Training set ARIMA(0,1,1)(2,0,0)[7] 4.103944 40.24048 31.06843 -0.5328407 11.90586 0.8402411 0.0576681 NA
Test set ARIMA(0,1,1)(2,0,0)[7] -12.488880 41.26388 37.45403 -6.8059927 14.08000 1.0129389 0.0094552 0.6520634
Training set1 ARIMA(1,1,1)(2,0,0)[7] 4.366770 40.13025 30.81424 -0.4249123 11.82139 0.8333666 -0.0295813 NA
Test set1 ARIMA(1,1,1)(2,0,0)[7] -11.151520 41.18892 36.99544 -6.3085458 13.84815 1.0005363 0.0068890 0.6535119
Training set2 ARIMA(0,1,2)(2,0,0)[7] 4.317327 40.14598 30.86092 -0.4448752 11.83567 0.8346291 -0.0158123 NA
Test set2 ARIMA(0,1,2)(2,0,0)[7] -11.480692 41.22657 37.12523 -6.4329773 13.91183 1.0040464 0.0065059 0.6536673

From the table, we can observe the error metrics for the three selected ARIMA models applied to Restaurant 12631, evaluated on both the training set and test set.

Among the three models, ARIMA(1,1,1)(2,0,0)[7] exhibits the lowest MAPE (13.85%) and RMSE (41.19) on the test set, suggesting it provides the most accurate predictions. Additionally, it has a lower ACF1 (0.0069) than ARIMA(0,1,1)(2,0,0)[7], meaning the residuals are less autocorrelated, which is a desirable property for a well-performing model.

Although ARIMA(0,1,1)(2,0,0)[7] shows slightly better RMSE in the training phase, it has a higher test set MAPE (14.08%) and a slightly larger ACF1 (0.0095), indicating that ARIMA(1,1,1)(2,0,0)[7] generalizes slightly better to unseen data.

Given its lower test set MAPE, RMSE and MAE, as well as improved residual behavior, ARIMA(1,1,1)(2,0,0)[7] is selected as the best-performing model for forecasting lettuce demand at Restaurant 12631. The (1,1,1) structure means it includes one autoregressive term (AR), one differencing step (I) to ensure stationarity, and one moving average term (MA) to correct forecasting errors. The seasonal component (2,0,0)[7] captures weekly patterns, with two seasonal autoregressive terms accounting for periodic fluctuations. This suggests that the model effectively captures both short-term and seasonal trends, making it the most reliable choice for this case.

Now, let’s proceed with residual diagnostics to verify that the residuals have a mean of zero and exhibit constant variance.

checkresiduals(arima_forecast_12631_2)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(1,1,1)(2,0,0)[7]
## Q* = 7.5634, df = 10, p-value = 0.6714
## 
## Model df: 4.   Total lags used: 14

The top-left plot, displaying residuals over time, reveals no discernible pattern, indicating that the residuals resemble white noise and appear to be centered around zero. The bottom-left ACF plot confirms this, as all autocorrelations fall within the blue significance bounds, indicating no significant structure remaining in the residuals. Lastly, the histogram with the overlaid density plot on the right suggests that the residuals follow an approximately normal distribution, further supporting the validity of the model.

Additionally, the Ljung-Box test results further confirm the model’s adequacy. The test produces a p-value of 0.6714, which is well above the common significance level of 0.05. This indicates that we fail to reject the null hypothesis, meaning that the residuals exhibit no significant autocorrelation. With these diagnostics, we can conclude that the ARIMA(1,1,1)(2,0,0)[7] model sufficiently captures the underlying structure of the time series data and is a valid choice for forecasting.

Demand Forecasting

Thus, we can now refit the model using the complete dataset and generate a two-week lettuce demand forecast, as illustrated below.

arima_12631_final <- Arima(lettuce_ts_12631, order = c(1,1,1), seasonal = list(order = c(2,0,0), period = 7), include.drift = FALSE)  
arima_forecast_12631_final <- forecast(arima_12631_final, h = 14)
Date Forecast
2025-06-16 267.8324
2025-06-17 252.9936
2025-06-18 293.8467
2025-06-19 276.2834
2025-06-20 253.1750
2025-06-21 266.4303
2025-06-22 248.8072
2025-06-23 277.1748
2025-06-24 254.6642
2025-06-25 309.5395
2025-06-26 302.7129
2025-06-27 252.7133
2025-06-28 257.0218
2025-06-29 252.8329

To review again, the black line represents historical sales data, while the forecasted values are depicted by a central dark blue line. The surrounding shaded regions indicate confidence intervals, with the darker blue reflecting the 80% confidence interval and the lighter blue representing the 95% confidence interval. As time progresses, these intervals widen, suggesting increasing uncertainty in the predictions. Overall, the forecast effectively captures the variability in demand while maintaining a reasonable trend, making it a reliable projection for short-term planning.

Restaurant New York 20974

Stationarity Check

We apply the same methodology to other restaurants. However, for Restaurant 20974, we first extract the longest continuous time series from the training set, as it contains some missing values at the beginning.

# Extracting longest ts 
train_valid_indices_20974 <- which(!is.na(train_20974))

train_breaks_20974 <- c(0, which(diff(train_valid_indices_20974) > 1), length(train_valid_indices_20974))

train_longest_seq_start_20974 <- train_breaks_20974[which.max(diff(train_breaks_20974))] + 1
train_longest_seq_end_20974 <- train_breaks_20974[which.max(diff(train_breaks_20974)) + 1]

train_start_index_20974 <- valid_indices_20974[train_longest_seq_start_20974]
train_end_index_20974 <- valid_indices_20974[train_longest_seq_end_20974]

train_longest_ts_20974 <- window(train_20974, start = c(10, train_start_index_20974), end = c(10, train_end_index_20974))

autoplot(train_longest_ts_20974) +
  ggtitle("Lettuce Demand Time Series - Restaurant 20974") +
  xlab("Day") + 
  ylab("Lettuce Demand (Ounces)") 

From the plot above, we can observe that there is no clear upward or downward trend in the time series for Restaurant 20974.

ndiffs(train_longest_ts_20974)
## [1] 0
nsdiffs(train_longest_ts_20974)
## [1] 0

The results from the ndiffs and nsdiffs tests indicate that no differencing is required for the time series of Restaurant 20974. A value of 0 for ndiffs suggests that the data is already stationary and does not require first-order differencing to remove trends. Similarly, nsdiffs being 0 confirms that there is no significant seasonal pattern that requires seasonal differencing. This implies that the time series can likely be modeled directly without transformation, simplifying the ARIMA or ETS model selection process. However, it is still necessary to verify this by conducting additional stationarity tests to ensure the data meets the necessary assumptions for time series modeling.

adf_test_20974 <- adf.test(train_longest_ts_20974)
pp_test_20974 <- pp.test(train_longest_ts_20974)
kpss_test_20974 <- kpss.test(train_longest_ts_20974)

stationarity_results_20974 <- data.frame(
  Test = c("Augmented Dickey-Fuller", "Phillips-Perron", "KPSS"),
  P_Value = round(c(adf_test_20974$p.value, pp_test_20974$p.value, kpss_test_20974$p.value), 4)
)
Stationarity Test P-Value
Augmented Dickey-Fuller 0.0418
Phillips-Perron 0.0100
KPSS 0.1000

The Augmented Dickey-Fuller (ADF) test returns a p-value of 0.0418, which is below the conventional significance level of 0.05. This indicates that we can reject the null hypothesis of a unit root, suggesting that the time series is likely stationary.

Similarly, the Phillips-Perron (PP) test gives a p-value of 0.0100, reinforcing the conclusion that the series does not have a unit root and is stationary.

On the other hand, the KPSS test yields a p-value of 0.1000. Since the KPSS test has a null hypothesis that the series is stationary, a higher p-value suggests that we fail to reject the null, further confirming that the series is stationary.

Overall, all three tests indicate that the time series does not require differencing, aligning with the previous results from ndiffs and nsdiffs. This confirms that we can proceed with model selection without additional transformations.

Model Selection

acf_plot_20974 <- ggAcf(train_longest_ts_20974, lag.max = 40) + ggtitle("")
pacf_plot_20974 <- ggPacf(train_longest_ts_20974, lag.max = 40) + ggtitle("")

grid.arrange(acf_plot_20974, pacf_plot_20974, ncol = 1, nrow = 2, 
             top = "ACF & PACF Analysis for Restaurant 20974")

The ACF and PACF plots for Restaurant 20974 provide insights into the autocorrelation structure of the time series, which gives us an idea about model selection for ARIMA.

The ACF plot displays a significant spike at lag 1, with additional peaks at lag 7, indicating a possible weekly seasonal pattern in the data. Beyond these initial lags, the autocorrelations gradually decay, suggesting the presence of a moving average (MA) component.

The PACF plot shows a sharp decline after lag 1, followed by another notable spike at lag 7. This pattern suggests an AR(1) or AR(2) process, as the partial autocorrelation function typically exhibits a sudden cutoff when an autoregressive component is present.

Given that the stationarity tests confirmed no need for differencing (ndiffs = 0, nsdiffs = 0), the data appears to be stationary, and an ARMA(p, q) model is likely more appropriate rather than a full ARIMA model.

To determine the most suitable model, we will use auto.arima again, which will identify the optimal values for p, q, and seasonal components based on AIC/BIC criteria.

arima_models_20974 <- capture.output(auto.arima(train_longest_ts_20974, trace = TRUE))

model_lines_20974 <- arima_models_20974[str_detect(arima_models_20974, "ARIMA\\(")]

arima_results_20974 <- str_extract(model_lines_20974, "ARIMA\\([0-9,]+\\)(\\([0-9,]+\\)\\[[0-9]+\\])?|ARIMA\\([0-9,]+\\)")

aic_values_20974 <- str_extract(model_lines_20974, "[0-9]+\\.[0-9]+$") %>% as.numeric()

if (length(arima_results_20974) == length(aic_values_20974)) {
  arima_table_20974 <- data.frame(
    Model = arima_results_20974,
    AIC = aic_values_20974
  )

  best_arima_models_20974 <- arima_table_20974 %>%
    arrange(AIC) %>%
    head(3)

  best_arima_models_20974$Drift <- ifelse(str_detect(model_lines_20974[match(best_arima_models_20974$Model, arima_results_20974)], "with drift"), "Yes", "No")

  best_arima_models_20974 %>%
    kable(format = "html", col.names = c("Model", "AIC", "Drift")) %>%
    kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
} else {
  print("Error: Extracted model names and AIC values do not match in count.")
}
Model AIC Drift
ARIMA(1,0,0)(1,0,0)[7] 791.0354 No
ARIMA(0,0,1)(1,0,0)[7] 791.8657 No
ARIMA(1,0,0)(2,0,0)[7] 792.4024 No

The auto.arima function has selected ARIMA(1,0,0)(1,0,0)[7] as the best model for Restaurant 20974 based on AIC. This model suggests that the time series follows an autoregressive process with one lag (AR(1)) and includes a seasonal component with a one-lag seasonal autoregressive term. The absence of differencing (d=0) confirms that the series is already stationary, which aligns with our stationarity test results. The other top models, ARIMA(0,0,1)(1,0,0)[7] and ARIMA(1,0,0)(2,0,0)[7], have slightly higher AIC values, indicating that they are marginally less optimal. Next, we evaluate their accuracy.

arima_model_20974_1 <- Arima(train_longest_ts_20974, order = c(1,0,0), seasonal = list(order = c(1,0,0), period = 7), include.drift = FALSE, include.mean = TRUE)  
arima_model_20974_2 <- Arima(train_longest_ts_20974, order = c(0,0,1), seasonal = list(order = c(1,0,0), period = 7), include.drift = FALSE, include.mean = TRUE)  
arima_model_20974_3 <- Arima(train_longest_ts_20974, order = c(1,0,0), seasonal = list(order = c(2,0,0), period = 7), include.drift = FALSE, include.mean = TRUE)  
arima_forecast_20974_1 <- forecast(arima_model_20974_1, h = length(test_20974))
arima_forecast_20974_2 <- forecast(arima_model_20974_2, h = length(test_20974))
arima_forecast_20974_3 <- forecast(arima_model_20974_3, h = length(test_20974))

In this case, the models with drift provide a better fit.

Model ME RMSE MAE MPE MAPE MASE ACF1 Theil’s U
Training set ARIMA(1,0,0)(1,0,0)[7] 1.254147 47.53160 37.97326 -17.7643380 32.56668 0.8099996 -0.0190123 NA
Test set ARIMA(1,0,0)(1,0,0)[7] 10.036807 52.43042 31.68473 0.3087592 12.87230 0.6758601 -0.1286452 0.7479320
Training set1 ARIMA(0,0,1)(1,0,0)[7] 1.284272 47.79543 38.13811 -17.8595686 32.71715 0.8135159 0.0254672 NA
Test set1 ARIMA(0,0,1)(1,0,0)[7] 9.619421 52.26425 31.32802 0.1404880 12.72733 0.6682514 -0.1336069 0.7452222
Training set2 ARIMA(1,0,0)(2,0,0)[7] 1.543987 47.15622 38.18619 -16.8409836 31.86687 0.8145415 -0.0198009 NA
Test set2 ARIMA(1,0,0)(2,0,0)[7] 13.421265 51.51128 31.16361 2.1982457 12.40326 0.6647444 -0.1998934 0.7343252

Based on the evaluation metrics, ARIMA(1,0,0)(2,0,0)[7] is selected as the best model for forecasting. It demonstrates the lowest RMSE, MAE, and MAPE on the test set, indicating better accuracy in predicting lettuce demand. Additionally, its lower ACF1 value suggests minimal autocorrelation in residuals, which confirms the model’s adequacy. Overall, this model provides the most reliable forecast while effectively capturing the data’s patterns.

Now, we move on to residual diagnostics to ensure that the residuals have a zero mean and maintain constant variance.

checkresiduals(arima_forecast_20974_3)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(1,0,0)(2,0,0)[7] with non-zero mean
## Q* = 7.9363, df = 11, p-value = 0.719
## 
## Model df: 3.   Total lags used: 14

The auto.arima output indicates that the residuals deviate slightly from a mean of zero. However, this does not inherently suggest a flawed model. The reduced training set, resulting from the exclusion of initial observations, may have limited the model’s capacity to capture the full underlying pattern.Despite this, the ACF of residuals shows no significant spikes, suggesting no remaining autocorrelation. The histogram with a fitted normal curve indicates an approximately normal distribution. Furthermore, the Ljung-Box test produces a p-value of 0.719, which is well above 0.05, confirming that the residuals are not significantly autocorrelated. These results indicate that the model is well-specified and appropriate for forecasting, allowing us to proceed with generating predictions.

Demand Forecasting

With the selected model, we can now use it to forecast lettuce demand for the next two weeks, as shown below.

arima_20974_final <- Arima(lettuce_longest_ts_20974, order = c(1,0,0), seasonal = list(order = c(2,0,0), period = 7), include.drift = FALSE, include.mean = TRUE) 
arima_forecast_20974_final <- forecast(arima_20974_final, h = 14)
Date Forecast
2025-06-16 225.0381
2025-06-17 236.1516
2025-06-18 274.9644
2025-06-19 197.9999
2025-06-20 207.1209
2025-06-21 198.7878
2025-06-22 234.9984
2025-06-23 219.5166
2025-06-24 228.8890
2025-06-25 257.8649
2025-06-26 205.5278
2025-06-27 213.0050
2025-06-28 205.4249
2025-06-29 229.7271

Restaurant California 46673

Stationarity Check

ndiffs(train_46673)
## [1] 0
nsdiffs(train_46673)
## [1] 1

For Restaurant 46673, the time series plot reveals a fluctuating demand pattern with no evident long-term trend. The ndiffs test result of 0 suggests that the series is already stationary and does not require differencing to remove trends. However, the nsdiffs test result of 1 indicates the presence of seasonality, suggesting that one seasonal differencing step is needed. Given that the seasonal cycle follows a weekly trend, we apply a differencing lag of 7. This process involves subtracting the observation from seven days prior from the current value, effectively removing seasonality and stabilizing fluctuations over time.

train_46673_diff <- diff(train_46673, lag = 7, differences = 1)

autoplot(train_46673_diff) +
  ggtitle("First-Order Differenced Time Series - Restaurant 46673") +
  ylab("Differenced Lettuce Demand") +
  xlab("Time")

We conduct stationarity checks again.

ndiffs(train_46673_diff)
## [1] 0
nsdiffs(train_46673_diff)
## [1] 0
adf_test_46673 <- adf.test(train_46673_diff)
pp_test_46673 <- pp.test(train_46673_diff)
kpss_test_46673 <- kpss.test(train_46673_diff)

stationarity_results_46673 <- data.frame(
  Test = c("Augmented Dickey-Fuller", "Phillips-Perron", "KPSS"),
  P_Value = round(c(adf_test_46673$p.value, pp_test_46673$p.value, kpss_test_46673$p.value), 4)
)
Stationarity Test P-Value
Augmented Dickey-Fuller 0.0375
Phillips-Perron 0.0100
KPSS 0.1000

The stationarity tests for Restaurant 46673 indicate that the series is likely stationary. Both the Augmented Dickey-Fuller (0.0375) and Phillips-Perron (0.0100) tests have p-values below 0.05, rejecting the null hypothesis of non-stationarity. Meanwhile, the KPSS test (0.1000), with a p-value above 0.05, fails to reject the null hypothesis of stationarity. These results confirm that the data does not require further differencing, apart from the seasonal adjustment already identified.

Model Selection

acf_plot_46673 <- ggAcf(train_46673_diff, lag.max = 40) + ggtitle("")
pacf_plot_46673 <- ggPacf(train_46673_diff, lag.max = 40) + ggtitle("")

grid.arrange(acf_plot_46673, pacf_plot_46673, ncol = 1, nrow = 2, 
             top = "ACF & PACF Analysis for Restaurant 46673")

Both the ACF and PACF plots for Restaurant 46673 suggest a strong seasonal component at lag 7. The significant negative spike at lag 7 in both plots indicates a seasonal autoregressive (SAR) or seasonal moving average (SMA) term. The remaining lags show no strong patterns, suggesting a simple seasonal ARIMA model with a seasonal order of (0,1,1) or (1,1,0) might be appropriate.

arima_models_46673 <- capture.output(auto.arima(train_46673, trace = TRUE, D = 1, seasonal = TRUE))

model_lines_46673 <- arima_models_46673[str_detect(arima_models_46673, "ARIMA\\(")]

arima_results_46673 <- str_extract(model_lines_46673, "ARIMA\\([0-9,]+\\)(\\([0-9,]+\\)\\[[0-9]+\\])?|ARIMA\\([0-9,]+\\)")

aic_values_46673 <- str_extract(model_lines_46673, "[0-9]+\\.[0-9]+$") %>% as.numeric()

if (length(arima_results_46673) == length(aic_values_46673)) {
  arima_table_46673 <- data.frame(
    Model = arima_results_46673,
    AIC = aic_values_46673
  )

  best_arima_models_46673 <- arima_table_46673 %>%
    arrange(AIC) %>%
    head(3)

  best_arima_models_46673$Drift <- ifelse(str_detect(model_lines_46673[match(best_arima_models_46673$Model, arima_results_46673)], "with drift"), "Yes", "No")

  best_arima_models_46673 %>%
    kable(format = "html", col.names = c("Model", "AIC", "Drift")) %>%
    kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
} else {
  print("Error: Extracted model names and AIC values do not match in count.")
}
Model AIC Drift
ARIMA(0,0,1)(0,1,1)[7] 775.7980 Yes
ARIMA(1,0,0)(0,1,1)[7] 775.9378 Yes
ARIMA(0,0,0)(0,1,1)[7] 776.4867 Yes

Based on the auto.arima function, the best-selected model for Restaurant 46673 is ARIMA(0,0,1)(0,1,1)[7] with an AIC of 775.7980. This model includes a moving average (MA(1)) component along with a seasonal MA(1) term, effectively capturing short-term dependencies and weekly seasonality.

The second-best model, ARIMA(1,0,0)(0,1,1)[7], has a slightly higher AIC (775.9378), indicating a similar fit but incorporating an autoregressive (AR(1)) term instead of a moving average component. The third model, ARIMA(0,0,0)(0,1,1)[7], has the highest AIC (776.4867), making it the least optimal choice among the three.

Given the minimal differences in AIC values, we will now proceed to test all the models on our test set and evaluate their forecasting accuracies.

arima_model_46673_1 <- Arima(train_46673, order = c(0,0,1), seasonal = list(order = c(0,1,1), period = 7), include.drift = FALSE)  
arima_model_46673_2 <- Arima(train_46673, order = c(1,0,0), seasonal = list(order = c(0,1,1), period = 7), include.drift = FALSE)  
arima_model_46673_3 <- Arima(train_46673, order = c(0,0,0), seasonal = list(order = c(0,1,1), period = 7), include.drift = FALSE)  
arima_forecast_46673_1 <- forecast(arima_model_46673_1, h = length(test_46673))
arima_forecast_46673_2 <- forecast(arima_model_46673_2, h = length(test_46673))
arima_forecast_46673_3 <- forecast(arima_model_46673_3, h = length(test_46673))
evaluate_arima_model_46673 <- function(forecast, actual) {
  acc <- accuracy(forecast, actual)
  data.frame(
    Dataset = c("Training set", "Test set"),
    ME = acc[, "ME"],
    RMSE = acc[, "RMSE"],
    MAE = acc[, "MAE"],
    MPE = acc[, "MPE"],
    MAPE = acc[, "MAPE"],
    MASE = acc[, "MASE"],
    ACF1 = acc[, "ACF1"],
    Theil_U = acc[, "Theil's U"]
  )
}

arima_error_46673_1 <- evaluate_arima_model_46673(arima_forecast_46673_1, test_46673)
arima_error_46673_2 <- evaluate_arima_model_46673(arima_forecast_46673_2, test_46673)
arima_error_46673_3 <- evaluate_arima_model_46673(arima_forecast_46673_3, test_46673)

arima_error_table_46673 <- rbind(
  cbind(Model = "ARIMA(0,0,1)(0,1,1)[7]", arima_error_46673_1),
  cbind(Model = "ARIMA(1,0,0)(0,1,1)[7]", arima_error_46673_2),
  cbind(Model = "ARIMA(0,0,0)(0,1,1)[7]", arima_error_46673_3)
)

arima_error_table_46673 %>%
  select(-Dataset) %>%  
  kable(format = "html", col.names = c("Model", "ME", "RMSE", "MAE", "MPE", "MAPE", "MASE", "ACF1", "Theil's U")) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
Model ME RMSE MAE MPE MAPE MASE ACF1 Theil’s U
Training set ARIMA(0,0,1)(0,1,1)[7] 0.0028287 24.33113 17.78837 -2.705200 13.37245 0.6687969 0.0035688 NA
Test set ARIMA(0,0,1)(0,1,1)[7] 7.4932548 41.08078 30.72942 2.313730 19.62275 1.1553471 0.1711167 0.6433336
Training set1 ARIMA(1,0,0)(0,1,1)[7] 0.0228809 24.34700 17.82533 -2.726029 13.36990 0.6701867 0.0132016 NA
Test set1 ARIMA(1,0,0)(0,1,1)[7] 7.3655972 41.09264 30.77366 2.230649 19.64924 1.1570106 0.1722622 0.6432981
Training set2 ARIMA(0,0,0)(0,1,1)[7] -0.0632055 24.89233 18.27833 -2.935965 13.78152 0.6872184 0.1811863 NA
Test set2 ARIMA(0,0,0)(0,1,1)[7] 8.6745680 41.49454 30.70063 3.132855 19.59288 1.1542648 0.1803610 0.6557006

All the models exhibit similar accuracy on the test set. However, the ARIMA(0,0,1)(0,1,1)[7] model demonstrates slightly better performance for Restaurant 46673, as it has the lowest RMSE (41.08). Therefore, we will proceed with this model for forecasting lettuce demand, after verifying its residuals.

checkresiduals(arima_forecast_46673_1)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(0,0,1)(0,1,1)[7]
## Q* = 14.87, df = 12, p-value = 0.2486
## 
## Model df: 2.   Total lags used: 14

The residual diagnostics indicate that the ARIMA(1,0,0)(0,1,1)[7] model is satisfcatory enough. The ACF plot shows no significant autocorrelation, and the histogram suggests an approximately normal distribution. Additionally, the Ljung-Box test yields a p-value of 0.2428, which is well above 0.05, confirming that the residuals exhibit no significant autocorrelation. Given these results, this model is appropriate for forecasting lettuce demand.

Demand Forecasting

Based on the final selected model, we generate the lettuce demand forecast for the next two weeks, as shown below.

arima_46673_final <- Arima(lettuce_ts_46673, order = c(0,0,1), seasonal = list(order = c(0,1,1), period = 7), include.drift = FALSE)  
arima_forecast_46673_final <- forecast(arima_46673_final, h = 14)
Date Forecast
2025-06-16 175.90960
2025-06-17 173.09271
2025-06-18 164.46665
2025-06-19 100.73332
2025-06-20 76.66666
2025-06-21 168.66665
2025-06-22 176.53332
2025-06-23 161.06062
2025-06-24 173.09271
2025-06-25 164.46665
2025-06-26 100.73332
2025-06-27 76.66666
2025-06-28 168.66665
2025-06-29 176.53332

Restaurant California 4904

Stationarity Check

The time series plot for Restaurant 4904 shows significant fluctuations in lettuce demand, with no clear upward or downward trend.

ndiffs(train_4904)
## [1] 0
nsdiffs(train_4904)
## [1] 1

The differencing tests indicate that no first-order differencing is needed (ndiffs = 0), confirming that the time series is already stationary in terms of trend. However, one seasonal differencing (nsdiffs = 1) is required, suggesting the presence of a weekly seasonal pattern that needs to be accounted for in the model. This implies that we should apply seasonal differencing with a lag of 7 days before fitting an ARIMA model.

train_4904_diff <- diff(train_4904, lag = 7, differences = 1)

autoplot(train_4904_diff) +
  ggtitle("First-Order Differenced Time Series - Restaurant 4904") +
  ylab("Differenced Lettuce Demand") +
  xlab("Time")

Let’s check if we need further differencing.

ndiffs(train_12631_diff)
## [1] 0
nsdiffs(train_12631_diff)
## [1] 0
adf_test_4904 <- adf.test(train_4904_diff)
pp_test_4904 <- pp.test(train_4904_diff)
kpss_test_4904 <- kpss.test(train_4904_diff)

stationarity_results_4904 <- data.frame(
  Test = c("Augmented Dickey-Fuller", "Phillips-Perron", "KPSS"),
  P_Value = round(c(adf_test_4904$p.value, pp_test_4904$p.value, kpss_test_4904$p.value), 4)
)
Stationarity Test P-Value
Augmented Dickey-Fuller 0.9349
Phillips-Perron 0.0100
KPSS 0.1000

The stationarity tests provide mixed results. The Augmented Dickey-Fuller (ADF) test yields a high p-value (0.9349), suggesting the presence of a unit root, indicating non-stationarity. The Phillips-Perron (PP) test, however, has a very low p-value (0.0100), indicating stationarity. The KPSS test has a p-value of 0.1000, which is above the 0.05 threshold, and thus it fails to reject the null hypothesis, meaning there is no strong evidence against stationarity. Given these conflicting results, seasonal differencing (nsdiffs = 1) remains necessary to address potential seasonal patterns before finalizing the model.

Model Selection

acf_plot_4904 <- ggAcf(train_4904_diff, lag.max = 40) + ggtitle("")
pacf_plot_4904 <- ggPacf(train_4904_diff, lag.max = 40) + ggtitle("")

grid.arrange(acf_plot_4904, pacf_plot_4904, ncol = 1, nrow = 2, 
             top = "ACF & PACF Analysis for Restaurant 4904")

The ACF and PACF plots for Restaurant 4904 show a significant drop at lag 7, indicating a strong seasonal component with a weekly pattern. The presence of significant spikes at early lags suggests the potential need for an AR(1) or MA(1) component. Given the seasonal differencing requirement (D=1), a seasonal ARIMA model incorporating seasonal MA or AR terms will likely be suitable for capturing the demand pattern.

arima_models_4904 <- capture.output(auto.arima(train_4904, trace = TRUE, D = 1, seasonal = TRUE, stepwise = FALSE, approximation = FALSE))

model_lines_4904 <- arima_models_4904[str_detect(arima_models_4904, "ARIMA\\(")]

arima_results_4904 <- str_extract(model_lines_4904, "ARIMA\\([0-9,]+\\)(\\([0-9,]+\\)\\[[0-9]+\\])?|ARIMA\\([0-9,]+\\)")

aic_values_4904 <- str_extract(model_lines_4904, "[0-9]+\\.[0-9]+$") %>% as.numeric()

if (length(arima_results_4904) == length(aic_values_4904)) {
  arima_table_4904 <- data.frame(
    Model = arima_results_4904,
    AIC = aic_values_4904
  )

  best_arima_models_4904 <- arima_table_4904 %>%
    arrange(AIC) %>%
    head(3)

  best_arima_models_4904$Drift <- ifelse(str_detect(model_lines_4904[match(best_arima_models_4904$Model, arima_results_4904)], "with drift"), "Yes", "No")

  best_arima_models_4904 %>%
    kable(format = "html", col.names = c("Model", "AIC", "Drift")) %>%
    kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
} else {
  print("Error: Extracted model names and AIC values do not match in count.")
}
Model AIC Drift
ARIMA(0,0,2)(0,1,1)[7] 801.0120 No
ARIMA(0,0,4)(0,1,1)[7] 801.4305 No
ARIMA(0,0,3)(0,1,1)[7] 802.8626 No
arima_model_4904_1 <- Arima(train_4904, order = c(0,0,2), seasonal = list(order = c(0,1,1), period = 7), include.drift = TRUE)  
arima_model_4904_2 <- Arima(train_4904, order = c(0,0,4), seasonal = list(order = c(0,1,1), period = 7), include.drift = TRUE)  
arima_model_4904_3 <- Arima(train_4904, order = c(0,0,3), seasonal = list(order = c(0,1,1), period = 7), include.drift = TRUE)  
arima_forecast_4904_1 <- forecast(arima_model_4904_1, h = length(test_4904))
arima_forecast_4904_2 <- forecast(arima_model_4904_2, h = length(test_4904))
arima_forecast_4904_3 <- forecast(arima_model_4904_3, h = length(test_4904))
# Function to evaluate model accuracy
evaluate_arima_model_4904 <- function(forecast, actual) {
  acc <- accuracy(forecast, actual)
  data.frame(
    Dataset = c("Training set", "Test set"),
    ME = acc[, "ME"],
    RMSE = acc[, "RMSE"],
    MAE = acc[, "MAE"],
    MPE = acc[, "MPE"],
    MAPE = acc[, "MAPE"],
    MASE = acc[, "MASE"],
    ACF1 = acc[, "ACF1"],
    Theil_U = acc[, "Theil's U"]
  )
}

arima_error_4904_1 <- evaluate_arima_model_4904(arima_forecast_4904_1, test_4904)
arima_error_4904_2 <- evaluate_arima_model_4904(arima_forecast_4904_2, test_4904)
arima_error_4904_3 <- evaluate_arima_model_4904(arima_forecast_4904_3, test_4904)

arima_error_table_4904 <- rbind(
  cbind(Model = "ARIMA(0,0,2)(0,1,1)[7]", arima_error_4904_1),
  cbind(Model = "ARIMA(0,0,4)(0,1,1)[7]", arima_error_4904_2),
  cbind(Model = "ARIMA(0,0,3)(0,1,1)[7]", arima_error_4904_3)
)

arima_error_table_4904 %>%
  select(-Dataset) %>%  
  kable(format = "html", col.names = c("Model", "ME", "RMSE", "MAE", "MPE", "MAPE", "MASE", "ACF1", "Theil's U")) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
Model ME RMSE MAE MPE MAPE MASE ACF1 Theil’s U
Training set ARIMA(0,0,2)(0,1,1)[7] 5.028166 42.93714 31.17312 0.1578130 10.366398 0.6619256 0.0052211 NA
Test set ARIMA(0,0,2)(0,1,1)[7] -8.066405 37.29566 28.75235 -2.9862269 9.274836 0.6105235 0.1947400 0.4263084
Training set1 ARIMA(0,0,4)(0,1,1)[7] 4.224050 42.19133 30.38149 0.1549914 10.067473 0.6451163 0.0160889 NA
Test set1 ARIMA(0,0,4)(0,1,1)[7] -13.024847 38.29577 29.89511 -4.6106604 9.587740 0.6347885 0.2227416 0.4268244
Training set2 ARIMA(0,0,3)(0,1,1)[7] 4.814607 42.86451 31.21566 0.0886746 10.364240 0.6628290 -0.0014062 NA
Test set2 ARIMA(0,0,3)(0,1,1)[7] -8.981517 37.81098 29.16125 -3.2909025 9.397333 0.6192058 0.1896541 0.4284052

In this case, we included stepwise = FALSE, approximation = FALSE compared to the other restaurants, where these parameters did not change the results. This adjustment allowed for a more exhaustive search of potential ARIMA models, ensuring that we explored a broader set of possibilities, given the less clear stationarity results.

Additionally, we tested both variations—models with and without drift—to determine whether including drift improved model performance. Based on the results, we found that incorporating drift led to better-fitting models.

Among the three selected models, the ARIMA(0,0,2)(0,1,1)[7] model emerged as the best-performing option, as it achieved the lowest RMSE and MAE, indicating superior accuracy in predicting lettuce demand. Therefore, we proceed with this model for further analysis.

checkresiduals(arima_forecast_4904_1)

## 
##  Ljung-Box test
## 
## data:  Residuals from ARIMA(0,0,2)(0,1,1)[7] with drift
## Q* = 15.884, df = 11, p-value = 0.1455
## 
## Model df: 3.   Total lags used: 14

The residual diagnostics for the ARIMA(0,0,2)(0,1,1)[7] with drift model indicate that the residuals fluctuate randomly around zero. However, there may be some minor patterns suggesting slight deviations from zero.

The ACF plot shows that most autocorrelation values fall within the confidence bounds, implying that the residuals are largely uncorrelated. However, a few lags slightly exceed the bounds, which could indicate minor model misspecification. The histogram of residuals appears approximately normal, though a few extreme values suggest the presence of potential outliers.

The Ljung-Box test result, with a p-value of 0.1455, suggests that we fail to reject the null hypothesis, indicating no significant autocorrelation in the residuals. Overall, the model performs reasonably well, though minor autocorrelation and outliers may still need to be addressed for further improvement

Demand Forecasting

Using the final selected model, we produce the lettuce demand forecast for the upcoming two weeks, as illustrated below.

arima_4904_final <- Arima(lettuce_ts_4904, order = c(0,0,2), seasonal = c(0,1,1), include.drift = TRUE)
arima_forecast_4904_final <- forecast(arima_4904_final, h = 14)
Date Forecast
2025-06-16 357.4335
2025-06-17 341.3545
2025-06-18 302.5804
2025-06-19 215.8155
2025-06-20 216.5998
2025-06-21 340.6342
2025-06-22 335.2030
2025-06-23 353.9381
2025-06-24 338.2286
2025-06-25 299.0394
2025-06-26 212.2745
2025-06-27 213.0588
2025-06-28 337.0932
2025-06-29 331.6620

ARIMA Final Forecasts

Date New_York.12631 New_York.20974 California.46673 California.4904
2025-06-16 267.8324 225.0381 175.90960 357.4335
2025-06-17 252.9936 236.1516 173.09271 341.3545
2025-06-18 293.8467 274.9644 164.46665 302.5804
2025-06-19 276.2834 197.9999 100.73332 215.8155
2025-06-20 253.1750 207.1209 76.66666 216.5998
2025-06-21 266.4303 198.7878 168.66665 340.6342
2025-06-22 248.8072 234.9984 176.53332 335.2030
2025-06-23 277.1748 219.5166 161.06062 353.9381
2025-06-24 254.6642 228.8890 173.09271 338.2286
2025-06-25 309.5395 257.8649 164.46665 299.0394
2025-06-26 302.7129 205.5278 100.73332 212.2745
2025-06-27 252.7133 213.0050 76.66666 213.0588
2025-06-28 257.0218 205.4249 168.66665 337.0932
2025-06-29 252.8329 229.7271 176.53332 331.6620

Overall, we provided demand forecasts for lettuce across different restaurant locations. The model selection was guided by a combination of Akaike Information Criterion (AIC) values and performance metrics such as Mean Absolute Percentage Error (MAPE), Root Mean Squared Error (RMSE), and residual diagnostics. These measures ensured that the final models strike a balance between accuracy and complexity.

For each restaurant, the final ARIMA configurations were determined as follows:

  • New York (12631): ARIMA(1,1,1)(2,0,0)[7]
  • New York (20974): ARIMA(1,0,0)(2,0,0)[7]
  • California (46673): ARIMA(0,0,1)(0,1,1)[7]
  • California (4904): ARIMA(0,0,2)(0,1,1)[7]

These models effectively capture short-term dependencies and weekly seasonality, crucial for understanding fluctuations in demand. The selection process involved evaluating multiple ARIMA candidates for each location and comparing their AIC values. However, the final choice was also influenced by how well the models performed on test data, ensuring that they generalize effectively beyond the training period.

Performance comparison between Holt-Winters and ARIMA

In this final section, we will evaluate and compare the accuracy and forecasting performance of both methods for each store, ultimately selecting the best-performing model for the final forecast.

Restaurant New York 12631

For this restaurant, the selected Holt-Winters model was ETS(A,N,A), while the chosen ARIMA model was ARIMA(1,1,1)(2,0,0)[7]. To determine which model performs better, we first overlay the fitted values onto the original time series, providing a visual comparison to assess their fit and overall alignment with the observed data.

The plot above presents a visual comparison of the fitted values from the Holt-Winters (ETS A,N,A) model and the ARIMA (1,1,1)(2,0,0)[7] model against the actual time series data. The black line represents the observed lettuce demand, while the blue dashed line and red dashed line correspond to the fitted values from the Holt-Winters and ARIMA models, respectively.

Both models capture the overall trend and seasonality of the data, but notable differences can be observed. The Holt-Winters model appears to track fluctuations more closely, adapting quickly to demand spikes. In contrast, the ARIMA model exhibits a smoother fit, indicating it may be less sensitive to sudden variations in demand.

While neither model perfectly matches all fluctuations, this visual assessment suggests that Holt-Winters might be better at capturing volatility, whereas ARIMA provides a more stable and generalized fit. Further evaluation using error metrics would help determine the model with superior predictive accuracy.

Model ME RMSE MAE MPE MAPE MASE ACF1 Theil’s U
Training set ETS(A,N,A) 3.986680 35.66635 27.53202 -0.0548485 10.39996 0.7445996 0.0469063 NA
Test set ETS(A,N,A) -9.920354 40.22329 33.63813 -5.5924766 12.59408 0.9097382 0.0735180 0.6530292
Training set1 ARIMA(1,1,1)(2,0,0)[7] 4.366770 40.13025 30.81424 -0.4249123 11.82139 0.8333666 -0.0295813 NA
Test set1 ARIMA(1,1,1)(2,0,0)[7] -11.151520 41.18892 36.99544 -6.3085458 13.84815 1.0005363 0.0068890 0.6535119

The comparison between the Holt-Winters ETS(A,N,A) model and the ARIMA(1,1,1)(2,0,0)[7] model shows that ETS slightly outperforms ARIMA in terms of forecasting accuracy. ETS has a lower Mean Error (ME) and Mean Absolute Error (MAE) in both the training and test sets, indicating smaller absolute deviations from actual demand. Additionally, its Root Mean Squared Error (RMSE) is slightly lower, suggesting better overall fit.

In terms of percentage errors, ETS achieves a lower Mean Absolute Percentage Error (MAPE) (12.59% test) compared to ARIMA (13.85% test), meaning ETS provides slightly more reliable relative predictions. The Mean Absolute Scaled Error (MASE) also favors ETS, confirming its accuracy over a naïve benchmark.

Overall, ETS(A,N,A) appears to be the more effective model for forecasting lettuce demand for Restaurant 12631.

Restaurant New York 20974

Both models capture the overall trend and seasonality well, except for the period around the 15th week. Similar to the previous case, the ETS model responds more dynamically to fluctuations in demand, particularly during sudden drops and peaks. In contrast, the ARIMA model provides a smoother fit, suggesting a lower sensitivity to short-term variations.

The beginning of the time series contains missing values, confirming that the ets() function effectively handles them. While both models closely align with the observed demand, further evaluation using error metrics is needed to determine which model offers better predictive accuracy for this specific location.

Model ME RMSE MAE MPE MAPE MASE ACF1 Theil’s U
Training set ETS(A,N,A) -0.9560356 40.69057 33.66340 -13.469089 26.18756 0.7180669 0.0861895 NA
Test set ETS(A,N,A) 27.2057558 55.99135 33.87778 9.466729 13.38155 0.7226398 -0.2949359 0.8497462
Training set1 ARIMA(1,0,0)(2,0,0)[7] 1.5439866 47.15622 38.18619 -16.840984 31.86687 0.8145415 -0.0198009 NA
Test set1 ARIMA(1,0,0)(2,0,0)[7] 13.4212651 51.51128 31.16361 2.198246 12.40326 0.6647444 -0.1998934 0.7343252

ARIMA exhibits higher error measures on the training set but achieves lower RMSE, MAE, and MAPE on the test set, indicating a better overall fit for forecasting. Additionally, its lower MASE suggests that its predictions deviate less from actual values compared to a naïve benchmark. The Mean Percentage Error (MPE) further indicates that ETS introduces more bias in its forecasts.

Given these results, ARIMA(1,0,0)(2,0,0)[7] appears to be the superior model for this location, as it provides more accurate and stable forecasts.

Restaurant California 46673

Once again, this case resembles the previous two: the ETS model responds more dynamically to sudden fluctuations, particularly during sharp increases and decreases, while the ARIMA model provides a smoother fit, indicating lower sensitivity to short-term volatility. Given these differences, we will once again evaluate their accuracies to determine which model performs better.

Model ME RMSE MAE MPE MAPE MASE ACF1 Theil’s U
Training set ETS(A,N,A) -1.1975184 23.23844 18.40454 -3.814912 14.15808 0.6919636 0.0857551 NA
Test set ETS(A,N,A) 12.9328967 38.32624 27.62740 6.297726 17.57719 1.0387192 0.0466446 0.6071629
Training set1 ARIMA(0,0,1)(0,1,1)[7] 0.0028287 24.33113 17.78837 -2.705200 13.37245 0.6687969 0.0035688 NA
Test set1 ARIMA(0,0,1)(0,1,1)[7] 7.4932548 41.08078 30.72942 2.313730 19.62275 1.1553471 0.1711167 0.6433336

ETS has lower errors in most cases, especially on the test set, which is our primary focus. This indicates that it provides a better overall fit and more accurate predictions. Based on these results, ETS(A,N,A) is the superior model for this store, offering more accurate and stable forecasts with consistently lower errors.

Restaurant California 4904

Unlike the other restaurants, this case stands out as ARIMA appears less smooth and follows fluctuations more closely. As a result, it becomes more challenging to determine the better-fitting model based solely on visualization. A deeper evaluation using accuracy metrics is necessary to make a more informed comparison.

Model ME RMSE MAE MPE MAPE MASE ACF1 Theil’s U
Training set ETS(A,N,A) -0.9194867 43.04688 32.02198 -2.351651 10.999318 0.6799502 -0.0667622 NA
Test set ETS(A,N,A) -9.7799936 37.75467 28.39140 -3.682129 9.266691 0.6028589 0.1306179 0.4259769
Training set1 ARIMA(0,0,2)(0,1,1)[7] 5.0281659 42.93714 31.17312 0.157813 10.366398 0.6619256 0.0052211 NA
Test set1 ARIMA(0,0,2)(0,1,1)[7] -8.0664052 37.29566 28.75235 -2.986227 9.274836 0.6105235 0.1947400 0.4263084

Both ETS and ARIMA perform very similarly, with only slight differences in accuracy metrics. Since the test set does not provide a clear advantage for either model, we base our decision on the training set performance, where ARIMA performs just slightly better. Given this, we choose ARIMA(0,0,2)(0,1,1)[7] for this store, as it offers a comparable fit with slightly better performance in training.

Final Lettuce Demand Forecasts Across All Restaurants

After a thorough evaluation of forecasting models for lettuce demand across multiple stores, we selected the most suitable model for each location based on predictive accuracy and stability. The final choices reflect a balance between error metrics, model responsiveness to fluctuations, and overall performance on training and test sets.

Both ETS and ARIMA effectively capture demand patterns, ensuring reliable forecasting tailored to each store’s data characteristics. This approach ensures that each location benefits from the most reliable forecasting method, allowing for improved demand planning and inventory management.

Below, we present a consolidated table of the final lettuce sales forecasts for each store.

Date New_York.12631 New_York.20974 California.46673 California.4904
2025-06-16 266.7777 225.0381 161.88903 357.4335
2025-06-17 279.3228 236.1516 175.51673 341.3545
2025-06-18 307.8626 274.9644 163.78747 302.5804
2025-06-19 305.8711 197.9999 102.47294 215.8155
2025-06-20 234.1088 207.1209 77.11859 216.5998
2025-06-21 267.8582 198.7878 170.33491 340.6342
2025-06-22 239.9318 234.9984 178.01567 335.2030
2025-06-23 267.8663 219.5166 161.96875 353.9381
2025-06-24 280.5532 228.8890 175.60049 338.2286
2025-06-25 309.4576 257.8649 163.86776 299.0394
2025-06-26 307.4355 205.5278 102.53316 212.2745
2025-06-27 234.8684 213.0050 77.16918 213.0588
2025-06-28 268.9501 205.4249 170.41714 337.0932
2025-06-29 240.7416 229.7271 178.10014 331.6620

Conclusion

By constructing a tailored data pipeline and applying both ETS and ARIMA models, daily lettuce demand was forecasted for four restaurant locations with location-specific accuracy. The analysis revealed notable patterns: demand was generally higher in New York locations compared to Berkeley, with clear weekly seasonality captured effectively by ETS models in some cases. ARIMA performed better in stores with irregular or noisier demand. These findings underscore the importance of customizing forecasting methods based on local demand behavior.

From a business perspective, the project provides a scalable, data-driven approach to ingredient forecasting. Accurate 14-day projections allow restaurant managers to optimize purchasing, reduce spoilage, and improve inventory efficiency. The methodology can be extended to other fresh ingredients, supporting broader supply chain improvements across the organization.