In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
In this issue, Xiaobian will bring you about how to use the open source large-scale prediction tool Prophet in R. The article is rich in content and analyzed and described from a professional perspective. After reading this article, I hope you can gain something.
Prophet is an open-source large-scale prediction tool launched by Facebook that can be used in R and Python to predict time-series data.
The following is a brief introduction to the use of Prophet in R.
I. Basic Introduction
The following example uses time series data for daily visits to Peyton Manning's Wikipedia homepage (December 10, 2007-January 20, 2016). We use the Wikipediatrend package in R to get this dataset. This dataset has the properties of multi-seasonal periodicity, changing growth rates, and the ability to fit specific dates (e.g., Peyton Manning's Finals and Super Bowl), so it serves as a good example. Peyton Manning is a former quarterback.
In R, Prophet provides a prophet function to fit the model and returns a model object on which you can perform "predict" and "plot" operations.
Use the prophet_plot_components function to show trends, weekly effects, and annual effects in the forecast.
Note: If Windows system language is set to Chinese, week effect cannot be displayed normally when R outputs component analysis diagram. Use Sys.setlocale("LC_ALL","English") code in R to change environment to English.
II. PROJECTED GROWTH
By default, Prophet uses linear models for prediction. When forecasting growth, there are usually limits that can be reached, such as total market size, total population, etc. This is called carrying capacity, and predictions should tend to saturate near this value.
Prophet can use logistic growth trend models to make predictions while specifying carrying capacity. The following is an example of log visits to the Wikipedia homepage in R.
III. Trend break point
By default, Prophet will automatically detect mutation points and adjust trends appropriately.
The following describes several methods used to gain greater control over the trend adjustment process.
1. Flexibility to adjust trends
If the change in trend is over-fitted (i.e. too flexible) or under-fitted (i.e. not flexible enough), the input parameter changepoint.prior.scale can be used to adjust the degree of sparseness of the prior. By default, this parameter is specified as 0.05.
Increasing this value results in a more flexible trend fit. The following code and diagram are shown:
Decreasing this value results in less flexibility in trend fitting. The following code and diagram are shown:
4. Holiday effect
1. Modeling holidays
If you want to model holidays specifically, you'll have to create a new data box for this purpose, with two columns (holiday and datestamp ds ), each for each holiday occurrence.
You can add two columns of lower_window and upper_window to this data frame to expand the holiday time into an interval [ lower_window , upper_window ]. For example, if you want to add Christmas Eve to Christmas, set lower_window =-1, upper_window = 0 ; if you want to add Black Friday to Thanksgiving, set lower_window = 0 , upper_window = 1.
Let's create a data box containing all the final dates Peyton Manning has played in:
In the above code, we record the date of the Super Bowl in both the Finals date data box and the Super Bowl date data box. This causes the Super Bowl date effect to stack twice with the final date effect.
Once this data frame is created, you can pass in the holidays parameter so that the holiday effect is taken into account in the forecast.
Holiday effects can be demonstrated through the forecast data box:
Holiday effects can also be seen in the plot of the component analysis, as shown below. We can see a penetration near the final date and a more pronounced penetration on Super Bowl dates.
2. Set a priori scale for holidays and seasonality
If holiday effects are found to be overfitted, they can be smoothed by adjusting their prior scale by setting the parameter holidays.prior.scale, which is 10 by default.
The scale of the holiday effect was reduced compared to previous ones, especially for the less-observed Super Bowl. Similarly, there is a seasonality.prior.scale parameter that can be used to adjust how well the model fits seasonality.
V. Prediction interval
In forecasting, uncertainty arises primarily from three components: uncertainty in trends, uncertainty in estimates of seasonal effects, and noise effects in observations.
1. Uncertainty in Trends
The source of uncertainty *** in forecasting is uncertainty about future trends. Prophet assumes that "the future will have similar trends to history." In particular, we assume that the average frequency and magnitude of future trends are the same as our observed historical values, thereby predicting trends and calculating the prediction intervals.
This measure of uncertainty has the following properties: greater flexibility in the rate of change (by increasing the value of the parameter changepoint.prior.scale) increases the uncertainty of the prediction. The reason is that adding more rates of change to the model from historical data means we expect more change in the future, making the prediction interval a sign of overfitting.
The width of the prediction interval (80% by default) can be controlled by setting the interval.width parameter:
2. Uncertainty in Seasonal Effects
By default, Prophet returns only uncertainty in trends and the effects of observation noise. You have to use Bayesian sampling to get the uncertainty of seasonal effects, which you can do by setting the mcmc.samples parameter (0 by default).
The above code replaces *** a posteriori estimation ( MAP ) with Markov Monte Carlo sampling ( MCMC ) and extends the computation time from 10 seconds to 10 minutes. If you do a full sample, you can see the uncertainty of seasonal effects by plotting:
VI. Outliers
Here we model predictions using log visits to the R Wikipedia homepage, which we used earlier, but with time gaps and incomplete data:
As shown in the R output plot above, the trend prediction looks reasonable, but the estimation of the prediction interval is too broad.
The way to handle outliers *** is to remove them, while Prophet enables handling of missing data. If a row in the historical data has a null value ( NA ), but the date remains in the future data box, the Prophet can still give the predicted value for that row.
This example affects the uncertainty estimate but does not affect the primary predictor yhat. However, this is often not the case. Next, add new outliers to the above dataset and model the prediction:
There are some outliers in June 2015 that undermine the estimation of seasonal effects, so future predictions will *** be affected by this. *** The solution is to remove these outliers:
VII. Non-daily data
Prophet doesn't have to deal with daily data, but trying to predict daily events or fit seasonal effects from non-daily data often leads to strange results. The following uses U.S. retail sales volume data to predict the next 10 years:
The prediction results look cluttered precisely because this particular dataset uses monthly data. When we fit the annual effect, there is only data for *** days of each month, and the periodic effect for other days is undetectable and overfitted. When you fit monthly data with Prophet, you can make monthly predictions only by passing in the frequency parameter in make_future_dataframe.
The above is how to use the open source large-scale prediction tool Prophet in R shared by Xiaobian. If you happen to have similar doubts, you may wish to refer to the above analysis for understanding. If you want to know more about it, please pay attention to the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.