Welcome to +74 new followers over the last 30 days and +17 Substack authors who recommend my newsletter.
For newcomers, I focus on unique investment ideas supported by novel research.
Discover more in our subscriber archive with over +150 newsletters.
Many of my friends wonder where I find investment ideas.
One of my preferred sources for investment ideas, as detailed in my post 'My go-to resource for investment ideas,' is the Good Judgement Open Website (GJ Open).
GJ Open is a crowd forecasting platform where users can refine their forecasting skills, stay informed about global events, and interact with fellow forecasters.
Participants can predict the likelihood of future events and evaluate their accuracy against the crowd's performance.
This week, I ranked in the top 10 (89th percentile) on the Forecasting Leaderboard by projecting the annualized rate of US Housing Starts for February 2024. (Lower Brier Scores are Better)
In December, I used ChatGPT to create an AI-assisted autoregressive integrated moving average (ARMIA) forecast.
ARMIA projection: 1,452 units
95% Confidence Interval: [1184, 1719]
Actual: 1,521 units
Many believe Chat GPT is limited to text tasks, but I'll show you step-by-step how I used AI to forecast this question and the prompts I utilized to make my projection.
Identify Data Used to Determine Forecast
Clean Data and Add Column Headings
Upload data to ChatGPT | Initiate first prompt
Seasonality and Trends
Model Specification
Specificity Tests
Forecast Results
For those that have the paid version of ChatGPT, you can access my prompts directly by accessing the link HERE.
Question: What will be the annualized rate of US housing starts for February 2024?
Step 1: Identify Data Used to Determine Forecast
Every GJ Open question has benchmark data to reference.
Luckily, the data for this question was pulled from the FRED website.
The New Privately-Owned Housing Units Started has a historical dataset that goes all the back all the way to the 1950’s so I knew I would have a large n sample size to work with.
Step 2: Clean Data and Add Column Headings
I downloaded the data from the FRED website as a csv file [New Privately-Owned Housing Units Started: Total Units (HOUST)] and cleaned the data by removing blank rows and added column headings.
Step 3: Upload data to ChatGPT | Initiate first prompt
I then uploaded my csv file to ChatGPT and initiated my first prompt:
<You are my Data Analyst. Observation Date, Housing Starts, Monthly Change. Can you set the parameters of the distribution? Is it normal?">
This prompt checks data column names and assesses normal distribution. Understanding if we are working with a normal distribution is crucial for drawing accurate statistical inferences from the data.
The prompt's flexibility allows ChatGPT to test non-normal distributions, speeding up the process of obtaining relevant information.
ChatGPT had to work through the time series for the monthly change data, so technically the March forecast period would be our February target. This was a minor issue. The prompt generated a beautiful bell-shaped distribution and statistical properties of the normal distribution tests:
The analysis of the distribution of the Housing Starts and Monthly Change data yielded the following results:
Housing Starts
Skewness: 0.36, indicating a slight skew to the right.
Kurtosis: -0.53, suggesting a distribution that is less peaked than a normal distribution.
Normality Test P-Value: 0.094, which is not small enough to reject the hypothesis of normality at a typical alpha level (e.g., 0.05).
Monthly Change
Skewness: -0.16, showing a slight skew to the left.
Kurtosis: 0.29, indicating a slightly more peaked distribution than normal.
Normality Test P-Value: 0.49, which is much higher than 0.05, suggesting that we cannot reject the hypothesis of normality.
From these results, both Housing Starts and Monthly Change distributions do not show strong deviations from normality. The skewness and kurtosis values are relatively close to those of a normal distribution (skewness = 0, kurtosis = 0). Moreover, the normality test does not provide enough evidence to reject the hypothesis that these distributions are normal. However, it's important to note that these tests are only indicative and should be used in conjunction with other data analysis techniques for more accurate conclusions.
As you can see, this first prompt gave us a lot to work with.
Step 4: Seasonality and Trends
My second prompt aimed to identify seasonality and trends given the 3-month gap between my forecast period (February) and the available data (November).
This lag presents a challenge compared to forecasting the next period directly, as we would be forecasting 3 months ahead.
To assess seasonality or trends, I prompted:
<Review the data and give me an instance where there were 3 positive months in a row? >
I emphasized 'review the data' because ChatGPT tends to be lazy and stop looking for specific iterations in large datasets once it finds the first instance.
As anticipated, ChatGPT identified several instances of three consecutive positive months, but they were all limited to the same year, 1994.
So I had to reinitiate two more prompts to be comprehensive and ensure that I was getting the series going all the way back to 1950’s:
<how many instances did this occur across the entire dataset?>
<and how many instances total?>
Now I had specific figures of directionality over the entire forecast period.
I implicitly had a bullish bias given our positive macro view in December. Based on the commonality of positive directionality I felt confident that there was enough occurrences to not rule out 3 consecutive positive months during positive economic times.
Step 5: Model Specification
After narrowing down the directionality for the forecast period, I prompted ChatGPT with:
<if you were going to forecast the housing starts out to March 2023? so 4 months from today? how would you set that up, based on the current distribution?>
ChatGPT listed several model and then recommended the following model: “Given the data and requirements, a statistical time series model like ARIMA or SARIMA might be the most straightforward choice.”
Then I prompted:
<Could you apply the ARIMA model?>
ChatGPT gave the Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF) plots to show the potential parameters for the ARIMA model:
Step 6: Specificity Tests
ChatGPT then conducted Ljung-Box and Jarque-Bera tests on the distribution.
“The Ljung-Box test (Prob(Q)) suggests that the residuals are independently distributed (p = 0.76), which is a good sign. The Jarque-Bera test indicates that the residuals are not normally distributed (Prob(JB) close to 0), which might be a concern.”
Step 7: Forecast Results
Given this model, we can now proceed to forecast the housing starts for the next four months, up to March 2023. However, it's important to note that the accuracy of these forecasts can be influenced by various factors not captured in the model, such as economic changes, policy decisions, etc. Let's generate the forecast.
The forecast for housing starts for the next four months, up to March 2023, using the ARIMA(1,1,1) model is as follows:
First Month Forecast: 1491.84 units
Confidence Interval: [1285.57, 1698.10] units
Second Month Forecast: 1465.84 units
Confidence Interval: [1226.29, 1705.40] units
Third Month Forecast: 1455.93 units
Confidence Interval: [1200.16, 1711.70] units
Fourth Month Forecast (February 2024): 1452.15 units
Confidence Interval: [1184.94, 1719.35] units
These values were the predicted housing starts, with confidence intervals indicating the range within which the actual values are likely to fall with 95%
And that's it we had our forecast to enter the GJ Open project.
The best part?
This entire process to forecast was completed in less time than it took to write this post.
We plan to utilize Chat GPT for future forecasting questions.
Final Word
Thank you for reading and I am grateful and humbled to be able to learn, grow, and invest alongside you at Tuttle Ventures.
Vision, courage, and patience leads to successful investing.
Don’t forget to follow Tuttle Ventures on Twitter, LinkedIn, or Instagram.
Check out the website or some other work here.
Best,
Darin Tuttle, CFA
This is not investment advice. Do your own due diligence. Past performance is no guarantee of future results. Investing in the stock market involves gains and losses and may not be suitable for all investors. Information presented herein is subject to change without notice and should not be considered as a solicitation to buy or sell any security. Investors are encouraged to perform due diligence, consider their risk tolerance, investment goals, and consult with financial advisors before making investment decisions. I make no representation, warranty or undertaking, express or implied, as to the accuracy, reliability, completeness, or reasonableness of the information contained in this report. Any assumptions, opinions and estimates expressed in this report constitute my judgment as of the date thereof and is subject to change without notice. Any projections contained in the report are based on a number of assumptions as to market conditions. There is no guarantee that projected outcomes will be achieved.
Neither the publisher nor any of its affiliates accepts any liability whatsoever for any direct or consequential loss howsoever arising, directly or indirectly, from any use of the information contained herein.
Unless there is a signed Investment Management or Financial Planning Agreement by both parties, Tuttle Ventures is not acting as your financial advisor or in any fiduciary capacity.