The web content provides a comprehensive guide on developing profitable quantitative trading models, emphasizing the importance of theory, backtesting, and performance metrics.
Abstract
The article "How to Build Profitable Trading Strategies" serves as a detailed guide for creating quantitative trading models. It underscores the significance of a strong theoretical foundation for trading models, suggesting that profitable strategies can be developed by forecasting information that directly impacts the traded asset. The author discusses the Efficient Market Hypothesis and its implications on trading strategy viability, advocating for the creation of unique datasets to gain an edge. The process of backtesting is highlighted as crucial for ensuring model profitability, with an emphasis on avoiding overfitting through train-test splits and using performance metrics like Profit and Loss, Drawdown, Sharpe Ratio, and System Quality Number to evaluate model effectiveness. The article also provides practical Python code examples for model implementation and backtesting using libraries like backtrader.
Opinions
The author believes in the practical application of economic theories to develop trading models, as demonstrated by the example of capitalizing on work efficiency during the Superbowl week.
The Efficient Market Hypothesis is acknowledged but not strictly adhered to, as the author suggests that profitable trading strategies can be created by productively forecasting information.
Overfitting is a concern in model development, and the author emphasizes the importance of using train-test splits to prevent this issue.
The author values custom datasets and unique signals, such as satellite images of Walmart parking lots, as a means to gain an informational advantage in trading.
A trading model's lifecycle is recognized, with the understanding that even highly profitable models may have a limited useful period.
The author advocates for a rigorous backtesting process that closely simulates real-world trading conditions, including the type of orders executed.
Performance metrics are crucial for evaluating trading models, with the Sharpe Ratio and System Quality Number being particularly important for assessing risk-adjusted returns and overall system quality.
Anyone can Build an AI Stock Trading Bot for Free, but not everyone can build a profitable stock trading bot. In my previous article I discussed in detail the mathematics and development of AI models specifically for stock trading bots. The terms trading model, trading strategy, algorithmic trading model and others are synonymous in the world of quantitative trading, all pertaining to a system that makes decisions regarding an investment security. The type of trading model falls in a subset of categories including HFT, AI, and Machine Learning. In this article I would like to discuss the design process of good trading models, and the tools that we can use to ensure our models are profitable before deployment. Just like in my previous articles we will be using Python and its accompanying libraries to help us develop a pipeline to design, test, deploy, and monitor these AI trading models. Having a pipeline allows for continuous development of new trading models. Your pipeline is data driven, evaluate often the quality of data you are providing it. There are a number of idiosyncrasies that you should bear in mind when designing and backtesting a trading model that we will cover in this article.
Theory
The most important part of your trading model is theory. What is your stock trading bot going to take advantage of that will reward us with a profit? This can come from anything with rational economic explanation. I worked with a professor who developed a trading model to capitalize on work efficiency during the week of the superbowl. He developed a hypothesis, researched and backtested it on collected data, and deployed it in a live system. There are however limitations to your economic reasoning given by one of the more controversial theories in finance: The Efficient Market Hypothesis.
The Efficient Market Hypothesis
Weak Form — Historic market price is already priced in
Semi-Strong Form — All public information is already priced in
Strong Form — All public and private information is already priced in
Developing a Trading Strategy
The different levels of the Efficient Market Hypothesis determine whether or not certain types of trading strategies are viable. If you believe in the weak form you already discard the ability to use technical analysis to trade as you are recognizing that historic market prices are already priced in. If you believe in the semi-strong form you are recognizing that all public information (most likely financial statements) have been priced in. If you believe in the strong form you are saying all information, public and private, is already priced in. Depending on which form of the Efficient Market Hypothesis you believe will alter how and which trading strategies you may develop. I tend to disregard this hypothesis, but how do I develop profitable trading strategies? The way I develop profitable trading strategies is by productively forecasting information that directly affects the asset that I am trading. It is in the same vein as using satellite images to determine the number of cars in a Walmart parking lot and creating your own dataset correlating the number of cars and the company’s performance. With this custom dataset, you are trading on information that nobody else has, allowing you to take advantage of your custom feature as a signal. It is absolutely critical to keep in mind that there is a lifecycle to trading models. I have deployed trading models that returned over 20% in a week but incurred a 2% loss every week after that. Don’t only backtest for profitability, but also for drawdown (covered in the performance metrics section), to determine the maximum threshold for the loss you are willing to incur before terminating your system. For the purpose of this article consider the following trading model built on the idea that profitable trades are executed when an anamoly is identified in a financial dataset.
Model
Now that we have built a trading model we can implement it as a trading strategy using backtrader’s Strategy class. Though it looks complex I have done my best to comment sections of the code to explain what is going on. The goal is to make this an online unsupervised learning model, to continually add data to the dataset for anamoly detection.
Strategy Implementation
With this, let's discuss the logistics of backtesting in Python.
Backtesting
The key component of any profitable trading operation: backtesting. The stars must align (literally) in your backtests to ensure that there is almost no difference between executing trades in your testing environment and with a brokerage house. The interval in which you are trading, the data you are feeding your model, the type of order you are executing, all of these components play a major role in the backtest itself. But most importantly how are you evaluating the quality of your trading model in the backtest? What metrics are you going to use to ensure that changing your trading model is being productive? How are you going to ensure you are increasing the productivity of your model by changing model parameters without overfitting it to historical data? All of these are critical questions in the process of developing a profitable trading system, and my goal is to offer a solution to each.
Train Test Splits
Something I take from deep learning is creating training and testing splits for your backtests. Similar to how you don’t want to overfit your neural network to seen data (which will offer poor performance on new, unseen data) we don’t want to overfit our trading model to the backtest or similarly it will offer poor performance on new, unseen live data. The goal of a train test split is to separate the data into three sets of data: training, validation, and testing. The purpose of the training set is to optimize the parameters in your trading strategy to yield the highest performance metric values. The purpose of the validation set is to see how the optimized parameters perform on unseen data, this helps us ensure that we are not overfitting our model to the training set while optimizing model parameters. After optimizing in the training set, and backtesting against the validation set to verify the model is still productive, it's time to see how the model performs on the testing set. Consider the following function I created which is essentially the backtesting engine to help us incorporate our custom trading models.
Final Implementation
Performance Metrics
The discussion of performance metrics is quite succinct due to their quantitative nature. I want to offer a few different performance metrics for evaluating your trading model.
Profit and Loss (P/L)
I debated about whether or not I should even include this as a performance metric since its obviously the first thing to check after a backtest, nevertheless here we are. The profit and loss of a system should be scrutinized. It’s very easy to bias a backtest by optimizing position size over the life of the backtest. Take the backtest’s initial P/L with a grain of salt and get ready for further system analysis.
Drawdown
This is how I determine whether or not I should retire a system. Drawdown is quoted as the percentage between peak and trough. For example, consider a system that has an initial capital investment of $100,000. After executing trades in a backtest it realizes a loss of $10,000 before climbing back to the initial $100,000. The system has a drawdown of 10%, and if that was the largest drawdown it would be considered the maximum drawdown. To determine whether or not to retire a system, I base my systems on the average drawdown over the life of the backtest. If a deployed system meets or exceeds the average drawdown for the life of the backtest I terminate the system.
Sharpe Ratio
Named after William Sharpe, a contributor to any finance academic’s favorite model CAPM, the sharpe ratio measures excess return per unit of risk. The ratio takes the system return less the risk-free rate for the period and divides the difference by system standard deviation. In some cases, high-frequency traders are capable of reaching double-digit, Sharpe Ratios. In general, any system with a backtested Sharpe Ratio over 1 is worth exploring.
System Quality Number
Van Tharp’s System Quality Number gives us insight into the risk to reward ratio by taking into account the number of trades. This gives a quantitative measure to risk exposure in terms of how many trades were executed over the life of the backtest.
Score: 1.6–1.9 Below average, but trade-able
Score: 2.0–2.4 Average
Score: 2.5–2.9 Good
Score: 3.0–5.0 Excellent
Score: 5.1–6.9 Superb
Score: 7.0 — Keep this up, and you may have the Holy Grail.
Conclusion
The purpose of this article was to lay the groundwork for developing profitable trading bots. We discussed the theory of the Efficient Market Hypothesis, its various forms, and consequences of various forms. Then we talked about quantitative trading, specifically trading model development. Depending on which form of the Efficient Market Hypothesis you believe determines the type of trading models you can productively develop.