How to Build Profitable Trading Strategies

A Quantitative Trading Model Development Guide

Quantitative Trading

Anyone can Build an AI Stock Trading Bot for Free, but not everyone can build a profitable stock trading bot. In my previous article I discussed in detail the mathematics and development of AI models specifically for stock trading bots. The terms trading model, trading strategy, algorithmic trading model and others are synonymous in the world of quantitative trading, all pertaining to a system that makes decisions regarding an investment security. The type of trading model falls in a subset of categories including HFT, AI, and Machine Learning. In this article I would like to discuss the design process of good trading models, and the tools that we can use to ensure our models are profitable before deployment. Just like in my previous articles we will be using Python and its accompanying libraries to help us develop a pipeline to design, test, deploy, and monitor these AI trading models. Having a pipeline allows for continuous development of new trading models. Your pipeline is data driven, evaluate often the quality of data you are providing it. There are a number of idiosyncrasies that you should bear in mind when designing and backtesting a trading model that we will cover in this article.

Theory

The most important part of your trading model is theory. What is your stock trading bot going to take advantage of that will reward us with a profit? This can come from anything with rational economic explanation. I worked with a professor who developed a trading model to capitalize on work efficiency during the week of the superbowl. He developed a hypothesis, researched and backtested it on collected data, and deployed it in a live system. There are however limitations to your economic reasoning given by one of the more controversial theories in finance: The Efficient Market Hypothesis.

The Efficient Market Hypothesis

Weak Form — Historic market price is already priced in
Semi-Strong Form — All public information is already priced in
Strong Form — All public and private information is already priced in

Developing a Trading Strategy

The different levels of the Efficient Market Hypothesis determine whether or not certain types of trading strategies are viable. If you believe in the weak form you already discard the ability to use technical analysis to trade as you are recognizing that historic market prices are already priced in. If you believe in the semi-strong form you are recognizing that all public information (most likely financial statements) have been priced in. If you believe in the strong form you are saying all information, public and private, is already priced in. Depending on which form of the Efficient Market Hypothesis you believe will alter how and which trading strategies you may develop. I tend to disregard this hypothesis, but how do I develop profitable trading strategies? The way I develop profitable trading strategies is by productively forecasting information that directly affects the asset that I am trading. It is in the same vein as using satellite images to determine the number of cars in a Walmart parking lot and creating your own dataset correlating the number of cars and the company’s performance. With this custom dataset, you are trading on information that nobody else has, allowing you to take advantage of your custom feature as a signal. It is absolutely critical to keep in mind that there is a lifecycle to trading models. I have deployed trading models that returned over 20% in a week but incurred a 2% loss every week after that. Don’t only backtest for profitability, but also for drawdown (covered in the performance metrics section), to determine the maximum threshold for the loss you are willing to incur before terminating your system. For the purpose of this article consider the following trading model built on the idea that profitable trades are executed when an anamoly is identified in a financial dataset.

Model

Now that we have built a trading model we can implement it as a trading strategy using backtrader’s Strategy class. Though it looks complex I have done my best to comment sections of the code to explain what is going on. The goal is to make this an online unsupervised learning model, to continually add data to the dataset for anamoly detection.

Strategy Implementation

With this, let's discuss the logistics of backtesting in Python.

Backtesting

The key component of any profitable trading operation: backtesting. The stars must align (literally) in your backtests to ensure that there is almost no difference between executing trades in your testing environment and with a brokerage house. The interval in which you are trading, the data you are feeding your model, the type of order you are executing, all of these components play a major role in the backtest itself. But most importantly how are you evaluating the quality of your trading model in the backtest? What metrics are you going to use to ensure that changing your trading model is being productive? How are you going to ensure you are increasing the productivity of your model by changing model parameters without overfitting it to historical data? All of these are critical questions in the process of developing a profitable trading system, and my goal is to offer a solution to each.

Train Test Splits

Something I take from deep learning is creating training and testing splits for your backtests. Similar to how you don’t want to overfit your neural network to seen data (which will offer poor performance on new, unseen data) we don’t want to overfit our trading model to the backtest or similarly it will offer poor performance on new, unseen live data. The goal of a train test split is to separate the data into three sets of data: training, validation, and testing. The purpose of the training set is to optimize the parameters in your trading strategy to yield the highest performance metric values. The purpose of the validation set is to see how the optimized parameters perform on unseen data, this helps us ensure that we are not overfitting our model to the training set while optimizing model parameters. After optimizing in the training set, and backtesting against the validation set to verify the model is still productive, it's time to see how the model performs on the testing set. Consider the following function I created which is essentially the backtesting engine to help us incorporate our custom trading models.

Final Implementation

Performance Metrics

The discussion of performance metrics is quite succinct due to their quantitative nature. I want to offer a few different performance metrics for evaluating your trading model.

Profit and Loss (P/L)

I debated about whether or not I should even include this as a performance metric since its obviously the first thing to check after a backtest, nevertheless here we are. The profit and loss of a system should be scrutinized. It’s very easy to bias a backtest by optimizing position size over the life of the backtest. Take the backtest’s initial P/L with a grain of salt and get ready for further system analysis.

Drawdown

This is how I determine whether or not I should retire a system. Drawdown is quoted as the percentage between peak and trough. For example, consider a system that has an initial capital investment of $100,000. After executing trades in a backtest it realizes a loss of $10,000 before climbing back to the initial $100,000. The system has a drawdown of 10%, and if that was the largest drawdown it would be considered the maximum drawdown. To determine whether or not to retire a system, I base my systems on the average drawdown over the life of the backtest. If a deployed system meets or exceeds the average drawdown for the life of the backtest I terminate the system.

Sharpe Ratio

Named after William Sharpe, a contributor to any finance academic’s favorite model CAPM, the sharpe ratio measures excess return per unit of risk. The ratio takes the system return less the risk-free rate for the period and divides the difference by system standard deviation. In some cases, high-frequency traders are capable of reaching double-digit, Sharpe Ratios. In general, any system with a backtested Sharpe Ratio over 1 is worth exploring.

System Quality Number

Van Tharp’s System Quality Number gives us insight into the risk to reward ratio by taking into account the number of trades. This gives a quantitative measure to risk exposure in terms of how many trades were executed over the life of the backtest.

Score: 1.6–1.9 Below average, but trade-able

Score: 2.0–2.4 Average

Score: 2.5–2.9 Good

Score: 3.0–5.0 Excellent

Score: 5.1–6.9 Superb

Score: 7.0 — Keep this up, and you may have the Holy Grail.

Conclusion

The purpose of this article was to lay the groundwork for developing profitable trading bots. We discussed the theory of the Efficient Market Hypothesis, its various forms, and consequences of various forms. Then we talked about quantitative trading, specifically trading model development. Depending on which form of the Efficient Market Hypothesis you believe determines the type of trading models you can productively develop.