The article discusses the results of backtesting a trading strategy based purely on financial news sentiment using Python.
Abstract
The article explores the impact of financial news on stock price movements and the potential of trading stocks based solely on news sentiment. The author uses Python to backtest a trading strategy that determines trade positions based on the sentiment of the previous day's financial news. The author retrieves news and price data for a selected stock ticker using the eod library and filters out irrelevant news. The sentiment scores for each article's headline are extracted using TextBlob, and trade positions are determined based on the given sentiment score and threshold. The backtest results are compared to Buy & Hold and random choice strategies, and the results are visualized for AMD and TSLA stocks. The sentiment strategy performs better than random choices but does not beat the Buy & Hold strategy.
Opinions
Trading stocks based solely on financial news sentiment may not be a reliable trading strategy.
The sentiment strategy performs better than random choices but does not consistently beat the Buy & Hold strategy.
Combining the sentiment strategy with Machine Learning Time Series forecasting may improve performance.
The author suggests that more testing with different scenarios may be required to fully understand the potential of this strategy.
The author provides code snippets and visualizations to support their findings.
The author acknowledges that using news sentiment to trade stocks is not a new strategy but wanted to see its performance in action.
The author suggests that the outcome of the backtest was better than expected but not enough to officially use news sentiment as a primary trading strategy.
What Happens When You Trade Stocks Based on News?
The outcome of backtesting a trading strategy based purely on financial news
We all know that one of the most impactful catalysts for stock price movement is financial news. A lot of negative press can drive a stock price through the floor. Conversely, positive financial news can skyrocket a former penny stock to the moon. Now, what would happen if we were to trade stocks purely on the sentiment from the press. Is it possible to ride the positive news wave to large financial gains? Or will it have been too late?
Seeing news headlines’ effect on stock price movement has been done before but I wanted to observe what would happen if you were to actually implement the financial news trading strategy. To do so, I used Python to code a backtesting process that would determine trade positions based on the sentiment from the previous day’s financial news.
I’ve attempted something similar to this before but with Tweets instead of headlines:
To start, I imported many different libraries necessary for this backtest to run:
# Librariesfrom eod import EodHistoricalData
import pandas as pd
from datetime import datetime, timedelta
from tqdm import tqdm
import nltk
from textblob import TextBlob
import numpy as np
import random
import plotly.express as px
# Importing and assigning the api keywithopen("../../eodHistoricalData-API.txt", "r") as f:
api_key = f.read()
# EOD Historical Data client
client = EodHistoricalData(api_key)
Getting News and Price Data
If you noticed above, I used a Python library called eod. This library allows me to retrieve not only price data but financial news data as well. If you want to follow along you’ll need to use your own API key from EOD HD. It’s free to sign up and you’ll have an API key in no time. Disclosure: I earn a small commission from any purchases made through the link above.
To retrieve the news and price data from a select stock ticker, I created a function to streamline the process:
Time Period Retrieval
Within this function, I am able to retrieve the financial news for any available stock ticker in 10 day windows. This is done to spread out the retrieval of financial news over the given time period. If I were to have done it over a period of say 1 year instead of 10 days, then I potentially risk only retrieving the most recent news instead of news from a year ago. This was the problem I had initially with the function.
Filtering Out Irrelevant News
After retrieving the bulk of the financial news, I had to filter out any irrelevant news. Many times, some of the news content retrieved may just have a passing reference to the selected stock, which is unnecessary for this experiment. In order to make sure that the news article is mainly about our given stock, I filtered out any news headlines that don’t mention our stock at all. The headline would need at least one mention of the stock symbol in order to pass my custom filter.
Price Data
With the relevant news data prepped and ready to go, I finally retrieved the price history for the selected stock from the same period of time as the financial news. I then separated the news and price data into separate Pandas DataFrames for formatting later on:
# Retrieving TSLA news and price data from the past 400 daysnews, prices = getNewsAndPrices("TSLA", 400)
Getting News Sentiment
With the price data and news headlines available, I can now move to the next step of extracting the sentiment from those headlines. There are many tools to use in order to analyze sentiment for these headlines such as NLTK’s Vader or Flair. But for me, I went with a library called TextBlob.
# Getting sentiment values for the news headlines/titlesnews['sentiment'] = news['title'].apply(
lambda x: TextBlob(x.lower()).sentiment[0]
)
# Grouping together dates and aggregating sentiment scores from the same daydf = news.groupby('date')[['sentiment']].mean()
After extracting the sentiment scores for each article’s headline, I then aggregated the scores for articles released in the same day by using the average of all the scores on that day.
Calculating Trade Positions
The next step after grabbing the sentiment scores would be getting the trade positions based on the headline sentiment:
def sentimentPositions(val, thresh=0.1):
"""
Returns position as 1, -1, or 0 for Buy, Sell,
and Do Nothing respectively based on the given
sentiment value and threshold.
"""ifval > thresh:
return1
elif val< -thresh:
return-1else:
return0
# Applying the position functiondf['sentiment_positions'] = df['sentiment'].apply(
lambda x: sentimentPositions(x, thresh=0)
)
With this function, I can determine what the trade position can be based on the given sentiment score and threshold. Since given sentiment scores are based on a range of -1 to 1, I can determine the trade position by looking at the score and threshold. For example, with a given threshold of 0.1, the function will only return a Buy position if the sentiment score is greater than 0.1 and vice-versa for Sell/Short positions.
Combining News Sentiment and Price History
Now that I have the trade positions based on sentiment scores and price history, I can combine both DataFrames in order to start the backtest:
# Merging price history and sentiment positions
df = df.merge(
prices,
right_index=True,
left_index=True,
how='outer'
)
# Filling inempty values with their most recent value for positions
df['sentiment_positions'] = df['sentiment_positions'].fillna(
method='ffill'
)
When the positions have been merged with daily price history, NaN values will be present when there are no news headlines for that day. In this case, I used the most recent sentiment position to determine the position for that day, which is why I fill in the NaNs with the forward fill method.
Performing the Backtest
To begin this backtest, I first needed to establish a few other trade positions which would act as baseline performance measurements. I added Buy & Hold and random choice strategies to compare to my news headline based strategy. The Buy & Hold strategy will buy the stock right at the beginning of the backtest and will never exit out of its position. The random choice strategy will randomly choose between Buy, Sell, or Do Nothing everyday. In the end, we shall see which strategy performs best.
# Positions shifted ahead by one to compensate for lookahead bias
position_df = df[['sentiment_positions']].shift(1)
# Buy and hold strategyposition_df['buy&hold'] = 1
# Random strategy
position_df['random_positions'] = random.choices(
[1,0,-1], k=len(position_df)
)
# Dropping the last Nansposition_df = position_df.dropna()
Now I can perform the backtest on the different strategies laid out above.
# Performing the backtest
returns = position_df.multiply(
log_returns,
axis=0
)
# Inversing the log returns to get daily portfolio balance
performance = returns.cumsum().apply(
np.exp
).dropna().fillna(
method='ffill'
)
The backtest has been performed and if you want to look at the raw numbers you can observe the DataFrame saved within the performance variable. However in the next section, you can see the performance results visualized.
Backtest Visualized
Visual Performance of Backtesting AMD
The first stock I attempted the backtest on was AMD. As you can see from the graph above, the overall performance was better than random choices but still not enough to beat a Buy & Hold strategy. Throughout the backtest, there were instances where the news sentiment strategy beat B&H (Sept. to Nov.) but also instances where it performed worse (first 6–8 months).
To see if these results were common with other stocks, I also performed the backtest again on TSLA:
Visual Performance of Backtesting TSLA
With TSLA, the performance between B&H and news sentiment initially was identical. Afterwards, the performances diverged. The news strategy also had an instance of outperforming B&H but in the end it did not beat out B&H. However, throughout all this, the sentiment strategy was still miles better than random choices.
Closing
While using news sentiment to trade stocks is not a new strategy, I still wanted to see its performance in action. The outcome was a bit better than expected but not enough to officially use news sentiment as a primary trading strategy. More testing with different scenarios may be required to know the full extent of this strategy.
However, I believe that if you were to combine this strategy along with some Machine Learning Time Series forecasting, then you might have something that can possibly beat the simple Buy & Hold strategy.