avatarSugath Mudali

Summary

The article presents a method for ranking stocks based on fundamental analysis using Python APIs, which involves scoring stocks according to financial ratios and their deviation from industry averages.

Abstract

The article outlines a systematic approach to stock ranking by leveraging fundamental analysis through Python APIs, specifically the yfinance library. It introduces a scoring system that evaluates stocks within the same industry based on key financial ratios such as Earnings Per Share (EPS), Price to Earnings (PE), and Return on Equity (ROE). The method categorizes ratios into two groups: those where lower values are preferred and those where higher values are better. It calculates scores by comparing each stock's ratios against the mean and standard deviation of its industry peers. The article also provides a Python codebase, available on GitHub, which includes data collection from Yahoo Finance, web scraping for stock symbols from the finviz screener, and the application of the scoring system. The final output is a DataFrame that ranks stocks according to their total score, aiding investors in making informed decisions.

Opinions

  • The author emphasizes the importance of comparing stocks within the same industry to ensure the relevance of the financial ratios used in the analysis.
  • The scoring system is designed to be somewhat arbitrary, with the suggestion that it can be refined or weighted according to investor preferences.
  • The article acknowledges that the scoring and subsequent stock rankings are subject to change based on current stock prices and market conditions.
  • The author provides a disclaimer that the information is for educational purposes only and not intended as investment advice.
  • The use of human-readable formats for large numbers and the application of styles to the DataFrame indicate a focus on user-friendliness and accessibility of the data presentation.
  • The author encourages reader engagement and feedback, indicating a collaborative approach to improving the methodology presented.
  • A recommendation for an AI service is made at the end of the article, suggesting the author's endorsement of technology to enhance analysis capabilities.

Fundamental Analysis for ranking Stocks with Python API

This article will rank stocks based on their fundamentals and stock details. The approach builds on the previous article "Fundamental Stock Analysis Using Python APIs" by applying a scoring method to the ratios based on the stock group's mean and standard deviation.

Disclaimer: The information provided here is for informational purposes only and is not intended to be personal financial, investment, or other advice.

The principal ratios employed in the article are:

  1. EPS (Earnings Per Share) — portion of a company’s profit that is assigned to each share of its stock
  2. PE (Price to Earnings) — relationship between the stock price of a company and its per-share earnings. It helps investors determine if a stock is undervalued or overvalued relative to others in the same sector.
  3. PEG (Projected Earnings Growth)— calculated by dividing a stock’s P/E by its projected 12-month forward revenue growth rate. In general, a PEG lower than 1 is a good sign, and a PEG higher than 2 indicates that a stock may be overpriced
  4. PB (Price to Book) — A ratio of 1 indicates the company’s shares are trading in line with its book value. A P/B higher than 1 suggests the company is trading at a premium to book value, and lower than 1 indicates a stock that may be undervalued relative to the company’s assets.
  5. ROE (Return on Equity) — provides a way for investors to evaluate how effectively a company is using its equity to generate profits. A higher ROE indicates a more efficient use of shareholder equity, which can lead to increased demand for shares and a higher stock price, as well as an increase in the company’s profits in the future.
  6. ROCE (Return on Capital Employed) — measures a company’s profitability in terms of all of its capital.
  7. FCFY (Free Cash Flow Yield) — a financial solvency ratio that compares the free cash flow per share a company is expected to earn against its market value per share. A lower ratio indicates a less attractive investment opportunity.
  8. D2E (debot To Equity) — compares a company’s total liabilities with its shareholder equity
  9. CR (Current Ratio) — measures a company’s ability to pay off its current liabilities (payable within one year) with its current assets, such as cash, accounts receivable, and inventories. The higher the ratio, the better the company’s liquidity position.
  10. QR (Quick Ratio) — measures a company’s capacity to pay its current liabilities without needing to sell its inventory or obtain additional financing.
  11. Asset TR (Asset Turnover Ratio) — measures the efficiency of a company’s assets in generating revenue or sales.
  12. DY (Dividend Yield Ratio) — ratio looks at the amount paid by a company in dividends every year relative to its share price. It is an estimate of the dividend-only return of a stock investment.
  13. Beta is a measure of a stock’s volatility in relation to the overall market. A stock that swings more than the market over time has a beta above 1.0. If a stock moves less than the market, the stock’s beta is less than 1.0.
  14. 52w Range — a visualization to indicate which stocks are near their 52-week low and which are near their 52-week high. For example, 90% will indicate that the current price is very close to its 52-week high
  15. Score — sum of ratio scores for each stock

Data Access

We will utilize the yfinance API to collect data from Yahoo Finance. The info component of a ticker, which is one of many components (e.g., Income Statement, Cash Flow, etc.), will supply values for most ratios.

We will web scrape the finviz screener to obtain a list of relevant stock symbols. It is vital that the study be conducted on a similar group of stocks, ideally in the same industry.

Code is available as a Jupyter notebook on GitHub.

Python Libraries

The required Python libraries are:

Import Libraries

# Read stocks
import yfinance as yf

# For DataFrame
import pandas as pd
import numpy as np

# For parsing finviz
import requests
from bs4 import BeautifulSoup

# to calculate std and mean
import statistics

Ratio Categories

Ratios are categorized into two groups: Category 1 includes ratios where a lower value is preferred, and Category 2 includes ratios where a higher value is preferred. Price/Earn (P/E) is an example ratio for Category 1. A lower P/E ratio, for instance, may suggest that a company is now inexpensive or that it is performing well in comparison to its historical patterns. Cash Ratio (CR) is an example for Category 2, where a higher score for CR is desirable. Since we are more interested in less volatile companies, the beta score is under Category 1; however, it can be moved to Category 2 if you are more interested in highly volatile companies.

# Scores for Catgeory 1 ratios - lower the better
CAT1_RATIOS = ['D2E', 'PEG', 'PE fwd', 'PB', 'Beta']

# Scores for Catgeory 2 ratios - higher the better
CAT2_RATIOS = ['ROCE', 'ROE', 'FCFY', 'CR', 'QR', 'Asset TR', 'EPS fwd']

Stock Symbols

As mentioned previously, we will use the finviz screener to obtain a list of relevant stock symbols. The code below, for instance, utilizes a filter that is set for companies in the “Utilities” sector’s “Regulated Gas” industry that have a market capitalization of more than $2 billion. The request parameter “f” receives values for the filter, as seen below.

def get_symbols():
    req = requests.get('https://finviz.com/screener.ashx',
        params={
            'v': '111',
            'f': 'cap_midover,ind_utilitiesregulatedgas',
            'o': 'company',
        },
        headers={
            'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
        })
    # Creating BeautifulSoup object
    soup = BeautifulSoup(req.text, 'html.parser')
    
    # Table we are interested
    table = soup.find('table', class_='styled-table-new is-rounded is-tabular-nums w-full screener_table')
    # Array to collect symbols
    symbols = []
    for i, row in enumerate(table.find_all('tr')):
        # Skip the header row
        if i != 0:
            # Loop through the row
            for j, td in enumerate(row.find_all('td')):
                # Symbol is in the second column
                if j == 1:
                    symbols.append(td.text.strip())
                    break
    return symbols

Get Stock symbols

Call the above utility method to populate the symbols variable.

symbols = get_symbols()

Without using the get_symbols function, you can also initialize the symbols variable if you already know a set of stock symbols, as shown below:

symbols = ['ATO', 'NI', ]

Note that all of the symbols should be in the same industry if you are setting them manually; there is a simple check performed to ensure this when calling the Yahoo Finance API.

Calculate Ratios

We start by defining a utility that makes use of the info method of the yfinance API in order to compute and populate ratios.

def populate_with_info(data, stock_info):
    # print(stock_info)
    data['Symbol'].append(stock_info['symbol'])
    data['Name'].append(stock_info['longName'])
    # Convert numbers to a human readable format
    data['Market Cap'].append(human_format(stock_info['marketCap']))
    data['Price'].append(stock_info['currentPrice'])

    # Could be that some indicators are not available; use NaN if this is the case

    # Valuation ratios
    
    if 'priceToBook' in stock_info:
        data['PB'].append(stock_info['priceToBook'])
    else:
        data['PB'].append(np.nan)
    
    if 'forwardEps' in stock_info:
        data['EPS fwd'].append(stock_info['forwardEps'])
    else:
        data['EPS fwd'].append(np.nan)
        
    if 'forwardPE' in stock_info:
        data['PE fwd'].append(stock_info['forwardPE'])
    else:
        data['PE fwd'].append(np.nan)
        
    if 'pegRatio' in stock_info:
        data['PEG'].append(stock_info['pegRatio'])
    else:
        data['PEG'].append(np.nan)
        
    # Solvency financial ratios

    if 'debtToEquity' in stock_info:
        data['D2E'].append(stock_info['debtToEquity'])
    else:
        data['D2E'].append(np.nan)

    # Profitability Ratios
    
    if 'returnOnEquity' in stock_info:
        data['ROE'].append(stock_info['returnOnEquity'])
    else:
        data['ROE'].append(np.nan)
    
    if ('freeCashflow' in stock_info) and ('marketCap' in stock_info):
        fcfy = (stock_info['freeCashflow']/stock_info['marketCap']) * 100
        data['FCFY'].append(round(fcfy, 2))
    else:
        data['FCFY'].append(np.nan)

    # Liquidity ratios

    if 'currentRatio' in stock_info:
        data['CR'].append(stock_info['currentRatio'])
    else:
        data['CR'].append(np.nan)

    if 'quickRatio' in stock_info:
        data['QR'].append(stock_info['quickRatio'])
    else:
        data['CR'].append(np.nan)

    # Other info (non ratios)
    
    if 'dividendYield' in stock_info:
        data['DY'].append(stock_info['dividendYield']*100)
    else:
        data['DY'].append(0.0)

    if 'beta' in stock_info:
        data['Beta'].append(stock_info['beta'])
    else:
        data['Beta'].append(np.nan)

    if 'fiftyTwoWeekLow' in stock_info:
        data['52w Low'].append(stock_info['fiftyTwoWeekLow'])
    else:
        data['52w Low'].append(np.nan)
        
    if 'fiftyTwoWeekHigh' in stock_info:    
        data['52w High'].append(stock_info['fiftyTwoWeekHigh'])
    else:
        data['52w High'].append(np.nan)

If a ratio cannot be found, it will be added to the dictionary as a NaN. Stocks with ratios set to NaN will be removed. In addition, this method makes use of a utility method to translate a number into a format that is legible by humans, such as 5B (billions), 5M (millions), etc.

def human_format(num):
    num = float('{:.3g}'.format(num))
    magnitude = 0
    while abs(num) >= 1000:
        magnitude += 1
        num /= 1000.0
    return '{}{}'.format('{:f}'.format(num).rstrip('0.'), ['', 'K', 'M', 'B', 'T'][magnitude])

And lastly, an additional set of techniques for calculating ratios with the balance sheet and income statement.

def roce(ticker):
    income_stm = ticker.income_stmt
    ebit = income_stm.loc['EBIT'].iloc[0]
    bs = ticker.balance_sheet
    return ebit/(bs.loc['Total Assets'].iloc[0]-bs.loc['Current Liabilities'].iloc[0])

def asset_turnover_ratio(ticker):
    df_bs = ticker.balance_sheet
    y0, y1 = df_bs.loc['Total Assets'].iloc[0], df_bs.loc['Total Assets'].iloc[1]
    avg_asset = (y0 + y1)/2
    tot_rvn_y0 = ticker.income_stmt.loc['Total Revenue'].iloc[0]/avg_asset
    return tot_rvn_y0

def investory_turnover_ratio(ticker):
    df_bs = ticker.balance_sheet
    y0, y1 = df_bs.loc['Inventory'].iloc[0], df_bs.loc['Inventory'].iloc[1]
    avg_inventory = (y0 + y1)/2
    return ticker.income_stmt.loc['Cost Of Revenue'].iloc[0]/avg_inventory

Collect Ratios

Let’s add all the ratios to a dictionary for each stock symbol.

# Dictionary to collect data to create a DF later
data = {
    'Symbol': [],
    'Name': [],
    'Market Cap': [],
    'EPS fwd': [],
    'PE fwd': [],
    'PEG': [],
    'PB': [],
    'ROE' : [],
    'ROCE' : [],
    'FCFY' : [],
    'D2E' : [],
    'CR' : [],
    'QR' : [],
    'Asset TR': [],
    'DY' : [],
    'Beta': [],
    'Price': [],
    '52w Low': [],
    '52w High': []
    }
industry = ''

for symbol in symbols:
    ticker = yf.Ticker(symbol)
    if not industry:
        industry = ticker.info['industry']
    else:
        industry_current = ticker.info['industry'] 
        if industry_current != industry:
            print(f'Encountred a different industry {industry_current}, previous {industry}. Quitting')
            break        
    populate_with_info(data, ticker.info)
    data['ROCE'].append(roce(ticker))
    data['Asset TR'].append(asset_turnover_ratio(ticker))

As mentioned previously, this method includes a simple check to see if the industry of the current stock differs from that of the preceding stock.

Create DataFrame

# Create a DF using the dictionary
df = pd.DataFrame(data)

# Save any stocks with NaN values
df_exceptions = df[df.isna().any(axis=1)]

# Remove any stocks with NaN values
df=df.dropna()

# Reset index after dropping rows with NaN values
df.reset_index(drop=True, inplace=True)

# Add 52 week price range
df['52w Range'] = ((df['Price'] - df['52w Low'])/(df['52w High'] - df['52w Low']))*100

df_exceptions

Any stock that has a ratio set to NaN will be eliminated and saved under an exception DataFrame. Lastly, we will include the 52-week price range.

Stocks with exceptions were removed from the analysis

And for non-exception stocks, the result is:

Stocks for the analysis

Score

The next step is to apply a score to the raw data.

def score(values, value, cat) -> int:
    '''
    Calculate the score using standard deviation and mean based on the category. A ratio such as PE which prefers a lower
    value, the score is calculated the following way:
    1. Score of 1 is returned if given PE is in between -1 std and mean
    2. Score of 2 is returned if given PE is in between -2 std and -1 std
    3. Score of 3 is returned if PE is outside -2 std
    4. Score of -1 is returned if given PE is in between 1 std and mean
    5. Score of -2 is returned if given PE is in between +1 std and +2 std
    6. Score of -3 is given if given PE is outside +2 std

    A ratio such as ROE which prefers a higher value, the score is calculated the following way:
    1. Score of 1 is returned if given ROE is in between mean and +1 std
    2. Score of 2 is returned if given ROE is in between +1 std and +2 std
    3. Score of 3 is returned if ROE is outside +2 std
    4. Score of -1 is returned if given ROE is in between -1 std and mean
    5. Score of -2 is returned if given ROE is in between -1 std and -2 std
    5. Score of -3 is given if given ROE is outside -2 std

    Parameters
    ----------
    values : List of the values
    value: The value to compare whether it's within mean, 1 std, -1 std, 2 std or -2 std
    cat: Category type, valid value is 1 or 2.
        
    Returns
    -------
    score: the score for given 'value'
    '''
    
    std = statistics.stdev(values)
    mean = statistics.mean(values)

    if cat == 1:
        if (mean + (-1 * std)) < value <= mean:
            return 1
        elif (mean + (-2 * std)) < value <= (mean + (-1 * std)):
            return 2
        elif value <= (mean + (-2 * std)):
            return 3
        elif mean < value <= (mean + (1 * std)):
            return -1
        elif (mean + (1 * std)) < value <= (mean + (2 * std)):
            return -2
        else:
            return -3
    else:
        if mean <= value < (mean + (1 * std)):
            return 1
        elif (mean + (1 * std)) <= value < (mean + (2 * std)):
            return 2
        elif value >= (mean + (2 * std)):
            return 3
        elif (mean + (-1 * std)) <= value < mean:
            return -1
        elif (mean + (-2 * std)) <= value < (mean + (-1 * std)):
            return -2
        else:
            return -3

To summarize, the following values are returned for ratios in Category 1: 1 if the ratio is between (mean — 1 * std) and the mean; 2 if the ratio is between (mean — 2 * std) and (mean — 1 * std); 3 if the ratio is less than (mean — 2 * std); negative values are returned for the opposite. Category 2 follows a similar procedure, but in the opposite direction. These scores are somewhat arbitrary, with outliers receiving either a maximum or minimum score. If you want to negate any biases with outliers, you can adjust scores to outliers, for instance, returning 0 if a ratio is less than (mean — 2 * std).

Apply scoring and add a column that totals the points for each ratio given in both categories.

df_score = df.copy()

for col in CAT1_RATIOS:
    for index, value in df[col].items():
        # print(f'{col} - {index} - {value}')
        df_score.loc[index, col] = score(df[col], value, 1)

for col in CAT2_RATIOS:
    for index, value in df[col].items():
        # print(f'{col} - {index} - {value}')
        df_score.loc[index, col] = score(df[col], value, 2)

# Add ranking scores to get the total score
df_score['Score'] = df_score[CAT1_RATIOS+CAT2_RATIOS].sum(axis=1)

Add some styles to the DataFrame:

def make_pretty(styler):
    # Column formatting
    styler.format({'EPS fwd': '{:.0f}', 'PE fwd': '{:.0f}', 'PEG': '{:.0f}', 'FCFY': '{:.0f}', 'PB' : '{:.0f}', 'ROE' : '{:.0f}',
                   'ROCE': '{:.0f}', 'D2E': '{:.0f}', 'CR': '{:.0f}', 'QR': '{:.0f}', 'Asset TR': '{:.0f}', 'DY': '{:.2f}%',
                   'Beta': '{:.0f}', '52w Low': '${:.2f}', 'Price': '${:.2f}', '52w High': '${:.2f}', '52w Range': '{:.2f}%', 'Score' : '{:.0f}'
                  })

    # Set the bar visualization
    styler.bar(subset = ['52w Range'], align = "mid", color = ["salmon", "cornflowerblue"])

    # Grid
    styler.set_properties(**{'border': '0.1px solid black'})

    # Set background gradients
    for ratio in CAT1_RATIOS:
        styler.background_gradient(subset=[ratio], cmap='RdYlGn', gmap=-df[ratio])
    for ratio in CAT2_RATIOS:
        styler.background_gradient(subset=[ratio], cmap='RdYlGn')
    styler.background_gradient(subset=['Score'], cmap='PiYG')
    
    # Hide index
    styler.hide(axis='index')

    # Left text alignment for some columns
    styler.set_properties(subset=['Symbol', 'Name'], **{'text-align': 'left'})
    styler.set_properties(subset=CAT1_RATIOS + CAT2_RATIOS + ['Market Cap', 'Score'], **{'text-align': 'center'})

    return styler

Finally, add the style to the DataFrame:

# Add table caption and styles to DF
df_score.style.pipe(make_pretty).set_caption(f'Stock Screener {industry}').set_table_styles(
    [{'selector': 'th.col_heading', 'props': 'text-align: center'},
     {'selector': 'caption', 'props': [('text-align', 'center'),
                                       ('font-size', '11pt'), ('font-weight', 'bold')]}])
Final result with Score column

With a market capitalization of over $2 billion, ATO and SR are the two utilities sector stocks that scored the highest out of all the stocks in the “Regulated Gas” industry. However, the price of ATO is currently closer to its 52w high, while the price of SR is closer to its 52w low.

Note that several ratios depend on the current stock price; thus, the results could change depending on when you invoke the notebook.

Conclusion

This article explains a fundamental analysis-based stock ranking method that assigns a score to each ratio based on the mean and standard deviation of the stock group.

For the purpose of analysis, it is imperative that a group of related stocks be selected, as most ratios are only meaningful within a group of similar stocks.

Even though the rating system is arbitrary, I think this post establishes the foundation for looking into other scoring systems, such weighted scores.

I hope you found the information interesting and value your feedback.

Python
Yahoo Finance
Pandas Dataframe
Stock Analysis
Recommended from ReadMedium