Sunday, 3 November 2019

python finance 5 beautifulsoup fetch and save data from wiki

wiki s&p 500 companies
goal is to fetch all company symbols from table

ctrl + U to inspect element
ctrl + F find table class = 'wikitable sortable'
skip first  header row 
fetch the first symbol column from tbody


symbols saved in pickle file

import bs4 as bs
import pickle
import requests

def save_sp500_tickers():
    resp = requests.get('https://en.wikipedia.org/wiki/List_of_S%26P_500_companies')
    soup = bs.BeautifulSoup(resp.text, 'lxml')
    table = soup.find('table', {'class': 'wikitable sortable'})
    tickers = []
    #skip table header
    for row in table.findAll('tr')[1:]:
        #save first column of tbody
        ticker = row.findAll('td')[0].text
        tickers.append(ticker)

    with open('sp500tickers.pickle', 'wb') as f:
        pickle.dump(tickers, f)

    print(tickers)
    return tickers

save_sp500_tickers()

reference:
https://www.youtube.com/watch?v=C--57BP79EM&list=PLQVvvaa0QuDcOdF96TBtRtuQksErCEBYZ&index=5

No comments:

Post a Comment