I'm trying to form a data frame filled in with different yahoo financial key statistics, I'm using the latest package called "yahoofinancials", it's a great tool with many useful functions to get different types of financial data from yahoo finance.
My data frame will hold columns of num_shares_outstanding, interest_expense, operating_income, total_operating_expense, total_revenue, cost_of_revenue.....and much much more. And of course the first 2 columns will be the Date and Symbols, and I will come cross thousands of tickers, the fucntion will be using is:
data.get_operating_income()
data.get_total_operating_expense()
data.get_total_revenue()
data.get_cost_of_revenue()
data.get_income_before_tax()
data.get_income_tax_expense()
data.get_gross_profit()
data.get_net_income_from_continuing_ops()
data.get_research_and_development()
data.get_market_cap()
data.get_dividend_yield()
data.get_dividend_rate()
...
and maybe much more than showed above, I have issues writing a brief code to form this big data frame, my first question is, is there a container in Python that I can hold the function I want to use and run a loop the apply them into each column on each row, like a high dimensional vectorization? Instead of plug in every function like what I'm doing right now:
make an empty df first....and then..
def Obtain_Yahho_Financials(i, df):
try:
data = yf(i)
df.loc[df.Symbol == i, 'num_shares_outstanding'] = data.get_num_shares_outstanding(price_type='current')
df.loc[df.Symbol == i, 'interest_expense'] = data.get_interest_expense()
df.loc[df.Symbol == i, 'operating_income'] = data.get_operating_income()
df.loc[df.Symbol == i, 'total_operating_expense'] = data.get_total_operating_expense()
df.loc[df.Symbol == i, 'total_revenue'] = data.get_total_revenue()
df.loc[df.Symbol == i, 'cost_of_revenue'] = data.get_cost_of_revenue()
df.loc[df.Symbol == i, 'income_before_tax'] = data.get_income_before_tax()
df.loc[df.Symbol == i, 'income_tax_expense'] = data.get_income_tax_expense()
df.loc[df.Symbol == i, 'gross_profit'] = data.get_gross_profit()
...........
except:
pass or output na to the cell!!!!!
return df
And you will notice, the reason my I'm using try except is sometimes the function will return error on certain tickers, that's my second issue, I know this way will be problematic as you are try except many functions, it will jump out on the first function causing error and the rest of the functions will not be executed! But what should I do to try except every function instead of really make 50 try except commands?
Appreciated if any ideas on making this big data frame in a brief way!!!