examples/loadHistoricalPriceData.ipynb
This notebook demonstrates some of the ways to approach loading historical price data using the OpenBB Platform. The action is in the Equity module; but first, we need to initialize the notebook with the import statements block.
from datetime import datetime, timedelta
import pandas as pd
from openbb import obb
Historical market prices typically come in the form of OHLC+V - open, high, low, close, volume. There may be additional fields returned by a provider, but those are the expected columns. Granularity and amount of historical data will vary by provider and subscription status. Visit their websites to understand what your entitlements are.
['alpha_vantage', 'cboe', 'fmp', 'intrinio', 'polygon', 'tiingo', 'yfinance']
Common parameters have been standardized across all souces, start_date, end_date, interval.
The default interval will be 1d.
The depth of historical data and choices for granularity will vary by provider and subscription status. Refer to the website and documentation of each source understand your specific entitlements.
For demonstration purposes, we will use the openbb-yfinance data extension.
df_daily = obb.equity.price.historical(symbol="spy", provider="yfinance")
df_daily.to_df().head(1)
To load the entire history available from a source, pick a starting date well beyond what it might be. For example, 1900-01-01
df_daily = obb.equity.price.historical(
symbol="spy", start_date="1990-01-01", provider="yfinance"
).to_df()
df_daily.head(1)
The intervals are entered according to this pattern:
1m = One Minute1h = One Hour1d = One Day1W = One Week1M = One MonthThe date for monthly value is the first or last, depending on the provider. This can be easily resampled from daily data.
df_monthly = obb.equity.price.historical(
"spy", start_date="1990-01-01", interval="1M", provider="yfinance"
).to_df()
df_monthly.tail(2)
yfinance returns the monthly data for the first day of each month. Let's resample it to take from the last, using the daily information captured in the previous cells.
df_daily.index = pd.to_datetime(df_daily.index)
(
df_daily[["open", "high", "low", "close", "volume"]]
.resample("ME")
.agg(
{"open": "first", "high": "max", "low": "min", "close": "last", "volume": "sum"}
)
)
The block below packs an object with most intervals.
class HistoricalPrices:
def __init__(self, symbol, start_date, end_date, provider, **kwargs) -> None:
self.one: pd.DataFrame = (
obb.equity.price.historical(
symbol=symbol,
start_date=start_date,
end_date=end_date,
interval="1m",
provider=provider,
**kwargs
)
.to_df()
.convert_dtypes()
)
self.five: pd.DataFrame = (
obb.equity.price.historical(
symbol=symbol,
start_date=start_date,
end_date=end_date,
interval="5m",
provider=provider,
**kwargs
)
.to_df()
.convert_dtypes()
)
self.fifteen: pd.DataFrame = (
obb.equity.price.historical(
symbol=symbol,
start_date=start_date,
end_date=end_date,
interval="15m",
provider=provider,
**kwargs
)
.to_df()
.convert_dtypes()
)
self.thirty: pd.DataFrame = (
obb.equity.price.historical(
symbol=symbol,
start_date=start_date,
end_date=end_date,
interval="30m",
provider=provider,
**kwargs
)
.to_df()
.convert_dtypes()
)
self.sixty: pd.DataFrame = (
obb.equity.price.historical(
symbol=symbol,
start_date=start_date,
end_date=end_date,
interval="60m",
provider=provider,
**kwargs
)
.to_df()
.convert_dtypes()
)
self.daily: pd.DataFrame = (
obb.equity.price.historical(
symbol=symbol,
start_date=start_date,
end_date=end_date,
interval="1d",
provider=provider,
**kwargs
)
.to_df()
.convert_dtypes()
)
self.weekly: pd.DataFrame = (
obb.equity.price.historical(
symbol=symbol,
start_date=start_date,
end_date=end_date,
interval="1W",
provider=provider,
**kwargs
)
.to_df()
.convert_dtypes()
)
self.monthly: pd.DataFrame = (
obb.equity.price.historical(
symbol=symbol,
start_date=start_date,
end_date=end_date,
interval="1M",
provider=provider,
**kwargs
)
.to_df()
.convert_dtypes()
)
def load_historical(
symbol: str = "", start_date=None, end_date=None, provider=None, **kwargs
) -> HistoricalPrices:
if symbol == "":
display("Please enter a ticker symbol")
if provider is None:
provider = "yfinance"
prices = HistoricalPrices(symbol, start_date, end_date, provider, **kwargs)
return prices
prices = load_historical("spy")
display(prices.__dict__.keys())
display(prices.weekly.tail(2))
display(prices.one.head(2))
To demonstrate the difference between sources, let's compare values for daily volume from several sources.
# Collect the data
yahoo = obb.equity.price.historical("spy", provider="yfinance").to_df()
alphavantage = obb.equity.price.historical("spy", provider="alpha_vantage").to_df()
intrinio = obb.equity.price.historical("spy", provider="intrinio").to_df()
fmp = obb.equity.price.historical("spy", provider="fmp").to_df()
polygon = obb.equity.price.historical("spy", provider="polygon").to_df()
# Make a new DataFrame with just the volume columns
compare = pd.DataFrame()
compare["AV Volume"] = alphavantage["volume"].tail(10)
compare["FMP Volume"] = fmp["volume"].tail(10)
compare["Intrinio Volume"] = intrinio["volume"].tail(10)
compare["Yahoo Volume"] = yahoo["volume"].tail(10)
compare["Polygon Volume"] = polygon["volume"].tail(10)
compare.dropna(how="any")
Other types of assets and ticker symbols can be loaded from obb.equity.price.historical(), below are some examples but not an exhaustive list.
Some sources use - as the distinction between a share class, e.g., BRK-A and BRK-B. Other formats include:
BRK.ABRK/Aobb.equity.price.historical("brk.b", provider="polygon")
obb.equity.price.historical("brk-b", provider="fmp")
While some providers handle the different formats on their end, others do not. This is something to consider when no results are returned from one source. Some may even use a combination, or accept multiple variations. Sometimes there is no real logic behind the additional characters, GOOGL vs. GOOG. These are known unknown variables of ticker symbology, what's good for one source may return errors from another.
With providers supporting market data from multiple jurisdictions, the most common method for requesting data outside of US-listings is to append a suffix to the ticker symbol (e.g., RELIANCE.NS for Indian equities). Formats may be unique to a provider, so it is best to review the source's documentation for an overview of their specific conventions. This page on Yahoo describes how they format symbols, which many others follow to some degree.
Sources will have their own treatment of these symbols, some examples are:
FX symbols face the same dilemna as share classes, there are several variations of the same symbol.
EURUSD=XC:EURUSDEURUSDThe symbol prefixes are handled internally when obb.currency.price.historical() is used to enter a pair with no extra characters.
Similar, but different to FX tickers.
BTC-USDX:BTCUSDBTCUSDThe symbol prefixes are handled internally when obb.crypto.price.historical() is used to enter a pair with no extra characters and placing the fiat currency second.
Historical prices for active contracts, and the continuation chart, can be fetched via yfinance.
CL=FCLZ24.NYMCLH24.NYMIndividual contracts will require knowing which of the CME venues the future is listed on. ["NYM", "NYB", "CME", "CBT"].
Individual options contracts are also loadable from openbb.equity.price.historical().
SPY241220P00400000O:SPY241220P00400000These examples represent only a few methods for fetching historical price data. Explore the contents of each module to find more!
obb.equity.price.historical("SPY251219P00400000", provider="yfinance").to_df()
obb.equity.price.historical("SPX", provider="cboe").to_df()
obb.equity.price.historical("^SPX", provider="fmp").to_df()
obb.equity.price.historical("CLZ25.NYM", provider="yfinance").to_df()
obb.equity.price.historical("CL=F", provider="fmp").to_df()
obb.equity.price.historical("usdjpy=x", provider="yfinance").to_df()
obb.currency.price.historical("usdjpy", provider="yfinance").to_df()