Create your own local OHLCV datasets using Python and the Binance API
Learn how to pull historical data from the Binance API without size limit. Update your datasets by loading only the most recent missing timestamps.
In my workflow, I chose to use local datasets of historical market data instead of pulling data from the Binance exchange each time I need them. This represents a gain in time and computing power alocated to this task and a gain in simplicity and robustness when working offline. All my coding projects are located in a single folder on my local machine. Each project that requires cryptocurrency data will look in the same and unique folder containing datasets of a bunch of selected cryptocurrency market data. I wrote a small Python script whose purpose is to load the entire market data found on Binance for a given pair. Therefore, the oldest candlesticks stored in these CSV files are as old as Binance which was created in 2017. Morevover, whenever I want to update these datasets with the most recent candles, the same single command executing the script in my terminal will scan the latest candle found in a given dataframe and pull data from the Binance API from this candle on.
Step 1: create an exchange.py file that will interact with the Binance API
This file will contain the minimum functions needed to send a request to the Binance API, read the response, and post-process the raw data to obtain a more readable DataFrame. The added value of this script is located in the most_recent_market_data() function. Indeed, the klines() function allows to get a limited number of candles within a single request. The most_recent_market_data() function performs multiple requests to cover the full time range of data available on the exchange and garantee the time continuity of the obtained dataset.






