Back to Qlib

Data Collector

scripts/data_collector/README.md

0.9.72.3 KB
Original Source

Data Collector

Introduction

Scripts for data collection

Custom Data Collection

Specific implementation reference: https://github.com/microsoft/qlib/tree/main/scripts/data_collector/yahoo

  1. Create a dataset code directory in the current directory
  2. Add collector.py
    • add collector class:
      python
      CUR_DIR = Path(__file__).resolve().parent
      sys.path.append(str(CUR_DIR.parent.parent))
      from data_collector.base import BaseCollector, BaseNormalize, BaseRun
      class UserCollector(BaseCollector):
          ...
      
    • add normalize class:
      python
      class UserNormalzie(BaseNormalize):
          ...
      
    • add CLI class:
      python
      class Run(BaseRun):
          ...
      
  3. add README.md
  4. add requirements.txt

Description of dataset

Basic data
FeaturesPrice/Volume:
   - $close/$open/$low/$high/$volume/$change/$factor
Calendar<freq>.txt:
   - day.txt
   - 1min.txt
Instruments<market>.txt:
   - required: all.txt;
   - csi300.txt/csi500.txt/sp500.txt
  • Features: data, digital
    • if not adjusted, factor=1

Data-dependent component

To make the component running correctly, the dependent data are required

Componentrequired data
Data retrievalFeatures, Calendar, Instrument
BacktestFeatures[Price/Volume], Calendar, Instruments