embedchain/docs/api-reference/app/add.mdx
add() method is used to load the data sources from different data sources to a RAG pipeline. You can find the signature below:
from embedchain import App
app = App()
app.add("https://www.forbes.com/profile/elon-musk")
# Inserting batches in chromadb: 100%|███████████████| 1/1 [00:00<00:00, 1.19it/s]
# Successfully saved https://www.forbes.com/profile/elon-musk (DataType.WEB_PAGE). New chunks count: 4
from embedchain import App
app = App()
app.add("https://python.langchain.com/sitemap.xml", data_type="sitemap")
# Loading pages: 100%|█████████████| 1108/1108 [00:47<00:00, 23.17it/s]
# Inserting batches in chromadb: 100%|█████████| 111/111 [04:41<00:00, 2.54s/it]
# Successfully saved https://python.langchain.com/sitemap.xml (DataType.SITEMAP). New chunks count: 11024
You can find complete list of supported data sources here.