Back to Llama Index

Remote Page/File Loader

llama-index-integrations/readers/llama-index-readers-remote/README.md

0.14.211.1 KB
Original Source

Remote Page/File Loader

bash
pip install llama-index-readers-remote

This loader makes it easy to extract the text from any remote page or file using just its url. If there's a file at the url, this loader will download it temporarily and parse it using SimpleDirectoryReader. It is an all-in-one tool for (almost) any url.

As a result, any page or type of file is supported. For instance, if a .txt url such as a Project Gutenberg book is passed in, the text will be parsed as is. On the other hand, if a hosted .mp3 url is passed in, it will be downloaded and parsed using AudioTranscriber.

Usage

To use this loader, you need to pass in a Path to a local file. Optionally, you may specify a file_extractor for the SimpleDirectoryReader to use, other than the default one.

python
from llama_index.readers.remote import RemoteReader

loader = RemoteReader()
documents = loader.load_data(
    url="https://en.wikipedia.org/wiki/File:Example.jpg"
)

This loader is designed to be used as a way to load data into LlamaIndex.