docs/source/package_reference/loading_methods.mdx
Methods for listing and loading datasets:
[[autodoc]] datasets.load_dataset
[[autodoc]] datasets.load_from_disk
[[autodoc]] datasets.load_dataset_builder
[[autodoc]] datasets.get_dataset_config_names
[[autodoc]] datasets.get_dataset_infos
[[autodoc]] datasets.get_dataset_split_names
Configurations used to load data files. They are used when loading local files or a dataset repository:
load_dataset("parquet", data_dir="path/to/data/dir")load_dataset("allenai/c4")You can pass arguments to load_dataset to configure data loading.
For example you can specify the sep parameter to define the [~datasets.packaged_modules.csv.CsvConfig] that is used to load the data:
load_dataset("csv", data_dir="path/to/data/dir", sep="\t")
[[autodoc]] datasets.packaged_modules.text.TextConfig
[[autodoc]] datasets.packaged_modules.text.Text
[[autodoc]] datasets.packaged_modules.csv.CsvConfig
[[autodoc]] datasets.packaged_modules.csv.Csv
[[autodoc]] datasets.packaged_modules.json.JsonConfig
[[autodoc]] datasets.packaged_modules.json.Json
[[autodoc]] datasets.packaged_modules.xml.XmlConfig
[[autodoc]] datasets.packaged_modules.xml.Xml
[[autodoc]] datasets.packaged_modules.parquet.ParquetConfig
[[autodoc]] datasets.packaged_modules.parquet.Parquet
[[autodoc]] datasets.packaged_modules.arrow.ArrowConfig
[[autodoc]] datasets.packaged_modules.arrow.Arrow
[[autodoc]] datasets.packaged_modules.sql.SqlConfig
[[autodoc]] datasets.packaged_modules.sql.Sql
[[autodoc]] datasets.packaged_modules.imagefolder.ImageFolderConfig
[[autodoc]] datasets.packaged_modules.imagefolder.ImageFolder
[[autodoc]] datasets.packaged_modules.audiofolder.AudioFolderConfig
[[autodoc]] datasets.packaged_modules.audiofolder.AudioFolder
[[autodoc]] datasets.packaged_modules.videofolder.VideoFolderConfig
[[autodoc]] datasets.packaged_modules.videofolder.VideoFolder
[[autodoc]] datasets.packaged_modules.hdf5.HDF5Config
[[autodoc]] datasets.packaged_modules.hdf5.HDF5
[[autodoc]] datasets.packaged_modules.pdffolder.PdfFolderConfig
[[autodoc]] datasets.packaged_modules.pdffolder.PdfFolder
[[autodoc]] datasets.packaged_modules.niftifolder.NiftiFolderConfig
[[autodoc]] datasets.packaged_modules.niftifolder.NiftiFolder
[[autodoc]] datasets.packaged_modules.webdataset.WebDataset