document/content/docs/self-host/design/dataset.en.mdx
In FastGPT, files are stored using MongoDB's GridFS, while the actual data is stored in PostgreSQL. Each row in PG has a file_id column that references the corresponding file. For backward compatibility and to support manual input and annotated data, file_id has some special values:
Note: file_id is only written at data insertion time and cannot be modified afterward.
file_id. The file is marked as unused at this point.file_id.used, and the data is pushed to the mongo training collection to await processing.