docs/TOML_SELECTOR.md
Mistral.rs supports loading models from a .toml file, and the fields are the same as for the CLI. Please find some example toml selectors here.
There are a few cases which add functionality that cannot be found in the CLI.
Under [speculative]
gamma parameterUnder [speculative.draft_model]
[model] (only requirement is that they have the same tokenizer)[model]
model_id = "mistralai/Mistral-7B-Instruct-v0.1"
arch = "mistral"
[speculative]
gamma = 32
[speculative.draft_model]
tok_model_id = "mistralai/Mistral-7B-Instruct-v0.1"
quantized_model_id = "TheBloke/Mistral-7B-Instruct-v0.1-GGUF"
quantized_filename = "mistral-7b-instruct-v0.1.Q2_K.gguf"
mistralrs from-config -f toml-selectors/speculative-gguf.toml
Under [anymoe], required unless specified
https://huggingface.co/<MODEL ID>/tree/main?show_file_info=model.safetensors.index.jsonmodel.layers.27.mlp.down_proj.weight means that the prefix is model.layers and the mlp is mlp.Under [anymoe.config]
https://huggingface.co/<BASE MODEL ID>/blob/main/config.json(For LoRA experts) Under [anymoe.config.expert_type.lora_adapter]
mistralrs from-config -f toml-selectors/anymoe.toml
[model]
model_id = "mistralai/Mistral-7B-Instruct-v0.1"
arch = "mistral"
[anymoe]
dataset_json = "test.csv"
prefix = "model.layers"
mlp = "mlp"
model_ids = ["HuggingFaceH4/zephyr-7b-beta"]
layers = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]
[anymoe.config]
hidden_size = 4096
expert_type = "fine_tuned"
[model]
model_id = "HuggingFaceH4/zephyr-7b-beta"
arch = "mistral"
[anymoe]
dataset_json = "test.csv"
prefix = "model.layers"
mlp = "mlp"
model_ids = ["EricB/example_adapter"]
layers = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]
[anymoe.config]
hidden_size = 4096
[anymoe.config.expert_type.lora_adapter]
rank = 16
alpha = 16
target_modules = ["gate_proj"]