README_0.10.0.md
llamafile 0.10.0 has been a work in progress for a while. Now that we are merging its code with main, we want to leave this document available to document both the reasons and the process behind it.
Everything started with the goal of replicating a cosmopolitan llama.cpp build from scratch, so we could get the best of two worlds. On the one hand, some of the characteristic features of llamafiles, that is portability across different systems and architectures and the possibility of bundling model weights within llamafile executables. On the other hand, the features and the model support made available by the most recent versions of llama.cpp.
We realise that what makes a llamafile is not just an APE executable, so before merging this code with main we wanted to bring back other of its features into the new build. We believe there's still work to do, but now that the main features are there we can let you play with a more modern llamafile and directly ask you what you'd like to see the most in its future versions.
Older builds (and llamafiles built on them) will still be available, check out our releases and our Example Llamafiles page.
Here are the features we brought into our development branch before merging with main. Most of them were brought in from previous versions of llamafile, and all credit goes to their original authors <3. Some (including new build for easier sync with upstream llama.cpp, mtmd API support, intregration tests, skill docs, HTTP chat client for combined mode) are new.
20260317
--image support to CLI20260219
20251215
--server parameter20251209
20251208
20251124