core/http/views/p2p.html
{{template "views/partials/inprogress" .}} {{ if eq .P2PToken "" }}
Enable peer-to-peer distribution to scale your AI workloads across multiple devices. Share instances, shard models, and pool computational resources across your network.
Load balance across multiple instances
Split large models across workers
Pool resources from multiple devices
1
Start LocalAI with P2P enabled
local-ai run --p2p
This will automatically generate a network token for you.
2
Or use an existing token
export TOKEN="your-token-here" local-ai run --p2p
If you already have a token from another instance, you can reuse it.
3
Access the P2P dashboard
Once enabled, refresh this page to see your network token and start connecting nodes.
{{ else }}
Scale your AI workloads across multiple devices with peer-to-peer distribution
LocalAI leverages cutting-edge peer-to-peer technologies to distribute AI workloads intelligently across your network
Share complete LocalAI instances across your network for load balancing and redundancy. Perfect for scaling across multiple devices.
Split large model weights across multiple workers. Currently supported with llama.cpp backends for efficient memory usage.
Pool computational resources from multiple devices, including your friends' machines, to handle larger workloads collaboratively.
Faster
Parallel processing
Scalable
Add more nodes
Resilient
Fault tolerant
Efficient
Resource optimization
{{.P2PToken}}
The network token can be used to either share the instance or join a federation or a worker network. Below you will find examples on how to start a new instance or a worker with this token.
Instance sharing
/
nodes
Load balanced instances
Model sharding
/
workers
Distributed computation
Connection token
Ready to connect
Instance load balancing and sharing
Active Nodes
/
Start LocalAI in federated mode to share your instance, or launch a federated server to distribute requests intelligently across multiple nodes in your network.
No nodes available
Start some workers to see them here
`
export TOKEN="{{.P2PToken}}" local-ai run --federated --p2p`
Note: If you don't have a token do not specify it and use the generated one that you can find in this page.
export TOKEN="{{.P2PToken}}" local-ai federated
Note: Token is needed when starting the federated server.
For all the options available, please refer to the documentation.
docker run -ti --net host -e TOKEN="{{.P2PToken}}" --name local-ai -p 8080:8080 localai/localai:latest-cpu run --federated --p2p
docker run -ti --net host -e TOKEN="{{.P2PToken}}" --name local-ai -p 9090:8080 localai/localai:latest-cpu federated
For all the options available and see what image to use, please refer to the Container images documentation and CLI parameters documentation.
Distributed model computation (llama.cpp)
Active Workers
/
Deploy llama.cpp workers to split model weights across multiple devices. This enables processing larger models by distributing computational load and memory requirements.
No workers available
Start some workers to see them here
export TOKEN="{{.P2PToken}}" local-ai worker p2p-llama-cpp-rpc
For all the options available, please refer to the documentation.
docker run -ti --net host -e TOKEN="{{.P2PToken}}" --name local-ai -p 8080:8080 localai/localai:latest-cpu worker p2p-llama-cpp-rpc
For all the options available and see what image to use, please refer to the Container images documentation and CLI parameters documentation.
{{ end }}
{{ if ne .P2PToken "" }} {{ end }} {{ if ne .P2PToken "" }} {{ else }} {{ end }} {{template "views/partials/footer" .}}