docs/source/get-started/cloud-deployment.rst
BentoCloud is an Inference Management Platform and Compute Orchestration Engine built on top of BentoML's open-source serving framework. It provides a complete stack for building fast and scalable AI systems with any model, on any cloud.
Why developers love BentoCloud:
Here is the workflow of deploying your AI service to BentoCloud:
.. image:: ../../_static/img/get-started/cloud-deployment/bentocloud-deployment-workflow.png :align: center :alt: BentoCloud deployment workflow
Visit the BentoML website <https://www.bentoml.com/>_ to sign up.
Install BentoML.
.. code-block:: bash
pip install bentoml
Log in to BentoCloud with the bentoml cloud login command. Follow the on-screen instructions to :ref:create a new API token <creating-an-api-token>.
.. code-block:: bash
$ bentoml cloud login
? How would you like to authenticate BentoML CLI? [Use arrows to move]
Create a new API token with a web browser Paste an existing API token
Clone the :doc:hello-world example.
.. code-block:: bash
git clone https://github.com/bentoml/quickstart.git cd quickstart
Deploy it to BentoCloud from the project directory. Optionally, use the -n flag to set a name.
.. code-block:: bash
bentoml deploy -n my-first-bento
.. note::
By default, this command packages all files under the directory from which it is executed. To exclude specific files or directories, define them in a .bentoignore file.
Sample output:
.. code-block:: bash
🍱 Built bento summarization:ngfnciv5g6nxonry Successfully pushed Bento "summarization:ngfnciv5g6nxonry" ✅ Created deployment "my-first-bento" in cluster "google-cloud-us-central-1" 💻 View Dashboard: https://demo.cloud.bentoml.com/deployments/my-first-bento
The first Deployment might take a minute or two. Wait until it's fully ready:
.. code-block:: bash
✅ Deployment "my-first-bento" is ready: https://demo.cloud.bentoml.com/deployments/my-first-bento
On the BentoCloud console, navigate to the Deployments page, and click your Deployment. Once it's up and running, you can interact with it using the Form section on the Playground tab.
.. image:: ../_static/img/get-started/cloud-deployment/first-bento-on-bentocloud.png :alt: A summarization model running on BentoCloud
Retrieve the Deployment URL via CLI. Replace my-first-bento if you use another name.
.. code-block:: bash
bentoml deployment get my-first-bento -o json | jq ."endpoint_urls"
.. note::
Ensure jq is installed for processing JSON output.
Create :doc:a BentoML client </build-with-bentoml/clients> to call the exposed endpoint. Replace the example URL with your Deployment's URL:
.. code-block:: python
import bentoml
client = bentoml.SyncHTTPClient("https://my-first-bento-e3c1c7db.mt-guc1.bentoml.ai") result: str = client.summarize( text="Breaking News: In an astonishing turn of events, the small town of Willow Creek has been taken by storm as local resident Jerry Thompson's cat, Whiskers, performed what witnesses are calling a 'miraculous and gravity-defying leap.' Eyewitnesses report that Whiskers, an otherwise unremarkable tabby cat, jumped a record-breaking 20 feet into the air to catch a fly. The event, which took place in Thompson's backyard, is now being investigated by scientists for potential breaches in the laws of physics. Local authorities are considering a town festival to celebrate what is being hailed as 'The Leap of the Century.", ) print(result)
To apply changes to your code, modify it locally and update the Deployment on BentoCloud by running:
.. code-block:: bash
bentoml deployment update my-first-bento --bento ./project/directory
For more information, see :doc:/scale-with-bentocloud/deployment/manage-deployments.
The replica count defaults to 1. You can update the minimum and maximum replicas allowed for scaling:
.. code-block:: bash
bentoml deployment update my-first-bento --scaling-min 0 --scaling-max 3
To terminate this Deployment, click Stop in the top right corner of its details page or simply run:
.. code-block:: bash
bentoml deployment terminate my-first-bento
If you are a first-time user of BentoCloud, we recommend you read the following documents to get started:
example projects </examples/overview> to BentoCloud/scale-with-bentocloud/deployment/manage-deployments/scale-with-bentocloud/deployment/create-deployments/scale-with-bentocloud/manage-api-tokens