Back to Bentoml

Scaling

docs/source/scale-with-bentocloud/scaling/index.rst

1.4.38642 B
Original Source

======= Scaling

Read how-to guides to scale inference on BentoCloud.

.. grid:: 1 2 2 2 :gutter: 3 :margin: 0 :padding: 3 4 0 0

.. grid-item-card:: Autoscaling
    :link: autoscaling
    :link-type: doc

    Configure concurrency and autoscaling to achieve optimal resource utilization and cost-efficiency for your AI workloads.

.. grid-item-card:: Gateways
    :link: gateways
    :link-type: doc

    Scale inference workloads across multiple regions and cloud providers with a single endpoint.

.. toctree:: :maxdepth: 1 :titlesonly: :hidden:

autoscaling
gateways