docs/tutorials/llm-serve.md
:start-after: <!-- start llm-streaming-intro -->
:end-before: <!-- end llm-streaming-intro -->
:start-after: <!-- start llm-streaming-schemas -->
:end-before: <!-- end llm-streaming-schemas -->
:class: note
Thanks to DocArray's flexibility, you can implement very flexible services. For instance, you can use
Tensor types to efficiently stream token logits back to the client and implement complex token sampling strategies on
the client side.
:start-after: <!-- start llm-streaming-init -->
:end-before: <!-- end llm-streaming-init -->
:start-after: <!-- start llm-streaming-endpoint -->
:end-before: <!-- end llm-streaming-endpoint -->
:start-after: <!-- start llm-streaming-serve -->
:end-before: <!-- end llm-streaming-serve -->