x-pack/platform/plugins/shared/inference/scripts/evaluation/README.md
This tool is developed for the teams working on anything related to inference. It simplifies scripting and evaluating various scenarios with the Large Language Model (LLM) integration.
Run the tool using:
$ node x-pack/platform/plugins/shared/inference/scripts/evaluation/index.js
This will evaluate all existing scenarios, and write the evaluation results to the terminal.
By default, the tool will look for a Kibana instance running locally (at http://localhost:5601, which is the default address for running Kibana in development mode). It will also attempt to read the Kibana config file for the Elasticsearch address & credentials. If you want to override these settings, use --kibana and --es. Only basic auth is supported, e.g. --kibana http://username:password@localhost:5601. If you want to use a specific space, use --spaceId
Use --connectorId to specify a generative AI connector to use. If none are given, it will prompt you to select a connector based on the ones that are available. If only a single supported connector is found, it will be used without prompting.
Use --evaluateWith to specify the gen AI connector to use for evaluating the output of the task. By default, the same connector will be used.