Back to Mlflow

MLflow Proto To GraphQL Autogeneration

dev/proto_to_graphql/README.md

3.12.03.9 KB
Original Source

MLflow Proto To GraphQL Autogeneration

What is this

The system in dev/proto_to_graphql parses proto rpc definitions and generates graphql schema based on the proto rpc definition. The goal of this system is to quickly generate base GraphQL schema and resolver code so that we can easily take advantage of the data joining functionalities of GraphQL.

The autogenerated schema and resolver are in the following file: mlflow/server/graphql/autogenerated_graphql_schema.py

The autogenerated schema and resolvers are referenced and can be extended in this file mlflow/server/graphql/graphql_schema_extensions.py

You can run python ./dev/proto_to_graphql/code_generator.py or ./dev/generate-protos.sh to trigger the codegen process.

FAQs

How to onboard a new rpc to GraphQL

  • In your proto rpc definition, add option (graphql) = {}; and re-run ./dev/generate-protos.sh. You should see the changes in the generated schema. Example.
  • In mlflow/server/handlers.py, identify the handler function for your rpc, for example _get_run, make sure there exists a corresponding get_run_impl function that takes in a request_message and returns a response messages that is of the generated service_pb proto type. If no such function exists, you can easily extract it out like in this example.
  • Test manually with a localhost server, as well as adding a unit test in tests/tracking/test_rest_tracking.py. Example.

How to customize a generated query/mutation to join multiple rpc endpoints

The proto to graphql autogeneration only supports 1 to 1 mapping from proto rpc to graphql operation. However, the power of GraphQL is to join multiple rpc endpoints together as one query. So we often would like to customize or extend the autogenerated operations to join these multiple endpoints.

For example, we would like to query data about Experiment, ModelVersions and Run in one query by extending the MlflowRun object.

query testQuery {
    mlflowGetRun(input: {runId: "my-id"}) {
        run {
            experiment {
                name
            }
            modelVersions {
                name
            }
        }
    }
}

To achieve joins, follow the steps below:

  • Make sure the rpcs you would like to join are already onboarded to GraphQL by following the How to onboard a new rpc to GraphQL section
  • Identify the class you would like to extend in autogenerated_graphql_schema.py and create a new class that inherits the target class, put it in graphql_schema_extensions.py. Add the new fields and the resolver function as you intended. Example
  • Run python ./dev/proto_to_graphql/code_generator.py or ./dev/generate-protos.sh, you should see the autogenerated schema being updated to reference the extension class you just created.
  • Add a test case in tests/tracking/test_rest_tracking.py Example

How to generate typescript types for a GraphQL operation

To generate typescript types, first make sure the generated schema is up-to-date by running python ./dev/proto_to_graphql/code_generator.py

Then write your new query or mutation in the mlflow/server/js/src folder, after that run the following commands:

  • cd mlflow/server/js
  • yarn graphql-codegen

You should be able to see the generated types in mlflow/server/js/src/graphql/__generated__/