README.md
1 # MLflow Proto To GraphQL Autogeneration 2 3 ## What is this 4 5 The system in `dev/proto_to_graphql` parses proto rpc definitions and generates graphql schema based on the proto rpc definition. The goal of this system is to quickly generate base GraphQL schema and resolver code so that we can easily take advantage of the data joining functionalities of GraphQL. 6 7 The autogenerated schema and resolver are in the following file: `mlflow/server/graphql/autogenerated_graphql_schema.py` 8 9 The autogenerated schema and resolvers are referenced and can be extended in this file `mlflow/server/graphql/graphql_schema_extensions.py` 10 11 You can run `python ./dev/proto_to_graphql/code_generator.py` or `./dev/generate-protos.sh` to trigger the codegen process. 12 13 ## FAQs 14 15 ### How to onboard a new rpc to GraphQL 16 17 - In your proto rpc definition, add `option (graphql) = {};` and re-run `./dev/generate-protos.sh`. You should see the changes in the generated schema. [Example](https://github.com/mlflow/mlflow/pull/11215/files#diff-8ab2ad3109b67a713e147edf557d4da88853563398ce354cc895bb5930950dc5R175). 18 - In `mlflow/server/handlers.py`, identify the handler function for your rpc, for example `_get_run`, make sure there exists a corresponding `get_run_impl` function that takes in a `request_message` and returns a response messages that is of the generated service_pb proto type. If no such function exists, you can easily extract it out like in this [example](https://github.com/mlflow/mlflow/pull/11215/files#diff-5c10a4e2ca47745f06fa9e7201087acfc102849756cb8d85e774a5ac468cb037R1779-R1795). 19 - Test manually with a localhost server, as well as adding a unit test in `tests/tracking/test_rest_tracking.py`. [Example](https://github.com/mlflow/mlflow/pull/11215/files#diff-2ec8756f67a20ecbaeec2d2c5e7bf33310a50c015fc3aa487e27100fc4c2f9a7R1771-R1802). 20 21 ### How to customize a generated query/mutation to join multiple rpc endpoints 22 23 The proto to graphql autogeneration only supports 1 to 1 mapping from proto rpc to graphql operation. However, the power of GraphQL is to join multiple rpc endpoints together as one query. So we often would like to customize or extend the autogenerated operations to join these multiple endpoints. 24 25 For example, we would like to query data about `Experiment`, `ModelVersions` and `Run` in one query by extending the `MlflowRun` object. 26 27 ``` 28 query testQuery { 29 mlflowGetRun(input: {runId: "my-id"}) { 30 run { 31 experiment { 32 name 33 } 34 modelVersions { 35 name 36 } 37 } 38 } 39 } 40 ``` 41 42 To achieve joins, follow the steps below: 43 44 - Make sure the rpcs you would like to join are already onboarded to GraphQL by following the `How to onboard a new rpc to GraphQL` section 45 - Identify the class you would like to extend in `autogenerated_graphql_schema.py` and create a new class that inherits the target class, put it in `graphql_schema_extensions.py`. Add the new fields and the resolver function as you intended. [Example](https://github.com/mlflow/mlflow/pull/11173/files#diff-9e4f7bdf4d7f9d362338bed9ce6607a51b8f520ee605e2fd4c9bda5e43cb617cR21-R31) 46 - Run `python ./dev/proto_to_graphql/code_generator.py` or `./dev/generate-protos.sh`, you should see the autogenerated schema being updated to reference the extension class you just created. 47 - Add a test case in `tests/tracking/test_rest_tracking.py` [Example](https://github.com/mlflow/mlflow/pull/11173/files#diff-2ec8756f67a20ecbaeec2d2c5e7bf33310a50c015fc3aa487e27100fc4c2f9a7R1771-R1795) 48 49 ### How to generate typescript types for a GraphQL operation 50 51 To generate typescript types, first make sure the generated schema is up-to-date by running `python ./dev/proto_to_graphql/code_generator.py` 52 53 Then write your new query or mutation in the mlflow/server/js/src folder, after that run the following commands: 54 55 - cd mlflow/server/js 56 - yarn graphql-codegen 57 58 You should be able to see the generated types in `mlflow/server/js/src/graphql/__generated__/`