GraphQL

If you want to try out this notebook with a live Python kernel, use mybinder:

vaex-graphql is a plugin package that exposes a DataFrame via a GraphQL interface. This allows easy sharing of data or aggregations/statistics or machine learning models to frontends or other programs with a standard query languages.

(Install with $ pip install vaex-graphql, no conda-forge support yet)

python

import vaex
df = vaex.datasets.titanic()
df

python

result = df.graphql.execute("""
    {
        df {
            min {
                age
                fare
            }
            mean {
                age
                fare
            }
            max {
                age
                fare
            }
            groupby {
                sex {
                   count
                   mean {
                       age
                   }
                }
            }
        }
    }
    """)
result.data

Pandas support

After importing vaex.graphql, vaex also installs a pandas accessor, so it is also accessible for Pandas DataFrames.

python

df_pandas = df.to_pandas_df()

python

df_pandas.graphql.execute("""
    {
        df(where: {age: {_gt: 20}}) {
            row(offset: 3, limit: 2) {
                name
                survived
            }
        }
    }
    """
).data

Server

The easiest way to learn to use the GraphQL language/vaex interface is to launch a server, and play with the GraphiQL graphical interface, its autocomplete, and the schema explorer.

We try to stay close to the Hasura API: https://docs.hasura.io/1.0/graphql/manual/api-reference/graphql-api/query.html

A server can be started from the command line:

$ python -m vaex.graphql myfile.hdf5

Or from within Python using df.graphql.serve

GraphiQL

See https://github.com/mariobuikhuizen/ipygraphql for a graphical widget, or a mybinder to try out a live example.