Back to Whylogs

Examples

python/examples/README.md

1.6.420.0 KB
Original Source

Examples

Welcome to our examples! If you want to get your hands dirty, check out the Getting Started Notebook.

๐Ÿง‘๐Ÿผโ€๐Ÿซ Basic examples

In the table below you will find different use cases for whylogs that will help you get started understanding what whylogs can do to make your data and ML pipelines more reliable and sustainable.

ExampleDescription
Visualizing ProfilesCompare profiles to detect distribution shifts, visualize histograms and bar charts and explore your data.
Logging DataSee the different ways you can log your data with whylogs.
Inspecting ProfilesA deeper dive on the metrics generated by whylogs.
Schema Configuration for Tracking MetricsConfigure tracking metrics according to data type or column features.
Constraints SuiteA collection of simple out-of-the-box constraints for the most common use-cases.
Merging ProfilesMerge your profiles logged across different computing instances, time periods or data segments.

๐ŸŒ‰ Whylogs Integrations

Welcome! In this section you will find examples on how to integrate whylogs' with different tools and platforms.

Data Pipelines

IntegrationDescription
Apache SparkProfile data in an Apache Spark environment
BigQueryProfile data queried from a Google BigQuery table
DaskProfile data in parallel with Dask
DatabricksLearn how to configure and run whylogs on a Databricks cluster
FugueUse Fugue to unify parallel whylogs profiling tasks
KafkaLearn how to consume and profile streaming data from an existing Kafka topic
RayProfile Big Data in parallel with the Ray integration

Storage

IntegrationDescription
s3See how to write your whylogs profiles to AWS S3 object storage
GCSSee how to write your whylogs profiles to the Google Cloud Storage

Model lifecycle and deployment

IntegrationDescription
Apache AirflowUse Airflow Operators to create drift reports and run contraint validations on your data
BentoMLLearn how monitor ML models managed and served with BentoML
FastAPILearn how monitor ML models served with FastAPI
FeastLearn how to log features from your Feature Store with Feast and whylogs
FlaskSee how you can create a Flask app with this whylogs + WhyLabs integration
FlyteLearn how to use whylogs' DatasetProfileView type natively on your Flyte workflows
Github ActionsMonitor your ML datasets as part of your GitOps CI/CD pipeline
MLflowLog your whylogs profiles to an MLflow experiment
ZenMLCombine different MLOps tools together with ZenML and whylogs!

Whylabs

You can monitor your profiles continuously with the WhyLabs Observability Platform, and have a single view of your different projects, data and ML models. To learn more how you can combine whylogs with WhyLabs and send over different profiles, refer to these following integration examples:

IntegrationDescription
Writing profilesSend profiles to your WhyLabs Dashboard
Reference ProfileSend profiles as Reference (Static) Profiles to WhyLabs
Regression MetricsMonitor Regression Model Performance Metrics with whylogs and WhyLabs
Classification MetricsMonitor Classification Model Performance Metrics with whylogs and WhyLabs
Ranking MetricsMonitor Ranking Model Performance Metrics with whylogs and WhyLabs (experimental)
Writing Feature WeightsSend Feature Weights / Feature Importance information to your WhyLabs Dashboard

Others

IntegrationDescription
whylogs ContainerA low code solution to profile your data with a Docker container deployed to your environment
JavaProfile data with whylogs with Java

๐Ÿง‘๐Ÿผโ€๐Ÿ”ฌ Advanced examples

Here you will find more advanced use-cases for whylogs, and you will learn how to make the most out of your created profiles. Hop on to any example in the table down below to get started.

ExampleDescription
Streaming Data with Log RotationGenerate profiles automatically at fixed intervals with rolling loggers
Condition Count MetricsCreate simple counter metrics with user-defined conditions
Condition ValidatorsReal-time Data Validation with Condition Validators.
Data ConstraintsSet constraints to your data to ensure its quality.
Custom MetricsCreate your own metrics and metric components
String TrackingTrack unicode ranges and character length distribution metrics for your textual features.
Image LoggingLog image properties and EXIF tags into profiles and send them to WhyLabs
SegmentsSegment your data to improve visibility to the sub-group level
Metric Constraints with Condition Count MetricsBuild Metric Constraints on top of Condition Count Metrics
Drift Algorithm ConfigurationChoose different drift algorithms and internal parameters for drift detection
Converting profiles from v0 to v1Convert whylogs v0 profiles to v1 profiles

๐Ÿงช Experimental

Here you will find examples of features that are still on an experimental stage. Expect changes on the API and the functionality of these features.

ExampleDescription
Performance Estimation - Estimating Accuracy for Binary Classification ProblemsEstimate accuracy for unlabeled target datasets for binary classification problems
Extracting and Monitoring Audio SamplesExtract features from audio samples for the purpose of monitoring for drift/quality
NLP SummarizationMonitor a document summarization task with whylogs
Embeddings Distance LoggingProfile embedding values by comparing them to reference data points
Condition Validator UDFsEasily create condition validators based on user-defined functions

๐Ÿ““ Benchmarks

Here you will find experiments to benchmark different aspect of the whylogs package, such as computational performance and different statistical algorithms.

ExampleDescription
Understanding Kolmogorov-Smirnov (KS) Tests for Data Drift on Profiled DataExperiments comparing between Kolmogorov-Smirnov whylogs' implementation on profiled data and traditional implementation on complete data

๐Ÿซ Tutorials

Here you will find tutorials that can span two or more concepts discussed in the previous sections. These tutorials are meant to be a more in-depth, and possibly domain-specific, explanation of the concepts discussed in the previous sections.

ExampleDescription
Data Validation for Spark Dataframes with whylogsProfile a Spark Dataframe and Perform Data Validation with Condition Count Metrics and Metric Constraints
Monitoring Embeddings for Text DataMonitor Embeddings, Tokens and Performance of your text classifier application
Data Validation at Scale - Detecting and Responding to Data MisbehaviorLog, validate, and debug failed conditions with Metric Constraints, Condition Count Metrics and Condition Validators

Get in touch

If you want to get more involved with whylogs adn interact with other practitioners, make sure to join our community Slack