docs/docs/self-hosting/deploy-to-cloud/aws.mdx
MLflow core components include:
This guide walks you through deploying the MLflow server to Amazon ECS (Elastic Container Service), the backend store to Amazon RDS (PostgreSQL), and the artifact store to Amazon S3. The guide also covers IAM roles, VPC subnets, security groups, and ALB (Application Load Balancer) settings. Once deployment is complete, you can access the MLflow web UI through an AWS application URL like https://ml-<unique-service-id>.ecs.<aws-region>.on.aws, and your MLflow client code can connect to the MLflow server by setting the tracking URI to this URL.
The overall deployment architecture is as follows:
The deployment architecture has a couple of advantages:
High Availability
Security by design
Operational Simplicity
Note that you can also use managed MLflow on AWS SageMaker instead of the self-hosting approach in this guide. The managed MLflow on AWS SageMaker provides a fully integrated, fully managed machine learning tracking and experiment management service.
Create a S3 bucket with name like "mlflow-artifacts-123", and block all public access as follows:
MLflow will store large artifacts such as model weight, figures, videos, within the specified S3 bucket.
Create an RDS instance of PostgreSQL type, set credentials management to "Self managed", and set master username / password. Once creation completes, you can view the RDS instance endpoint information as follows:
If you need to set up a standby database instance for high availability, go to "Availability & durability -> Multi-AZ deployment" panel and select "Create a standby instance".
Then you can construct the database URI like postgresql://<username>:<password>@<endpoint>:5432/<database-name>. The database URI is used by MLflow server.
MLflow will store metadata of experiments / runs / logged models / registered models / traces / jobs / gateway endpoints in the database.
The S3 bucket that is created in the first step blocks all public access. Create an IAM role "mlflow-task-role" with an inline policy as follows:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "MLflowArtifacts",
"Effect": "Allow",
"Action": [
"s3:PutObject",
"s3:GetObject",
"s3:DeleteObject",
"s3:ListBucket",
"s3:AbortMultipartUpload"
],
"Resource": [
"arn:aws:s3:::<the name of s3 bucket for mlflow artifacts store>",
"arn:aws:s3:::<the name of s3 bucket for mlflow artifacts store>/*"
]
}
]
}
Create ECS (Elastic Container Service) to host the MLflow web server and the UI. Choose "Express mode" to configure most settings including VPC and ALB automatically, but you need to fill the following settings correctly:
ghcr.io/mlflow/mlflow:v3.10.0-full. You can find available MLflow images in this page.bash,-c,mlflow server --backend-store-uri {database-URL} --artifacts-destination s3://{s3-bucket-name}/ --host 0.0.0.0 --port 5000 --disable-security-middleware, note that we disable the security middleware because the MLflow service is protected behind the Amazon ALB Security group + VPC subnet, and the ECS express mode will generate the default ALB / VPC settings automatically.After creation, you can view the ECS express service overview page as follows:
The "Application URL" on the overview page is the MLflow service URL that you can access from public network.
MLflow supports basic authentication and authentication with OIDC plugin, the 2 kinds of authentication settings require:
pip install command into the "Command" setting.Use MLflow demo CLI to validate the deployment. Run the command from your own laptop as follows:
mlflow demo --tracking-uri <ECS-application-URL>
then open the application URL in your browser, view the experiment with name "MLflow Demo", and explore GenAI features like traces, evaluation runs, prompt management etc.