Deploying MWAA Using AWS CDK
Learn how to use a Python AWS CDK application to configure and deploy your Apache Airflow environments using MWAA in a repeatable and consistent way.
Join the DZone community and get the full member experience.
Join For FreeIntroduction
In this quick how-to guide, I will show you how you can use a Python AWS CDK application to automate the deployment and configuration of your Apache Airflow environments using Managed Workflows for Apache Airflow (MWAA) on AWS.
What will you need:
- an AWS account with the right level of privileges
- a development environment with the AWS CDK configured and running (at the time of writing, you should be using AWS CDK v2)
- access to an AWS region where Managed Workflows for Apache Airflow is supported
- all code used in this how-to guide is provided in this GitHub repository
Some things to watch out for:
- If you are deploying this in an environment that already has VPCs, you may generate an error if you exceed the number of VPCs within your AWS Account (by default, this is set to 5, but this is a soft limit which you can request an increase for).
- Make sure that the Amazon S3 bucket you define for your MWAA environment does not exist before running the CDK app
Getting Started
Make sure we are running the correct version of the AWS CDKv2 tool (at least v2.2) and then check out the git repo.
cdk --version
> 2.28.1 (build d035432)
git clone https://github.com/094459/blogpost-cdk-mwaa.git
After checking out the repository you will have the following files on your local developer environment.
├── app.py
├── cdk.json
├── dags
│ ├── sample-cdk-dag-od.py
│ └── sample-cdk-dag.py
├── mwaa_cdk
│ ├── mwaa_cdk_backend.py
│ └── mwaa_cdk_env.py
└── requirements.txt
The first thing we need to do is update our Python dependencies which are documented in the requirements.txt file.
Note! If you are currently using in the process of moving between AWS CDKv1 and v2, then you should check out this blog post to help you prepare for this as the steps that follow may fail.
pip install -r requirements.txt
Exploring the CDK Stack
Our AWS CDK application consists of a number of files. The entry point to our application is the app.py file, where we define the structure and resources we are going to build. We then have two CDK stacks that deploy and configure AWS resources. Finally, we have resources that we deploy to our target Apache Airflow environment.
If we take a look at the app.py
file, we can see explore our CDK application in more detail. We are creating two stacks, one called mwaa_cdk_backend
and the other called mwaa_cdk_env
.
The mwaa_cdk_backend
will be used to set up the VPC network that the MWAA environment is going to use. The mwaa_cdk_env
is the stack that will configure your MWAA environment.
In order to do both though, first, we set up some configuration parameters so that we can maximise the re-use of this CDK application
import aws_cdk as cdk
from mwaa_cdk.mwaa_cdk_backend import MwaaCdkStackBackend
from mwaa_cdk.mwaa_cdk_env import MwaaCdkStackEnv
env_EU=cdk.Environment(region="{your-aws-region}", account="{your-aws-ac}")
mwaa_props = {'dagss3location': '{your-unqiue-s3-bucket}','mwaa_env' : '{name-of-your-mwaa-env}'}
app = cdk.App()
mwaa_hybrid_backend = MwaaCdkStackBackend(
scope=app,
id="mwaa-hybrid-backend",
env=env_EU,
mwaa_props=mwaa_props
)
mwaa_hybrid_env = MwaaCdkStackEnv(
scope=app,
id="mwaa-hybrid-environment",
vpc=mwaa_hybrid_backend.vpc,
env=env_EU,
mwaa_props=mwaa_props
)
app.synth()
We define configuration parameters in the env_EU
and mwaa_props
lines. This will allow you to re-use this stack to create multiple different environments. You can also add/change the variables in mwaa_props if you wanted to make other configuration options changeable via a configuration property (for example, logging verbosity or perhaps the version of Apache Airflow)
After changing the values in the app.py
file and saving, we are ready to deploy.
mwaa_cdk_backend
There is nothing particularly interesting about this other than it creates the underlying network infrastructure that MWAA needs. There is nothing you need to do, but if you do want to experiment, then what I would say is that a) ensure you read and follow the networking guidance on the MWAA documentation site, as they provide you with details on what needs to be set up, b) if you are trying to lock down the networking, try just deploying the backend stack, and then manually creating an MWAA environment to see if it works/fails.
from aws_cdk import (
aws_iam as iam,
aws_ec2 as ec2,
Stack,
CfnOutput
)
from constructs import Construct
class MwaaCdkStackBackend(Stack):
def __init__(self, scope: Construct, id: str, mwaa_props, **kwargs) -> None:
super().__init__(scope, id, **kwargs)
# Create VPC network
self.vpc = ec2.Vpc(
self,
id="MWAA-Hybrid-ApacheAirflow-VPC",
cidr="10.192.0.0/16",
max_azs=2,
nat_gateways=1,
subnet_configuration=[
ec2.SubnetConfiguration(
name="public", cidr_mask=24,
reserved=False, subnet_type=ec2.SubnetType.PUBLIC),
ec2.SubnetConfiguration(
name="private", cidr_mask=24,
reserved=False, subnet_type=ec2.SubnetType.PRIVATE_WITH_NAT)
],
enable_dns_hostnames=True,
enable_dns_support=True
)
CfnOutput(
self,
id="VPCId",
value=self.vpc.vpc_id,
description="VPC ID",
export_name=f"{self.region}:{self.account}:{self.stack_name}:vpc-id"
)
We can see that once this stack has deployed, it will output the VPC details via the console as well as via the AWS CloudFormation Output tab.
mwaa_cdk_env
The MWAA environment stack is a little more interesting and I will break it down. The first part of the stack configures the Amazon S3 buckets that MWAA will use.
from aws_cdk import (
aws_iam as iam,
aws_ec2 as ec2,
aws_s3 as s3,
aws_s3_deployment as s3deploy,
aws_mwaa as mwaa,
aws_kms as kms,
Stack,
CfnOutput,
Tags
)
from constructs import Construct
class MwaaCdkStackEnv(Stack):
def __init__(self, scope: Construct, id: str, vpc, mwaa_props, **kwargs) -> None:
super().__init__(scope, id, **kwargs)
key_suffix = 'Key'
# Create MWAA S3 Bucket and upload local dags
s3_tags = {
'env': f"{mwaa_props['mwaa_env']}",
'service': 'MWAA Apache AirFlow'
}
dags_bucket = s3.Bucket(
self,
"mwaa-dags",
bucket_name=f"{mwaa_props['dagss3location'].lower()}",
versioned=True,
block_public_access=s3.BlockPublicAccess.BLOCK_ALL
)
for tag in s3_tags:
Tags.of(dags_bucket).add(tag, s3_tags[tag])
s3deploy.BucketDeployment(self, "DeployDAG",
sources=[s3deploy.Source.asset("./dags")],
destination_bucket=dags_bucket,
destination_key_prefix="dags",
prune=False,
retain_on_delete=False
)
dags_bucket_arn = dags_bucket.bucket_arn
What this also does, however, is it takes all the files it finds in the local dags folder (in this particular example, and what is in the GitHub repo, this will be two DAGs, sample-cdk-dag-od.py and sample-cdk-dag.py) and uploads those as part of the deployment process. You can tweak this to your own requirements if you want, and even comment it out/remove it as needed if you do not need to do this.
Next up we have the code that creates the MWAA execution policy and the associated role that will be used by the MWAA worker nodes. This is taken from the MWAA documentation, but you can adjust it as needed for your own environment. You might need to do this if you are integrating with other AWS services — this has been set up with default none access, so anything you need to do will need to be added.
mwaa_policy_document = iam.PolicyDocument(
statements=[
iam.PolicyStatement(
actions=["airflow:PublishMetrics"],
effect=iam.Effect.ALLOW,
resources=[f"arn:aws:airflow:{self.region}:{self.account}:environment/{mwaa_props['mwaa_env']}"],
),
iam.PolicyStatement(
actions=[
"s3:ListAllMyBuckets"
],
effect=iam.Effect.DENY,
resources=[
f"{dags_bucket_arn}/*",
f"{dags_bucket_arn}"
],
),
iam.PolicyStatement(
actions=[
"s3:*"
],
effect=iam.Effect.ALLOW,
resources=[
f"{dags_bucket_arn}/*",
f"{dags_bucket_arn}"
],
),
iam.PolicyStatement(
actions=[
"logs:CreateLogStream",
"logs:CreateLogGroup",
"logs:PutLogEvents",
"logs:GetLogEvents",
"logs:GetLogRecord",
"logs:GetLogGroupFields",
"logs:GetQueryResults",
"logs:DescribeLogGroups"
],
effect=iam.Effect.ALLOW,
resources=[f"arn:aws:logs:{self.region}:{self.account}:log-group:airflow-{mwaa_props['mwaa_env']}-*"],
),
iam.PolicyStatement(
actions=[
"logs:DescribeLogGroups"
],
effect=iam.Effect.ALLOW,
resources=["*"],
),
iam.PolicyStatement(
actions=[
"sqs:ChangeMessageVisibility",
"sqs:DeleteMessage",
"sqs:GetQueueAttributes",
"sqs:GetQueueUrl",
"sqs:ReceiveMessage",
"sqs:SendMessage"
],
effect=iam.Effect.ALLOW,
resources=[f"arn:aws:sqs:{self.region}:*:airflow-celery-*"],
),
iam.PolicyStatement(
actions=[
"ecs:RunTask",
"ecs:DescribeTasks",
"ecs:RegisterTaskDefinition",
"ecs:DescribeTaskDefinition",
"ecs:ListTasks"
],
effect=iam.Effect.ALLOW,
resources=[
"*"
],
),
iam.PolicyStatement(
actions=[
"iam:PassRole"
],
effect=iam.Effect.ALLOW,
resources=[ "*" ],
conditions= { "StringLike": { "iam:PassedToService": "ecs-tasks.amazonaws.com" } },
),
iam.PolicyStatement(
actions=[
"kms:Decrypt",
"kms:DescribeKey",
"kms:GenerateDataKey*",
"kms:Encrypt",
"kms:PutKeyPolicy"
],
effect=iam.Effect.ALLOW,
resources=["*"],
conditions={
"StringEquals": {
"kms:ViaService": [
f"sqs.{self.region}.amazonaws.com",
f"s3.{self.region}.amazonaws.com",
]
}
},
),
]
)
mwaa_service_role = iam.Role(
self,
"mwaa-service-role",
assumed_by=iam.CompositePrincipal(
iam.ServicePrincipal("airflow.amazonaws.com"),
iam.ServicePrincipal("airflow-env.amazonaws.com"),
iam.ServicePrincipal("ecs-tasks.amazonaws.com"),
),
inline_policies={"CDKmwaaPolicyDocument": mwaa_policy_document},
path="/service-role/"
)
The next part configures the security group and subnets needed by MWAA.
security_group = ec2.SecurityGroup(
self,
id = "mwaa-sg",
vpc = vpc,
security_group_name = "mwaa-sg"
)
security_group_id = security_group.security_group_id
security_group.connections.allow_internally(ec2.Port.all_traffic(),"MWAA")
subnets = [subnet.subnet_id for subnet in vpc.private_subnets]
network_configuration = mwaa.CfnEnvironment.NetworkConfigurationProperty(
security_group_ids=[security_group_id],
subnet_ids=subnets,
)
The final part is the most interesting from the MWAA perspective, which is setting up and then configuring the environment. I have commented some of the environment settings out, so feel free to adjust for your own needs.
The first thing we do is create a configuration for the MWAA logging. In this particular configuration, I have enabled everything with INFO level logging so feel free to enable/disable or change the logging level as you need.
logging_configuration = mwaa.CfnEnvironment.LoggingConfigurationProperty(
dag_processing_logs=mwaa.CfnEnvironment.ModuleLoggingConfigurationProperty(
enabled=True,
log_level="INFO"
),
task_logs=mwaa.CfnEnvironment.ModuleLoggingConfigurationProperty(
enabled=True,
log_level="INFO"
),
worker_logs=mwaa.CfnEnvironment.ModuleLoggingConfigurationProperty(
enabled=True,
log_level="INFO"
),
scheduler_logs=mwaa.CfnEnvironment.ModuleLoggingConfigurationProperty(
enabled=True,
log_level="INFO"
),
webserver_logs=mwaa.CfnEnvironment.ModuleLoggingConfigurationProperty(
enabled=True,
log_level="INFO"
)
)
Next up we define some MWAA Apache Airflow configuration parameters. If you use custom properties, then this is where you will add them. Also, if you want to use TAGs for your MWAA environment, you can adjust accordingly.
options = {
'core.load_default_connections': False,
'core.load_examples': False,
'webserver.dag_default_view': 'tree',
'webserver.dag_orientation': 'TB'
}
tags = {
'env': f"{mwaa_props['mwaa_env']}",
'service': 'MWAA Apache AirFlow'
}
Next, we need to create some additional IAM policies and permissions as well as an AWS KMS encryption key to keep everything encrypted. This part is optional if you decide to not configure KMS encryption when configuring your MWAA environment, but I have included the info here.
kms_mwaa_policy_document = iam.PolicyDocument(
statements=[
iam.PolicyStatement(
actions=[
"kms:Create*",
"kms:Describe*",
"kms:Enable*",
"kms:List*",
"kms:Put*",
"kms:Decrypt*",
"kms:Update*",
"kms:Revoke*",
"kms:Disable*",
"kms:Get*",
"kms:Delete*",
"kms:ScheduleKeyDeletion",
"kms:GenerateDataKey*",
"kms:CancelKeyDeletion"
],
principals=[
iam.AccountRootPrincipal(),
# Optional:
# iam.ArnPrincipal(f"arn:aws:sts::{self.account}:assumed-role/AWSReservedSSO_rest_of_SSO_account"),
],
resources=["*"]),
iam.PolicyStatement(
actions=[
"kms:Decrypt*",
"kms:Describe*",
"kms:GenerateDataKey*",
"kms:Encrypt*",
"kms:ReEncrypt*",
"kms:PutKeyPolicy"
],
effect=iam.Effect.ALLOW,
resources=["*"],
principals=[iam.ServicePrincipal("logs.amazonaws.com", region=f"{self.region}")],
conditions={"ArnLike": {"kms:EncryptionContext:aws:logs:arn": f"arn:aws:logs:{self.region}:{self.account}:*"}},
),
]
)
key = kms.Key(
self,
f"{mwaa_props['mwaa_env']}{key_suffix}",
enable_key_rotation=True,
policy=kms_mwaa_policy_document
)
key.add_alias(f"alias/{mwaa_props['mwaa_env']}{key_suffix}")
Now we come to actually creating the environment, using the stuff we have created or set up above. The following represents all the typical configuration options for the core Apache Airflow options within MWAA. You can change them to suit your own environment or parameterise them as mentioned above.
managed_airflow = mwaa.CfnEnvironment(
scope=self,
id='airflow-test-environment',
name=f"{mwaa_props['mwaa_env']}",
airflow_configuration_options={'core.default_timezone': 'utc'},
airflow_version='2.0.2',
dag_s3_path="dags",
environment_class='mw1.small',
execution_role_arn=mwaa_service_role.role_arn,
kms_key=key.key_arn,
logging_configuration=logging_configuration,
max_workers=5,
network_configuration=network_configuration,
#plugins_s3_object_version=None,
#plugins_s3_path=None,
#requirements_s3_object_version=None,
#requirements_s3_path=None,
source_bucket_arn=dags_bucket_arn,
webserver_access_mode='PUBLIC_ONLY',
#weekly_maintenance_window_start=None
)
managed_airflow.add_override('Properties.AirflowConfigurationOptions', options)
managed_airflow.add_override('Properties.Tags', tags)
CfnOutput(
self,
id="MWAASecurityGroup",
value=security_group_id,
description="Security Group name used by MWAA"
)
This stack also outputs the MWAA security group, but you could export other information as well.
Deploying Your CDK Application
Now that we have reviewed the app, and modified it so that it contains your details (your AWS account/unique S3 bucket/etc), you can now run the app and deploy the CDK stacks. To do this we use the "cdk deploy
" command.
First of all, from the directory, make sure everything is working ok. To do this we can use the "cdk ls
" command. It should return the following (which are the ids assigned in the stacks that this CDK application uses) if it is working ok.
cdk ls
>MWAA-Backend
>MWAA-Environment
We can now deploy them, either altogether or one at a time. This CDK application needs the MWAA-Backend app deployed first as it contains the VPC networking that will be used in the MWAA-Environment stack, so we can deploy that by:
cdk deploy MWAA-Backend
And if it is working ok, it should look similar to the following:
✨ Synthesis time: 7.09s
mwaa-hybrid-backend: deploying...
[0%] start: Publishing 2695cb7a9f601cf94a4151c65c9069787d9ec312084346f2f4359e3f55ff2310:704533066374-eu-central-1
[100%] success: Published 2695cb7a9f601cf94a4151c65c9069787d9ec312084346f2f4359e3f55ff2310:704533066374-eu-central-1
mwaa-hybrid-backend: creating CloudFormation changeset...
✅ mwaa-hybrid-backend
✨ Deployment time: 172.13s
Outputs:
mwaa-hybrid-backend.ExportsOutputRefMWAAHybridApacheAirflowVPC677B092EF6F2F587 = vpc-0bbdeee3652ef21ff
mwaa-hybrid-backend.ExportsOutputRefMWAAHybridApacheAirflowVPCprivateSubnet1Subnet2A6995DF7F8D3134 = subnet-01e48db64381efc7f
mwaa-hybrid-backend.ExportsOutputRefMWAAHybridApacheAirflowVPCprivateSubnet2SubnetA28659530C36370A = subnet-0321530b8154f9bd2
mwaa-hybrid-backend.VPCId = vpc-0bbdeee3652ef21ff
Stack ARN:
arn:aws:cloudformation:eu-central-1:704533066374:stack/mwaa-hybrid-backend/b05897d0-f087-11ec-b5f3-02db3f47a5ca
✨ Total time: 179.22s
You can then track/view what has been deployed by checking the CloudFormation stack via the AWS Console.
We can now deploy the MWAA environment, which we can do simply by typing:
cdk deploy MWAA-Environment
This time, it will pop up details about some of the security-related information, in this case the IAM policies and security groups that I mentioned earlier. Answer "Y" to deploy these changes. This will kick off the deployment which you can track by going to the CloudFormation console.
This will take approx 20-25 minutes, so a good time to grab a cup of tea and read some of my other blog posts perhaps :-) If it has been successful, you will see the following output (again, your details will change but it should look similar to this):
Including dependency stacks: mwaa-hybrid-backend
[Warning at /mwaa-hybrid-environment/mwaa-sg] Ignoring Egress rule since 'allowAllOutbound' is set to true; To add customize rules, set allowAllOutbound=false on the SecurityGroup
✨ Synthesis time: 12.37s
mwaa-hybrid-backend
mwaa-hybrid-backend: deploying...
[0%] start: Publishing 2695cb7a9f601cf94a4151c65c9069787d9ec312084346f2f4359e3f55ff2310:704533066374-eu-central-1
[100%] success: Published 2695cb7a9f601cf94a4151c65c9069787d9ec312084346f2f4359e3f55ff2310:704533066374-eu-central-1
✅ mwaa-hybrid-backend (no changes)
✨ Deployment time: 1.97s
Outputs:
mwaa-hybrid-backend.ExportsOutputRefMWAAHybridApacheAirflowVPC677B092EF6F2F587 = vpc-0bbdeee3652ef21ff
mwaa-hybrid-backend.ExportsOutputRefMWAAHybridApacheAirflowVPCprivateSubnet1Subnet2A6995DF7F8D3134 = subnet-01e48db64381efc7f
mwaa-hybrid-backend.ExportsOutputRefMWAAHybridApacheAirflowVPCprivateSubnet2SubnetA28659530C36370A = subnet-0321530b8154f9bd2
mwaa-hybrid-backend.VPCId = vpc-0bbdeee3652ef21ff
Stack ARN:
arn:aws:cloudformation:eu-central-1:704533066374:stack/mwaa-hybrid-backend/b05897d0-f087-11ec-b5f3-02db3f47a5ca
✨ Total time: 14.35s
mwaa-hybrid-environment
This deployment will make potentially sensitive changes according to your current security approval level (--require-approval broadening).
Please confirm you intend to make the following modifications:
IAM Statement Changes
┌───┬──────────────────────────────┬────────┬──────────────────────────────┬──────────────────────────────┬─────────────────────────────────┐
│ │ Resource │ Effect │ Action │ Principal │ Condition │
├───┼──────────────────────────────┼────────┼──────────────────────────────┼──────────────────────────────┼─────────────────────────────────┤
│ + │ ${Custom::CDKBucketDeploymen │ Allow │ sts:AssumeRole │ Service:lambda.amazonaws.com │ │
│ │ t8693BB64968944B69AAFB0CC9EB │ │ │ │ │
│ │ 8756C/ServiceRole.Arn} │ │ │ │ │
├───┼──────────────────────────────┼────────┼──────────────────────────────┼──────────────────────────────┼─────────────────────────────────┤
│ + │ ${mwaa-dags.Arn} │ Deny │ s3:ListAllMyBuckets │ AWS:${mwaa-service-role} │ │
│ │ ${mwaa-dags.Arn}/* │ │ │ │ │
│ + │ ${mwaa-dags.Arn} │ Allow │ s3:* │ AWS:${mwaa-service-role} │ │
│ │ ${mwaa-dags.Arn}/* │ │ │ │ │
│ + │ ${mwaa-dags.Arn} │ Allow │ s3:Abort* │ AWS:${Custom::CDKBucketDeplo │ │
│ │ ${mwaa-dags.Arn}/* │ │ s3:DeleteObject* │ yment8693BB64968944B69AAFB0C │ │
│ │ │ │ s3:GetBucket* │ C9EB8756C/ServiceRole} │ │
│ │ │ │ s3:GetObject* │ │ │
│ │ │ │ s3:List* │ │ │
│ │ │ │ s3:PutObject │ │ │
├───┼──────────────────────────────┼────────┼──────────────────────────────┼──────────────────────────────┼─────────────────────────────────┤
│ + │ ${mwaa-hybrid-demoKey.Arn} │ Allow │ kms:CancelKeyDeletion │ AWS:arn:${AWS::Partition}:ia │ │
│ │ │ │ kms:Create* │ m::704533066374:root │ │
│ │ │ │ kms:Decrypt* │ │ │
│ │ │ │ kms:Delete* │ │ │
│ │ │ │ kms:Describe* │ │ │
│ │ │ │ kms:Disable* │ │ │
│ │ │ │ kms:Enable* │ │ │
│ │ │ │ kms:GenerateDataKey* │ │ │
│ │ │ │ kms:Get* │ │ │
│ │ │ │ kms:List* │ │ │
│ │ │ │ kms:Put* │ │ │
│ │ │ │ kms:Revoke* │ │ │
│ │ │ │ kms:ScheduleKeyDeletion │ │ │
│ │ │ │ kms:Update* │ │ │
│ + │ ${mwaa-hybrid-demoKey.Arn} │ Allow │ kms:Decrypt* │ Service:logs.eu-central-1.am │ "ArnLike": { │
│ │ │ │ kms:Describe* │ azonaws.com │ "kms:EncryptionContext:aws:lo │
│ │ │ │ kms:Encrypt* │ │ gs:arn": "arn:aws:logs:eu-centr │
│ │ │ │ kms:GenerateDataKey* │ │ al-1:704533066374:*" │
│ │ │ │ kms:PutKeyPolicy │ │ } │
│ │ │ │ kms:ReEncrypt* │ │ │
├───┼──────────────────────────────┼────────┼──────────────────────────────┼──────────────────────────────┼─────────────────────────────────┤
│ + │ ${mwaa-service-role.Arn} │ Allow │ sts:AssumeRole │ Service:airflow-env.amazonaw │ │
│ │ │ │ │ s.com │ │
│ │ │ │ │ Service:airflow.amazonaws.co │ │
│ │ │ │ │ m │ │
│ │ │ │ │ Service:ecs-tasks.amazonaws. │ │
│ │ │ │ │ com │ │
├───┼──────────────────────────────┼────────┼──────────────────────────────┼──────────────────────────────┼─────────────────────────────────┤
│ + │ * │ Allow │ logs:DescribeLogGroups │ AWS:${mwaa-service-role} │ │
│ + │ * │ Allow │ ecs:DescribeTaskDefinition │ AWS:${mwaa-service-role} │ │
│ │ │ │ ecs:DescribeTasks │ │ │
│ │ │ │ ecs:ListTasks │ │ │
│ │ │ │ ecs:RegisterTaskDefinition │ │ │
│ │ │ │ ecs:RunTask │ │ │
│ + │ * │ Allow │ iam:PassRole │ AWS:${mwaa-service-role} │ "StringLike": { │
│ │ │ │ │ │ "iam:PassedToService": "ecs-t │
│ │ │ │ │ │ asks.amazonaws.com" │
│ │ │ │ │ │ } │
│ + │ * │ Allow │ kms:Decrypt │ AWS:${mwaa-service-role} │ "StringEquals": { │
│ │ │ │ kms:DescribeKey │ │ "kms:ViaService": [ │
│ │ │ │ kms:Encrypt │ │ "sqs.eu-central-1.amazonaws │
│ │ │ │ kms:GenerateDataKey* │ │ .com", │
│ │ │ │ kms:PutKeyPolicy │ │ "s3.eu-central-1.amazonaws. │
│ │ │ │ │ │ com" │
│ │ │ │ │ │ ] │
│ │ │ │ │ │ } │
├───┼──────────────────────────────┼────────┼──────────────────────────────┼──────────────────────────────┼─────────────────────────────────┤
│ + │ arn:${AWS::Partition}:s3:::c │ Allow │ s3:GetBucket* │ AWS:${Custom::CDKBucketDeplo │ │
│ │ dk-hnb659fds-assets-70453306 │ │ s3:GetObject* │ yment8693BB64968944B69AAFB0C │ │
│ │ 6374-eu-central-1 │ │ s3:List* │ C9EB8756C/ServiceRole} │ │
│ │ arn:${AWS::Partition}:s3:::c │ │ │ │ │
│ │ dk-hnb659fds-assets-70453306 │ │ │ │ │
│ │ 6374-eu-central-1/* │ │ │ │ │
├───┼──────────────────────────────┼────────┼──────────────────────────────┼──────────────────────────────┼─────────────────────────────────┤
│ + │ arn:aws:airflow:eu-central-1 │ Allow │ airflow:PublishMetrics │ AWS:${mwaa-service-role} │ │
│ │ :704533066374:environment/mw │ │ │ │ │
│ │ aa-hybrid-demo │ │ │ │ │
├───┼──────────────────────────────┼────────┼──────────────────────────────┼──────────────────────────────┼─────────────────────────────────┤
│ + │ arn:aws:logs:eu-central-1:70 │ Allow │ logs:CreateLogGroup │ AWS:${mwaa-service-role} │ │
│ │ 4533066374:log-group:airflow │ │ logs:CreateLogStream │ │ │
│ │ -mwaa-hybrid-demo-* │ │ logs:DescribeLogGroups │ │ │
│ │ │ │ logs:GetLogEvents │ │ │
│ │ │ │ logs:GetLogGroupFields │ │ │
│ │ │ │ logs:GetLogRecord │ │ │
│ │ │ │ logs:GetQueryResults │ │ │
│ │ │ │ logs:PutLogEvents │ │ │
├───┼──────────────────────────────┼────────┼──────────────────────────────┼──────────────────────────────┼─────────────────────────────────┤
│ + │ arn:aws:sqs:eu-central-1:*:a │ Allow │ sqs:ChangeMessageVisibility │ AWS:${mwaa-service-role} │ │
│ │ irflow-celery-* │ │ sqs:DeleteMessage │ │ │
│ │ │ │ sqs:GetQueueAttributes │ │ │
│ │ │ │ sqs:GetQueueUrl │ │ │
│ │ │ │ sqs:ReceiveMessage │ │ │
│ │ │ │ sqs:SendMessage │ │ │
└───┴──────────────────────────────┴────────┴──────────────────────────────┴──────────────────────────────┴─────────────────────────────────┘
IAM Policy Changes
┌───┬───────────────────────────────────────────────────────────────────┬───────────────────────────────────────────────────────────────────┐
│ │ Resource │ Managed Policy ARN │
├───┼───────────────────────────────────────────────────────────────────┼───────────────────────────────────────────────────────────────────┤
│ + │ ${Custom::CDKBucketDeployment8693BB64968944B69AAFB0CC9EB8756C/Ser │ arn:${AWS::Partition}:iam::aws:policy/service-role/AWSLambdaBasic │
│ │ viceRole} │ ExecutionRole │
└───┴───────────────────────────────────────────────────────────────────┴───────────────────────────────────────────────────────────────────┘
Security Group Changes
┌───┬────────────────────┬─────┬────────────┬────────────────────┐
│ │ Group │ Dir │ Protocol │ Peer │
├───┼────────────────────┼─────┼────────────┼────────────────────┤
│ + │ ${mwaa-sg.GroupId} │ In │ Everything │ ${mwaa-sg.GroupId} │
│ + │ ${mwaa-sg.GroupId} │ Out │ Everything │ Everyone (IPv4) │
└───┴────────────────────┴─────┴────────────┴────────────────────┘
(NOTE: There may be security-related changes not in this list. See https://github.com/aws/aws-cdk/issues/1299)
Do you wish to deploy these changes (y/n)? y
mwaa-hybrid-environment: deploying...
[0%] start: Publishing e9882ab123687399f934da0d45effe675ecc8ce13b40cb946f3e1d6141fe8d68:704533066374-eu-central-1
[0%] start: Publishing 983c442a2fe823a8b4ebb18d241a5150ae15103dacbf3f038c7c6343e565aa4c:704533066374-eu-central-1
[0%] start: Publishing 91ab667f7c88c3b87cf958b7ef4158ef85fb9ba8bd198e5e0e901bb7f904d560:704533066374-eu-central-1
[0%] start: Publishing f2a926ee3d8ca4bd02b0cf073eb2bbb682e94c021925bf971a9730045ef4fb02:704533066374-eu-central-1
[25%] success: Published 983c442a2fe823a8b4ebb18d241a5150ae15103dacbf3f038c7c6343e565aa4c:704533066374-eu-central-1
[50%] success: Published 91ab667f7c88c3b87cf958b7ef4158ef85fb9ba8bd198e5e0e901bb7f904d560:704533066374-eu-central-1
[75%] success: Published f2a926ee3d8ca4bd02b0cf073eb2bbb682e94c021925bf971a9730045ef4fb02:704533066374-eu-central-1
[100%] success: Published e9882ab123687399f934da0d45effe675ecc8ce13b40cb946f3e1d6141fe8d68:704533066374-eu-central-1
mwaa-hybrid-environment: creating CloudFormation changeset...
✅ mwaa-hybrid-environment
✨ Deployment time: 1412.35s
Outputs:
mwaa-hybrid-environment.MWAASecurityGroup = sg-0ea83e01caded2bb3
Stack ARN:
arn:aws:cloudformation:eu-central-1:704533066374:stack/mwaa-hybrid-environment/450337a0-f088-11ec-a169-06ba63bfdfb2
✨ Total time: 1424.72s
Testing the Environment
If we take a look at the Amazon S3 bucket we can see we have our MWAA bucket and dags folder created, as well as our local DAGs uploaded.
If we go to the MWAA console, we can see our environment
We can now grab the URL for this environment, either by getting it from the console or by using the AWS CLI. Just substitute the name of the MWAA environment and AWS region, and it should then give you the URL you can use in your browser (although you will have to append /home to it)
Note I am using jq, if you do not have this in your environment, you can run the command without this but just need to find the entry in the output where it says "WebserverUrl"
aws mwaa get-environment --name {name of the environment created} --region={region} | jq -r '.Environment | .WebserverUrl'
And as we can see, we have the two sample DAGS that were in the local folder, and are now available for us in the MWAA environment.
Removing/Cleaning up our MWAA environment
In order to remove everything we have deployed, all we need to do is:
cdk destroy MWAA-Environment
It will take 20-30 minutes to clean up the MWAA environment. One thing that it will not do, however, is remove the Amazon S3 bucket we set up, so you will need to manually delete that via the console (or use the AWS CLI — that would be my approach). Once you have removed that S3 bucket, now clean up the backend stack
cdk destroy MWAA-Backend
This should be much quicker to clean up. Once finished, you should be done.
What's Next?
That's all folks, I hope this has been helpful. Please let me know if you find this how-to guide useful, and if you run into any issues, please log an issue in GitHub and I will take a look.
Opinions expressed by DZone contributors are their own.
Comments