Diagram As Code: Crafting AWS Architecture Diagrams Using Python
Discover the art of diagramming AWS infrastructures using Python’s ‘diagrams’ library.
Introduction
Visual representations of system architectures are crucial for understanding and communicating the design and operation of cloud-based services. For those who work with Amazon Web Services (AWS), the ability to programmatically create these representations can streamline documentation and ensure that diagrams stay up-to-date with the underlying infrastructure. Enter diagrams
, a Python library that does just that: it allows you to create, update, and manage your AWS architecture diagrams using code.
Although the diagrams
library is versatile enough to support various cloud providers, our focus in this article is on AWS (Because I love it:-D). AWS's extensive range of services and widespread adoption make it a prime candidate for such visualizations. By using code to diagram these services, we can achieve a level of detail and customization that traditional drawing tools struggle to match. Let’s dive it.
What is the diagrams Library?
The diagrams
library is a Python package that enables developers to build cloud system diagrams using just Python code. It leverages the Graphviz graph-drawing software to turn descriptions of architectures into detailed visualizations. With the diagrams
library, you can create high-quality diagrams that are easily modifiable, making it a perfect tool for generating architecture diagrams as part of your documentation or development process.
If you want to learn more about this package and contribute, please check its main repository.
Getting Started with the diagrams Library
Setting Up Your Python Environment
Before you embark on the journey of transforming your AWS infrastructure into a diagrammatic form, you’ll need to prepare your workspace. The foundation of this setup is Python. Ensure that you have Python 3.6 or later installed on your system, as it’s a prerequisite for the diagrams
library to function correctly.
Installing Graphviz
With Python ready to go, the next step is to install Graphviz, the open-source graph visualization software. Graphviz is crucial because it provides the underlying graph layout and rendering capabilities that the diagrams
library depends on. Installation methods may vary depending on your operating system, but for most Linux distributions, Graphviz can be installed using the package manager. For example, on Debian-based systems like Ubuntu, you can install Graphviz by running:
sudo apt-get install graphviz
Windows and macOS users can download installers from the Graphviz website or use package managers such as Homebrew for macOS or Chocolatey for Windows.
Installing the diagrams Library
Once Graphviz is in place, the final step in the setup process is to install the diagrams
library itself. This is easily done through Python’s package manager, pip. Open your command line interface and execute the following command:
pip install diagrams
This command downloads and installs the latest version of the diagrams
library along with its dependencies, setting the stage for you to begin crafting your diagrams.
Verifying the Installation
To confirm that everything is installed correctly, try importing the diagrams
library in a Python shell or script. If no errors occur, you're ready to proceed.
import diagrams
Now that you have all the necessary tools installed, you’re all set to start converting your AWS architecture into a visual masterpiece. The ease and flexibility provided by the diagrams
library make it an invaluable addition to any cloud developer's toolkit.
Exploring Essential AWS Services in the diagrams Library
Understanding the vast array of services offered by AWS is fundamental to creating a comprehensive architecture diagram. The diagrams
library categorizes these services to mirror AWS's own structure, making it easier for architects and developers familiar with AWS to find their way around. Let's introduce some of the most commonly used AWS services that you can represent using the diagrams
library.
Compute Services
- EC2 (Elastic Compute Cloud): The backbone of AWS, EC2 instances are the workhorses that power a large portion of cloud applications. They provide scalable computing capacity, making it possible to run applications in the cloud.
- Lambda: This is AWS’s event-driven, serverless computing platform. It runs your code in response to events and automatically manages the compute resources.
- Elastic Beanstalk: An orchestration service that allows you to deploy and scale web applications and services developed with Java, .NET, PHP, Node.js, and more, running on familiar servers like Apache and IIS.
Storage Services
- S3 (Simple Storage Service): S3 provides object storage with a simple web service interface to store and retrieve any amount of data from anywhere on the web.
- EBS (Elastic Block Store): Offers persistent block storage volumes for use with EC2 instances, providing high-availability storage.
- Glacier: An extremely low-cost storage service that provides secure and durable storage for data archiving and backup.
Database Services
- RDS (Relational Database Service): Simplifies setting up, operating, and scaling a relational database in the cloud. It provides cost-efficient and resizable capacity while managing time-consuming database administration tasks.
- DynamoDB: A NoSQL database service that provides fast and predictable performance with seamless scalability.
- Redshift: A fast, fully managed data warehouse that makes it simple and cost-effective to analyze all your data using standard SQL and existing Business Intelligence (BI) tools.
Networking Services
- VPC (Virtual Private Cloud): Offers a logically isolated section of the AWS cloud where you can launch AWS resources in a virtual network that you define.
- Route 53: A scalable and highly available Domain Name System (DNS) web service.
- API Gateway: This makes it easy for developers to create, publish, maintain, monitor, and secure APIs at any scale.
Security Services
- IAM (Identity and Access Management): Enables you to manage access to AWS services and resources securely.
- KMS (Key Management Service): Makes it easy for you to create and manage the encryption keys used to encrypt your data.
Application Integration Services
- SNS (Simple Notification Service): A managed service that provides message delivery from publishers to subscribers (also known as producers and consumers).
- SQS (Simple Queue Service): Offers a secure, durable, and available hosted queue that lets you integrate and decouple distributed software systems and components.
Management and Governance Services
- CloudWatch: Provides data and actionable insights to monitor applications, understand and respond to system-wide performance changes, optimize resource utilization, and get a unified view of operational health.
- CloudFormation: Gives developers and systems administrators an easy way to create and manage a collection of related AWS resources, provisioning and updating them in an orderly and predictable fashion.
Developer Tools
- CodeCommit: A source control service that hosts secure Git-based repositories, making it easy for teams to collaborate on code in a secure and highly scalable ecosystem.
- CodeBuild: A fully managed build service that compiles source code, runs tests, and produces software packages that are ready to deploy.
Machine Learning Services
- Sagemaker: Provides every developer and data scientist with the ability to build, train, and deploy machine learning models quickly.
These services are just the tip of the iceberg, but they’re integral to most AWS architectures. Each service in the diagrams
library is represented as a class that you can instantiate and use in your diagrams. They encapsulate the visual and functional aspects of their AWS counterparts, allowing you to create diagrams that are both informative and visually accurate.
In the following sections, we’ll harness these services and more as we step through a series of practical examples to illustrate common AWS architectural patterns. These examples will serve as blueprints that you can expand on to create detailed representations of your own cloud environments.
Example 1: Simple Web Application Architecture on AWS
In this first example, we’ll create a diagram for a simple web application running on AWS. The architecture includes a web server running on an EC2 instance, static content hosted on S3, and a database hosted on RDS.
Here’s how you might write the code for this diagram in a Jupyter Notebook:
from diagrams import Diagram
from diagrams.aws.compute import EC2
from diagrams.aws.database import RDS
from diagrams.aws.storage import S3
from diagrams.aws.network import ELB
from diagrams.aws.network import Route53
from IPython.display import Image
with Diagram("Simple Web Application Architecture", show=False, filename="simple_web_app_architecture"):
dns = Route53("DNS")
lb = ELB("Load Balancer")
web_server = EC2("Web Server")
static_content = S3("Static Content")
database = RDS("Database")
dns >> lb >> web_server
web_server >> static_content
web_server >> database
Image(filename="simple_web_app_architecture.png")
This script will generate a diagram named “Simple Web Application Architecture” with the following components:
- Route 53 serves as the DNS service.
- Load Balancer (ELB) to distribute incoming traffic.
- EC2 instance hosting the web server.
- S3 bucket for static content.
- RDS instance for the relational database.
After running the code, a PNG image with the name simple_web_app_architecture.png
will be created and displayed within the Jupyter Notebook:
Example 2: Serverless Architecture Using AWS Lambda and API Gateway
In this second example, we’ll illustrate a serverless architecture utilizing AWS Lambda and API Gateway, backed by S3 for storage and DynamoDB for database operations. This setup is typical for applications that require scalability and flexibility without managing servers.
Here’s the Jupyter Notebook code for this diagram:
from diagrams import Diagram
from diagrams.aws.compute import Lambda
from diagrams.aws.database import Dynamodb
from diagrams.aws.storage import S3
from diagrams.aws.network import APIGateway
from diagrams.aws.management import Cloudwatch
from IPython.display import Image
with Diagram("Serverless Architecture", show=False, filename="serverless_architecture"):
api = APIGateway("API Gateway")
lambda_function = Lambda("Lambda Function")
db = Dynamodb("DynamoDB")
storage = S3("S3 Bucket")
logs = Cloudwatch("CloudWatch Logs")
api >> lambda_function >> db
lambda_function >> storage
lambda_function >> logs
Image(filename="serverless_architecture.png")
This script generates a diagram with the following components:
- Elastic Load Balancer (ELB) for traffic distribution.
- Auto Scaling Group (ASG) to automatically adjust the number of EC2 instances.
- Multiple EC2 instances representing the web servers.
- Multi-AZ RDS setup with a primary and a secondary (replica) database for high availability.
- S3 Bucket for storing logs and other data.
- CloudWatch for monitoring the health and performance of the application.
After running the code, a PNG image will be created and displayed within the Jupyter Notebook:
Example 3: Highly Available Web Application with Auto Scaling and Load Balancing
This example demonstrates a more complex architecture for a high-availability web application. It includes an Elastic Load Balancer (ELB) to distribute incoming traffic, Auto Scaling Groups (ASG) to handle changes in load by automatically adjusting the number of EC2 instances and a Multi-AZ RDS setup for high availability of the database.
Here’s the Jupyter Notebook code for this diagram:
from diagrams import Diagram
from diagrams.aws.compute import EC2, AutoScaling
from diagrams.aws.database import RDS
from diagrams.aws.network import ELB
from diagrams.aws.storage import S3
from diagrams.aws.management import Cloudwatch
from IPython.display import Image
with Diagram("Highly Available Web App", show=False, filename="high_avail_web_app"):
user = EC2("User")
lb = ELB("Load Balancer")
asg = AutoScaling("Auto Scaling Group")
servers = [EC2("Web Server 1"),
EC2("Web Server 2"),
EC2("Web Server 3")]
db_primary = RDS("Primary DB")
db_secondary = RDS("Secondary DB (Replica)")
bucket = S3("S3 Bucket for Logs")
cw = Cloudwatch("CloudWatch Monitoring")
user >> lb >> asg >> servers
asg >> db_primary
db_primary - db_secondary
for server in servers:
server >> bucket
server >> cw
Image(filename="high_avail_web_app.png")
This script generates a diagram with the following components:
- Elastic Load Balancer (ELB) for traffic distribution.
- Auto Scaling Group (ASG) to automatically adjust the number of EC2 instances.
- Multiple EC2 instances representing the web servers.
- Multi-AZ RDS setup with a primary and a secondary (replica) database for high availability.
- S3 Bucket for storing logs and other data.
- CloudWatch for monitoring the health and performance of the application.
After running the code, a PNG image will be created and displayed within the Jupyter Notebook:
Example 4: Microservices Architecture with ECS and API Gateway
In this example, we’ll illustrate a microservices architecture using Amazon Elastic Container Service (ECS) to manage containers, combined with API Gateway for service exposure. This kind of architecture is often used for scalable, modular applications.
Here’s the Jupyter Notebook code for this diagram:
from diagrams import Diagram
from diagrams.aws.compute import ECS, Fargate
from diagrams.aws.database import Dynamodb
from diagrams.aws.integration import APIGateway
from diagrams.aws.network import Route53
from diagrams.aws.security import IAM
from diagrams.aws.storage import S3
from diagrams.aws.management import Cloudwatch
from IPython.display import Image
with Diagram("Microservices Architecture", show=False, filename="microservices_architecture"):
route53 = Route53("DNS")
api_gw = APIGateway("API Gateway")
ecs_cluster = ECS("ECS Cluster")
services = [ECS("Service 1"),
ECS("Service 2"),
ECS("Service 3")]
fargate = Fargate("Fargate")
db = Dynamodb("DynamoDB")
s3_bucket = S3("S3 Bucket")
iam_role = IAM("IAM Role")
cloudwatch = Cloudwatch("CloudWatch")
route53 >> api_gw >> ecs_cluster
ecs_cluster >> fargate
for service in services:
fargate >> service >> db
service >> s3_bucket
service >> cloudwatch
ecs_cluster >> iam_role
Image(filename="microservices_architecture.png")
This script creates a diagram featuring:
- Route 53 for DNS management.
- API Gateway as the entry point for microservices.
- ECS Cluster hosting the microservice containers.
- Fargate for serverless container execution.
- Multiple ECS Services representing different microservices.
- DynamoDB for database needs.
- S3 Bucket for storage requirements.
- IAM Role for managing permissions.
- CloudWatch for monitoring and logging.
After running the code, a PNG image will be created and displayed within the Jupyter Notebook:
Example 5: Data Processing Architecture with Amazon SageMaker, Lambda, and S3
In this example, we’ll construct a data processing architecture that utilizes Amazon SageMaker for machine learning, AWS Lambda for data processing and orchestration, and Amazon S3 for data storage. This setup is commonly used in scenarios where machine learning models are employed to process and analyze large datasets.
Here’s the Jupyter notebook code for this diagram:
from diagrams import Diagram
from diagrams.aws.compute import Lambda
from diagrams.aws.storage import S3
from diagrams.aws.ml import Sagemaker
from diagrams.aws.database import Dynamodb
from diagrams.aws.integration import SQS
from diagrams.aws.management import Cloudwatch
from IPython.display import Image
with Diagram("Data Processing with SageMaker", show=False, filename="data_processing_sagemaker"):
data_source = S3("Data Source")
processing_lambda = Lambda("Data Processing Lambda")
sagemaker_model = Sagemaker("SageMaker Model")
data_storage = S3("Processed Data Storage")
db = Dynamodb("DynamoDB")
queue = SQS("Processing Queue")
cloudwatch = Cloudwatch("Monitoring")
data_source >> processing_lambda >> queue >> sagemaker_model
sagemaker_model >> data_storage
sagemaker_model >> db
sagemaker_model >> cloudwatch
Image(filename="data_processing_sagemaker.png")
In this architecture:
- Amazon S3 (Data Source) holds the raw data that needs processing.
- AWS Lambda (Data Processing Lambda) acts as a serverless function to pre-process or orchestrate the data.
- Amazon SQS (Processing Queue) queues the data for processing, ensuring that the SageMaker model isn’t overwhelmed.
- Amazon SageMaker (SageMaker Model) performs machine learning tasks on the data.
- Amazon S3 (Processed Data Storage) stores the processed data.
- Amazon DynamoDB might be used to store metadata or results of the processing.
- Amazon CloudWatch monitors the entire process, logging activities and performance metrics.
This diagram demonstrates a typical flow for processing data using AWS services, where data is ingested, processed, and analyzed, and the results are stored and monitored. After running the code, a PNG image will be created and displayed within the Jupyter Notebook:
Example 6: LLM-based ChatBot
In our final example, we construct a streamlined architecture for an interactive web application using a fully managed suite of AWS services. This architecture is designed to be scalable, secure, and resilient, leveraging the power of AWS to eliminate the need for direct infrastructure management for llm-based chatbot deployment.
from diagrams import Diagram
from diagrams.aws.compute import Lambda
from diagrams.aws.database import Dynamodb
from diagrams.aws.integration import SQS, SNS
from diagrams.aws.network import APIGateway
from diagrams.aws.network import CloudFront
from diagrams.aws.security import WAF, Cognito
from diagrams.aws.storage import S3
from diagrams.aws.ml import SagemakerModel
from diagrams.onprem.client import User
from IPython.display import Image
with Diagram("LLM-based ChatBot", show=False, filename="LLM-based-ChatBot"):
user = User("Users")
cognito = Cognito("Cognito\nUser Pool")
waf = WAF("AWS WAF")
cloudfront = CloudFront("CloudFront")
s3_bucket = S3("S3 Bucket\nReact application")
api_gateway_rest = APIGateway("API Gateway\nREST")
api_gateway_ws = APIGateway("API Gateway\nWebSocket")
lambda_fastapi = Lambda("Lambda\n(FastAPI)")
lambda_publisher = Lambda("Lambda\n(Publisher)")
lambda_1 = Lambda("Lambda")
sns = SNS("Amazon SNS")
bedrock = SagemakerModel("Amazon Bedrock\nClaude 2")
dynamodb = Dynamodb("DynamoDB\nConversation Table")
user >> waf >> cloudfront >> s3_bucket
user >> cognito
user >> api_gateway_rest >> lambda_fastapi >> bedrock
lambda_fastapi >> dynamodb
user >> api_gateway_ws >> lambda_publisher >> sns >> lambda_1 >> api_gateway_ws
lambda_1 >> dynamodb
# Display the image in the Jupyter notebook
Image(filename="LLM-based-ChatBot.png")
Let’s walk through the components and their interactions:
- Users: They are the clients who interact with the web application, initiating requests and receiving responses.
- Amazon Cognito: Handles user authentication, providing secure access to the web application.
- AWS WAF: Stands in front of CloudFront as a firewall, applying IP address filtering and other security rules to protect the application against common web exploits.
- Amazon CloudFront: Serves as a content delivery network (CDN) that hosts and delivers the static assets of the React application stored in the S3 bucket.
- S3 Bucket: Stores the React application’s static files, which are then delivered globally through CloudFront.
- API Gateway REST: Acts as the entry point for the serverless backend, receiving HTTP requests from the client which are then processed by a Lambda function.
- Lambda (FastAPI): A serverless compute service that runs the backend code (using FastAPI framework) without provisioning or managing servers. It’s responsible for handling business logic, interacting with Amazon Bedrock for machine learning operations, and storing conversation data in DynamoDB.
- Amazon Bedrock (represented by Amazon SageMaker Model): A managed service used for machine learning tasks. Here it processes data as part of the backend operations.
- DynamoDB: A NoSQL database service used for storing conversation history and other relevant data.
- API Gateway WebSocket: Manages real-time, two-way communication between clients and the server.
- Lambda (Publisher): This function responds to WebSocket connections, publishing messages to an Amazon SNS topic.
- Amazon SNS: A publish/subscribe service that decouples microservices, distributed systems, and serverless applications. It takes messages from the Lambda publisher and forwards them as needed.
- Lambda: Subscribes to the Amazon SNS topic, processing the messages and performing necessary actions, such as updating the API Gateway WebSocket connection or updating DynamoDB.
- The architecture is designed to be scalable and secure, using AWS managed services to handle different aspects of the application, from user authentication with Amazon Cognito to content delivery with Amazon CloudFront and real-time data processing with AWS Lambda and Amazon SNS.
After running the code, a PNG image will be created and displayed within the Jupyter Notebook:
For more information about this case, you can check this AWS sample repo:
Thank you for reading my post, and I hope it was useful for you. If you enjoyed the article and would like to show your support, please consider taking the following actions:
- 📚 If you found value in my articles and would like to support my work, consider buying me a book: Buy me a book
- 👏 Show your support by giving the article a clap, enhancing its visibility.
- 📖 Stay updated with my latest pieces by Follow Now.
- 🔔 Don’t miss out on my new posts. Subscribe to the newsletter.
- 🛎 For more regular updates, connect with me on LinkedIn.
Conclusion
Throughout this guide, we’ve seen the power and versatility of using the diagrams
Python library to create detailed and accurate representations of AWS architectures. The ability to define architecture as code is a game-changer, offering adaptability, precision, and ease of maintenance that traditional diagramming tools can't match.
While our focus has been on AWS due to its prominence in the cloud industry, it’s important to note that the diagrams
library's capabilities are not limited to AWS. Its support extends to other major cloud providers, such as Google Cloud Platform and Microsoft Azure. This universality makes it an invaluable tool in the toolkit of any cloud professional, regardless of the specific platform they use.
In conclusion, whether you are documenting AWS architectures or working with other cloud providers, adopting a code-based approach to diagramming can significantly enhance the way you visualize and communicate your cloud infrastructure. This method brings clarity and cohesion to cloud architecture documentation, making it an essential practice for teams and individuals working in the ever-evolving cloud environment.