Healthcare Cloud Blog | Cloudticity

How to Conduct an AWS Well-Architected Framework Review | Cloudticity

Written by Josh Ray | Aug 12, 2024 1:00:00 PM

Organizations across all sectors recognize the tremendous potential benefits of developing and running applications in the cloud. But many teams need assistance with building optimal architectures to support those applications.

Amazon Web Services (AWS) created the AWS Well-Architected Framework to help organizations build and operate secure, high-performing, resilient, and efficient cloud environments for their applications. The framework has six pillars: operational excellence, security, reliability, performance efficiency, cost optimization, and sustainability. Using the design principles and best practices for each pillar, cloud architects can implement a consistent approach for developing and then evaluating their architectures on AWS.

Purpose and Goals of Conducting a Well-Architected Framework Review

The AWS Well-Architected Framework Review (sometimes abbreviated as WAFR) is the process of evaluating an existing or newly designed architecture according to the principles and best practices in the framework. By answering questions for each framework pillar and using the AWS Well-Architected Tool, your team can determine whether best practices are sufficiently implemented in your architecture and then identify areas for improvement. The ultimate goal of the review is to improve the architecture so it is closely aligned with business objectives.

Operational Excellence  

The operational excellence pillar of the AWS Well-Architected Framework focuses on supporting development, running applications effectively, and continuously improving processes to maximize their business value. The review of this pillar can help assess your processes, determine whether you are effectively measuring performance, and investigate how you are managing events. 

Assessing processes and procedures

Your processes for developing and running your application should be not only efficient but also flexible. You need the agility to make changes quickly so you can respond to quality issues, shifting priorities, or other feedback you might receive. Employing automation as part of your processes can help you make small changes quickly and efficiently.

Measuring performance

AWS stresses the importance of building observability into your application so you can continuously monitor its state, measure performance, and make data-driven decisions about changes. You should establish performance baselines, monitor for deviations, and make sure you receive alerts when something changes. Automation of operational health monitoring and observability can help you capture all the metrics you need without overextending your team.

Managing events and incidents

Your team should have processes in place to handle events, incidents, and other problems. For example, you should have a plan for communicating the status of an event with stakeholders and for mitigating any damage. The team should also know who will lead the response. These processes should be documented in a central location and have the flexibility for change as processes evolve. 

Security

How well are you protecting information and systems? In reviewing the security pillar of the framework, you should assess your practices for detecting threats, managing user permissions, protecting data, and more. 

Detecting threats and vulnerabilities

Employing multiple types of detection capabilities is essential not only for preventing attacks but also for maintaining quality and sustaining compliance with regulations. As part of your threat detection efforts, AWS recommends inventorying assets, conducting internal audits, examining system controls, and establishing automated alerting. AWS offers several services that support threat detection and simplify log management.

Evaluating identity and access

Identity and access management controls should play a key role in a multi-layered, defense-in-depth strategy. AWS recommends establishing role-based policies, which can help prevent unauthorized access and restrict authorized access to only necessary systems. As part of the AWS identity and access management service, you can require strong passwords and implement multi-factor authentication (MFA) to add another layer of control.

Protecting data and infrastructure

To protect critical data, AWS recommends using encryption and regularly rotating encryption keys. AWS also offers storage options designed to provide high levels of data durability and resilience. 

To protect infrastructure, AWS recommends employing multiple layers of defense—including many of the same capabilities you might use for on-premises environments. For example, you should enforce boundary protection; monitor all points of ingress and egress; and implement logging, monitoring, and alerting. 

Reliability

For the reliability pillar of the framework, the AWS Well-Architected Framework Review evaluates whether you have implemented best practices for maintaining high availability and quickly recovering from failures. 

Managing failures and disasters 

Component failures happen—but they don’t have to bring down your entire application. To achieve a level of fault tolerance, AWS recommends replacing large resources with multiple smaller ones so that a single failure does not cause significant downtime. You might also implement fault-isolated boundaries, which limit the impact of a failure within a workload to a certain number of components. Components beyond the boundary are unaffected and can continue operating.

Testing reliability through chaos engineering

How do you know your application will be reliable and resilient when in production? AWS recommends testing through a form of “chaos engineering”—the practice of experimenting on a system to build confidence in its ability to withstand turbulent conditions in production. Your teams can simulate real-world disruptions in a controlled way (with no impact to customers) so you can learn from faults. This approach can highlight deficiencies that should be addressed before they impact availability.

Planning for high availability 

One of the best ways to build reliability and availability into your application is to use a microservices architecture. With microservices, you can establish distinct availability requirements for different services. For example, you could ensure that the most mission-critical elements of a customer-facing application are always available while investing fewer resources on less critical elements—even if those services are unavailable, customers can still have a good experience.

Planning for high availability should continue even after you launch your application. You should continuously refine your design as you experience events and failures in production.

Performance Efficiency  

Your review of the framework’s performance efficiency pillar should center on your architecture’s ability to maximize application performance while ensuring the application runs as efficiently as possible.

Selecting optimal architecture resources

There is no one single optimal architecture for all applications. Each application will likely need a unique architecture and set of cloud services to deliver the highest level of performance. Many applications need multiple services to deliver strong performance. Fortunately, AWS offers a wide range of highly configurable services so you can customize your architecture to best support your application.

Tuning performance

Sustaining outstanding performance efficiency requires continuous monitoring and frequent tuning. AWS recommends first establishing key performance indicators—including both technical and business metrics. You should review those metrics regularly and identify any potential issues that require adjustments.

Right-sizing compute and storage

Continuous monitoring and tuning should also help you identify opportunities for right-sizing compute and storage resources. While you might find that you need to scale up some resources to achieve better performance, you could also pinpoint resources that could be decommissioned without affecting performance. AWS encourages organizations to capitalize on the cloud’s flexibility to avoid under- or over-utilizing resources. 

Cost Optimization

The cost optimization pillar examines ways to control costs when building and running applications in the cloud. The review of this pillar would evaluate your effectiveness in choosing cost-effective resources, analyzing expenditures, and continuously optimizing spending. 

Choosing cost-effective resources

AWS offers numerous infrastructure options, giving you the flexibility to select the most cost-effective choices for your application. That selection process should include a thorough analysis of all application components: Each component might have multiple options for cloud resources. You should also consider how changes in usage might alter the cost arithmetic for certain services, since some services are more or less cost effective at different usage levels. 

Analyzing usage and expenditures

To avoid overspending, your organization should continuously monitor usage and analyze expenditures. AWS provides tools to help. In addition, outside organizations, including managed service providers, can assist you in evaluating your current costs and finding ways to save. 

Optimizing over time

As your requirements and usage patterns change, you should continue to optimize your application architecture for costs over time. Your team should review new AWS services regularly, and evaluate them in reference to your changing business and technical requirements. Are your architectural decisions still the most cost effective? You might periodically find new, more cost-effective ways to build or run your applications.

Sustainability  

The review of the sustainability pillar would assess how well you have incorporated best practices for reducing your environmental impact in the cloud.

Evaluating resource efficiency

To pinpoint areas for improvement, first evaluate the efficiency of your current resources. Examine all the compute, storage, and networking resources used to support your application. Calculate emissions per unit of work. Use that data to establish key performance indicators and estimate the effect of proposed changes.

Minimizing environmental impact

As you look for opportunities to reduce your impact, AWS recommends focusing first on potential “hot spots,” such as large deployments and frequently used resources. Your goal is to find ways to improve resource utilization and reduce the total resources required for your application—all without jeopardizing your application’s ability to address business goals. 

Some improvements could be simple: For example, you could reduce your power consumption by maximizing utilization of one host instead of using two underutilized hosts. You should also decommission any idle resources—including servers and storage—so you can reduce the total energy consumed by your application.

AWS offers several additional best practices for improving sustainability. For example, you might consider running your application close to an Amazon renewable energy project or in a region with a low carbon intensity. You could also optimize areas of your code that consume the most time or resources. And you might consider using managed services, which are shared among multiple organizations, to help reduce the total amount of cloud resources used to support everyone’s applications.

The Review Process 

How do you conduct the Well-Architected Framework Review? And how does it differ from an audit?

Follow a consistent, blame-free approach

As AWS notes, the framework review process should be conducted in a “consistent manner, with a blame-free approach.” While healthcare organizations might be familiar with stressful, time-consuming audit processes for regulatory compliance, AWS recommends pursuing an architectural framework review more like a conversation instead of an audit. 

That conversation should fit within the overall process of learning AWS best practices, measuring your architecture against best practices, spotting risks, and creating a plan to address those risks. The review process should help identify opportunities for improvement as well as issues that must be addressed. At the end of the review, you should have a set of recommended actions that can ultimately help you improve the experiences of application users.

Dive deep into architecture decisions

Even if the review process takes only hours or days, instead of weeks or months, you should still dive deep into architectural decisions. If you begin the review during your development process, you will have opportunities to make changes that can have dramatic, positive impact on the operational efficiency, security, performance, reliability, cost effectiveness, and sustainability of your application.

AWS recommends having the team members who built the architecture review it according to the framework’s design principles and best practices. These reviews should be conducted at key milestones in the product lifecycle, from the pre-launch design to post-launch phases, as your team continues to make architectural changes.

Use the Well-Architected Framework Tool

The AWS Well-Architected Tool is a cloud service that can help your organization maintain a consistent process for evaluating your architecture against AWS best practices. Available through your AWS console, the tool can help you document architectural decisions, assist with launch governance, identify potential high-risk issues, and guide you through architecture improvements. The tool can be used throughout your architecture design, implementation, and review processes.

Work with external partners

Many organizations will benefit from working with an AWS Well-Architected Partner to navigate the Well-Architected Framework and to conduct a Well-Architected Framework Review. The partner can help streamline the review process, which is especially helpful if you have a complex architecture or your team lacks expertise with the Well-Architected Framework. Beyond evaluating your architecture, the partner can help you optimize that architecture for building secure, efficient, and reliable applications on AWS.

Start Planning for Your Review  

The AWS Well-Architected Framework provides a comprehensive collection of design principles and best practices for building and operating secure, high-performing, resilient, and efficient cloud environments for your applications. The AWS Well-Architected Framework Review process is instrumental in assessing your integration of those principles and practices. It provides a consistent, methodical process for identifying potential risks and finding areas for improvement.

Get Your AWS Well-Architected Review for Healthcare Now

Ready to start planning your AWS Well-Architected Framework Review? As an AWS Well-Architected Partner, Cloudticity can help. Sign up now to see if you’re eligible for a free Well-Architected Review today!