AWS Well-Architected Framework: Six Key Pillars | Cloudticity

Written by Josh Ray | Aug 27, 2024 4:18:37 PM

Healthcare organizations are increasingly tapping into public cloud services to build innovative applications and services. But architecting optimal cloud environments can be a daunting task—especially for teams that are used to on-premises infrastructure.

Amazon Web Services (AWS) created the AWS Well-Architected Framework to help cloud architects build and operate secure, high-performing, resilient, and efficient cloud environments for their applications. The framework is built on six pillars: operational excellence, security, reliability, performance efficiency, cost optimization, and sustainability. By drawing on the design principles and best practices provided for each pillar, teams can develop a consistent approach to evaluating and optimizing their architectures on AWS.

1. Operational Excellence

The operational excellence pillar provides guidance about designing, deploying, running, and monitoring systems, as well as improving processes.

Implementing best practices for design and deployment

The quest for operational excellence starts with a clear understanding of business goals. Your team members should know how those goals connect to the application, and how each of them can support these business goals through their work.

As you design the application, AWS recommends using a scalable, loosely coupled approach so you can update components regularly. At the same time, the application should generate all the metrics you need to monitor its internal state at all times.

You should implement processes that enable you to make changes rapidly so you can respond to quality issues, shifting priorities, or other feedback. Employing automation as part of that process can help you make small changes quickly and efficiently.

Streamlining operations through automation and monitoring

Continuous application monitoring is essential for streamlining operations and improving performance. You should establish performance baselines, monitor for deviations, and make sure you receive alerts when something changes. Automation of operational health monitoring and observability can help you capture all the metrics you need without overextending your team.

Continuous improvement and learning from operational events

Sustaining operational excellence requires continuous learning, sharing, and improvement. In addition to making incremental improvements, you should analyze any and all incidents that have affected customers: These are important opportunities to pinpoint—and then address—root causes and contributing factors. Whatever changes you need, make sure you communicate operational modifications to all stakeholders and teams.

2. Security

The focus of the security pillar is protecting information and systems. AWS provides best practices for maintaining the confidentiality and integrity of data, managing user permissions, and implementing controls to detect security events.

Protecting data, apps, and systems from threats

A defense-in-depth or multi-layered security approach is key for safeguarding your data, apps, and infrastructure in the cloud. To protect data, AWS recommends encrypting data and regularly rotating keys. You can also capitalize on AWS storage designed to provide high levels of data durability and resilience.

Enhancing application security begins during development. You should incorporate security into design, build, and test phases. Doing so can help you improve quality, accelerate time to market, and avoid costly, release-stopping problems late in development.

Infrastructure protection might include stateful and stateless packet inspection. AWS also suggests using Amazon Virtual Private Cloud (Amazon VPC) to create a private, secured, and scalable environment.

Implementing robust identity and access management

Identity and access management capabilities are critical for protecting apps and data. AWS recommends establishing role-based policies, which can help prevent unauthorized access and also restrict authorized access to only necessary systems. As part of the AWS identity and access management service, you can also require strong passwords and implement multi-factor authentication (MFA), adding another layer to access control.

Adhering to compliance requirements

For healthcare organizations especially, security and compliance efforts must be closely aligned. AWS can help organizations maintain HIPAA compliance and achieve HITRUST certification. More than 160 AWS services are HITRUST certified, and any organization using those services can inherit controls from AWS, applying them to their own HITRUST assessment.

As part of the Well-Architected Framework documentation, AWS highlights how it can help organizations adhere to data residency and data localization rules. For example, AWS will not move data from one region to another unless an organization chooses a feature or service that provides that functionality.

3. Reliability

For the reliability pillar, AWS provides guidance for achieving high availability and for quickly recovering from failures.

Ensuring fault tolerance and high availability of applications

How can you ensure that your application is available whenever users need it? More specifically, how can you be sure they can keep using your application even if something small goes wrong behind the scenes?

Delivering high availability does not mean guaranteeing that everything will operate perfectly all the time. It means people should be able to use your application consistently and reliably, even if some components of your architecture temporarily fail.

AWS offers design principles and best practices to help you achieve high availability and fault tolerance. For example, AWS recommends replacing any large resource with multiple small ones so that a single failure does not bring down the entire application. You might also implement fault-isolated boundaries. These boundaries limit the impact of a failure within a workload to a certain number of components. Components beyond the boundary are unaffected and can continue to operate.

Implementing backup and disaster recovery strategies

Regularly backing up data—and testing that backed up data—is important for maintaining reliability and availability. If you encounter an error or face a cybersecurity attack, you need to know that you can quickly restore data from a backed up copy.

Similarly, developing and testing a robust disaster recovery strategy is vital. You need to know that you can restore data and use redundant systems to resume operations quickly in the event of a problem. In addition to setting recovery-time and recovery-point objectives (RTOs and RPOs), you’ll need a precise plan for achieving those objectives, which might require you to shift where your data is located or where app components are running.

Monitoring and testing for resilience

Continuous monitoring is critical for strengthening resilience. Once you have policies for automated recovery in place, you can monitor workloads for key performance indicators—specifically, business criteria for the workload’s performance. When the threshold you set is surpassed, automated recovery can begin.

To make sure your recovery strategy will work when you need it, you need to test it. You should test how your workload might fail in different scenarios and then validate that your recovery procedure is effective.

4. Performance Efficiency

The performance efficiency pillar centers on guidelines for not only delivering high-performing applications but also ensuring that they run efficiently, making the most of cloud resources while controlling costs.

Optimizing resource utilization

To deliver the best performance while controlling costs, you need to find ways to optimize resource utilization from the start. For example, AWS recommends employing serverless architectures, which eliminate the need to run and maintain physical servers. Serverless architectures can also lower transactional costs by using managed services that operate at cloud scale.

Selecting the right services and resources for workloads

Each application will likely need a unique set of cloud services. And as AWS notes, many applications need multiple solutions and features to deliver the best performance. Fortunately, AWS provides a wide range of highly configurable services so you can customize the resources to best support each application and optimize performance efficiency.

Monitoring and tuning performance

Sustaining optimal performance requires continuous monitoring and frequent tuning. AWS recommends establishing key performance indicators—including both technical and business metrics. Review those metrics regularly. Employing visualization capabilities can make it easier to spot potential issues that require tuning. At the same time, you should right-size resources based on actual usage patterns to avoid over- and under-provisioning.

5. Cost Optimization

Just as performance efficiency seeks to run applications efficiently, the cost optimization pillar examines all the ways to optimize costs when building and running applications in the cloud.

Leveraging cost-effective resources and AWS pricing models

AWS offers a number of ways to control costs. For example, because there are numerous infrastructure options, you can select the most cost-effective choices for your applications. You can also select from multiple pricing plans. You could choose the On-Demand plan, which uses a consumption model; the Reserved Instance, which offers a contract for a fixed time; or Spot Instances, which use spare cloud capacity when it’s available.

As AWS notes, you might also find that using managed services can help reduce costs. You can eliminate the expenses of training staff and managing systems yourself.

Implementing cost-monitoring and optimization strategies

The cloud’s ease of use and simple scalability mean that cloud costs can escalate rapidly if they are not closely monitored. When organizations move to the cloud, many need to start tracking the spending of distinct teams, ensuring that there are no duplicative services across the organization. They also need to monitor the status of projects to make sure resources are decommissioned when they are no longer being used.

AWS provides tools to help you monitor usage and costs. In addition, outside organizations, including managed service providers, can help you evaluate your current costs and find ways to save.

Your teams should also review new AWS services regularly, and evaluate them in reference to your changing business and technical requirements. You might periodically find new, more cost-effective ways to build or run your applications.

Right-sizing resources based on demand and usage patterns

Your demand and usage patterns are likely to change over time. You might need more resources to handle periodic spikes in usage or fewer resources when development projects are completed. Capitalize on the tremendous flexibility of the cloud and be sure to right-size your resources as your needs change. With a pay-as-you-go pricing plan, you could save considerable costs by re-evaluating resource needs frequently.

6. Sustainability

The sustainability pillar highlights ways to reduce your environmental impact as you develop and run applications in the cloud.

Maintaining efficient resource utilization

The most direct way to reduce the environmental impact of your applications is through efficient resource utilization. For example, you can reduce your power consumption by maximizing utilization of one host instead of using two underutilized hosts.

If at all possible, your applications should be architected for efficient resource utilization from the beginning. Once your applications are up and running, decommissioning idle resources—including servers and storage—can reduce the total energy consumed by your applications.

Establishing goals and implementing sustainable practices

Setting sustainability goals early in your development process can help you build efficiency into your applications. To achieve goals for reduced energy consumption, for example, you might focus on reducing the compute and storage resources needed per transaction. You should also anticipate growth and design your applications so you can scale as efficiently as possible.

AWS recommends exploring a full range of sustainable practices for running your applications. For example, you might determine that using managed services, which are shared among organizations, can help reduce the amount of total cloud resources used to support everyone’s applications.

Monitoring and optimizing for energy efficiency

Whatever your goals, continuous monitoring will be crucial. By monitoring utilization of cloud resources, you can identify underutilized systems and consolidate infrastructure. Monitoring application activity can help you discover opportunities for removing components, optimizing code, or refactoring applications to enhance efficiency. You should also keep tabs on new releases of AWS instance types so you can take advantage of new energy-saving offerings.

Design Principles

In addition to providing specific principles for each pillar, AWS offers design principles for the Well-Architected Framework more generally. The high-level design principles are focused on optimizing scalability, resilience, and efficiency.

Capitalize on cloud flexibility: Using cloud resources, you can scale up or down, automatically, as your needs change. AWS encourages organizations to take advantage of this flexibility so they can stop guessing capacity needs.
Test systems at production scale: Because you have virtually unlimited resources at your disposal, you can—and should—create production-scale test environments. Running tests at scale will help you optimize performance and reliability in the long run.
Automate with architectural experimentation in mind: By helping to increase operational efficiency, automation can let you experiment with your architecture more quickly and easily.
Consider evolutionary architectures: With cloud services, your architecture can evolve along with your business goals and technical needs. You do not need to stick with architectural decisions, like you might have done with traditional, on-premises environments.
Drive architectures using data: As you consider making architectural changes, use data on your application performance, efficiency, and availability to support your decisions.
Improve through “game” days: AWS recommends testing your architecture and processes by scheduling “game” days to simulate events in production. This approach can give you a clearer sense of what is working and what isn’t.

Real-World Examples

Organizations across industries have used the AWS Well-Architected Framework to accelerate their migration to the cloud and optimize cloud architectures. For example, the Japanese digital advertising company CyberAgent used the framework to learn about AWS and identify potential business risks. The framework’s best practices enabled the company to implement architectural changes to address those risks.

NEC Corporation, the multinational IT and electronics company, conducted several AWS Well-Architected Framework Reviews—the process for evaluating architectures in relation to the framework. In one case, teams conducted the review early in the design process to find and resolve potential issues. As a result, the company was able to speed delivery of services to its clients.

Getting Started with a Well-Architected Framework for Healthcare

AWS provides a wealth of resources for getting started with the Well-Architected Framework. In addition to a comprehensive document that describes design principles and best practices for each pillar, AWS offers hands-on labs and a free tool that evaluates workloads, identifies risks, and records improvements.

AWS also offers several “lenses”—white papers on more focused topics. One is centered on how to build and manage healthcare workloads on AWS. That paper covers the same high-level best practices but with a healthcare emphasis. It also explores architectural characteristics for electronic health records, healthcare interoperability, medical imaging, healthcare analytics, and machine learning for healthcare.

Many organizations will benefit from working with an AWS Well-Architected Partner Program to navigate the Well-Architected Framework. A partner can do some of the heavy lifting involved with reviewing existing or proposed architectures against the framework. And the partner can help optimize that architecture for building secure, efficient, and reliable applications on AWS.

Ready For a Well-Architected Review for Healthcare?

You might be curious how your cloud environment stacks up against Well-Architected benchmarks. If you’re in healthcare, Cloudticity can conduct a Well-Architected Review at no cost to you. Sign up now to see if you’re eligible and get your Well-Architected Review today!

View full post