Best practices for cloud architecture design based on AWS-long text: cloud architecture design principles | attached PDF download 07/13 Update SLTechnology News&Howtos

Best practices for cloud architecture design based on AWS-long text: cloud architecture design principles | attached PDF download

2025-07-13 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/02 Report--

Translator's preface

AWS has a wide range of users and a complex product line. The white paper "Architecting for the Cloud-AWS Best Practices" issued by AWS introduces the best practices of cloud architecture in common scenarios, which is of reference significance not only for AWS users, but also for cloud users. The new titanium cloud service engineer specially translated this white paper for cloud users.

The brain map compiled by the translator

This manual is divided into two parts

The first part is the difference between traditional environment and cloud environment (previous article)

Part II Cloud Architecture Design principles

Design principle

AWS contains a number of design patterns and architectural options that can be applied to various use cases. Some of the key design principles of AWS include scalability, disposable resources, automation, loosely coupled management services, and flexible data storage options.

4.1 scalability

Systems that are expected to grow over time need to be built on a scalable architecture. Such an architecture can support the growth of users, traffic, or data size without compromising performance. Resources should be provided proportionally in a linear manner, and the addition of additional resources results in at least a proportional increase in the ability to provide additional load. Growth should introduce economies of scale, and costs should follow the same dimensions that generate business value from the system. Although cloud computing provides almost unlimited on-demand capacity, your design needs to be able to utilize these resources seamlessly.

There are usually two ways to extend IT architecture: scale-up and scale-out.

4.1.1 Vertical expansion

Scale-up is achieved by increasing the size of individual resources, such as upgrading servers with larger hard drives or faster CPU. With Amazon EC2, you can stop the instance and adjust it to an instance type with more RAM,CPU,I/O or network functions. This scaling will eventually reach its limit, and it is not always a cost-effective or highly available method. However, it is easy to implement and is sufficient for many use cases, especially in the short term.

4.1.2 scale-out

Scale out by increasing the number of resources, such as adding more hard drives to the storage array, or adding more servers to support applications. This is a good way to build Internet-scale applications that take advantage of cloud elasticity. Not all architectures are designed to allocate their workloads to multiple resources, so let's look at some possible scenarios.

1) stateless application

When a user or service interacts with an application, a series of interactions that form a session are usually performed. A session is the only data that the user keeps unchanged between requests when using the application. Stateless applications are applications that do not require previous interactive knowledge and do not store session information. For example, an application that provides the same response to any end user is a stateless application given the same input. Stateless applications can scale out because any available computing resources, such as EC2 instances and AWS Lambda functions, can serve any request. If you do not store session data, you can add more computing resources as needed. When this capacity is no longer needed, these individual resources can be safely terminated after the running task is exhausted. These resources do not need to be aware of the existence of peers, all they need is a way to assign workloads to them.

2) distribute the load to multiple nodes

To distribute the workload to multiple nodes in the environment, you can choose a push model or a pull model.

Using the push model, you can use Elastic Load Balancing (ELB) to distribute the workload. ELB routes incoming application requests across multiple EC2 instances. When routing traffic, the network load balancer runs at layer 4 of the Open Systems Interconnection (OSI) model to handle millions of requests per second. By using container-based services, you can also use application load balancers. The application load balancer provides layer 7 of the OSI model and supports content-based request routing based on application traffic. Alternatively, DNS polling can be implemented using Amazon Route 53. In this case, the DNS response returns the IP address from the list of valid hosts in a circular manner. Although easy to implement, this approach is not always well adapted to the resilience of cloud computing. This is because even if you can set low time-to-live (TTL) values for DNS records, caching DNS parsers are not under the control of Amazon Route 53 and may not always follow your settings.

can implement pull models for asynchronous, event-driven workloads, rather than load balancing solutions. In the pull model, tasks that need to be performed or data that need to be processed can be stored in queues using Amazon Simple Queue Service (Amazon SQS) as messages or as streaming data solutions.

For example, Amazon Kinesis, multiple computing resources can extract and use these messages and process them in a distributed manner.

3) stateless components

In fact, most applications maintain some kind of state information. For example, a Web application needs to track whether the user is logged in so that personalized content can be rendered based on previous actions. The automated multi-step process also requires tracking previous activities to determine what their next step should be. You can still make parts of these architectures stateless by not storing anything that requires more than one request in the local file system.

For example, Web applications can use HTTP cookie to store session information (such as shopping cart items) in the Web client cache. The browser passes this information back to the server on each subsequent request so that the application does not need to store it. However, this approach has two disadvantages. First of all, the contents of HTTP cookie can be tampered with on the client side, so you should always treat it as untrusted data that must be validated. Second, the HTTP cookie is transmitted with each request, which means that you should keep its size to a minimum to avoid unnecessary delays.

Consider storing only unique session identifiers in HTTP cookie and storing more detailed user session information on the server side. Most programming platforms provide a local conversation management mechanism that works in this way. However, by default, user session information is usually stored in the local file system, forming a stateful architecture. A common solution to this problem is to store this information in a database. Amazon DynamoDB is a good choice because of its scalability, high availability and durability. For many platforms, there are open source alternative libraries that allow you to store this opportunity session in Amazon DynamoDB.

Other scenarios need to store larger files (such as intermediate results between user uploads and batches). By placing these files in a shared storage tier, such as Amazon Simple Storage Service,Amazon S3; or Amazon Elastic File System,Amazon EFS, you can avoid introducing stateful components.

Finally, a complex multi-step workflow is another example where the current state of each execution must be tracked. You can use the AWS steps feature to centrally store execution history and make these workloads stateless.

4) stateful components

Inevitably, your architectural layer will not become stateless components. By definition, the database is stateful. For more information, see the Database Section below. In addition, many legacy applications are designed to run on a single server relying on local computing resources. Other use cases include the possibility that client devices may be required to maintain a connection to a specific server for a long time. For example, real-time multiplayer games must provide a consistent view of the game world for multiple players with very low latency. This is much easier to achieve in a non-distributed implementation, where participants connect to the same server.

These components can still be scaled horizontally by distributing the load to multiple nodes with session kinship. In this model, all transactions of the session are bound to specific computing resources. However, this model does have some limitations. Existing sessions do not directly benefit from the introduction of newly started compute nodes. More importantly, conversational affinity cannot be guaranteed. For example, when a node terminates or becomes unavailable, users bound to that node disconnect and experience session-specific data loss that is not stored in a shared resource, such as Amazon S3 Amazon EFS or a database.

5) realize conversational affinity

For HTTP and HTTPS traffic, you can use the sticky session feature of the application load balancer to bind a user's session to a specific instance. Using this feature, the application load balancer will try to use the same server conference for the user for that duration. Another option, if you control the code that runs on the client, is to use client-side load balancing. This adds additional complexity, but is useful if the load balancer does not meet your requirements. For example, you may be using a protocol that ELB does not support, or you may need complete control over how users are assigned to the server (for example, in a game scenario, you may need to ensure that game participants match and connect to the same server). In this model, the client needs a way to find a valid server endpoint to connect directly. You can use DNS, or you can build a simple discovery API to provide this information to software running on the client. In the absence of a load balancer, you also need to implement a health check mechanism on the client side. should design client logic so that when a server is detected to be unavailable, the device reconnects to another server with little disruption to the application.

6) distributed processing

Use cases that deal with a large amount of data cannot handle anything from a single computing resource in a timely manner, and a distributed processing approach is required. By dividing tasks and their data into many small pieces of work, they can be executed in parallel across a set of computing resources.

7) implement distributed processing

Offline batch jobs can be scaled horizontally by using distributed data processing engines such as AWS Batch,AWS Glue and Apache Hadoop. On AWS, you can use Amazon EMR to run Hadoop workloads on a set of EC2 instances without operational complexity. For real-time processing of streaming data, Amazon Kinesis divides the data into multiple fragments, which are then used by multiple Amazon EC2 or AWS Lambda resources to achieve scalability.

For more information about these types of workloads, see the "Big Data Analytics Options on AWS" and "Core Tenets of IoT" white papers.

4.2 one-time resources rather than inherent servers

In a traditional infrastructure environment, fixed resources must be used because of the upfront cost and lead time of introducing new hardware. This promotes practices such as manually logging in to the server to configure software or fix problems, hard-coding IP addresses, and running tests or processing jobs sequentially.

When designing for AWS, you can take advantage of the dynamic configuration features of cloud computing. Servers and other components can be treated as temporary resources. You can start as many instances as you need and use them only when needed.

Another problem with long-running servers is configuration deviation. Changes and software patches applied over time can result in untested and heterogeneous configurations across different environments. can use immutable infrastructure model patterns to solve this problem. In this way, the server will never be updated once it is started. On the contrary, when a problem occurs or needs to be updated, the problem server is replaced with a new server with the latest configuration. This keeps the resource in a consistent (and tested) state and makes the rollback easier to perform. This is easier to support using a stateless architecture.

4.2.1 instantiating computing resources

Whether you are deploying a new environment for testing or increasing the capacity of an existing system to cope with additional load, you do not want to use its configuration and code to manually set up new resources. It is important that you make it an automated and repeatable process to avoid long delivery cycles and no human error. There are several ways to achieve this goal.

1) Boot

When you start an AWS resource, such as an EC2 instance or an AmazonRelational Database Service (Amazon RDS) database instance, the default configuration is started. You can then perform auto-boot operations, which are scripts that install software or copy data to bring the resource to a specific state. Configuration details that vary between different environments, such as production or testing, can be parameterized so that the same script can be reused without any modification.

You can set up a new EC2 instance using user data scripts and cloud-init directives. You can use simple scripts and configuration management tools, such as Chef or Puppet. In addition, through custom scripts and AWS API, or AWS CloudFormation support for custom resources supported by AWS AWS, you can write configuration logic that applies to almost any AWS resource.

2) Golden image

Some AWS resource types, such as EC2 instances, AmazonRDS database instances, and Amazon Elastic Block Store,Amazon EBS, can be launched from a golden image, which is a snapshot of a specific state of the resource. Compared with the boot method, the golden image reduces startup time and eliminates dependence on configuration services or third-party repositories. This is important in an automatic extension environment where you want to be able to start other resources quickly and reliably in response to changes in requirements.

You can customize the EC2 instance and then save its configuration by creating an Amazon Machine Image (AMI). You can launch any number of instances from AMI as needed, and they will all include these customizations. Every time you want to change the configuration, you must create a new golden image, so you must have a version control convention to manage your golden image. recommends that you use a script to create a bootstrap AMI for the EC2 instance you use to create. This gives you a flexible way to test and modify these images.

Or, if you have an existing local virtualized environment, you can use AWS's VM import / export to convert various virtualized formats to AMI. You can also find and use pre-packaged shared AMI provided by third parties in AWS or AWS.

Although golden mirrors are most commonly used when starting new EC2 instances, they can also be applied to resources such as Amazon RDS database instances or Amazon EBS volumes. For example, when starting a new test environment, you may want to pre-populate the database by instantiating the database from a specific Amazon RDS snapshot, rather than importing data from lengthy SQL scripts.

3) Container

Another favorite option for developers is Docker, an open source technology that allows you to build and deploy distributed applications within software containers. Docker allows you to package a software in a Docker image, which is a standardized unit of software development that contains everything the software needs to run: code, runtime, system tools, system libraries, etc. AWS Elastic Beanstalk,Amazon ElasticContainer Services (Amazon ECS) and AWSFargate allow you to deploy and manage multiple containers across EC2 instance clusters. You can build golden Docker images and manage them using the ECS container registry.

Another container environment is Amazon Elastic Container Service (Amazon EKS) from Kubernetes and Kubernetes. With Kubernetes and Amazon EKS, you can easily deploy, manage, and extend containerized applications.

4) mixing

You can also use a combination of these two methods: some parts of the configuration are captured in the golden image, while others are dynamically configured through boot operations.

Projects that do not often change or introduce external dependencies are usually part of your golden image. An example of a good candidate is your Web server software, otherwise it must be downloaded from a third-party repository every time you start the instance.

You can dynamically set up frequently changed or different projects between different environments through boot operations. For example, if you want to deploy new versions of an application frequently, it may be impractical to create a new AMI for each version of the application. You also don't want to hard-code the database hostname configuration to AMI, because there will be differences between test and production environments. User data or tags allow you to use a more generic AMI that can be modified at startup. For example, if you run Web servers for various small businesses, they can all use the same AMI and retrieve its contents from the S3 bucket location specified in user data at startup.

AWS Elastic Beanstalk follows a hybrid model. It provides preconfigured runtime environments, each launched from its own AMI11, but allows you to run boot operations through ebextensions configuration files and configure environment variables to parameterize environment differences.

4.2.2 Infrastructure is code

The application of the principles we are discussing need not be limited to a single resource level. Because AWS resources are programmable, you can apply techniques, practices, and tools in software development to make your entire infrastructure reusable, maintainable, extensible and testable.

AWS CloudFormation templates provide you with an easy way to create and manage collections of related AWS resources, and to provide and update them in an orderly and predictable manner. Your CloudFormation templates can be used with applications in your version control repository so that you can reuse the architecture and reliably clone the production environment for testing.

4.3 Automation

In a traditional IT infrastructure, you usually have to react to events manually. You can automate when deploying on AWS.

To improve system stability and organizational efficiency, consider introducing one or more of these automation into your application architecture to ensure greater resilience, scalability, and performance.

4.3.1 serverless management and deployment

When adopting the serverless mode, the focus of the operation is to deploy the automation pipeline. AWS manages basic services, scale and availability. AWS CodePipeline,AWS CodeBuild and AWS CodeDeploy support the automation of deployment of these processes.

4.3.2 Infrastructure management and deployment

AWS Elastic Beanstalk: you can use this service to deploy and extend Web applications and services developed using Java,.NET,PHP,Node.js,Python,Ruby,Go and Docker on familiar servers such as Apache,Nginx,Passenger and servers. IIS developers can simply upload their application code, and the service automatically handles all the details, such as resource configuration, load balancing, automatic extension, and monitoring.

However, this feature applies only to applicable instance configurations. For an up-to-date description of these prerequisites, see the Amazon EC2 documentation.

AWS Systems Manager: you can automatically collect software listings, apply operating system patches, create system images to configure Windows and Linux operating systems, and execute arbitrary commands. Providing these services simplifies the operational model and ensures the best configuration of the environment.

Auto Scaling: you can automatically maintain application availability according to the conditions you define, and automatically extend Amazon EC2,Amazon DynamoDB,Amazon ECS for Kubernetes's Amazon Elastic Container Service, (Amazon EKS) capacity. You can use Auto Scaling to help ensure that you run the required number of healthy EC2 instances across multiple availability zones. Auto Scaling can also automatically increase the number of EC2 instances during peak demand, maintain performance during less busy periods, and reduce capacity to optimize costs.

4.3.3 Alerts and events

Amazon CloudWatch alerts: you can create CloudWatch alerts that send AmazonSimple Notification Service (Amazon SNS) messages when a specific metric exceeds a specified threshold for a specified number of periods of time. These Amazon SNS messages can automatically start Lambda functions that perform subscriptions, queue notification messages into Amazon SQS queues, or perform POST requests to HTTP or HTTPS endpoints.

Amazon CloudWatchEvents: provides a near-real-time stream of system events describing changes in AWS resources. Using simple rules, you can route each type of event to one or more targets, such as Lambda functions, Kinesis streams, and SNS topics.

AWS Lambda scheduled event: you can create a Lambda function and configure AWS Lambda to execute it periodically.

AWS WAF Security Automation: AWS WAF is an Web application firewall that allows you to create custom application-specific rules to block common patterns that may affect application availability, compromise security, or consume too many resources.

4.4 loose coupling

4.4.1 well-defined interfaces

One way to reduce interdependence in a system is to allow components to interact with each other only through specific, technology-independent interfaces, such as RESTful API. As long as these interfaces remain backward compatible, the deployment of differential components will be separated. This granularity design pattern is often referred to as micro-service architecture.

Amazon API Gateway is a fully managed service that allows developers to easily create, publish, maintain, monitor, and protect API at any scale. It handles all the tasks involved in receiving and handling hundreds of thousands of concurrent API calls, including traffic management, authorization and access control, monitoring, and API version management.

4.4.2 Service discovery

Applications deployed as a set of smaller services depend on the ability of those services to interact with each other. Because each service can run across multiple computing resources, there needs to be a way to solve each service. For example, in a traditional infrastructure, if your front-end Web service needs to connect to a back-end Web service, you can hard-code the IP address of the computing resource running the service. While this approach still applies to cloud computing, if these services are loosely coupled, they should be able to be used without prior knowledge of the details of their network topology. In addition to hiding complexity, this allows infrastructure details to change at any time. If you want to take advantage of the flexibility of cloud computing, you can start or terminate new resources at any point in time, then loose coupling is a crucial factor. To achieve this goal, you need some ways to implement service discovery.

Implement Service Discovery

For Amazon EC2-hosted services, an easy way to achieve service discovery is through Elastic LoadBalancing (ELB). Since each load balancer has its own hostname, you can use the service through a stable endpoint. This can be used in conjunction with DNS and private Amazon Route 53 zones so that the endpoint of a particular load balancer can be abstracted and modified at any time.

Because service discovery becomes the glue between components, it is highly available and reliable. If you are not using a load balancer, you should also have options such as service discovery and health check. Amazon Route 53 supports automatic naming to make it easier to configure instances for microservices. Automatic naming allows you to automatically create DNS records based on defined configurations. Other example implementations include custom solutions that use tag combinations, high-availability databases, custom scripts that call AWSAPI, or open source tools such as Netflix Eureka,AirbnbSynapse or HashiCorp Consul.

4.4.3 Asynchronous Integration

Asynchronous integration is another form of loose coupling between services. This model is suitable for any interaction that does not require an immediate response, and it is sufficient to confirm that the request has been registered. It involves one component that generates events and another component that consumes them. These two components are integrated not through direct point-to-point interaction, but through intermediate persistence layers, such as SQS queues or streaming data platforms such as Amazon Kinesis, cascading Lambda events, AWS step capabilities, or AmazonSimple Workflow services.

Figure 1: tight coupling and loose coupling

This approach separates the two components and introduces additional resilience. So, for example, if the process that is reading a message from the queue fails, the message can still be added to the queue and processed when the system recovers. It also allows you to protect less scalable back-end services from front-end spikes and find the right trade-off between cost and processing lag. For example, you can decide that you don't need to expand the database to accommodate the occasional peak of write queries, as long as you end up processing them asynchronously with some delay. Finally, you can also improve the end-user experience by removing slow operations from the interactive request path.

Examples of asynchronous integration include:

The lambda front-end application inserts jobs into a queuing system such as Amazon SQS. The back-end system retrieves these jobs and processes them at its own pace.

λ API generates an event and pushes it into the Kinesis stream. The back-end application processes these events in batches to create aggregated time series data stored in the database.

The λ Lambda function can use events from various AWS sources, such as Amazon DynamoDB update streams and Amazon S3 event notifications. You don't have to worry about implementing queuing or other asynchronous integration methods, because Lambda will handle this for you.

4.4.4 Best practices for distributed systems

Another way to increase loose coupling is to build applications that handle component failures in an elegant manner. You can identify ways to reduce the impact on end users and improve your ability to make progress during the offline process, even if some components fail.

Dealing with failure gracefully in practice

Failed requests can be retried using exponential Backoff and jitter policies, or they can be stored in a queue for later processing. For front-end interfaces, you can provide alternative or cached content rather than a complete failure, for example, your database server becomes unavailable. Amazon Route 53 DNS failover also allows you to monitor your website and automatically route visitors to the backup site if the primary site is not available. You can host the backup site as a static site on Amazon S3 or as a separate dynamic environment.

4.4.5 Services, not servers

The development, management and operation of applications, especially large-scale applications, require a variety of underlying technical components. For a traditional IT infrastructure, companies must build and run all of these components.

AWS provides a wide range of computing, storage, database, analysis, application and deployment services to help organizations move faster and reduce IT costs.

Architectures that do not take advantage of this breadth (for example, if they only use AmazonEC2) may not be able to take full advantage of cloud computing and may miss opportunities to improve developer productivity and operational efficiency.

4.4.5.1 Management Services

AWS managed services provide building blocks that developers can use to power their applications. These managed services include database, machine learning, analysis, queuing, search, email, notification, etc. For example, with Amazon SQS, you can reduce the administrative burden of operating and extending a high-availability messaging cluster while paying a low price for the content you use. Amazon SQS itself is scalable and reliable. The same applies to Amazon S3, which allows you to store as much data as you need and access it when needed, regardless of capacity, hard disk configuration, replication and other related issues.

Other examples of managed services that support your application include:

λ AmazonCloudFront for content delivery

λ ELB for load balancing

λ is used for Amazon DynamoDB of NoSQL database

λ is used to search the AmazonCloudSearch of the workload

λ Amazon ElasticTranscoder for video coding

λ AmazonSimple Email Service (Amazon SES) for sending and receiving email

4.4.5.2 serverless computing architecture

Serverless computing architecture can reduce the operational complexity of running applications. These architectures can reduce costs because you do not have to manage or pay for underutilized servers, nor do you need to configure redundant infrastructure for high availability.

For example, you can upload code to the AWS Lambda computing service, and the service can run the code on your behalf using the AWS infrastructure. With AWS Lambda, you have to pay for every 100 milliseconds the code executes and the number of times the code is triggered. By using Amazon API Gateway, you can develop an almost infinitely extensible synchronous API supported by AWS Lambda. When used in conjunction with Amazon S3 to provide static content assets, this pattern provides a complete Web application.

For mobile and Web applications, you can use Amazon Cognito so that you don't have to manage back-end solutions to handle user authentication, network status, storage and synchronization. Amazon Cognito generates a unique identifier for your user.

Amazon Cognito provides temporary AWS credentials for your users, allowing mobile applications running on the device to interact directly with AWS services protected by AWS identity and access Management (IAM).

For Internet of things applications, organizations traditionally have to configure, operate, expand and maintain their own servers as device gateways to handle communications between connected devices and their services. AWS IoT provides a fully managed device gateway that can be automatically extended according to your usage without any operational overhead.

The serverless computing architecture also makes it possible to run responsive services on edge computing.

4.5 Database

For traditional IT infrastructure, organizations are usually limited to the database and storage technologies that can be used. There may be constraints based on license costs and the ability to support various database engines. On AWS, these constraints are lifted by managed database services that provide enterprise performance at an open source cost. Therefore, it is not uncommon for applications to run on top of a multilingual data layer and choose the right technology for each workload.

4.5.1 choose the correct database technology for each business load

The following questions can help you decide which solutions to include in your architecture:

Is this a read-much, write-much or read-write-balanced business load? How many reads and writes per second do you need? How will these values change if the number of users increases?

How much data do you need to store and how long do you need to store? How fast will it grow? Is there an upper limit in the near future? What is the size of each object (average, minimum, maximum)? How do I access these objects?

What are the requirements for λ data persistence? Is this data store "real source?"

λ what is your delay requirement? How many concurrent users do you need to support?

λ what is your data model and how to query the data? Your query is relational in nature (for example, JOIN between multiple tables)? Can you de-normalize the model to create a flatter data structure that is easier to extend?

What kind of functions do you need? Do you need strong integrity controls, or are you looking for more flexibility (for example, unstructured data storage)? Do you need complex reporting or search functions? Are your developers more familiar with relational databases than NoSQL?

What is the license cost of λ-related database technology? Will these costs take into account application development investment, storage and usage costs? Does the licensing model support the expected growth? Can you use cloud native database engines such as Amazon Aurora to achieve the simplicity and cost-effectiveness of open source databases?

4.5.2 Relational database

Relational databases (also known as RDBS or SQL databases) normalize data into an explicit table structure called a table, which consists of rows and columns. They provide a powerful query language, flexible indexing capabilities, powerful integrity control, and the ability to quickly and efficiently combine data from multiple tables. Amazon RDS can easily set up, operate, and extend relational databases in the cloud, and support many familiar database engines.

4.5.2.1 scalability

Relational databases can scale vertically by upgrading to a larger Amazon RDS database instance or by adding more and faster storage. In addition, consider using Amazon Aurora, a database engine that provides higher throughput than standard MySQL running on the same hardware. For heavy read applications, you can also scale out beyond the capacity limit of a single database instance by creating one or more read-only copies.

Read-only replicas are separate database instances replicated asynchronously. As a result, they are affected by replication lag and may lose some of the latest transactions. Application designers need to consider which queries are tolerant of slightly stale data. These queries can be executed on read-only copies, while the rest of the queries should be run on the primary node. Read-only copies also cannot accept any write queries.

Relational database workloads that need to extend their write capacity beyond the constraints of a single database instance require a different approach called data partitioning or sharding. Using this model, the data runs in its own primary database instance across multiple database schemas. Although Amazon RDS eliminates the operational overhead of running these instances, sharding brings some complexity to the application. The data access layer of the application needs to be modified to understand how the data is split so that the query is directed to the correct instance. In addition, schema changes must be performed across multiple database schemas, so it is worth some effort to automate this process.

4.5.2.2 High availability

For any production relationship database, we recommend using the Amazon RDS MultiAZ deployment feature, which creates standby instances of synchronous replication in different availability zones. If the primary node fails, Amazon RDS automatically fails over to the standby node without manual administrative intervention. When performing a failover, the primary node is inaccessible for a short period of time. By using read-only copies to provide reduced functionality, such as read-only mode, you can design resilient applications for resilient applications. Amazon Aurora provides multi-host capabilities, extends reads and writes between available zones, and supports cross-zone replication.

4.5.2.3 non-standard model

If your application mainly indexes and queries data without requiring connections or complex transactions, especially if you want write throughput to exceed the constraints of a single instance, consider using a NoSQL database. If you have large binaries (audio, video, and images), it is more efficient to store the actual files in Amazon S3 and save only the metadata of the files in the database.

4.5.3 NoSQL database

NoSQL database exchanges some query and transaction functions of relational database to obtain a more flexible data model that can be seamlessly extended horizontally. NoSQL databases use a variety of data models, including graphics, key-value pairs, and JSON documents, and are widely recognized for their ease of development, scalable performance, high availability, and flexibility. Amazon DynamoDB is a fast and flexible NoSQL database service for applications of any size that require consistent, single-digit, millisecond latency. It is a fully managed cloud database that supports document and key-value storage models.

4.5.3.1 scalability

The NoSQL database engine typically performs data partitioning and replication to scale reads and writes horizontally. They do this transparently and do not require the data partitioning logic implemented in the application's data access layer. In particular, Amazon DynamoDB automatically manages table partitions, adding new partitions as the size of the table increases or the read and write configuration capacity changes. Amazon DynamoDB Accelerator (DAX) is hosted by DynamoDB with a highly available memory cache that can take full advantage of performance improvements.

4.5.3.2 High availability

Amazon DynamoDB replicates data synchronously between the three facilities in the AWS zone, providing fault tolerance in the event of server failure or disruption of the availability zone. Amazon DynamoDB also supports global tables to provide fully managed multi-area, multi-master databases that provide fast, local, read and write performance for massively extended global applications. The global table is replicated in the selected AWS zone.

4.5.3.3 non-standard model

If your model cannot be normalized and your application requires connections or complex transactions, consider using a relational database. If you have large binaries (audio, video and images), consider storing the files in Amazon S3 and storing the metadata of the files in the database.

4.5.4 data Warehouse

Data warehouse is a special type of relational database, which is optimized for the analysis and reporting of a large amount of data. It can be used to combine transaction data from different sources (such as user behavior in Web applications, data from financial and billing systems, or customer relationship management or CRM) to make it available for analysis and decision-making.

Traditionally, setting up, running, and extending data warehouses has been complex and expensive. On AWS, you can take advantage of Amazon Redshift, a managed data warehouse service that costs only 1/10 of the cost of a traditional solution.

4.5.4.1 scalability

Amazon Redshift achieves efficient storage and optimal query performance through the combination of massively parallel processing (MPP), column data storage and target data compression coding scheme. It is particularly suitable for analyzing and reporting workloads for very large datasets. The Amazon Redshift MPP architecture allows you to improve performance by increasing the number of nodes in the data warehouse cluster. AmazonRedshift Spectrum supports Amazon RedshiftSQL queries to prevent data in Amazon S3, thus extending the analysis capabilities of AmazonRedshift from data stored on local disks in the data warehouse to unstructured data without the need to load or convert data.

4.5.4.2 High availability

Amazon Redshift has a variety of functions to enhance the reliability of data warehouse clusters. We recommend that you deploy the production workload in a multi-node cluster so that the data written to the node is automatically copied to other nodes in the cluster. Data is also continuously backed up to Amazon S3. Amazon Redshift continuously monitors the health of the cluster and automatically restarts the data in the failed drive and replaces the nodes as needed.

4.5.4.3 non-standard model

Because Amazon Redshift is a SQL-based relational database management system (RDBMS), it is compatible with other RDBMS applications and business intelligence tools. Although Amazon Redshift provides the capabilities of a typical RDBMS, including online transaction processing (OLTP) capabilities, it is not designed for these workloads. If you expect highly concurrent workloads that typically involve reading and writing all columns of a small number of records at once, consider using Amazon RDS or Amazon DynamoDB.

4.5.5 search

Searches are often confused with queries. A query is a formal database query that represents a specific dataset in formal terms. Search can query datasets that are not precisely built. As a result, applications that require complex search capabilities often go beyond the capabilities of relational or NoSQL databases. Search services can be used to index and search structured and free text formats, and can support features that are not available in other databases, such as customizable result rankings, filtered facets, synonyms and stemming.

On AWS, you can choose Amazon CloudSearch and Amazon ElasticsearchService (Amazon ES). Amazon CloudSearch is a managed service that requires little configuration and automatically extends. Amazon ES provides open source API to give you more control over configuration details. Amazon ES has also evolved into more than just a search solution. It is often used as an analysis engine for use cases such as log analysis, real-time application monitoring, and clickstream analysis.

4.5.5.1 scalability

Both Amazon CloudSearch and Amazon ES use data partitioning and replication to scale out. The difference is that with Amazon CloudSearch, you don't have to worry about how many partitions and replicas you need, because the service automatically handles them.

4.5.5.2 High availability

Both Amazon CloudSearch and Amazon ES include the ability to store data redundant across availability zones.

4.5.6 figure database

The graphic database is queried using the graphic structure. Graphics are defined as edges (relationships), which are directly related to nodes (data entities) in the store. These relationships enable the data in the store to be directly linked together, so that complex hierarchies in the relational system can be quickly retrieved. For this reason, graphical databases are dedicated to storing and navigating relationships, often used in use cases such as social networks, recommendation engines, and fraud detection, and you need to be able to create relationships between data and query them quickly.

Amazon Neptune is a fully managed graphical database service.

4.5.6.1 scalability

Amazon Neptune is a dedicated high-performance graphics database optimized for processing graphic queries.

4.5.6.2 High availability

Amazon Neptune has high availability, read-only copies, point-in-time recovery, continuous backups to Amazon S3, and replication across availability zones. Neptune is secure and supports encryption at rest and in transit.

4.5.7 manage the increasing amount of data

Traditional data storage and analysis tools can no longer provide the flexibility and flexibility needed to provide relevant business insight. This is why many organizations are turning to the data Lake architecture. A data lake is an architectural approach that allows you to store large amounts of data in a central location so that you can classify, process, analyze, and use different groups within your organization at any time. Because the data can be stored as is, you don't need to convert it to a predefined schema, and you no longer need to know about the data in advance. This allows you to choose the right technology to meet your specific analytical requirements.

4.5.8 eliminating a single point of failure

Production systems usually have defined or implied uptime goals. The system is highly available when it can withstand the failure of a single component or multiple components, such as hard drives, servers, and network links. To help you create a system with high availability, you can consider automated recovery and reduce disruption at each layer of the architecture.

4.5.8.1 introduce redundancy

You can eliminate a single point of failure by introducing redundancy, which means you can provide multiple resources for the same task. Redundancy can be achieved in standby or active mode.

In active and standby redundancy, when a resource fails, the failover process is used to restore functionality on the standby resource. Failover usually takes some time to complete, during which time resources remain unavailable. Spare resources can either be started automatically when needed (to reduce costs) or can be idle (to speed up failover and minimize disruption). Active and standby redundancy is commonly used for stateful components, such as relational databases.

In master-master redundancy, requests are distributed to multiple redundant computing resources. When one of them fails, the rest can simply absorb a larger share of the workload. Compared with standby redundancy, active redundancy can be better used and affect a smaller population in the event of a failure.

4.5.8.2 Detection failed

You should build as much automation as possible when detecting and responding to failures. You can use services such as ELB and Amazon Route 53 to configure health checks and block failures by routing traffic to healthy endpoints. In addition, you can use Auto Scaling or services such as Amazon EC2 automatic recovery or AWS Elastic Beanstalk to automatically replace unhealthy nodes. It is not possible to predict every possible failure on the first day. Ensure that sufficient logs and metrics are collected to understand normal system behavior. Once you know, you will be able to set alarms for manual intervention or automatic response.

A well-designed health check

Configuring the correct health check for the application can help determine whether you can respond to various failure situations correctly and in a timely manner. Specifying incorrect health checks may actually reduce the availability of the application.

In a typical three-tier application, you can configure health checks on ELB. Your health check is designed to reliably evaluate the health of back-end nodes. A simple TCP health check does not detect whether the instance itself is healthy but whether the Web server process has crashed. Instead, you should evaluate whether the Web server can return a HTTP 200 response to a simple request.

At this layer, it may not be a good idea to configure a deep health check, which is a successful test that depends on other layers of the application because it can lead to false positives. For example, if your health check also evaluates whether the instance can connect to the back-end database, you may mark all Web servers as unhealthy when the database node is quickly unavailable. The hierarchical approach is usually the best. An in-depth health check may be suitable for implementation at Amazon's Route53 level. By running a more comprehensive check to determine whether the environment actually provides the required functionality, you can configure Amazon Route 53 to fail over to the static version of the website until the database is up and running again.

4.5.8.3 reliable data storage

Your application and users will create and maintain all kinds of data. It is critical that your architecture protects data availability and integrity. Data replication is a technology that introduces redundant copies of data. It can help scale read capacity horizontally, but it also improves the durability and availability of data. Replication can be done in several different modes.

Synchronous replication acknowledges the transaction only after it has been persisted in the primary location and its copy. It is an ideal choice to protect data integrity when the primary node fails. Synchronous replication can also expand the read capacity of queries that require the latest data (strong consistency). The disadvantage of synchronous replication is that the primary node is coupled to the replica. The transaction cannot be confirmed until all replicas are written. This can affect performance and availability, especially in topologies running unreliable or high-latency network connections. For the same reason, it is not recommended to maintain many synchronous copies.

No matter how durable your solution is, this is no substitute for backup. Synchronous replication redundantly stores all updates to your data, even those caused by software errors or human errors. However, especially for objects stored on Amazon S3, you can use version control to retain, retrieve and restore any version of the object, through which you can recover from unexpected user actions and application failures.

Asynchronous replication decouples the primary node from its replica at the cost of introducing replication latency. This means that changes on the primary node are not immediately reflected in its copy. Asynchronous replicas are used to horizontally expand the read capacity of the system to queries that can tolerate replication delays. It can also be used to improve data persistence when some recent transaction losses can be tolerated during failover. For example, you can maintain an asynchronous copy of the database in a separate AWS region as a disaster recovery solution.

Quorum-based replication combines synchronous and asynchronous replication to overcome the challenge of large-scale distributed database systems.

Replication to multiple nodes can be managed by defining the minimum number of nodes that must participate in a successful write operation. A detailed discussion of distributed data storage is beyond the scope of this document. More information about distributed data storage and core principles of a super-scalable and highly reliable database system.

It is important to understand the position of each technology you use in these data storage models. During various failover or backup / restore scenarios, their behavior should be consistent with recovery point objectives (RPO) and recovery time objectives (RTO). You must determine the amount of data you want to lose and the speed required to recover. For example, AmazonElastiCache's Redis engine supports replication using automatic failover, but Redis engine replication is asynchronous. During a failover, some recent transactions are likely to be lost. However, Amazon RDS with multiple availability zones is designed to provide synchronous replication to keep the data on the standby node in sync with the primary node.

4.5.8.4 automated multi-data center resiliency

Business-critical applications also need to be protected against outages that affect not only a single disk, server, or rack. In traditional infrastructure, you usually need a disaster recovery plan to allow failover to a remote second data center in the event of a major outage in the primary data center. Because of the distance between the two data centers, latency makes it impractical to maintain synchronous copies of data across data centers. Therefore, failover will certainly result in data loss or data recovery process is very expensive. This makes failover a risky and not always fully tested program. Nevertheless, such a model can provide excellent protection against low-probability but significant impact risks, such as natural disasters that affect the entire infrastructure in the long run.

It is more likely that the downtime in the data center is shorter. For short interruptions that are not expected to last long, the choice of performing a failover is difficult and is usually avoided. On AWS, simpler and more effective protection can be used to prevent such failures. Each AWS area contains several different locations or availability zones. Each availability zone is designed to be independent of failures in other availability zones. Availability zones are data centers, and in some cases, availability zones are made up of multiple data centers. Available areas within the region provide cheap, low-latency networks to connect to other areas in the same region. This allows you to replicate data synchronously across data centers so that failover can be automated and transparent to users.

Active redundancy can also be achieved. For example, a set of application servers can be distributed in multiple availability zones and attached to the ELB. When EC2 instances in a specific availability zone fail the health check, ELB stops sending traffic to those nodes. In addition, AWS Auto Scaling ensures that the correct number of EC2 instances can be used to run your application, start and terminate instances on demand, and be defined by your extension strategy. If your application does not need short-term performance degradation due to availability zone failures, your architecture should be statically stable, which means it does not need to change the behavior of the workload to tolerate failures. In this case, your architecture should provide extra capacity to withstand the loss of an availability zone.

Many advanced services on AWS are designed based on the principle of multiple availability zones. For example, AmazonRDS uses multiple availability zone deployments to provide high availability and automatic failover support for database instances, while for Amazon S3 and Amazon DynamoDB, your data is stored redundantly across multiple facilities.

4.5.8.5 Fault isolation and traditional horizontal expansion

While the primary primary redundancy mode is well suited for balancing traffic and handling instance or availability zone interruptions, it is not enough to have any adverse impact on the request itself. For example, there may be situations where every instance is affected. If a particular request happens to trigger an error that causes a system failover, the caller may trigger a cascade failure by repeatedly trying the same request for all instances.

Random slicing

You can improve the isolation of traditional horizontal scaling, which is called sharding. Similar to the techniques traditionally used with data storage systems, you can group instances into shards instead of propagating traffic from all customers on each node. For example, if your service has eight instances, you can create four shards, each shard containing two instances (two instances in each shard for redundancy), and assign each customer to a specific shard.

In this way, you can reduce the impact on your customers in direct proportion to the number of pieces you have. However, some customers will still be affected, so the key is to make the client fault-tolerant. If the client can try each endpoint in a set of sharding resources until it is successful, then you will get a significant improvement. This technique is called random slicing.

4.6 optimize cost

When you move your existing architecture to the cloud, you can reduce capital expenditure and save costs because of AWS's economies of scale. By iterating and using more AWS capabilities, you can have the opportunity to create a cost-optimized cloud architecture.

4.6.1 correct instance

AWS provides a wide range of resource types and configurations for many use cases. For example, services such as Amazon EC2,Amazon RDS,Amazon Redshift and Amazon ES provide many instance types. In some cases, you should choose the type that best suits your workload requirements. In other cases, fewer instances that use fewer instance types may reduce total cost or improve performance. You should benchmark the application environment and choose the correct instance type based on how the workload uses CPU,RAM, network, storage size, and I / O.

Similarly, you can reduce costs by choosing a storage solution that suits your needs. For example, Amazon S3 provides a variety of storage classes, including standard, simplified redundancy and standard-infrequent access. Other services (such as Amazon EC2,Amazon RDS and Amazon ES) support different types of EBS volumes you should evaluate (magnetic, generic SSD, pre-configured IOPS SSD).

Over time, you can continue to reduce costs through continuous monitoring and marking. Just like application development, cost optimization is an iterative process. Because your application and its usage will evolve over time, and because AWS iterates frequently and publishes new options on a regular basis, it is important to constantly evaluate your solution.

AWS provides tools to help you identify these cost-saving opportunities and keep your resources at the right size. To make the results of these tools easy to understand, you should define and implement marking strategies for AWS resources. You can use AWS management tools such as AWS Elastic Beanstalk and AWS OpsWorks to automate markup as part of the build process. You can also use the hosting rules provided by AWS Config to evaluate whether specific tags are applied to your resources.

4.6.2 make full use of elasticity

Another way to save money with AWS is to take advantage of the resilience of the platform. Plan to implement Auto Scaling for as many Amazon EC2 workloads as possible so that you scale out and scale down as needed and automatically reduce expenses when that capacity is no longer needed. In addition, you can automatically turn off non-production workloads when you are not in use. Finally, consider what computing workloads you can implement on AWS Lambda so that you never pay for free or redundant resources.

Whenever possible, replace AWS EC2 workloads with AWS managed services that don't require you to make any capacity decisions (such as ELB,Amazon CloudFront,Amazon SQS,Amazon Kinesis Firehose,AWS Lambda,Amazon SES,Amazon CloudSearch or Amazon EFS) or enable you to easily modify capacity (such as Amazon DynamoDB,Amazon RDS or Amazon ES) when needed.

4.6.3 make full use of various procurement options

Amazon EC2 On-Demand instance pricing provides you with maximum flexibility without long-term commitment. Two other EC2 instances that can help you reduce expenses are reserved and competitive instances.

4.6.3.1 reserved instances

Compared to on-demand instance pricing, Amazon EC2 reserved instances allow you to retain Amazon EC2 computing capacity in exchange for heavily discounted hourly rates. This is ideal for applications with predictable minimum capacity requirements. You can use tools such as AWS Trusted Advisor or Amazon EC2 usage reports to identify the computing resources you use most and should consider retaining. According to your reserved instance purchase, the discount will be reflected in the monthly bill. There is no technical difference between on-demand EC2 instances and reserved instances. The difference lies in the way you pay for reserved instances.

Other services also have reserved capacity options (for example, Amazon Redshift,Amazon RDS,Amazon DynamoDB and Amazon CloudFront).

Tip: reserved instances should not be submitted for purchase until the application in production is fully benchmarked. After you have purchased the reserved capacity, you can use the reserved instance utilization report to ensure that you are still making full use of the reserved capacity.

4.6.3.2 bidding example

For unstable workloads, consider using bidding instances. Amazon EC2 Spot instances allow you to use backup Amazon EC2 to calculate capacity. Because bidding instances are usually offered at a discounted price compared to on-demand pricing, you can significantly reduce the cost of running the application.

Bidding instances allow you to request unused EC2 instances, which can significantly reduce your Amazon EC2 costs. The hourly price of the bidding instance (each instance type in each availability zone) is set by Amazon EC2 and gradually adjusted according to the long-term supply and demand of the bidding instance. As long as the capacity is available and your requested maximum price per hour exceeds the bidding price, your bidding instance will run.

Therefore, bidding instances are well suited for workloads that can tolerate interruptions. However, when you need more predictable usability, you can also use competitive instances. For example, you can combine reserved, on-demand, and competitive instances to combine predictable minimum capacity with opportunity access to other computing resources, depending on the competitive market price. This is a very cost-effective way to improve throughput or application performance.

4.7 cach

Caching is a technique that stores previously calculated data for future use. This technology is used to improve the performance of applications and improve the cost efficiency of implementation. It can be applied to multiple layers of the IT architecture.

4.7.1 Application data caching

Applications can be designed to store and retrieve information from fast, managed, in-memory caches. Cache information may include the results of I / O-intensive database queries or compute-intensive processing. When the result set is not found in the cache, the application can evaluate it, or retrieve it from the database or expensive, slowly changing third-party content and store it in the cache for subsequent requests. However, when a result set is found in the cache, the application can use the result directly, which can improve the latency of the end user and reduce the load on the back-end system. Your application can control how long each cached item remains valid. In some cases, for very popular objects, even a few seconds of caching can lead to a sharp drop in database load.

Amazon ElastiCache is a Web service that makes it easy to deploy, operate, and scale memory caches in the cloud. It supports two open source memory caching engines: Memcached and Redis.

Amazon DynamoDB Accelerator (DAX) is a fully managed, highly available memory cache of DynamoDB that provides performance improvements from milliseconds to microseconds for high throughput. DAX adds memory acceleration to your DynamoDB tables without managing cache invalidation, data population, or cluster management.

4.7.2 Edge caching

Copies of static content (images, CSS files, or streaming pre-recorded video) and dynamic content (responsive HTML, real-time video) can be cached at the edge of the Amazon CloudFront, a CDN with multiple presence points around the world. Edge caching allows content to be serviced by a closer infrastructure, reducing latency and providing you with high, sustained data transfer rates, thereby providing large popular objects for large-scale end users.

Your content requests will be intelligently routed to Amazon S3 or the original server. If the source is running on AWS, the request will be transmitted over an optimized network path for a more reliable and consistent experience. You can use Amazon CloudFront to deliver the entire site, including non-cacheable content. In this case, the advantage is that Amazon CloudFront reuses the existing connection between the Amazon CloudFront edge and the source server, which reduces the connection setup latency for each source request. Other connection optimizations are also applied to avoid Internet bottlenecks and to take full advantage of available bandwidth between edge locations and viewers. This means that when you browse Web applications, Amazon CloudFront can speed up the delivery of your dynamic content and provide your viewers with a consistent, reliable, personalized experience. Amazon CloudFront also applies upload requests to the same performance advantages as download dynamic content requests.

4.8 safe

Most of the security tools and technologies that you may already be familiar with in a traditional IT infrastructure can be used in the cloud. At the same time, AWS allows you to improve security in a variety of ways. AWS is a platform that allows you to formally design security controls in the platform itself. It simplifies system usage for administrators and IT departments, making it easier for your environment to audit in a continuous manner.

4.8.1 use the AWS function for defense in depth

AWS provides many features that can help you build an architecture with a defense-in-depth approach. Starting at the network level, you can build VPC topologies that isolate parts of the infrastructure by using subnets, security groups, and routing controls. Services such as AWS WAF (Web Application Firewall) can help protect your Web application from SQL injection and other vulnerabilities in the application code. For access control, you can use IAM to define a set of fine policies and assign them to users, groups, and AWS resources. Finally, AWS Cloud offers a number of options to protect your data, whether in transit or at rest.

4.8.2 share security responsibilities with AWS

AWS runs under the shared security responsibility model: AWS is responsible for the security of the underlying cloud infrastructure, and you are responsible for protecting the workloads you deploy in AWS. This helps you reduce your responsibilities and focus on your core competencies by using AWS hosting services. For example, when you use services such as Amazon RDS and Amazon ElastiCache, security patches are automatically applied to your configuration settings. This can not only reduce the operating expenses of the team, but also reduce your vulnerability risk.

4.8.3 reduce privileged access

When your server is a programmable resource, you will gain many security advantages. Change the functionality of the server anytime and anywhere so that you do not need the guest operating system to access the production environment. If the instance encounters a problem, you can automatically or manually terminate and replace it. However, before replacing the instance, you should collect and centrally store log data that can help you recreate problems in the development environment and deploy them as fixes through the continuous deployment process. This approach ensures that log data helps troubleshoot and raise awareness of security events. This is particularly important in a flexible computing environment where the server is temporary. You can use Amazon CloudWatch Logs to collect this information. If you do not have direct access, you can implement services such as AWS Systems Manager 55 to get a unified view and automatically perform operations on resource groups. You can integrate these requests with your ticketing system to track and dynamically process access requests only after approval.

Another common security risk is the use of stored long-term credentials or service accounts. In traditional environments, service accounts typically assign long-term credentials stored in configuration files. On AWS, you can use the IAM role to grant permissions to applications running on EC2 instances by using short-term credentials that are automatically distributed and rotated. For mobile applications, you can use Amazon Cognito to allow client devices to access AWS resources through temporary tokens with fine-grained permissions.

As an AWS Management console user, you can similarly provide federated access through temporary tokens instead of creating IAM users in your AWS account. Then, when an employee leaves your organization and deletes it from the organization's identity directory, the employee will automatically lose access to your AWS account.

4.8.4 Security Code

Traditional security frameworks, regulations, and organizational policies define security requirements related to firewall rules, network access control, internal / external subnets, and operating system hardening. You can also implement these in the AWS environment, but you now have the opportunity to capture them in the template that defines the golden environment. AWS CloudFormation uses this template and deploys resources according to your security policy. As part of the continuous integration pipeline, you can reuse security best practices in multiple projects. You can perform security tests during the release cycle and automatically discover application gaps and disappear from the security policy.

In addition, for better control and security, AWS CloudFormation templates can be imported into AWS Service Catalog.5 as products. This allows you to centrally manage resources to support consistent governance, security and compliance requirements, while enabling your users to quickly deploy the IT services they need to approve. You apply IAM permissions to control who can view and modify the product, and define constraints to limit how specific AWS resources can be deployed for the product.

4.8.5 Real-time audit

Testing and auditing your environment is the key to keeping safe and fast movement. Traditional methods that involve regular (usually manual or sample-based) checks are not enough, especially in a constant agile environment. On AWS, you can implement continuous monitoring and automation of controls to minimize security risks. Services such as AWS Config,Amazon Inspector and AWS Trusted Advisor continuously monitor compliance or vulnerabilities to clearly see which IT resources meet and which do not. Using AWS Config rules, you can also see whether resources are non-compliant in a short period of time, making time-point and time-period audits very effective.

By enabling AWS CloudTrail, you can implement extensive logging for your application (using Amazon CloudWatch Logs) and actual AWS API calls AWS CloudTrail is a Web service that records API calls to supported AWS services in your AWS account and transfers log files to the S3 bucket. Log data can then be stored in an immutable manner and automatically processed to send notifications or act on your behalf to protect your organization from non-compliance. You can scan log data using third-party tools in AWS Lambda,Amazon EMR,Amazon ES,Amazon Athena or AWS Marketplace to detect unused permissions, privileged account overuse, key usage, abnormal login, policy violations, and system abuses.

Conclusion

When designing a cloud architecture in AWS, it is important to consider the important principles and design models in AWS, including how to choose the right database for your application and how to build applications that can scale horizontally and have high availability. Cloud computing architecture is an extensive and evolving topic.

Description:

This white paper was translated by Fu Yubin, an operation and maintenance engineer of New Titanium Cloud Service. new Titanium Cloud Services has a number of certified AWS engineers, has rich experience in AWS use and maintenance, and has provided AWS cloud support for many users.

Original text link:

Https://d1.awsstatic.com/whitepapers/AWS_Cloud_Best_Practices.pdf

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.