AWS Solutions Architect Professional Exam Notes and Prep Guide

AWS Solutions Architect Professional Exam Notes and Prep Guide

I set my sights on the SA Pro cert a while ago, but for multiple reasons I couldn't find the time to sit down and study, until early this year. On April 6th I finally sat the exam and passed with a score of 859. Here's my account on how I prepared for it, what the exam felt like, and a ton of notes that I took about small technical details that can make a difference in a question.

My Study Materials and Strategy

While I had some experience as a freelance architect and AWS Authorized Instructor, the past year saw me working a lot with code and GCP, and barely even touching AWS, so I knew I needed a full course that would help me remember the basics (in case I had forgotten anything) and also level up on the advanced stuff. I chose Adrian Cantrill's AWS Certified Solutions Architect - Professional course for that, and it was excellent, though quite long.

It took me over a month and a half to go over Adrian's course, but after that I felt in a pretty good place, with his excellent lessons and demos. However, I knew something must be lacking, from my memory if not from the course, so I signed in to AWS SkillBuilder and found the Exam Readiness: AWS Certified Solutions Architect – Professional course. It says 4 hours, but I think you should take at least 6, because while the course doesn't give you any new knowledge, it helps you a lot to reflect on what you're missing and identify your weaknesses, and that's what's going to drive your next steps.

Identifying and Addressing Weaknesses

My weaknesses weren't focused on a single area, since those I had identified earlier and covered by re-watching Adrian's lessons as many times as necessary (I think I watched the Direct Connect ones 4 or 5 times). Instead of not knowing one service or one kind of solution, my weaknesses were all over the place, not in the general aspects but rather in the smallest details that mattered.

Some of the not so small details:

  • If you're connecting Direct Connect to a VPC without a VPN, should you use a public or private VIF? What about when using site-to-site VPN? Answer: private when going to the VPC directly, public when using a VPN because Site-to-Site VPN is a public service (i.e. not in a VPC, same as S3 for example).

  • Is Kinesis Firehose able to stream data in real time? Answer: No, it has a 60-second latency, and is considered near-real time, NOT real time.

Some of the much smaller ones:

  • In ALB, can you associate multiple SSL certificates with the same listener? If so, how will the listener choose the correct certificate? Answer: Yes, and the listener automatically chooses the correct cert using SNI.

  • Is data ordered in a Kinesis Data Stream? Answer: Yes inside the shard, not across multiple shards.

  • In SQS with a retention period of 7 days, if a message is moved to the DLQ 5 days after being enqueued, when will it be deleted? Answer: In 2 days, because the retention period checks the enqueue timestamp, which is unchanged when a message is moved to the DLQ.


Master AWS with Real Solutions and Best Practices. Subscribe to the free newsletter Simple AWS. For a limited time, you'll get a free ebook (valued at $10) when you subscribe.


Focusing on Practice Exams

So I knew I was lacking, but I didn't even know the questions that I should seek answers to. I tried going to the FAQs, but let me tell you, those are SUPER LONG and full of A TON of info that's probably not relevant to the exam (though at the professional level you should assume everything is relevant). After about half an hour of just reading the FAQs and getting terribly bored, I went online to search for practice exams, so I could make my own mistakes and learn that way. I found the AWS Certified Solutions Architect Professional Practice Exams 2022 in TutorialsDojo and purchased that.

On a brief note, TutorialsDojo's practice exams are excellent, even if they're not perfect. Most answers are correct and the explanations are really good. I did find one or three that were ridiculous or outright technically impossible. Still, one or three among 375 (4 practice exams + 1 final exam) is very good. Just keep in mind that, when in doubt, you should look up the documentation and try to find the correct answer by yourself.

Benefits of Practice Exams

At this point, doing practice exams is by far the best thing that you can do, in my opinion. Making your own mistakes (TutorialsDojo does tell you which questions you got right or wrong, what the correct answer is, and why) really helps you to recall those small details that make a difference. Plus, you can do half of an exam, or just 10 questions, whenever you have the time. I do recommend doing at least one or two full, timed exams, but you don't have to do either a 3-hour study session or nothing at all; if all you have is 30 minutes, it's better to answer 5 or 10 questions than not doing anything. Also, write everything down, so you can go over your notes later.

Another huge thing about practice exams is that you get to practice timing yourself. You get 180 minutes for 75 questions, which is 2 minutes and 24 seconds per question. If it doesn't sound like much, it's because it isn't. Most questions are very long, much longer than in the SA Associate exam, and the correct answer often depends on a word or two. You'll find yourself scanning through answers 4 or 5 lines long that seem exactly the same, until you find the difference: a private VPC vs a public VPC, for example. Other times, what seems to be the best answer actually has a detail that means it won't work. For example, one answer might describe setting up the application in a private subnet and adding an interface VPC endpoint to access DynamoDB, while the other will talk about putting the application in a public subnet and using a gateway VPC endpoint to access DynamoDB. If you're not careful, you might miss the fact that DynamoDB does not support interface VPC endpoints, only gateway VPC endpoints.

Timing Strategy

For the SA Pro exam, I recommend spending the first 90 minutes reading through all 75 questions, and answering only the ones that you're 100% sure of. Flag the others for review, and take a quick break if needed. Then, go back to the flagged questions and spend the next 60 minutes trying to figure them out. Finally, spend the last 30 minutes going over all the questions again, reviewing your answers and ensuring you haven't missed any small details. This approach worked well for me and helped me manage my time effectively.

Final Thoughts

The AWS Solutions Architect Professional exam is challenging, but with the right study materials and practice exams, you can succeed. Adrian Cantrill's course, AWS SkillBuilder's Exam Readiness, and TutorialsDojo's practice exams were invaluable in my preparation. The key is to identify your weaknesses, focus on the small technical details, and practice your timing. Remember to always consult the documentation when in doubt, and take the time to learn from your mistakes. Best of luck in your certification journey!

Exam notes

The following are the notes I took on the very fine details for each service. They don't cover everything, just what I thought would be difficult and important to remember.

EBS

GP2

  • 1 IOPS = 1 IO (16 KB) in 1 second.

  • Max IO credits = 5.4 million. Starts full. Fills at rate of Baseline Performance. above the 100 minimum IO credits, 3 IO credits per second per GB of volume size.

  • Burst up to 3000 IOPS or the fill rate

  • Volumes above 1000 GB have baseline performance higher than 3000 IOPS and don't use credits.

GP3

  • 3000 IOPS & 125 MiB/s standard (regardless of size)

  • Goes up to 16000 IOPS or 1000 MiB/s

  • Performance doesn't scale with size, need to scale it separately. It's still around 20% cheaper than GP2

Provisioned IOPS

  • Consistent low latency & jitter

  • 64000 IOPS, 1000 MB/s (256000 IOPS & 4000 MB/s for Block Express)

  • 4 GB to 16 TB (64 TB for Block Express)

  • IO1: 50 IOPS/GB max. IO2: 500 IOPS/GB max.

  • IOPS can be adjusted independently of size

  • Real limitations for maximum performance between EBS and EC2:

  • Per instance performance: IO1: 260000 IOPS & 7500 MB/s, IO2: 160000 IOPS & 4750 MB/s, IO2 Block Express: 260000 IOPS & 7500 MB/s

  • Limitations on the EC2 instance type and size

  • Use cases: Small volumes with really high performance, extreme performance, latency-sensitive workloads

HDD

  • st1: cheaper than SSD, really bad at random access. Max 500 IOPS, but 1 MB per IO. Max 500 MB/s. 40 MB/s/TB base, 250 MB/s/TB burst. Size 125 GB to 16 TB. Use case: sequential access, big data, data warehouses, log processing.

  • sc1: even cheaper, but cold, designed for infrequent workloads. Max 250 IOPS but 1 MB per IO. Max 250 MB/s. 12 MB/s/TB base, 80 MB/s/TB burst. Size 125 GB to 16 TB.

Instance Store volumes

  • Block storage devices (like EBS) but local to the instance. Physically connected to one EC2 host. Instances on that host can access them.

  • Included in instance price (for instance types that have it), use it or waste it

  • Attached at launch

  • Ephemeral storage. If the instance moves between hosts, data in instance volumes is lost.

  • Size depends on type and size of instance

  • EC2 instance type D3 = 4.6 GB/s throughput

  • EC2 instance type I3 = 16 GB/s sequential throughput

  • How to choose between EBS and Instance Store:

  • Persistence, resilience, backups or isolation from instance lifecycle: choose EBS

  • Cost for EBS: ST1 or SC1 (both are hard disks)

  • Throughput or streaming: ST1

  • Boot volume: NOT ST1 or SC1

  • Up to 16000 IOPS: GP2/3

  • Up to 64000 IOPS: IO2

  • Up to 256000 IOPS: IO2 Block Express

  • Up to 260000 IOPS: RAID0 + EBS (IO1/2-BE/GP2/3) (this is the max performance of an EC2 instance)

  • More than 260000 IOPS: Instance Store (but it's not persistent)

  • Support encryption, but it's NOT enabled by default.

EC2

Placement groups

  • Cluster: Same rack, higher network, one AZ, supported instance type, for fast speeds and low latency
  • Spread: always different racks, 7 instances per AZ, for critical instances
  • Partition: Max 7 partitions, each can have more than 1 instance, great for topology-aware apps like HDFS, HBase and Cassandra

ELB:

  • GWLB:

  • L3 LB for ingress/egress security scans

  • To pass traffic through scalable 3rd party appliances, using GENEVE protocol.

  • Uses GWLB Endpoint, which can be added to a RT as a next hop.

  • Packets are unaltered.

ALB:

  • Can have multiple SSL certificates associated with a secure listener and will automatically choose the optimal certificate using SNI.

DynamoDB:

Local Secondary Indexes (LSI)

  • Can only be created when creating the table

  • Use the same PK but a different SK

  • Aside from keys, can project none, some or all attributes

  • Share capacity with the table

  • Are sparse: only items with values in PK and SK are projected

  • Use strong consistency.

Global Secondary Indexes (GSI)

  • Can be created at any time

  • Different PK and SK

  • Own RCU and WCU allocations

  • Aside from keys, can project none, some or all attributes

  • Are sparse: only items with values in PK and SK are projected

  • Are always eventually consistent, replication between base table and GSI is async.

  • On LSIs and GSIs you can query on attributes not projected, but it's expensive.

DynamoDB Streams

  • A Kinesis Stream with 24-h rolling window of time-ordered item changes in a table

  • Enabled on a per-table basis

  • Records INSERTS, UPDATES and DELETES

  • Different view types: KEYS_ONLY, NEW_IMAGE, OLD_IMAGE and NEW_AND_OLD_IMAGE.

Athena

  • Serverless interactive querying service

  • Free, you only pay for the data consumed

  • Schema-on-read table-like translation

  • Original data never changed, remains on S3

  • Schema translates data to relational-like when reading

  • Can also query AWS logs, web server logs or Glue Data Catalogs

  • Can use Athena Federated Query to use a Lambda to transform the data before querying.

Kinesis

Data stream

  • Sub-1-second

  • custom processsing per record

  • choice of stream processing framework

  • Multi-shard

  • 1 shard = 1 MB ingestion and 2 MB consumption

  • Order is guaranteed within the shard, but not across shards

  • 24h (up to 7d for more $$$) rolling window

  • Multiple consumers

Firehose

  • Connects to a data stream or ingests from multiple sources

  • Zero admin (automatically scalable, serverless and resilient)

  • \>60 seconds latency

  • delivers data to existing analytics tools: HTTP such as splunk, ElasticSearch and OpenSearch, S3 and Redshift (through intermediate S3 bucket)

  • Order is guaranteed

  • Supports transformation of data on the fly

  • Billed by data streamed.

Difference between SQS and Kinesis data streams

  • SQS has 1 production group and 1 consumer group, and once a message is consumed it's deleted

  • It's typically used to decouple async communication

  • Kinesis is designed for huge scale ingestion and multiple consumer within the rolling window

  • It's designed for data ingestion, analytics, monitoring, app clicks, and streaming

Kinesis Data Analytics

  • real-time processing of data using SQL

  • Ingests from Data streams or Firehose or S3, processes it and sends to Data streams, Lambda or Firehose

  • It fits between 2 streams and allows you to use SQL to modify the data

Elastic MapReduce (EMR)

  • Managed implementation of Hadoop, Spark, HBase, Presto, Flink, Hive and Pig.

  • Huge-scale parallel processing

  • Two phases: Map and Reduce. Map: Data is separated into 'splits', each assigned to a mapper. Perform customized operations at scale. Reduce: Recombine data into results.

  • Can create clusters for long-term usage or ad-hoc (transient) usage.

  • Runs in one AZ in a VPC (NOT HA) using EC2 for compute

  • Auto scales and can use spot, instance fleet, reserved and on-demand.

  • Loads data from S3 and outputs to S3.

  • Uses Hadoop File System (HDFS)

  • Data stored across multiple data nodes and replicated between nodes for fault tolerance.

Node types

  • Master (at least 1): Manages the cluster and health, distributes workloads and controls access to HDFS and SSH access to the cluster. Don't run in spot.

  • Core (0 or more): Are the data nodes for HDFS, run task trackers and can run map and reduce tasks. HDFS runs in instance store. Don't run in spot.

  • Task nodes (0 or more): Only run tasks, don't run HDFS or task trackers. Ideal for spot instances.

EMRFS

  • Is a file system for EMR

  • backed by S3 (regionally resilient)

  • persists past the lifetime of the cluster and is resilient to core node failure

  • It is slower than HDFS (S3 vs Instance Storage)

Redshift

  • Petabyte-scale data warehouse

  • OLAP (column-based, not OLTP: row/transaction)

  • Designed to aggregate data from OLTP DBs

  • NOT designed for real-time ingestion, but for batch ingestion.

  • Provisioned (server-based)

  • Single-AZ (not HA).

  • Automatic snapshots to S3 every 8h or 5GB with 1d (default) to 35d retention, plus manual snapshots, make the data resilient to AZ failure. Can be configured to be copied to another region.

  • DMS can migrate into Redshift and Firehose can stream into redshift.

  • Redshift Spectrum: Directly query data in S3. Federated query: Directly query data in other DBs.

  • For ad-hoc querying use Athena.

  • Can copy encrypted snapshots to another region by configuring a snapshot copy grant for the master key in the other region.

Node types in Redshift

  • Leader node: Query input, planning and aggregation. Applications interact with the leader node using ODBC or JDBC.

  • Compute node: performing queries of data. They have slices with the data, replicated to 1 additional node.

AWS Batch

  • Lets you worry about defining batch jobs, handles the compute.

  • Job: script, executable or docker container. The thing to run.

  • Job definition: Metadata for a job, including permissions, resource config, mount points, etc.

  • Job queue: Jobs are added to queues, where they wait for compute capacity. Capacity comes from 1+ compute environments.

  • Compute environment: managed or unmanaged compute, configurable with instance type/size, vCPU amount, spot price, or using an existing environment with ECS (only with ECS).

  • Managed compute environment: Batch manages capacity, you pick on-demand or spot, instance size/type, max spot price. Runs in VPC, can run in private VP but you need to provide gateways.

  • Unmanaged compute environment: You create everything and manage everything outside of Batch (with ECS).

  • Jobs can come from Lambda API calls, Step Functions integration or API call, target of EventBridge (e.g. from S3).

  • When completed, can store data and metadata in S3 and DynamoDB, can continue execution of Step Functions, or post to Batch Event Stream.

Difference between AWS Batch and AWS Lambda

  • Lambda has 15-min execution limit, 10 GB disk space limit (as of 2022/03/24, probably not impacted the exam yet, previous limit was 512 MB) and limited runtime

  • Batch uses docker (so any runtime) and has no resource limits.

ElastiCache

  • Redis: advanced data structures, persistent, multi-az, read replicas, can scale up but not out (and can't scale down), backups and restores. Highly available (multi-az)

  • Memcached: simple K/V, non-persistent, can scale up and out (multiple nodes), multi-thread, no backup/restore. NOT highly available

EFS and FSx

FSx for Windows

  • ENIs injected into VPCs

  • Native Windows FS

  • needs to be connected with Directory Service or self-managed AD

  • Single or Multi-AZ

  • on-demand and scheduled backups

  • accessible using VPC, VPN, peering, direct connect

  • Encryption at rest (KMS) and in transit

  • Keywords: VSS, SMB, DFS

FSx for Lustre

  • ENIs injected into VPCs

  • HPC for Linux (POSIX)

  • Used for ML, big data or financial

  • 100s GB/s

  • deployment types: Scratch (short term, no replication) and Persistent (longer term, HA in one AZ, self-healing)

  • Available over VPN or direct connect

  • Data is lazy loaded from S3 and can sync back to S3

  • < 1 ms latency.

EFS

  • NFSv4 FS for Linux

  • Mount targets in VPC

  • General purpose and Max I/O modes

  • Bursting and Provisioned throughput modes (separate from size)

  • Standard and IA storage classes.

  • It's impossible to update the deployment type (single-AZ or multi-AZ) of an FSx for Windows file system after it has been created

  • To migrate to multi-AZ, create a new one and use DataSync to replicate the data.

QuickSight

  • BA/BI tool for visualizations and ad-hoc analysis.

  • Supports discovery and integration with AWS or external data sources

  • Used for dashboards or visualization.

SQS

Visibility timeout

  • Default is 30s

  • can be between 0s and 12h

  • Set on queue or per message.

Extended client library

  • for messages over SQS max (256 KB)

  • Allows larger payloads (up to 2 GB) stored in S3

  • SendMessage uploads to S3 automatically and stores the link in the message

  • ReceiveMessage loads payload from S3 automatically

  • DeleteMessage also deletes payload in S3

  • Exam often mentions Java.

Delay queues

  • Postpone delivery of message (only in Standard queues)

  • Set DelaySeconds and messages will be added immediately to the queue but will only be visible after the delay

  • Min (default) is 0s, max is 15m.

Dead-letter queues

  • Every time a message is received (or visibility timeout expires) in a queue, ReceiveCount is increased

  • When ReceiveCount > maxReceiveCount a message is moved to the dead-letter queue

  • Enqueue timestamp is unchanged (so Retention period is time at queue + time at DL queue).

FIFO queue

  • 3000 messages per second limit.

Amazon MQ

  • Open-source message broker based on Apache ActiveMQ

  • JMS API with protocols such as AMQP, MQTT, OpenWire and STOMP

  • Provides queues and topics.

  • Runs in VPC with single instance or HA pair (active/standby)

  • Comparison with SNS and SQS: SNS and SQS use AWS APIs, public, highly scalable, AWS integrated. Amazon MQ is based on ActiveMQ and uses protocols JMS, AMQP, MQTT, OpenWire and STOMP. Look for protocols in the exam. Also, SNS and SQS for new apps, Amazon MQ for migrations with little to no app change.

Lambda

  • FaaS, short-running (default 3s, max 15m)

  • Function = piece of code + wrapping and config

  • It uses a runtime (Python, Ruby, Java, Go and C#), and it's loaded and ran in a runtime environment

  • The environment has a direct memory (128MB to 10240 MB), indirect CPU and instance storage (default 512 MB, max 10 GB) allocation.

  • Docker is an anti-pattern for lambda, lambda container images is something different and is possible.

  • Used for serverless apps, file processing (S3 events), DB triggers (DynamoDB), serverless cron (EventBridge), realtime stream data processing (Kinesis).

  • By default Lambda runs in public space and can't access to VPC services

  • It can also run inside a VPC (needs EC2 Network permissions)

  • Technically Lambda runs in a separate (shared) VPC, creates an ENI in your VPC per function (NOT per invocation) and uses an NLB, with 90s for initial setup and no additional invocation delay.

  • Lambda uses an Execution role (IAM role) which grants permissions.

  • Also a resource policy can control what can invoke the lambda.

  • Lambda logs to CW Logs, posts metrics to CW and can use X-Ray. Needs permissions for this, in the Execution role.

  • Context includes runtime + variables created before handler + /tmp. Context can be reused, but we can't control that, must assume new context.

  • Cold start: Provision HW, install environment, download code, run code before handler. Can pre-warm using Provisioned concurrency. You are NOT billed for cold-start time (not even for code before handler)k.

  • Execution process: Init (cold start) (if necessary), Invoke (runs the function Handler), Shutdown (terminate environment).

  • Lambda + ALB: ALB synchronously invokes Lambda (automatically translates HTTP(s) request to Lambda event).

  • Multi-value headers: (When using ALB + Lambda) Groups query string values by key, eg http://a.io?&search=a&search=b is passed as multiValueQueryStringParameters:{"search": ["a","b"]}. If not using Multi-value headers, only the last value is sent, e.g. "queryStringParameters": {"search":"b"}

Lambda layers

  • Share and reuse code by externalising libraries, which are shared between functions

  • Also allows new, unsupported runtimes such as Rust

  • Deployment zip only contains specific code (is smaller)

  • Can use AWS layers or write your own.

Lambda versions

  • A function has immutable versions

  • Each with their own ARN, called qualified, the unqualified ARN points to $Latest

  • Each includes code + config (including env vars)

  • $Latest points at the latest version, and aliases like Dev, Stage, Prod can be created and updated

  • A version is created when a Lambda is published, but it can be deployed without being published

  • You can also create aliases that point a % of traffic to an alias and another % to another alias.

Lambda invocation types

  • Sync: CLI/API invokes and waits for response. Same is used through API Gateway. Client handles errors or retries.

  • Async: Typical when AWS services invoke Lambdas, such as S3. Lambda handles retries (configurable 0-2 times). Must be idempotent!! Events can be sent to DLQ and destinations (SQS, SNS, Lambda and EventBridge.

  • Event Source Mapping: Kinesis data streams sends batches of events to Lambdas using Event Source Mapping. Lambda needs permissions to access the source (which are used on its behalf by Event Source Mapping). Can use DLQ for failed events.

Lambda container images

  • Include Lambda Runtime API (to run) and Runtime Interface Emulator (to local test) in the container image

  • Image is built and pushed to ECR, then operates as normal.

API GW

  • Highly available

  • scalable

  • handles auth (directly with Cognito or with a Lambda authorizer), throttling, caching, CORS, transformations, OpenAPI spec, direct integration.

  • Can connect to services in AWS or on-prem

  • Supports HTTP, REST and WebSocket

Endpoint types

  • Edge-optimized: Routed to the nearest CloudFront POP

  • Regional: Clients in the same region

  • Private: Endpoint accessible only within a VPC via interface endpoint

  • APIs are deployed to stages, each stage has one deployment. Stages can be environments (dev, prod) or version (v1, v2). Each stage has its own config. They are NOT immutable, can be changed and rolled back. Stages can be enabled for canary deployments.

  • 2 phases: Request: Authorize, validate and transform. Response: transform, prepare and return. Request is called method request and is converted to integration request, which is passed to the backend. Response is called integration response, which is converted to method response and is returned to the client.

Types of integration

  • Mock: For testing, no backend

  • HTTP: Set translation for method->integration request and integration->method response in the API GW

  • HTTP Proxy: Pass through request unmodified, return to the client unmodified (backend needs to use supported format)

  • AWS: Exposes AWS service actions

  • AWS_PROXY(Lambda): Low admin overhead Lambda endpoint.

  • Mapping templates: User for AWS and HTTP (non-PROXY) integrations. Modify or rename parameters, body or headers of the request/response. Uses Velocity Template Language (VTL). Can transform REST request to a SOAP API.

  • API GW has a timeout of 29s (can’t be increased)

Errors

  • 4XX: Client error, invalid request on client side

  • 5XX: Server error, valid request, backend issue

  • 400: Bad request, generic

  • 403: Access denied, authorizer denies or WAF filtered

  • 429: API GW throttled the request

  • 502: Bad GW, bad output returned by backend

  • 503: Service unavailable, backend offline?

  • 504: Integration failure/timeout, 29s limit achieved.

  • Cache: TTL 0s to 3600s (default 300s), 500 MB to 237 GB, can be encrypted. Defined per stage. Request only goes to backend if cache miss.

  • Payload limit: 10 MB

CloudFront

  • Private behaviors: A behavior can be made private if it uses a Trusted Signer (key created by root user). It will require a signed URL (access to 1 object) or signed cookie (access to the whole origin).

  • Origin Access Identity: Set an identity to the CloudFront behavior and only allow that identity in the origin (e.g. S3 bucket).

Storage Gateway

File Gateway

  • Access S3/Glacier through NFS and SMB protocols

  • Only the most recent data is stored (cached) on prem

  • NOT low-latency because from on prem it needs to fetch data form S3.

Tape Gateway

  • Access S3/Glacier through iSCSI VLT

  • Mainly used for archiving

  • Backed by Glacier, so can't consume in real time.

Stored-Volume Gateway

  • iSCSI-mounted volume stored on-prem and async backed to S3 as EBS snapshots

  • 16 TB per volume, max 32 volumes per gateway = max 512 TB

  • Is low latency, since all data is stored on prem, S3 is just used as backup of EBS snapshots of the volume.

Cached-Volume Gateway

  • iSCSI-mounted volume stored in S3 and cached on-prem

  • 32 TB per volume, max 32 volumes per gateway = 1024 TB

  • Data is stored on S3, NOT on-prem

  • On-prem only has a cache of the data, so low latency will only work for the cached data, not all data.

Migrations

6R

  • Retain: Stays on prem, no migration for now, revisit in the future

  • Re-host: Lift and shift with no changes

  • Refactor: Architect brand new, cloud native app. Lots of work

  • Re-platform: Lift and shift with some tinkering

  • Replace: Buy a native solution (not build one)

  • Retire: Solution is no longer needed, it's not replaced with something else Migration process:

Migration Plan

  • Discovery: making sure we know what's really happening, identify dependencies, check all the corners and ask the questions

  • Assessment and profiling, data requirements and classification, prioritization, business logic and infrastructure dependencies

  • Design: Detailed migration plan, effort estimation, security and risk assessment.

  • Tools: AWS Application Discovery Service, AWS Database Migration Service.

Migration Build

  • Transform: Network topology, migrate, deploy, validate

  • Transition: Pilot testing, transition to support, release management, cutover and decomission.

Migration Run

  • Operate: Staff training, monitoring, incident management, provisioning

  • Optimize: Monitoring-driven optimization, continuous integration and continuous deployment, well-architected framework.

S3

S3 Object Lock (requires versioning)

  • Legal Hold: Turn on or off. Object versions can't be deleted or modified while turned on, can be turned off.

  • Retention Compliance: Set a duration, object versions can't be deleted or modified for the duration. Can't be disabled, not even by root.

  • Retention Governance: Set a duration, object versions can't be deleted or modified for the duration. Special permissions allow changing the policy.

Amazon Macie

  • Data security and privacy service. Identifies data that should be private.

  • Select S3 buckets, create discovery job, set managed or custom data identifiers, post policy findings and sensitive data findings to EventBridge o Security Hub.

Interface and Gateway endpoints

  • Interface endpoints: ENI with private IP for traffic to services with PrivateLink

  • Gateway endpoints: Target for a route in the RT, only used for S3 or DynamoDB

  • NACL: Limit of 20 rules, can be increased to 40

Amazon DirectConnect (DX)

  • Public VIF: Used for AWS public services (including VPN)

  • Private VIF: Used for resources inside a VPC

Schema Conversion Tool (SCT)

  • You need to configure the data extraction agent first on your on-premises server.

Database Migration System (DMS)

  • can directly migrate the data to Amazon Redshift.

RDS

  • cross-region read replicas

  • multi-master: all master nodes need to be in the same region, and can't enable cross-region read replicas.

  • Max size: 64 TB

Step Functions

  • Does not directly support Mechanical Turk, in that case use SWF.

CloudSearch

  • Provides search capabilities, for example for documents stored in S3.

AWS Config

  • Can aggregate data from multiple AWS accounts using an Aggregator

  • Can only perform actions in the same AWS account.

Most important thing to remember

  • You can do it!!!

Master AWS with Real Solutions and Best Practices.
Join over 2500 devs, tech leads, and experts learning real AWS solutions with the Simple AWS newsletter.

  • Analyze real-world scenarios

  • Learn the why behind every solution

  • Get best practices to scale and secure them

Subscribe now and you'll get the AWS Made Simple and Fun ebook for free (valued at $10). Limited offer, don't wait!

If you'd like to know more about me, you can find me on LinkedIn or at www.guilleojeda.com

Did you find this article valuable?

Support Guillermo Ojeda by becoming a sponsor. Any amount is appreciated!