This is a digital copy of my physical notes taken while studying for AWS Certifications.
This is the following exam to SAA-003.
- IAM
- Compute
- AppSync
- Route53 Health-checks
- Hybrid DNS & Resolver Rules
- AWS Global Accelerator
- AWS Outposts
- AWS Wavelength (5G)
- AWS Local Zones
- EBS Snapshots
- EBS Encryption
- EFS Access Points
- S3 Replication Time Control (RTC)
- FSX for Lustre Lazy Data Loading
- AWS DataSync
- RDS for Oracle
- High Volume Queue Processing
- Kinesis
- AWS Batch
- Amazon Elastic Map Reduce (EMR)
- Redshift
- Athena
- Cloudwatch Synthetic Canary
- Elastic Beanstalk
- Service Catalog
- AWS Compute Optimiser NEW!
- AWS Snow Family NEW!
- AWS Schema Conversion Tool (SCT) NEW!
- Snowball + DMS NEW!
- Disaster Recovery NEW!
- VPCs NEW!
- VPC Peering NEW!
- Transit Gateway NEW!
- VPC Endpoints NEW!
- VPC Endpoint Policies NEW!
- AWS Private Link NEW!
- Site to Site VPN NEW!
- Client VPN NEW!
- Direct Connect Virtual Interfaces NEW!
- Direct Connect Encryption NEW!
- Direct Connect Link Aggregation Groups (LAG) NEW!
- Direct Connect Site link NEW!
- Kinesis Video Streams NEW!
- Amazon Workspaces NEW!
- AWS Application Discovery Service NEW!
IAM
NotAction
instead ofDeny
avoids issues with wildcard actions.- i.e.
NotAction: "iam:*"
instead ofDeny: "iam:*"
- We can laterAllow: "iam:ListUsers"
without issue.
- i.e.
- When assuming roles you “give up” your original permissions
- When using a resource based policy the principle (user/service) doesn’t need to give up any permissions.
- e.g. User A scans Dynamo in A & then dumps to S3 in B.
- If using a role, User A would need to give up permissions in A to access B.
- S3 And Dynamo have resource based policies, so User A doesn’t need to give up permissions.
- e.g. User A scans Dynamo in A & then dumps to S3 in B.
- When using a resource based policy the principle (user/service) doesn’t need to give up any permissions.
IAM Access Analyzer
- Defines a “Zone of Trust” and will alert of any access outside the defined zone
- Can parse cloudtrail logs to generate an IAM policy which only contains the necessary permissions.
Security Token Service
- Define IAM roles & Principles who can access them
- STS allows pulling credentials to assume roles
- They can last 15 minutes - 12 hours
- STS allows pulling credentials to assume roles
- 3rd parties must provide an “ExternalID” to prevent spoofing. This value is SECRET and is shared off-platform
Without externalID spoofing
With externalID
Identity Federation
- Allow users outside of AWS to access AWS resources
- Use SAML 2.0 to authenticate users
- Open Standard
- No need to create IAM users for each employee if already in SAML provider
- This is considered the old way to handle SSO
- Use “Amazon Single Sign-on” for a more modern approach
Custom Identity Broker
- Legacy and should only be used if you cannot use anything else
- Identity broker has admin on AWS & will request keys once it’s verified authentication
Web Identity Federation - Without Cognito
- Not recommended without cognito
- Log in with any IdP compatible with OIDC connect & exchange token with STS to be given credentials
Web Identity Federation - With Cognito
- Same as above except OID token used to claim cognito token which is used with STS to get credentials
AWS Managed Microsoft Active Directory
- Ability to connect your on-prem Active Directory to AWS Manage Microsoft AD
- MUST establish a FX or VPN connection
- Replication is not support, must Synchronise
- Can be used to authenticate to AWS resources
- If on-prem “trusts” cloud it can forward requests for users to AWS if it is missing them
- Only way to replicate is with an EC2 running a self-managed replication of on-prem
Organisation Units
- OU’s can be nested
OrganisationAccountAccessRole
is created in an OU on creation, this role is assumed by the root for any administrative duties- For billing purposes orgs treat all OUs as if the same account
- All accounts receive reserved instance benefits, this can be disabled by root
Service Control Policies (SCPs)
- Define allowed/disallowed actions for accounts
- Applied at OU or Account level
- Does not apply to management account
- SCP is applied to all the users and roles in the account including root users
- Must have explicit allows (by default all actions are denied)
- Denies are inherited on sub-accounts / OUs
- Exam will mention using
aws:TagKeys
as a condition to restrict access- i.e. Only allow user access to infrastructure with a specific tag
- Also, possible to Deny based on
aws:RequestedRegion
to restrict regions
Organisation tag policies
- Enforce tag standards across all accounts
- Ensures consistency across all accounts
Organisation AI opt-out policies
- By creating an opt-out policy you can opt your data out of AWS AI training
- Data used for training is data sent to AWS AI services
Organisation backup policies
- Enforce AWS backup configuration on a specific OU/Account
- Immutable so they cannot be edited by child accounts
AWS IAM Identity Center
- Allows for fine grain permissions & assignments
- Create permission sets which can be assumed by users via AWS SSO
- Can be used to manage permissions for AWS SSO users
- Can be driven based on users attributes stored in Identity Centre
- i.e. “DatabaseAdmin” is given to users with the role “Developer” and “Senior”
- Can be driven based on users attributes stored in Identity Centre
AWS Control Tower
- Account Factory allows automated account creation & deployment from the root organisation account
- Guardrail can ensure compliance of rules/tags/etc & cam trigger SNS/Lambda/etc on non-compliance
Compute
- EC2 Instance Types:
R
- High RAMC
- Compute OptimisedM
- Balanced (Think Medium)I
- High I/OG
- GPU AcceleratedT1/T2
- Burstable
Placement Groups
- Refer to EC2 Placement Groups
- Stopped instances can be moved between Placement Groups
Host Affinity
Host affinity is configured at the instance level. It establishes a launch relationship between an instance and a Dedicated Host.
When affinity is set to Host, an instance launched onto a specific host always restarts on the same host if stopped. This applies to both targeted and untargeted launches.
When affinity is set to Default, and you stop and restart the instance, it can be restarted on any available host. However, it tries to launch back onto the last Dedicated Host on which it ran (on a best-effort basis).
AWS Resource Access Manager
- Allows sharing of subnets, Route53, Transit Gateway, etc. between accounts
- Security groups are not shared
- Exam mostly will focus on sharing subnets across accounts
- i.e. Sub OU can place their EC2s into the shared VPC allowing private IP communication
- Managed Prefix Lists can be shared across accounts
- These are sets of CIDR blocks
- i.e. Security Group rules can reference a prefix list
- “Allow 22 from PrefixA”
SSL SNI
- “Newer Protocol” which allows multiple SSL certificates on a single IP
- Initial handshake indicates the requested hostname & the server will return the specified key (or default if not found)
- Supported on ALB & NLB, CloudFront
Cloud HSM
- Amazon managed hardware security module
- Tamper-proof hardware security module, FIPS 140-2 Level 3 compliant
- Good for SSE-C encryption on S3
SSL Offloading
- Uses HSM to do SSL compute & save CPU Cycles
- Compatible with NGINX, Apache, etc.
EC2 Instance Connect
- Sends SSH Key to instance via AWS API for 60 seconds allowing to connect
- No need to manage keys on the instance
SendSSHPublicKey
API call
Instance Recovery
- Cloudwatch alarm can monitor Instance Status (VM) and System Status (Hardware)
- This can trigger instance recovery
- Same Private/Public IP, Elastic IP, Metadata & Placement group ensured
- This can trigger instance recovery
High Performance Computing (HPC)
- Enhanced networking can be enabled (SR-IOV) for low latency & high throughput
- Elastic Network Adapter (ENA) offers up to 100Gbps
- Intel 82599VP offers up to 10Gbps (considered legacy)
- Elastic Fabric Adapter
- Bypasses OS for low latency, instant packet delivery
- Linux only
- Placement groups can be used to ensure instances are close together
ECS Networking
- Standard docker network modes are available in EKS however
awsvpc
is also available.awsvpc
creates an ENI for each task- Default mode for Fargate
bridge
is the default modehost
uses the host networknone
disables networking
EKS Data Volumes
- Only EFS works with Fargate
- EKS can use EBS, EFS, FSx for Lustre & FSx for NetApp ONTAP
ECS Anywhere
- ECS & SSM Agents can run on-prem
- This allows you to target on-prem infrastructure for the ECS console
EKS Anywhere
- Amazon managed kubernetes distribution
- Can run on-prem
- Can run on AWS Outposts
- Reduces third party tools & services required for on-prem kubernetes
- AWS connection is optional
NLB Zonal DNS Names
- Resolving regional DNS names will return the IP of the NLB in the same zone
- mynlb.elb.us-east-1.amazon.aws.com
- NLB also has DNS names for each AZ
- us-east-1a.mynlb.elb.us-east-1.amazon.aws.com
- Reduces latency and can minimise data transfer cost
- You do not pay for transfers in the same AZ
NLB Flow hash routing
- Uses a 5-tuple hash to determine which target to send traffic to
- Source IP, Source Port, Destination IP, Destination Port, Protocol
- TCP/UDP connections are routed to a single target for the life of their connection
AppSync
- Realtime data via graphQL
- Can pull data from DynamoDB, Aurora, OpenSearch, Lambda, HTTP, etc.
Route53 Health-checks
- Similar to standard health checks but can be used to route traffic based on health
- i.e. Route traffic to a backup region if the primary is unhealthy
- Can read the first 5120 bytes of a response to determine health
- Cannot check private instances health
- Instead, monitor a Cloudwatch alarm which monitors the instance health
Hybrid DNS & Resolver Rules
- Requires a VPN or Direct Connect to work
- Offer resolver endpoints in your VPC to allow for on-prem DNS resolvers to resolve AWS resources
- Inbound resolver endpoint allows your VPC to resolve on-prem resources
AWS Global Accelerator
- Clients can hit one of 2 any cast IPs for your application to be forwarded to the closest edge location
- Supports client IP preservation except for NLB & ELP endpoints.
AWS Outposts
- Hybrid cloud solution where companies can run cloud & on-prem servers. Outposts allows for AWS services to run on-prem in a dedicated server rack.
AWS Wavelength (5G)
- Gives ultra low latency to apps by running them “at edge” in the telco datacenter
- Data never leaves the telco network unless configures to connect to AWS
AWS Local Zones
- Place compute/storage/db closer to end users for latency-sensitive applications
- E.g. Northern Virginia has a local zone in Boston, Chicago, Miami, etc.
- Possible to extend your VPC across AZs and Local Zones
EBS Snapshots
- Incremental backups
- Only backs up changed blocks
- Backups are high IO - so even though you can do them while mounted and being used you probably shouldn’t
- Snapshots are stored in S3 under the hood however they are not visible in standard S3.
Data Lifecycle Manager
- Automate snapshots of EBS / EBS Backed AMIs
- Can handle cross-account backups
- Should be used over AWS Backup when needed to managed creation/retention/deletion of snapshots in more detail
EBS Encryption
- Disabled by default
- Must be enabled on in account on a per-region basis
EFS Access Points
- Restrict access to a directory within the filesystem
- Possible to enforce specific POSIX GID & UID
S3 Replication Time Control (RTC)
- Replicate most objects in seconds
- 99.99% replication within 15 minutes
- Cloudwatch alarm if not met
- Good for compliance if near realtime backups required
- 99.99% replication within 15 minutes
FSX for Lustre Lazy Data Loading
- Any processing job on Lustre with S3 as an input data source can be started without Lustre cloning the entire dataset, instead the data is only downloaded when requested.
AWS DataSync
- Move large datasets inter-service / inter-cloud / on-prem
- Scheduled either Hourly / Daily / Weekly
- POSIX permissions & metadata retained
- If missing network bandwidth - should use Snowcone/Snowball
RDS for Oracle
- Oracle RMAN can be used for restore to non-RDS systems
- RDS for Oracle does not support RAC
- If RAC is required use Oracle for EC2
High Volume Queue Processing
- Exam may mention processing “tens of thousands” of requests per second & maintaining order. SQS FIFO cannot handle this speed/volume
- Instead, use Kinesis Data Streams
- These can be ordered with partition keys
- Instead, use Kinesis Data Streams
Kinesis
- Streams: Low latency stream ingest at scale
Analytics: Real-time analytics on streams using SQL
Firehose: Load streams into S3, Redshift, ElasticSearch, etc. - Important to remember the difference between Streams & Firehose
Kinesis Streams
- Shards need to be pre-provisioned
- Multiple consumers can read from the same shard
KCL Consumer
- Checkpointed coordinated reads
- Java Framework for creating custom consumers
Kinesis Firehose
- Can take data from producers, streams & cloudwatch logs, optionally modify it with lambda & then batch it into S3, Redshift or OpenSearch.
- Offers 3rd party targets like Splunk, DataDog, etc.
- Data is flushed to the destination when the buffer size or buffer time is hit
- Firehose is not real-time it’s Near Real-Time
Kinesis Analytics
- Use SQL to transform data coming from Kinesis services
Comparing data streaming services
Comparison Charts
Kinesis Data Streams | SQS | SQS FIFO | SNS | DynamoDB | S3 | |
---|---|---|---|---|---|---|
Data | Immutable | Immutable | Immutable | Immutable | Mutable | Mutable |
Retention | 1-365 days, export to S3 using KDF | 1-14 days | 1-14 days | No retention | Infinite or can implement TTL | Infinite, can setup lifecycle |
Ordering | Per shard | No ordering | Per group-id | No ordering | No ordering | No ordering |
Scalability | Provision shards | Soft limit | 300 msg/s Or 3000 if batch | Soft limit | WCU & RCU / On-demand | Infinite / 3500 PUT 5500 GET per prefix |
Readers | EC2, Lambda, KDF, KDA, KCL (checkpoint) | EC2, Lambda | EC2, Lambda | HTTP, Lambda, Email, SOS… | DynamoDB Streams | SDK, S3 Events |
Latency | KDS (200 ms) KDF (1 min) | Low (10-100ms) | Low (10-100ms) | Low (10-100 ms) | Low (10-100ms) | Low (10-100ms) |
AWS Batch
- Runs batch jobs with Docker
- Fargate or Dynamic EC2/Spot instances available
- Fully serverless since all infrastructure is managed by AWS
- Only pay for what resources are used
Amazon Elastic Map Reduce (EMR)
- Create Hadoop clusters for big data
- Mostly used for on-prem to cloud migrations
- EBS used for storage - S3 with EMRFS is used for long term storage
Redshift
- Previous notes
- Redshift is provisioned so its only worth it for sustained usage, instead use Athena for ad-hoc/sporadic loads
Athena
- Previous notes
- Recommended data type is Apache Parquet since it’s columnar & allows for less scanning
- AWS Glue ETL job can convert data to Parquet
- Federated queries allow queries across many data sources both in AWS & on-prem
- Requires a Data Service Connector on Lambda to run queries
Cloudwatch Synthetic Canary
- Can test application endpoints to detect app-level issues
- Can use puppeteer to interact with the app instance directly
Elastic Beanstalk
- Great to “Re-platform” on-prem app to cloud easily
- Supports many languages & frameworks
- Can use Docker if not supported
Service Catalog
- Admins can create pre-approved CFN templates
- Users with the correct IAM roles can then launch these templates
- “pre-approved” way of standing up infrastructure without needing to know CFN/Company best-practices
AWS Compute Optimiser
- Uses ML to detect over-provisioned &/ under-provisioned EC2 & ASG instances as well as EBS Volumes & Lambda Functions
AWS Snow Family
- Rule of Thumb: If it takes more than a week to upload - you should use a snowball device
- Snowcone: 8TB HDD / 14TD SSD
Snowball: 80TB
Snowmobile: < 100PB - Can manage your snow devices in OpsHub
AWS Schema Conversion Tool (SCT)
- Convert DB Schemas during migrations with DMS
- Only required if changing DB engines
Snowball + DMS
- Can combine Snowball + SCT to extract DB data from on-prem & ship to AWS. DMS can then restore from S3 to DB.
Disaster Recovery
Elastic Disaster Recovery
- Actively clones your on-premises infrastructure to AWS
- Continuous block level replication
- Possible to automatically fail-over and scale in minutes
VPCs
VPC Peering
- Connect two VPCs privately using the AWS network
- These connections are not transitive.
- If A -> B & B -> C, A cannot talk to C
- Instead, A must peer with C directly
- Possible to peer across accounts
- Possible to peer A with B & C even if there is CIDR overlap on B & C
- Route tables will pick the longest prefix (Most specific) match
- No edge to edge routing
- i.e. If VPC A has a VPN to on-prem or an Internet gateway, VPC B cannot use this VPN to reach on-prem or the internet even if peered
Transit Gateway
- Allows for transitive peering between VPCs in a “Hub and Spoke” configuration
- VPCs can be in different accounts
- Possible to peer other Transit gateways
- This is the only service that supports IP Multicast
VPC Endpoints
- Allows VPCs to access the AWS network & resources without needing to connect to the wider internet
- VPC Endpoint Gateway: S3 & DynamoDB only
- VPC Endpoint Interface: All AWS services except DynamoDB
- Lives inside the subnet as an ENI
VPC Endpoint Policies
- JSON Policy to control access to services
- Similar syntax to IAM but does not override or replace existing IAM or service policies
AWS Private Link
- Most secure way to expose a service to 1000s of other VPCs
- Does not require VPC peering, internet gateway, NAT gateway, VPN or Direct Connect
- Requires a Network Load Balancer inside your VPC & an ENI inside the customer VPC.
- With private link the customer now has a connection your Network load balancer
- This is safer than peering (Whole VPC accessible) or opening VPC & instance to the public net
- Possible to On-prem -> DX -> Private Link -> VPC Endpoint -> S3
Site to Site VPN
- Access VPC via internet over the VPN
- Requires on-prem VPN configured with an AWS customer gateway pointed to it and connected to a Virtual Private Gateway in your VPC
- Possible to connect with Global Accelerator for faster VPN connections
- Can use BGP to configure dynamic routing or can manually configure route tables with static routing
- Cannot do On-prem -> VPN/DX -> Nat -> IGW -> Internet
- Nat instances can allow this behaviour
Client VPN
- Leverage OpenVPN to allow personal connections to your VPC
Direct Connect Virtual Interfaces
- Public VIF: Connect to public services like S3, DynamoDB, etc.
- Private VIF: Connect to your VPC
Direct Connect Encryption
- Not encrypted by default
- VPN over Direct Connect Private VIF is encrypted
Direct Connect Link Aggregation Groups (LAG)
- Get increased speed and fail-over by summing up existing DX connections into a single logical one
- Can aggregate up to 4 connections and can add to this over-time
Direct Connect Site link
- Connect two On-prem data centres with a Direct Connect Connections to each-other via AWS
Kinesis Video Streams
- One video stream per producer (Security Camera, Body-cam, etc.)
- CANNOT output to S3 directly
- Consumed by either EC2 or AWS Rekognition which can then use Kinesis Data Streams
Amazon Workspaces
- Managed Desktop as a Service
- Integrates with Windows AD
AWS Application Discovery Service
- Detect on-prem infrastructure & applications and connect with AWS Migration Hub to provide an easier transition to the cloud
Agentless discovery can be performed by deploying the Application Discovery Service Agentless Collector (Agentless Collector) (OVA file) through your VMware vCenter. After Agentless Collector is configured, it identifies virtual machines (VMs) and hosts associated with vCenter. Agentless Collector collects the following static configuration data: Server hostnames, IP addresses, MAC addresses, disk resource allocations, database engine versions, and database schemas. Additionally, it collects the utilization data for each VM and database providing the average and peak utilization for metrics such as CPU, RAM, and Disk I/O.
Agent-based discovery can be performed by deploying the AWS Application Discovery Agent on each of your VMs and physical servers. The agent installer is available for Windows and Linux operating systems. It collects static configuration data, detailed time-series system-performance information, inbound and outbound network connections, and processes that are running.