Study Notes On AWS

aws

IAM

SAML(Security Assertion Markup Language 2.0)
Enables users to access AWS resources with their corporate credentials. Ex: Active Directory
It is recommended to have 1 IAM role per application

EC2

EBS - Elastic block store, virtual drive
ELB - distributing load accross
ASG - Scaling the services

subnet - which AZ
tags - Name (going to show in UI)
EC2 connect - ssh from browser (AMI linux 2 only)
pem file permission - 0644 -> too open

chmod 0400 *.pem

Load Balancers

HTTPS (SSL termination)
Enforce stickiness with cookies

Three diff kinds of LB offerings:

Classic Load Balancer
Application Load Balancer - HTTP/HTTPS (L7)
Network Load Balancer - TCP (L4)

To get the actual IP

x-forwarded-for (contains the actual IP)

ASG

Min Size
Actual Size / Desired Capacity
Max Size

Send Custom Metric via –> PutMetric API

ASG <- -triggered by- - cloudWatch

ASG uses Launch Configuration

EBS Volume

Network drive (not physical drive),attach to AZ, EBS encryption
In flight and at rest encryption
Snapshots are also encrypted

Route 53

A - URL to IPV4
AAAA - URL to IPV6
CNAME - URL to URL
ALIAS - URL to AWS Resources

RDS

Encryption at rest - KMS - AES-256
Encryption in flight - SSL
- postgres - rds.force_ssl = 1
- MySQL - REQUIRE SSL

ElastiCache

Redis/Memcached

make application stateless by storing state in a common cache
Write scaling - Sharding
Read scaling - Read replicas
multi AZ with failover
cache hit / cache miss
cache invalidation strategy

Redis

in memory Key-value
super low latency (sub micro sec)
cache survive reboots by default (persistence)
great to host -
i) user sessions
ii) Leaderboard
iii) distributed states
iv) relieve pressure on db
v) pub/sub messaging

Memcached

in memory object store
cache doesn’t survive reboots

Elasticache Patterns

Lazy loading
Pros:
- only requested data is cached
Cons:
- cache miss penalty
- Stale data - to avoid this use Write-Through and TTL strategy
Write through
Pros:
- cache is never stale
- write penalty as oppose to read penalty: users are more tolerant when uploading data
Cons:
- cache churn : waste of resource, most data is never read.

S3 Versioning

best practice is to enable versioning

protect against unintended delete
easy rollback to prev version

S3 Encryption

SSE-S3 - managed by AWS S3
SSE-KMS - managed by AWS KMS
SSE-C - you want to manage your own encryption key (BYO encryprion Key 😂)
Client Side encryption

SSE-S3

AWS S3 manages, object encrypted server side
AES-256 encryption type
Must set Header:

“x-amz-server-side-encryption”: “AES-256”

SSE-KMS

managed by AWS KMS
Pro:

user control + audit trail (key rotation)

Server side enc Header:

“x-amz-server-side-encryption”: “aws:kms”

Key - KMS Customer Master Key (CMK)

SSE-C

fully managed by customer outside of AWS
S3 does not store the encryption key
must use HTTPS
encryption key must be provided in HTTP header
Key - Client side data key

Client Side encryption

Client library such as S3 Encryption Client
Client must encrypt the data themselves before sending to S3
Client must decrypt data themselves when retrieving from S3
Customer manages key + encryption cycle

Encryption in Transit(SSL/TLS) - In flight

SSE-C in HTTPS is must
SSL/TLS

S3 Security

IAM Policies
Bucket Policies - json based policy
- Resources
- Action
- Effect: Allow/Deny
- Principle: user/acc to apply policy to
policy can be used to
- grant publlic access to bucket
- force objects to be encrypted at upload
- grant access to another acc(cross account)
Networking - VPC endpoints
logging and audit
- S3 access logs in other S3 bucket (best practice is to not put on the same one)
- API calls can be logged via cloudtrail
User security
- MFA - versioned buckets
- signed url: limited time access (premium video)

S3 CORS

Cross origin resource sharing

Access-Control-Allow-Origin : domain

S3 Consistency Model

Read after write

PUT 200 > GET 200
Eventually consistent
- GET 404 > PUT 200 > GET 404 (results may be cached)
- PUT 200 v1 > PUT 200 v2 > GET 200 v1 (might be older object)
- DELETE 200 > GET 200 (might get the object even after delete for a shortwhile)

S3 Performance

For each prefix

3500 TPS PUT
5500 TPS GET

For faster upload of objects(>=100MB) use multipart upload
S3 transfer accellaration(uses edge locations)
SSE-KMS encryption limited to your AWS limits for KMS usage (~100s-1000) downloads/upload per sec

S3 and Glacier Select

AWS CLI

aws sts decode-authorization-message –encode-message “message”

EC2 instance metadata

169.254.169.254/latest/meta-data
Metadata = data about EC2 instance
Userdata = launch script of EC2 instance

AWS SDK

if region not chosen request goes to us-east-1 by default from SDK

Exponential Backoff

-> 1s
–> 2s
—-> 4s
——–> 8s
—————-> 16s

EBS

Three architecture model:

Single instance deployment: dev
LB + ASG: production web apps
ASG only: non-web apps in prod

ElasticBeanstalk consists of 3 components:

Application
Application Version
Environment name (dev, test, prod)

EBS deployment modes

All at once - fastest deployment, but downtime
Rolling - update few at a time, move to new once once first one is healthy, takes down instances, capacity goes down
Rolling with batches - spins new instances instead of taking down existing ones, always work full capacity
Immutable - spins new instance in new ASG, swaps all instances when everything is healthy. Quick roll back, longest deployment time.

ElasticBeanstalk Extensions

.ebextensions directory
.config extension(Ex. logging.config)

Few more EB stuff

package dependencies with source code to imporve deployment performance
HTTPS securelistener - alb.config -> to upload SSL, security group(sg) allow 443
Lifecycle policy - 1000 versions
Retain source bundle in S3 periodic tasks in cron.yaml
decouple RDS from EB with connection string
enabling deletion protection in RDS will stop it from terminating while deleting EBS environment