Cloud & DevOps

Atlas

Project Atlas is a production-grade infrastructure blueprint for deploying a secure Kubernetes cluster and a standalone observability stack on AWS. Built on a 'Portless, Serverless-Storage, and Automated-Discovery' philosophy.

Visit Live Site View Repository

Technical Stack

TerraformAWS EC2AWS VPCAWS IAMAWS S3AWS SSMKubernetesAnsibleDockerFlannel CNIPrometheusGrafanaLokiTempoLinuxWSL

Executive Summary

Atlas automates the provisioning of a 3-node Kubernetes cluster (1 Controller, 2 Workers) and a dedicated Monitoring node on AWS. Architectural decisions prioritize a zero-trust perimeter, infrastructure reproducibility via IaC, and survivable observability independent of the K8s cluster itself.

Security Architecture

Built on a Zero-Public-Port model — no SSH (Port 22) is ever opened on any node.

All instance access managed exclusively via AWS SSM Session Manager
Identity-Based Access using IAM roles — no long-term AWS keys stored on hosts
Private subnets only — no EC2 instances carry public endpoints
IAM roles scoped per workload (controller, worker, monitoring node)
All administrative access is fully auditable via AWS CloudTrail

Infrastructure as Code

All cloud resources are provisioned and managed via Terraform, ensuring reproducibility and controlled change management across environments.

AWS VPC with private subnets, route tables, NACLs, and security groups
IAM roles and instance profiles scoped per node type
S3 buckets for serverless log and trace persistence
SSM Parameter Store auto-populated by Terraform — consumed dynamically by Ansible for zero hardcoded IPs
EC2 instances: 1 Controller, 2 Workers, 1 Monitoring node

Kubernetes Orchestration

Ansible playbooks automate the full Kubernetes cluster lifecycle — from OS hardening to cluster bootstrapping — across all five sequential phases via a single Master Orchestrator playbook.

Automated Kubernetes v1.29 installation across all nodes
Production-grade OS hardening: swap management, kernel network optimizations (bridge-nf-call-iptables)
Flannel CNI for internal cluster pod networking
site.yaml Master Orchestrator executes all 5 phases in sequence
Dynamic inventory powered by SSM Parameter Store — no hardcoded configuration

Observability Stack

A standalone monitoring node runs the LGTM+P stack, engineered to survive independently of the Kubernetes cluster. Logs and traces are persisted to S3 for infinite scalability.

Loki: log aggregation and querying
Grafana: dashboarding and unified visualization
Tempo: distributed trace collection and storage
Prometheus: metrics scraping and alerting
S3-backed persistence for logs and traces — 99.999999999% durability
Mock log/trace generators deployed via Ansible to validate the full end-to-end telemetry pipeline

Access & Operations

Private, encrypted access to the monitoring dashboard is provided through a custom SSM tunnel utility — no bastion host required.

atlas-console.sh: SSM Port Forwarding tunnel exposing Grafana at localhost:3000
No jump server or bastion host — SSM fully replaces traditional SSH access patterns
Operational access logs captured in AWS CloudTrail for full auditability