Atlas
Project Atlas is a production-grade infrastructure blueprint for deploying a secure Kubernetes cluster and a standalone observability stack on AWS. Built on a 'Portless, Serverless-Storage, and Automated-Discovery' philosophy.

Technical Stack
Executive Summary
Atlas automates the provisioning of a 3-node Kubernetes cluster (1 Controller, 2 Workers) and a dedicated Monitoring node on AWS. Architectural decisions prioritize a zero-trust perimeter, infrastructure reproducibility via IaC, and survivable observability independent of the K8s cluster itself.
Security Architecture
Built on a Zero-Public-Port model — no SSH (Port 22) is ever opened on any node.
- All instance access managed exclusively via AWS SSM Session Manager
- Identity-Based Access using IAM roles — no long-term AWS keys stored on hosts
- Private subnets only — no EC2 instances carry public endpoints
- IAM roles scoped per workload (controller, worker, monitoring node)
- All administrative access is fully auditable via AWS CloudTrail

Infrastructure as Code
All cloud resources are provisioned and managed via Terraform, ensuring reproducibility and controlled change management across environments.
- AWS VPC with private subnets, route tables, NACLs, and security groups
- IAM roles and instance profiles scoped per node type
- S3 buckets for serverless log and trace persistence
- SSM Parameter Store auto-populated by Terraform — consumed dynamically by Ansible for zero hardcoded IPs
- EC2 instances: 1 Controller, 2 Workers, 1 Monitoring node
Kubernetes Orchestration
Ansible playbooks automate the full Kubernetes cluster lifecycle — from OS hardening to cluster bootstrapping — across all five sequential phases via a single Master Orchestrator playbook.
- Automated Kubernetes v1.29 installation across all nodes
- Production-grade OS hardening: swap management, kernel network optimizations (bridge-nf-call-iptables)
- Flannel CNI for internal cluster pod networking
- site.yaml Master Orchestrator executes all 5 phases in sequence
- Dynamic inventory powered by SSM Parameter Store — no hardcoded configuration
Observability Stack
A standalone monitoring node runs the LGTM+P stack, engineered to survive independently of the Kubernetes cluster. Logs and traces are persisted to S3 for infinite scalability.
- Loki: log aggregation and querying
- Grafana: dashboarding and unified visualization
- Tempo: distributed trace collection and storage
- Prometheus: metrics scraping and alerting
- S3-backed persistence for logs and traces — 99.999999999% durability
- Mock log/trace generators deployed via Ansible to validate the full end-to-end telemetry pipeline

Access & Operations
Private, encrypted access to the monitoring dashboard is provided through a custom SSM tunnel utility — no bastion host required.
- atlas-console.sh: SSM Port Forwarding tunnel exposing Grafana at localhost:3000
- No jump server or bastion host — SSM fully replaces traditional SSH access patterns
- Operational access logs captured in AWS CloudTrail for full auditability