> ## Documentation Index
> Fetch the complete documentation index at: https://docs.maia.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Amazon EKS deployment guide for Maia runners

export const m_runner = "Maia runner";

export const maia = "Maia";

This document helps you understand AWS-specific architecture decisions, deployment considerations, and readiness requirements for running {m_runner}s on Amazon Elastic Kubernetes Service (EKS). EKS provides a managed Kubernetes control plane for running Matillion {m_runner}s in your AWS infrastructure. This deployment model combines AWS-native security features (IAM Roles for Service Accounts) with Kubernetes operational flexibility.

For complete Terraform modules, Helm charts, and step-by-step implementation instructions, see the [AWS {m_runner} directory](https://github.com/matillion-public/deployment-library/tree/main/agent/aws) in the Matillion deployment library on GitHub.

You should read the general [Kubernetes deployment guide](/docs/guides/kubernetes-deployment-guide) before reading this document.

### What you get with EKS deployment

* Managed Kubernetes control plane. AWS handles the Kubernetes API server, etcd, and control plane upgrades.
* IAM Roles for Service Accounts (IRSA). Credential-free authentication to AWS services.
* Flexible worker nodes. EC2 instances, Fargate, or hybrid deployments.
* AWS integration. Native support for CloudWatch, VPC networking, and AWS Load Balancers.
* Horizontal Pod Autoscaler. Scale {m_runner} pods based on metrics.
* Cluster Autoscaler. Automatically adjust EC2 worker node capacity.

### When to choose EKS

Choose EKS for {m_runner} deployment when:

* You have existing AWS infrastructure and expertise.
* You need IRSA for secure, credential-free access to AWS services (S3, Secrets Manager).
* You require integration with AWS monitoring and security tools (CloudWatch, GuardDuty).
* You want Kubernetes operational flexibility with AWS managed services.
* You plan to deploy {m_runner}s across multiple availability zones for high availability.

***

## Prerequisites and readiness

### AWS account requirements

Required AWS services:

* Amazon EKS enabled in your target region.
* Sufficient EC2 service quotas for worker nodes.
* VPC with public and/or private subnet configuration.

Your AWS identity (user or role) needs permissions to:

* Create and manage EKS clusters and node groups.
* Create and manage EC2 instances, Auto Scaling groups, and Launch Templates.
* Create IAM roles, policies, and IRSA configurations.
* Manage VPC resources (subnets, route tables, security groups, NAT gateways).
* Access AWS Secrets Manager (for storing OAuth credentials).
* Create S3 buckets (for {m_runner} staging data).
* Configure CloudWatch Logs and metrics.

We recommend you use an administrative role for initial deployment, then scope down to least-privilege for ongoing operations.

### Maia account setup

Before deploying infrastructure, create a {m_runner} in {maia}.

You need to obtain the following information about the {m_runner} you created:

* **Account ID:** Your Matillion organization identifier.
* **Runner ID:** Unique identifier for this {m_runner} (auto-generated).
* **OAuth Client ID and Secret:** {m_runner} authentication credentials.
* **Region:** `us1` (United States), `eu1` (Europe), or `au1` (Australia/Asia-Pacific).

These credentials are required for the Helm deployment in [Phase 4](#phase-4-maia-runner-deployment-helm). Store them securely.

For details, read [create a {m_runner}](/docs/guides/create-a-runner#prerequisites).

### Required tools

Ensure these tools are installed and configured on your deployment workstation:

* [Terraform 1.0+](https://www.terraform.io/downloads.html) for infrastructure provisioning.
* [AWS CLI](https://aws.amazon.com/cli/) configured with credentials (`aws configure` or environment variables).
* [kubectl](https://kubernetes.io/docs/tasks/tools/) for Kubernetes cluster management.
* [Helm 3.x](https://helm.sh/docs/intro/install/) for application deployment.

Verify prerequisites:

```bash theme={null}
# Verify AWS CLI authentication
aws sts get-caller-identity

# Verify tool versions
terraform --version
kubectl version --client
helm version
```

***

## Architecture decision points

Before deploying, make these key architectural decisions.

### 1. VPC strategy

| Option               | When to use                                      | What gets created                                                                 |
| -------------------- | ------------------------------------------------ | --------------------------------------------------------------------------------- |
| **Create new VPC**   | Isolated {m_runner} deployment, no existing VPC. | New VPC with public and private subnets across 3 AZs, NAT gateways, route tables. |
| **Use existing VPC** | Integrate with existing AWS infrastructure.      | EKS cluster in existing VPC, may create new subnets if needed.                    |

Set `use_existing_vpc = true/false` in Terraform variables.

### 2. Subnet strategy

If using existing VPC, decide whether to use existing subnets or create new.

| Option                 | Requirements                                                            | Considerations                                           |
| ---------------------- | ----------------------------------------------------------------------- | -------------------------------------------------------- |
| **Existing subnets**   | Private subnets with NAT gateway or NAT instance for outbound internet. | Must have available IP addresses for EKS nodes and pods. |
| **Create new subnets** | Room in existing VPC CIDR for new subnet ranges.                        | Terraform creates new subnets within existing VPC.       |

<Note>
  If using EKS Fargate, you **must** provide private subnets with NAT gateway access. Fargate requires private subnets and will fail with public subnets.
</Note>

### 3. Public vs private cluster

| Setting             | API server access                                         | Use case                                                     |
| ------------------- | --------------------------------------------------------- | ------------------------------------------------------------ |
| **Public cluster**  | API server publicly accessible from authorized IP ranges. | Development, testing, faster initial setup.                  |
| **Private cluster** | API server accessible only from within VPC.               | Production, enhanced security, requires bastion host or VPN. |

Set `is_private_cluster = true/false` in Terraform variables.

For private clusters, ensure:

* Deployment workstation has VPN or bastion access to VPC.
* CI/CD runners can access cluster API server.
* Authorized IP ranges include your access points.

### 4. Authentication strategy

IAM Roles for Service Accounts (IRSA)—recommended:

* {m_runner} pods assume IAM role without storing credentials.
* Automatic credential rotation by AWS STS.
* Least-privilege access to AWS services (S3, Secrets Manager).
* Terraform module configures IRSA automatically.

Static OAuth credentials:

* OAuth credentials stored in Kubernetes Secrets.
* Use only if IRSA cannot be implemented (not recommended for EKS).

Recommendation: Always use IRSA for EKS deployments. The deployment library Terraform module creates the required IAM roles and policies automatically.

### 5. Worker node strategy

Node instance sizing:

| Instance type | vCPU | Memory | Use case                              |
| ------------- | ---- | ------ | ------------------------------------- |
| `t3.medium`   | 2    | 4 GB   | Development, testing, low workload.   |
| `t3.large`    | 2    | 8 GB   | Small production workloads.           |
| `m5.large`    | 2    | 8 GB   | Baseline production.                  |
| `m5.xlarge`   | 4    | 16 GB  | Medium production workloads.          |
| `m5.2xlarge`  | 8    | 32 GB  | High-throughput production workloads. |

Considerations:

* **Transformation-heavy workloads:** SQL generation tasks, low {m_runner} CPU usage → Smaller instances sufficient.
* **Data ingestion/scripting workloads:** High data transfer, processing on {m_runner} → Larger instances needed.
* **Pod density:** Larger instances allow more {m_runner} pods per node, reducing operational overhead.

Configure instance type in Terraform node group settings.

### 6. Scaling strategy

Static replica count:

* Fixed number of {m_runner} pods (e.g., 2, 5, 10).
* Predictable capacity and costs.
* Suitable for steady-state workloads.

Horizontal Pod Autoscaler (HPA):

* Automatically scales {m_runner} pods based on workload metrics.
* Configure min/max replicas (e.g., min: 2, max: 10).
* Responds to workload spikes dynamically.

Cluster Autoscaler:

* Automatically adds/removes EC2 worker nodes based on pod scheduling needs.
* Works in tandem with HPA.
* Optimizes infrastructure costs.

Recommendation: start with static replicas, add HPA as you understand workload patterns.

***

## Container images

{m_runner} images are available in AWS ECR Public Registry:

Image repository: `public.ecr.aws/matillion/etl-agent`.

Available tags:

* `:stable` - Slower release cycle, maximum stability, recommended for production.
* `:current` - Faster release cycle, earlier access to new features.

Both tags are production-ready. Choose `:stable` for stability-first deployments, or `:current` for early access to features.

No authentication is required. ECR Public images can be pulled without AWS credentials.

***

## Deployment journey

### Expected timeline

* **Phase 1 — {m_runner} registration:** 10 minutes (Matillion console).
* **Phase 2 — Infrastructure provisioning:** 15-20 minutes (Terraform: VPC, EKS cluster, IRSA).
* **Phase 3 — Configure kubectl access:** 2 minutes (AWS CLI + kubectl).
* **Phase 4 — {m_runner} deployment:** 5-10 minutes (Helm chart).
* **Phase 5 — Validation:** 15-30 minutes (Pre-deployment checks + testing).

**Total:** 50-75 minutes for first-time deployment.

### Phase 1: Maia runner registration (Matillion console)

Refer to [Prerequisites](#prerequisites-and-readiness), above, for details of {m_runner} creation.

What you'll have at the end:

* Account ID
* {m_runner} ID
* OAuth Client ID and Secret
* Region (us1, eu1, or au1)

Store these securely. You'll need them for Helm deployment in [Phase 4](#phase-4-maia-runner-deployment-helm).

### Phase 2: Infrastructure provisioning (Terraform)

The Terraform module creates:

1. **Amazon EKS cluster:**

   * Managed Kubernetes control plane (API server, etcd, controller manager).
   * EKS-managed upgrades and patching.
   * CloudWatch logging for control plane components.

2. **Worker node groups:**

   * EC2 Auto Scaling group with configurable instance types.
   * Launch template with Amazon EKS-optimized AMI.
   * Kubernetes node labels and taints (if configured).

3. **IAM roles and IRSA:**

   * EKS cluster IAM role (for cluster operations).
   * Node group IAM role (for EC2 instances).
   * IRSA-enabled service account role for {m_runner} pods.
   * IAM policies for S3, Secrets Manager, CloudWatch.

4. **VPC and networking (if creating new):**

   * VPC with public and private subnets across 3 availability zones.
   * NAT gateways for outbound internet from private subnets.
   * Route tables and internet gateway.
   * Security groups for control plane and node group communication.

5. **Security groups:**

   * Control plane security group (API server access).
   * Node security group (inter-node and pod communication).
   * Rules for HTTPS outbound to Matillion control plane.

In `terraform.tfvars` you will need to make these configuration changes:

* **region:** Your AWS region (e.g., `us-east-1`, `us-west-2`).
* **name:** Cluster name prefix (e.g., `matillion-agent`).
* **use\_existing\_vpc:** `true` or `false`.
* **cidr\_block:** VPC CIDR if creating new (e.g., `172.5.0.0/16`).
* **is\_private\_cluster:** `true` or `false`.
* **authorized\_ip\_ranges:** List of CIDRs allowed to access API server.
* **tags:** Resource tags for cost allocation and organization.

After `terraform apply` completes, retrieve the Terraform outputs using:

```bash theme={null}
terraform output cluster_name
terraform output service_account_role_arn
```

The **service\_account\_role\_arn** is required for Helm deployment in [Phase 4](#phase-4-maia-runner-deployment-helm).

**Where to implement:** [EKS Terraform module](https://github.com/matillion-public/deployment-library/tree/main/agent/aws/eks).

### Phase 3: Configure kubectl access

You must configure kubectl to authenticate to your EKS cluster using the AWS CLI.

The `aws eks update-kubeconfig` command retrieves cluster endpoint and certificate authority data, then configures your local `kubeconfig` file with AWS IAM authentication.

The command is:

```bash theme={null}
aws eks update-kubeconfig --region <region> --name <cluster-name>
```

Use the `<region>` and `<cluster-name>` from your Terraform variables.

Verification:

```bash theme={null}
kubectl get nodes
kubectl get namespaces
```

You should see EKS worker nodes and default Kubernetes namespaces.

### Phase 4: Maia runner deployment (Helm)

The Helm chart deploys:

1. **{m_runner} pods:**

   * Deployment with configurable replica count (default: 2).
   * Each pod runs the Matillion {m_runner} binary.
   * Resource requests and limits for CPU and memory.

2. **ServiceAccount:**

   * Kubernetes ServiceAccount annotated with IAM role ARN (from [Phase 2](#phase-2-infrastructure-provisioning-terraform)).
   * Enables IRSA for credential-free AWS access.

3. **ConfigMaps:**

   * {m_runner} configuration (account ID, {m_runner} ID, region).
   * Environment-specific settings.

4. **Secrets:**

   * OAuth Client ID and Secret for Matillion control plane authentication.

5. **Service:**

   * Kubernetes Service exposing Prometheus metrics endpoint (port 8080).
   * Annotated for Prometheus service discovery.

You will provide the following configuration values:

| Value                                   | Source                                                                             | Example                                   |
| --------------------------------------- | ---------------------------------------------------------------------------------- | ----------------------------------------- |
| `cloudProvider`                         | Static                                                                             | `"aws"`                                   |
| `config.oauthClientId`                  | [Phase 1](#phase-1-maia-runner-registration-matillion-console) (Matillion console) | `"abc123..."`                             |
| `config.oauthClientSecret`              | [Phase 1](#phase-1-maia-runner-registration-matillion-console) (Matillion console) | `"secret456..."`                          |
| `serviceAccount.roleArn`                | [Phase 2](#phase-2-infrastructure-provisioning-terraform) (`terraform output`)     | `"arn:aws:iam::123456789:role/..."`       |
| `dpcAgent.dpcAgent.env.accountId`       | [Phase 1](#phase-1-agent-registration-matillion-console) (Matillion console)       | `"12345"`                                 |
| `dpcAgent.dpcAgent.env.agentId`         | [Phase 1](#phase-1-agent-registration-matillion-console) (Matillion console)       | `"agent-prod-01"`                         |
| `dpcAgent.dpcAgent.env.matillionRegion` | [Phase 1](#phase-1-agent-registration-matillion-console) (Matillion console)       | `"us1"`, `"eu1"`, or `"au1"`              |
| `dpcAgent.replicas`                     | Your decision                                                                      | `2` (baseline) to `10+` (high throughput) |
| `dpcAgent.dpcAgent.image.repository`    | Static                                                                             | `"public.ecr.aws/matillion/etl-agent"`    |
| `dpcAgent.dpcAgent.image.tag`           | Your decision                                                                      | `"stable"` or `"current"`                 |

Where to implement:

* [Helm chart documentation](https://github.com/matillion-public/deployment-library/tree/main/agent/helm).
* [values.yaml reference](https://github.com/matillion-public/deployment-library/blob/main/agent/helm/agent/values.yaml).

### Phase 5: Validation and testing

Run automated pre-deployment validation scripts to verify {m_runner} pod environment:

```bash theme={null}
# From deployment library root
./agent/helm/checks/run-check.sh --namespace matillion --release matillion-agent
```

What gets checked:

* Python 3 and Java runtime available.
* Filesystem permissions correct.
* Environment variables set (ACCOUNT\_ID, AGENT\_ID, etc.).
* cgroup CPU and memory limits applied.
* Network connectivity to Matillion control plane.
* Security agents that might interfere (Crowdstrike, Prisma Cloud).

Manual verification:

1. **Matillion Console:** Navigate to **Manage runners**. Verify {m_runner} status shows "Connected".
2. **Test pipeline:** Create a simple pipeline (for example, "Hello World" transformation) and execute.
3. **Prometheus metrics:** Verify metrics available at `http://<pod-ip>:8080/actuator/prometheus`.

{m_runner} application logs are available in CloudWatch Logs (if Container Insights enabled):

* Log group: `/aws/eks/<cluster-name>/cluster`.
* Pod logs: Filterable by pod name.

***

## Maia runner architecture on EKS

### IAM Roles for Service Accounts (IRSA)

How IRSA works:

1. **Kubernetes ServiceAccount** is annotated with IAM role ARN.
2. **EKS OIDC Provider** allows Kubernetes to issue tokens trusted by AWS IAM.
3. **{m_runner} pod** assumes IAM role using projected service account token.
4. **AWS STS** exchanges token for temporary AWS credentials (valid 1 hour, auto-refreshed).
5. **{m_runner}** accesses AWS services (S3, Secrets Manager) without storing credentials.

Security benefits:

* No long-lived AWS credentials in cluster.
* Automatic credential rotation (every hour).
* Least-privilege access (IAM role scoped to specific S3 buckets, secrets).
* Pod-level isolation (each pod has its own token).

What the Terraform module creates for IRSA:

* IAM OIDC provider for EKS cluster.
* IAM role for {m_runner} service account.
* IAM policy allowing access to S3, Secrets Manager, CloudWatch.
* Trust relationship allowing Kubernetes service account to assume role.

### Task capacity and throughput

**Per-pod capacity:** Each {m_runner} pod can execute up to 20 concurrent tasks.

**Throughput calculation:** Maximum concurrent tasks = (Number of {m_runner} pods) × 20.

Examples:

* 2 pods (default) = 40 concurrent tasks.
* 5 pods = 100 concurrent tasks.
* 10 pods = 200 concurrent tasks.

**Scaling guidance:**

* For transformation workloads: Tasks generate SQL executed by data warehouse. {m_runner} CPU/memory usage is low. Fewer pods needed.
* For data ingestion workloads: Tasks transfer and process data on {m_runner}. {m_runner} CPU/memory usage is high. More pods needed.

**Queuing behavior:** When all pods are at capacity (20 tasks each), new tasks queue in Matillion's agent gateway until capacity becomes available.

***

## Monitoring and observability

### Native Prometheus metrics

{m_runner} pods expose Prometheus-compatible metrics at:

* **Endpoint:** `http://<pod-ip>:8080/actuator/prometheus`.
* **Service:** Automatically created by Helm chart with Prometheus annotations.

Key metrics:

* `app_version_info`: {m_runner} version and build metadata.
* `app_agent_status`: {m_runner} status (1 = running, 0 = stopped).
* `app_active_task_count`: Current number of executing tasks.

The Helm chart includes annotations for automatic Prometheus service discovery:

```yaml theme={null}
prometheus.io/scrape: "true"
prometheus.io/port: "8080"
prometheus.io/path: "/actuator/prometheus"
```

If Prometheus is deployed in your cluster, it will automatically discover and scrape these metrics.

### AWS CloudWatch integration

Enable CloudWatch Container Insights for comprehensive EKS monitoring:

* Cluster-level metrics (CPU, memory, network).
* Pod-level metrics (resource usage per {m_runner} pod).
* Node-level metrics (EC2 worker node health).

{m_runner} application logs are streamed to CloudWatch Logs:

* Centralized log aggregation.
* Query with CloudWatch Logs Insights.
* Set up alarms on error patterns.

Recommended CloudWatch alarms:

* {m_runner} pod restarts > threshold.
* {m_runner} pods in CrashLoopBackOff state.
* Task execution failures (requires custom metric from {m_runner} logs).
* Worker node CPU/memory > 80%.

***

## Security best practices

### Network security

VPC configuration:

* Deploy {m_runner} pods in **private subnets** for enhanced security.
* Use NAT Gateway for outbound internet access (required for Matillion control plane).
* Restrict network security groups to minimum required ingress/egress.

Network connectivity requirements:

Outbound:

* **HTTPS (443)** to Matillion control plane (region-specific endpoints).
* **HTTPS/JDBC** to your relevant data warehouse endpoints.
* **HTTPS (443)** to AWS APIs (S3, Secrets Manager, STS for IRSA).
* **HTTP (80)** to Snowflake endpoints.
* **HTTPS (443)** to all other required specific endpoints.

Inbound:

* **Ingress:** No inbound traffic required ({m_runner} initiates all connections).

Private cluster considerations:

* API server accessible only from VPC (or authorized VPN/bastion).
* Requires VPN or AWS Systems Manager Session Manager for kubectl access.
* CI/CD pipelines need VPC connectivity or VPN access.

### Pod security standards

The Helm chart implements Kubernetes pod security standards.

Security context configuration:

* Run as non-root user (UID 65534).
* Read-only root filesystem.
* No privilege escalation.
* Drop all Linux capabilities.
* Seccomp profile: RuntimeDefault.

Example from Helm chart:

```yaml theme={null}
securityContext:
  runAsNonRoot: true
  runAsUser: 65534
  fsGroup: 65534
  seccompProfile:
    type: RuntimeDefault

containers:
  - securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop: ["ALL"]
```

### Secrets management

OAuth credentials storage options:

* AWS Secrets Manager (recommended):
  * Store OAuth credentials in AWS Secrets Manager.
  * Use External Secrets Operator to sync to Kubernetes Secrets.
  * Automatic rotation support.
  * Centralized secret management across environments.
* Kubernetes Secrets (default):
  * Credentials provided via Helm values.
  * Stored as base64-encoded Kubernetes Secret.
  * Not encrypted at rest by default (enable EKS envelope encryption).

Recommendation: For production, use AWS Secrets Manager with External Secrets Operator for centralized, auditable secret management.

### EKS envelope encryption

Enable envelope encryption for Kubernetes Secrets at rest:

* EKS integrates with AWS KMS.
* Secrets encrypted with customer-managed KMS key.
* Decryption on-demand when pods access secrets.

Configure in Terraform EKS module settings.

***

## Scaling considerations

### When to scale

Indicators to add more {m_runner} pods:

* Task queue depth consistently > 0 (check Matillion console or metrics).
* Pipeline execution time increases due to task queuing.
* More concurrent pipelines being executed.
* Workload characteristics change (more data ingestion vs transformation).

Indicators to keep current capacity:

* Task queue depth consistently = 0.
* Pipeline execution times stable.
* Workload primarily transformation (SQL generation).

### Horizontal Pod Autoscaler (HPA)

How it works:

* Kubernetes HPA monitors pod metrics.
* Automatically scales Deployment replicas within configured min/max range.
* Evaluates every 15 seconds (default), scales up/down based on thresholds.

Example HPA configuration:

* Min replicas: 2 (baseline availability).
* Max replicas: 10 (cost control).

Configure via Helm values or separate HPA manifest.

Read the [HPA documentation](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/) for details.

### Cluster autoscaler

How it works:

* Monitors pods in Pending state (unable to schedule due to insufficient node capacity).
* Automatically adds EC2 worker nodes to Auto Scaling group.
* Removes underutilized nodes after 10 minutes of low usage.

Works with HPA:

1. HPA scales {m_runner} pods based on metrics.
2. If pods can't schedule (no node capacity), cluster autoscaler adds nodes.
3. {m_runner} pods schedule on new nodes.
4. When load decreases, HPA scales down pods, cluster autoscaler removes empty nodes.

Read the [cluster autoscaler setup documentation](https://docs.aws.amazon.com/eks/latest/userguide/autoscaling.html) for more details.

### Vertical scaling

Adjust CPU and memory limits per pod via Helm values:

* Useful when individual tasks require more resources than current pod limits.
* Requires pod restart to apply new resource limits.
* Consider workload characteristics (transformation vs ingestion).

***

## Cost optimization

### Cost optimization strategies

* **Right-size worker nodes:** Match instance type to workload (transformation-heavy = smaller, ingestion-heavy = larger).
* **Use Cluster Autoscaler:** Automatically remove unused nodes during low-usage periods.
* **Consider Savings Plans or Reserved Instances:** For predictable baseline capacity.
* **Monitor data transfer:** Ensure data warehouses in same region to avoid cross-region charges.
* **Evaluate Fargate:** For variable workloads, Fargate pricing may be more cost-effective (pay per pod vs per EC2 instance).

***

## Additional resources

### Implementation and deployment

For complete Terraform modules, Helm charts, and step-by-step implementation, see the following in the Matillion Deployment Library on GitHub:

* [AWS {m_runner} directory](https://github.com/matillion-public/deployment-library/tree/main/agent/aws).
* [EKS Terraform module](https://github.com/matillion-public/deployment-library/tree/main/agent/aws/eks).
* [Helm charts](https://github.com/matillion-public/deployment-library/tree/main/agent/helm).

You can find the Matillion Deployment Library at [github.com/matillion-public/deployment-library](https://github.com/matillion-public/deployment-library).

### General Kubernetes guide

You should read the general [Kubernetes deployment guide](/docs/guides/kubernetes-deployment-guide) for platform-agnostic concepts and architecture.

### Matillion documentation

* For deployment models, read [{m_runner} overview](/docs/guides/runner-overview).
* For {m_runner} registration, read [Create a {m_runner}](/docs/guides/create-a-runner).
* For capacity planning, read [Scaling best practices](/docs/guides/scaling-best-practices).

### AWS documentation

* For EKS concepts and operations, read [Amazon EKS user guide](https://docs.aws.amazon.com/eks/latest/userguide/).
* For IRSA set up, read [IAM Roles for Service Accounts](https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html).
* For automatic node scaling, read [Cluster Autoscaler](https://docs.aws.amazon.com/eks/latest/userguide/autoscaling.html).