Skip to main content

AWS Elastic Container Service (ECS)

Telemetry data can be collected from AWS ECS (both EC2 and Fargate) using OpenTelemetry (OTel) Collector.

OTel Collector needs to be deployed as a daemon service.

  1. Create an IAM role with the following policy and name it ECSOTELDaemonRole.

    {
    "Version": "2012-10-17",
    "Statement": [
    {
    "Sid": "VisualEditor0",
    "Effect": "Allow",
    "Action": ["ec2:DescribeInstances", "ecs:DescribeContainerInstances"],
    "Resource": "*"
    }
    ]
    }
  2. Attach IAM Permission to ECS Task to allow OTel Collector to fetch the configuration from SSM Parameter Store.

    [
    {
    "Effect": "Allow",
    "Action": ["ssm:GetParameter", "ssm:GetParameters"],
    "Resource": "arn:aws:ssm:<aws-region>:<aws-account-id>:parameter/otel-collector-config"
    }
    ]
  3. Create SSM Parameter Store for OTel Collector Config. Create a parameter in AWS Systems Manager (SSM) Parameter Store and name it otel-collector-config. Choose type String and data type text, and then copy the below configuration in the value field.

    otel-collector-config.yaml
    receivers:
    otlp:
    protocols:
    grpc:
    endpoint: 0.0.0.0:4317
    http:
    endpoint: 0.0.0.0:4318

    hostmetrics:
    collection_interval: 60s
    scrapers:
    cpu:
    disk:
    load:
    filesystem:
    memory:
    network:
    root_path: /hostfs

    awsecscontainermetrics:
    collection_interval: 60s

    processors:
    batch: {}

    resourcedetection:
    detectors:
    - ecs
    - system
    system:
    hostname_sources: ["os"]
    timeout: 2s

    exporters:
    debug:
    verbosity: detailed
    sampling_initial: 5
    sampling_thereafter: 1

    otlphttp/metrics:
    metrics_endpoint: http://<cubeapm_endpoint>:3130/api/metrics/v1/save/otlp
    retry_on_failure:
    enabled: false

    otlphttp/logs:
    logs_endpoint: http://<cubeapm_endpoint>:3130/api/logs/insert/opentelemetry/v1/logs
    headers:
    Cube-Stream-Fields: severity, host.name
    otlp/traces:
    endpoint: <cubeapm_endpoint>:4317/v1/traces
    tls:
    insecure: true

    service:
    pipelines:
    traces:
    exporters:
    # - debug
    - otlp/traces
    processors:
    - batch
    - resourcedetection
    receivers:
    - otlp

    metrics:
    exporters:
    # - debug
    - otlphttp/metrics
    processors:
    - batch
    - resourcedetection
    receivers:
    - otlp
    - hostmetrics
    # - awsecscontainermetrics

    logs:
    exporters:
    # - debug
    - otlphttp/logs
    processors:
    - batch
    - resourcedetection
    receivers:
    - otlp
  4. Create task definition for OTel Collector. Use the following configuration and edit the values in < > according to your setup.

    otel-collector-daemonset.json
    {
    "family": "otel-collector-daemon",
    "containerDefinitions": [
    {
    "name": "otel-collector",
    "image": "otel/opentelemetry-collector-contrib:0.145.0",
    "cpu": 0,
    "portMappings": [
    {
    "containerPort": 4317,
    "hostPort": 4317,
    "protocol": "tcp"
    },
    {
    "containerPort": 4318,
    "hostPort": 4318,
    "protocol": "tcp"
    }
    ],
    "essential": true,
    "command": [
    "--config=env:OTEL_CONFIG"
    ],
    "environment": [],
    "mountPoints": [
    {
    "sourceVolume": "host-root",
    "containerPath": "/hostfs",
    "readOnly": true
    }
    ],
    "volumesFrom": [],
    "secrets": [
    {
    "name": "OTEL_CONFIG",
    "valueFrom": "otel-collector-config"
    }
    ],
    "logConfiguration": {
    "logDriver": "awslogs",
    "options": {
    "awslogs-group": "<logs-group>",
    "awslogs-region": "<aws-region>",
    "awslogs-stream-prefix": "daemon"
    }
    },
    "systemControls": []
    }
    ],
    "taskRoleArn": "ECSOTELDaemonRole",
    "executionRoleArn": "<ecsTaskExecutionRole>",
    "networkMode": "host",
    "volumes": [
    {
    "name": "host-root",
    "host": {
    "sourcePath": "/"
    }
    }
    ],
    "placementConstraints": [],
    "requiresCompatibilities": [
    "EC2"
    ],
    "cpu": "512",
    "memory": "1024",
    "runtimePlatform": {
    "cpuArchitecture": "<X86_64 or ARM64>",
    "operatingSystemFamily": "LINUX"
    }
    }
  5. Create a service using the above task definition and configure it as a daemon service by setting Scheduling strategy as daemon in Deployment configuration.

  6. Update your applications' configuration to send logs, metrics and traces to the OTel Collector daemon service. Applications can connect to OTel Collector on the fixed IP 172.17.0.1 (Docker bridge gateway IP), which will always point to the OTel Collector running on the same EC2 host as the application container.

Troubleshooting

  1. Verify that OTel Collector is running.

  2. Check OTel Collector logs.