ECS provides the freedom of scaling different services through adding “tasks”. These operate and communicate independently, and are being hosted on a series of EC2 instances. Scaling the system’s services in and out, as well as scaling out EC2 hosts is a convenient task, achieved with the help of CloudWatch metrics and alarms. However, scaling-in EC2 hosts requires some complexity in decision making.

ECS is a three-layer architecture:

Cluster: a set of applications running on a shared group of resources Service: a logical instance (e.g “application”) Task: a physical instance of a service

Scaling tasks in and out is relatively simple; based on a couple of metric thresholds such as cpu load percentage which determines whether new tasks should be loaded, or current ones should be removed.

Scaling ECS hosts out, is about the same; for example when a certain level of CPU is crossed, or when the physical cluster doesn’t have the required amount of memory to hold the current load, new EC2 instances are triggered.

Removing resources is about cost reduction; these are the resources that are being paid for, and removing them when they are no longer required is about utilization which is translated to costs.

However, the scale-in task is somewhat challenging: applying the general method of crossing a metric threshold might result in cutting down required resources in other parameters, for example: removing resources due to lower levels of CPU while the level of memory reservation cannot “afford” to lose hosts.

Since AWS doesn’t provide the function of multiple-metric scale, and due to the fact that such a scale requires a few more calculations to make the decision of resource removal, an external tools is required for the task.