Prometheus and Grafana on EC2 with Terraform and Ansible
In this article, I'll show you how to deploy an EC2 virtual machine and set up Prometheus and Grafana servers on it. We'll achieve this using Infrastructure as Code with Terraform and automate the configuration management with Ansible.
If you're curious about Infrastructure as Code and its benefits, I've also written another article on that topic.
You can access the full code for this article on this GitHub repository.
Networking
Our first step will be to set up a Virtual Private Cloud (VPC) and a Subnet. To ensure the subnet is public, we'll create an Internet Gateway and the necessary Route Table to facilitate traffic routing to the Internet:
# network.tf
resource "aws_vpc" "monitoring" {
cidr_block = "10.0.0.0/16"
}
resource "aws_subnet" "monitoring_a" {
vpc_id = aws_vpc.monitoring.id
cidr_block = cidrsubnet(aws_vpc.monitoring.cidr_block, 8, 0)
}
resource "aws_internet_gateway" "monitoring" {
vpc_id = aws_vpc.monitoring.id
}
resource "aws_route_table" "public" {
vpc_id = aws_vpc.monitoring.id
route {
cidr_block = "0.0.0.0/0"
gateway_id = aws_internet_gateway.monitoring.id
}
}
resource "aws_route_table_association" "monitoring-a" {
subnet_id = aws_subnet.monitoring_a.id
route_table_id = aws_route_table.public.id
}
Also, we'll need to open ports to allow access to Prometheus and Grafana. The default port for Prometheus is 9090, while Grafana uses port 3000. We'll open port 22 to enable SSH access to the instance, too.
To achieve this, we'll create a security group that includes the necessary rules to permit incoming traffic on these ports. Even though our subnet's external traffic is being routed to the Internet, we must explicitly specify in our Security Group that we allow outbound traffic. This is because, by default, security groups deny all outbound traffic:
# network.tf
resource "aws_security_group" "monitoring" {
name = "monitoring-server"
vpc_id = aws_vpc.monitoring.id
ingress {
description = "Allow SSH"
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
ipv6_cidr_blocks = ["::/0"]
}
ingress {
description = "Allow traffic to Prometheus server (port 9090 by default)"
from_port = 9090
to_port = 9090
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
ipv6_cidr_blocks = ["::/0"]
}
ingress {
description = "Allow traffic to Grafana (port 3000 by default)"
from_port = 3000
to_port = 3000
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
ipv6_cidr_blocks = ["::/0"]
}
egress {
description = "Allow all outbound traffic"
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
ipv6_cidr_blocks = ["::/0"]
}
}
Now that we have all the networking elements in place, we can move forward with the creation of our virtual machine.
EC2 instance
To establish an SSH connection to the instance, we need to create a key pair. This can be done using the following code, which allows us to import an existing public key from our local machine:
resource "aws_key_pair" "monitoring" {
key_name = "monitoring_server_keypair"
public_key = file("~/.ssh/public_key.pub")
}
Now, we will create the instance and link it to the elements we previously created: subnet, security group, and key.
resource "aws_instance" "monitoring" {
instance_type = var.instance_type
ami = var.ami_id
subnet_id = aws_subnet.monitoring_a.id
vpc_security_group_ids = [aws_security_group.monitoring.id]
key_name = aws_key_pair.monitoring.key_name
associate_public_ip_address = true
}
Run Prometheus and Grafana
With our EC2 instance in place, our goal is to run Prometheus and Grafana on it. Instead of manually accessing the machine via SSH to execute these tasks, we will streamline the process by creating a provisioner block that calls our Ansible playbook. This playbook will be responsible for setting up and running Prometheus and Grafana binaries efficiently.
First, we will create our Ansible playbook. This is the fragment where Grafana is installed and started:
- rpm_key:
key: https://rpm.grafana.com/gpg.key
state: present
become: true
- yum:
name: "https://dl.grafana.com/oss/release/grafana-{{ grafana_version }}.x86_64.rpm"
state: present
become: true
- systemd:
name: grafana-server
state: started
become: true
For simplicity, I won't show the full playbook here, but you can see the code here.
We want the playbook to be executed when our EC2 instance boots up. We can achieve this using a Terraform provisioner. We will add the code below to our aws_instance.monitoring
Terraform resource:
provisioner "local-exec" {
command = "ANSIBLE_HOST_KEY_CHECKING=false ansible-playbook -u ec2-user -i '${self.public_ip},' --private-key ${var.instance_ssh_priv_key} ../ansible/playbook.yml"
}
ec2-user
is the default user for Amazon Linux machines.self.public_ip
refers to EC2 machine public IP.--private-key ${var.instance_ssh_priv_key}
: here we tell Ansible which private key we want to use to connect to the instance. Note we must incorporate a variable for the private key location.
Now, we are ready to initiate our infrastructure creation using Terraform (plan and apply). Once the EC2 instance is successfully created, the Ansible playbook will be automatically executed, making available the Prometheus and Grafana servers.