Consul - Service Discovery
Introduction
What is Consul
Consul is a distributed, highly available system with multiple components providing several key features like service discovery, health check, Key-Value Store and Multi Datacenter support.
Let's go through the multi-node consul architecture to understand the consul workflow.
The above diagram shows two datacenters.
Let's consider a single datacenter Datacenter-1. The consul cluster with-in a datacenter consists of consul agents (one agent deployed per node).
Agent: An agent is daemon on every member of the Consul cluster. It is started by running consul agent. The agent is able to run in either client or server mode.
Client: A client is an agent that forwards all RPCs to a server. The client is relatively stateless. The only background activity a client performs is taking part in the LAN gossip pool. This has a minimal resource overhead and consumes only a small amount of network bandwidth.
Server: A server is an agent with an expanded set of responsibilities including participating in the Raft quorum, maintaining cluster state, responding to RPC queries, exchanging WAN gossip with other datacenters, and forwarding queries to leaders or remote datacenters.
The Consul servers are where data is stored and replicated. The servers themselves elect a leader. While Consul can function with one server, 3 to 5 is recommended to avoid failure scenarios leading to data loss.
Datacenter: While the definition of a datacenter seems obvious, there are subtle details that must be considered. For example, in EC2, are multiple availability zones considered to comprise a single datacenter? We define a datacenter to be a networking environment that is private, low latency, and high bandwidth. This excludes communication that would traverse the public internet, but for our purposes multiple availability zones within a single EC2 region would be considered part of a single datacenter.
Consensus: Implies agreement upon the elected leader as well as consistency of a replicated state machine, i.e. ordering of transactions.
Gossip: Serf provides a full gossip protocol in consul. Serf provides membership, failure detection, and event broadcast. It is enough to know that gossip involves random node-to-node communication, primarily over UDP.
LAN Gossip: Refers to the LAN gossip pool which contains nodes (clients and servers) that are all located on the same local area network or datacenter.
WAN Gossip: Refers to the WAN gossip pool which contains only servers. These servers are primarily located in different datacenters and typically communicate over the internet or wide area network.
RPC: Remote Procedure Call. This is a request / response mechanism allowing a client to make a request of a server.
Components of your infrastructure that need to discover other services or nodes can query any of the Consul servers or any of the Consul clients. The clients forward queries to the servers automatically. Running an agent is not required for discovering other services or getting/setting key/value data.
Each datacenter runs a cluster of Consul servers. When a cross-datacenter service discovery or configuration request is made, the local Consul servers forward the request to the remote datacenter server and return the result.
Getting Started
Consul Cluster
Let's create a Consul cluster with 4 nodes - 3 Consul Server and 1 Consul Client
consul-server-1 : 192.168.0.152
consul-server-2 : 192.168.0.153
consul-server-3 : 192.168.0.157
consul-client-1 : 192.168.0.154
Install the Consul
Use the below steps to install Consul over 4 Consul nodes.
Create Installation Directory : sudo mkdir /usr/local/bin/consul-v0,5.1
Goto Directory : sudo /usr/local/bin/consul-v0,5.1
Download Consul : sudo wget https://dl.bintray.com/mitchellh/consul/0.5.1_linux_amd64.zip
Install unzip : sudo apt-get update ; sudo apt-get unzip
Unzip consul ': sudo unzip 0.5.1_linux_amd64.zip
Add consul to PATH : export PATH=$PATH:/usr/local/bin/consul-v0.5.1
Add PATH to bashrc : echo 'export PATH=$PATH:/usr/local/bin/consul-v0.5.1' | sudo tee -a /etc/bash.bashrc
Install autossh : sudo apt-get autossh
Set autossh to prevent remote NAT ssh connection break : autossh -M 20000 -f -N 192.168.0.152 -R 1234:localhost:22 -C
# consul
usage: consul [--version] [--help] <command> [<args>]
Available commands are:
agent Runs a Consul agent
configtest Validate config file
event Fire a new event
exec Executes a command on Consul nodes
force-leave Forces a member of the cluster to enter the "left" stat
info Provides debugging information for operators
join Tell Consul agent to join cluster
keygen Generates a new encryption key
keyring Manages gossip layer encryption keys
leave Gracefully leaves the Consul cluster and shuts down
lock Execute a command holding a lock
maint Controls node or service maintenance mode
members Lists the members of a Consul cluster
monitor Stream logs from a Consul agent
reload Triggers the agent to reload configuration files
version Prints the Consul version
watch Watch for changes in Consul
If you get an error that consul could not be found, your PATH environment variable was not set up properly. Please go back and ensure that your PATH variable contains the directory where Consul was installed.
Define the Service over Consul Server Nodes
A service can be registered either by providing a service definition or by making the appropriate calls to the HTTP API. Use service definition.
First, create a directory for Consul configuration.
Second, create the service definition configuration file. Let's pretend we have a service named "web" running on port 80. Additionally, we'll give it a tag we can use as an additional way to query the service:
Execute over 3 consul servers - consul-server-1, consul-server-2 and consul-server-3.
$ sudo mkdir /etc/consul.d
$ sudo echo '{"service": {"name": "apache-service", "tags": ["rails"], "port": 80}}' | sudo tee -a /etc/consul.d/apache-service.json
$ sudo echo '{"service": {"name": "web-service", "tags": ["rails"], "port": 81}}' | sudo tee -a /etc/consul.d/web-service.json
Define the Health Checks over Consul Server Nodes
Similar to a service, a check can be registered either by providing a check definitionor by making the appropriate calls to the HTTP API.
Execute over 3 consul servers - consul-server-1, consul-server-2 and consul-server-3.
$ sudo echo '{"check": {"name": "ping-check", "script": "ping -c1 google.com >/dev/null", "interval": "30s"}}' | sudo tee -a /etc/consul.d/ping-check.json
$ sudo echo '{"service": {"name": "apache-check", "tags": ["rails"], "port": 80, "check": {"script": "curl localhost:80 >/dev/null 2>&1", "interval": "10s"}}}' | sudo tee -a /etc/consul.d/apache-check.json
$ sudo echo '{"service": {"name": "apache-check", "tags": ["rails"], "port": 80, "check": {"script": "curl localhost:80 >/dev/null 2>&1", "interval": "10s"}}}' | sudo tee -a /etc/consul.d/web-check.json
Start the Consul Nodes
Start the three consul server nodes
Create /tmp/consul on all 3 consul servers
$sudo mkdir /tmp/consul
$consul agent -server -bootstrap-expect=3 -data-dir /tmp/consul -config-dir /etc/consul.d -dc dc-one -node=consul-server-1 --bind=192.168.0.152
$consul agent -server -bootstrap-expect=3 -data-dir /tmp/consul -config-dir /etc/consul.d -dc dc-one -node=consul-server-2 --bind=192.168.0.153
$consul agent -server -bootstrap-expect=3 -data-dir /tmp/consul -config-dir /etc/consul.d -dc dc-one -node=consul-server-3 --bind=192.168.0.157
The server starts with below logs ...
user@consul-server-1:~$ consul agent -server -bootstrap-expect=3 -data-dir /tmp/consul -config-dir /etc/consul.d -dc dc-one -node=consul-server-1 --bind=192.168.0.152
==> WARNING: Expect Mode enabled, expecting 3 servers
==> WARNING: It is highly recommended to set GOMAXPROCS higher than 1
==> Starting Consul agent...
==> Starting Consul agent RPC...
==> Consul agent running!
Node name: 'consul-server-1'
Datacenter: 'dc-one'
Server: true (bootstrap: false)
Client Addr: 127.0.0.1 (HTTP: 8500, HTTPS: -1, DNS: 8600, RPC: 8400)
Cluster Addr: 192.168.0.152 (LAN: 8301, WAN: 8302)
Gossip encrypt: false, RPC-TLS: false, TLS-Incoming: false
Atlas: <disabled>
==> Log data will now stream in as it occurs:
2015/08/15 05:41:12 [INFO] raft: Node at 192.168.0.152:8300 [Follower] entering Follower state
2015/08/15 05:41:12 [INFO] serf: EventMemberJoin: consul-server-1 192.168.0.152
2015/08/15 05:41:12 [INFO] serf: EventMemberJoin: consul-server-1.dc-one 192.168.0.152
2015/08/15 05:41:12 [INFO] consul: adding server consul-server-1 (Addr: 192.168.0.152:8300) (DC: dc-one)
2015/08/15 05:41:12 [INFO] consul: adding server consul-server-1.dc-one (Addr: 192.168.0.152:8300) (DC: dc-one)
2015/08/15 05:41:12 [ERR] agent: failed to sync remote state: No cluster leader
2015/08/15 05:41:13 [WARN] raft: EnableSingleNode disabled, and no known peers. Aborting election.
2015/08/15 05:41:19 [WARN] agent: Check 'service:web-check' is now critical
==> Newer Consul version available: 0.5.2
user@consul-server-2:~$ consul agent -server -bootstrap-expect=3 -data-dir /tmp/consul -config-dir /etc/consul.d -dc dc-one -node=consul-server-2 --bind=192.168.0.153
==> WARNING: Expect Mode enabled, expecting 3 servers
==> WARNING: It is highly recommended to set GOMAXPROCS higher than 1
==> Starting Consul agent...
==> Starting Consul agent RPC...
==> Consul agent running!
Node name: 'consul-server-2'
Datacenter: 'dc-one'
Server: true (bootstrap: false)
Client Addr: 127.0.0.1 (HTTP: 8500, HTTPS: -1, DNS: 8600, RPC: 8400)
Cluster Addr: 192.168.0.153 (LAN: 8301, WAN: 8302)
Gossip encrypt: false, RPC-TLS: false, TLS-Incoming: false
Atlas: <disabled>
==> Log data will now stream in as it occurs:
2015/08/15 05:42:20 [INFO] serf: EventMemberJoin: consul-server-2 192.168.0.153
2015/08/15 05:42:20 [INFO] serf: EventMemberJoin: consul-server-2.dc-one 192.168.0.153
2015/08/15 05:42:20 [INFO] raft: Node at 192.168.0.153:8300 [Follower] entering Follower state
2015/08/15 05:42:20 [INFO] consul: adding server consul-server-2 (Addr: 192.168.0.153:8300) (DC: dc-one)
2015/08/15 05:42:20 [INFO] consul: adding server consul-server-2.dc-one (Addr: 192.168.0.153:8300) (DC: dc-one)
2015/08/15 05:42:20 [ERR] agent: failed to sync remote state: No cluster leader
2015/08/15 05:42:21 [WARN] agent: Check 'service:web-check' is now critical
2015/08/15 05:42:22 [WARN] raft: EnableSingleNode disabled, and no known peers. Aborting election.
user@consul-server-3:~$ consul agent -server -bootstrap-expect=3 -data-dir /tmp/consul -config-dir /etc/consul.d -dc dc-one -node=consul-server-3 --bind=192.168.0.157
==> WARNING: Expect Mode enabled, expecting 3 servers
==> WARNING: It is highly recommended to set GOMAXPROCS higher than 1
==> Starting Consul agent...
==> Starting Consul agent RPC...
==> Consul agent running!
Node name: 'consul-server-3'
Datacenter: 'dc-one'
Server: true (bootstrap: false)
Client Addr: 127.0.0.1 (HTTP: 8500, HTTPS: -1, DNS: 8600, RPC: 8400)
Cluster Addr: 192.168.0.157 (LAN: 8301, WAN: 8302)
Gossip encrypt: false, RPC-TLS: false, TLS-Incoming: false
Atlas: <disabled>
==> Log data will now stream in as it occurs:
2015/08/15 05:46:02 [INFO] raft: Node at 192.168.0.157:8300 [Follower] entering Follower state
2015/08/15 05:46:02 [INFO] serf: EventMemberJoin: consul-server-3 192.168.0.157
2015/08/15 05:46:02 [INFO] serf: EventMemberJoin: consul-server-3.dc-one 192.168.0.157
2015/08/15 05:46:02 [INFO] consul: adding server consul-server-3 (Addr: 192.168.0.157:8300) (DC: dc-one)
2015/08/15 05:46:02 [INFO] consul: adding server consul-server-3.dc-one (Addr: 192.168.0.157:8300) (DC: dc-one)
2015/08/15 05:46:02 [ERR] agent: failed to sync remote state: No cluster leader
2015/08/15 05:46:04 [WARN] raft: EnableSingleNode disabled, and no known peers. Aborting election.
2015/08/15 05:46:08 [WARN] agent: Check 'service:web-check' is now critical
Now join the server-2 to cluster
user@consul-server-1:/$ consul join 192.168.0.153
Successfully joined cluster by contacting 1 nodes.
Logs from consul-server-1
2015/08/15 05:46:47 [INFO] agent.rpc: Accepted client: 127.0.0.1:36523
2015/08/15 05:46:47 [INFO] agent: (LAN) joining: [192.168.0.153]
2015/08/15 05:46:47 [INFO] serf: EventMemberJoin: consul-server-2 192.168.0.153
2015/08/15 05:46:47 [INFO] agent: (LAN) joined: 1 Err: <nil>
2015/08/15 05:46:47 [INFO] consul: adding server consul-server-2 (Addr: 192.168.0.153:8300) (DC: dc-one)
Logs from consul-server-2
2015/08/15 05:46:56 [INFO] serf: EventMemberJoin: consul-server-1 192.168.0.152
2015/08/15 05:46:56 [INFO] consul: adding server consul-server-1 (Addr: 192.168.0.152:8300) (DC: dc-one)
user@consul-server-1:/$ consul join 192.168.0.157
Logs from consul-server-1
2015/08/15 05:47:51 [INFO] agent.rpc: Accepted client: 127.0.0.1:36541
2015/08/15 05:47:51 [INFO] agent: (LAN) joining: [192.168.0.157]
2015/08/15 05:47:51 [INFO] serf: EventMemberJoin: consul-server-3 192.168.0.157
2015/08/15 05:47:51 [INFO] agent: (LAN) joined: 1 Err: <nil>
2015/08/15 05:47:51 [INFO] consul: adding server consul-server-3 (Addr: 192.168.0.157:8300) (DC: dc-one)
2015/08/15 05:47:51 [INFO] consul: Attempting bootstrap with nodes: [192.168.0.157:8300 192.168.0.152:8300 192.168.0.153:8300]
2015/08/15 05:47:52 [WARN] raft: Heartbeat timeout reached, starting election
2015/08/15 05:47:52 [INFO] raft: Node at 192.168.0.152:8300 [Candidate] entering Candidate state
2015/08/15 05:47:52 [INFO] raft: Election won. Tally: 2
2015/08/15 05:47:52 [INFO] raft: Node at 192.168.0.152:8300 [Leader] entering Leader state
2015/08/15 05:47:52 [INFO] consul: cluster leadership acquired
2015/08/15 05:47:52 [INFO] consul: New leader elected: consul-server-1
2015/08/15 05:47:52 [INFO] raft: pipelining replication to peer 192.168.0.153:8300
2015/08/15 05:47:52 [INFO] consul: member 'consul-server-1' joined, marking health alive
2015/08/15 05:47:52 [INFO] raft: pipelining replication to peer 192.168.0.157:8300
2015/08/15 05:47:52 [INFO] consul: member 'consul-server-2' joined, marking health alive
2015/08/15 05:47:52 [INFO] consul: member 'consul-server-3' joined, marking health alive
2015/08/15 05:47:53 [INFO] agent: Synced service 'web-service'
2015/08/15 05:47:53 [INFO] agent: Synced service 'consul'
2015/08/15 05:47:53 [INFO] agent: Synced service 'apache-check'
2015/08/15 05:47:53 [INFO] agent: Synced service 'apache-service'
2015/08/15 05:47:53 [INFO] agent: Synced service 'web-check'
2015/08/15 05:47:53 [INFO] agent: Synced check 'ping-check'
2015/08/15 05:47:59 [WARN] agent: Check 'service:web-check' is now critical
Logs from consul-server-2
2015/08/15 05:48:01 [INFO] serf: EventMemberJoin: consul-server-3 192.168.0.157
2015/08/15 05:48:01 [INFO] consul: adding server consul-server-3 (Addr: 192.168.0.157:8300) (DC: dc-one)
2015/08/15 05:48:01 [INFO] consul: Attempting bootstrap with nodes: [192.168.0.153:8300 192.168.0.152:8300 192.168.0.157:8300]
2015/08/15 05:48:02 [WARN] agent: Check 'service:web-check' is now critical
2015/08/15 05:48:02 [INFO] consul: New leader elected: consul-server-1
2015/08/15 05:48:04 [INFO] agent: Synced service 'apache-service'
2015/08/15 05:48:04 [INFO] agent: Synced service 'web-check'
2015/08/15 05:48:04 [INFO] agent: Synced service 'web-service'
2015/08/15 05:48:04 [INFO] agent: Synced service 'consul'
2015/08/15 05:48:04 [INFO] agent: Synced service 'apache-check'
2015/08/15 05:48:04 [INFO] agent: Synced check 'ping-check'
2015/08/15 05:48:12 [WARN] agent: Check 'service:web-check' is now critical
Logs from consul-server-3
2015/08/15 05:47:51 [INFO] serf: EventMemberJoin: consul-server-2 192.168.0.153
2015/08/15 05:47:51 [INFO] serf: EventMemberJoin: consul-server-1 192.168.0.152
2015/08/15 05:47:51 [INFO] consul: adding server consul-server-2 (Addr: 192.168.0.153:8300) (DC: dc-one)
2015/08/15 05:47:51 [INFO] consul: Attempting bootstrap with nodes: [192.168.0.152:8300 192.168.0.157:8300 192.168.0.153:8300]
2015/08/15 05:47:51 [INFO] consul: adding server consul-server-1 (Addr: 192.168.0.152:8300) (DC: dc-one)
2015/08/15 05:47:52 [INFO] consul: New leader elected: consul-server-1
2015/08/15 05:47:54 [INFO] agent: Synced service 'consul'
2015/08/15 05:47:54 [INFO] agent: Synced service 'apache-check'
2015/08/15 05:47:54 [INFO] agent: Synced service 'apache-service'
2015/08/15 05:47:55 [INFO] agent: Synced service 'web-check'
2015/08/15 05:47:55 [INFO] agent: Synced service 'web-service'
2015/08/15 05:47:55 [INFO] agent: Synced check 'ping-check'
2015/08/15 05:47:58 [WARN] agent: Check 'service:web-check' is now critical
Check the Consul Info
#This gives the state of node , server or client, leader , services, health checks, number of peers
user@consul-server-1:~$ consul info | grep leader
leader = true
user@consul-server-1:~$
user@consul-server-1:~$
user@consul-server-1:~$ consul info
WARNING: It is highly recommended to set GOMAXPROCS higher than 1
agent:
check_monitors = 3
check_ttls = 0
checks = 3
services = 5
build:
prerelease =
revision = dc6795a5
version = 0.5.1
consul:
bootstrap = false
known_datacenters = 1
leader = true
server = true
raft:
applied_index = 23
commit_index = 23
fsm_pending = 0
last_contact = never
last_log_index = 23
last_log_term = 1
last_snapshot_index = 0
last_snapshot_term = 0
num_peers = 2
state = Leader
term = 1
runtime:
arch = amd64
cpu_count = 1
goroutines = 82
max_procs = 1
os = linux
version = go1.4.2
serf_lan:
encrypted = false
event_queue = 0
event_time = 2
failed = 0
intent_queue = 0
left = 0
member_time = 3
members = 3
query_queue = 0
query_time = 1
serf_wan:
encrypted = false
event_queue = 0
event_time = 1
failed = 0
intent_queue = 0
left = 0
member_time = 1
members = 1
query_queue = 0
query_time = 1
Start the consul client node
The consul client does not have options server, bootstrap-expect, config-dir.
Create /tmp/consul on all consul client
$sudo mkdir /tmp/consul
consul agent -data-dir /tmp/consul -dc dc-one -node=consul-client-1 --bind=192.168.0.154
user@consul-client-1:~$ consul agent -data-dir /tmp/consul -dc dc-one -node=consul-client-1 --bind=192.168.0.154
==> WARNING: It is highly recommended to set GOMAXPROCS higher than 1
==> Starting Consul agent...
==> Starting Consul agent RPC...
==> Consul agent running!
Node name: 'consul-client-1'
Datacenter: 'dc-one'
Server: false (bootstrap: false)
Client Addr: 127.0.0.1 (HTTP: 8500, HTTPS: -1, DNS: 8600, RPC: 8400)
Cluster Addr: 192.168.0.154 (LAN: 8301, WAN: 8302)
Gossip encrypt: false, RPC-TLS: false, TLS-Incoming: false
Atlas: <disabled>
==> Log data will now stream in as it occurs:
2015/08/15 05:50:16 [INFO] serf: EventMemberJoin: consul-client-1 192.168.0.154
2015/08/15 05:50:17 [ERR] agent: failed to sync remote state: No known Consul servers
Now Join the Client to Cluster
user@consul-server-1:~$ consul join 192.168.0.154
Successfully joined cluster by contacting 1 nodes.
Logs from consul-server-1
2015/08/15 05:51:00 [INFO] agent.rpc: Accepted client: 127.0.0.1:36590
2015/08/15 05:51:00 [INFO] agent: (LAN) joining: [192.168.0.154]
2015/08/15 05:51:00 [INFO] serf: EventMemberJoin: consul-client-1 192.168.0.154
2015/08/15 05:51:00 [INFO] agent: (LAN) joined: 1 Err: <nil>
2015/08/15 05:51:00 [INFO] consul: member 'consul-client-1' joined, marking health alive
Logs from consul-server-2
2015/08/15 05:51:10 [INFO] serf: EventMemberJoin: consul-client-1 192.168.0.154
Logs from consul-server-3
2015/08/15 05:50:59 [INFO] serf: EventMemberJoin: consul-client-1 192.168.0.154
Logs from consul-client-1
2015/08/15 05:51:03 [INFO] serf: EventMemberJoin: consul-server-1 192.168.0.152
2015/08/15 05:51:03 [INFO] serf: EventMemberJoin: consul-server-2 192.168.0.153
2015/08/15 05:51:03 [INFO] serf: EventMemberJoin: consul-server-3 192.168.0.157
2015/08/15 05:51:03 [INFO] consul: adding server consul-server-1 (Addr: 192.168.0.152:8300) (DC: dc-one)
2015/08/15 05:51:03 [INFO] consul: adding server consul-server-2 (Addr: 192.168.0.153:8300) (DC: dc-one)
2015/08/15 05:51:03 [INFO] consul: adding server consul-server-3 (Addr: 192.168.0.157:8300) (DC: dc-one)
2015/08/15 05:51:03 [INFO] consul: New leader elected: consul-server-1
Check the version and consul member list
user@consul-client-1:~$ consul --version
Consul v0.5.1-1-gdc6795a
Consul Protocol: 2 (Understands back to: 1)
user@consul-client-1:~$ consul members
Node Address Status Type Build Protocol
consul-server-1 192.168.0.152:8301 alive server 0.5.1 2
consul-server-2 192.168.0.153:8301 alive server 0.5.1 2
consul-server-3 192.168.0.157:8301 alive server 0.5.1 2
consul-client-1 192.168.0.154:8301 alive client 0.5.1 2
Leaving a Cluster
To leave the cluster, you can either gracefully quit an agent (using Ctrl-C) or force kill one of the agents. Gracefully leaving allows the node to transition into the leftstate; otherwise, other nodes will detect it as having failed. The difference is covered in more detail here.
$consul leave
Consul Features
Logging
The consul agent on execution generates variety of logs.
The logs by default go to stdout and the default logging level is "info".
Logging Level
The level of logging to show after the Consul agent has started. This defaults to "info".
The available log levels are "trace", "debug", "info", "warn", and "err".
Consul Monitor
Note that you can always connect to an agent via consul monitor and use any log level. Also, the log level can be changed during a config reload.
The logs are output to stdout and there is no way to specify the file-name.
The logs could be re-directed to a file.
# Consul monitor to capture logs (err) from consul-agent
user@consul-client-1:~$ consul monitor -log-level=err
2015/09/06 06:14:27 [ERR] agent: failed to sync remote state: No known Consul servers
# By default log-level is info
user@consul-client-1:~$ consul monitor
2015/09/06 06:14:27 [INFO] serf: EventMemberJoin: consul-client-1 192.168.0.154
2015/09/06 06:14:27 [ERR] agent: failed to sync remote state: No known Consul servers
2015/09/06 06:14:34 [INFO] agent.rpc: Accepted client: 127.0.0.1:34992
# Consul monitor logs could be directed to file
user@consul-client-1:~$ consul monitor -log-level=err > log-consul
user@consul-client-1:~$ cat log-consul
2015/09/06 06:14:27 [ERR] agent: failed to sync remote state: No known Consul servers
Syslog Logging
The consul agent logging to syslog could be achieved in 2 ways.
Syslog logging on consul-start
The consul agent could specify at the time of start to enable logging to syslog.
The '-syslog' flag enables logging to syslog. This is only supported on Linux and OSX. It will result in an error if provided on Windows.
The level of logs going to syslog would be as specified by log-level flag.
The logs as specified by -log-level or default log-level will continue over stdout.
# Start the consul agent with -syslog flag
user@consul-client-1:~$ consul agent -data-dir /tmp/consul -dc dc-one -node=consul-client-1 --bind=192.168.0.154 -syslog
==> WARNING: It is highly recommended to set GOMAXPROCS higher than 1
==> Starting Consul agent...
==> Starting Consul agent RPC...
==> Consul agent running!
Node name: 'consul-client-1'
Datacenter: 'dc-one'
Server: false (bootstrap: false)
Client Addr: 127.0.0.1 (HTTP: 8500, HTTPS: -1, DNS: 8600, RPC: 8400)
Cluster Addr: 192.168.0.154 (LAN: 8301, WAN: 8302)
Gossip encrypt: false, RPC-TLS: false, TLS-Incoming: false
Atlas: <disabled>
==> Log data will now stream in as it occurs:
2015/09/06 06:11:20 [INFO] serf: EventMemberJoin: consul-client-1 192.168.0.154
2015/09/06 06:11:20 [ERR] agent: failed to sync remote state: No known Consul servers
# Syslog : Note both info and err logs go to syslog as default log-level is 'info' while starting consul agent
user@consul-client-1:~$ tail -f /var/log/syslog
Sep 6 06:11:20 consul-client-1 consul[20028]: serf: EventMemberJoin: consul-client-1 192.168.0.154
Sep 6 06:11:20 consul-client-1 consul[20028]: agent: failed to sync remote state: No known Consul servers
# Start consul with log-level 'err'
user@consul-client-1:~$ consul agent -data-dir /tmp/consul -dc dc-one -node=consul-client-1 --bind=192.168.0.154 -syslog -log-level=err
# Syslog captures only log-level 'err' logs
user@consul-client-1:~$ tail -f /var/log/syslog
Sep 6 06:18:08 consul-client-1 consul[20085]: agent: failed to sync remote state: No known Consul servers
Syslog logging using consul monitor
Syslog could also be populated using consul monitor. The log file name to redirect logs with monitor should be /var/log/syslog
Service Discovery
Clients of Consul can provide a service, such as api ormysql, and other clients can use Consul to discover providers of a given service. Using either DNS or HTTP, applications can easily find the services they depend upon.
DNS API
user@consul-client-1:~$ dig @127.0.0.1 -p 8600 web.service.consul
; <<>> DiG 9.9.5-3-Ubuntu <<>> @127.0.0.1 -p 8600 web.service.consul
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 54417
;; flags: qr aa rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
;; WARNING: recursion requested but not available
;; QUESTION SECTION:
;web.service.consul. IN A
;; Query time: 15 msec
;; SERVER: 127.0.0.1#8600(127.0.0.1)
;; WHEN: Sat Aug 15 05:55:57 EDT 2015
;; MSG SIZE rcvd: 36
user@consul-client-1:~$
user@consul-client-1:~$ dig @127.0.0.1 -p 8600 web.service.consul SRV
; <<>> DiG 9.9.5-3-Ubuntu <<>> @127.0.0.1 -p 8600 web.service.consul SRV
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 15136
;; flags: qr aa rd; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
;; WARNING: recursion requested but not available
;; QUESTION SECTION:
;web.service.consul. IN SRV
;; Query time: 2 msec
;; SERVER: 127.0.0.1#8600(127.0.0.1)
;; WHEN: Sat Aug 15 05:56:05 EDT 2015
;; MSG SIZE rcvd: 36
HTTP API
user@consul-client-1:~$ curl http://localhost:8500/v1/catalog/service/web-service
[{"Node":"consul-server-1","Address":"192.168.0.152","ServiceID":"web-service","ServiceName":"web-service","ServiceTags":["rails"],"ServiceAddress":"","ServicePort":81},{"Node":"consul-server-2","Address":"192.168.0.153","ServiceID":"web-service","ServiceName":"web-service","ServiceTags":["rails"],"ServiceAddress":"","ServicePort":81},{"Node":"consul-server-3","Address":"192.168.0.157","ServiceID":"web-service","ServiceName":"web-service","ServiceTags":["rails"],"ServiceAddress":"","ServicePort":81}]
user@consul-client-1:~$ curl http://localhost:8500/v1/catalog/service/apache-service
[{"Node":"consul-server-1","Address":"192.168.0.152","ServiceID":"apache-service","ServiceName":"apache-service","ServiceTags":["rails"],"ServiceAddress":"","ServicePort":80},{"Node":"consul-server-2","Address":"192.168.0.153","ServiceID":"apache-service","ServiceName":"apache-service","ServiceTags":["rails"],"ServiceAddress":"","ServicePort":80},{"Node":"consul-server-3","Address":"192.168.0.157","ServiceID":"apache-service","ServiceName":"apache-service","ServiceTags":["rails"],"ServiceAddress":"","ServicePort":80}]
Updating Services
Service definitions can be updated by changing configuration files and sending aSIGHUP to the agent. This lets you update services without any downtime or unavailability to service queries. Alternatively, the HTTP API can be used to add, remove, and modify services dynamically.
Querying Nodes
Just like querying services, Consul has an API for querying the nodes themselves. You can do this via the DNS or HTTP API.
For the DNS API, the structure of the names is NAME.node.consul orNAME.node.DATACENTER.consul. If the datacenter is omitted, Consul will only search the local datacenter.
For example, from "consul-client-1", we can query for the address of the node "consul-server-1":
user@consul-client-1:~$ dig @127.0.0.1 -p 8600 consul-server-1.node.consul
; <<>> DiG 9.9.5-3-Ubuntu <<>> @127.0.0.1 -p 8600 consul-server-1.node.consul
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 20448
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
;; WARNING: recursion requested but not available
;; QUESTION SECTION:
;consul-server-1.node.consul. IN A
;; ANSWER SECTION:
consul-server-1.node.consul. 0 IN A 192.168.0.152
;; Query time: 3 msec
;; SERVER: 127.0.0.1#8600(127.0.0.1)
;; WHEN: Sat Aug 15 06:04:42 EDT 2015
;; MSG SIZE rcvd: 88
Health Checking
Consul clients can provide any number of health checks, either associated with a given service ("is the webserver returning 200 OK"), or with the local node ("is memory utilization below 90%"). This information can be used by an operator to monitor cluster health, and it is used by the service discovery components to route traffic away from unhealthy hosts.
This should health check for web-check is critical on all servers.
user@consul-client-1:~$ curl http://localhost:8500/v1/health/state/critical
[{"Node":"consul-server-1","CheckID":"service:web-check","Name":"Service 'web-check' check","Status":"critical","Notes":"","Output":"","ServiceID":"web-check","ServiceName":"web-check"},{"Node":"consul-server-2","CheckID":"service:web-check","Name":"Service 'web-check' check","Status":"critical","Notes":"","Output":"","ServiceID":"web-check","ServiceName":"web-check"},{"Node":"consul-server-3","CheckID":"service:web-check","Name":"Service 'web-check' check","Status":"critical","Notes":"","Output":"","ServiceID":"web-check","ServiceName":"web-check"}]
Key/Value Store
Applications can make use of Consul's hierarchical key/value store for any number of purposes, including dynamic configuration, feature flagging, coordination, leader election, and more. The simple HTTP API makes it easy to use.
Initially Key-Value store is empty
user@consul-client-1:~$ curl -v http://localhost:8500/v1/kv/?recurse
* Hostname was NOT found in DNS cache
* Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 8500 (#0)
> GET /v1/kv/?recurse HTTP/1.1
> User-Agent: curl/7.35.0
> Host: localhost:8500
> Accept: */*
>
< HTTP/1.1 404 Not Found
< X-Consul-Index: 1
< X-Consul-Knownleader: true
< X-Consul-Lastcontact: 0
< Date: Sat, 15 Aug 2015 10:11:20 GMT
< Content-Length: 0
< Content-Type: text/plain; charset=utf-8
<
* Connection #0 to host localhost left intact
Insert Key-Value and list Key-Value
user@consul-client-1:~$ curl -X PUT -d 'test' http://localhost:8500/v1/kv/web/key1
true
user@consul-client-1:~$
user@consul-client-1:~$ curl -X PUT -d 'test' http://localhost:8500/v1/kv/web/key2?flags=42
true
user@consul-client-1:~$
user@consul-client-1:~$ curl -X PUT -d 'test' http://localhost:8500/v1/kv/web/sub/key3
true
user@consul-client-1:~$
user@consul-client-1:~$
user@consul-client-1:~$ curl http://localhost:8500/v1/kv/?recurse
[{"CreateIndex":50,"ModifyIndex":50,"LockIndex":0,"Key":"web/key1","Flags":0,"Value":"dGVzdA=="},{"CreateIndex":51,"ModifyIndex":51,"LockIndex":0,"Key":"web/key2","Flags":42,"Value":"dGVzdA=="},{"CreateIndex":53,"ModifyIndex":53,"LockIndex":0,"Key":"web/sub/key3","Flags":0,"Value":"dGVzdA=="}]
user@consul-client-1:~$
List a Key entry. Delete a Key-Value
user@consul-client-1:~$ curl http://localhost:8500/v1/kv/web/key1
[{"CreateIndex":50,"ModifyIndex":50,"LockIndex":0,"Key":"web/key1","Flags":0,"Value":"dGVzdA=="}]
user@consul-client-1:~$
user@consul-client-1:~$
user@consul-client-1:~$ curl -X DELETE http://localhost:8500/v1/kv/web/sub?recurse
true
user@consul-client-1:~$
user@consul-client-1:~$
user@consul-client-1:~$ curl http://localhost:8500/v1/kv/web?recurse
[{"CreateIndex":50,"ModifyIndex":50,"LockIndex":0,"Key":"web/key1","Flags":0,"Value":"dGVzdA=="},{"CreateIndex":51,"ModifyIndex":51,"LockIndex":0,"Key":"web/key2","Flags":42,"Value":"dGVzdA=="}]
user@consul-client-1:~$
Modify value using ModifyIndex
user@consul-client-1:~$ curl http://localhost:8500/v1/kv/web?recurse
[{"CreateIndex":50,"ModifyIndex":50,"LockIndex":0,"Key":"web/key1","Flags":0,"Value":"dGVzdA=="},{"CreateIndex":51,"ModifyIndex":51,"LockIndex":0,"Key":"web/key2","Flags":42,"Value":"dGVzdA=="}]
user@consul-client-1:~$
user@consul-client-1:~$
user@consul-client-1:~$
user@consul-client-1:~$
user@consul-client-1:~$ curl -X PUT -d 'newval' http://localhost:8500/v1/kv/web/key1?cas=50
true
user@consul-client-1:~$
user@consul-client-1:~$ curl -X PUT -d 'newval' http://localhost:8500/v1/kv/web/key1?cas=50
false
user@consul-client-1:~$
user@consul-client-1:~$
user@consul-client-1:~$ curl http://localhost:8500/v1/kv/web?recurse
[{"CreateIndex":50,"ModifyIndex":62,"LockIndex":0,"Key":"web/key1","Flags":0,"Value":"bmV3dmFs"},{"CreateIndex":51,"ModifyIndex":51,"LockIndex":0,"Key":"web/key2","Flags":42,"Value":"dGVzdA=="}]
We can also make use of the ModifyIndex to wait for a key's value to change. For example, suppose we wanted to wait for key2 to be modified, until greater than ModifyIndex 83.
user@consul-client-1:~$ curl http://localhost:8500/v1/kv/?recurse
[{"CreateIndex":50,"ModifyIndex":74,"LockIndex":0,"Key":"web/key1","Flags":0,"Value":"bmV3dmFs"},{"CreateIndex":51,"ModifyIndex":82,"LockIndex":0,"Key":"web/key2","Flags":0,"Value":"bmV3dmFs"}]
#This call will be blocking until ModifyIndex is greater than 83
user@consul-client-1:~$ curl "http://localhost:8500/v1/kv/web/key2?index=83"
[{"CreateIndex":51,"ModifyIndex":84,"LockIndex":0,"Key":"web/key2","Flags":0,"Value":"bmV3dmFs"}]
#On other window assign new value ti increment ModifyIndex
#As ModifyIndex goes above 83, the previous call returns
user@consul-server-3:~$ curl -X PUT -d 'newval' http://localhost:8500/v1/kv/web/key2?cas=82
true
user@consul-server-3:~$ curl http://localhost:8500/v1/kv/?recurse
[{"CreateIndex":50,"ModifyIndex":74,"LockIndex":0,"Key":"web/key1","Flags":0,"Value":"bmV3dmFs"},{"CreateIndex":51,"ModifyIndex":84,"LockIndex":0,"Key":"web/key2","Flags":0,"Value":"bmV3dmFs"}]
#The wait could also be time bounded, i.e. return with latest value in 5s if ModifyIndex doesn't go above 83 in this duration
user@consul-client-1:~$ curl "http://localhost:8500/v1/kv/web/key2?index=83&wait=5s"
Multi Datacenter Replication
Consul supports multiple datacenters out of the box. This means users of Consul do not have to worry about building additional layers of abstraction to grow to multiple regions.
The consul-replicate tool is evaluated to replicate dc-dc Key-Value pairs, as below from dc-one to dc-two.
dc-one (physical datacenter-1)
The consul server-2 has Key-Value pairs which could be accessed using client-ip 192.168.0.155 as rpc-addr.
consul-server-2 : dc:dc-one consul IP:192.168.0.153 client-ip:192.168.0.155
http://192.168.0.155:8500/v1/kv/?recurse=
[{"CreateIndex":1287,"ModifyIndex":1287,"LockIndex":0,"Key":"foo","Flags":0,"Value":"YmFy"},{"CreateIndex":1553,"ModifyIndex":1567,"LockIndex":0,"Key":"service/consul-replicate/statuses/4b6a8b56271d06bed31bfa838eb2235e","Flags":0,"Value":"ewogICJMYXN0UmVwbGljYXRlZCI6IDE1NTIsCiAgIlNvdXJjZSI6ICJ3ZWIiLAogICJEZXN0aW5hdGlvbiI6ICJ3ZWIiCn0="},{"CreateIndex":50,"ModifyIndex":1551,"LockIndex":0,"Key":"web/key1","Flags":0,"Value":"bmV3dmFs"},{"CreateIndex":51,"ModifyIndex":1552,"LockIndex":0,"Key":"web/key2","Flags":0,"Value":"bmV3dmFs"}]
dc-two (physical datacenter-2)
consul-server-4: dc: dc-two consul IP : 192.168.0.158 client-ip:localhost
#Invoke the consul-replica to replicate Key-Value from dc-one to dc-two
@consul-server-4:~$ consul-replicate -consul=192.168.0.155:8500
<This call is blocking>
@consul-server-4:~$ curl http://localhost:8500/v1/kv/?recurse
#<No keys gets replicated>, No replication happened. This is an issue
@consul-server-4:~$
Watcher
Watches for changes in a given data view from Consul. If a child process is specified, it will be invoked with the latest results on changes. Otherwise, the latest values are dumped to stdout and the watch terminates.
The examples explain the watcher over Key-Value and Service and invocation of user defined handler on state change.
Options:
-http-addr=127.0.0.1:8500 HTTP address of the Consul agent.
-datacenter="" Datacenter to query. Defaults to that of agent.
-token="" ACL token to use. Defaults to that of agent.
Watch Specification:
-key=val Specifies the key to watch. Only for 'key' type.
-name=val Specifies an event name to watch. Only for 'event' type.
-passingonly=[true|false] Specifies if only hosts passing all checks are displayed. Optional for 'service' type. Defaults false.
-prefix=val Specifies the key prefix to watch. Only for 'keyprefix' type.
-service=val Specifies the service to watch. Required for 'service' type, optional for 'checks' type.
-state=val Specifies the states to watch. Optional for 'checks' type.
-tag=val Specifies the service tag to filter on. Optional for 'service' type.
-type=val Specifies the watch type. One of key, keyprefix services, nodes, service, checks, or event.
Key-Watcher
This example illustrates the use of consul watcher to invoke the handler on key value change.
#Define the key change handler to log the key update to a file
user@consul-client-1:~$ cat my-key-handler.sh
cat >> /home/user/watch-key.log
#Check the current key details
user@consul-client-1:~$ curl http://localhost:8500/v1/kv/?recurse
[{"CreateIndex":50,"ModifyIndex":74,"LockIndex":0,"Key":"web/key1","Flags":0,"Value":"bmV3dmFs"},{"CreateIndex":51,"ModifyIndex":141,"LockIndex":0,"Key":"web/key2","Flags":0,"Value":"bmV3dmFs"}]
#Start the Key watch on web/key2 with a handler
user@consul-client-1:~$ consul watch -type=key -key=web/key2 ./my-key-handler.sh
#Watch starts with invoking handler with current key details (JSON) and logs to the log file
user@consul-client-1:~$ cat watch-key.log
{"Key":"web/key2","CreateIndex":51,"ModifyIndex":141,"LockIndex":0,"Flags":0,"Value":"bmV3dmFs","Session":""}
#Change the key value for web/key2 using ModifyIndex=141
user@consul-client-1:~$ curl -X PUT -d 'newval' http://localhost:8500/v1/kv/web/key2?cas=141
true
user@consul-client-1:~$ curl http://localhost:8500/v1/kv/?recurse
[{"CreateIndex":50,"ModifyIndex":74,"LockIndex":0,"Key":"web/key1","Flags":0,"Value":"bmV3dmFs"},{"CreateIndex":51,"ModifyIndex":145,"LockIndex":0,"Key":"web/key2","Flags":0,"Value":"bmV3dmFs"}]
user@consul-client-1:~$
#Watch proceeds with invoking handler with modified key details (JSON) and logs to the log file
user@consul-client-1:~$ cat watch-key.log
{"Key":"web/key2","CreateIndex":51,"ModifyIndex":141,"LockIndex":0,"Flags":0,"Value":"bmV3dmFs","Session":""}
{"Key":"web/key2","CreateIndex":51,"ModifyIndex":145,"LockIndex":0,"Flags":0,"Value":"bmV3dmFs","Session":""}
Health-watcher
This example illustrates the use of consul watcher to invoke the handler on service health state change.
#Define the key change handler to log the key update to a file
user@consul-client-1:~$ cat my-service-handler.sh
cat >> /home/user/watch-service.log
#Apache service is running on consul-server-1
user@consul-server-1:~$ sudo /etc/init.d/apache2 status
* apache2 is running
#Start the Service health watch on apache-check with a handler
user@consul-client-1:~$ consul watch -type=checks -service apache-check ./my-service-handler.sh
#Watch starts with invoking handler with modified service status details (JSON) and logs to the log file
user@consul-client-1:~$ tail -f watch-service.log
[{"Node":"consul-server-2","CheckID":"service:apache-check","Name":"Service 'apache-check' check","Status":"passing","Notes":"","Output":"","ServiceID":"apache-check","ServiceName":"apache-check"},{"Node":"consul-server-3","CheckID":"service:apache-check","Name":"Service 'apache-check' check","Status":"passing","Notes":"","Output":"","ServiceID":"apache-check","ServiceName":"apache-check"},{"Node":"consul-server-1","CheckID":"service:apache-check","Name":"Service 'apache-check' check","Status":"passing","Notes":"","Output":"","ServiceID":"apache-check","ServiceName":"apache-check"}]
user@consul-server-1:~$ sudo /etc/init.d/apache2 status
* apache2 is running
#Apache service is stopped on consul-server-1
user@consul-server-1:~$ sudo /etc/init.d/apache2 stop
* Stopping web server apache2 *
user@consul-server-1:~$ sudo /etc/init.d/apache2 status
* apache2 is not running
user@consul-client-1:~$ tail -f watch-service.log
[{"Node":"consul-server-2","CheckID":"service:apache-check","Name":"Service 'apache-check' check","Status":"passing","Notes":"","Output":"","ServiceID":"apache-check","ServiceName":"apache-check"},{"Node":"consul-server-3","CheckID":"service:apache-check","Name":"Service 'apache-check' check","Status":"passing","Notes":"","Output":"","ServiceID":"apache-check","ServiceName":"apache-check"},{"Node":"consul-server-1","CheckID":"service:apache-check","Name":"Service 'apache-check' check","Status":"passing","Notes":"","Output":"","ServiceID":"apache-check","ServiceName":"apache-check"}]
#Watch proceeds with invoking handler with modified service status details (JSON) and logs to the log file
[{"Node":"consul-server-2","CheckID":"service:apache-check","Name":"Service 'apache-check' check","Status":"passing","Notes":"","Output":"","ServiceID":"apache-check","ServiceName":"apache-check"},{"Node":"consul-server-3","CheckID":"service:apache-check","Name":"Service 'apache-check' check","Status":"passing","Notes":"","Output":"","ServiceID":"apache-check","ServiceName":"apache-check"},{"Node":"consul-server-1","CheckID":"service:apache-check","Name":"Service 'apache-check' check","Status":"critical","Notes":"","Output":"","ServiceID":"apache-check","ServiceName":"apache-check"}]
Upgrade
For upgrades consul team strive to ensure backwards compatibility. To support this, nodes gossip their protocol version and builds. This enables clients and servers to intelligently enable new features when available, or to gracefully fallback to a backward compatible mode of operation otherwise.
Every subsequent release of Consul will remain backwards compatible with at least one prior version. Concretely: version 0.5 can speak to 0.4 (and vice versa) but may not be able to speak to 0.1
Note: if speaking an earlier protocol, new features may not be available.
user@consul-client-1:~$ consul --version
Consul v0.5.1-1-gdc6795a
Consul Protocol: 2 (Understands back to: 1)
#Consul version is v0.5.1
#Current Consul protocol is 2 and it could understand Consul protocol backward upto 1
The upgrade involves server upgradation before starting client upgradation.
The deployment is a 4 node cluster with 3 consul servers and 1 consul client.
Currently all 4 nodes have consul version v0.5.1 and this exercise will upgrade one server consul-server-2 to v0.5.2
The steps to upgrade consul server from v0.5.1 to v0.5.2 are following:
Gracefully leave the consul and shutdown
Install new version of consul
Update the path of consul to new version both in PATH and bashrc
Start the consul server. It will start with new version
Join the consul server to cluster
Verify the consul server joined the cluster
Follow above steps to upgrade each node in cluster
#Verify the consul-server-2 using the consul v0.5.1
user@consul-server-2:~$ echo $PATH
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/usr/local/bin/consul-v0.5.1
user@consul-server-2:~$ cat /etc/bash.bashrc | grep consul
export PATH=$PATH:/usr/local/bin/consul-v0.5.1
#Verify the consul-server-2 using the consul v0.5.1
user@consul-server-2:~$ consul members
Node Address Status Type Build Protocol
consul-client-1 192.168.0.154:8301 alive client 0.5.1 2
consul-server-2 192.168.0.153:8301 alive server 0.5.1 2
consul-server-1 192.168.0.152:8301 alive server 0.5.1 2
consul-server-3 192.168.0.157:8301 alive server 0.5.1 2
#Install the consul v0.5.2 and verify the location
user@consul-server-2:~$ ls /usr/local/bin/
consul-v0.5.1 consul-v0.5.2
#Gracefully leave the consul cluster and shutdown consul-server-2
user@consul-server-2:~$ consul leave
Graceful leave complete
#Logs ar consul-server-1 when consul-server-2 leaves gracefully
consul-server-1
2015/08/15 11:11:04 [INFO] consul: removing server consul-server-2 (Addr: 192.168.0.153:8300) (DC: dc-one)
2015/08/15 11:11:04 [INFO] raft: Removed peer 192.168.0.153:8300, stopping replication (Index: 378)
2015/08/15 11:11:04 [INFO] consul: removed server 'consul-server-2' as peer
2015/08/15 11:11:04 [INFO] consul: member 'consul-server-2' left, deregistering
2015/08/15 11:11:04 [INFO] raft: aborting pipeline replication to peer 192.168.0.153:8300
#Logs ar consul-server-2 when consul-server-2 leaves gracefully
consul-server-2
2015/08/15 11:10:47 [INFO] agent.rpc: Accepted client: 127.0.0.1:49107
2015/08/15 11:10:54 [WARN] agent: Check 'service:web-check' is now critical
2015/08/15 11:11:04 [WARN] agent: Check 'service:web-check' is now critical
2015/08/15 11:11:14 [WARN] agent: Check 'service:web-check' is now critical
2015/08/15 11:11:15 [INFO] agent.rpc: Accepted client: 127.0.0.1:49115
2015/08/15 11:11:15 [INFO] agent.rpc: Graceful leave triggered
2015/08/15 11:11:15 [INFO] consul: server starting leave
2015/08/15 11:11:15 [INFO] serf: EventMemberLeave: consul-server-2.dc-one 192.168.0.153
2015/08/15 11:11:15 [INFO] consul: removing server consul-server-2.dc-one (Addr: 192.168.0.153:8300) (DC: dc-one)
2015/08/15 11:11:15 [INFO] serf: EventMemberLeave: consul-server-2 192.168.0.153
2015/08/15 11:11:15 [INFO] consul: removing server consul-server-2 (Addr: 192.168.0.153:8300) (DC: dc-one)
2015/08/15 11:11:15 [INFO] raft: Removed ourself, transitioning to follower
2015/08/15 11:11:15 [ERR] raft-net: Failed to flush response: write tcp 192.168.0.152:49796: broken pipe
2015/08/15 11:11:15 [INFO] agent: requesting shutdown
2015/08/15 11:11:15 [INFO] consul: shutting down server
2015/08/15 11:11:15 [INFO] agent: shutdown complete
#Verify at consul-server-1 the consul member list and consul-server-2 has left
user@consul-server-1:~$ consul members
Node Address Status Type Build Protocol
consul-server-1 192.168.0.152:8301 alive server 0.5.1 2
consul-server-2 192.168.0.153:8301 left server 0.5.1 2
consul-server-3 192.168.0.157:8301 alive server 0.5.1 2
consul-client-1 192.168.0.154:8301 alive client 0.5.1 2
#At consul-server-2 update the path and bashrc to new version of consul v0.5.2
user@consul-server-2:~$ export PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/usr/local/bin/consul-v0.5.2
user@consul-server-2:~$ sudo vi /etc/bash.bashrc
[sudo] password for user:
user@consul-server-2:~$ cat /etc/bash.bashrc | grep consul
export PATH=$PATH:/usr/local/bin/consul-v0.5.2
user@consul-server-2:~$ echo $PATH
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/usr/local/bin/consul-v0.5.2
#Start the consul-server-2, it should start with new consul version v0.5.2
user@consul-server-2:~$ consul agent -server -bootstrap-expect=3 -data-dir /tmp/consul -config-dir /etc/consul.d -dc dc-one -node=consul-server-2 --bind=192.168.0.153
==> WARNING: Expect Mode enabled, expecting 3 servers
==> WARNING: It is highly recommended to set GOMAXPROCS higher than 1
==> Starting raft data migration...
==> Starting Consul agent...
==> Starting Consul agent RPC...
==> Consul agent running!
Node name: 'consul-server-2'
Datacenter: 'dc-one'
Server: true (bootstrap: false)
Client Addr: 127.0.0.1 (HTTP: 8500, HTTPS: -1, DNS: 8600, RPC: 8400)
Cluster Addr: 192.168.0.153 (LAN: 8301, WAN: 8302)
Gossip encrypt: false, RPC-TLS: false, TLS-Incoming: false
Atlas: <disabled>
==> Log data will now stream in as it occurs:
2015/08/15 11:15:13 [INFO] serf: EventMemberJoin: consul-server-2 192.168.0.153
2015/08/15 11:15:13 [INFO] serf: EventMemberJoin: consul-server-2.dc-one 192.168.0.153
2015/08/15 11:15:13 [INFO] raft: Node at 192.168.0.153:8300 [Follower] entering Follower state
2015/08/15 11:15:14 [INFO] consul: adding server consul-server-2 (Addr: 192.168.0.153:8300) (DC: dc-one)
2015/08/15 11:15:14 [INFO] consul: adding server consul-server-2.dc-one (Addr: 192.168.0.153:8300) (DC: dc-one)
2015/08/15 11:15:14 [ERR] agent: failed to sync remote state: No cluster leader
2015/08/15 11:15:15 [WARN] raft: EnableSingleNode disabled, and no known peers. Aborting election.
2015/08/15 11:15:16 [WARN] agent: Check 'service:web-check' is now critical
#Join the consul-server-2 to the consul cluster
user@consul-server-1:~$ consul join 192.168.0.153
Successfully joined cluster by contacting 1 nodes.
#Verify the consul member list and validate consul-server-2 version is v0.5.2
user@consul-server-1:~$ consul members
Node Address Status Type Build Protocol
consul-server-1 192.168.0.152:8301 alive server 0.5.1 2
consul-server-2 192.168.0.153:8301 alive server 0.5.2 2
consul-server-3 192.168.0.157:8301 alive server 0.5.1 2
consul-client-1 192.168.0.154:8301 alive client 0.5.1 2
#Logs at consul-server-1 when consul-server-2 joins the consul cluster
consul-server-1
2015/08/15 11:15:43 [INFO] agent.rpc: Accepted client: 127.0.0.1:41181
2015/08/15 11:15:43 [INFO] agent: (LAN) joining: [192.168.0.153]
2015/08/15 11:15:43 [INFO] serf: EventMemberJoin: consul-server-2 192.168.0.153
2015/08/15 11:15:43 [INFO] agent: (LAN) joined: 1 Err: <nil>
2015/08/15 11:15:43 [INFO] consul: adding server consul-server-2 (Addr: 192.168.0.153:8300) (DC: dc-one)
2015/08/15 11:15:43 [INFO] raft: Added peer 192.168.0.153:8300, starting replication
2015/08/15 11:15:43 [ERR] raft: Failed to AppendEntries to 192.168.0.153:8300: EOF
2015/08/15 11:15:43 [INFO] consul: member 'consul-server-2' joined, marking health alive
2015/08/15 11:15:43 [WARN] raft: AppendEntries to 192.168.0.153:8300 rejected, sending older logs (next: 379)
2015/08/15 11:15:43 [INFO] raft: pipelining replication to peer 192.168.0.153:8300
#Logs at consul-server-2 when consul-server-2 joins the consul cluster
consul-server-2
2015/08/15 11:15:54 [INFO] serf: EventMemberJoin: consul-server-1 192.168.0.152
2015/08/15 11:15:54 [INFO] serf: EventMemberJoin: consul-server-3 192.168.0.157
2015/08/15 11:15:54 [INFO] serf: EventMemberJoin: consul-client-1 192.168.0.154
2015/08/15 11:15:54 [INFO] consul: adding server consul-server-1 (Addr: 192.168.0.152:8300) (DC: dc-one)
2015/08/15 11:15:54 [INFO] consul: adding server consul-server-3 (Addr: 192.168.0.157:8300) (DC: dc-one)
2015/08/15 11:15:54 [INFO] consul: New leader elected: consul-server-1
2015/08/15 11:15:54 [WARN] raft: Failed to get previous log: 383 log not found (last: 378)
2015/08/15 11:15:54 [INFO] raft: Removed ourself, transitioning to follower
2015/08/15 11:15:55 [INFO] agent: Synced service 'consul'
2015/08/15 11:15:55 [INFO] agent: Synced service 'apache-check'
2015/08/15 11:15:55 [INFO] agent: Synced service 'apache-service'
2015/08/15 11:15:55 [INFO] agent: Synced service 'web-check'
2015/08/15 11:15:55 [INFO] agent: Synced service 'web-service'
2015/08/15 11:15:55 [INFO] agent: Synced check 'ping-check'
2015/08/15 11:15:56 [WARN] agent: Check 'service:web-check' is now critical
Consul Template
Consul Template is a Consul Tool that provides a convenient way to populate values from Consul into the filesystem using the consul-template daemon.
The daemon consul-template queries a Consul instance and updates any number of specified templates on the filesystem. As an added bonus, consul-template can optionally run arbitrary commands when the update process completes.
Consul Template can query a service entries, keys, and key values in Consul. The powerful abstraction and template query language makes Consul Template perfect for creating dynamic configurations.
HAProxy Backends
HAProxy is very common, high-performance load balancing software. A typical HAProxy configuration file looks like:
backend frontend
balance roundrobin
server web1 web1.yourdomain.com:80 check
server web2 web2.yourdomain.com:80 check
However, adding and removing nodes from HAProxy is a painful and often scary experience. Consul Template takes the fear out of HAProxy:
backend frontend
balance roundrobin{{range "app.frontend"}}
service {{.ID}} {{.Address}}:{{.Port}}{{end}}
You may notice the check attribute has been removed. Since our health checks are defined in Consul, and Consul Template only returns healthy nodes from a service query, we can save HAProxy the work of checking the nodes health and leave that logic to Consul.
With the optional command argument, Consul Template can automatically trigger a reload of HAProxy when the template is updated. As nodes are dynamically added and removed from Consul, your load balancer will be immediately informed of the change, making it easier than ever to minimize downtime.
Application Configurations
The intuitive yet powerful key-value store allows application developers and operators to store global configuration information in Consul. Consul Template will dynamically update when a change to one of those values occurs. With today's configuration management practices, it is common to have a configuration template for an application which alter some tuneable values for the application:
MaxWorkers 5
JobsPerSecond 11
Even using a dynamic configuration management system, the application's configuration will remain unchanged until the next run. With Consul Template, any change in the key-value store is immediately propagated to all templates listening to that value. Now, your configuration management software can write a Consul Template:
MaxWorkers {{key "web/max-workers"}}
JobsPerSecond {{key "web/jobs-per-second"}}
This template is now connected with Consul and will receive instant, dynamic updates as soon as a change is pushed to Consul. You no longer need to wait an hour for the next iteration of your CM system to run.
Features
Consul Template is jam-packed full of great features, but we cannot possibly list them all! Here are just a few of the many features you will find in Consul Template:
Quiescence - Consul Template ships with built-in quiescence and can intelligently wait for changes from a Consul instance. This critical feature prevents frequent updates to a template while a system is fluctuating.
Dry Mode - Unsure of the current state of your infrastructure? Afraid a template change might break a subsystem? Fear no more because Consul Template comes complete with -dry mode. In dry mode, Consul Template will render the result to STDOUT, so an operator can inspect the output and decide if the template change is safe.
CLI and Config - Do you prefer to specify everything on the command line? Or are you using a configuration management tool and prefer configurations written to disk? Whatever your preference, Consul Template has you covered! With built-in support for HCL, Consul Template accepts a configuration file, command line flags, or a mix of both! In this way, you can continue to use your existing configuration management tools in tandem with Consul Template.
Verbose Debugging - Even when you think it is perfect, sometimes systems fail. For those critical times, Consul Template has a detailed logging system that can be used to track down almost any issue.
Conclusion
Consul Template has changed the way we think about service discovery in our infrastructure.
Install Consul Template
# Download Consul Template
user@consul-client-2:~$ cd /usr/local/bin/
user@consul-client-2:/usr/local/bin$ sudo wget https://github.com/hashicorp/consul-template/releases/download/v0.10.0/consul-template_0.10.0_linux_amd64.tar.gz
--2015-08-22 13:07:09-- https://github.com/hashicorp/consul-template/releases/download/v0.10.0/consul-template_0.10.0_linux_amd64.tar.gz
Resolving github.com (github.com)... 192.30.252.129
Connecting to github.com (github.com)|192.30.252.129|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://s3.amazonaws.com/github-cloud/releases/24898177/9bfc68f4-0eab-11e5-869a-600fbcec3532.gz?response-content-disposition=attachment%3B%20filename%3Dconsul-template_0.10.0_linux_amd64.tar.gz&response-content-type=application/octet-stream&AWSAccessKeyId=AKIAISTNZFOVBIJMK3TQ&Expires=1440266840&Signature=dHQtg5WPJZ6x7rBcjKxb%2FFS3Hf0%3D [following]
--2015-08-22 13:07:10-- https://s3.amazonaws.com/github-cloud/releases/24898177/9bfc68f4-0eab-11e5-869a-600fbcec3532.gz?response-content-disposition=attachment%3B%20filename%3Dconsul-template_0.10.0_linux_amd64.tar.gz&response-content-type=application/octet-stream&AWSAccessKeyId=AKIAISTNZFOVBIJMK3TQ&Expires=1440266840&Signature=dHQtg5WPJZ6x7rBcjKxb%2FFS3Hf0%3D
Resolving s3.amazonaws.com (s3.amazonaws.com)... 54.231.65.40
Connecting to s3.amazonaws.com (s3.amazonaws.com)|54.231.65.40|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2559734 (2.4M) [application/octet-stream]
Saving to: ‘consul-template_0.10.0_linux_amd64.tar.gz’
100%[=========================================================================================================>] 2,559,734 462KB/s in 5.4s
2015-08-22 13:07:18 (462 KB/s) - ‘consul-template_0.10.0_linux_amd64.tar.gz’ saved [2559734/2559734]
# Untar Consul Template
user@consul-client-2:/usr/local/bin$ sudo tar -xzvf consul-template_0.10.0_linux_amd64.tar.gz
consul-template_0.10.0_linux_amd64/
consul-template_0.10.0_linux_amd64/consul-template
# Copy Consul Template binary to /usr/local/bin
user@consul-client-2:/usr/local/bin$ sudo cp consul-template_0.10.0_linux_amd64/consul-template .
user@consul-client-2:/usr/local/bin$ cd -
/home/user
#Validate Consul Template is installed and accessible
user@consul-client-2:~$ consul-template --help
Usage: consul-template [options]
Watches a series of templates on the file system, writing new changes when
Consul is updated. It runs until an interrupt is received unless the -once
flag is specified.
Options:
-auth=<user[:pass]> Set the basic authentication username (and password)
-consul=<address> Sets the address of the Consul instance
-max-stale=<duration> Set the maximum staleness and allow stale queries to
Consul which will distribute work among all servers
instead of just the leader
-ssl Use SSL when connecting to Consul
-ssl-verify Verify certificates when connecting via SSL
-ssl-cert SSL client certificate to send to server
-ssl-ca-cert Validate server certificate against this CA
certificate file list
-token=<token> Sets the Consul API token
-syslog Send the output to syslog instead of standard error
and standard out. The syslog facility defaults to
LOCAL0 and can be changed using a configuration file
-syslog-facility=<f> Set the facility where syslog should log. If this
attribute is supplied, the -syslog flag must also be
supplied.
-template=<template> Adds a new template to watch on disk in the format
'templatePath:outputPath(:command)'
-wait=<duration> Sets the 'minumum(:maximum)' amount of time to wait
before writing a template (and triggering a command)
-retry=<duration> The amount of time to wait if Consul returns an
error when communicating with the API
-config=<path> Sets the path to a configuration file on disk
-pid-file=<path> Path on disk to write the PID of the process
-log-level=<level> Set the logging level - valid values are "debug",
"info", "warn" (default), and "err"
-dry Dump generated templates to stdout
-once Do not run the process as a daemon
-v, -version Print the version of this daemon
Consul Template returned errors:
Dry run of Consul Template
# Create a consul template to fetch server list from consul running apache-service
user@consul-client-2:~$ cat consul.ctmpl
{{range service "apache-service"}}\nserver {{.Address}}:{{.Port}}{{end}}
# Fetch the server list running apache-service
# -dry is for dry run and -once is to run it once
user@consul-client-2:~$ consul-template -consul localhost:8500 -template consul.ctmpl:consul.result -dry -once
2015/08/22 10:08:47 [DEBUG] (logging) setting up logging
2015/08/22 10:08:47 [DEBUG] (logging) config:
{
"name": "consul-template",
"level": "WARN",
"syslog": false,
"syslog_facility": "LOCAL0"
}
> consul.result
\nserver 192.168.0.152:80\nserver 192.168.0.153:80\nserver 192.168.0.157:80\nserver 192.168.0.154:80
Flapping Service Detection
Consider a case consul-template is watching on specific service and one instance of service is flapping.
This section covers how-to address flapping server and config consul-template to wait for configuration if it detects flapping server.
This feature is supported by consul-template wait attribute.
-wait : The minimum(:maximum) to wait before rendering a new template to disk and triggering a command, separated by a colon (:). If the optional maximum value is omitted, it is assumed to be 4x the required minimum value. There is no default value.
consul-template -wait 10s
Tells CT to wait until no new data is returned from Consul for 10 seconds. Similarly:
consul-template -wait 10s:100s
Tells CT to wait until no new data is returned from Consul for 10 seconds, but to force continuation after 100 seconds if flapping is detected.
E.g. Above dry run has 4 servers 152, 153, 157, 154 for apache-service
# Configure consul-template to wait for minimum 20s and maximum 100s for flapping server detection
user@haproxy:~$ consul-template -consul localhost:8500 -template /home/user/nginx.template:/home/user/nginx.conf:/home/user/nginx.sh -wait 20s:100s
This means if at-least one server is flapping and consul-template detects it, the template execution is deferred for maximum 100s.
Use-case-1: No flapping and Service register\de-register happens and wait 20s:100s
Detection time : 20s
Use-case-2 : Service register\de-register flapping with say 5s interval and wait 20s:100s
Detection time : 100s
Use-case-3 : Service register\de-register flapping with say 5s interval and wait 20s:100s
Another service instance register (say 192.168.0.155) without flapping at 10s. This also gets deferred due to flapping service.
Detection time : 100s
Use-case-4 : Service register\de-register flapping with say 5s interval and wait 20s:100s
Service flapping stopped between 20s and 100s
Detection time : Less than 100s
Consul Tools
Automation
The automation of consul tools using python is evaluated using python-consul.
Install python-consul
user@consul-client-1:/usr/lib/python2.7$ sudo pip install python-consul
Downloading/unpacking python-consul
Downloading python-consul-0.4.4.tar.gz
Running setup.py (path:/tmp/pip_build_root/python-consul/setup.py) egg_info for package python-consul
Requirement already satisfied (use --upgrade to upgrade): requests in ./dist-packages (from python-consul)
Requirement already satisfied (use --upgrade to upgrade): six>=1.4 in ./dist-packages (from python-consul)
Installing collected packages: python-consul
Running setup.py install for python-consul
Successfully installed python-consul
Cleaning up...
Automate consul APIs using python library
user@consul-client-1:/usr/lib/python2.7$ python
Python 2.7.6 (default, Mar 22 2014, 22:59:56)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import consul
>>> c = consul.Consul()
>>> c.kv.put("foo", "bar")
True
# Verify consul kv store has got KV inserted with key "foo"
user@consul-client-1:~$ curl http://localhost:8500/v1/kv/?recurse
[{"CreateIndex":1287,"ModifyIndex":1287,"LockIndex":0,"Key":"foo","Flags":0,"Value":"YmFy"},{"CreateIndex":50,"ModifyIndex":74,"LockIndex":0,"Key":"web/key1","Flags":0,"Value":"bmV3dmFs"},{"CreateIndex":51,"ModifyIndex":145,"LockIndex":0,"Key":"web/key2","Flags":0,"Value":"bmV3dmFs"}]
# Verify consul kv store has got KV inserted with key "foo" and key "bar" by API
>>> index, data = c.kv.get('foo')
>>> data['Value']
'bar'
# Verify consul kv store has got KV inserted with key "web/key2" and value "newval" This key was inserted from CLI
>>> index, data = c.kv.get('web/key2')
>>> data['Value']
'newval'
HashiCorp Tools
These Consul tools are created and managed by the dedicated engineers at HashiCorp:
Envconsul - Read and set environmental variables for processes from Consul.
Consul Replicate - Consul cross-DC KV replication daemon.
Consul Template - Generic template rendering and notifications with Consul
Consul Migrate - Data migration tool to handle Consul upgrades to 0.5.1+
Consul SDK
These Consul SDK are created and managed by the amazing members of the Consul community:
api - Official Go client for the Consul HTTP API
consulate - Python client for the Consul HTTP API
python-consul - Python client for the Consul HTTP API
consul-php-sdk - PHP client for the Consul HTTP API
scala-consul - Scala client for the Consul HTTP API
consul-client - Java client for the Consul HTTP API
consul-api - Java client for the Consul HTTP API
discovery - Erlang/OTP client for the Consul HTTP API
consul-client - Ruby client for the Consul HTTP API
diplomat - Ruby library to query Consul's KV-store and services directory
node-consul - Node.js client for the Consul HTTP API
Consul.NET - C# client for the Consul HTTP API
Community Tools
These Consul tools are created and managed by the amazing members of the Consul community:
confd - Manage local application configuration files using templates and data from etcd or Consul
crypt - Store and retrieve encrypted configuration parameters from etcd or Consul
docker-consul - Dockerized Consul Agent
git2consul - Mirror the contents of a Git repository into Consul KVs
helios-consul - Service registrar plugin for Helios
registrator - Service registry bridge for Docker