本文共 10213 字,大约阅读时间需要 34 分钟。
Warning | This guide is based on an old version of Docker. The instructions you find below may be already removed or changed in the Docker codebase. |
In my I talked about running the on a local KVM with libvirt. This was not a standalone task, but rather the preparation for this blog post: running containers on multiple hosts attached to the same network.
I was asked in the comments on my if it would be possible to run a cluster on multiple hosts. I found a written by Franck Besnard. I’ve decided to set up a similar environment on my own to see how/if it works.
I made a few changes to Franck’s set up:
I’m not using the script to minimize the dependencies.
I wanted to make the launching of the containers as simple as possible, so I dropped the use of theovswork.sh
script Franck crafted and I only use the docker run
command.
I’m not creating a virtual ethernet device for each container?—?instead I’m attaching all containers to a bridge.
I’m not using VLAN’s (yet)
I used two VM’s (host1
and host2
) each with Fedora 20 as the operating system. You can use to create them. Later you can use the virsh start host1
command to run them.
On both hosts I’ve installed Docker:
yum -y install docker-io
The Docker configuration requires some changes.
By default Docker chooses a (more or less) random network to run the containers. After this it creates a bridge and assigns an address to it. This is not really what we want because we need to have static address assignment, so we need to prepare our own bridge and disable the one managed by Docker.
Copy the /usr/lib/systemd/system/docker.service
file to /etc/systemd/system/docker.service
and add following content to disable the default docker0
bridge creation on Docker startup.
.include /usr/lib/systemd/system/docker.service[Service]ExecStart=ExecStart=/usr/bin/docker -d -b=none
You can start Docker with systemctl start docker
.
Note | Every time you modify a systemd service file do not forget to run systemctl daemon-reload to apply your changes. |
This is the interesting part :)
To make networking easy I used the software. I’m very new to it, but its flexibility and ease of use is just impressive. I haven’t done any performance testing, though. .
You can install Open vSwitch on Fedora by running this command:
yum -y install openvswitch
The script below prepares the networking for you. You can execute it on both hosts by adjusting theREMOTE_IP
and BRIDGE_ADDRESS
variables. The BRIDGE_NAME
can be the same on both hosts.
# The 'other' hostREMOTE_IP=192.168.122.189# Name of the bridgeBRIDGE_NAME=docker0# Bridge addressBRIDGE_ADDRESS=172.16.42.2/24# Deactivate the docker0 bridgeip link set $BRIDGE_NAME down# Remove the docker0 bridgebrctl delbr $BRIDGE_NAME# Delete the Open vSwitch bridgeovs-vsctl del-br br0# Add the docker0 bridgebrctl addbr $BRIDGE_NAME# Set up the IP for the docker0 bridgeip a add $BRIDGE_ADDRESS dev $BRIDGE_NAME# Activate the bridgeip link set $BRIDGE_NAME up# Add the br0 Open vSwitch bridgeovs-vsctl add-br br0# Create the tunnel to the other host and attach it to the# br0 bridgeovs-vsctl add-port br0 gre0 -- set interface gre0 type=gre options:remote_ip=$REMOTE_IP# Add the br0 bridge to docker0 bridgebrctl addif $BRIDGE_NAME br0# Some useful commands to confirm the settings:# ip a s# ip r s# ovs-vsctl show# brctl show
After executing these commands on both hosts you should be able to ping the docker0
bridge addresses from both hosts.
Here is an example from host2
(ip 192.168.122.189
):
$ ping 172.16.42.1PING 172.16.42.1 (172.16.42.1) 56(84) bytes of data.64 bytes from 172.16.42.1: icmp_seq=1 ttl=64 time=2.16 ms64 bytes from 172.16.42.1: icmp_seq=2 ttl=64 time=0.628 ms^C--- 172.16.42.1 ping statistics ---2 packets transmitted, 2 received, 0% packet loss, time 1001msrtt min/avg/max/mdev = 0.628/1.396/2.165/0.769 ms
The above script has some useful comments that help to understand what it’s doing, but here’s a high level view on the networking part.
Every container run with Docker is attached to docker0
bridge. This is a you can create on every Linux system, without the need for Open vSwitch.
The docker0
bridge is attached to another bridge: br0
. This time it’s an Open vSwitch bridge. This means that all traffic between containers is routed through br0
too. You can think about two switches connected to each other.
Additionally we need to connect together the networks from both hosts in which the containers are running. A is used for this purpose. This tunnel is attached to the br0
Open vSwitch bridge and as a result to docker0
too.
While creating this environment I found a problem.
Docker assumes that it’s managing the network where the containers are run. It does not expect any other hosts to be run on the network besides the ones it starts. This works well in a typical environment (and definitely makes the code easier). But if we’re going to spread across multiple hosts?—?this can cause some headaches.
The way Docker assignes IP addresses to the containers is very simple: it tries to assign the first unusedaddress. It sounds valid, right? But it depends how do you define not used. When Docker starts a container?—?the assigned IP is added to a list of used IPs maintained by the Docker daemon. Not used IP in Docker’s case means that the IP wasn’t found in that list.
This can be problematic, though. If you run something manually on that network and you assign an IP to it?—?Docker will not be able to detect it and instead it can happen that Docker assigns this IP blindly again causing a conflict.
Over the weekend I was thinking about some solutions, and I ended up with two:
Obvious one: change the Docker code to find out if the address is really free.
Manually assign IP’s to the containers when running them.
Both have pros and cons. There may be other solutions too. Feel free to drop a comment if you find one.
The first idea involves patching Docker. We need to make it aware of the hosts running on the network. From the beginning I was focused on using the .
I was trying to use the host ARP cache table for the interface bound to Docker (by default it’s docker0
), but I found that:
Containers do not advertise themselves on startup, and
Even if we advertise manually (using )?—?the ARP table is not reliable enough since entries will be removed after some time if there is no communication between these two hosts.
Note | Fedora does drop the broadcast ARP messages by default. You can change this by setting:echo 1 > /proc/sys/net/ipv4/conf/<device>/arp_accept . (search for arp_accept ). |
But the good news is that we still can find if the selected IP is used by using the arping
utility and this is what I used.
I prepared a for Docker 0.7.6
which adds an additional check if the IP we’re trying to use is actually free.
In my testing I found that using arping is pretty reliable?—?the hosts were discovered properly and it didn’t take too long to find a free IP.
I built an RPM with this patch for Fedora 20, you can , if you want to give it a try.
After installing the patched Docker you should be able to run containers just like you’re used to:
docker run -i -t centos:latest /bin/bash
Sometimes patching Docker is not an option.
This is where assigning IP addresses manually makes sense. Since Docker does not expose the ability to assign a selected IP directly to the docker run
command?—?we need to do this in two steps:
Disable the automatic network configuration in Docker by specifying -n=false
,
Configure networking using the LXC configuration using -lxc-conf
This is how it could be done:
docker run \-n=false \-lxc-conf="lxc.network.type = veth" \-lxc-conf="lxc.network.ipv4 = 172.16.42.20/24" \-lxc-conf="lxc.network.ipv4.gateway = 172.16.42.1" \-lxc-conf="lxc.network.link = docker0" \-lxc-conf="lxc.network.name = eth0" \-lxc-conf="lxc.network.flags = up" \-i -t centos:latest /bin/bash
This will run a CentOS container with networking set up as follows:
Create a virtual ethernet interface
Attach this interface to the docker0
bridge
Expose it in the container as eth0
Assign the 172.16.42.20
IP to the interface
Set up the default gateway as 172.16.42.1
If you want to run multiple containers on one host, the only thing you’ll change is the IP address?—?everything else can be left as-is.
If you followed the tutorial (no matter which option you choose)?—?you should be able to run containers on both hosts. Containers should be attached to the same network and be able to ping each other. Additionaly no IP address conflicts should happen.
Win!
If you encounter some problems?—?you need to check the configuration.
Make sure the brctl show
command outputs similar content:
bridge name bridge id STP enabled interfacesdocker0 8000.7a7c5f332842 no br0
Make sure the ovs-vsctl show
command outputs similar content:
73f7bcaa-7141-4b20-8fa8-3a0c1ec34f39 Bridge "br0" Port "br0" Interface "br0" type: internal Port "gre0" Interface "gre0" type: gre options: {remote_ip="192.168.122.43"} ovs_version: "2.0.0"
Make sure you can ping host1
from host2
and vice-versa.
Make sure you can ping the docker0
interface running on host1
from host2
and vice-versa.
It’s possible to run Docker containers on different hosts that share the same network.
It’s even pretty simple. But like always?—?it could be better: Docker should make it possible without any workarounds.
One idea would be to implement the ARP requests directly in Go and drop the use of arping
.
The other idea is to expose the network settings for the containers to the docker run
call. I’m thinking here about the -i
(IP with network prefix) and -g
(gateway) options forwarded to dockerinit
when launching a container.
Whoah, you’re still reading this? Not bad.
Thanks!
转载地址:http://pmcia.baihongyu.com/