博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
connecting docker containers on multiple hosts with open vswitch GRE
阅读量:6240 次
发布时间:2019-06-22

本文共 10213 字,大约阅读时间需要 34 分钟。

转一篇
通过GRE联通多台主机上运行的containers. 
可以用于解决没有VLAN或者网络不归自己管理的环境, 用gre穿透.

GRE概念

OpenVswitch相关

通过GRE联通多台主机上运行的containers.

Connecting Docker containers on multiple hosts

Warning
This guide is based on an old version of Docker. The instructions you find below may be already removed or changed in the Docker codebase.

In my  I talked about running the  on a local KVM with libvirt. This was not a standalone task, but rather the preparation for this blog post: running  containers on multiple hosts attached to the same network.

I was asked in the comments on my  if it would be possible to run a cluster on multiple hosts. I found a  written by Franck Besnard. I’ve decided to set up a similar environment on my own to see how/if it works.

I made a few changes to Franck’s set up:

  1. I’m not using the  script to minimize the dependencies.

  2. I wanted to make the launching of the containers as simple as possible, so I dropped the use of theovswork.sh script Franck crafted and I only use the docker run command.

  3. I’m not creating a virtual ethernet device for each container?—?instead I’m attaching all containers to a bridge.

  4. I’m not using VLAN’s (yet)

Set up

I used two VM’s (host1 and host2) each with Fedora 20 as the operating system. You can use   to create them. Later you can use the virsh start host1 command to run them.

Docker

On both hosts I’ve installed Docker:

yum -y install docker-io

The Docker configuration requires some changes.

By default Docker chooses a (more or less) random network to run the containers. After this it creates a bridge and assigns an address to it. This is not really what we want because we need to have static address assignment, so we need to prepare our own bridge and disable the one managed by Docker.

Copy the /usr/lib/systemd/system/docker.service file to /etc/systemd/system/docker.service and add following content to disable the default docker0 bridge creation on Docker startup.

.include /usr/lib/systemd/system/docker.service[Service]ExecStart=ExecStart=/usr/bin/docker -d -b=none

You can start Docker with systemctl start docker.

Note
Every time you modify a systemd service file do not forget to run systemctl daemon-reload to apply your changes.

Networking

This is the interesting part :)

Open vSwitch

To make networking easy I used the  software. I’m very new to it, but its flexibility and ease of use is just impressive. I haven’t done any performance testing, though. .

You can install Open vSwitch on Fedora by running this command:

yum -y install openvswitch

Network configuration

The script below prepares the networking for you. You can execute it on both hosts by adjusting theREMOTE_IP and BRIDGE_ADDRESS variables. The BRIDGE_NAME can be the same on both hosts.

# The 'other' hostREMOTE_IP=192.168.122.189# Name of the bridgeBRIDGE_NAME=docker0# Bridge addressBRIDGE_ADDRESS=172.16.42.2/24# Deactivate the docker0 bridgeip link set $BRIDGE_NAME down# Remove the docker0 bridgebrctl delbr $BRIDGE_NAME# Delete the Open vSwitch bridgeovs-vsctl del-br br0# Add the docker0 bridgebrctl addbr $BRIDGE_NAME# Set up the IP for the docker0 bridgeip a add $BRIDGE_ADDRESS dev $BRIDGE_NAME# Activate the bridgeip link set $BRIDGE_NAME up# Add the br0 Open vSwitch bridgeovs-vsctl add-br br0# Create the tunnel to the other host and attach it to the# br0 bridgeovs-vsctl add-port br0 gre0 -- set interface gre0 type=gre options:remote_ip=$REMOTE_IP# Add the br0 bridge to docker0 bridgebrctl addif $BRIDGE_NAME br0# Some useful commands to confirm the settings:# ip a s# ip r s# ovs-vsctl show# brctl show

After executing these commands on both hosts you should be able to ping the docker0 bridge addresses from both hosts.

Here is an example from host2 (ip 192.168.122.189):

$ ping 172.16.42.1PING 172.16.42.1 (172.16.42.1) 56(84) bytes of data.64 bytes from 172.16.42.1: icmp_seq=1 ttl=64 time=2.16 ms64 bytes from 172.16.42.1: icmp_seq=2 ttl=64 time=0.628 ms^C--- 172.16.42.1 ping statistics ---2 packets transmitted, 2 received, 0% packet loss, time 1001msrtt min/avg/max/mdev = 0.628/1.396/2.165/0.769 ms

Networking explained

The above script has some useful comments that help to understand what it’s doing, but here’s a high level view on the networking part.

  1. Every container run with Docker is attached to docker0 bridge. This is a  you can create on every Linux system, without the need for Open vSwitch.

  2. The docker0 bridge is attached to another bridge: br0. This time it’s an Open vSwitch bridge. This means that all traffic between containers is routed through br0 too. You can think about two switches connected to each other.

  3. Additionally we need to connect together the networks from both hosts in which the containers are running. A  is used for this purpose. This tunnel is attached to the br0 Open vSwitch bridge and as a result to docker0 too.

The issue: IP assignment

While creating this environment I found a problem.

Docker assumes that it’s managing the network where the containers are run. It does not expect any other hosts to be run on the network besides the ones it starts. This works well in a typical environment (and definitely makes the code easier). But if we’re going to spread across multiple hosts?—?this can cause some headaches.

Docker address assignement method

The way Docker assignes IP addresses to the containers is very simple: it tries to assign the first unusedaddress. It sounds valid, right? But it depends how do you define not used. When Docker starts a container?—?the assigned IP is added to a list of used IPs maintained by the Docker daemon. Not used IP in Docker’s case means that the IP wasn’t found in that list.

This can be problematic, though. If you run something manually on that network and you assign an IP to it?—?Docker will not be able to detect it and instead it can happen that Docker assigns this IP blindly again causing a conflict.

Solution

Over the weekend I was thinking about some solutions, and I ended up with two:

  1. Obvious one: change the Docker code to find out if the address is really free.

  2. Manually assign IP’s to the containers when running them.

Both have pros and cons. There may be other solutions too. Feel free to drop a comment if you find one.

Option 1: Modifying Docker

The first idea involves patching Docker. We need to make it aware of the hosts running on the network. From the beginning I was focused on using the .

I was trying to use the host ARP cache table for the interface bound to Docker (by default it’s docker0), but I found that:

  1. Containers do not advertise themselves on startup, and

  2. Even if we advertise manually (using )?—?the ARP table is not reliable enough since entries will be removed after some time if there is no communication between these two hosts.

Note
Fedora does drop the broadcast ARP messages by default. You can change this by setting:echo 1 > /proc/sys/net/ipv4/conf/<device>/arp_accept.  (search for arp_accept).

But the good news is that we still can find if the selected IP is used by using the arping utility and this is what I used.

I prepared a  for Docker 0.7.6 which adds an additional check if the IP we’re trying to use is actually free.

In my testing I found that using arping is pretty reliable?—?the hosts were discovered properly and it didn’t take too long to find a free IP.

I built an RPM with this patch for Fedora 20, you can , if you want to give it a try.

After installing the patched Docker you should be able to run containers just like you’re used to:

docker run -i -t centos:latest /bin/bash

Option 2: Manual address assignment

Sometimes patching Docker is not an option.

This is where assigning IP addresses manually makes sense. Since Docker does not expose the ability to assign a selected IP directly to the docker run command?—?we need to do this in two steps:

  1. Disable the automatic network configuration in Docker by specifying -n=false,

  2. Configure networking using the LXC configuration using -lxc-conf

Example

This is how it could be done:

docker run \-n=false \-lxc-conf="lxc.network.type = veth" \-lxc-conf="lxc.network.ipv4 = 172.16.42.20/24" \-lxc-conf="lxc.network.ipv4.gateway = 172.16.42.1" \-lxc-conf="lxc.network.link = docker0" \-lxc-conf="lxc.network.name = eth0" \-lxc-conf="lxc.network.flags = up" \-i -t centos:latest /bin/bash

This will run a CentOS container with networking set up as follows:

  • Create a virtual ethernet interface

  • Attach this interface to the docker0 bridge

  • Expose it in the container as eth0

  • Assign the 172.16.42.20 IP to the interface

  • Set up the default gateway as 172.16.42.1

If you want to run multiple containers on one host, the only thing you’ll change is the IP address?—?everything else can be left as-is.

Expected result

If you followed the tutorial (no matter which option you choose)?—?you should be able to run containers on both hosts. Containers should be attached to the same network and be able to ping each other. Additionaly no IP address conflicts should happen.

Win!

Troubleshooting

If you encounter some problems?—?you need to check the configuration.

  • Make sure the brctl show command outputs similar content:

bridge name bridge id   STP enabled interfacesdocker0   8000.7a7c5f332842 no    br0
  • Make sure the ovs-vsctl show command outputs similar content:

73f7bcaa-7141-4b20-8fa8-3a0c1ec34f39    Bridge "br0"        Port "br0"            Interface "br0"                type: internal        Port "gre0"            Interface "gre0"                type: gre                options: {remote_ip="192.168.122.43"}    ovs_version: "2.0.0"
  • Make sure you can ping host1 from host2 and vice-versa.

  • Make sure you can ping the docker0 interface running on host1 from host2 and vice-versa.

Conclusion

It’s possible to run Docker containers on different hosts that share the same network.

It’s even pretty simple. But like always?—?it could be better: Docker should make it possible without any workarounds.

One idea would be to implement the ARP requests directly in Go and drop the use of arping.

The other idea is to expose the network settings for the containers to the docker run call. I’m thinking here about the -i (IP with network prefix) and -g (gateway) options forwarded to dockerinit when launching a container.

Whoah, you’re still reading this? Not bad.

Thanks!


[参考]
1. 
2. 

转载地址:http://pmcia.baihongyu.com/

你可能感兴趣的文章
Oracle 11G 数据库迁移【expdp/impdp】
查看>>
17.EXTJs 中icon 与iconCls的区别及用法!
查看>>
3.mybatis实战教程(mybatis in action)之三:实现数据的增删改查
查看>>
Caused by: Unable to load bean: type: class:com.opensymphony.xwork2.ObjectFactory - bean - jar
查看>>
让你拥有超能力:程序员应该掌握的统计学公式
查看>>
互联网组织的未来:剖析 GitHub 员工的任性之源
查看>>
Java 开源博客 Solo 1.4.0 发布 - 简化
查看>>
Oracle巡检
查看>>
【转载】胜者树
查看>>
查看mysql数据库存放的路径|Linux下查看MySQL的安装路径
查看>>
selenium+testNG+Ant
查看>>
1024程序员节,你屯书了吗?(内含福利)
查看>>
移动端JS 触摸事件基础
查看>>
Flex拖动原来如此简单
查看>>
温故而知新:什么是wcf
查看>>
centos语言设置
查看>>
php安装
查看>>
Fragment在getUserVisibleHint时进行加载数据的问题记录
查看>>
使用线程池模拟处理耗时任务,通过websocket提高用户体验
查看>>
Java 内部类种类及使用解析
查看>>