Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Detailed explanation of Docker using Linux iptables and Interfaces to manage container network

2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)06/02 Report--

I've been using docker for some time, and like most people, I'm impressed by the power and ease of use of docker. Simplicity and convenience is one of the cores of docker, and its powerful functions are abstracted into very simple commands. When I am using and learning docker, I would like to know what docker does in the background, especially in the network (the one I am most interested in).

I've found a lot of documentation about creating and manipulating container networks, but not so much about how docker makes the network work. Docker uses linux iptables and bridge interfaces extensively, and this article is a summary of how I used to create container networks, with most of the information coming from discussions on github, presentations, and my own tests. At the end of the article, I will give links to materials that I think are very useful.

I wrote this article using docker 1.12.3, but not as a comprehensive description of the docker network, nor as an introduction to the docker network. I just hope this article will broaden your horizons and thank you very much for all the feedback and criticism on the mistakes and deficiencies of the article.

Overview of Docker Network

Docker's network is built on a container network model (CNM) that allows either party to write its own network driver. This allows different network types to be used for containers running on the docker engine, and containers can connect to multiple networks at the same time. In addition to the various third-party network drivers available, docker comes with four built-in networks

Driver:

Bridge: this is the default network for launching the container. The connection is achieved through the bridge interface on the docker host. Containers that use the same bridge have their own subnets and can communicate with each other (by default).

Host: this driver allows the container to access the docker host's own cyberspace (the container will see and use the same interface as the docker host).

Macvlan: this driver allows the container to directly access the host's interface or subinterface (vlan). It also allows relay links.

Overlay: this driver allows you to build a network on multiple hosts running docker (usually a docker cluster). Containers also have their own subnets and network addresses and can communicate directly with each other, even if they are running on different physical hosts.

Bridge and Overlay are probably the most commonly used network drivers, which I will focus on in this and the next article.

Docker Bridge network

The default network for containers running on docker hosts is. Docker creates a default network called "bridge" when it is first installed. We can list all docker networks to view this network docker network ls:

$docker network lsNETWORK ID NAME DRIVER SCOPE3e8110efa04a bridge bridge localbb3cd79b9236 docker_gwbridge bridge local22849c4d1c3a host host local3kuba8yq3c27 ingress overlay swarmecbd1c6c193a none null local

To check its properties, run docker network inspect bridge

Docker network inspect bridge [{"Name": "bridge", "Id": "3e8110efa04a1eb0923d863af719abf5eac871dbac4ae74f133894b8df4b9f5f", "Scope": "local", "Driver": "bridge", "EnableIPv6": false, "IPAM": {"Driver": "default", "Options": null, "Config": [{"Subnet": "172.18.0.0ly16" "Gateway": "172.18.0.1"}}, "Internal": false, "Containers": {}, "Options": {"com.docker.network.bridge.default_bridge": "true", "com.docker.network.bridge.enable_icc": "true", "com.docker.network.bridge.enable_ip_masquerade": "true" "com.docker.network.bridge.host_binding_ipv4": "0.0.0.0", "com.docker.network.bridge.name": "docker0", "com.docker.network.driver.mtu": "1500"}, "Labels": {}}]

You can also use the docker network create command and specify the option-- driver bridge to create your own network, for example

Docker network create-- driver bridge--subnet 192.168.100.0 my-bridge-network 24-- ip-range 192.168.100.0 my-bridge-network creates another bridge network with the name "my-bridge-network" and the subnet 192.168.100.0 Universe 24.

Linux bridge interface

Each bridge network created by docker is represented by a bridge interface on the docker host. The default bridge network "bridge" usually has an interface docker0 associated with it, and each subsequent bridge network created with the docker network create command will have a new interface associated with it.

$ifconfig docker0docker0 Link encap:Ethernet HWaddr 02:42:44:88:bd:75 inet addr:172.18.0.1 Bcast:0.0.0.0 Mask:255.255.0.0 UP BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:0 (0.0B) TX bytes:0 (0.0B)

To find the linux interface associated with the docker network you created, use ifconfig to list all the interfaces, and then find the interface you specified the subnet. For example, if we want to view the bridge interface my-bridge-network we created earlier, we can do this:

$ifconfig | grep 192.168.100. -B 1br-e6bc7d6b75f3 Link encap:Ethernet HWaddr 02:42:bc:f1:91:09 inet addr:192.168.100.1 Bcast:0.0.0.0 Mask:255.255.255.0

Linux bridging interfaces are similar to switches in that they connect different interfaces to the same subnet and forward traffic based on MAC addresses. As we will see below, each container connected to the bridge network will create its own virtual interface on the docker host, and the docker engine connects all containers in the same network to the same bridge interface, which will allow them to communicate with each other. You can use brctl to get more details about the status of the bridge.

$brctl show docker0bridge name bridge id STP enabled interfacesdocker0 8000.02424488bd75 no

Once we have containers running and connecting to this network, we will see the interfaces of each container listed under the interfaces column. And running traffic capture on the bridge interface will allow us to see the communication between containers on the same subnet.

Linux Virtual Network Interface (veth)

The Container Network Model (CNM) allows each container to have its own cyberspace. Running ifconfig from inside the container shows the network interface inside the container:

$docker run-ti ubuntu:14.04 / bin/bashroot@6622112b507c:/#root@6622112b507c:/# ifconfigeth0 Link encap:Ethernet HWaddr 02:42:ac:12:00:02 inet addr:172.18.0.2 Bcast:0.0.0.0 Mask:255.255.0.0 inet6 addr: fe80::42:acff:fe12:2/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:9 errors:0 dropped:0 overruns:0 Frame:0 TX packets:6 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:766 (766.0 B) TX bytes:508 (508.0 B) lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: 1hand 128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame : 0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:0 (0.0B) TX bytes:0 (0.0B)

However, the eth0 seen above is only available from that container, while outside the Docker host, docker creates a corresponding dual virtual interface and acts as a link to the outside of the container. These virtual interfaces are connected to the bridge interfaces discussed above to facilitate connections between different containers on the same subnet.

We can view this process by starting two containers connected to the default bridge, and then look at the interface configuration on the docker host.

The docker0 bridge interface does not have an interface to which it is connected before running and starting any containers:

Then I started 2 containers from the ubuntu:14.04 image

$docker psCONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMESa754719db594 ubuntu:14.04 "/ bin/bash" 5 seconds ago Up 4 seconds zen_kalam976041ec420f ubuntu:14.04 "/ bin/bash" 7 seconds ago Up 5 seconds stupefied_easley

You can immediately see that there are now two interfaces connected to the docker0 bridge interface (one for each container)

$sudo brctl show docker0bridge name bridge id STP enabled interfacesdocker0 8000.02424488bd75 no veth3177159 vethd8e05dd

Ping from one of the containers to google, and then capture the virtual interface of the container from the docker host. The container traffic will be displayed.

$docker exec a754719db594 ping google.comPING google.com (216.58.217.110) 56 (84) bytes of data.64 bytes from iad23s42-in-f110.1e100.net (216.58.217.110): icmp_seq=1 ttl=48 time=0.849 ms64 bytes from iad23s42-in-f110.1e100.net (216.58.217.110): icmp_seq=2 ttl=48 time=0.965 msubuntu@swarm02:~$ sudo tcpdump-I veth3177159 icmptcpdump: verbose output suppressed, use-v or-vv for full protocol decodelistening on veth3177159, link-type EN10MB (Ethernet) Capture size 262144 bytes20:47:12.170815 IP 172.18.0.3 > iad23s42-in-f14.1e100.net: ICMP echo request, id 14, seq 55, length 6420 iad23s42-in-f14.1e100.net 47 length 12.171654 IP iad23s42-in-f14.1e100.net > 172.18.0.3: ICMP echo reply, id 14, seq 55, length 6420 length 6420 47 IP 172.18.0.3 > iad23s42-in-f14.1e100.net: ICMP echo request, id 14, seq 56 Length 6420 id 47 ICMP echo reply 13.171694 IP iad23s42-in-f14.1e100.net > 172.18.0.3: ICMP echo reply, id 14, seq 56, length 64

Similarly, we can flatten from one container to another.

First, we need to get the IP address of the container, which can be done by running ifconfig in the container or by using the docker inspect command to check the container:

$docker inspect-f'{{range .NetworkSettings.Networks}} {{.IPAddress}} {{end}} 'a754719db594 172.18.0.3

Then we ping from one container to another.

$docker exec 976041ec420f ping 172.18.0.3PING 172.18.0.3 (172.18.0.3) 56 (84) bytes of data.64 bytes from 172.18.0.3: icmp_seq=1 ttl=64 time=0.070 ms64 bytes from 172.18.0.3: icmp_seq=2 ttl=64 time=0.053 ms

To see this traffic from the docker host, we can capture it on any virtual interface corresponding to the container, or we can capture it on the bridge interface (docker0 in this example), showing all the communication subnets between containers:

$sudo tcpdump-ni docker0 host 172.18.0.2 and host 172.18.0.3tcpdump: verbose output suppressed, use-v or-vv for full protocol decodelistening on docker0, link-type EN10MB (Ethernet), capture size 262144 bytes20:55:37.990831 IP 172.18.0.2 > 172.18.0.3: ICMP echo request, id 14, seq 200, length 6420 Vista 55 bytes20:55:37.990831 IP 37.990865 IP 172.18.0.3 > 172.18.0.2: ICMP echo reply, id 14, seq 200 Length 6420 IP 172.18.0.2 > 172.18.0.3: ICMP echo request, id 14, seq 201,172.18.0.3: ICMP echo request, id 14, seq 201,172.18.0.3 > 172.18.0.2: ICMP echo reply, id 14, seq 201, length 64.

Locate the vet interface of a container

There is no direct way to find out which veth interface on the docker host is linked to the interface within the container, but several methods are discussed in various docker forums and github. In my opinion, the simplest thing is the following (slightly modified based on this solution), which also depends on the fact that ethtool is accessible in the container

For example, there are three containers running on my system

MAGE COMMAND CREATED STATUS PORTS NAMESccbf97c72bf5 ubuntu:14.04 "/ bin/bash" 3 seconds ago Up 3 seconds admiring_torvalds77d9f02d61f2 ubuntu:14.04 "/ bin/bash" 4 seconds ago Up 4 seconds goofy_borg19743c0ddf24 ubuntu:14.04 "/ bin/sh" 8 minutes ago Up 8 minutes high_engelbart

First, I run the following command to get the peer_ifindex number

$docker exec 77d9f02d61f2 sudo ethtool-S eth0NIC statistics: peer_ifindex: 16

Then on the docker host, find the interface name through peer_ifindex

$sudo ip link | grep 1616: veth7bd3604@if15: mtu 1500 qdisc noqueue master docker0 state UP mode DEFAULT group default

So, in the current case, the interface name is: veth7bd3604

Iptables

Docker uses linux iptables to control communication with the interfaces and networks it creates. Linux iptables consists of different tables, but we focus on two: filter and nat. A filter is a table of security rules for traffic on a network or interface that allows or denies IP addresses, while nat contains rules that block IP addresses or ports. Docker uses nat to allow containers on the bridged network to communicate with destinations other than the docker host (otherwise the route to the container network must be added to the docker host's network)

Iptables:filter

The table in iptables consists of different chains corresponding to different conditions or stages of processing packets on the docker host. By default, the filter table has three chains: the input chain for processing packets arriving at the host and destined for the same host, the output chain for packets sent to the host to an external destination, and the external host for entering the host but with a destination. Each chain consists of rules that specify measures to be taken against the grouping (such as rejecting or accepting the grouping) and the conditions that match the rules. The rules are processed sequentially until a match is found, otherwise the default policy of the chain is applied. You can also define a custom chain in a table.

To view the currently configured rules and default policies for the chain in the filter table, you can run iptables-t filter-L (or iptables-L, if no table is specified, the filter table is used by default)

Sudo iptables-t filter-LChain INPUT (policy ACCEPT) target prot opt source destinationACCEPT tcp-- anywhere anywhere tcp dpt:domainACCEPT udp-- anywhere anywhere udp dpt:domainACCEPT tcp-- anywhere anywhere tcp dpt:bootpsACCEPT udp-- anywhere anywhere udp dpt:bootpsChain FORWARD (policy ACCEPT) target prot opt source destinationDOCKER-ISOLATION all-- anywhere anywhereDOCKER all-- anywhere anywhereACCEPT All-anywhere anywhere ctstate RELATED ESTABLISHEDACCEPT all-anywhere anywhereACCEPT all-anywhere anywhereDOCKER all-anywhere anywhereACCEPT all-anywhere anywhere ctstate RELATED,ESTABLISHEDACCEPT all-anywhere anywhereACCEPT all-anywhere anywhereDOCKER all-anywhere anywhereACCEPT all-anywhere anywhere ctstate RELATED ESTABLISHEDACCEPT all-anywhere anywhereACCEPT all-anywhere anywhereACCEPT all-anywhere anywhereDROP all-anywhere anywhereChain OUTPUT (policy ACCEPT) target prot opt source destinationChain DOCKER (3 references) target prot opt source destinationChain DOCKER-ISOLATION (1 references) target prot opt source destinationDROP all-anywhere anywhereDROP all- -anywhere anywhereDROP all-anywhere anywhereRETURN all-anywhere anywhere

Different chains are highlighted, as well as the default policy for each chain (there is no default policy for custom chains). We can also see that Docker has added two custom chains: Docker and Docker-Isolation, and inserted rules targeting these two new chains in the Forward chain.

Docker-isolation chain

Docker-isolation contains rules that restrict access between different container networks. To see more details, use the-v option when running iptables

$sudo iptables-t filter-L-v... .chain DOCKER-ISOLATION (1 references) pkts bytes target prot opt in out source destination 00 DROP all-- br-e6bc7d6b75f3 docker0 anywhere anywhere 00 DROP all-- docker0 br-e6bc7d6b75f3 anywhere anywhere 00 DROP all-- docker_gwbridge docker0 anywhere anywhere 00 DROP all-- docker0 docker_gwbridge anywhere anywhere 00 DROP all-- docker_gwbridge br-e6bc7d6b75f3 anywhere anywhere 00 DROP all-- br -e6bc7d6b75f3 docker_gwbridge anywhere anywhere36991 3107K RETURN all-- any any anywhere anywhere

You can see some deletion rules above to block traffic between any bridging interfaces created by docker, thus ensuring that the container network cannot communicate.

Icc=false

One of the options that can be passed to the docker network create command is com.docker.network.bridge.enable_icc, which represents communication between containers. Setting this option to false prevents containers on the same network from communicating with each other. This is achieved by adding a discarding rule to the forward chain that matches packets from the bridge interface associated with the network going to the same interface.

For example, let's create a new network with the following command

Docker network create-- driver bridge-- subnet 192.168.200.0sudo iptables 24-- ip-range 192.168.200.0sudo iptables 24-o "com.docker.network.bridge.enable_icc" = "false" no-icc-network$ ifconfig | grep 192.168.200-B 1br-8e3f0d353353 Link encap:Ethernet HWaddr 02:42:c4:6b:f1:40 inet addr:192.168.200.1 Bcast:0.0.0.0 Mask:255.255.255.0$ sudo iptables-t Filter-S FORWARD-P FORWARD ACCEPT-A FORWARD- j DOCKER-ISOLATION-A FORWARD- o br-8e3f0d353353-j DOCKER-A FORWARD- o br-8e3f0d353353-m conntrack-- ctstate RELATED ESTABLISHED-j ACCEPT-A FORWARD-I br-8e3f0d353353!-o br-8e3f0d353353-j ACCEPT-A FORWARD-o docker0-j DOCKER-A FORWARD-o docker0-m conntrack-- ctstate RELATED,ESTABLISHED-j ACCEPT-A FORWARD-I docker0!-o docker0-j ACCEPT-A FORWARD-I docker0-o docker0-j ACCEPT-A FORWARD-o br-e6bc7d6b75f3-j DOCKER-A FORWARD-o br-e6bc7d6b75f3-m conntrack-- ctstate RELATED ESTABLISHED-j ACCEPT-A FORWARD-I br-e6bc7d6b75f3!-o br-e6bc7d6b75f3-j ACCEPT-A FORWARD-I br-e6bc7d6b75f3-o br-e6bc7d6b75f3-j ACCEPT-A FORWARD-o docker_gwbridge-j DOCKER-A FORWARD-o docker_gwbridge-m conntrack-- ctstate RELATED ESTABLISHED-j ACCEPT-A FORWARD-I docker_gwbridge!-o docker_gwbridge-j ACCEPT-A FORWARD-o lxcbr0-j ACCEPT-A FORWARD-I lxcbr0-j ACCEPT-A FORWARD-I docker_gwbridge-o docker_gwbridge-j DROP-A FORWARD-I br-8e3f0d353353-o br-8e3f0d353353-j DROP

Iptables:nat

NAT allows the host to change the IP address or port of the packet. In this case, it is used to mask the source IP address from the docker network (for example, hosts in the 172.18.0.0 / 24 subnet) destined for packets outside the container after the IP address of the docker host. This feature is controlled by the com.docker.network.bridge.enable_ip_masquerade option and can be used in the docker network create (default is true if not specified) command.

You can see the effect of this command in the nat table of iptables

$sudo iptables-t nat-LChain PREROUTING (policy ACCEPT) target prot opt source destinationDOCKER all-- anywhere anywhere ADDRTYPE match dst-type LOCALChain INPUT (policy ACCEPT) target prot opt source destinationChain OUTPUT (policy ACCEPT) target prot opt source destinationDOCKER all-- anywhere! 127.0.0. 0 ADDRTYPE match dst-type LOCALChain POSTROUTING (policy ACCEPT) target prot opt source destinationMASQUERADE all-172.18.0. AnywhereMASQUERADE all-192.168.100.0 anywhereMASQUERADE all 24 anywhereMASQUERADE all-172.19.0.0 anywhereMASQUERADE all-10.0.3.0 target prot opt source destinationRETURN all 24! 10.0.3.0/24Chain DOCKER (2 references) target prot opt source destinationRETURN all-anywhere anywhereRETURN all-- anywhere anywhereRETURN all-- anywhere anywhere

In the postrouting chain, you can see all docker networks created by applying masquerade operations when communicating with any host outside your network.

Summary

The bridge network has a corresponding linux bridge interface on the docker host, which acts as a layer2 switch and is connected to different containers on the same subnet.

Each network interface in the container has a corresponding virtual interface created on the Docker host while the container is running.

Capturing traffic from Docker hosts on the bridge interface is equivalent to configuring SPAN ports on the switch, where all inter-cluster traffic can be viewed.

Traffic capture from the docker host on the virtual interface (veth- *) will display all traffic sent by the container on a specific subnet

Linux iptables rules are used to prevent different networks (and sometimes hosts in the network) from using filter tables for communication. These rules are usually added to the DOCKER-ISOLATION chain.

The container communicates with the outside through the bridge interface, and its IP is hidden behind the IP address of the docker host. This is achieved by adding rules to the nat table in iptables.

Concluding remarks

The above is the whole content of this article about how Docker uses Linux iptables and Interfaces to manage the container network. I hope it will be helpful to you. Interested friends can refer to: talk about Docker security mechanism kernel security and container network security, as well as other topics of this site. Thank you for your support!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report