OpenBSD Routing Tables and Routing Domains

Posted on Mar 16, 2022 | By Andrei Buzoianu | 15 minutes read

Traditionally speaking, the OpenBSD kernel routing system has a single table for routes. This means it only allows non-conflicting IP address assignments and all network interfaces on the system are connected to a single routing table.

Therefore, all interfaces on an OpenBSD server belong to rdomain 0 by default. Assuming that IP Forwarding is enabled and pf(4) allows it, traffic will flow freely between all interfaces. The functionality is also present in user-land tools such as dhclient(8) and dhcpd(8) and in the routing protocol daemons like ospfd(8), and bgpd(8). Support for virtual routing and firewalling first appeared in OpenBSD 4.6 with the addition of routing domains.

About rdomain and rtable

Every rdomain has a completely separate address space in the kernel. Consequently, an IP address can be assigned in more than one rdomain, but it cannot be assigned more than once per rdomain. Interfaces in different routing domains are separated and can not directly pass traffic between each other and as such network traffic inside the rdomain stays within the current routing domain. To move traffic from one rdomain to another rdomain, pf(4) is used.

When an interface is assigned to a non-existent rdomain it gets created automatically. At the same time one rtable with the same ID and a lo(4) interface with a unit number matching the routing domain ID get created and assigned to the new domain.

Regarding limits, the highest ID that can be used for rdomain is 255, which basically means that is the maximum number of routing domains. The pseudo-device which implements and controls the CARP protocol (e.g. carp(4)) must be in the same rdomain as the interface it attaches to (carpdev).

rtables contain routes for outbound network packets. One rdomain can contain more than one rtable. Multiple routing tables are commonly used for Policy Based Routing and the same limitation on possible number of routing tables applies as for rdomain (e.g. the highest ID that can be used for rtable is 255).

In his BSDCan 2015 paper, Peter Hessler gives a more concise description by comparison:

rtable

  • alternate routing table, usable with the same interfaces
  • ip addresses cannot overlap
  • multiple rtables can belong to a single rdomain
  • can be used for Policy Based Routing

rdomain

  • completely independent routing table instance
  • assign 10.0.0.1/16 a dozen times
  • interfaces can be assigned to only one rdomain at a time
  • how we ’know’ which one incoming packets should use
  • rdomains always contain at least one rtable

Use Cases

Using rdomains is similar to using VRFs in Cisco IOS. The VRF-like mechanism provider by the rdomain/rtable functionality can be useful to solve routing problems for atypical network setup or to isolate traffic for multiple tenants without using more advanced methods such as MPLS.

For example, given 2 servers that have exactly the same IP address and gateway, rdomains can be used to accommodate traffic from both without altering the IP or routing configuration. This covers the more atypical case.

When applying network segmentation with virtual local area networks (VLANs), traffic from multiple customers can be aggregated to a Provider Edge (PE). By having a different routing domain per customer, the prefixes traversing the OpenBSD router will never mix with each other. This stands as an example for the second case.

Setup

The following depicts a network topology designed to house two customers. Resource for both vlan6 (192.168.1.0/24) and vlan60 (172.16.60.0/24) belong to rdomain 1 and include devices that should be accessible only to Customer 1. Likewise vlan7 (172.16.7.0/30) and vlan70 (172.16.70.0/24) belong to rdomain 2 and Customer 2.

OpenBSD rdomains study case

This non-persistent example configuration uses em0 as the INTERNET connected interface.

$ doas ifconfig em0 up
$ doas ifconfig em0 10.10.10.100/23
$ doas route -n add default 10.10.11.254
$ ifconfig em0
em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
	lladdr 00:50:56:8e:6a:db
	description: INTERNET
	index 1 priority 0 llprio 3
	groups: egress
	media: Ethernet autoselect (1000baseT full-duplex,master)
	status: active
	inet 10.10.10.100 netmask 0xfffffe00 broadcast 10.10.11.255

Now let’s add the customer related network configuration. The routing domain on an interface is set using the rdomain parameter to ifconfig(8). By default route(8) will use the current routing table. To select an alternate routing table to modify or query use -T. To execute a command forcing the process and its children to use the routing table and appropriate routing domain as specified with the -T rtable option:

$ doas route [-T rtable] exec [command ...]

Customer 1:

$ doas ifconfig vlan6 rdomain 1
$ doas ifconfig vlan6 172.16.6.1/30
$ doas ifconfig vlan60 rdomain 1
$ doas ifconfig vlan60 172.16.60.1/30
$ doas ifconfig lo1 rdomain 1
$ doas ifconfig lo1 inet 127.0.0.1/8
$ doas route -T1 -qn add -inet 127 127.0.0.1 -reject
$ doas route -T1 -qn add default 127.0.0.1 -blackhole
$ doas route -T1 -qn add -inet 192.168.1.0/24 172.16.6.2
$ ifconfig lo1
lo1: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> rdomain 1 mtu 32768
	index 7 priority 0 llprio 3
	groups: lo
	inet6 ::1 prefixlen 128
	inet6 fe80::1%lo1 prefixlen 64 scopeid 0x7
	inet 127.0.0.1 netmask 0xff000000
$ ifconfig vlan6
vlan6: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> mtu 1500
	lladdr 0c:c4:7a:58:f4:c1
	description: C1
	index 18 priority 0 llprio 3
	encap: vnetid 6 parent ix0 txprio packet rxprio outer
	groups: vlan egress
	media: Ethernet autoselect (10GbaseSR full-duplex,rxpause,txpause)
	status: active
	inet 172.16.6.1 netmask 0xfffffffc broadcast 172.16.6.3
$ ifconfig vlan60
vlan60: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> mtu 1500
  lladdr 0c:c4:7a:58:f4:c1
  description: C1-ICONNECT
  index 18 priority 0 llprio 3
  encap: vnetid 60 parent ix0 txprio packet rxprio outer
  groups: vlan egress
  media: Ethernet autoselect (10GbaseSR full-duplex,rxpause,txpause)
  status: active
  inet 172.16.60.1 netmask 0xffffff00 broadcast 172.16.60.255

Customer 2:

$ doas ifconfig vlan7 rdomain 2
$ doas ifconfig vlan7 172.16.7.1/30
$ doas ifconfig vlan70 rdomain 2
$ doas ifconfig vlan70 172.16.70.1/30
$ doas ifconfig lo1 rdomain 2
$ doas ifconfig lo2 inet 127.0.0.1/8
$ doas route -T2 -qn add -inet 127 127.0.0.1 -reject
$ doas route -T2 -qn add default 127.0.0.1 -blackhole
$ ifconfig lo2
lo2: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> rdomain 2 mtu 32768
	index 8 priority 0 llprio 3
	groups: lo
	inet6 ::1 prefixlen 128
	inet6 fe80::1%lo2 prefixlen 64 scopeid 0x8
	inet 127.0.0.1 netmask 0xff000000
$ ifconfig vlan7
vlan7: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> rdomain 2 mtu 1500
  lladdr 0c:c4:7a:58:e4:c2
  description: C2
  index 18 priority 0 llprio 3
  encap: vnetid 6 parent ix1 txprio packet rxprio outer
  groups: vlan egress
  media: Ethernet autoselect (10GbaseSR full-duplex,rxpause,txpause)
	status: active
	inet 172.16.7.1 netmask 0xfffffffc broadcast 172.16.7.3
$ ifconfig vlan70
vlan70: flags=8943<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> rdomain 2 mtu 1500
  lladdr 0c:c4:7a:58:e4:c2
  description: C2-ICONNECT
  index 18 priority 0 llprio 3
  encap: vnetid 70 parent ix1 txprio packet rxprio outer
  groups: vlan egress
  media: Ethernet autoselect (10GbaseSR full-duplex,rxpause,txpause)
  status: active
  inet 172.16.70.1 netmask 0xffffff00 broadcast 172.16.70.255

To show the routing table content, use the route(8) command with the -T flag:

$ doas route -T1 -n show                                                                                                                                             
Routing tables

Internet:
Destination        Gateway            Flags   Refs      Use   Mtu  Prio Iface
127.0.0.1          127.0.0.1          UHl        0        0 32768     1 lo1  
172.16.6/30        172.16.6.1         UCn        1        0     -     4 vlan6  
172.16.6.1         0c:c4:7a:58:f4:c1  UHLl       0    12408     -     1 vlan6  
172.16.6.2         00:50:56:8e:c5:aa  UHLch      3      731     -     3 vlan6  
172.16.6.255       172.16.6.1         UHb        0        0     -     1 vlan6  
192.168.1/24       172.16.6.2         UGS        0        0     -     8 vlan6  

Internet6:
Destination                        Gateway                        Flags   Refs      Use   Mtu  Prio Iface
::1                                ::1                            UHl        0        0 32768     1 lo1  
fe80::1%lo1                        fe80::1%lo1                    UHl        0        0 32768     1 lo1  
ff01::%lo1/32                      fe80::1%lo1                    Um         0        1 32768     4 lo1  
ff02::%lo1/32                      fe80::1%lo1                    Um         0        1 32768     4 lo1  

Configuration Persistence

To make the above configuration persistent, we need to edit interface-specific configuration files in /etc according to hostname.if(5). The hostname.* files contain the required information regarding system interfaces. The configuration information is expressed in a line-by-line packed format which makes the most common cases simpler.

Customer 1:

$ cat /etc/hostname.vlan6
rdomain 1
inet 172.16.6.1 255.255.255.252 NONE
!route -T1 -qn add -inet 127 127.0.0.1 -reject
!route -T1 -qn add default 127.0.0.1 -blackhole
!route -T1 -qn add -inet 192.168.1.0/24 172.16.6.2
$ cat /etc/hostname.vlan60
rdomain 1
inet 172.16.60.1 255.255.255.0 NONE

Customer 2:

$ cat /etc/hostname.vlan7
rdomain 2
inet 172.16.7.1 255.255.255.252 NONE
!route -T1 -qn add -inet 127 127.0.0.1 -reject
!route -T1 -qn add default 127.0.0.1 -blackhole
$ cat /etc/hostname.vlan70
rdomain 2
inet 172.16.70.1 255.255.255.0 NONE

Turn the Server into a Router

By using the sysctl(8) utility, an OpenBSD server can turn into a router.

Excerpt from the sysctl(2) man page:

ip.forwarding (net.inet.ip.forwarding)
    If set to 0, IP forwarding is disabled. The IP stack also requires
    the destination IP address of incoming packets to match the IP address
    of the network interface the packet is bound to. If set to 1,
    IP forwarding is enabled for the host, indicating the host is acting
    as a router. If set to 2, IP forwarding is restricted to traffic that
    has been IPsec encapsulated or decapsulated by the host. Enabling packet
    forwarding (values either 1 or 2) relaxes the requirements on incoming
    packets, so that its destination address must match just any IP address
    bound to the host. The default value is 0.

To turn on IP forwarding, one would use the following command:

$ doas sysctl net.inet.ip.forwarding=1

To make the changes permanent, add the following configuration in /etc/sysctl.conf:

echo 'net.inet.ip.forwarding=1' | doas tee -a /etc/sysctl.conf

Packet filtering

As mentioned before, network traffic inside one rdomain stays within the current routing domain. pf(4) is used to move traffic from one rdomain to a different rdomain.

The following rules are needed in /etc/pf.conf to allow 172.16.7.2 on rdomain 2 to access the internet:

#       $OpenBSD: pf.conf,v 1.55 2017/12/03 20:40:04 sthen Exp $
#
# See pf.conf(5) and /etc/examples/pf.conf


# silently drop rejected packets
# set block-policy return
set block-policy drop

# packet and byte statistics
set loginterface egress

table <rdomains> const { 172.16.6.0/30, 192.168.1.0/24, 172.16.60.0/24, 172.18.7.0/30, 172.16.70.0/24 }

# don't care about ipv6 yet
block return out quick inet6 all
block in quick inet6 all

# default deny all
block log all

# loopback
pass in quick on lo0
pass out quick on lo0

pass in quick on lo1
pass out quick on lo1

pass in quick on lo2
pass out quick on lo2

# allow SSH for management purposes
pass in on em0 proto tcp to port 22 keep state

pass in on em0 to <rdomains> keep state
pass in on vlan6 from { 172.16.6.0/30, 192.168.1.0/24 }
pass in on vlan60 from 172.16.60.0/24
pass in on vlan7 from 172.16.7.0/30
pass in on vlan70 from 172.16.70.0/24

# allow host traffic
pass out on rdomain 0 from (em0)
# pass out from 172.16.7.2 to internet via em0
pass out on em0 from 172.16.7.2 received-on vlan7 keep state

# allow traffic out from vlan6 (rdomain 1)
pass out on vlan6 to { 172.16.6.0/30, 192.168.1.0/24 } keep state
pass out on vlan60 to 172.16.60.0/24 keep state
# allow traffic out from em2 (rdomain 2)
pass out on vlan7 to 172.16.7.0/30 keep state
pass out on vlan70 to 172.16.70.0/24 keep state

# move traffic from rdomain 0 to rdomain 2
match in on rdomain 0 to { 172.16.7.0/30, 172.16.70.0/24 } rtable 2
# move traffic from rdomain 2 to rdomain 0
match in on rdomain 2 from { 172.16.7.0/30, 172.16.70.0/24 } to !<rdomains> rtable 0

On 172.16.7.2:

$ ifconfig em2|grep inet|cut -d ' ' -f 2
172.16.7.2
$ mtr -r -c 1 10.10.11.254
Start: 2022-03-16T14:16:04+0200
HOST: obsdtest2                   Loss%   Snt   Last   Avg  Best  Wrst StDev
  1.|-- 10.10.10.100               0.0%     1    0.4   0.4   0.4   0.4   0.0
  2.|-- 10.10.11.254               0.0%     1    0.8   0.8   0.8   0.8   0.0

Caveats and Pitfalls

  • each tenant could have its own routing table and multiple interfaces can create a routing domain
  • we need to add default routes for all the domains since route lookup happens before pf(4)
  • CARP interfaces must be in the same rdomain as the interface it attaches to (e.g. carpdev)
  • when adding rdomains, the IP configuration will be removed from the interfaces
  • tools and daemons might be aware of rdomains in which case direct invocation is possible by passing rdomain IDs
  • however, not all tools are able to handle rdomains themselves; use rcctl(8) to overcome this by starting several instances in different rdomains

rdomain Awareness

sshd

To specify the local addresses sshd(8) should listen on, the following forms may be used:

ListenAddress hostname|address [rdomain domain]
ListenAddress hostname:port [rdomain domain]
ListenAddress IPv4_address:port [rdomain domain]
ListenAddress [hostname|address]:port [rdomain domain]

Multiple ListenAddress options are permitted. The default is to listen on all local addresses on the current default routing domain. The optional rdomain qualifier requests sshd(8) listen in an explicit routing domain.

ntpd

The basic configuration in ntpd.conf allow specifying listen on multiple times. ntpd(8) will listen on each given address. The optional rtable keyword will specify which routing table to listen on. By default ntpd(8) will listen using the current routing table. For example:

listen on 127.0.0.1
listen on ::1
listen on 127.0.0.1 rtable 1

Other Tools

  • netstat(1): -T will select an alternate routing table to query. The default is to use the current routing table.
  • route(8): -T will select an alternate routing table to query. The default is to use the current routing table.
  • arp(8)/ndp(8): -V will select the routing domain.
  • ping(8): -V will set the routing table to be used for outgoing packets.
  • traceroute(8): -V will set the routing table to be used.
  • nc(1): -V will set the routing table to be used.
  • ps(1): rtable is amongst the available keywords.
    $ doas ps -auxwo rtable|grep 'root.*bgpd'
    USER       PID %CPU %MEM   VSZ   RSS TT  STAT   STARTED       TIME COMMAND          RTABLE
    root      8178  0.0  0.1  1464  1852 ??  I      28Feb22    0:00.07 /usr/sbin/bgpd        0
    root     76083  0.0  0.1  1464  1908 ??  I      Mon11PM    0:00.01 bgpd -f /etc/bgp      1
    root     46912  0.0  0.1   628  1504 p0  S+p     4:26PM    0:00.01 grep root.*bgpd       0
    
  • telnet(1): -V will set the routing table to be used.
  • id(1): -R will display the routing table of the current process.
  • rcctl(8): rtable can be set as in the following example:
$ doas rcctl enable dhcpd50
$ doas rcctl set dhcpd50 rtable 50
  • ifconfig(8): ifconfig rdomain id will attach the interface to the routing domain with the specified rdomainid while ifconfig -rdomain will remove the interface from the routing domain and return it to routing domain 0.

Tools which are not rdomain aware

Assuming we already have a BGP session in rdomain 0:

$ route -T1 exec bgpctl sh sum
Neighbor                   AS    MsgRcvd    MsgSent  OutQ Up/Down  State/PrfRcvd
TEST-IPv4               65000       4949       4949     0 1d17h13m      1

To start another bgpd(8) daemon in rdomain 1 use the following commands, which will basically clone the resource script by soft-linking it. Thus, rcctl(8) will treat them separately, with individual flags:

$ doas ln -s /etc/rc.d/bgpd /etc/rc.d/bgpd_rd1
$ doas rcctl enable bgpd_rd1
$ doas rcctl set bgpd_rd1 rtable 1
$ doas rcctl set bgpd_rd1 flags "-f /etc/bgpd_rd1.conf"
$ doas vi /etc/bgpd_rd1.conf
$ doas /etc/rc.d/bgpd_rd1 start

bgpctl(8) can use -s socket to communicate with bgpd(8) instead of the default /var/run/bgpd.sock.0. The default bgpd control socket is /var/run/bgpd.sock.<rdomain> where <rdomain> is the routing domain in which bgpd(8) has been started.

To administer bgpd(8) in a different routing domain, run bgpctl(8) in said routing domain with route -T <rdomain> or specify the matching socket.

$ route -T1 exec bgpctl sh sum
Neighbor                   AS    MsgRcvd    MsgSent  OutQ Up/Down  State/PrfRcvd
TEST1-IPv4              65001          4          4     0 00:00:09      1
$ bgpctl -s /var/run/bgpd.sock.1 sh sum
Neighbor                   AS    MsgRcvd    MsgSent  OutQ Up/Down  State/PrfRcvd
TEST1-IPv4              65001          4          4     0 00:00:11      1

Conclusion

OpenBSD’s rdomains are a smart and easy way to isolate traffic for multiple tenants within the same machine. All traffic crossing interfaces in the same rdomain is automatically forwarded based on said virtualized routing table. The traffic between routing domains is managed by pf(4), therefore escaping a routing domain is possible to achieve safe isolation between multiple tenants.

References