routing problem with floating ip addresses

Asked by Max Schilling

I've got a three machine setup running Essex. Host a runs Glance, Keystone, and nova-* except nova-compute. Host b and c only run nova-compute. For the most part everything seems to work, I'm able to create instances. I can ping them from the Network-Controller using the private ip addresses and the floating ip addresses.

I'm using a FlatDHCP configuration, but the enterprise network in which the hosts run heavily uses vlans. For the setup of OpenStack I was granted three vlans 232, 233, 235. VLAN 235 is the dmz vlan. All vlans were created with gateways on the first ip address of the ip space. I originally wanted to use VLAN 232 for the vms, VLAN 233 for the host ips and VLAN 235 for floating ip addresses that are accessible from the outside. I ran into problems using the ip space of VLAN 232 for the private ip addresses, so i changed that to local ip adresses.

Created vms can access the internet as long as they don't have a floating ip address. As soon as I assign a floating ip address to a vm, the vms cannot access the internet anymore.

We checked on the network controller that the floating ip addresses were assigned:
   br235: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP
    link/ether 00:46:e9:25:b2:38 brd ff:ff:ff:ff:ff:ff
    inet 192.168.135.1/24 brd 192.168.135.255 scope global br235
    inet 138.246.18.131/32 scope global br235
    inet 138.246.18.132/32 scope global br235
    inet6 fe80::236:b9ff:fe25:b448/64 scope link
       valid_lft forever preferred_lft forever

But we're not able to ping this address from another server on the network. The routes on the network controller look like this:

Destination Gateway Genmask Flags Metric Ref Use Iface
default 10.144.233.1 0.0.0.0 UG 100 0 0 br233
10.144.233.0 * 255.255.255.0 U 0 0 0 br233
192.168.122.0 * 255.255.255.0 U 0 0 0 virbr0
192.168.235.0 * 255.255.255.0 U 0 0 0 br235

Does anyone know how the network controller usually propagates the floating ip addresses to the connected switches/routers? Or how I can check that he's doing it correctly?

Question information

Language:
English Edit question
Status:
Solved
For:
OpenStack Compute (nova) Edit question
Assignee:
No assignee Edit question
Solved by:
Max Schilling
Solved:
Last query:
Last reply:
Revision history for this message
James Kyle (jkyle) said :
#1

Floating IPS are routed to vm's on the host via iptable forwarding rules.

Each vm will receive its own iptable chain, for example if you had a vm with id instance-00000b, the network controller will create an iptable chain named something like nova-instance-11. That chain will create a rule that forwards the floating ip to the appropriate vm interface/host. (along with other rules.

To get a dump of your currently generated rules, execute

    iptables -S

Optionally filtering by the chain for the vm you're interested in. If you're in a vishy ha config, this is done on the node that is hosting the vm.

In general, when debugging this (assuming the iptable rules are in place and look good and ip_forwarding is enabled on the host) you'll want to trace the packet through the various networking layers and determine at which point it's being dropped.

Revision history for this message
Max Schilling (mx-chilly) said :
#2

I got it to work, but I still don't understand what the cause of the error was exactly.

My private vm network now uses br232 (VLAN 232). I deleted br235 and changed public_interface to eth0.235, because I was unsure what I was supposed to use there. Then I assigned eth0.235 an ip address (no dhcp in that vlan) and restartet the servers. The cloud controller now uses the default gateway of eth0.235 instead of the br233 (management vlan). And with that changes it worked.

Revision history for this message
chuanyu (x77126) said :
#3

I have a similar problem, but I use OpenStack with multi_host env.
First, I don't have public ip associate to my compute workers from starter, they use NAT. ( I think this is the point ?!)
When floating ip bind to the public interface of compute workers, the netmask is 255.255.255.255,
and nova-network arping the neighbor, too, but it seems to no effects.

I noticed that I don't have correctly routing, so I try to add default gw:
    root@compute5:~# ip r add default via 140.113.207.254 dev br100
but unfortunately, cause the floating ip which be bind to the br100 is "140.113.207.161/32",
so, the gateway(207.254) and the float ip is not in the same network, it can not be bind.

If I change the netmask to 140.113.207.161/24 manually, and set the default gw to 207.254 again,
the vm's outgoing is work fine, and can ping to vm from outside at the right now.

So, I don't know weather nova-network node should always have it's public ip or not...

Revision history for this message
chuanyu (x77126) said :
#4

I did a source based routing patchs, and support 2 args to let this patch fit your env.
These patches work fine on Ubuntu clouding natty & precise,
and my OpenStack version is Essex packaged in Ubuntu 12.04.

I use multi_host option, but I thing all nova-network nodes need it,
because I am a rookie about OpenStack, hope somebody can give me an advise.

Thanks.
<<< nova.conf >>>
# nova-network will bind floating ip by this cidr
float_cidr=24
# define default gateway for your floating ips, if you use NAT at first.
float_gw_ip=140.113.207.254

<<< patch1 : >>>
--- nova/network/l3.py.orig 2012-06-07 18:27:00.730605718 +0800
+++ nova/network/l3.py 2012-06-07 18:27:00.718606710 +0800
@@ -102,11 +102,11 @@
         linux_net.unplug(network_ref)

     def add_floating_ip(self, floating_ip, fixed_ip, l3_interface_id):
- linux_net.bind_floating_ip(floating_ip, l3_interface_id)
+ linux_net.bind_floating_ip(floating_ip, fixed_ip, l3_interface_id)
         linux_net.ensure_floating_forward(floating_ip, fixed_ip)

     def remove_floating_ip(self, floating_ip, fixed_ip, l3_interface_id):
- linux_net.unbind_floating_ip(floating_ip, l3_interface_id)
+ linux_net.unbind_floating_ip(floating_ip, fixed_ip, l3_interface_id)
         linux_net.remove_floating_forward(floating_ip, fixed_ip)

     def add_vpn(self, public_ip, port, private_ip):

<<< patch2 : >>>

--- nova/network/linux_net.py.orig 2012-06-07 18:27:05.630201035 +0800
+++ nova/network/linux_net.py 2012-06-07 18:27:05.626201365 +0800
@@ -79,6 +79,12 @@
                 default=False,
                 help='Use single default gateway. Only first nic of vm will '
                      'get default gateway from dhcp server'),
+ cfg.IntOpt('float_cidr',
+ default=32,
+ help='The cidr is used when bind floating ip to Public IF'),
+ cfg.StrOpt('float_gw_ip',
+ default='$my_ip',
+ help='Gateway ip of floating ip'),
     ]

 FLAGS = flags.FLAGS
@@ -457,23 +463,39 @@
     iptables_manager.apply()

-def bind_floating_ip(floating_ip, device):
- """Bind ip to public interface."""
- _execute('ip', 'addr', 'add', str(floating_ip) + '/32',
+def bind_floating_ip(floating_ip, fixed_ip, device):
+ """Bind ip to public interface,
+ and add source based routing to routing table '200'.
+ """
+ _execute('ip', 'addr', 'add', str(floating_ip) + '/' + str(FLAGS.float_cidr),
              'dev', device,
              run_as_root=True, check_exit_code=[0, 2, 254])
+ _execute('ip', 'rule', 'add', 'from', str(fixed_ip), 'table', '200',
+ run_as_root=True, check_exit_code=[0, 2, 254])
+ _execute('ip', 'route', 'add', 'default', 'via', str(FLAGS.float_gw_ip),
+ 'dev', device, 'table', '200',
+ run_as_root=True, check_exit_code=[0, 2, 254])
+ _execute('ip', 'route', 'flush', 'cache',
+ run_as_root=True, check_exit_code=[0, 2, 254])
     if FLAGS.send_arp_for_ha:
         _execute('arping', '-U', floating_ip,
                  '-A', '-I', device,
                  '-c', 1, run_as_root=True, check_exit_code=False)

-def unbind_floating_ip(floating_ip, device):
- """Unbind a public ip from public interface."""
- _execute('ip', 'addr', 'del', str(floating_ip) + '/32',
+def unbind_floating_ip(floating_ip, fixed_ip, device):
+ """Unbind a public ip from public interface,
+ and del related routing rule and table.
+ """
+ _execute('ip', 'addr', 'del', str(floating_ip) + '/' + str(FLAGS.float_cidr),
              'dev', device,
              run_as_root=True, check_exit_code=[0, 2, 254])
-
+ _execute('ip', 'route', 'del', 'default', 'via', str(FLAGS.float_gw_ip),
+ 'dev', device, 'table', '200',
+ run_as_root=True, check_exit_code=[0, 2, 254])
+ _execute('ip', 'rule', 'del', 'from', str(fixed_ip), 'table', '200',
+ run_as_root=True, check_exit_code=[0, 2, 254])
+

 def ensure_metadata_ip():
     """Sets up local metadata ip."""