route-table: Make route-table module thread-safe.
Since the use of xcache, the netdev struct can be freed by the
revalidator threads. This fact also makes the following race possible:
1. Consider there is a gre tunnel, and datapath flows that go through
the tunnel. Now, assume user deletes the tunnel port.
2. The main thread closes all of its references to the corresponding
netdev struct.
3. If the ukey for the tunnel datapath flows hold the last reference
to the old port's netdev, the revalidator will then close it
and remove the netlink notifier struct (if the netdev is the last
vport netdev).
4. However, if the main thread executes the netdev_vport_run(), and
sees the existence of netlink notifier struct (before revalidator
frees it), it will try polling updates from the notifier socket.
And when it polls the socket fd, the fd has already been freed,
and poll will keep failing with "Bad file descriptor".
The following script could be used to reproduce the race:
- assume on a VM-VM setup, with the setup below:
ovs-vsctl add-br br-int -- add-port br-int p3
ovs-vsctl add-port br-int vif1 -- set int vif1 type=internal
ifconfig vif1 11.0.0.1 up; ifconfig eth3 3.3.3.1 up
- while keeping a ping from 11.0.0.1 to 11.0.0.2, run this loop:
for i in `seq
1000000`; do
sleep 5; ovs-vsctl del-port p3;
ovs-vsctl add-port br-int p3 -- set int p3 type=gre \
options:remote_ip=3.3.3.2 options:key=1;
done
- after a while, the race should be triggered. and the main thread
should run at 100% cpu.
This race has already been fixed on master by commit
3c27dbe6 (route
-table: Make route-table module thread-safe.). This commit backports
it to branch-2.3.
VMware-BZ: #
1287360
Signed-off-by: Ryan Wilson <wryan@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
Signed-off-by: Alex Wang <alexw@nicira.com>