net/mlx4_core: Fix deadlock when switching between polling and event fw commands
authorJack Morgenstein <jackm@dev.mellanox.co.il>
Tue, 20 Sep 2016 11:39:42 +0000 (14:39 +0300)
committerDavid S. Miller <davem@davemloft.net>
Thu, 22 Sep 2016 01:52:43 +0000 (21:52 -0400)
commita7e1f04905e5b2b90251974dddde781301b6be37
tree5008a3d346e4d8b8e5aff19b8b3ab2667e1f95e7
parent30353bfc43a1602c020f31d95cf27182ffd23824
net/mlx4_core: Fix deadlock when switching between polling and event fw commands

When switching from polling-based fw commands to event-based fw
commands, there is a race condition which could cause a fw command
in another task to hang: that task will keep waiting for the polling
sempahore, but may never be able to acquire it. This is due to
mlx4_cmd_use_events, which "down"s the sempahore back to 0.

During driver initialization, this is not a problem, since no other
tasks which invoke FW commands are active.

However, there is a problem if the driver switches to polling mode
and then back to event mode during normal operation.

The "test_interrupts" feature does exactly that.
Running "ethtool -t <eth device> offline" causes the PF driver to
temporarily switch to polling mode, and then back to event mode.
(Note that for VF drivers, such switching is not performed).

Fix this by adding a read-write semaphore for protection when
switching between modes.

Fixes: 225c7b1feef1 ("IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
drivers/net/ethernet/mellanox/mlx4/cmd.c
drivers/net/ethernet/mellanox/mlx4/mlx4.h