Commit graph

262 commits

Author SHA1 Message Date
Thomas Gleixner
dace145374 [PATCH] irq-flags: misc drivers: Use the new IRQF_ constants
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-02 13:58:50 -07:00
Jörn Engel
6ab3d5624e Remove obsolete #include <linux/config.h>
Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-30 19:25:36 +02:00
Greg Kroah-Hartman
e29419fffc [PATCH] 64bit resource: fix up printks for resources in misc drivers
This is needed if we wish to change the size of the resource structures.

Based on an original patch from Vivek Goyal <vgoyal@in.ibm.com>

Cc: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-27 09:23:59 -07:00
Roland Dreier
c93b6fbaa9 IB/mthca: Make all device methods truly reentrant
Documentation/infiniband/core_locking.txt says:

  All of the methods in struct ib_device exported by a low-level
  driver must be fully reentrant.  The low-level driver is required to
  perform all synchronization necessary to maintain consistency, even
  if multiple function calls using the same object are run
  simultaneously.

However, mthca's modify_qp, modify_srq and resize_cq methods are
currently not reentrant.  Add a mutex to the QP, SRQ and CQ structures
so that these calls can be properly serialized.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-06-17 20:37:41 -07:00
Roland Dreier
c9c5d9feef IB/mthca: Fix memory leak on modify_qp error paths
Some error paths after the mthca_alloc_mailbox() call in mthca_modify_qp()
just do a "return -EINVAL" without freeing the mailbox.  Convert these
returns to "goto out" to avoid leaking the mailbox storage.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-06-17 20:37:41 -07:00
Or Gerlitz
d4cb0784fd IB/mthca: Fill in max_map_per_fmr device attribute
Report the true max_map_per_fmr value from mthca_query_device(),
taking into account the change in FMR remapping introduced by the
Sinai performance optimization.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-06-17 20:37:37 -07:00
Leonid Arsh
12bbb2b7be IB/mthca: Add client reregister event generation
Change the mthca snoop of MADs that set PortInfo to check if the SM
has set the client reregister bit, and if it has, generate a client
reregister event.  If the bit is not set, just generate a LID change
event as usual.

Signed-off-by: Leonid Arsh <leonida@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-06-17 20:37:36 -07:00
Roland Dreier
e9cd59418f IB/mthca: Convert FW commands to use wait_for_completion_timeout()
The kernel has had wait_for_completion_timeout() for a long time now.
mthca should use it to handle FW commands timing out, instead of
implementing the same thing in a much more complicated way by using
wait_for_completion() along with a timer that does complete().

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-06-17 20:37:30 -07:00
Michael S. Tsirkin
a26026c122 IB/mthca: Remove dead code
Kill some dead code in mthca_eq.c

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-06-17 20:37:29 -07:00
Michael S. Tsirkin
4e56ea794e IB/mthca: memfree completion with error FW bug workaround
Memfree firmware is in rare cases reporting WQE index == base - 1 in
receive completion with error, instead of (rq size - 1); base is 0 in
mthca.  Here is a patch to avoid kernel crash and report a correct WR
id in this case.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-06-17 20:37:20 -07:00
Michael S. Tsirkin
13aa6ecb47 IB/mthca: restore missing PCI registers after reset
mthca does not restore the following PCI-X/PCI Express registers after reset:
  PCI-X device: PCI-X command register
  PCI-X bridge: upstream and downstream split transaction registers
  PCI Express : PCI Express device control and link control registers

This causes instability and/or bad performance on systems where one of
these registers is set to a non-default value by BIOS.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-06-17 20:37:00 -07:00
Michael S. Tsirkin
ab28b171ea IB/mthca: Fix posting lists of 256 receive requests to SRQ for Tavor
If we post a list of length exactly a multiple of 256, nreq in
doorbell gets set to 256 which is wrong: it should be encoded by 0.
This is because we only zero it out on the next WR, which may not be
there.  The solution is to ring the doorbell after posting a WQE, not
before posting the next one.

This is the same bug that we just fixed for QPs with non-shared RQ.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-05-24 13:43:37 -07:00
Michael S. Tsirkin
23f3bc0f2c IB/mthca: Fix posting lists of 256 receive requests for Tavor
If we post a list of length 256 exactly, nreq in doorbell gets set to
256 which is wrong: it should be encoded by 0.  This is because we
only zero it out on the next WR, which may not be there.  The solution
is to ring the doorbell after posting a WQE, not before posting the
next one.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-05-18 11:37:03 -07:00
Roland Dreier
1db76c14d2 IB/mthca: Make fw_cmd_doorbell default to 0
Setting fw_cmd_doorbell allows FW command to be queued using posted
writes instead of requiring polling on a "go" bit, so it should be a
performance boost.  However, the option causes problems with at least
some device/firmware combinations, so set the default to 0 until we
understand what's going on better.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-05-17 07:48:07 -07:00
Michael S. Tsirkin
ce477ae4f8 IB/mthca: FMR ioremap fix
Addresses for ioremap must be calculated off of pci_resource_start;
we can't directly use the bus address as seen by the HCA.  Fix the
code that remaps device memory for FMR access.

Based on patch by Klaus Smolin.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-05-10 15:16:57 -07:00
Roland Dreier
a3285aa4ee IB/mthca: Fix race in reference counting
Fix races in in destroying various objects.  If a destroy routine
waits for an object to become free by doing

	wait_event(&obj->wait, !atomic_read(&obj->refcount));
	/* now clean up and destroy the object */

and another place drops a reference to the object by doing

	if (atomic_dec_and_test(&obj->refcount))
		wake_up(&obj->wait);

then this is susceptible to a race where the wait_event() and final
freeing of the object occur between the atomic_dec_and_test() and the
wake_up().  And this is a use-after-free, since wake_up() will be
called on part of the already-freed object.

Fix this in mthca by replacing the atomic_t refcounts with plain old
integers protected by a spinlock.  This makes it possible to do the
decrement of the reference count and the wake_up() so that it appears
as a single atomic operation to the code waiting on the wait queue.

While touching this code, also simplify mthca_cq_clean(): the CQ being
cleaned cannot go away, because it still has a QP attached to it.  So
there's no reason to be paranoid and look up the CQ by number; it's
perfectly safe to use the pointer that the callers already have.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-05-09 10:50:29 -07:00
Roland Dreier
254abfd33a IB/mthca: Fix offset in query_gid method
GuidInfo records have 8 byte GUIDs in them, so an index should be
multiplied by 8 to get an offset.  mthca_query_gid() was incorrectly
multiplying by 16.

Noticed by Leonid Keller <leonid@mellanox.co.il>.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-05-01 10:40:23 -07:00
Adrian Bunk
415dcd95b2 IB/mthca: make a function static
This patch makes the needlessly global mthca_update_rate() static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-04-19 11:40:12 -07:00
Jack Morgenstein
59fef3b1e9 IB/mthca: Fix max_srq_sge returned by ib_query_device for Tavor devices
The driver allocates SRQ WQEs size with a power of 2 size both for
Tavor and for memfree. For Tavor, however, the hardware only requires
the WQE size to be a multiple of 16, not a power of 2, and the max
number of scatter-gather allowed is reported accordingly by the
firmware (and this is the value currently returned by
ib_query_device() and ibv_query_device()).

If the max number of scatter/gather entries reported by the FW is used
when creating an SRQ, the creation will fail for Tavor, since the
required WQE size will be increased to the next power of 2, which
turns out to be larger than the device permitted max WQE size (which
is not a power of 2).

This patch reduces the reported SRQ max wqe size so that it can be used
successfully in creating an SRQ on Tavor HCAs.

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-04-12 11:42:30 -07:00
Michael S. Tsirkin
abf45dbb5b IB/mthca: Disable tuning PCI read burst size
The PCI spec recommends against drivers playing with a device's PCI
read burst size, and says that systems software should configure it.
And we actually have users that report that changing it from the
default set by BIOS hurts performance and/or stability for them.  On
the other hand, the Mellanox Programmer's Reference Manual recommends
turning it up all the way to the maximum value.  Some tests conducted
here in the lab do not show performance improvement from this tuning,
but this might be just me.

As a work-around, make this tuning an option, off by default (safe
value), with an eye towards removing it completely one day if no one
complains.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-04-10 09:43:58 -07:00
Jack Morgenstein
bf6a9e31cf IB: simplify static rate encoding
Push translation of static rate to HCA format into low-level drivers,
where it belongs.  For static rate encoding, use encoding of rate
field from IB standard PathRecord, with addition of value 0, for
backwards compatibility with current usage.  The changes are:

 - Add enum ib_rate to midlayer includes.
 - Get rid of static rate translation in IPoIB; just use static rate
   directly from Path and MulticastGroup records.
 - Update mthca driver to translate absolute static rate into the
   format used by hardware.  This also fixes mthca's static rate
   handling for HCAs that are capable of 4X DDR.

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-04-10 09:43:47 -07:00
Roland Dreier
227c939b00 IB/mthca: Always build debugging code unless CONFIG_EMBEDDED=y
Change the mthca debugging trace output code so that it can enabled
and disabled at runtime with the debug_level module parameter in
sysfs.  Also, don't allow CONFIG_INFINIBAND_MTHCA_DEBUG to be disabled
unless CONFIG_EMBEDDED is selected.  We want users (and especially
distros) to have this turned on unless they really need to save space,
because by the time we want debugging output, it's usually too late to
rebuild a kernel.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-04-02 14:39:20 -07:00
Roland Dreier
e1f7868c80 IB/mthca: Fix section mismatch problems
Quite a few cleanup functions in mthca were marked as __devexit.
However, they could also be called from error paths during
initialization, so they cannot be marked that way.  Just delete all of
the incorrect annotations.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-29 09:36:46 -08:00
Jack Morgenstein
a07bacca7b IB/mthca: Fix check of size in SRQ creation
The previous patch for Tavor broke MemFree logic.

The driver should perform limit check only for Tavor.  For MemFree,
the check is incorrect, since ds (WQE stride) is always a power-of-2
(although the max_desc_size may not be).

In Tavor, however, WQE stride and desc_size are the same, and are not
necessarily power-of-2.  The check was really for the WQE stride (and
it Tavor, we use max_desc_size for the stride).

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-29 09:36:46 -08:00
Roland Dreier
192daa18dd IB/mthca: Fix modify QP error path
If the call to mthca_MODIFY_QP() failed, then mthca_modify_qp() would
still do some things it shouldn't, such as store away attributes for
special QPs.  Fix this, and simplify the code, by simply jumping to
the exit path if mthca_MODIFY_QP() fails.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-24 15:47:30 -08:00
Roland Dreier
b0b3a8e193 IB/mthca: Fix indentation
Fix some whitespace damage (indenting with spaces) that snuck in.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-24 15:47:29 -08:00
Jack Morgenstein
b3f64967fa IB/mthca: Fix uninitialized variable in mthca_alloc_qp()
mthca_alloc_sqp() by mthca_set_qp_size() need to set qp->transport
before calling mthca_set_qp_size(), since the value is used there.

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-24 15:47:29 -08:00
Jack Morgenstein
d4301e2c66 IB/mthca: Check SRQ limit in modify SRQ operation
When setting the shared receive queue (SRQ) watermark in a modify SRQ
operation, make sure that the supplied value is not larger than the
full size of the SRQ.

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-24 15:47:28 -08:00
Jack Morgenstein
ded9ad721d IB/mthca: Check that SRQ WQE size does not exceed device's max value
Guarantee the calculated work queue entry size does not exceed the max
allowable WQE size when creating an SRQ.  This is a problem with Arbel
in Tavor-compatibility mode because the current WQE size computation
method rounds up to next power of 2.

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-24 15:47:27 -08:00
Dotan Barak
0ef61db837 IB/mthca: Check that sgid_index and path_mtu are valid in modify_qp
Add a check that the modify QP parameters sgid_index and path_mtu are
valid, since they might come from userspace.

Signed-off-by: Dotan Barak <dotanb@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-24 15:47:27 -08:00
Eli Cohen
fd02e8038e IB/mthca: Query SRQ srq_limit fixes
Fix endianness handling of srq_limit: it is big-endian in the context
structure, so we need to swab it before returning it.

Also add support for srq_limit query for Tavor (non-MemFree) HCAs.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20 10:08:26 -08:00
Dotan Barak
e10e271bfd IB/mthca: Correct reported SRQ size in MemFree case.
MemFree devices need to reserve one shared receive queue (SRQ) work
request for internal use, so the capacity returned from the create_srq
and query_srq methods should be srq->max - 1.

Signed-off-by: Dotan Barak <dotanb@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20 10:08:26 -08:00
Roland Dreier
6b63e3015a IB/mthca: Coverity fix to mthca_init_eq_table()
Fix bug found by coverity: the loop body never executed, because it
was doing for (i = 0; i < MTHCA_EQ_CMD; ++i), but MTHCA_EQ_CMD is 0.
The correct loop bound is MTHCA_NUM_EQ, to loop over all EQs.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20 10:08:25 -08:00
Roland Dreier
6226bb5701 IB/mthca: Update firmware versions
Update known firmware versions in driver's table to the latest releases.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20 10:08:22 -08:00
Eli Cohen
651eaac928 IB/mthca: Optimize large messages on Sinai HCAs
Sinai (one-port PCI Express) HCAs get improved throughput for messages
bigger than 80 KB in DDR mode if memory keys are formatted in a
specific way.  The enhancement only works if the memory key table is
smaller than 2^24 entries.  For larger tables, the enhancement is off
and a warning is printed (to avoid silent performance loss).

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Michael Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20 10:08:22 -08:00
Ishai Rabinovitz
8d3ef29d6b IB/mthca: Use an enum for HCA page size
Use a named enum for the HCA's internal page size, rather than having
magic values of 4096 and shifts by 12 all over the code.  Also, fix
one minor bug in EQ handling: only one HCA page is mapped to the HCA
during initialization, but a full kernel page is unmapped during
cleanup.  This might cause problems when PAGE_SIZE != 4096.

Signed-off-by: Ishai Rabinovitz <ishai@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20 10:08:19 -08:00
Dotan Barak
67e7377661 IB/mthca: Check alternate P_Key index when setting alternate path
Check that the alternate P_Key index is in range when setting the
alternate path for a QP.  Also make a cosmetic touch up to the debug
message printed when the main P_Key index is out of range.

Signed-off-by: Dotan Barak <dotanb@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20 10:08:18 -08:00
Dotan Barak
7667abd152 IB/mthca: Add support for send work request fence flag
Add support for IB_SEND_FENCE flag in post_send methods.

Signed-off-by: Dotan Barak <dotanb@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20 10:08:18 -08:00
Jack Morgenstein
1d89b1ae6c IB/mthca: Implement query_ah method
Implement query_ah (except for AVs which are in HCA memory).  This is
needed to implement RMPP duplicate session detection on sending side
(extraction of DGID/DLID and GRH flag from address handle).

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20 10:08:17 -08:00
Eli Cohen
14abdffcc0 IB/mthca: Write FW commands through doorbell page
This patch is checks whether the HCA supports posting FW commands
through a doorbell page (user access region 0, or "UAR0").  If this is
supported, the driver maps UAR0 and uses it for FW commands. This can
be controlled by the value of a writable module parameter
fw_cmd_doorbell.  When the parameter is 0, the commands are posted
through HCR using the old method; otherwise if HCA is capable commands
go through UAR0.

This use of UAR0 to post commands eliminates the need for polling the
"go" bit prior to posting a new command. Since reading from a PCI
device is much more expensive then issuing a posted write, it is
expected that issuing FW commands this way will provide better CPU
utilization.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20 10:08:17 -08:00
Dotan Barak
abb6e9ba17 IB/mthca: Return actual capacity from create_srq
Have mthca's create_srq method return the actual capacity of the SRQ
that gets created.  Also update comments in <rdma/ib_verbs.h> to
clarify that this is what is expected from ib_create_srq().

Signed-off-by: Dotan Barak <dotanb@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20 10:08:16 -08:00
Roland Dreier
00df1b2c8b IB/mthca: Bump driver version and release date
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20 10:08:15 -08:00
Eli Cohen
8ebe5077e3 IB/mthca: Support for query QP and SRQ
Implement the query_qp and query_srq methods in mthca.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20 10:08:15 -08:00
Roland Dreier
d844183d9c IB/mthca: Convert to use ib_modify_qp_is_ok()
Use ib_modify_qp_is_ok() in mthca, and delete the big table of
attributes for queue pair state transitions.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20 10:08:13 -08:00
Roland Dreier
3fa1fa3e80 IB/mthca: Generate SQ drained events when requested
Add low-level driver support to ib_mthca so that consumers can request
a "send queue drained" event be generated when a transiton to the SQD
state completes.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20 10:08:12 -08:00
Or Gerlitz
d36f34aadf IB: Enable FMR pool user to set page size
This patch allows the consumer to set the page size of "pages" mapped
by the pool FMRs, which is a feature already existing in the base
verbs API.  On the cosmetic side it changes ib_fmr_attr.page_size field
to be named page_shift.

Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20 10:08:10 -08:00
Roland Dreier
6dfc3901b0 IB/mthca: Add modify_device method to set node description
Add a modify_device method to mthca, which implements setting the node
description.  This makes the writable "node_desc" sysfs attribute work
for Mellanox HCAs.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20 10:08:10 -08:00
Roland Dreier
2fa5e2ebbe IB/mthca: Whitespace cleanups
Remove trailing whitespace and fix indentation that with spaces
instead of tabs.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20 10:08:09 -08:00
Roland Dreier
4885bf64bc IB/mthca: Add device-specific support for resizing CQs
Add low-level driver support for resizing CQs (both kernel and
userspace) to mthca.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20 10:08:08 -08:00
Roland Dreier
399d792129 IB/mthca: Get rid of might_sleep() annotations
The might_sleep() annotations in mthca are silly -- they all occur
shortly before calls that will end up in core functions like kmalloc()
that will print the same warning in an unsafe context anyway.  In
fact, beyond cluttering the source, we're actually bloating text with
CONFIG_DEBUG_SPINLOCK_SLEEP and/or CONFIG_PREEMPT_VOLUNTARY set.

With both options set, getting rid of the might_sleep()s saves a lot:
add/remove: 0/0 grow/shrink: 0/7 up/down: 0/-171 (-171)
function                                     old     new   delta
mthca_pd_alloc                               132     109     -23
mthca_init_cq                                969     946     -23
mthca_mr_alloc                               592     568     -24
mthca_pd_free                                 67      42     -25
mthca_free_mr                                219     194     -25
mthca_free_cq                                570     545     -25
mthca_fmr_alloc                              742     716     -26

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20 10:08:07 -08:00
Roland Dreier
d9b98b0f11 IB/mthca: Make functions that never fail return void
The function mthca_free_err_wqe() can never fail, so get rid of its
return value.  That means handle_error_cqe() doesn't have to check
what mthca_free_err_wqe() returns, which means it can't fail either
and doesn't have to return anything either.  All this results in
simpler source code and a slight object code improvement:

add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-10 (-10)
function                                     old     new   delta
mthca_free_err_wqe                            83      81      -2
mthca_poll_cq                               1758    1750      -8

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-03-20 10:08:07 -08:00
Roland Dreier
7d2babc487 IB/mthca: bump driver version and release date
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-02-13 12:19:44 -08:00
Roland Dreier
f295c79b67 IB/mthca: Don't print debugging info until we have all values
When debugging is enabled, the mthca_QUERY_DEV_LIM() firmware command
function prints out some of the device limits that it queries.
However the debugging prints happen before all of the fields are
extracted from the firmware response, so some of the values that get
printed are uninitialized junk.  Move the prints to the end of the
function to fix this.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-02-10 18:02:44 -08:00
Roland Dreier
fd9cfdd11b IB/mthca: Semaphore to mutex conversions
Convert semaphores to mutexes in mthca.  Leave firmware command
interface poll_sem and event_sem as semaphores.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-01-30 16:45:11 -08:00
Michael S. Tsirkin
e3aa31c517 IB/mthca: Don't cancel commands on a signal
We have run into the following problem: if a task receives a signal
while in the process of e.g. destroying a resource (which could be
because the relevant file was closed) mthca could bail out from trying
to take a command interface semaphore without performing the
appropriate command to tell hardware that the resource is being
destroyed.

As a result we see messages like
 ib_mthca 0000:04:00.0: HW2SW_CQ failed (-4)

In this case, hardware could access the resource after the memory has
been freed, possibly causing memory corruption.

A simple solution is to replace down_interruptible() by down() in
command interface activation.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
[ It's also not safe to bail out on multicast table operations, since
  they may be invoked on the cleanup path too.  So use down() for
  mcg_table.sem too. ]
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-01-30 16:22:29 -08:00
Michael S. Tsirkin
cbd2981a97 IB/mthca: Relax UAR size check
There are some cards around that have UAR (user access region) size
different from 8 MB.  Relax our sanity check to make sure that the PCI
BAR is big enough to access the UAR size reported by the device
firmware instead.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-01-30 15:20:35 -08:00
Michael S. Tsirkin
f9e61929e5 IB/mthca: Use correct GID in MADs sent on port 2
mthca_create_ah() includes the port number in the GID index. The reverse
needs to be done in mthca_read_ah().

Noted by Hal Rosenstock.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-01-21 14:02:59 -08:00
Michael S. Tsirkin
9eacee2ac6 IB/mthca: Initialize grh_present before using it
build_mlx_header() was using sqp->ud_header.grh_present before it was
initialized by mthca_read_ah().  Furthermore, header->grh_present is
set by ib_ud_header_init, so there's no need to set it again in
mthca_read_ah().

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-01-12 15:55:41 -08:00
Michael S. Tsirkin
c063a06835 IB/mthca: Cosmetic: use the ALIGN macro
Use the ALIGN macro to simplify some rounding code.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-01-12 15:43:58 -08:00
Jack Morgenstein
17e2e81951 IB/mthca: Fix memory leaks in error handling
Fix memory leaks in mthca_create_qp() and mthca_create_srq()
error handling.

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-01-12 15:35:15 -08:00
Ishai Rabinovitz
59f174faff IB/mthca: Fix memory leak of multicast group structures
Convert "/ (1 << lg)" to ">> lg" for a slight code size reduction.

add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-24 (-24)
function                                     old     new   delta
mthca_map_cmd                                613     589     -24

Signed-off-by: Ishai Rabinovitz <ishai@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-01-12 15:24:51 -08:00
Sean Hefty
cf311cd49a IB: Add node_guid to struct ib_device
Add a node_guid field to struct ib_device.  It is the responsibility
of the low-level driver to initialize this field before registering a
device with the midlayer.  Convert everyone to looking at this field
instead of calling ib_query_device() when all they want is the node
GUID, and remove the node_guid field from struct ib_device_attr.

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-01-10 07:39:34 -08:00
Roland Dreier
87635b71b5 IB/mthca: Factor common MAD initialization code
Factor out common code for initializing MAD packets, which is shared
by many query routines in mthca_provider.c, into init_query_mad().

add/remove: 1/0 grow/shrink: 0/4 up/down: 16/-44 (-28)
function                                     old     new   delta
init_query_mad                                 -      16     +16
mthca_query_port                             521     518      -3
mthca_query_pkey                             301     294      -7
mthca_query_device                           648     641      -7
mthca_query_gid                              453     426     -27

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-01-09 15:29:53 -08:00
Roland Dreier
105e50a5e8 IB/mthca: kzalloc conversions
Convert kmalloc()/memset(,0,) pairs to kzalloc().

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-01-09 15:21:21 -08:00
Michael S. Tsirkin
92898522e3 IB/mthca: prevent event queue overrun
I am seeing EQ overruns in SDP stress tests: if the CQ completion
handler arms a CQ, this could generate more EQEs, so that EQ will
never get empty and consumer index will never get updated.

This is similiar to what we have with command interface:
		/*
		 * cmd_event() may add more commands.
		 * The card will think the queue has overflowed if
		 * we don't tell it we've been processing events.
		 */
However, for completion events, we *don't* want to update the consumer
index on each event. So, perform EQ doorbell coalescing: allocate EQs
with some spare EQEs, and update once we run out of them.

The value 0x80 was selected to avoid any performance impact.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-01-09 14:04:40 -08:00
Michael S. Tsirkin
6627fa662e IB/mthca: fix page shift calculation in mthca_reg_phys_mr()
For all pages except possibly the last one, the byte beyond the buffer
end must be page aligned.  Therefore, when computing the page shift to
use, OR the end addresses of the buffers as well as the start
addresses into the mask we check.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-01-09 13:50:57 -08:00
Linus Torvalds
5367f2d67c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband 2006-01-08 20:18:44 -08:00
Tim Schmielau
de25968cc8 [PATCH] fix more missing includes
Include fixes for 2.6.14-git11.  Should allow to remove sched.h from
module.h on i386, x86_64, arm, ia64, ppc, ppc64, and s390.  Probably more
to come since I haven't yet checked the other archs.

Signed-off-by: Tim Schmielau <tim@physik3.uni-rostock.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08 20:13:45 -08:00
Dotan Barak
4de144bf72 IB/mthca: Add support for automatic path migration (APM)
Add code to modify QP operation to handle setting alternate paths for
connected QPs.

Signed-off-by: Dotan Barak <dotanb@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-01-06 13:23:58 -08:00
Michael S. Tsirkin
0f8e8f9607 IB/mthca: Fill in vendor_err field in completion with error
Fill vendor_err field in completion with error.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-01-06 13:13:32 -08:00
Jack Morgenstein
5ceb74557c IB/mthca: multiple fixes for multicast group handling
Multicast group management fixes:
. Fix leak of mailbox memory in error handling on multicast group operations.
. Free AMGM indices at detach and in attach error handling.
. Fix amount to shift for aligning next_gid_index in mailbox: it
  starts at bit 6, not bit 5.
. Allocate AMGM index after end of MGM table, in the range num_mgms to
  multicast table size - 1. Add some BUG_ON checks to catch cases
  where the index falls in the MGM hash area.
. Initialize the list of QPs in a newly-allocated group from AMGM to 0
  This is necessary since when a group is moved from AMGM to MGM (in the
  case where the MGM entry has been emptied of QPs), the AMGM entry is
  not reset to 0 (and we don't want an extra command to do that).

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-01-06 13:11:07 -08:00
Jack Morgenstein
0d3b525fff IB/mthca: fix for RTR-to-RTS transition in modify QP
PKEY_INDEX is not a legal parameter in the RTR->RTS transition.

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-01-06 13:03:43 -08:00
Jack Morgenstein
0364ffc3e8 IB/mthca: fix for SQEr-to-RTS transition in modify QP
Fixes to SQEr->RTS transition in modify_qp:
1. The flag IB_QP_ACCESS_FLAGS is optional for UC qps
2. The SQEr state is not supported for RC qps

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-01-06 13:01:27 -08:00
Jack Morgenstein
5b3bc7a681 IB/mthca: max_inline_data handling tweaks
Fix a case where copying max_inline_data from a successful create_qp
capabilities output to create_qp input could cause EINVAL error:

mthca_set_qp_size must check max_inline_data directly against
max_desc_sz; checking qp->sq.max_gs is wrong since max_inline_data
depends on the qp type and does not involve max_sg.

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-01-06 12:57:30 -08:00
Michael S. Tsirkin
466200562c IB/mthca: create_eq with size not a power of 2
Fix mthca_create_eq for when the EQ size is not a power of 2.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-01-05 16:17:38 -08:00
Jack Morgenstein
38d1e79347 IB/mthca: check port validity in modify_qp
Modify_qp should check that the physical port number provided
is a legal value.

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-01-05 16:13:46 -08:00
Jack Morgenstein
aa2f936779 IB/mthca: check return value in mthca_dev_lim call
Check error return on call to mthca_dev_lim for Tavor
(as is done for memfree).

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-01-05 16:12:01 -08:00
Jack Morgenstein
1d7d2f6f47 IB/mthca: fix WQE size calculation in create-srq
Thinko: 64 bytes is the minimum SRQ WQE size (not the maximum).

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-01-04 14:42:39 -08:00
Jack Morgenstein
c4342d8a4d IB/mthca: Fix corner cases in max_rd_atomic value handling in modify QP
sae and sre bits should only be set when setting sra_max.  Further, in
the old code, if the caller specifies max_rd_atomic = 0, the sre and
sae bits are still set, with the result that the QP ends up with
max_rd_atomic = 1 in effect.

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-12-15 19:59:01 -08:00
Jack Morgenstein
d1646f86a2 IB/mthca: Fix IB_QP_ACCESS_FLAGS handling.
This patch corrects some corner cases in managing the RAE/RRE bits in
the mthca qp context.  These bits need to be zero if the user requests
max_dest_rd_atomic of zero.  The bits need to be restored to the value
implied by the qp access flags attribute in a previous (or the
current) modify-qp command if the dest_rd_atomic variable is changed
to non-zero.

In the current implementation, the following scenario will not work:
RESET-to-INIT 	set QP access flags to all disabled (zeroes)
INIT-to-RTR     set max_dest_rd_atomic=10, AND
		set qp_access_flags = IB_ACCESS_REMOTE_READ | IB_ACCESS_REMOTE_ATOMIC

The current code will incorrectly take the access-flags value set in
the RESET-to-INIT transition.

We can simplify, and correct, this IB_QP_ACCESS_FLAGS handling: it is
always safe to set qp access flags in the firmware command if either
of IB_QP_MAX_DEST_RD_ATOMIC or IB_QP_ACCESS_FLAGS is set, so let's
just set it to the correct value, always.

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-12-15 14:36:24 -08:00
Jack Morgenstein
576d2e4e40 IB/mthca: Fix SRQ cleanup during QP destroy
When cleaning up a CQ for a QP attached to SRQ, need to free an SRQ
WQE only if the CQE is a receive completion.

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-12-15 14:20:23 -08:00
Michael S. Tsirkin
6c7d2a75b5 IB/mthca: Fix thinko in mthca_table_find()
break only escapes from the innermost loop, and we want to escape both
loops and return an answer.  Noticed by Ishai Rabinovitch.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-12-15 13:55:50 -08:00
Jack Morgenstein
44b5b03033 IB/mthca: don't change driver's copy of attributes if modify QP fails
Only change the driver's copy of the QP attributes in modify QP after
checking the modify QP command completed successfully.

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-12-09 16:40:14 -08:00
Jack Morgenstein
6aa2e4e806 IB/mthca: correct log2 calculation
Fix thinko in rd_atomic calculation: ffs(x) - 1 does not find the next
power of 2 -- it should be fls(x - 1).

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-12-09 16:38:04 -08:00
Jack Morgenstein
94361cf74a IB/mthca: check RDMA limits
Add limit checking on rd_atomic and dest_rd_atomic attributes:
especially for max_dest_rd_atomic, a value that is larger than HCA
capability can cause RDB overflow and corruption of another QP.

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-12-09 16:32:21 -08:00
Jack Morgenstein
52d0df153c IB/mthca: fix memory user DB table leak
Free the memory allocated in mthca_init_user_db_tab() when releasing
the db_tab in mthca_cleanup_user_db_tab().

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-12-09 13:48:50 -08:00
Jack Morgenstein
a3c8ab4fe8 IB/mthca: fix QP size limits for mem-free HCAs
Unlike tavor, the max work queue size is an exact power of 2 for arbel
mode, despite what the documentation (of the QUERY_DEV_LIM firmware
command) says.  Without this patch, on Arbel, we can start with a QP
of a valid size and get above the reported limit after rounding to the
next power of two.

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-11-30 09:55:22 -08:00
Michael S. Tsirkin
e0ae9ecf46 IB/mthca: fix posting of send lists of length >= 255 on mem-free HCAs
On mem-free HCAs, when posting a long list of send requests, a
doorbell must be rung every 255 requests.  Add code to handle this.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-11-29 11:33:46 -08:00
Michael S. Tsirkin
187a25863f IB/mthca: reset QP's last pointers when transitioning to reset state
last pointer is not updated when QP is modified to reset state.  This
causes data corruption if WQEs are already posted on the queue.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-11-28 11:19:43 -08:00
Linus Torvalds
9b152d53b7 Merge branch 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband 2005-11-18 14:33:03 -08:00
Michael S. Tsirkin
48fd0d1fdd IB/mthca: Safer max_send_sge/max_recv_sge calculation
Calculation of QP capabilities still isn't exactly right in mthca:
max_send_sge/max_recv_sge fields returned in create_qp can exceed the
handware supported limits.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-11-18 14:11:17 -08:00
Roland Dreier
cbc5b2bb9e [IB] mthca: don't disable RDMA writes if no responder resources
Responder resources are only required to handle RDMA reads and atomic
operations, not RDMA writes.  So the driver should allow RDMA writes
even if responder resources are set to 0.  This is especially
important for the UC transport -- with the old code, it was impossible
to enable RDMA writes for UC QPs.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-11-15 00:24:23 -08:00
Greg Kroah-Hartman
249bb070f5 [PATCH] PCI: removed unneeded .owner field from struct pci_driver
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2005-11-10 16:09:17 -08:00
Linus Torvalds
78b9c0f91c Merge branch 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband 2005-11-10 13:27:06 -08:00
Michael S. Tsirkin
ae57e24a40 [IB] mthca: fix posting long lists of receive work requests
In Tavor mode, when posting a long list of receive work requests, a
doorbell must be rung every 256 requests.  Add code to do this when
required.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-11-10 10:22:51 -08:00
Roland Dreier
64044bcf75 [IB] mthca: fix wraparound handling in mthca_cq_clean()
Handle case where prod_index has wrapped around and become less than
cq->cons_index by checking that their difference as a signed int is
positive rather than comparing directly.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-11-10 10:22:51 -08:00
Michael S. Tsirkin
62abb8416f [IB] mthca: fix posting of atomic operations
The size of work requests for atomic operations was computed
incorrectly in mthca: all sizeofs need to be divided by 16.

Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-11-10 10:22:50 -08:00
Jack Morgenstein
77369ed31d [IB] uverbs: have kernel return QP capabilities
Move the computation of QP capabilities (max scatter/gather entries,
max inline data, etc) into the kernel, and have the uverbs module
return the values as part of the create QP response.  This keeps
precise knowledge of device limits in the low-level kernel driver.

This requires an ABI bump, so while we're making changes, get rid of
the max_sge parameter for the modify SRQ command -- it's not used and
shouldn't be there.

Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il>
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-11-10 10:22:50 -08:00
Roland Dreier
0b4ff2c0e6 [IB] mthca: fix typo in catastrophic error polling
Fix a typo in the rearming of the catastrophic error polling timer: we
should rearm the timer as long as the stop flag is _not_ set.

Signed-off-by: Roland Dreier <rolandd@cisco.com>
2005-11-10 10:22:50 -08:00
Tim Schmielau
8c65b4a604 [PATCH] fix remaining missing includes
Fix more include file problems that surfaced since I submitted the previous
fix-missing-includes.patch.  This should now allow not to include sched.h
from module.h, which is done by a followup patch.

Signed-off-by: Tim Schmielau <tim@physik3.uni-rostock.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07 07:53:41 -08:00