android_kernel_google_msm

mirror of https://github.com/followmsi/android_kernel_google_msm.git synced 2024-11-06 23:17:41 +00:00

Author	SHA1	Message	Date
Alex Elder	70b06043cd	libceph: just set SOCK_CLOSED when state changes (cherry picked from commit `d65c9e0b9e`) When a TCP_CLOSE or TCP_CLOSE_WAIT event occurs, the SOCK_CLOSED connection flag bit is set, and if it had not been previously set queue_con() is called to ensure con_work() will get a chance to handle the changed state. con_work() atomically checks--and if set, clears--the SOCK_CLOSED bit if it was set. This means that even if the bit were set repeatedly, the related processing in con_work() only gets called once per transition of the bit from 0 to 1. What's important then is that we ensure con_work() gets called at least once when a socket close event occurs, not that it gets called exactly once. The work queue mechanism already takes care of queueing work only if it is not already queued, so there's no need for us to call queue_con() conditionally. So this patch just makes it so the SOCK_CLOSED flag gets set unconditionally in ceph_sock_state_change(). Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-11-26 11:38:27 -08:00
Alex Elder	54942c5326	libceph: don't change socket state on sock event (cherry picked from commit `188048bce3`) Currently the socket state change event handler records an error message on a connection to distinguish a close while connecting from a close while a connection was already established. Changing connection information during handling of a socket event is not very clean, so instead move this assignment inside con_work(), where it can be done during normal connection-level processing (and under protection of the connection mutex as well). Move the handling of a socket closed event up to the top of the processing loop in con_work(); there's no point in handling backoff etc. if we have a newly-closed socket to take care of. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-11-26 11:38:27 -08:00
Alex Elder	eb8c5642db	libceph: SOCK_CLOSED is a flag, not a state (cherry picked from commit `a8d00e3cde`) The following commit changed it so SOCK_CLOSED bit was stored in a connection's new "flags" field rather than its "state" field. libceph: start separating connection flags from state commit `928443cd` That bit is used in con_close_socket() to protect against setting an error message more than once in the socket event handler function. Unfortunately, the field being operated on in that function was not updated to be "flags" as it should have been. This fixes that error. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-11-26 11:38:27 -08:00
Alex Elder	f8920642ec	libceph: don't use bio_iter as a flag (cherry picked from commit `abdaa6a849`) Recently a bug was fixed in which the bio_iter field in a ceph message was not being properly re-initialized when a message got re-transmitted: commit `43643528cc` Author: Yan, Zheng <zheng.z.yan@intel.com> rbd: Clear ceph_msg->bio_iter for retransmitted message We are now only initializing the bio_iter field when we are about to start to write message data (in prepare_write_message_data()), rather than every time we are attempting to write any portion of the message data (in write_partial_msg_pages()). This means we no longer need to use the msg->bio_iter field as a flag. So just don't do that any more. Trust prepare_write_message_data() to ensure msg->bio_iter is properly initialized, every time we are about to begin writing (or re-writing) a message's bio data. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-11-26 11:38:26 -08:00
Alex Elder	67e5007aca	libceph: move init of bio_iter (cherry picked from commit `572c588eda`) If a message has a non-null bio pointer, its bio_iter field is initialized in write_partial_msg_pages() if this has not been done already. This is really a one-time setup operation for sending a message's (bio) data, so move that initialization code into prepare_write_message_data() which serves that purpose. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-11-26 11:38:26 -08:00
Alex Elder	ec53635e8f	libceph: move init_bio_*() functions up (cherry picked from commit `df6ad1f973`) Move init_bio_iter() and iter_bio_next() up in their source file so the'll be defined before they're needed. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-11-26 11:38:26 -08:00
Alex Elder	3b17b0bb2b	libceph: don't mark footer complete before it is (cherry picked from commit `fd154f3c75`) This is a nit, but prepare_write_message() sets the FOOTER_COMPLETE flag before the CRC for the data portion (recorded in the footer) has been completely computed. Hold off setting the complete flag until we've decided it's ready to send. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-11-26 11:38:26 -08:00
Alex Elder	3c968ed12f	libceph: encapsulate advancing msg page (cherry picked from commit `84ca8fc87f`) In write_partial_msg_pages(), once all the data from a page has been sent we advance to the next one. Put the code that takes care of this into its own function. While modifying write_partial_msg_pages(), make its local variable "in_trail" be Boolean, and use the local variable "msg" (which is just the connection's current out_msg pointer) consistently. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-11-26 11:38:26 -08:00
Alex Elder	4ecff48cef	libceph: encapsulate out message data setup (cherry picked from commit `739c905baa`) Move the code that prepares to write the data portion of a message into its own function. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-11-26 11:38:25 -08:00
Sage Weil	9021a42c79	libceph: drop ceph_con_get/put helpers and nref member (cherry picked from commit `d59315ca8c`) These are no longer used. Every ceph_connection instance is embedded in another structure, and refcounts manipulated via the get/put ops. Signed-off-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-11-26 11:38:25 -08:00
Sage Weil	1c623b046a	libceph: use con get/put methods (cherry picked from commit `36eb71aa57`) The ceph_con_get/put() helpers manipulate the embedded con ref count, which isn't used now that ceph_connections are embedded in other structures. Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-11-26 11:38:25 -08:00
Dan Carpenter	8124d55a2d	libceph: fix NULL dereference in reset_connection() (cherry picked from commit `26ce171915`) We dereference "con->in_msg" on the line after it was set to NULL. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Alex Elder <elder@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-11-26 11:38:25 -08:00
Sage Weil	fccbf066b3	libceph: transition socket state prior to actual connect (cherry picked from commit `89a86be0ce`) Once we call ->connect(), we are racing against the actual connection, and a subsequent transition from CONNECTING -> CONNECTED. Set the state to CONNECTING before that, under the protection of the mutex, to avoid the race. This was introduced in `928443cd96`, with the original socket state code. Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-11-26 11:38:25 -08:00
Xi Wang	f4d29a959a	libceph: fix overflow in osdmap_apply_incremental() (cherry picked from commit `a550604950`) On 32-bit systems, a large `pglen' would overflow `pglensizeof(u32)' and bypass the check ceph_decode_need(p, end, pglensizeof(u32), bad). It would also overflow the subsequent kmalloc() size, leading to out-of-bounds write. Signed-off-by: Xi Wang <xi.wang@gmail.com> Reviewed-by: Alex Elder <elder@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-11-26 11:38:24 -08:00
Xi Wang	6b71f61c32	libceph: fix overflow in osdmap_decode() (cherry picked from commit `e91a9b639a`) On 32-bit systems, a large `n' would overflow `n * sizeof(u32)' and bypass the check ceph_decode_need(p, end, n * sizeof(u32), bad). It would also overflow the subsequent kmalloc() size, leading to out-of-bounds write. Signed-off-by: Xi Wang <xi.wang@gmail.com> Reviewed-by: Alex Elder <elder@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-11-26 11:38:24 -08:00
Xi Wang	c66a9c7c10	libceph: fix overflow in __decode_pool_names() (cherry picked from commit `ad3b904c07`) `len' is read from network and thus needs validation. Otherwise a large `len' would cause out-of-bounds access via the memcpy() call. In addition, len = 0xffffffff would overflow the kmalloc() size, leading to out-of-bounds write. This patch adds a check of `len' via ceph_decode_need(). Also use kstrndup rather than kmalloc/memcpy. [elder@inktank.com: added -ENOMEM return for null kstrndup() result] Signed-off-by: Xi Wang <xi.wang@gmail.com> Reviewed-by: Alex Elder <elder@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-11-26 11:38:24 -08:00
Alex Elder	ce4516fbb4	libceph: make ceph_con_revoke_message() a msg op (cherry picked from commit `8921d114f5`) ceph_con_revoke_message() is passed both a message and a ceph connection. A ceph_msg allocated for incoming messages on a connection always has a pointer to that connection, so there's no need to provide the connection when revoking such a message. Note that the existing logic does not preclude the message supplied being a null/bogus message pointer. The only user of this interface is the OSD client, and the only value an osd client passes is a request's r_reply field. That is always non-null (except briefly in an error path in ceph_osdc_alloc_request(), and that drops the only reference so the request won't ever have a reply to revoke). So we can safely assume the passed-in message is non-null, but add a BUG_ON() to make it very obvious we are imposing this restriction. Rename the function ceph_msg_revoke_incoming() to reflect that it is really an operation on an incoming message. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-11-26 11:38:24 -08:00
Alex Elder	ae048538ab	libceph: make ceph_con_revoke() a msg operation (cherry picked from commit `6740a845b2`) ceph_con_revoke() is passed both a message and a ceph connection. Now that any message associated with a connection holds a pointer to that connection, there's no need to provide the connection when revoking a message. This has the added benefit of precluding the possibility of the providing the wrong connection pointer. If the message's connection pointer is null, it is not being tracked by any connection, so revoking it is a no-op. This is supported as a convenience for upper layers, so they can revoke a message that is not actually "in flight." Rename the function ceph_msg_revoke() to reflect that it is really an operation on a message, not a connection. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-11-26 11:38:24 -08:00
Alex Elder	bfd357201d	libceph: have messages take a connection reference (cherry picked from commit `92ce034b5a`) There are essentially two types of ceph messages: incoming and outgoing. Outgoing messages are always allocated via ceph_msg_new(), and at the time of their allocation they are not associated with any particular connection. Incoming messages are always allocated via ceph_con_in_msg_alloc(), and they are initially associated with the connection from which incoming data will be placed into the message. When an outgoing message gets sent, it becomes associated with a connection and remains that way until the message is successfully sent. The association of an incoming message goes away at the point it is sent to an upper layer via a con->ops->dispatch method. This patch implements reference counting for all ceph messages, such that every message holds a reference (and a pointer) to a connection if and only if it is associated with that connection (as described above). For background, here is an explanation of the ceph message lifecycle, emphasizing when an association exists between a message and a connection. Outgoing Messages An outgoing message is "owned" by its allocator, from the time it is allocated in ceph_msg_new() up to the point it gets queued for sending in ceph_con_send(). Prior to that point the message's msg->con pointer is null; at the point it is queued for sending its message pointer is assigned to refer to the connection. At that time the message is inserted into a connection's out_queue list. When a message on the out_queue list has been sent to the socket layer to be put on the wire, it is transferred out of that list and into the connection's out_sent list. At that point it is still owned by the connection, and will remain so until an acknowledgement is received from the recipient that indicates the message was successfully transferred. When such an acknowledgement is received (in process_ack()), the message is removed from its list (in ceph_msg_remove()), at which point it is no longer associated with the connection. So basically, any time a message is on one of a connection's lists, it is associated with that connection. Reference counting outgoing messages can thus be done at the points a message is added to the out_queue (in ceph_con_send()) and the point it is removed from either its two lists (in ceph_msg_remove())--at which point its connection pointer becomes null. Incoming Messages When an incoming message on a connection is getting read (in read_partial_message()) and there is no message in con->in_msg, a new one is allocated using ceph_con_in_msg_alloc(). At that point the message is associated with the connection. Once that message has been completely and successfully read, it is passed to upper layer code using the connection's con->ops->dispatch method. At that point the association between the message and the connection no longer exists. Reference counting of connections for incoming messages can be done by taking a reference to the connection when the message gets allocated, and releasing that reference when it gets handed off using the dispatch method. We should never fail to get a connection reference for a message--the since the caller should already hold one. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-11-26 11:38:23 -08:00
Alex Elder	e84e066e5c	libceph: have messages point to their connection (cherry picked from commit `38941f8031`) When a ceph message is queued for sending it is placed on a list of pending messages (ceph_connection->out_queue). When they are actually sent over the wire, they are moved from that list to another (ceph_connection->out_sent). When acknowledgement for the message is received, it is removed from the sent messages list. During that entire time the message is "in the possession" of a single ceph connection. Keep track of that connection in the message. This will be used in the next patch (and is a helpful bit of information for debugging anyway). Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-11-26 11:38:23 -08:00
Alex Elder	35067a2068	libceph: tweak ceph_alloc_msg() (cherry picked from commit `1c20f2d267`) The function ceph_alloc_msg() is only used to allocate a message that will be assigned to a connection's in_msg pointer. Rename the function so this implied usage is more clear. In addition, make that assignment inside the function (again, since that's precisely what it's intended to be used for). This allows us to return what is now provided via the passed-in address of a "skip" variable. The return type is now Boolean to be explicit that there are only two possible outcomes. Make sure the result of an ->alloc_msg method call always sets the value of *skip properly. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-11-26 11:38:23 -08:00
Alex Elder	6880138c03	libceph: fully initialize connection in con_init() (cherry picked from commit `1bfd89f4e6`) Move the initialization of a ceph connection's private pointer, operations vector pointer, and peer name information into ceph_con_init(). Rearrange the arguments so the connection pointer is first. Hide the byte-swapping of the peer entity number inside ceph_con_init() Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-11-26 11:38:23 -08:00
Alex Elder	9403ae33bf	libceph: init monitor connection when opening (cherry picked from commit `20581c1faf`) Hold off initializing a monitor client's connection until just before it gets opened for use. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-11-26 11:38:23 -08:00
Sage Weil	a2b87615e2	libceph: drop connection refcounting for mon_client (cherry picked from commit `ec87ef4309`) All references to the embedded ceph_connection come from the msgr workqueue, which is drained prior to mon_client destruction. That means we can ignore con refcounting entirely. Signed-off-by: Sage Weil <sage@newdream.net> Reviewed-by: Alex Elder <elder@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-11-26 11:38:22 -08:00
Alex Elder	31a84d8343	libceph: embed ceph connection structure in mon_client (cherry picked from commit `67130934fb`) A monitor client has a pointer to a ceph connection structure in it. This is the only one of the three ceph client types that do it this way; the OSD and MDS clients embed the connection into their main structures. There is always exactly one ceph connection for a monitor client, so there is no need to allocate it separate from the monitor client structure. So switch the ceph_mon_client structure to embed its ceph_connection structure. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-11-26 11:38:22 -08:00
Alex Elder	51588ed26f	libceph: set CLOSED state bit in con_init (cherry picked from commit `a5988c490e`) Once a connection is fully initialized, it is really in a CLOSED state, so make that explicit by setting the bit in its state field. It is possible for a connection in NEGOTIATING state to get a failure, leading to ceph_fault() and ultimately ceph_con_close(). Clear that bits if it is set in that case, to reflect that the connection truly is closed and is no longer participating in a connect sequence. Issue a warning if ceph_con_open() is called on a connection that is not in CLOSED state. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-11-26 11:38:22 -08:00
Alex Elder	d39319ee9b	libceph: provide osd number when creating osd (cherry picked from commit `e10006f807`) Pass the osd number to the create_osd() routine, and move the initialization of fields that depend on it therein. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-11-26 11:38:22 -08:00
Alex Elder	0bcd157774	libceph: start tracking connection socket state (cherry picked from commit `ce2c8903e7`) Start explicitly keeping track of the state of a ceph connection's socket, separate from the state of the connection itself. Create placeholder functions to encapsulate the state transitions. -------- \| NEW* \| transient initial state -------- \| con_sock_state_init() v ---------- \| CLOSED \| initialized, but no socket (and no ---------- TCP connection) ^ \ \| \ con_sock_state_connecting() \| ---------------------- \| \ + con_sock_state_closed() \ \|\ \ \| \ \ \| ----------- \ \| \| CLOSING \| socket event; \ \| ----------- await close \ \| ^ \| \| \| \| \| + con_sock_state_closing() \| \| / \ \| \| / --------------- \| \| / \ v \| / -------------- \| / -----------------\| CONNECTING \| socket created, TCP \| \| / -------------- connect initiated \| \| \| con_sock_state_connected() \| \| v ------------- \| CONNECTED \| TCP connection established ------------- Make the socket state an atomic variable, reinforcing that it's a distinct transtion with no possible "intermediate/both" states. This is almost certainly overkill at this point, though the transitions into CONNECTED and CLOSING state do get called via socket callback (the rest of the transitions occur with the connection mutex held). We can back out the atomicity later. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil<sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-11-26 11:38:22 -08:00
Alex Elder	bc327474a0	libceph: start separating connection flags from state (cherry picked from commit `928443cd96`) A ceph_connection holds a mixture of connection state (as in "state machine" state) and connection flags in a single "state" field. To make the distinction more clear, define a new "flags" field and use it rather than the "state" field to hold Boolean flag values. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil<sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-11-26 11:38:21 -08:00
Alex Elder	d910c114b6	libceph: embed ceph messenger structure in ceph_client (cherry picked from commit `15d9882c33`) A ceph client has a pointer to a ceph messenger structure in it. There is always exactly one ceph messenger for a ceph client, so there is no need to allocate it separate from the ceph client structure. Switch the ceph_client structure to embed its ceph_messenger structure. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Yehuda Sadeh <yehuda@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-11-26 11:38:21 -08:00
Alex Elder	4874ba9c07	libceph: rename kvec_reset and kvec_add functions (cherry picked from commit `e22004235a`) The functions ceph_con_out_kvec_reset() and ceph_con_out_kvec_add() are entirely private functions, so drop the "ceph_" prefix in their name to make them slightly more wieldy. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Yehuda Sadeh <yehuda@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-11-26 11:38:21 -08:00
Alex Elder	f5e79a4430	libceph: rename socket callbacks (cherry picked from commit `327800bdc2`) Change the names of the three socket callback functions to make it more obvious they're specifically associated with a connection's socket (not the ceph connection that uses it). Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Yehuda Sadeh <yehuda@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-11-26 11:38:21 -08:00
Alex Elder	809c58f1bd	libceph: kill bad_proto ceph connection op (cherry picked from commit `6384bb8b8e`) No code sets a bad_proto method in its ceph connection operations vector, so just get rid of it. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Yehuda Sadeh <yehuda@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-11-26 11:38:21 -08:00
Alex Elder	ac7a426817	libceph: eliminate connection state "DEAD" (cherry picked from commit `e5e372da9a`) The ceph connection state "DEAD" is never set and is therefore not needed. Eliminate it. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Yehuda Sadeh <yehuda@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-11-26 11:38:20 -08:00
Yan, Zheng	e7fda85c9d	ceph: check PG_Private flag before accessing page->private (cherry picked from commit `28c0254ede`) I got lots of NULL pointer dereference Oops when compiling kernel on ceph. The bug is because the kernel page migration routine replaces some pages in the page cache with new pages, these new pages' private can be non-zero. Signed-off-by: Zheng Yan <zheng.z.yan@intel.com> Signed-off-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-11-26 11:38:20 -08:00
Yan, Zheng	9923ad77a6	rbd: Fix ceph_snap_context size calculation (cherry picked from commit `f9f9a19044`) ceph_snap_context->snaps is an u64 array Signed-off-by: Zheng Yan <zheng.z.yan@intel.com> Reviewed-by: Alex Elder <elder@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-11-26 11:38:10 -08:00
Josh Durgin	6448acf6e6	rbd: store snapshot id instead of index (cherry picked from commit `77dfe99fe3`) When a device was open at a snapshot, and snapshots were deleted or added, data from the wrong snapshot could be read. Instead of assuming the snap context is constant, store the actual snap id when the device is initialized, and rely on the OSDs to signal an error if we try reading from a snapshot that was deleted. Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com> Reviewed-by: Alex Elder <elder@dreamhost.com> Reviewed-by: Yehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-11-26 11:38:10 -08:00
Josh Durgin	4573951472	rbd: protect read of snapshot sequence number (cherry picked from commit `403f24d3d5`) This is updated whenever a snapshot is added or deleted, and the snapc pointer is changed with every refresh of the header. Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com> Reviewed-by: Alex Elder <elder@dreamhost.com> Reviewed-by: Yehuda Sadeh <yehuda@hq.newdream.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-11-26 11:38:10 -08:00
Alex Elder	095cb2142d	rbd: don't hold spinlock during messenger flush (cherry picked from commit `cd9d9f5df6`) A recent change made changes to the rbd_client_list be protected by a spinlock. Unfortunately in rbd_put_client(), the lock is taken before possibly dropping the last reference to an rbd_client, and on the last reference that eventually calls flush_workqueue() which can sleep. The problem was flagged by a debug spinlock warning: BUG: spinlock wrong CPU on CPU#3, rbd/27814 The solution is to move the spinlock acquisition and release inside rbd_client_release(), which is the spot where it's really needed for protecting the removal of the rbd_client from the client list. Signed-off-by: Alex Elder <elder@dreamhost.com> Reviewed-by: Sage Weil <sage@newdream.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-11-26 11:38:10 -08:00
Sage Weil	49da293c7d	libceph: fix messenger retry (cherry picked from commit `5bdca4e076`) In ancient times, the messenger could both initiate and accept connections. An artifact if that was data structures to store/process an incoming ceph_msg_connect request and send an outgoing ceph_msg_connect_reply. Sadly, the negotiation code was referencing those structures and ignoring important information (like the peer's connect_seq) from the correct ones. Among other things, this fixes tight reconnect loops where the server sends RETRY_SESSION and we (the client) retries with the same connect_seq as last time. This bug pretty easily triggered by injecting socket failures on the MDS and running some fs workload like workunits/direct_io/test_sync_io. Signed-off-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-11-26 11:38:10 -08:00
Sage Weil	21cbad59b0	libceph: flush msgr queue during mon_client shutdown (cherry picked from commit `f3dea7edd3`) (cherry picked from commit `642c0dbde3`) We need to flush the msgr workqueue during mon_client shutdown to ensure that any work affecting our embedded ceph_connection is finished so that we can be safely destroyed. Previously, we were flushing the work queue after osd_client shutdown and before mon_client shutdown to ensure that any osd connection refs to authorizers are flushed. Remove the redundant flush, and document in the comment that the mon_client flush is needed to cover that case as well. Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-11-26 11:38:09 -08:00
Yan, Zheng	576e428b24	rbd: Clear ceph_msg->bio_iter for retransmitted message (cherry picked from commit `43643528cc`) (cherry picked from commit `b132cf4c73`) The bug can cause NULL pointer dereference in write_partial_msg_pages Signed-off-by: Zheng Yan <zheng.z.yan@intel.com> Reviewed-by: Alex Elder <elder@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-11-26 11:38:09 -08:00
Sage Weil	acecca4878	libceph: use con get/put ops from osd_client (cherry picked from commit `0d47766f14`) There were a few direct calls to ceph_con_{get,put}() instead of the con ops from osd_client.c. This is a bug since those ops aren't defined to be ceph_con_get/put. This breaks refcounting on the ceph_osd structs that contain the ceph_connections, and could lead to all manner of strangeness. The purpose of the ->get and ->put methods in a ceph connection are to allow the connection to indicate it has a reference to something external to the messaging system, not to indicate something external has a reference to the connection. [elder@inktank.com: added that last sentence] Signed-off-by: Sage Weil <sage@newdream.net> Reviewed-by: Alex Elder <elder@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit `88ed6ea0b2`)	2012-11-26 11:38:09 -08:00
Alex Elder	40971fcf15	libceph: osd_client: don't drop reply reference too early (cherry picked from commit `ab8cb34a4b`) In ceph_osdc_release_request(), a reference to the r_reply message is dropped. But just after that, that same message is revoked if it was in use to receive an incoming reply. Reorder these so we are sure we hold a reference until we're actually done with the message. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit `680584fab0`)	2012-11-26 11:38:09 -08:00
Sage Weil	1c201dffa3	libceph: fix pg_temp updates (cherry picked from commit `6bd9adbdf9`) Usually, we are adding pg_temp entries or removing them. Occasionally they update. In that case, osdmap_apply_incremental() was failing because the rbtree entry already exists. Fix by removing the existing entry before inserting a new one. Fixes http://tracker.newdream.net/issues/2446 Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-11-26 11:38:09 -08:00
Sage Weil	15ba38ebce	libceph: avoid unregistering osd request when not registered (cherry picked from commit `35f9f8a09e`) There is a race between two __unregister_request() callers: the reply path and the ceph_osdc_wait_request(). If we get a reply and the timeout expires at roughly the same time, both callers will try to unregister the request, and the second one will do bad things. Simply check if the request is still already unregistered; if so, return immediately and do nothing. Fixes http://tracker.newdream.net/issues/2420 Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Alex Elder <elder@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-11-26 11:38:08 -08:00
Alex Elder	1d3df0e266	ceph: add auth buf in prepare_write_connect() (cherry picked from commit `3da54776e2`) Move the addition of the authorizer buffer to a connection's out_kvec out of get_connect_authorizer() and into its caller. This way, the caller--prepare_write_connect()--can avoid adding the connect header to out_kvec before it has been fully initialized. Prior to this patch, it was possible for a connect header to be sent over the wire before the authorizer protocol or buffer length fields were initialized. An authorizer buffer associated with that header could also be queued to send only after the connection header that describes it was on the wire. Fixes http://tracker.newdream.net/issues/2424 Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-11-26 11:38:08 -08:00
Alex Elder	3f13447d2c	ceph: rename prepare_connect_authorizer() (cherry picked from commit `dac1e716c6`) Change the name of prepare_connect_authorizer(). The next patch is going to make this function no longer add anything to the connection's out_kvec, so it will no longer fit the pattern of the rest of the prepare_connect_*() functions. In addition, pass the address of a variable that will hold the authorization protocol to use. Move the assignment of that to the connection's out_connect structure into prepare_write_connect(). Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-11-26 11:38:08 -08:00
Alex Elder	8d19055c84	ceph: return pointer from prepare_connect_authorizer() (cherry picked from commit `729796be91`) Change prepare_connect_authorizer() so it returns a pointer (or pointer-coded error). Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-11-26 11:38:08 -08:00
Alex Elder	ed35fbcd3c	ceph: use info returned by get_authorizer (cherry picked from commit `8f43fb5389`) Rather than passing a bunch of arguments to be filled in with the content of the ceph_auth_handshake buffer now returned by the get_authorizer method, just use the returned information in the caller, and drop the unnecessary arguments. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-11-26 11:38:08 -08:00

1 2 3 4 5 ...

301929 commits