* git://git.samba.org/sfrench/cifs-2.6:
cifs: Fix oops in session setup code for null user mounts
[CIFS] Update cifs Kconfig title to match removal of experimental dependency
cifs: fix printk format warnings
cifs: check offset in decode_ntlmssp_challenge()
cifs: NULL dereference on allocation failure
put_io_context() performed a complex trylock dancing to avoid
deferring ioc release to workqueue. It was also broken on UP because
trylock was always assumed to succeed which resulted in unbalanced
preemption count.
While there are ways to fix the UP breakage, even the most
pathological microbench (forced ioc allocation and tight fork/exit
loop) fails to show any appreciable performance benefit of the
optimization. Strip it out. If there turns out to be workloads which
are affected by this change, simpler optimization from the discussion
thread can be applied later.
Signed-off-by: Tejun Heo <tj@kernel.org>
LKML-Reference: <1328514611.21268.66.camel@sli10-conroe>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The function ext4_claim_inode() is only called by one function,
ext4_new_inode(), and by folding the functionality into
ext4_new_inode(), we can remove almost 50 lines of code, and put all
of the logic of allocating a new inode into a single place.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Network namespace is taken from request transport and passed as a part of
cb_process_state structure.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This patch makes nfs_clients_lock allocated per network namespace. All items it
protects are already network namespace aware.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This patch makes ID's infrastructure network namespace aware. This was done
mainly because of nfs_client_lock, which is desired to be per network
namespace, but protects NFS clients ID's.
NOTE: NFS client's net pointer have to be set prior to ID initialization,
proper assignment was moved.
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This patch splits global list of NFS servers into per-net-ns array of lists.
This looks more strict and clearer.
BTW, this patch also makes "/proc/fs/nfsfs/volumes" content depends on /proc
mount owner pid namespace. See below for details.
NOTE: few words about how was /proc/fs/nfsfs/ entries content show per network
namespace done. This is a little bit tricky and not the best is could be. But
it's cheap (proper fix for /proc conteinerization is a hard nut to crack).
The idea is simple: take proper network namespace from pid namespace
child reaper nsproxy of /proc/ mount creator.
This actually means, that if there are 2 containers with different net
namespace sharing pid namespace, then read of /proc/fs/nfsfs/ entries will
always return content, taken from net namespace of pid namespace creator task
(and thus second namespace set wil be unvisible).
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This patch splits global list of NFS clients into per-net-ns array of lists.
This looks more strict and clearer.
BTW, this patch also makes "/proc/fs/nfsfs/servers" entry content depends on
/proc mount owner pid namespace. See below for details.
NOTE: few words about how was /proc/fs/nfsfs/ entries content show per network
namespace done. This is a little bit tricky and not the best is could be. But
it's cheap (proper fix for /proc conteinerization is a hard nut to crack).
The idea is simple: take proper network namespace from pid namespace
child reaper nsproxy of /proc/ mount creator.
This actually means, that if there are 2 containers with different net
namespace sharing pid namespace, then read of /proc/fs/nfsfs/ entries will
always return content, taken from net namespace of pid namespace creator task
(and thus second namespace set wil be unvisible).
Signed-off-by: Stanislav Kinsbursky <skinsbursky@parallels.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This patch removes the CONFIG_NFS_USE_NEW_IDMAPPER compile option.
First, the idmapper will attempt to map the id using /sbin/request-key
and nfsidmap. If this fails (if /etc/request-key.conf is not configured
properly) then the idmapper will call the legacy code to perform the
mapping. I left a comment stating where the legacy code begins to make
it easier for somebody to remove in the future.
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
data_server_cache entries should only be treated as the same if the address
list hasn't changed.
A new entry will be made when an MDS changes an address list (as seen by
GETDEVINFO). The old entry will be freed once all references are gone.
Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This patch addresses printks that have some context to show that they are
from fs/nfs/, but for the sake of consistency now start with NFS:
Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Messages like "Got error -10052 from the server on DESTROY_SESSION. Session
has been destroyed regardless" can be confusing to users who aren't very
familiar with NFS.
NOTE: This patch ignores any printks() that start by printing __func__ - that
will be in a separate patch.
Signed-off-by: Weston Andros Adamson <dros@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
I noticed that test_stateid() was always using the same stateid for open
and lock recovery. After poking around a bit, I discovered that it was
always testing with a delegation stateid (even if there was no
delegation present). I figured this wasn't correct, so now delegation
and open stateids are tested during open_expired() and lock stateids are
tested during lock_expired().
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This takes the guesswork out of what stateid to use. The caller is
expected to figure this out and pass in the correct one.
Signed-off-by: Bryan Schumaker <bjschuma@netapp.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Setting the task name is done within setup_new_exec() by accessing
bprm->filename. However this happens after flush_old_exec().
This may result in a use after free bug, flush_old_exec() may
"complete" vfork_done, which will wake up the parent which in turn
may free the passed in filename.
To fix this add a new tcomm field in struct linux_binprm which
contains the now early generated task name until it is used.
Fixes this bug on s390:
Unable to handle kernel pointer dereference at virtual kernel address 0000000039768000
Process kworker/u:3 (pid: 245, task: 000000003a3dc840, ksp: 0000000039453818)
Krnl PSW : 0704000180000000 0000000000282e94 (setup_new_exec+0xa0/0x374)
Call Trace:
([<0000000000282e2c>] setup_new_exec+0x38/0x374)
[<00000000002dd12e>] load_elf_binary+0x402/0x1bf4
[<0000000000280a42>] search_binary_handler+0x38e/0x5bc
[<0000000000282b6c>] do_execve_common+0x410/0x514
[<0000000000282cb6>] do_execve+0x46/0x58
[<00000000005bce58>] kernel_execve+0x28/0x70
[<000000000014ba2e>] ____call_usermodehelper+0x102/0x140
[<00000000005bc8da>] kernel_thread_starter+0x6/0xc
[<00000000005bc8d4>] kernel_thread_starter+0x0/0xc
Last Breaking-Event-Address:
[<00000000002830f0>] setup_new_exec+0x2fc/0x374
Kernel panic - not syncing: Fatal exception: panic_on_oops
Reported-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- Fix a regression in 16-bit Atmel NAND flash which was introduced in 3.1
- Fix breakage with MTD suspend caused by the API rework
- Fix a problem with resetting the MX28 BCH module
- A couple of other trivial fixes
* tag 'for-linus-3.3-20120204' of git://git.infradead.org/~dwmw2/mtd-3.3:
Revert "mtd: atmel_nand: optimize read/write buffer functions"
mtd: fix MTD suspend
jffs2: do not initialize variable unnecessarily
mtd: gpmi-nand bugfix: reset the BCH module when it is not MX23
mtd: nand: fix typo in comment
Commit bf118a342f (NFSv4: include bitmap
in nfsv4 get acl data) introduces the 'acl_scratch' page for the case
where we may need to decode multi-page data. However it fails to take
into account the fact that the variable may be NULL (for the case where
we're not doing multi-page decode), and it also attaches it to the
encoding xdr_stream rather than the decoding one.
The immediate result is an Oops in nfs4_xdr_enc_getacl due to the
call to page_address() with a NULL page pointer.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Andy Adamson <andros@netapp.com>
Cc: stable@vger.kernel.org
The rpc buffers will be allocated out of low memory, so we should really
only be taking that into account.
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Move calculation of the default into a helper function.
Get rid of an unused variable "err" while we're there.
Thanks to Mi Jinlong for catching an arithmetic error in a previous
version.
Cc: Mi Jinlong <mijinlong@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
We check for zero length strings in the caller now, so these aren't
needed.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Define new macro XFS_ALL_QUOTA_ACTIVE and simply some usage
of quota macros.
Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
Change xfs_sb_from_disk() interface to take a mount pointer
instead of a superblock pointer.
This is to print mount point specific error messages in future
fixes.
Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
Define a new function xfs_inode_dquot() that takes a inode pointer
and a disk quota type and returns the quota pointer for the specified
quota type.
This simplifies the xfs_qm_dqget() error path significantly.
Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
Create a new function xfs_this_quota_on() that takes a xfs_mount
data structure and a disk quota type and returns true if the specified
type of quota is ON in the xfs_mount data structure.
Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
rbd: fix safety of rbd_put_client()
rbd: fix a memory leak in rbd_get_client()
ceph: create a new session lock to avoid lock inversion
ceph: fix length validation in parse_reply_info()
ceph: initialize client debugfs outside of monc->mutex
ceph: change "ceph.layout" xattr to be "ceph.file.layout"
Removing the macro, as this is no more needed in the code.
Tried to find the reference when it was last used - but the usage
for this seemed to have been dropped long time ago.
Signed-off-by: Amit Sahrawat <amit.sahrawat83@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
This fixes the race in process_vm_core found by Oleg (see
http://article.gmane.org/gmane.linux.kernel/1235667/
for details).
This has been updated since I last sent it as the creation of the new
mm_access() function did almost exactly the same thing as parts of the
previous version of this patch did.
In order to use mm_access() even when /proc isn't enabled, we move it to
kernel/fork.c where other related process mm access functions already
are.
Signed-off-by: Chris Yeoh <yeohc@au1.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Lockdep was reporting a possible circular lock dependency in
dentry_lease_is_valid(). That function needs to sample the
session's s_cap_gen and and s_cap_ttl fields coherently, but needs
to do so while holding a dentry lock. The s_cap_lock field was
being used to protect the two fields, but that can't be taken while
holding a lock on a dentry within the session.
In most cases, the s_cap_gen and s_cap_ttl fields only get operated
on separately. But in three cases they need to be updated together.
Implement a new lock to protect the spots updating both fields
atomically is required.
Signed-off-by: Alex Elder <elder@dreamhost.com>
Reviewed-by: Sage Weil <sage@newdream.net>
"len" is read from network and thus needs validation. Otherwise, given
a bogus "len" value, p+len could be an out-of-bounds pointer, which is
used in further parsing.
Signed-off-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: Sage Weil <sage@newdream.net>
The virtual extended attribute named "ceph.layout" is meaningful
only for regular files. Change its name to be "ceph.file.layout" to
more directly reflect that in the ceph xattr namespace. Preserve
the old "ceph.layout" name for the time being (until we decide it's
safe to get rid of it entirely).
Add a missing initializer for "readonly" in the terminating entry.
Signed-off-by: Alex Elder <elder@dreamhost.com>
Reviewed-by: Sage Weil <sage@newdream.net>
This was done to resolve a merge and build problem with the
drivers/acpi/processor_driver.c file.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
There are no functional changes. Just code motion to make it
clear that we don't follow a link between sysctl roots unless the
directory entry actually is a link.
Suggested-by: Lucian Adrian Grijincu <lucian.grijincu@gmail.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Document get_subdir and that find_subdir alwasy takes a reference.
Suggested-by: Lucian Adrian Grijincu <lucian.grijincu@gmail.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
When insert_header fails ensure we return the proper error value
from get_subdir. In practice nothing cares, but there is no
need to be sloppy.
Reported-by: Lucian Adrian Grijincu <lucian.grijincu@gmail.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Once /proc/pid/mem is opened, the memory can't be released until
mem_release() even if its owner exits.
Change mem_open() to do atomic_inc(mm_count) + mmput(), this only
pins mm_struct. Change mem_rw() to do atomic_inc_not_zero(mm_count)
before access_remote_vm(), this verifies that this mm is still alive.
I am not sure what should mem_rw() return if atomic_inc_not_zero()
fails. With this patch it returns zero to match the "mm == NULL" case,
may be it should return -EINVAL like it did before e268337d.
Perhaps it makes sense to add the additional fatal_signal_pending()
check into the main loop, to ensure we do not hold this memory if
the target task was oom-killed.
Cc: stable@kernel.org
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
No functional changes, cleanup and preparation.
mem_read() and mem_write() are very similar. Move this code into the
new common helper, mem_rw(), which takes the additional "int write"
argument.
Cc: stable@kernel.org
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch fixes merge conflict resolution breakage introduced by merge
d3712b9dfc ("Merge tag 'for-linus' of git://github.com/prasad-joshi/logfs_upstream").
The commit changed 'mtd_can_have_bb()' function and made it always
return zero, which is incorrect. Instead, we need it to return whether
the underlying flash device can have bad eraseblocks or not. UBI needs
this information because it affects how it handles the underlying flash.
E.g., if the underlying flash is NOR, it cannot have bad blocks and any
write or erase error is fatal, and all we can do is to switch to R/O
mode. We do not need to reserve a pool of good eraseblocks for bad
eraseblocks handling, and so on.
This patch also removes 'mtd_can_have_bb()' invocations from Logfs to
ensure correct Logfs behavior.
I've tested that with this patch UBI works on top of NOR and NAND
flashes emulated by mtdram and nandsim correspondingly.
This patch is based on patch from Linus Torvalds.
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Acked-by: Jörn Engel <joern@logfs.org>
Acked-by: Prasad Joshi <prasadjoshi.linux@gmail.com>
Acked-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
bdi_prune_sb() resets sb->s_bdi to default_backing_dev_info when the
tearing down the original bdi. Fix trace_writeback_single_inode to
use sb->s_bdi=default_backing_dev_info rather than bdi->dev=NULL for a
teared down bdi.
Cc: <stable@kernel.org>
Reported-by: Rabin Vincent <rabin@rab.in>
Tested-by: Rabin Vincent <rabin@rab.in>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>