cascardo/linux.git
8 years agomm: update min_free_kbytes from khugepaged after core initialization
Jason Baron [Thu, 5 May 2016 23:22:12 +0000 (16:22 -0700)]
mm: update min_free_kbytes from khugepaged after core initialization

Khugepaged attempts to raise min_free_kbytes if its set too low.
However, on boot khugepaged sets min_free_kbytes first from
subsys_initcall(), and then the mm 'core' over-rides min_free_kbytes
after from init_per_zone_wmark_min(), via a module_init() call.

Khugepaged used to use a late_initcall() to set min_free_kbytes (such
that it occurred after the core initialization), however this was
removed when the initialization of min_free_kbytes was integrated into
the starting of the khugepaged thread.

The fix here is simply to invoke the core initialization using a
core_initcall() instead of module_init(), such that the previous
initialization ordering is restored.  I didn't restore the
late_initcall() since start_stop_khugepaged() already sets
min_free_kbytes via set_recommended_min_free_kbytes().

This was noticed when we had a number of page allocation failures when
moving a workload to a kernel with this new initialization ordering.  On
an 8GB system this restores min_free_kbytes back to 67584 from 11365
when CONFIG_TRANSPARENT_HUGEPAGE=y is set and either
CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y or
CONFIG_TRANSPARENT_HUGEPAGE_MADVISE=y.

Fixes: 79553da293d3 ("thp: cleanup khugepaged startup")
Signed-off-by: Jason Baron <jbaron@akamai.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agohuge pagecache: mmap_sem is unlocked when truncation splits pmd
Hugh Dickins [Thu, 5 May 2016 23:22:09 +0000 (16:22 -0700)]
huge pagecache: mmap_sem is unlocked when truncation splits pmd

zap_pmd_range()'s CONFIG_DEBUG_VM !rwsem_is_locked(&mmap_sem) BUG() will
be invalid with huge pagecache, in whatever way it is implemented:
truncation of a hugely-mapped file to an unhugely-aligned size would
easily hit it.

(Although anon THP could in principle apply khugepaged to private file
mappings, which are not excluded by the MADV_HUGEPAGE restrictions, in
practice there's a vm_ops check which excludes them, so it never hits
this BUG() - there's no interface to "truncate" an anonymous mapping.)

We could complicate the test, to check i_mmap_rwsem also when there's a
vm_file; but my inclination was to make zap_pmd_range() more readable by
simply deleting this check.  A search has shown no report of the issue
in the years since commit e0897d75f0b2 ("mm, thp: print useful
information when mmap_sem is unlocked in zap_pmd_range") expanded it
from VM_BUG_ON() - though I cannot point to what commit I would say then
fixed the issue.

But there are a couple of other patches now floating around, neither yet
in the tree: let's agree to retain the check as a VM_BUG_ON_VMA(), as
Matthew Wilcox has done; but subject to a vma_is_anonymous() check, as
Kirill Shutemov has done.  And let's get this in, without waiting for
any particular huge pagecache implementation to reach the tree.

Matthew said "We can reproduce this BUG() in the current Linus tree with
DAX PMDs".

Signed-off-by: Hugh Dickins <hughd@google.com>
Tested-by: Matthew Wilcox <willy@linux.intel.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andres Lagar-Cavilla <andreslc@google.com>
Cc: Yang Shi <yang.shi@linaro.org>
Cc: Ning Qu <quning@gmail.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Andres Lagar-Cavilla <andreslc@google.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agorapidio/mport_cdev: fix uapi type definitions
Alexandre Bounine [Thu, 5 May 2016 23:22:06 +0000 (16:22 -0700)]
rapidio/mport_cdev: fix uapi type definitions

Fix problems in uapi definitions reported by Gabriel Laskar: (see
https://lkml.org/lkml/2016/4/5/205 for details)

 - move public header file rio_mport_cdev.h to include/uapi/linux directory
 - change types in data structures passed as IOCTL parameters
 - improve parameter checking in some IOCTL service routines

Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Reported-by: Gabriel Laskar <gabriel@lse.epita.fr>
Tested-by: Barry Wood <barry.wood@idt.com>
Cc: Gabriel Laskar <gabriel@lse.epita.fr>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
Cc: Barry Wood <barry.wood@idt.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agomm: memcontrol: let v2 cgroups follow changes in system swappiness
Johannes Weiner [Thu, 5 May 2016 23:22:03 +0000 (16:22 -0700)]
mm: memcontrol: let v2 cgroups follow changes in system swappiness

Cgroup2 currently doesn't have a per-cgroup swappiness setting.  We
might want to add one later - that's a different discussion - but until
we do, the cgroups should always follow the system setting.  Otherwise
it will be unchangeably set to whatever the ancestor inherited from the
system setting at the time of cgroup creation.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Cc: <stable@vger.kernel.org> [4.5]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agomm: thp: correct split_huge_pages file permission
Yang Shi [Thu, 5 May 2016 23:22:00 +0000 (16:22 -0700)]
mm: thp: correct split_huge_pages file permission

split_huge_pages doesn't support get method at all, so the read
permission sounds confusing, change the permission to write only.

And, add "\n" to the output of set method to make it more readable.

Signed-off-by: Yang Shi <yang.shi@linaro.org>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoMerge tag 'asm-generic-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd...
Linus Torvalds [Thu, 5 May 2016 22:40:38 +0000 (15:40 -0700)]
Merge tag 'asm-generic-4.6' of git://git./linux/kernel/git/arnd/asm-generic

Pull asm-generic syscall fix from Arnd Bergmann:
 "My last pull request for asm-generic had just one patch that added two
  new system calls to asm/unistd.h, but unfortunately it turned out to
  be wrong, pointing arch/tile compat mode at the native handlers rather
  than the compat ones.

  This was spotted by Yury Norov, who is working on ILP32 mode for
  arch/arm64, which would have the same problem when merged.  This fixes
  the table to use the correct compat syscalls, like the other 64-bit
  architectures do.

  I'll try to find the time to come up with a solution that prevents
  this problem from happening again, by allowing all future system calls
  to just get added in a single file for use by all architectures"

* tag 'asm-generic-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
  asm-generic: use compat version for preadv2 and pwritev2

8 years agoMerge tag 'iio-fixes-for-4.6d' of git://git.kernel.org/pub/scm/linux/kernel/git/jic23...
Greg Kroah-Hartman [Thu, 5 May 2016 22:38:07 +0000 (15:38 -0700)]
Merge tag 'iio-fixes-for-4.6d' of git://git./linux/kernel/git/jic23/iio into staging-linus

Jonathan writes:

Fourth set of IIO fixes for the 4.6 cycle.

This last minute set is concerned with a regression in the mpu6050 driver.
The regression causes a null pointer dereference on any ACPI device
that has one of these present such as the ASUS T100TA Baytrail/T.

The issue was known but thought (i.e. missunderstood by me)
to only be a possible with no reports, so was routed via the normal merge
window.  Turns out this was wrong (thanks to Alan for reporting the crash).

The pull is just for the null dereference fix and a followup fix
that also stops the reported name of the device being NULL.

* mpu6050
  - Fix a 'possible' NULL dereference introduced as part of splitting the
  driver to allow both i2c and spi to be supported.  The issue affects ACPI
  systems with this device.
  - Fix a follow up issue where the name and chip id both get set to null if
  the device driver instance is instantiated from ACPI tables.

8 years agoMerge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm...
Linus Torvalds [Thu, 5 May 2016 22:31:35 +0000 (15:31 -0700)]
Merge tag 'fixes-for-linus' of git://git./linux/kernel/git/arm/arm-soc

Pull ARM SoC fixes from Arnd Bergmann:
 "Here are a couple last-minute fixes for ARM SoCs.  Most of them are
  for the OMAP platforms, the rest are all for different platforms.

  OMAP:
     All dts fixes, mostly affecting voltages and pinctrl for various
     device drivers:

      - Regulator minimum voltage fixes for omap5
      - ISP syscon register offset fix for omap3
      - Fix regulator initial modes for n900
      - Fix omap5 pinctrl wkup instance size

  Allwinner:
     Remove incorrect constraints from a dcdc1 regulator

  Alltera SoCFPGA:
     Fix compilation in thumb2 mode

  Samsung exynos:
     Fix a potential oops in the pm-domain error handling

  Davinci:
     Avoid a link error if NVMEM is disabled

  Renesas:
     Do not mark an external uart clock as disabled, to allow probing
     the uarts"

* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  ARM: davinci: only use NVMEM when available
  ARM: SoCFPGA: Fix secondary CPU startup in thumb2 kernel
  ARM: dts: omap5: fix range of permitted wakeup pinmux registers
  ARM: dts: omap3-n900: Specify peripherals LDO regulators initial mode
  ARM: dts: omap3: Fix ISP syscon register offset
  ARM: dts: omap5-cm-t54: fix ldo1_reg and ldo4_reg ranges
  ARM: dts: omap5-board-common: fix ldo1_reg and ldo4_reg ranges
  arm64: dts: r8a7795: Don't disable referenced optional scif clock
  ARM: EXYNOS: Properly skip unitialized parent clock in power domain on
  ARM: dts: sun8i-q8-common: Do not set constraints on dc1sw regulator

8 years agomaintainers: update rmk's email address(es)
Russell King [Thu, 5 May 2016 16:12:54 +0000 (17:12 +0100)]
maintainers: update rmk's email address(es)

Update my email and web addresses in the kernel maintainers file.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agowriteback: Fix performance regression in wb_over_bg_thresh()
Howard Cochran [Thu, 10 Mar 2016 06:12:39 +0000 (01:12 -0500)]
writeback: Fix performance regression in wb_over_bg_thresh()

Commit 947e9762a8dd ("writeback: update wb_over_bg_thresh() to use
wb_domain aware operations") unintentionally changed this function's
meaning from "are there more dirty pages than the background writeback
threshold" to "are there more dirty pages than the writeback threshold".
The background writeback threshold is typically half of the writeback
threshold, so this had the effect of raising the number of dirty pages
required to cause a writeback worker to perform background writeout.

This can cause a very severe performance regression when a BDI uses
BDI_CAP_STRICTLIMIT because balance_dirty_pages() and the writeback worker
can now disagree on whether writeback should be initiated.

For example, in a system having 1GB of RAM, a single spinning disk, and a
"pass-through" FUSE filesystem mounted over the disk, application code
mmapped a 128MB file on the disk and was randomly dirtying pages in that
mapping.

Because FUSE uses strictlimit and has a default max_ratio of only 1%, in
balance_dirty_pages, thresh is ~200, bg_thresh is ~100, and the
dirty_freerun_ceiling is the average of those, ~150. So, it pauses the
dirtying processes when we have 151 dirty pages and wakes up a background
writeback worker. But the worker tests the wrong threshold (200 instead of
100), so it does not initiate writeback and just returns.

Thus, balance_dirty_pages keeps looping, sleeping and then waking up the
worker who will do nothing. It remains stuck in this state until the few
dirty pages that we have finally expire and we write them back for that
reason. Then the whole process repeats, resulting in near-zero throughput
through the FUSE BDI.

The fix is to call the parameterized variant of wb_calc_thresh, so that the
worker will do writeback if the bg_thresh is exceeded which was the
behavior before the referenced commit.

Fixes: 947e9762a8dd ("writeback: update wb_over_bg_thresh() to use wb_domain aware operations")
Signed-off-by: Howard Cochran <hcochran@kernelspring.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Cc: <stable@vger.kernel.org> # v4.2+
Tested-by Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
8 years agoARM: 8573/1: domain: move {set,get}_domain under config guard
Vladimir Murzin [Wed, 4 May 2016 09:39:02 +0000 (10:39 +0100)]
ARM: 8573/1: domain: move {set,get}_domain under config guard

Recursive undefined instrcution falut is seen with R-class taking an
exception. The reson for that is __show_regs() tries to get domain
information, but domains is not available on !MMU cores, like R/M
class.

Fix it by puting {set,get}_domain functions under CONFIG_CPU_CP15_MMU
guard and providing stubs for the case where domains is not supported.

Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
8 years agoARM: 8572/1: nommu: change memory reserve for the vectors
Jean-Philippe Brucker [Wed, 4 May 2016 09:38:12 +0000 (10:38 +0100)]
ARM: 8572/1: nommu: change memory reserve for the vectors

Commit 19accfd3 (ARM: move vector stubs) moved the vector stubs in an
additional page above the base vector one. This change wasn't taken into
account by the nommu memreserve.
This patch ensures that the kernel won't overwrite any vector stub on
nommu.

[changed the MPU side too]

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
8 years agoARM: 8571/1: nommu: fix PMSAv7 setup
Jean-Philippe Brucker [Wed, 4 May 2016 09:37:22 +0000 (10:37 +0100)]
ARM: 8571/1: nommu: fix PMSAv7 setup

Commit 1c2f87c (ARM: 8025/1: Get rid of meminfo) broke the support for
MPU on ARMv7-R. This patch adapts the code inside CONFIG_ARM_MPU to use
memblocks appropriately.

MPU initialisation only uses the first memory region, and removes all
subsequent ones. Because looping over all regions that need removal is
inefficient, and memblock_remove already handles memory ranges, we can
flatten the 'for_each_memblock' part.

Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
8 years agoIB/iser: Fix max_sectors calculation
Christoph Hellwig [Mon, 18 Apr 2016 21:06:28 +0000 (17:06 -0400)]
IB/iser: Fix max_sectors calculation

iSER currently has a couple places that set max_sectors in either the host
template or SCSI host, and all of them get it wrong.

This patch instead uses a single assignment that (hopefully) gets it right:
the max_sectors value must be derived from the number of segments in the
FR or FMR structure, but actually be one lower than the page size multiplied
by the number of sectors, as it has to handle the case of non-aligned I/O.

Without this I get trivial to reproduce hangs when running xfstests
(on XFS) over iSER to Linux targets.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Acked-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Doug Ledford <dledford@redhat.com>
8 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm...
Linus Torvalds [Thu, 5 May 2016 15:41:57 +0000 (08:41 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/ebiederm/user-namespace

Pull userns fix from Eric Biederman:
 "This contains just a single fix for a nasty oops"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  propogate_mnt: Handle the first propogated copy being a slave

8 years agoMerge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Linus Torvalds [Thu, 5 May 2016 15:26:54 +0000 (08:26 -0700)]
Merge tag 'for_linus' of git://git./linux/kernel/git/mst/vhost

Pull virtio/qemu fixes from Michael Tsirkin:
 "A couple of fixes for virtio and for the new QEMU fw cfg driver"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  virtio: Silence uninitialized variable warning
  firmware: qemu_fw_cfg.c: potential unintialized variable

8 years agopropogate_mnt: Handle the first propogated copy being a slave
Eric W. Biederman [Thu, 5 May 2016 14:29:29 +0000 (09:29 -0500)]
propogate_mnt: Handle the first propogated copy being a slave

When the first propgated copy was a slave the following oops would result:
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
> IP: [<ffffffff811fba4e>] propagate_one+0xbe/0x1c0
> PGD bacd4067 PUD bac66067 PMD 0
> Oops: 0000 [#1] SMP
> Modules linked in:
> CPU: 1 PID: 824 Comm: mount Not tainted 4.6.0-rc5userns+ #1523
> Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
> task: ffff8800bb0a8000 ti: ffff8800bac3c000 task.ti: ffff8800bac3c000
> RIP: 0010:[<ffffffff811fba4e>]  [<ffffffff811fba4e>] propagate_one+0xbe/0x1c0
> RSP: 0018:ffff8800bac3fd38  EFLAGS: 00010283
> RAX: 0000000000000000 RBX: ffff8800bb77ec00 RCX: 0000000000000010
> RDX: 0000000000000000 RSI: ffff8800bb58c000 RDI: ffff8800bb58c480
> RBP: ffff8800bac3fd48 R08: 0000000000000001 R09: 0000000000000000
> R10: 0000000000001ca1 R11: 0000000000001c9d R12: 0000000000000000
> R13: ffff8800ba713800 R14: ffff8800bac3fda0 R15: ffff8800bb77ec00
> FS:  00007f3c0cd9b7e0(0000) GS:ffff8800bfb00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000010 CR3: 00000000bb79d000 CR4: 00000000000006e0
> Stack:
>  ffff8800bb77ec00 0000000000000000 ffff8800bac3fd88 ffffffff811fbf85
>  ffff8800bac3fd98 ffff8800bb77f080 ffff8800ba713800 ffff8800bb262b40
>  0000000000000000 0000000000000000 ffff8800bac3fdd8 ffffffff811f1da0
> Call Trace:
>  [<ffffffff811fbf85>] propagate_mnt+0x105/0x140
>  [<ffffffff811f1da0>] attach_recursive_mnt+0x120/0x1e0
>  [<ffffffff811f1ec3>] graft_tree+0x63/0x70
>  [<ffffffff811f1f6b>] do_add_mount+0x9b/0x100
>  [<ffffffff811f2c1a>] do_mount+0x2aa/0xdf0
>  [<ffffffff8117efbe>] ? strndup_user+0x4e/0x70
>  [<ffffffff811f3a45>] SyS_mount+0x75/0xc0
>  [<ffffffff8100242b>] do_syscall_64+0x4b/0xa0
>  [<ffffffff81988f3c>] entry_SYSCALL64_slow_path+0x25/0x25
> Code: 00 00 75 ec 48 89 0d 02 22 22 01 8b 89 10 01 00 00 48 89 05 fd 21 22 01 39 8e 10 01 00 00 0f 84 e0 00 00 00 48 8b 80 d8 00 00 00 <48> 8b 50 10 48 89 05 df 21 22 01 48 89 15 d0 21 22 01 8b 53 30
> RIP  [<ffffffff811fba4e>] propagate_one+0xbe/0x1c0
>  RSP <ffff8800bac3fd38>
> CR2: 0000000000000010
> ---[ end trace 2725ecd95164f217 ]---

This oops happens with the namespace_sem held and can be triggered by
non-root users.  An all around not pleasant experience.

To avoid this scenario when finding the appropriate source mount to
copy stop the walk up the mnt_master chain when the first source mount
is encountered.

Further rewrite the walk up the last_source mnt_master chain so that
it is clear what is going on.

The reason why the first source mount is special is that it it's
mnt_parent is not a mount in the dest_mnt propagation tree, and as
such termination conditions based up on the dest_mnt mount propgation
tree do not make sense.

To avoid other kinds of confusion last_dest is not changed when
computing last_source.  last_dest is only used once in propagate_one
and that is above the point of the code being modified, so changing
the global variable is meaningless and confusing.

Cc: stable@vger.kernel.org
fixes: f2ebb3a921c1ca1e2ddd9242e95a1989a50c4c68 ("smarter propagate_mnt()")
Reported-by: Tycho Andersen <tycho.andersen@canonical.com>
Reviewed-by: Seth Forshee <seth.forshee@canonical.com>
Tested-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
8 years agox86/sysfb_efi: Fix valid BAR address range check
Wang YanQing [Thu, 5 May 2016 13:14:21 +0000 (14:14 +0100)]
x86/sysfb_efi: Fix valid BAR address range check

The code for checking whether a BAR address range is valid will break
out of the loop when a start address of 0x0 is encountered.

This behaviour is wrong since by breaking out of the loop we may miss
the BAR that describes the EFI frame buffer in a later iteration.

Because of this bug I can't use video=efifb: boot parameter to get
efifb on my new ThinkPad E550 for my old linux system hard disk with
3.10 kernel. In 3.10, efifb is the only choice due to DRM/I915 not
supporting the GPU.

This patch also add a trivial optimization to break out after we find
the frame buffer address range without testing later BARs.

Signed-off-by: Wang YanQing <udknight@gmail.com>
[ Rewrote changelog. ]
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Reviewed-by: Peter Jones <pjones@redhat.com>
Cc: <stable@vger.kernel.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: David Herrmann <dh.herrmann@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/1462454061-21561-2-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoARC: support HIGHMEM even without PAE40
Vineet Gupta [Mon, 18 Apr 2016 05:19:56 +0000 (10:49 +0530)]
ARC: support HIGHMEM even without PAE40

Initial HIGHMEM support on ARC was introduced for PAE40 where the low
memory (0x8000_0000 based) and high memory (0x1_0000_0000) were
physically contiguous. So CONFIG_FLATMEM sufficed (despite a peipheral
hole in the middle, which wasted a bit of struct page memory, but things
worked).

However w/o PAE, highmem was not possible and we could only reach
~1.75GB of DDR. Now there is a use case to access ~4GB of DDR w/o PAE40
The idea is to have low memory at canonical 0x8000_0000 and highmem
at 0 so enire 4GB address space is available for physical addressing
This needs additional platform/interconnect mapping to convert
the non contiguous physical addresses into linear bus adresses.

From Linux point of view, non contiguous divide means FLATMEM no
longer works and DISCONTIGMEM is needed to track the pfns in the 2
regions.

This scheme would also work for PAE40, only better in that we don't
waste struct page memory for the peripheral hole.

The DT description will be something like

    memory {
        ...
        reg = <0x80000000 0x200000000   /* 512MB: lowmem */
               0x00000000 0x10000000>;  /* 256MB: highmem */
   }

Signed-off-by: Noam Camus <noamc@ezchip.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
8 years agoARC: Fix PAE40 boot failures due to PTE truncation
Vineet Gupta [Thu, 5 May 2016 09:23:48 +0000 (14:53 +0530)]
ARC: Fix PAE40 boot failures due to PTE truncation

So a benign looking cleanup which macro'ized PAGE_SHIFT shifts turned
out to be bad (since it was done non-sensically across the board).

It caused boot failures with PAE40 as forced cast to (unsigned long)
from newly introduced virt_to_pfn() was causing truncatiion of the
(long long) pte/paddr values.

It is OK to use this in accessors dealing with kernel virtual address,
pointers etc, but not for PTE values themelves.

Fixes: cJ2ff5cf2735c ("ARC: mm: Use virt_to_pfn() for addr >> PAGE_SHIFT pattern)
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
8 years agoARC: Add missing io barriers to io{read,write}{16,32}be()
Vineet Gupta [Thu, 5 May 2016 08:02:34 +0000 (13:32 +0530)]
ARC: Add missing io barriers to io{read,write}{16,32}be()

While reviewing a different change to asm-generic/io.h Arnd spotted that
ARC ioread32 and ioread32be both of which come from asm-generic versions
are not symmetrical in terms of calling the io barriers.

generic ioread32   -> ARC readl()                  [ has barriers]
generic ioread32be -> __be32_to_cpu(__raw_readl()) [ lacks barriers]

While generic ioread32be is being remediated to call readl(), that involves
a swab32(), causing double swaps on ioread32be() on Big Endian systems.

So provide our versions of big endian IO accessors to ensure io barrier
calls while also keeping them optimal

Suggested-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Cc: stable@vger.kernel.org [4.2+]
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
8 years ago[media] media-device: fix builds when USB or PCI is compiled as module
Mauro Carvalho Chehab [Thu, 5 May 2016 11:01:34 +0000 (08:01 -0300)]
[media] media-device: fix builds when USB or PCI is compiled as module

Just checking ifdef CONFIG_USB is not enough, if the USB is compiled
as module. The same applies to PCI.

Tested with the following .config alternatives:

CONFIG_USB=m
CONFIG_MEDIA_CONTROLLER=y
CONFIG_MEDIA_SUPPORT=m
CONFIG_VIDEO_AU0828=m

CONFIG_USB=m
CONFIG_MEDIA_CONTROLLER=y
CONFIG_MEDIA_SUPPORT=y
CONFIG_VIDEO_AU0828=m

CONFIG_USB=y
CONFIG_MEDIA_CONTROLLER=y
CONFIG_MEDIA_SUPPORT=y
CONFIG_VIDEO_AU0828=m

CONFIG_USB=y
CONFIG_MEDIA_CONTROLLER=y
CONFIG_MEDIA_SUPPORT=y
CONFIG_VIDEO_AU0828=y

Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
8 years agoperf/x86/amd/iommu: Do not register a task ctx for uncore like PMUs
Peter Zijlstra [Sat, 23 Apr 2016 22:42:55 +0000 (00:42 +0200)]
perf/x86/amd/iommu: Do not register a task ctx for uncore like PMUs

The new sanity check introduced by:

  26657848502b ("perf/core: Verify we have a single perf_hw_context PMU")

... triggered on the AMD IOMMU driver.

IOMMUs are not per logical CPU, they cannot have per-task counters. Fix it.

Reported-by: Borislav Petkov <bp@alien8.de>
Tested-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: jroedel@suse.de
Cc: suravee.suthikulpanit@amd.com
Link: http://lkml.kernel.org/r/20160423224255.GB3430@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agox86/platform/UV: Bring back the call to map_low_mmrs in uv_system_init
Alex Thorlton [Wed, 4 May 2016 22:39:52 +0000 (17:39 -0500)]
x86/platform/UV: Bring back the call to map_low_mmrs in uv_system_init

A while back the following commit:

  d394f2d9d8e1 ("x86/platform/UV: Remove EFI memmap quirk for UV2+")

changed uv_system_init() to only call map_low_mmrs() on older UV1 hardware,
which requires EFI_OLD_MEMMAP to be set in order to boot.

The recent changes to the EFI memory mapping code in:

  d2f7cbe7b26a ("x86/efi: Runtime services virtual mapping")

exposed some issues with the fact that we were relying on the EFI memory
mapping mechanisms to map in our MMRs for us, after commit d394f2d9d8e1.

Rather than revert the entire commit and go back to forcing
EFI_OLD_MEMMAP on all UVs, we're going to add the call to map_low_mmrs()
back into uv_system_init(), and then fix up our EFI runtime calls to use
the appropriate page table.

For now, UV2+ will still need efi=old_map to boot, but there will be
other changes soon that should eliminate the need for this.

Signed-off-by: Alex Thorlton <athorlton@sgi.com>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Adam Buchbinder <adam.buchbinder@gmail.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Russ Anderson <rja@sgi.com>
Cc: Dimitri Sivanich <sivanich@sgi.com>
Link: http://lkml.kernel.org/r/1462401592-120735-1-git-send-email-athorlton@sgi.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agosched/fair: Fix comment in calculate_imbalance()
Dietmar Eggemann [Fri, 29 Apr 2016 19:32:39 +0000 (20:32 +0100)]
sched/fair: Fix comment in calculate_imbalance()

The comment in calculate_imbalance() was introduced in commit:

 2dd73a4f09be ("[PATCH] sched: implement smpnice")

which described the logic as it was then, but a later commit:

  b18855500fc4 ("sched/balancing: Fix 'local->avg_load > sds->avg_load' case in calculate_imbalance()")

.. complicated this logic some more so that the comment does not match anymore.

Update the comment to match the code.

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Morten Rasmussen <morten.rasmussen@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1461958364-675-3-git-send-email-dietmar.eggemann@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agosched/fair: Remove stale power aware scheduling comments
Dietmar Eggemann [Fri, 29 Apr 2016 19:32:38 +0000 (20:32 +0100)]
sched/fair: Remove stale power aware scheduling comments

Commit 8e7fbcbc22c1 ("sched: Remove stale power aware scheduling remnants
and dysfunctional knobs") deleted the power aware scheduling support.

This patch gets rid of the remaining power aware scheduling related
comments in the code as well.

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Morten Rasmussen <morten.rasmussen@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1461958364-675-2-git-send-email-dietmar.eggemann@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agosched/fair: Update rq clock before updating nohz CPU load
Matt Fleming [Tue, 3 May 2016 19:46:54 +0000 (20:46 +0100)]
sched/fair: Update rq clock before updating nohz CPU load

If we're accessing rq_clock() (e.g. in sched_avg_update()) we should
update the rq clock before calling cpu_load_update(), otherwise any
time calculations will be stale.

All other paths currently call update_rq_clock().

Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Wanpeng Li <wanpeng.li@hotmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Mike Galbraith <umgwanakikbuti@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1462304814-11715-1-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agosched/debug: Print out idle balance values even on !CONFIG_SCHEDSTATS kernels
Wanpeng Li [Tue, 3 May 2016 04:38:25 +0000 (12:38 +0800)]
sched/debug: Print out idle balance values even on !CONFIG_SCHEDSTATS kernels

The max_idle_balance_cost and avg_idle values which are tracked and ar used to
capture short idle incidents, are not associated with schedstats, however the
information of these two values isn't printed out on !CONFIG_SCHEDSTATS kernels.

Fix this by moving the value printout out of the CONFIG_SCHEDSTATS section.

Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1462250305-4523-1-git-send-email-wanpeng.li@hotmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agosched/fair: Optimize sum computation with a lookup table
Yuyang Du [Mon, 2 May 2016 21:54:27 +0000 (05:54 +0800)]
sched/fair: Optimize sum computation with a lookup table

__compute_runnable_contrib() uses a loop to compute sum, whereas a
table lookup can do it faster in a constant amount of time.

The program to generate the constants is located at:

  Documentation/scheduler/sched-avg.txt

Signed-off-by: Yuyang Du <yuyang.du@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Morten Rasmussen <morten.rasmussen@arm.com>
Acked-by: Vincent Guittot <vincent.guittot@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: bsegall@google.com
Cc: dietmar.eggemann@arm.com
Cc: juri.lelli@arm.com
Cc: pjt@google.com
Link: http://lkml.kernel.org/r/1462226078-31904-2-git-send-email-yuyang.du@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agosched/fair: Add detailed description to the sched load avg metrics
Yuyang Du [Tue, 5 Apr 2016 04:12:28 +0000 (12:12 +0800)]
sched/fair: Add detailed description to the sched load avg metrics

These sched metrics have become complex enough, so describe them
in detail at their definition.

Signed-off-by: Yuyang Du <yuyang.du@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
[ Fixed the text to improve its spelling and typography. ]
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: bsegall@google.com
Cc: dietmar.eggemann@arm.com
Cc: lizefan@huawei.com
Cc: morten.rasmussen@arm.com
Cc: pjt@google.com
Cc: umgwanakikbuti@gmail.com
Cc: vincent.guittot@linaro.org
Link: http://lkml.kernel.org/r/1459829551-21625-4-git-send-email-yuyang.du@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agosched/fair: Rename SCHED_LOAD_SHIFT to NICE_0_LOAD_SHIFT and remove SCHED_LOAD_SCALE
Yuyang Du [Tue, 5 Apr 2016 04:12:27 +0000 (12:12 +0800)]
sched/fair: Rename SCHED_LOAD_SHIFT to NICE_0_LOAD_SHIFT and remove SCHED_LOAD_SCALE

After cleaning up the sched metrics, there are two definitions that are
ambiguous and confusing: SCHED_LOAD_SHIFT and SCHED_LOAD_SHIFT.

Resolve this:

 - Rename SCHED_LOAD_SHIFT to NICE_0_LOAD_SHIFT, which better reflects what
   it is.

 - Replace SCHED_LOAD_SCALE use with SCHED_CAPACITY_SCALE and remove SCHED_LOAD_SCALE.

Suggested-by: Ben Segall <bsegall@google.com>
Signed-off-by: Yuyang Du <yuyang.du@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dietmar.eggemann@arm.com
Cc: lizefan@huawei.com
Cc: morten.rasmussen@arm.com
Cc: pjt@google.com
Cc: umgwanakikbuti@gmail.com
Cc: vincent.guittot@linaro.org
Link: http://lkml.kernel.org/r/1459829551-21625-3-git-send-email-yuyang.du@intel.com
[ Rewrote the changelog and fixed the build on 32-bit kernels. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoperf/x86: Add model numbers for Kabylake CPUs
Andi Kleen [Sat, 30 Apr 2016 00:55:48 +0000 (17:55 -0700)]
perf/x86: Add model numbers for Kabylake CPUs

Everything the same as Skylake, just new model numbers.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1461977748-17616-1-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agosched/fair: Generalize the load/util averages resolution definition
Yuyang Du [Tue, 5 Apr 2016 04:12:26 +0000 (12:12 +0800)]
sched/fair: Generalize the load/util averages resolution definition

Integer metric needs fixed point arithmetic. In sched/fair, a few
metrics, e.g., weight, load, load_avg, util_avg, freq, and capacity,
may have different fixed point ranges, which makes their update and
usage error-prone.

In order to avoid the errors relating to the fixed point range, we
definie a basic fixed point range, and then formalize all metrics to
base on the basic range.

The basic range is 1024 or (1 << 10). Further, one can recursively
apply the basic range to have larger range.

Pointed out by Ben Segall, weight (visible to user, e.g., NICE-0 has
1024) and load (e.g., NICE_0_LOAD) have independent ranges, but they
must be well calibrated.

Signed-off-by: Yuyang Du <yuyang.du@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: bsegall@google.com
Cc: dietmar.eggemann@arm.com
Cc: lizefan@huawei.com
Cc: morten.rasmussen@arm.com
Cc: pjt@google.com
Cc: umgwanakikbuti@gmail.com
Cc: vincent.guittot@linaro.org
Link: http://lkml.kernel.org/r/1459829551-21625-2-git-send-email-yuyang.du@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agosched/core: Enable increased load resolution on 64-bit kernels
Peter Zijlstra [Thu, 28 Apr 2016 10:49:38 +0000 (12:49 +0200)]
sched/core: Enable increased load resolution on 64-bit kernels

Mike ran into the low load resolution limitation on his big machine.

So reenable these bits; nobody could ever reproduce/analyze the
reported power usage claim and Google has been running with this for
years as well.

Reported-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Tested-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agolocking/lockdep, sched/core: Implement a better lock pinning scheme
Peter Zijlstra [Sat, 1 Aug 2015 17:25:08 +0000 (19:25 +0200)]
locking/lockdep, sched/core: Implement a better lock pinning scheme

The problem with the existing lock pinning is that each pin is of
value 1; this mean you can simply unpin if you know its pinned,
without having any extra information.

This scheme generates a random (16 bit) cookie for each pin and
requires this same cookie to unpin. This means you have to keep the
cookie in context.

No objsize difference for !LOCKDEP kernels.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agosched/core: Introduce 'struct rq_flags'
Peter Zijlstra [Fri, 31 Jul 2015 19:28:18 +0000 (21:28 +0200)]
sched/core: Introduce 'struct rq_flags'

In order to be able to pass around more than just the IRQ flags in the
future, add a rq_flags structure.

No difference in code generation for the x86_64-defconfig build I
tested.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agosched/core: Move task_rq_lock() out of line
Peter Zijlstra [Thu, 28 Apr 2016 14:16:33 +0000 (16:16 +0200)]
sched/core: Move task_rq_lock() out of line

Its a rather large function, inline doesn't seems to make much sense:

 $ size defconfig-build/kernel/sched/core.o{.orig,}
    text    data     bss     dec     hex filename
   56533   21037    2320   79890   13812 defconfig-build/kernel/sched/core.o.orig
   55733   21037    2320   79090   134f2 defconfig-build/kernel/sched/core.o

The 'perf bench sched messaging' micro-benchmark shows a visible improvement
of 4-5%:

  $ for i in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor ; do echo performance > $i ; done
  $ perf stat --null --repeat 25 -- perf bench sched messaging -g 40 -l 5000

  pre:
       4.582798193 seconds time elapsed          ( +-  1.41% )
       4.733374877 seconds time elapsed          ( +-  2.10% )
       4.560955136 seconds time elapsed          ( +-  1.43% )
       4.631062303 seconds time elapsed          ( +-  1.40% )

  post:
       4.364765213 seconds time elapsed          ( +-  0.91% )
       4.454442734 seconds time elapsed          ( +-  1.18% )
       4.448893817 seconds time elapsed          ( +-  1.41% )
       4.424346872 seconds time elapsed          ( +-  0.97% )

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoMerge branch 'sched/urgent' into sched/core, to pick up fixes before applying new...
Ingo Molnar [Thu, 5 May 2016 07:01:49 +0000 (09:01 +0200)]
Merge branch 'sched/urgent' into sched/core, to pick up fixes before applying new changes

Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agocrypto: rsa - select crypto mgr dependency
Tadeusz Struk [Wed, 4 May 2016 13:38:46 +0000 (06:38 -0700)]
crypto: rsa - select crypto mgr dependency

The pkcs1pad template needs CRYPTO_MANAGER so it needs
to be explicitly selected by CRYPTO_RSA.

Reported-by: Jamie Heilman <jamie@audible.transient.net>
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
8 years agocrypto: hash - Fix page length clamping in hash walk
Herbert Xu [Wed, 4 May 2016 09:52:56 +0000 (17:52 +0800)]
crypto: hash - Fix page length clamping in hash walk

The crypto hash walk code is broken when supplied with an offset
greater than or equal to PAGE_SIZE.  This patch fixes it by adjusting
walk->pg and walk->offset when this happens.

Cc: <stable@vger.kernel.org>
Reported-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
8 years agoMerge tag 'drm-intel-fixes-2016-05-02' of git://anongit.freedesktop.org/drm-intel...
Dave Airlie [Thu, 5 May 2016 02:12:09 +0000 (12:12 +1000)]
Merge tag 'drm-intel-fixes-2016-05-02' of git://anongit.freedesktop.org/drm-intel into drm-fixes

i915 fixes for 4.6. A bit more than I'd like at this stage, but
OTOH they're all stable material.

* tag 'drm-intel-fixes-2016-05-02' of git://anongit.freedesktop.org/drm-intel:
  drm/i915: Make RPS EI/thresholds multiple of 25 on SNB-BDW
  drm/i915: Fake HDMI live status
  drm/i915: Fix eDP low vswing for Broadwell
  drm/i915/ddi: Fix eDP VDD handling during booting and suspend/resume
  drm/i915: Fix system resume if PCI device remained enabled
  drm/i915: Avoid stalling on pending flips for legacy cursor updates

8 years agoMerge branch 'drm-fixes-4.6' of git://people.freedesktop.org/~agd5f/linux into drm...
Dave Airlie [Thu, 5 May 2016 00:37:25 +0000 (10:37 +1000)]
Merge branch 'drm-fixes-4.6' of git://people.freedesktop.org/~agd5f/linux into drm-fixes

two fixes for hw lockups and one for a double free

* 'drm-fixes-4.6' of git://people.freedesktop.org/~agd5f/linux:
  drm/amdgpu: make sure vertical front porch is at least 1
  drm/radeon: make sure vertical front porch is at least 1
  drm/amdgpu: set metadata pointer to NULL after freeing.

8 years agogpu: ipu-v3: Fix imx-ipuv3-crtc module autoloading
Philipp Zabel [Wed, 27 Apr 2016 08:17:51 +0000 (10:17 +0200)]
gpu: ipu-v3: Fix imx-ipuv3-crtc module autoloading

If of_node is set before calling platform_device_add, the driver core
will try to use of: modalias matching, which fails because the device
tree nodes don't have a compatible property set. This patch fixes
imx-ipuv3-crtc module autoloading by setting the of_node property only
after the platform modalias is set.

Fixes: 304e6be652e2 ("gpu: ipu-v3: Assign of_node of child platform devices to corresponding ports")
Reported-by: Dennis Gilmore <dennis@ausil.us>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Tested-By: Dennis Gilmore <dennis@ausil.us>
Cc: stable@vger.kernel.org # 4.4+
Signed-off-by: Dave Airlie <airlied@redhat.com>
8 years agoPM / OPP: Remove useless check
Viresh Kumar [Wed, 4 May 2016 13:19:55 +0000 (18:49 +0530)]
PM / OPP: Remove useless check

Regulators are optional for devices using OPPs and the OPP core
shouldn't be printing any errors for such missing regulators.

It was fine before the commit 0c717d0f9cb4, but that failed to update
this part of the code to remove an 'always true' check and an extra
unwanted print message.

Fix that now.

Fixes: 0c717d0f9cb4 (PM / OPP: Initialize regulator pointer to an error value)
Reported-by: Marc Gonzalez <marc_gonzalez@sigmadesigns.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
8 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Linus Torvalds [Wed, 4 May 2016 23:07:50 +0000 (16:07 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/dtor/input

Pull input fixes from Dmitry Torokhov.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: atmel_mxt_ts - use mxt_acquire_irq in mxt_soft_reset
  Input: zforce_ts - fix dual touch recognition
  Input: twl6040-vibra - fix atomic schedule panic

8 years agoasm-generic: use compat version for preadv2 and pwritev2
Yury Norov [Mon, 2 May 2016 16:12:47 +0000 (19:12 +0300)]
asm-generic: use compat version for preadv2 and pwritev2

Compat architectures that does not use generic unistd (mips, s390),
declare compat version in their syscall tables for preadv2 and
pwritev2. Generic unistd syscall table should do it as well.

[arnd: this initially slipped through the review and an
 incorrect patch got merged. arch/tile/ is the only architecture
 that could be affected for their 32-bit compat mode, every
 other architecture we support today is fine.]

Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
8 years agoMerge branch 'bnxt_en-fixes'
David S. Miller [Wed, 4 May 2016 21:11:39 +0000 (17:11 -0400)]
Merge branch 'bnxt_en-fixes'

Michael Chan says:

====================
bnxt_en: 2 bug fixes.

Fix crash on ppc64 due to missing memory barrier and restore multicast
after reset.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agobnxt_en: Setup multicast properly after resetting device.
Michael Chan [Wed, 4 May 2016 20:56:44 +0000 (16:56 -0400)]
bnxt_en: Setup multicast properly after resetting device.

The multicast/all-multicast internal flags are not properly restored
after device reset.  This could lead to unreliable multicast operations
after an ethtool configuration change for example.

Call bnxt_mc_list_updated() and setup the vnic->mask in bnxt_init_chip()
to fix the issue.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agobnxt_en: Need memory barrier when processing the completion ring.
Michael Chan [Wed, 4 May 2016 20:56:43 +0000 (16:56 -0400)]
bnxt_en: Need memory barrier when processing the completion ring.

The code determines if the next ring entry is valid before proceeding
further to read the rest of the entry.  The CPU can re-order and read
the rest of the entry first, possibly reading a stale entry, if DMA
of a new entry happens right after reading it.  This issue can be
readily seen on a ppc64 system, causing it to crash.

Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoACPICA: Dispatcher: Update thread ID for recursive method calls
Prarit Bhargava [Wed, 4 May 2016 05:48:56 +0000 (13:48 +0800)]
ACPICA: Dispatcher: Update thread ID for recursive method calls

ACPICA commit 7a3bd2d962f221809f25ddb826c9e551b916eb25

Set the mutex owner thread ID.
Original patch from: Prarit Bhargava <prarit@redhat.com>

Link: https://bugzilla.kernel.org/show_bug.cgi?id=115121
Link: https://github.com/acpica/acpica/commit/7a3bd2d9
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Tested-by: Andy Lutomirski <luto@kernel.org> # On a Dell XPS 13 9350
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
8 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
David S. Miller [Wed, 4 May 2016 20:35:31 +0000 (16:35 -0400)]
Merge branch 'master' of git://git./linux/kernel/git/klassert/ipsec

Steffen Klassert says:

====================
pull request (net): ipsec 2016-05-04

1) The flowcache can hit an OOM condition if too
   many entries are in the gc_list. Fix this by
   counting the entries in the gc_list and refuse
   new allocations if the value is too high.

2) The inner headers are invalid after a xfrm transformation,
   so reset the skb encapsulation field to ensure nobody tries
   access the inner headers. Otherwise tunnel devices stacked
   on top of xfrm may build the outer headers based on wrong
   informations.

3) Add pmtu handling to vti, we need it to report
   pmtu informations for local generated packets.

Please pull or let me know if there are problems.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: fix infoleak in rtnetlink
Kangjie Lu [Tue, 3 May 2016 20:46:24 +0000 (16:46 -0400)]
net: fix infoleak in rtnetlink

The stack object “map” has a total size of 32 bytes. Its last 4
bytes are padding generated by compiler. These padding bytes are
not initialized and sent out via “nla_put”.

Signed-off-by: Kangjie Lu <kjlu@gatech.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: fix infoleak in llc
Kangjie Lu [Tue, 3 May 2016 20:35:05 +0000 (16:35 -0400)]
net: fix infoleak in llc

The stack object “info” has a total size of 12 bytes. Its last byte
is padding which is not initialized and leaked via “put_cmsg”.

Signed-off-by: Kangjie Lu <kjlu@gatech.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris...
Linus Torvalds [Wed, 4 May 2016 18:14:00 +0000 (11:14 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/jmorris/linux-security

Pull IMA fix from James Morris.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  ima: fix the string representation of the LSM/IMA hook enumeration ordering

8 years agonet: fec: only clear a queue's work bit if the queue was emptied
Uwe Kleine-König [Tue, 3 May 2016 14:38:53 +0000 (16:38 +0200)]
net: fec: only clear a queue's work bit if the queue was emptied

In the receive path a queue's work bit was cleared unconditionally even
if fec_enet_rx_queue only read out a part of the available packets from
the hardware. This resulted in not reading any packets in the next napi
turn and so packets were delayed or lost.

The obvious fix is to only clear a queue's bit when the queue was
emptied.

Fixes: 4d494cdc92b3 ("net: fec: change data structure to support multiqueue")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Fugang Duan <fugang.duan@nxp.com>
Acked-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agodrivers: net: xgene: Fix error handling
Matthias Brugger [Tue, 3 May 2016 14:05:07 +0000 (16:05 +0200)]
drivers: net: xgene: Fix error handling

When probe bails out with an error, we try to unregister the
netdev before we have even registered it. Fix the goto statements
for that.

Signed-off-by: Matthias Brugger <mbrugger@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoMerge tag 'for-linus-4.6-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Wed, 4 May 2016 18:00:05 +0000 (11:00 -0700)]
Merge tag 'for-linus-4.6-rc6-tag' of git://git./linux/kernel/git/xen/tip

Pull xen regression fixes from David Vrabel:

 - Fix two regressions causing crashes in 32-bit PV guests

 - Fix a regression in the evtchn driver

* tag 'for-linus-4.6-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/evtchn: fix ring resize when binding new events
  xen/balloon: Fix crash when ballooning on x86 32 bit PAE
  xen: Fix page <-> pfn conversion on 32 bit systems

8 years agoxen/evtchn: fix ring resize when binding new events
Jan Beulich [Wed, 4 May 2016 13:02:36 +0000 (07:02 -0600)]
xen/evtchn: fix ring resize when binding new events

The copying of ring data was wrong for two cases: For a full ring
nothing got copied at all (as in that case the canonicalized producer
and consumer indexes are identical). And in case one or both of the
canonicalized (after the resize) indexes would point into the second
half of the buffer, the copied data ended up in the wrong (free) part
of the new buffer. In both cases uninitialized data would get passed
back to the caller.

Fix this by simply copying the old ring contents twice: Once to the
low half of the new buffer, and a second time to the high half.

This addresses the inability to boot a HVM guest with 64 or more
vCPUs.  This regression was caused by 8620015499101090 (xen/evtchn:
dynamically grow pending event channel ring).

Reported-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: <stable@vger.kernel.org> # 4.4+
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
8 years agointel_pstate: Fix intel_pstate_get()
Rafael J. Wysocki [Wed, 4 May 2016 12:01:10 +0000 (14:01 +0200)]
intel_pstate: Fix intel_pstate_get()

After commit 8fa520af5081 "intel_pstate: Remove freq calculation from
intel_pstate_calc_busy()" intel_pstate_get() calls get_avg_frequency()
to compute the average frequency, which is problematic for two reasons.

First, intel_pstate_get() may be invoked before the driver reads the
CPU feedback registers for the first time and if that happens,
get_avg_frequency() will attempt to divide by zero.

Second, the get_avg_frequency() call in intel_pstate_get() is racy
with respect to intel_pstate_sample() and it may end up returning
completely meaningless values for this reason.

Moreover, after commit 7349ec0470b6 "intel_pstate: Move
intel_pstate_calc_busy() into get_target_pstate_use_performance()"
sample.core_pct_busy is never computed on Atom, but it is used in
intel_pstate_adjust_busy_pstate() in that case too.

To address those problems notice that if sample.core_pct_busy
was used in the average frequency computation carried out by
get_avg_frequency(), both the divide by zero problem and the
race with respect to intel_pstate_sample() would be avoided.

Accordingly, move the invocation of intel_pstate_calc_busy() from
get_target_pstate_use_performance() to intel_pstate_update_util(),
which also will take care of the uninitialized sample.core_pct_busy
on Atom, and modify get_avg_frequency() to use sample.core_pct_busy
as per the above.

Reported-by: kernel test robot <ying.huang@linux.intel.com>
Link: http://marc.info/?l=linux-kernel&m=146226437623173&w=4
Fixes: 8fa520af5081 "intel_pstate: Remove freq calculation from intel_pstate_calc_busy()"
Fixes: 7349ec0470b6 "intel_pstate: Move intel_pstate_calc_busy() into get_target_pstate_use_performance()"
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
8 years agoima: fix the string representation of the LSM/IMA hook enumeration ordering
Mimi Zohar [Tue, 19 Apr 2016 21:42:43 +0000 (17:42 -0400)]
ima: fix the string representation of the LSM/IMA hook enumeration ordering

This patch fixes the string representation of the LSM/IMA hook enumeration
ordering used for displaying the IMA policy.

Fixes: d9ddf077bb85 ("ima: support for kexec image and initramfs")
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Tested-by: Eric Richter <erichte@linux.vnet.ibm.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
8 years agoiio: imu: mpu6050: Fix name/chip_id when using ACPI
Daniel Baluta [Thu, 17 Mar 2016 16:32:44 +0000 (18:32 +0200)]
iio: imu: mpu6050: Fix name/chip_id when using ACPI

When using ACPI, id is NULL and the current code automatically
defaults name to NULL and chip id to 0. We should instead use
the data provided in the ACPI device table.

Fixes: c816d9e7a57b ("iio: imu: mpu6050: fix possible NULL dereferences")
Signed-off-by: Daniel Baluta <daniel.baluta@intel.com>
Reviewed-By: Matt Ranostay <matt.ranostay@intel.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
8 years agoiio: imu: mpu6050: fix possible NULL dereferences
Matt Ranostay [Thu, 3 Mar 2016 03:18:12 +0000 (19:18 -0800)]
iio: imu: mpu6050: fix possible NULL dereferences

Fix possible null dereferencing of i2c and spi driver data.

Signed-off-by: Matt Ranostay <matt.ranostay@intel.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
8 years agox86/efi-bgrt: Switch all pr_err() to pr_notice() for invalid BGRT
Josh Boyer [Tue, 3 May 2016 19:29:41 +0000 (20:29 +0100)]
x86/efi-bgrt: Switch all pr_err() to pr_notice() for invalid BGRT

The promise of pretty boot splashes from firmware via BGRT was at
best only that; a promise.  The kernel diligently checks to make
sure the BGRT data firmware gives it is valid, and dutifully warns
the user when it isn't.  However, it does so via the pr_err log
level which seems unnecessary.  The user cannot do anything about
this and there really isn't an error on the part of Linux to
correct.

This lowers the log level by using pr_notice instead.  Users will
no longer have their boot process uglified by the kernel reminding
us that firmware can and often is broken when the 'quiet' kernel
parameter is specified.  Ironic, considering BGRT is supposed to
make boot pretty to begin with.

Signed-off-by: Josh Boyer <jwboyer@fedoraproject.org>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Môshe van der Sterre <me@moshe.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/1462303781-8686-4-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoMAINTAINERS: Remove asterisk from EFI directory names
Matt Fleming [Tue, 3 May 2016 19:29:39 +0000 (20:29 +0100)]
MAINTAINERS: Remove asterisk from EFI directory names

Mark reported that having asterisks on the end of directory names
confuses get_maintainer.pl when it encounters subdirectories, and that
my name does not appear when run on drivers/firmware/efi/libstub.

Reported-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: <stable@vger.kernel.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/1462303781-8686-2-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoMerge tag 'trace-fixes-v4.6-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Wed, 4 May 2016 01:02:38 +0000 (18:02 -0700)]
Merge tag 'trace-fixes-v4.6-rc6' of git://git./linux/kernel/git/rostedt/linux-trace

Pull tracing fix from Steven Rostedt:
 "Chunyu Hu noticed that if one writes into the trigger files within the
  ftrace subsystem of events that it can cause an oops.  This file is
  only writable by root, but still is a bug that needs to be fixed"

* tag 'trace-fixes-v4.6-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Don't display trigger file for events that can't be enabled

8 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Tue, 3 May 2016 22:07:50 +0000 (15:07 -0700)]
Merge git://git./linux/kernel/git/davem/net

Pull networking fixes from David Miller:
 "Some straggler bug fixes:

   1) Batman-adv DAT must consider VLAN IDs when choosing candidate
      nodes, from Antonio Quartulli.

   2) Fix botched reference counting of vlan objects and neigh nodes in
      batman-adv, from Sven Eckelmann.

   3) netem can crash when it sees GSO packets, the fix is to segment
      then upon ->enqueue.  Fix from Neil Horman with help from Eric
      Dumazet.

   4) Fix VXLAN dependencies in mlx5 driver Kconfig, from Matthew
      Finlay.

   5) Handle VXLAN ops outside of rcu lock, via a workqueue, in mlx5,
      since it can sleep.  Fix also from Matthew Finlay.

   6) Check mdiobus_scan() return values properly in pxa168_eth and macb
      drivers.  From Sergei Shtylyov.

   7) If the netdevice doesn't support checksumming, disable
      segmentation.  From Alexandery Duyck.

   8) Fix races between RDS tcp accept and sending, from Sowmini
      Varadhan.

   9) In macb driver, probe MDIO bus before we register the netdev,
      otherwise we can try to open the device before it is really ready
      for that.  Fix from Florian Fainelli.

  10) Netlink attribute size for ILA "tunnels" not calculated properly,
      fix from Nicolas Dichtel"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  ipv6/ila: fix nlsize calculation for lwtunnel
  net: macb: Probe MDIO bus before registering netdev
  RDS: TCP: Synchronize accept() and connect() paths on t_conn_lock.
  RDS:TCP: Synchronize rds_tcp_accept_one with rds_send_xmit when resetting t_sock
  vxlan: Add checksum check to the features check function
  net: Disable segmentation if checksumming is not supported
  net: mvneta: Remove superfluous SMP function call
  macb: fix mdiobus_scan() error check
  pxa168_eth: fix mdiobus_scan() error check
  net/mlx5e: Use workqueue for vxlan ops
  net/mlx5e: Implement a mlx5e workqueue
  net/mlx5: Kconfig: Fix MLX5_EN/VXLAN build issue
  net/mlx5: Unmap only the relevant IO memory mapping
  netem: Segment GSO packets on enqueue
  batman-adv: Fix reference counting of hardif_neigh_node object for neigh_node
  batman-adv: Fix reference counting of vlan object for tt_local_entry
  batman-adv: B.A.T.M.A.N V - make sure iface is reactivated upon NETDEV_UP event
  batman-adv: fix DAT candidate selection (must use vid)

8 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi...
Linus Torvalds [Tue, 3 May 2016 21:23:58 +0000 (14:23 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/mszeredi/fuse

Pull fuse fixes from Miklos Szeredi:
 "Fix a regression and update the MAINTAINERS entry for fuse"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
  fuse: update mailing list in MAINTAINERS
  fuse: Fix return value from fuse_get_user_pages()

8 years agoipv6/ila: fix nlsize calculation for lwtunnel
Nicolas Dichtel [Tue, 3 May 2016 07:58:27 +0000 (09:58 +0200)]
ipv6/ila: fix nlsize calculation for lwtunnel

The handler 'ila_fill_encap_info' adds one attribute: ILA_ATTR_LOCATOR.

Fixes: 65d7ab8de582 ("net: Identifier Locator Addressing module")
CC: Tom Herbert <tom@herbertland.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: macb: Probe MDIO bus before registering netdev
Florian Fainelli [Tue, 3 May 2016 01:38:45 +0000 (18:38 -0700)]
net: macb: Probe MDIO bus before registering netdev

The current sequence makes us register for a network device prior to
registering and probing the MDIO bus which could lead to some unwanted
consequences, like a thread of execution calling into ndo_open before
register_netdev() returns, while the MDIO bus is not ready yet.

Rework the sequence to register for the MDIO bus, and therefore attach
to a PHY prior to calling register_netdev(), which implies reworking the
error path a bit.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoMerge branch 'rds-fixes'
David S. Miller [Tue, 3 May 2016 20:03:45 +0000 (16:03 -0400)]
Merge branch 'rds-fixes'

Sowmini Varadhan says:

====================
RDS: TCP: sychronization during connection startup

This patch series ensures that the passive (accept) side of the
TCP connection used for RDS-TCP is correctly synchronized with
any concurrent active (connect) attempts for a given pair of peers.

Patch 1 in the series makes sure that the t_sock in struct
rds_tcp_connection is only reset after any threads in rds_tcp_xmit
have completed (otherwise a null-ptr deref may be encountered).
Patch 2 synchronizes rds_tcp_accept_one() with the rds_tcp*connect()
path.

v2: review comments from Santosh Shilimkar, other spelling corrections
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoRDS: TCP: Synchronize accept() and connect() paths on t_conn_lock.
Sowmini Varadhan [Mon, 2 May 2016 18:24:52 +0000 (11:24 -0700)]
RDS: TCP: Synchronize accept() and connect() paths on t_conn_lock.

An arbitration scheme for duelling SYNs is implemented as part of
commit 241b271952eb ("RDS-TCP: Reset tcp callbacks if re-using an
outgoing socket in rds_tcp_accept_one()") which ensures that both nodes
involved will arrive at the same arbitration decision. However, this
needs to be synchronized with an outgoing SYN to be generated by
rds_tcp_conn_connect(). This commit achieves the synchronization
through the t_conn_lock mutex in struct rds_tcp_connection.

The rds_conn_state is checked in rds_tcp_conn_connect() after acquiring
the t_conn_lock mutex.  A SYN is sent out only if the RDS connection is
not already UP (an UP would indicate that rds_tcp_accept_one() has
completed 3WH, so no SYN needs to be generated).

Similarly, the rds_conn_state is checked in rds_tcp_accept_one() after
acquiring the t_conn_lock mutex. The only acceptable states (to
allow continuation of the arbitration logic) are UP (i.e., outgoing SYN
was SYN-ACKed by peer after it sent us the SYN) or CONNECTING (we sent
outgoing SYN before we saw incoming SYN).

Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoRDS:TCP: Synchronize rds_tcp_accept_one with rds_send_xmit when resetting t_sock
Sowmini Varadhan [Mon, 2 May 2016 18:24:51 +0000 (11:24 -0700)]
RDS:TCP: Synchronize rds_tcp_accept_one with rds_send_xmit when resetting t_sock

There is a race condition between rds_send_xmit -> rds_tcp_xmit
and the code that deals with resolution of duelling syns added
by commit 241b271952eb ("RDS-TCP: Reset tcp callbacks if re-using an
outgoing socket in rds_tcp_accept_one()").

Specifically, we may end up derefencing a null pointer in rds_send_xmit
if we have the interleaving sequence:
           rds_tcp_accept_one                  rds_send_xmit

                                             conn is RDS_CONN_UP, so
      invoke rds_tcp_xmit

                                             tc = conn->c_transport_data
        rds_tcp_restore_callbacks
            /* reset t_sock */
      null ptr deref from tc->t_sock

The race condition can be avoided without adding the overhead of
additional locking in the xmit path: have rds_tcp_accept_one wait
for rds_tcp_xmit threads to complete before resetting callbacks.
The synchronization can be done in the same manner as rds_conn_shutdown().
First set the rds_conn_state to something other than RDS_CONN_UP
(so that new threads cannot get into rds_tcp_xmit()), then wait for
RDS_IN_XMIT to be cleared in the conn->c_flags indicating that any
threads in rds_tcp_xmit are done.

Fixes: 241b271952eb ("RDS-TCP: Reset tcp callbacks if re-using an
outgoing socket in rds_tcp_accept_one()")
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoMerge branch 'tunnel-csum-and-sg-offloads'
David S. Miller [Tue, 3 May 2016 20:00:55 +0000 (16:00 -0400)]
Merge branch 'tunnel-csum-and-sg-offloads'

Alexander Duyck says:

====================
Fixes for tunnel checksum and segmentation offloads

This patch series is a subset of patches I had submitted for net-next.  I
plan to drop these two patches from the v3 of "Fix Tunnel features and
enable GSO partial for several drivers" and I am instead submitting them
for net since these are truly fixes and likely will need to be backported
to stable branches.

This series addresses 2 specific issues.  The first is that we could
request TSO on a v4 inner header while not supporting checksum offload of
the outer IPv6 header.  The second is that we could request an IPv6 inner
checksum offload without validating that we could actually support an inner
IPv6 checksum offload.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agovxlan: Add checksum check to the features check function
Alexander Duyck [Mon, 2 May 2016 16:25:16 +0000 (09:25 -0700)]
vxlan: Add checksum check to the features check function

We need to perform an additional check on the inner headers to determine if
we can offload the checksum for them.  Previously this check didn't occur
so we would generate an invalid frame in the case of an IPv6 header
encapsulated inside of an IPv4 tunnel.  To fix this I added a secondary
check to vxlan_features_check so that we can verify that we can offload the
inner checksum.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: Disable segmentation if checksumming is not supported
Alexander Duyck [Mon, 2 May 2016 16:25:10 +0000 (09:25 -0700)]
net: Disable segmentation if checksumming is not supported

In the case of the mlx4 and mlx5 driver they do not support IPv6 checksum
offload for tunnels.  With this being the case we should disable GSO in
addition to the checksum offload features when we find that a device cannot
perform a checksum on a given packet type.

Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet: mvneta: Remove superfluous SMP function call
Anna-Maria Gleixner [Mon, 2 May 2016 09:02:51 +0000 (11:02 +0200)]
net: mvneta: Remove superfluous SMP function call

Since commit 3b9d6da67e11 ("cpu/hotplug: Fix rollback during error-out
in __cpu_disable()") it is ensured that callbacks of CPU_ONLINE and
CPU_DOWN_PREPARE are processed on the hotplugged CPU. Due to this SMP
function calls are no longer required.

Replace smp_call_function_single() with a direct call to
mvneta_percpu_enable() or mvneta_percpu_disable(). The functions do
not require to be called with interrupts disabled, therefore the
smp_call_function_single() calling convention is not preserved.

Cc: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Cc: netdev@vger.kernel.org
Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agomacb: fix mdiobus_scan() error check
Sergei Shtylyov [Sat, 30 Apr 2016 22:47:36 +0000 (01:47 +0300)]
macb: fix mdiobus_scan() error check

Now mdiobus_scan() returns ERR_PTR(-ENODEV) instead of NULL if the PHY
device ID was read as all ones. As this was not  an error before, this
value  should be filtered out now in this driver.

Fixes: b74766a0a0fe ("phylib: don't return NULL from get_phy_device()")
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agopxa168_eth: fix mdiobus_scan() error check
Sergei Shtylyov [Sat, 30 Apr 2016 20:35:11 +0000 (23:35 +0300)]
pxa168_eth: fix mdiobus_scan() error check

Since mdiobus_scan() returns either an error code or NULL on error, the
driver should check  for both,  not only for NULL, otherwise a crash is
imminent...

Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agodrm/amdgpu: make sure vertical front porch is at least 1
Alex Deucher [Mon, 2 May 2016 22:54:39 +0000 (18:54 -0400)]
drm/amdgpu: make sure vertical front porch is at least 1

hw doesn't like a 0 value.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
8 years agodrm/radeon: make sure vertical front porch is at least 1
Alex Deucher [Mon, 2 May 2016 22:53:27 +0000 (18:53 -0400)]
drm/radeon: make sure vertical front porch is at least 1

hw doesn't like a 0 value.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
8 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid
Linus Torvalds [Tue, 3 May 2016 18:06:01 +0000 (11:06 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/jikos/hid

Pull HID fixes from Jiri Kosina:
 "Fixes for the HID subsystem:

   - regression fix for Wacom driver; commit introduced in 4.6-rc1
     mistakenly removed line that should be kept.  Fix by Ping Cheng

   - two device-specific quirks, by Ping Cheng and Nazar Mokrynskyi"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid:
  HID: wacom: add missed stylus_in_proximity line back
  HID: Fix boot delay for Creative SB Omni Surround 5.1 with quirk
  HID: wacom: Add support for DTK-1651

8 years agoMerge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Tue, 3 May 2016 17:58:29 +0000 (10:58 -0700)]
Merge tag 'clk-fixes-for-linus' of git://git./linux/kernel/git/clk/linux

Pull clk fix from Stephen Boyd:
 "One small bug fix for the imx6qp CAN clk definition that was causing
  failures and division by zeros in the kernel on those devices"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: imx6q: fix typo in CAN clock definition

8 years agoMerge branch 'mlx5-fixes'
David S. Miller [Tue, 3 May 2016 17:37:27 +0000 (13:37 -0400)]
Merge branch 'mlx5-fixes'

Saeed Mahameed says:

====================
Mellanox 100G mlx5 fixes for 4.6-rc

This small series provides some bug fixes for mlx5 driver.

A small bug fix for iounmap of a null pointer, which dumps a warning on some archs.

One patch to fix the VXLAN/MLX5_EN dependency issue reported by Arnd.

Two patches to fix the scheduling while atomic issue for ndo_add/del_vxlan_port
NDOs.  The first will add an internal mlx5e workqueue and the second will
delegate vxlan ports add/del requests to that workqueue.

Note: ('net/mlx5: Kconfig: Fix MLX5_EN/VXLAN build issue') is only needed for net
and not net-next as the issue was globally fixed for all device drivers by:
b7aade15485a ('vxlan: break dependency with netdev drivers') in net-next.

Applied on top: f27337e16f2d ('ip_tunnel: fix preempt warning in ip tunnel creation/updating')
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet/mlx5e: Use workqueue for vxlan ops
Matthew Finlay [Sun, 1 May 2016 19:59:57 +0000 (22:59 +0300)]
net/mlx5e: Use workqueue for vxlan ops

The vxlan add/delete port NDOs are called under rcu lock.
The current mlx5e implementation can potentially block in these
calls, which is not allowed.  Move to using the mlx5e workqueue
to handle these NDOs.

Fixes: b3f63c3d5e2c ('net/mlx5e: Add netdev support for VXLAN tunneling')
Signed-off-by: Matthew Finlay <matt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet/mlx5e: Implement a mlx5e workqueue
Matthew Finlay [Sun, 1 May 2016 19:59:56 +0000 (22:59 +0300)]
net/mlx5e: Implement a mlx5e workqueue

Implement a mlx5e workqueue to handle all mlx5e specific tasks.  Move
all tasks currently using the system workqueue to the new workqueue.
This is in preparation for vxlan using the mlx5e workqueue in order to
schedule port add/remove operations.

Signed-off-by: Matthew Finlay <matt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet/mlx5: Kconfig: Fix MLX5_EN/VXLAN build issue
Matthew Finlay [Sun, 1 May 2016 19:59:55 +0000 (22:59 +0300)]
net/mlx5: Kconfig: Fix MLX5_EN/VXLAN build issue

When MLX5_EN=y MLX5_CORE=y and VXLAN=m there is a linker error for
vxlan_get_rx_port() due to the fact that VXLAN is a module. Change Kconfig
to select VXLAN when MLX5_CORE=y. When MLX5_CORE=m there is no dependency
on the value of VXLAN.

Fixes: b3f63c3d5e2c ('net/mlx5e: Add netdev support for VXLAN tunneling')
Signed-off-by: Matthew Finlay <matt@mellanox.com>
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agonet/mlx5: Unmap only the relevant IO memory mapping
Gal Pressman [Sun, 1 May 2016 19:59:54 +0000 (22:59 +0300)]
net/mlx5: Unmap only the relevant IO memory mapping

When freeing UAR the driver tries to unmap uar->map and uar->bf_map
which are mutually exclusive thus always unmapping a NULL pointer.
Make sure we only call iounmap() once, for the actual mapping.

Fixes: 0ba422410bbf ('net/mlx5: Fix global UAR mapping')
Signed-off-by: Gal Pressman <galp@mellanox.com>
Reported-by: Doron Tsur <doront@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agotracing: Don't display trigger file for events that can't be enabled
Chunyu Hu [Tue, 3 May 2016 11:34:34 +0000 (19:34 +0800)]
tracing: Don't display trigger file for events that can't be enabled

Currently register functions for events will be called
through the 'reg' field of event class directly without
any check when seting up triggers.

Triggers for events that don't support register through
debug fs (events under events/ftrace are for trace-cmd to
read event format, and most of them don't have a register
function except events/ftrace/functionx) can't be enabled
at all, and an oops will be hit when setting up trigger
for those events, so just not creating them is an easy way
to avoid the oops.

Link: http://lkml.kernel.org/r/1462275274-3911-1-git-send-email-chuhu@redhat.com
Cc: stable@vger.kernel.org # 3.14+
Fixes: 85f2b08268c01 ("tracing: Add basic event trigger framework")
Signed-off-by: Chunyu Hu <chuhu@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
8 years agodrm/amdgpu: set metadata pointer to NULL after freeing.
Dave Airlie [Tue, 3 May 2016 02:44:29 +0000 (12:44 +1000)]
drm/amdgpu: set metadata pointer to NULL after freeing.

Without this there was a double free of the metadata,
which ended up freeing the fd table for me here, and taking
out the machine more often than not.

I reproduced with X.org + modesetting DDX + latest llvm/mesa,
also required using dri3.

Cc: stable@vger.kernel.org
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
8 years agoHID: wacom: add missed stylus_in_proximity line back
Ping Cheng [Tue, 3 May 2016 04:17:34 +0000 (21:17 -0700)]
HID: wacom: add missed stylus_in_proximity line back

Commit 7e12978 ("HID: wacom: break out wacom_intuos_get_tool_type") by accident
removed stylus_in_proximity flag for Intuos series while shuffling the code
around.

Fix that by reintroducing that flag setting in wacom_intuos_inout(), where
it originally was.

Fixes: 7e12978 ("HID: wacom: break out wacom_intuos_get_tool_type")
Signed-off-by: Ping Cheng <pingc@wacom.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
8 years agofuse: update mailing list in MAINTAINERS
Miklos Szeredi [Tue, 3 May 2016 09:19:33 +0000 (11:19 +0200)]
fuse: update mailing list in MAINTAINERS

The fuse mailing list seems not to be open anymore.  The discussion on
fuse-devel@... is mostly userspace related anyway.

Reported-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
8 years agocrypto: qat - fix adf_ctl_drv.c:undefined reference to adf_init_pf_wq
Tadeusz Struk [Fri, 29 Apr 2016 17:43:40 +0000 (10:43 -0700)]
crypto: qat - fix adf_ctl_drv.c:undefined reference to adf_init_pf_wq

Fix undefined reference issue reported by kbuild test robot.

Cc: <stable@vger.kernel.org>
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
8 years agonetem: Segment GSO packets on enqueue
Neil Horman [Mon, 2 May 2016 16:20:15 +0000 (12:20 -0400)]
netem: Segment GSO packets on enqueue

This was recently reported to me, and reproduced on the latest net kernel,
when attempting to run netperf from a host that had a netem qdisc attached
to the egress interface:

[  788.073771] ---------------------[ cut here ]---------------------------
[  788.096716] WARNING: at net/core/dev.c:2253 skb_warn_bad_offload+0xcd/0xda()
[  788.129521] bnx2: caps=(0x00000001801949b3, 0x0000000000000000) len=2962
data_len=0 gso_size=1448 gso_type=1 ip_summed=3
[  788.182150] Modules linked in: sch_netem kvm_amd kvm crc32_pclmul ipmi_ssif
ghash_clmulni_intel sp5100_tco amd64_edac_mod aesni_intel lrw gf128mul
glue_helper ablk_helper edac_mce_amd cryptd pcspkr sg edac_core hpilo ipmi_si
i2c_piix4 k10temp fam15h_power hpwdt ipmi_msghandler shpchp acpi_power_meter
pcc_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c
sd_mod crc_t10dif crct10dif_generic mgag200 syscopyarea sysfillrect sysimgblt
i2c_algo_bit drm_kms_helper ahci ata_generic pata_acpi ttm libahci
crct10dif_pclmul pata_atiixp tg3 libata crct10dif_common drm crc32c_intel ptp
serio_raw bnx2 r8169 hpsa pps_core i2c_core mii dm_mirror dm_region_hash dm_log
dm_mod
[  788.465294] CPU: 16 PID: 0 Comm: swapper/16 Tainted: G        W
------------   3.10.0-327.el7.x86_64 #1
[  788.511521] Hardware name: HP ProLiant DL385p Gen8, BIOS A28 12/17/2012
[  788.542260]  ffff880437c036b8 f7afc56532a53db9 ffff880437c03670
ffffffff816351f1
[  788.576332]  ffff880437c036a8 ffffffff8107b200 ffff880633e74200
ffff880231674000
[  788.611943]  0000000000000001 0000000000000003 0000000000000000
ffff880437c03710
[  788.647241] Call Trace:
[  788.658817]  <IRQ>  [<ffffffff816351f1>] dump_stack+0x19/0x1b
[  788.686193]  [<ffffffff8107b200>] warn_slowpath_common+0x70/0xb0
[  788.713803]  [<ffffffff8107b29c>] warn_slowpath_fmt+0x5c/0x80
[  788.741314]  [<ffffffff812f92f3>] ? ___ratelimit+0x93/0x100
[  788.767018]  [<ffffffff81637f49>] skb_warn_bad_offload+0xcd/0xda
[  788.796117]  [<ffffffff8152950c>] skb_checksum_help+0x17c/0x190
[  788.823392]  [<ffffffffa01463a1>] netem_enqueue+0x741/0x7c0 [sch_netem]
[  788.854487]  [<ffffffff8152cb58>] dev_queue_xmit+0x2a8/0x570
[  788.880870]  [<ffffffff8156ae1d>] ip_finish_output+0x53d/0x7d0
...

The problem occurs because netem is not prepared to handle GSO packets (as it
uses skb_checksum_help in its enqueue path, which cannot manipulate these
frames).

The solution I think is to simply segment the skb in a simmilar fashion to the
way we do in __dev_queue_xmit (via validate_xmit_skb), with some minor changes.
When we decide to corrupt an skb, if the frame is GSO, we segment it, corrupt
the first segment, and enqueue the remaining ones.

tested successfully by myself on the latest net kernel, to which this applies

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Jamal Hadi Salim <jhs@mojatatu.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: netem@lists.linux-foundation.org
CC: eric.dumazet@gmail.com
CC: stephen@networkplumber.org
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoMerge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-merge
David S. Miller [Tue, 3 May 2016 04:17:38 +0000 (00:17 -0400)]
Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-merge

Antonio Quartulli says:

====================
In this small batch of patches you have:
- a fix for our Distributed ARP Table that makes sure that the input
  provided to the hash function during a query is the same as the one
  provided during an insert (so to prevent false negatives), by Antonio
  Quartulli
- a fix for our new protocol implementation B.A.T.M.A.N. V that ensures
  that a hard interface is properly re-activated when it is brought down
  and then up again, by Antonio Quartulli
- two fixes respectively to the reference counting of the tt_local_entry
  and neigh_node objects, by Sven Eckelmann. Such bug is rather severe
  as it would prevent the netdev objects references by batman-adv from
  being released after shutdown.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
8 years agoMinimal fix-up of bad hashing behavior of hash_64()
Linus Torvalds [Mon, 2 May 2016 19:46:42 +0000 (12:46 -0700)]
Minimal fix-up of bad hashing behavior of hash_64()

This is a fairly minimal fixup to the horribly bad behavior of hash_64()
with certain input patterns.

In particular, because the multiplicative value used for the 64-bit hash
was intentionally bit-sparse (so that the multiply could be done with
shifts and adds on architectures without hardware multipliers), some
bits did not get spread out very much.  In particular, certain fairly
common bit ranges in the input (roughly bits 12-20: commonly with the
most information in them when you hash things like byte offsets in files
or memory that have block factors that mean that the low bits are often
zero) would not necessarily show up much in the result.

There's a bigger patch-series brewing to fix up things more completely,
but this is the fairly minimal fix for the 64-bit hashing problem.  It
simply picks a much better constant multiplier, spreading the bits out a
lot better.

NOTE! For 32-bit architectures, the bad old hash_64() remains the same
for now, since 64-bit multiplies are expensive.  The bigger hashing
cleanup will replace the 32-bit case with something better.

The new constants were picked by George Spelvin who wrote that bigger
cleanup series.  I just picked out the constants and part of the comment
from that series.

Cc: stable@vger.kernel.org
Cc: George Spelvin <linux@horizon.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoMerge tag 'md/4.6-rc6-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md
Linus Torvalds [Mon, 2 May 2016 19:22:51 +0000 (12:22 -0700)]
Merge tag 'md/4.6-rc6-fix' of git://git./linux/kernel/git/shli/md

Pull MD fixes from Shaohua Li:
 "This update includes several trival fixes.  The only important one is
  to fix MD bio merge, which has big performance impact"

* tag 'md/4.6-rc6-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md:
  raid5: delete unnecessary warnning
  MD: make bio mergeable
  md/raid0: remove empty line printk from dump_zones
  md/raid0: fix uninitialized variable bug

8 years agoPCI: Do not treat EPROBE_DEFER as device attach failure
Lukas Wunner [Mon, 2 May 2016 18:48:31 +0000 (13:48 -0500)]
PCI: Do not treat EPROBE_DEFER as device attach failure

Linux 4.5 introduced a behavioral change in device probing during the
suspend process with commit 013c074f8642 ("PM / sleep: prohibit devices
probing during suspend/hibernation"): It defers device probing during the
entire suspend process, starting from the prepare phase and ending with the
complete phase.  A rule existed before that "we rely on subsystems not to
do any probing once a device is suspended" but it is enforced only now
(Alan Stern, https://lkml.org/lkml/2015/9/15/908).

This resulted in a WARN splat if a PCI device (e.g., Thunderbolt) is
plugged in while the system is asleep: Upon waking up, pciehp_resume()
discovers new devices in the resume phase and immediately tries to bind
them to a driver.  Since probing is now deferred, device_attach() returns
-EPROBE_DEFER, which provoked a WARN in pci_bus_add_device().

Linux 4.6-rc1 aggravates the situation with commit ab1a187bba5c ("PCI:
Check device_attach() return value always"): If device_attach() returns a
negative value, pci_bus_add_device() now removes the sysfs and procfs
entries for the device and pci_bus_add_devices() subsequently locks up with
a BUG.  Even with the BUG fixed we're still in trouble because the device
remains on the deferred probing list even though its sysfs and procfs
entries are gone and its children won't be added.

Fix by not interpreting -EPROBE_DEFER as failure.  The device will be
probed eventually (through device_unblock_probing() in dpm_complete()) and
there is proper locking in place to avoid races (e.g., if devices are
unplugged again und thus deleted from the system before deferred probing
happens, I have tested this).  Also, those functions which dereference
dev->driver (e.g. pci_pm_*()) do contain proper NULL pointer checks.  So it
seems safe to ignore -EPROBE_DEFER.

Fixes: ab1a187bba5c ("PCI: Check device_attach() return value always")
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Grygorii Strashko <grygorii.strashko@ti.com>
Cc: Alan Stern <stern@rowland.harvard.edu>
8 years agoPCI: Fix BUG on device attach failure
Lukas Wunner [Mon, 2 May 2016 18:48:25 +0000 (13:48 -0500)]
PCI: Fix BUG on device attach failure

Previously when pci_bus_add_device() called device_attach() and it returned
a negative value, we emitted a WARN but carried on.

Commit ab1a187bba5c ("PCI: Check device_attach() return value always"),
introduced in Linux 4.6-rc1, changed this to unwind all steps preceding
device_attach() and to not set dev->is_added = 1.

The latter leads to a BUG if pci_bus_add_device() was called from
pci_bus_add_devices().  Fix by not recursing to a child bus if
device_attach() failed for the bridge leading to it.

This can be triggered by plugging in a PCI device (e.g. Thunderbolt) while
the system is asleep.  The system locks up when woken because
device_attach() returns -EPROBE_DEFER.

Fixes: ab1a187bba5c ("PCI: Check device_attach() return value always")
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
8 years agoMerge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Linus Torvalds [Mon, 2 May 2016 16:59:57 +0000 (09:59 -0700)]
Merge branch 'for_linus' of git://git./linux/kernel/git/jack/linux-fs

Pull UDF fix from Jan Kara:
 "A fix of a regression in UDF that got introduced in 4.6-rc1 by one of
  the charset encoding fixes"

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  udf: Fix conversion of 'dstring' fields to UTF8

8 years agoMerge tag 'gpio-v4.6-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux...
Linus Torvalds [Mon, 2 May 2016 16:54:22 +0000 (09:54 -0700)]
Merge tag 'gpio-v4.6-4' of git://git./linux/kernel/git/linusw/linux-gpio

Pull GPIO fixes from Linus Walleij:
 "Here are some late but important fixes for the v4.6 kernel series.
  ACPI and RCAR, so two driver fixes (PM related) and a self-evident
  string lookup fix for ACPI GPIOs:

   - A serious ACPI fix targeted for stable: lookup strings were being
     free'd.

   - Revert two patches from the RCAR driver"

* tag 'gpio-v4.6-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
  gpiolib-acpi: Duplicate con_id string when adding it to the crs lookup list
  Revert "gpio: rcar: Fine-grained Runtime PM support"
  Revert "gpio: rcar: Add Runtime PM handling for interrupts"