cascardo/linux.git
7 years agocpufreq: schedutil: map raw required frequency to driver frequency
Steve Muckle [Wed, 13 Jul 2016 20:25:26 +0000 (13:25 -0700)]
cpufreq: schedutil: map raw required frequency to driver frequency

The slow-path frequency transition path is relatively expensive as it
requires waking up a thread to do work. Should support be added for
remote CPU cpufreq updates that is also expensive since it requires an
IPI. These activities should be avoided if they are not necessary.

To that end, calculate the actual driver-supported frequency required by
the new utilization value in schedutil by using the recently added
cpufreq_driver_resolve_freq API. If it is the same as the previously
requested driver frequency then there is no need to continue with the
update assuming the cpu frequency limits have not changed. This will
have additional benefits should the semantics of the rate limit be
changed to apply solely to frequency transitions rather than to
frequency calculations in schedutil.

The last raw required frequency is cached. This allows the driver
frequency lookup to be skipped in the event that the new raw required
frequency matches the last one, assuming a frequency update has not been
forced due to limits changing (indicated by a next_freq value of
UINT_MAX, see sugov_should_update_freq).

Signed-off-by: Steve Muckle <smuckle@linaro.org>
Reviewed-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
7 years agoGFS2: Fix gfs2_replay_incr_blk for multiple journal sizes
Bob Peterson [Thu, 21 Jul 2016 18:02:44 +0000 (13:02 -0500)]
GFS2: Fix gfs2_replay_incr_blk for multiple journal sizes

Before this patch, if you used gfs2_jadd to add new journals of a
size smaller than the existing journals, replaying those new journals
would withdraw. That's because function gfs2_replay_incr_blk was
using the number of journal blocks (jd_block) from the superblock's
journal pointer. In other words, "My journal's max size" rather than
"the journal we're replaying's size." This patch changes the function
to use the size of the pertinent journal rather than always using the
journal we happen to be using.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
7 years agox86/fpu: Do not BUG_ON() in early FPU code
Dave Hansen [Wed, 20 Jul 2016 19:45:51 +0000 (12:45 -0700)]
x86/fpu: Do not BUG_ON() in early FPU code

I don't think it is really possible to have a system where CPUID
enumerates support for XSAVE but that it does not have FP/SSE
(they are "legacy" features and always present).

But, I did manage to hit this case in qemu when I enabled its
somewhat shaky XSAVE support.  The bummer is that the FPU is set
up before we parse the command-line or have *any* console support
including earlyprintk.  That turned what should have been an easy
thing to debug in to a bit more of an odyssey.

So a BUG() here is worthless.  All it does it guarantee that
if/when we hit this case we have an empty console.  So, remove
the BUG() and try to limp along by disabling XSAVE and trying to
continue.  Add a comment on why we are doing this, and also add
a common "out_disable" path for leaving fpu__init_system_xstate().

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave@sr71.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20160720194551.63BB2B58@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
7 years agocpufreq: add cpufreq_driver_resolve_freq()
Steve Muckle [Wed, 13 Jul 2016 20:25:25 +0000 (13:25 -0700)]
cpufreq: add cpufreq_driver_resolve_freq()

Cpufreq governors may need to know what a particular target frequency
maps to in the driver without necessarily wanting to set the frequency.
Support this operation via a new cpufreq API,
cpufreq_driver_resolve_freq(). This API returns the lowest driver
frequency equal or greater than the target frequency
(CPUFREQ_RELATION_L), subject to any policy (min/max) or driver
limitations. The mapping is also cached in the policy so that a
subsequent fast_switch operation can avoid repeating the same lookup.

The API will call a new cpufreq driver callback, resolve_freq(), if it
has been registered by the driver. Otherwise the frequency is resolved
via cpufreq_frequency_table_target(). Rather than require ->target()
style drivers to provide a resolve_freq() callback it is left to the
caller to ensure that the driver implements this callback if necessary
to use cpufreq_driver_resolve_freq().

Suggested-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Steve Muckle <smuckle@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
7 years agoperf tools: Add AVX-512 instructions to the new instructions test
Adrian Hunter [Wed, 20 Jul 2016 08:30:37 +0000 (11:30 +0300)]
perf tools: Add AVX-512 instructions to the new instructions test

Previous patches added support for Intel's AVX-512 instructions to the
kernel and perf tools instruction decoders.

AVX-512 instructions are documented in Intel Architecture Instruction
Set Extensions Programming Reference (February 2016).

Add a representative set of instructions to perf's "new instructions"
test. e.g.

perf test "new instructions"

Or to view a particular instruction:

perf test -v "new instructions" 2>&1 | grep vbroadcasti64x4

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: X86 ML <x86@kernel.org>
Link: http://lkml.kernel.org/r/1469003437-32706-5-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
7 years agoperf tools: Add AVX-512 support to the instruction decoder used by Intel PT
Adrian Hunter [Wed, 20 Jul 2016 08:30:36 +0000 (11:30 +0300)]
perf tools: Add AVX-512 support to the instruction decoder used by Intel PT

Add support for Intel's AVX-512 instructions to perf tools instruction
decoder used by Intel PT.  The kernel's instruction decoder was updated in
a previous patch.

AVX-512 instructions are documented in Intel Architecture Instruction Set
Extensions Programming Reference (February 2016).

AVX-512 instructions are identified by a EVEX prefix which, for the purpose
of instruction decoding, can be treated as though it were a 4-byte VEX
prefix.

Existing instructions which can now accept an EVEX prefix need not be
further annotated in the op code map (x86-opcode-map.txt). In the case of
new instructions, the op code map is updated accordingly.

Also add associated Mask Instructions that are used to manipulate mask
registers used in AVX-512 instructions.

A representative set of instructions is added to the perf tools new
instructions test in a subsequent patch.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: X86 ML <x86@kernel.org>
Link: http://lkml.kernel.org/r/1469003437-32706-4-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
7 years agox86/insn: Add AVX-512 support to the instruction decoder
Adrian Hunter [Wed, 20 Jul 2016 08:30:35 +0000 (11:30 +0300)]
x86/insn: Add AVX-512 support to the instruction decoder

Add support for Intel's AVX-512 instructions to the instruction decoder.

AVX-512 instructions are documented in Intel Architecture Instruction
Set Extensions Programming Reference (February 2016).

AVX-512 instructions are identified by a EVEX prefix which, for the
purpose of instruction decoding, can be treated as though it were a
4-byte VEX prefix.

Existing instructions which can now accept an EVEX prefix need not be
further annotated in the op code map (x86-opcode-map.txt). In the case
of new instructions, the op code map is updated accordingly.

Also add associated Mask Instructions that are used to manipulate mask
registers used in AVX-512 instructions.

The 'perf tools' instruction decoder is updated in a subsequent patch.
And a representative set of instructions is added to the perf tools new
instructions test in a subsequent patch.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: X86 ML <x86@kernel.org>
Link: http://lkml.kernel.org/r/1469003437-32706-3-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
7 years agocpufreq: intel_pstate: Check cpuid for MSR_HWP_INTERRUPT
Srinivas Pandruvada [Tue, 19 Jul 2016 23:52:01 +0000 (16:52 -0700)]
cpufreq: intel_pstate: Check cpuid for MSR_HWP_INTERRUPT

The MSR MSR_HWP_INTERRUPT is valid only when CPUID.06H:EAX[8] = 1, so
check for feature before accessing this MSR.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
7 years agointel_pstate: Update cpu_frequency tracepoint every time
Rafael J. Wysocki [Tue, 19 Jul 2016 13:10:37 +0000 (15:10 +0200)]
intel_pstate: Update cpu_frequency tracepoint every time

Currently, intel_pstate only updates the cpu_frequency tracepoint
if the new P-state to set is different from the current one, but
that causes powertop to report 100% idle on an 100% loaded system
sometimes.

Prevent that from happening by updating the cpu_frequency tracepoint
every time intel_pstate_update_pstate() is called.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>-
7 years agocpufreq: intel_pstate: clean remnant struct element
Carsten Emde [Mon, 18 Jul 2016 23:19:15 +0000 (01:19 +0200)]
cpufreq: intel_pstate: clean remnant struct element

When I was working with the Intel P state driver I came across a
remnant struct element that is no longer needed after the function
intel_pstate_calc_freq() was retired.

Signed-off-by: Carsten Emde <C.Emde@osadl.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
7 years agox86/boot: Reorganize and clean up the BIOS area reservation code
Ingo Molnar [Thu, 21 Jul 2016 07:53:52 +0000 (09:53 +0200)]
x86/boot: Reorganize and clean up the BIOS area reservation code

So the reserve_ebda_region() code has accumulated a number of
problems over the years that make it really difficult to read
and understand:

- The calculation of 'lowmem' and 'ebda_addr' is an unnecessarily
  interleaved mess of first lowmem, then ebda_addr, then lowmem tweaks...

- 'lowmem' here means 'super low mem' - i.e. 16-bit addressable memory. In other
  parts of the x86 code 'lowmem' means 32-bit addressable memory... This makes it
  super confusing to read.

- It does not help at all that we have various memory range markers, half of which
  are 'start of range', half of which are 'end of range' - but this crucial
  property is not obvious in the naming at all ... gave me a headache trying to
  understand all this.

- Also, the 'ebda_addr' name sucks: it highlights that it's an address (which is
  obvious, all values here are addresses!), while it does not highlight that it's
  the _start_ of the EBDA region ...

- 'BIOS_LOWMEM_KILOBYTES' says a lot of things, except that this is the only value
  that is a pointer to a value, not a memory range address!

- The function name itself is a misnomer: it says 'reserve_ebda_region()' while
  its main purpose is to reserve all the firmware ROM typically between 640K and
  1MB, while the 'EBDA' part is only a small part of that ...

- Likewise, the paravirt quirk flag name 'ebda_search' is misleading as well: this
  too should be about whether to reserve firmware areas in the paravirt case.

- In fact thinking about this as 'end of RAM' is confusing: what this function
  *really* wants to reserve is firmware data and code areas! Once the thinking is
  inverted from a mixed 'ram' and 'reserved firmware area' notion to a pure
  'reserved area' notion everything becomes a lot clearer.

To improve all this rewrite the whole code (without changing the logic):

- Firstly invert the naming from 'lowmem end' to 'BIOS reserved area start'
  and propagate this concept through all the variable names and constants.

BIOS_RAM_SIZE_KB_PTR // was: BIOS_LOWMEM_KILOBYTES

BIOS_START_MIN // was: INSANE_CUTOFF

ebda_start // was: ebda_addr
bios_start // was: lowmem

BIOS_START_MAX // was: LOWMEM_CAP

- Then clean up the name of the function itself by renaming it
  to reserve_bios_regions() and renaming the ::ebda_search paravirt
  flag to ::reserve_bios_regions.

- Fix up all the comments (fix typos), harmonize and simplify their
  formulation and remove comments that become unnecessary due to
  the much better naming all around.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
7 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Herbert Xu [Thu, 21 Jul 2016 04:26:55 +0000 (12:26 +0800)]
Merge git://git./linux/kernel/git/herbert/crypto-2.6

Merge the crypto tree to resolve conflict in qat Makefile.

7 years agocrypto: qat - make qat_asym_algs.o depend on asn1 headers
Jan Stancek [Thu, 30 Jun 2016 10:23:51 +0000 (12:23 +0200)]
crypto: qat - make qat_asym_algs.o depend on asn1 headers

Parallel build can sporadically fail because asn1 headers may
not be built yet by the time qat_asym_algs.o is compiled:
  drivers/crypto/qat/qat_common/qat_asym_algs.c:55:32: fatal error: qat_rsapubkey-asn1.h: No such file or directory
   #include "qat_rsapubkey-asn1.h"

Cc: stable@vger.kernel.org
Signed-off-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agodm: allow bio-based table to be upgraded to bio-based with DAX support
Toshi Kani [Tue, 28 Jun 2016 19:37:15 +0000 (13:37 -0600)]
dm: allow bio-based table to be upgraded to bio-based with DAX support

Allow table type DM_TYPE_BIO_BASED to extend with DM_TYPE_DAX_BIO_BASED
since DM_TYPE_DAX_BIO_BASED supports bio-based requests.

This is needed to allow a snapshot of an LV with DAX support to be
removed.  One of the intermediate table reloads that lvm2 does switches
from DM_TYPE_BIO_BASED to DM_TYPE_DAX_BIO_BASED.  No known reason to
disallow this so...

Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
7 years agodm snap: add fake origin_direct_access
Toshi Kani [Tue, 28 Jun 2016 19:37:16 +0000 (13:37 -0600)]
dm snap: add fake origin_direct_access

dax-capable mapped-device is marked as DM_TYPE_DAX_BIO_BASED,
which supports both dax and bio-based operations.  dm-snap
needs to work with dax-capable device when bio-based operation
is used.

Add fake origin_direct_access() to origin device so that its
origin device is also marked as DM_TYPE_DAX_BIO_BASED for
dax-capable device.  This allows to extend target's DM table.
dm-snap works normally when bio-based operation is used.

dm-snap does not support dax operation, and mount with dax
option to a target device or snapshot device fails.

Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Mike Snitzer <snitzer@redhat.com>
Cc: Alasdair Kergon <agk@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
7 years agodm stripe: add DAX support
Toshi Kani [Fri, 24 Jun 2016 18:23:30 +0000 (12:23 -0600)]
dm stripe: add DAX support

Change dm-stripe to implement direct_access function,
stripe_direct_access(), which maps bdev and sector and
calls direct_access function of its physical target device.

Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
7 years agodm error: add DAX support
Mike Snitzer [Fri, 24 Jun 2016 21:09:35 +0000 (17:09 -0400)]
dm error: add DAX support

Allow the error target to replace an existing DAX-enabled target.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
7 years agodm linear: add DAX support
Toshi Kani [Wed, 22 Jun 2016 23:54:54 +0000 (17:54 -0600)]
dm linear: add DAX support

Change dm-linear to implement direct_access function,
linear_direct_access(), which maps sector and calls direct_access
function of its physical target device.

Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
7 years agodm: add infrastructure for DAX support
Toshi Kani [Wed, 22 Jun 2016 23:54:53 +0000 (17:54 -0600)]
dm: add infrastructure for DAX support

Change mapped device to implement direct_access function,
dm_blk_direct_access(), which calls a target direct_access function.
'struct target_type' is extended to have target direct_access interface.
This function limits direct accessible size to the dm_target's limit
with max_io_len().

Add dm_table_supports_dax() to iterate all targets and associated block
devices to check for DAX support.  To add DAX support to a DM target the
target must only implement the direct_access function.

Add a new dm type, DM_TYPE_DAX_BIO_BASED, which indicates that mapped
device supports DAX and is bio based.  This new type is used to assure
that all target devices have DAX support and remain that way after
QUEUE_FLAG_DAX is set in mapped device.

At initial table load, QUEUE_FLAG_DAX is set to mapped device when setting
DM_TYPE_DAX_BIO_BASED to the type.  Any subsequent table load to the
mapped device must have the same type, or else it fails per the check in
table_load().

Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
7 years agoMerge remote-tracking branch 'jens/for-4.8/core' into dm-4.8
Mike Snitzer [Thu, 21 Jul 2016 03:48:25 +0000 (23:48 -0400)]
Merge remote-tracking branch 'jens/for-4.8/core' into dm-4.8

DM's DAX support depends on block core's newly added QUEUE_FLAG_DAX.

7 years agoblock: Fix front merge check
Damien Le Moal [Thu, 21 Jul 2016 03:40:47 +0000 (21:40 -0600)]
block: Fix front merge check

For a front merge, the maximum number of sectors of the
request must be checked against the front merge BIO sector,
not the current sector of the request.

Signed-off-by: Damien Le Moal <damien.lemoal@hgst.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
7 years agoblock: do not merge requests without consulting with io scheduler
Tahsin Erdogan [Thu, 7 Jul 2016 18:48:22 +0000 (11:48 -0700)]
block: do not merge requests without consulting with io scheduler

Before merging a bio into an existing request, io scheduler is called to
get its approval first. However, the requests that come from a plug
flush may get merged by block layer without consulting with io
scheduler.

In case of CFQ, this can cause fairness problems. For instance, if a
request gets merged into a low weight cgroup's request, high weight cgroup
now will depend on low weight cgroup to get scheduled. If high weigt cgroup
needs that io request to complete before submitting more requests, then it
will also lose its timeslice.

Following script demonstrates the problem. Group g1 has a low weight, g2
and g3 have equal high weights but g2's requests are adjacent to g1's
requests so they are subject to merging. Due to these merges, g2 gets
poor disk time allocation.

cat > cfq-merge-repro.sh << "EOF"
#!/bin/bash
set -e

IO_ROOT=/mnt-cgroup/io

mkdir -p $IO_ROOT

if ! mount | grep -qw $IO_ROOT; then
  mount -t cgroup none -oblkio $IO_ROOT
fi

cd $IO_ROOT

for i in g1 g2 g3; do
  if [ -d $i ]; then
    rmdir $i
  fi
done

mkdir g1 && echo 10 > g1/blkio.weight
mkdir g2 && echo 495 > g2/blkio.weight
mkdir g3 && echo 495 > g3/blkio.weight

RUNTIME=10

(echo $BASHPID > g1/cgroup.procs &&
 fio --readonly --name name1 --filename /dev/sdb \
     --rw read --size 64k --bs 64k --time_based \
     --runtime=$RUNTIME --offset=0k &> /dev/null)&

(echo $BASHPID > g2/cgroup.procs &&
 fio --readonly --name name1 --filename /dev/sdb \
     --rw read --size 64k --bs 64k --time_based \
     --runtime=$RUNTIME --offset=64k &> /dev/null)&

(echo $BASHPID > g3/cgroup.procs &&
 fio --readonly --name name1 --filename /dev/sdb \
     --rw read --size 64k --bs 64k --time_based \
     --runtime=$RUNTIME --offset=256k &> /dev/null)&

sleep $((RUNTIME+1))

for i in g1 g2 g3; do
  echo ---- $i ----
  cat $i/blkio.time
done

EOF
# ./cfq-merge-repro.sh
---- g1 ----
8:16 162
---- g2 ----
8:16 165
---- g3 ----
8:16 686

After applying the patch:

# ./cfq-merge-repro.sh
---- g1 ----
8:16 90
---- g2 ----
8:16 445
---- g3 ----
8:16 471

Signed-off-by: Tahsin Erdogan <tahsin@google.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
7 years agonvme/pci: Provide SR-IOV support
Keith Busch [Mon, 20 Jun 2016 15:41:06 +0000 (09:41 -0600)]
nvme/pci: Provide SR-IOV support

This registers an sr-iov callback for nvme.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
7 years agoblock: Fix spelling in a source code comment
Bart Van Assche [Tue, 19 Jul 2016 15:18:06 +0000 (08:18 -0700)]
block: Fix spelling in a source code comment

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
7 years agonvme: initialize variable before logical OR'ing it
Jay Freyensee [Thu, 21 Jul 2016 03:26:16 +0000 (21:26 -0600)]
nvme: initialize variable before logical OR'ing it

It is typically not good coding or secure coding practice
to logical OR a variable without an initialization value first.
Here on this line:

integrity.flags |= BLK_INTEGRITY_DEVICE_CAPABLE;

BLK_INTEGRITY_DEVICE_CAPABLE is being OR'ed to a member variable
never set to an initial value. This patch fixes that.

Signed-off-by: Jay Freyensee <james.p.freyensee@intel.com>
Reviewed-by: Ming Lin <ming.l@samsung.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
7 years agoblock: expose QUEUE_FLAG_DAX in sysfs
Yigal Korman [Thu, 23 Jun 2016 21:05:51 +0000 (17:05 -0400)]
block: expose QUEUE_FLAG_DAX in sysfs

Provides the ability to identify DAX enabled devices in userspace.

Signed-off-by: Yigal Korman <yigal@plexistor.com>
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
7 years agoblock: add QUEUE_FLAG_DAX for devices to advertise their DAX support
Toshi Kani [Thu, 23 Jun 2016 21:05:50 +0000 (17:05 -0400)]
block: add QUEUE_FLAG_DAX for devices to advertise their DAX support

Currently, presence of direct_access() in block_device_operations
indicates support of DAX on its block device.  Because
block_device_operations is instantiated with 'const', this DAX
capablity may not be enabled conditinally.

In preparation for supporting DAX to device-mapper devices, add
QUEUE_FLAG_DAX to request_queue flags to advertise their DAX
support.  This will allow to set the DAX capability based on how
mapped device is composed.

Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: <linux-s390@vger.kernel.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
7 years agoInput: tsc200x - report proper input_dev name
Michael Welling [Wed, 20 Jul 2016 17:02:07 +0000 (10:02 -0700)]
Input: tsc200x - report proper input_dev name

Passes input_id struct to the common probe function for the tsc200x drivers
instead of just the bustype.

This allows for the use of the product variable to set the input_dev->name
variable according to the type of touchscreen used. Note that when we
introduced support for TSC2004 we started calling everything TSC200X, so
let's keep this quirk.

Signed-off-by: Michael Welling <mwelling@ieee.org>
Cc: stable@vger.kernel.org
Acked-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Pali Rohár <pali.rohar@gmail.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
7 years agotty/vt/keyboard: fix OOB access in do_compute_shiftstate()
Dmitry Torokhov [Mon, 27 Jun 2016 21:12:34 +0000 (14:12 -0700)]
tty/vt/keyboard: fix OOB access in do_compute_shiftstate()

The size of individual keymap in drivers/tty/vt/keyboard.c is NR_KEYS,
which is currently 256, whereas number of keys/buttons in input device (and
therefor in key_down) is much larger - KEY_CNT - 768, and that can cause
out-of-bound access when we do

sym = U(key_maps[0][k]);

with large 'k'.

To fix it we should not attempt iterating beyond smaller of NR_KEYS and
KEY_CNT.

Also while at it let's switch to for_each_set_bit() instead of open-coding
it.

Reported-by: Sasha Levin <sasha.levin@oracle.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
7 years agoblock: unexport various bio mapping helpers
Christoph Hellwig [Tue, 19 Jul 2016 09:31:54 +0000 (11:31 +0200)]
block: unexport various bio mapping helpers

They are unused and potential new users really should use the
blk_rq_map* versions.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
7 years agoscsi/osd: open code blk_make_request
Christoph Hellwig [Tue, 19 Jul 2016 09:31:53 +0000 (11:31 +0200)]
scsi/osd: open code blk_make_request

I wish the OSD code could simply use blk_rq_map_* helpers like
everyone else, but the complex nature of deciding if we have
DATA IN and/or DATA OUT buffers might make this impossible
(at least for a mere human like me).

But using blk_rq_append_bio at least allows sharing the setup code
between request with or without dat a buffers, and given that this
is the last user of blk_make_request it allows getting rid of that
somewhat awkward interface.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Boaz Harrosh <ooo@electrozaur.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
7 years agotarget: stop using blk_make_request
Christoph Hellwig [Tue, 19 Jul 2016 09:31:52 +0000 (11:31 +0200)]
target: stop using blk_make_request

Using blk_rq_append_bio allows to append the bios to the request
directly instead of having to build up a list first, and also
allows to have a single code path for requests with or without
data attached to them.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
7 years agoblock: simplify and export blk_rq_append_bio
Christoph Hellwig [Tue, 19 Jul 2016 09:31:51 +0000 (11:31 +0200)]
block: simplify and export blk_rq_append_bio

The target SCSI passthrough backend is much better served with the low-level
blk_rq_append_bio construct then the helpers built on top of it, so export it.

Also use the opportunity to remove the pointless request_queue argument and
make the code flow a little more readable.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
7 years agoblock: ensure bios return from blk_get_request are properly initialized
Christoph Hellwig [Tue, 19 Jul 2016 09:31:50 +0000 (11:31 +0200)]
block: ensure bios return from blk_get_request are properly initialized

blk_get_request is used for BLOCK_PC and similar passthrough requests.
Currently we always need to call blk_rq_set_block_pc or an open coded
version of it to allow appending bios using the request mapping helpers
later on, which is a somewhat awkward API.  Instead move the
initialization part of blk_rq_set_block_pc into blk_get_request, so that
we always have a safe to use request.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
7 years agovirtio_blk: use blk_rq_map_kern
Christoph Hellwig [Tue, 19 Jul 2016 09:31:49 +0000 (11:31 +0200)]
virtio_blk: use blk_rq_map_kern

Similar to how SCSI and NVMe prepare passthrough requests.  This avoids
poking into request internals too much.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
7 years agomemstick: don't allow REQ_TYPE_BLOCK_PC requests
Christoph Hellwig [Tue, 19 Jul 2016 09:31:48 +0000 (11:31 +0200)]
memstick: don't allow REQ_TYPE_BLOCK_PC requests

There is no code to issue or handle REQ_TYPE_BLOCK_PC request in the
memstick drivers, so remove the bogus conditional.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
7 years agoblock: shrink bio size again
Christoph Hellwig [Tue, 19 Jul 2016 09:28:43 +0000 (11:28 +0200)]
block: shrink bio size again

The recent ops split grew the bio by adding the new ioprio field.
Shrink it again by using a 16-bit field for the bi_flags value and
filling the holes near the beginning of the structure.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
7 years agoblock: simplify and cleanup bvec pool handling
Christoph Hellwig [Tue, 19 Jul 2016 09:28:42 +0000 (11:28 +0200)]
block: simplify and cleanup bvec pool handling

Instead of a flag and an index just make sure an index of 0 means
no need to free the bvec array.  Also move the constants related
to the bvec pools together and use a consistent naming scheme for
them.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
7 years agoblock: get rid of bio_rw and READA
Christoph Hellwig [Tue, 19 Jul 2016 09:28:41 +0000 (11:28 +0200)]
block: get rid of bio_rw and READA

These two are confusing leftover of the old world order, combining
values of the REQ_OP_ and REQ_ namespaces.  For callers that don't
special case we mostly just replace bi_rw with bio_data_dir or
op_is_write, except for the few cases where a switch over the REQ_OP_
values makes more sense.  Any check for READA is replaced with an
explicit check for REQ_RAHEAD.  Also remove the READA alias for
REQ_RAHEAD.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
7 years agoblock: don't ignore -EOPNOTSUPP blkdev_issue_write_same
Christoph Hellwig [Tue, 19 Jul 2016 09:23:34 +0000 (11:23 +0200)]
block: don't ignore -EOPNOTSUPP blkdev_issue_write_same

WRITE SAME is a data integrity operation and we can't simply ignore
errors.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
7 years agoblock: introduce BLKDEV_DISCARD_ZERO to fix zeroout
Christoph Hellwig [Tue, 19 Jul 2016 09:23:33 +0000 (11:23 +0200)]
block: introduce BLKDEV_DISCARD_ZERO to fix zeroout

Currently blkdev_issue_zeroout cascades down from discards (if the driver
guarantees that discards zero data), to WRITE SAME and then to a loop
writing zeroes.  Unfortunately we ignore run-time EOPNOTSUPP errors in the
block layer blkdev_issue_discard helper to work around DM volumes that
may have mixed discard support underneath.

This patch intoroduces a new BLKDEV_DISCARD_ZERO flag to
blkdev_issue_discard that indicates we are called for zeroing operation.
This allows both to ignore the EOPNOTSUPP hack and actually consolidating
the discard_zeroes_data check into the function.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
7 years agoRevert "doc/sphinx: Enable keep_warnings"
Jonathan Corbet [Wed, 20 Jul 2016 22:56:21 +0000 (16:56 -0600)]
Revert "doc/sphinx: Enable keep_warnings"

This reverts commit 47d6d752b9e20dbe8a2acd22e887be81a6f39de9.

Commit f42ddca7bebc (doc-rst: kernel-doc directive, fix state machine
reporter) from Marcus Heiser provides a better fix, so this configuration
change is no longer needed.

7 years agodoc-rst: kernel-doc directive, fix state machine reporter
Markus Heiser [Wed, 20 Jul 2016 10:38:58 +0000 (12:38 +0200)]
doc-rst: kernel-doc directive, fix state machine reporter

Add a reporter replacement that assigns the correct source name and line
number to a system message, as recorded in a ViewList.

[1] http://mid.gmane.org/CAKMK7uFMQ2wOp99t-8v06Om78mi9OvRZWuQsFJD55QA20BB3iw@mail.gmail.com

Signed-off-by: Markus Heiser <markus.heiser@darmarIT.de>
Tested-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
7 years agodocs: deprecate kernel-doc-nano-HOWTO.txt
Jonathan Corbet [Wed, 20 Jul 2016 22:43:41 +0000 (16:43 -0600)]
docs: deprecate kernel-doc-nano-HOWTO.txt

Now that the new Sphinx world order is taking over, the information in
kernel-doc-nano-HOWTO.txt is outmoded.  I hate to remove it altogether,
since it's one of those files that people expect to find.  But we can add a
warning and fix all the other pointers to it.

Reminded-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
7 years agonet/mlx5e: Fix del vxlan port command buffer memset
Saeed Mahameed [Wed, 20 Jul 2016 21:39:53 +0000 (00:39 +0300)]
net/mlx5e: Fix del vxlan port command buffer memset

memset the command buffers rather than the pointers to them.

Fixes: b3f63c3d5e2c ("net/mlx5e: Add netdev support for VXLAN tunneling")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agodm thin: fix a race condition between discarding and provisioning a block
Joe Thornber [Fri, 1 Jul 2016 13:00:02 +0000 (14:00 +0100)]
dm thin: fix a race condition between discarding and provisioning a block

The discard passdown was being issued after the block was unmapped,
which meant the block could be reprovisioned whilst the passdown discard
was still in flight.

We can only identify unshared blocks (safe to do a passdown a discard
to) once they're unmapped and their ref count hits zero.  Block ref
counts are now used to guard against concurrent allocation of these
blocks that are being discarded.  So now we unmap the block, issue
passdown discards, and the immediately increment ref counts for regions
that have been discarded via passed down (this is safe because
allocation occurs within the same thread).  We then decrement ref counts
once the passdown discard IO is complete -- signaling these blocks may
now be allocated.

This fixes the potential for corruption that was reported here:
https://www.redhat.com/archives/dm-devel/2016-June/msg00311.html

Reported-by: Dennis Yang <dennisyang@qnap.com>
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
7 years agodm btree: fix a bug in dm_btree_find_next_single()
Joe Thornber [Fri, 1 Jul 2016 10:09:13 +0000 (11:09 +0100)]
dm btree: fix a bug in dm_btree_find_next_single()

dm_btree_find_next_single() can short-circuit the search for a block
with a return of -ENODATA if all entries are higher than the search key
passed to lower_bound().

This hasn't been a problem because of the way the btree has been used by
DM thinp.  But it must be fixed now in preparation for fixing the race
in DM thinp's handling of simultaneous block discard vs allocation.
Otherwise, once that fix is in place, some of the blocks in a discard
would not be unmapped as expected.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
7 years agolibata-scsi: better style in ata_msense_*()
Tom Yan [Tue, 19 Jul 2016 20:39:28 +0000 (04:39 +0800)]
libata-scsi: better style in ata_msense_*()

`changeable` is the "version" of mode page requested by the user.
It will be less confusing/misleading if we do not check it
"together" with the setting bits of the drive.

Not to mention that we currently have ata_mselect_*() implemented
in a way that each of them will serve exclusively a particular bit
on each page. The old style will hence make the condition look even
more unnecessarily arcane if the ata_msense_*() is reflecting more
than one bit.

Signed-off-by: Tom Yan <tom.ty89@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
7 years agoAHCI: Clear GHC.IS to prevent unexpectly asserting INTx
Pang Raymond [Wed, 20 Jul 2016 12:13:46 +0000 (12:13 +0000)]
AHCI: Clear GHC.IS to prevent unexpectly asserting INTx

Due to PCI subsystem behaviour, unloading AHCI driver will disable
MSI and enable INTx. When HBA supports MSIx or Multiple MSI, Driver's
irq handler doesn't clear GHC.IS register. It works well when reading or
writing data and GHC.IS is always non-zero. But when unloading driver
(or any other operation which causes disable MSIx and enable INTx), PCI
 subsystem uses config write(Rx04.bit10) to enable INTx. Because
GHC.IS is non-zero, HBA will falsely assume some port needs interrupt
service. Then it asserts INTx. To make things worse, when AHCI controller
shares the same interrupt pin with other PCI device, that PCI device's ISR
will be called and nobody de-asserts previous INTx.
This patch clears GHC.IS in ahci_port_stop() even when using MSIx or
MMSI to prevent this case. It ensures GHC.IS is zero before PCI subsystem
enables INTx.

tj: Minor updates to the comment.

Signed-off-by: Raymond Pang <raymond_rule@hotmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
7 years agocrypto: vmx - Fix aes_p8_xts_decrypt build failure
Herbert Xu [Wed, 20 Jul 2016 14:32:50 +0000 (22:32 +0800)]
crypto: vmx - Fix aes_p8_xts_decrypt build failure

We use _GLOBAL so there is no need to do the manual alignment,
in fact it causes a build failure.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: vmx - Ignore generated files
Paulo Flabiano Smorigo [Tue, 19 Jul 2016 13:36:26 +0000 (10:36 -0300)]
crypto: vmx - Ignore generated files

Ignore assembly files generated by the perl script.

Signed-off-by: Paulo Flabiano Smorigo <pfsmorigo@linux.vnet.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agohwmon: (ftsteutates) Remove unused including <linux/version.h>
Wei Yongjun [Wed, 20 Jul 2016 12:06:16 +0000 (12:06 +0000)]
hwmon: (ftsteutates) Remove unused including <linux/version.h>

Remove including <linux/version.h> that don't need it.

Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
7 years agohwmon: (adt7411) set bit 3 in CFG1 register
Michael Walle [Tue, 19 Jul 2016 14:43:26 +0000 (16:43 +0200)]
hwmon: (adt7411) set bit 3 in CFG1 register

According to the datasheet you should only write 1 to this bit. If it is
not set, at least AIN3 will return bad values on newer silicon revisions.

Fixes: d84ca5b345c2 ("hwmon: Add driver for ADT7411 voltage and temperature sensor")
Signed-off-by: Michael Walle <michael@walle.cc>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
7 years agohwmon: Add driver for FTS BMC chip "Teutates"
Thilo Cestonaro [Mon, 18 Jul 2016 11:51:29 +0000 (13:51 +0200)]
hwmon: Add driver for FTS BMC chip "Teutates"

This driver implements hardware monitoring and watchdog support
for the FTS BMC Chip "Teutates".

Signed-off-by: Thilo Cestonaro <thilo@cestona.ro>
[groeck: Updated subject and description; fixed dependencies]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
7 years agox86/insn: perf tools: Fix vcvtph2ps instruction decoding
Adrian Hunter [Wed, 20 Jul 2016 08:30:34 +0000 (11:30 +0300)]
x86/insn: perf tools: Fix vcvtph2ps instruction decoding

vcvtph2ps does not have an immediate operand, so remove the erroneous
'Ib' from its opcode map entry. Add vcvtph2ps to the perf tools new
instructions test to verify it.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: X86 ML <x86@kernel.org>
Link: http://lkml.kernel.org/r/1469003437-32706-2-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
7 years agopacket: fix second argument of sock_tx_timestamp()
Yoshihiro Shimoda [Tue, 19 Jul 2016 05:40:51 +0000 (14:40 +0900)]
packet: fix second argument of sock_tx_timestamp()

This patch fixes an issue that a syscall (e.g. sendto syscall) cannot
work correctly. Since the sendto syscall doesn't have msg_control buffer,
the sock_tx_timestamp() in packet_snd() cannot work correctly because
the socks.tsflags is set to 0.
So, this patch sets the socks.tsflags to sk->sk_tsflags as default.

Fixes: c14ac9451c34 ("sock: enable timestamping using control messages")
Reported-by: Kazuya Mizuguchi <kazuya.mizuguchi.ks@renesas.com>
Reported-by: Keita Kobayashi <keita.kobayashi.ym@renesas.com>
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoInput: synaptics-rmi4 - fix maximum size check for F12 control register 8
Andrew Duggan [Wed, 20 Jul 2016 00:53:59 +0000 (17:53 -0700)]
Input: synaptics-rmi4 - fix maximum size check for F12 control register 8

According to the RMI4 spec the maximum size of F12 control register 8 is
15 bytes. The current code incorrectly reports an error if control 8 is
greater then 14. Making sensors with a control register 8 with 15 bytes
unusable.

Signed-off-by: Andrew Duggan <aduggan@synaptics.com>
Reported-by: Chris Healy <cphealy@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
7 years agonet: switchdev: change ageing_time type to clock_t
Vivien Didelot [Mon, 18 Jul 2016 19:02:06 +0000 (15:02 -0400)]
net: switchdev: change ageing_time type to clock_t

The switchdev value for the SWITCHDEV_ATTR_ID_BRIDGE_AGEING_TIME
attribute is a clock_t and requires to use helpers such as
clock_t_to_jiffies() to convert to milliseconds.

Change ageing_time type from u32 to clock_t to make it explicit.

Fixes: f55ac58ae64c ("switchdev: add bridge ageing_time attribute")
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoUpdate maintainer for EHEA driver.
Douglas Miller [Mon, 18 Jul 2016 17:28:45 +0000 (12:28 -0500)]
Update maintainer for EHEA driver.

Since Thadeu left IBM, EHEA has gone mostly unmaintained, since his email
address doesn't work anymore.  I'm stepping up to help maintain this
driver upstream.

I'm adding Thadeu's personal e-mail address in Cc, hoping that we can
get his ack.

CC: Thadeu Lima de Souza Cascardo <cascardo@cascardo.eti.br>
Signed-off-by: Douglas Miller <dougmill@linux.vnet.ibm.com>
Acked-by: Thadeu Lima de Souza Cascardo <cascardo@cascardo.eti.br>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'mlx4-fixes'
David S. Miller [Tue, 19 Jul 2016 23:44:12 +0000 (16:44 -0700)]
Merge branch 'mlx4-fixes'

Tariq Toukan says:

====================
Safe flow for mlx4_en configuration change

This patchset improves the mlx4_en driver resiliency, especially on
systems with low memory.  Upon a configuration change that requires
the allocation of new resources, we first try to allocate, prior to
destroying the current ones.  Once it is successfully done,
we release the old resources and attach the new ones.  Otherwise, we
stay with a functioning interface having the same old configuration.

This improvement became of greater significance after removing the use
of vmap.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet/mlx4_en: Add resilience in low memory systems
Eugenia Emantayev [Mon, 18 Jul 2016 15:35:12 +0000 (18:35 +0300)]
net/mlx4_en: Add resilience in low memory systems

This patch fixes the lost of Ethernet port on low memory system,
when driver frees its resources and fails to allocate new resources.
Issue could happen while changing number of channels, rings size or
changing the timestamp configuration.
This fix is necessary because of removing vmap use in the code.
When vmap was in use driver could allocate non-contiguous memory
and make it contiguous with vmap. Now it could fail to allocate
a large chunk of contiguous memory and lose the port.
Current code tries to allocate new resources and then upon success
frees the old resources.

Fixes: 73898db04301 ('net/mlx4: Avoid wrong virtual mappings')
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet/mlx4_en: Move filters cleanup to a proper location
Eugenia Emantayev [Mon, 18 Jul 2016 15:35:11 +0000 (18:35 +0300)]
net/mlx4_en: Move filters cleanup to a proper location

Filters cleanup should be done once before destroying net device,
since filters list is contained in the private data.

Fixes: 1eb8c695bda9 ('net/mlx4_en: Add accelerated RFS support')
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agodoc/sphinx: Enable keep_warnings
Daniel Vetter [Tue, 19 Jul 2016 11:42:54 +0000 (13:42 +0200)]
doc/sphinx: Enable keep_warnings

Unfortunately warnings generated after parsing in sphinx can end up
with entirely bogus files and line numbers as sources. Strangely for
outright errors this is not a problem. Trying to convert warnings to
errors also doesn't fix it.

The only way to get useful output out of sphinx to be able to root
cause the error seems to be enabling keep_warnings, which inserts
a System Message into the actual output. Not pretty at all, but I
don't really want to fix up core rst/sphinx code, and this gets the job
done meanwhile.

Cc: Markus Heiser <markus.heiser@darmarit.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-doc@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
7 years agonfit: make DIMM DSMs optional
Dan Williams [Tue, 19 Jul 2016 19:32:39 +0000 (12:32 -0700)]
nfit: make DIMM DSMs optional

Commit 4995734e973a "acpi, nfit: fix acpi_check_dsm() vs zero functions
implemented" attempted to fix a QEMU regression by supporting its usage
of a zero-mask as a valid response to a DSM-family probe request.
However, this behavior breaks HP platforms that return a zero-mask by
default causing the probe to misidentify the DSM-family.

Instead, the QEMU regression can be fixed by simply not requiring the DSM
family to be identified.

This effectively reverts commit 4995734e973a, and removes the DSM
requirement from the init path.

Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Cc: Linda Knippers <linda.knippers@hpe.com>
Fixes: 4995734e973a ("acpi, nfit: fix acpi_check_dsm() vs zero functions implemented")
Reported-by: Jerry Hoemann <jerry.hoemann@hpe.com>
Tested-by: Jerry Hoemann <jerry.hoemann@hpe.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
7 years agoata: sata_dwc_460ex: remove redundant dev_err call
Wei Yongjun [Tue, 19 Jul 2016 11:27:53 +0000 (11:27 +0000)]
ata: sata_dwc_460ex: remove redundant dev_err call

There is a error message within devm_ioremap_resource
already, so remove the dev_err call to avoid redundant
error message.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Tejun Heo <tj@kernel.org>
7 years agocgroup: remove duplicated include from cgroup.c
Wei Yongjun [Tue, 19 Jul 2016 12:02:39 +0000 (12:02 +0000)]
cgroup: remove duplicated include from cgroup.c

Remove duplicated include.

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Tejun Heo <tj@kernel.org>
7 years agodm raid: fix random optimal_io_size for raid0
Heinz Mauelshagen [Tue, 19 Jul 2016 11:16:24 +0000 (13:16 +0200)]
dm raid: fix random optimal_io_size for raid0

raid_io_hints() was retrieving the number of data stripes used for the
calculation of io_opt from struct r5conf, which is not defined for raid0
mappings.

Base the calculation on the in-core raid_set structure instead.

Also, adjust to use to_bytes() for the sector -> bytes conversion
throughout.

Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
7 years agodm raid: address checkpatch.pl complaints
Heinz Mauelshagen [Tue, 19 Jul 2016 12:03:51 +0000 (14:03 +0200)]
dm raid: address checkpatch.pl complaints

Use 'unsigned int' where appropriate.
Return negative errors.
Correct an indentation.

Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
7 years agotick/nohz: Optimize nohz idle enter
Gaurav Jindal [Thu, 14 Jul 2016 12:04:20 +0000 (12:04 +0000)]
tick/nohz: Optimize nohz idle enter

tick_nohz_start_idle is called before checking whether the idle tick can be
stopped. If the tick cannot be stopped, calling tick_nohz_start_idle() is
pointless and just wasting CPU cycles.

Only invoke tick_nohz_start_idle() when can_stop_idle_tick() returns true. A
short one minute observation of the effect on ARM64 shows a reduction of calls
by 1.5% thus optimizing the idle entry sequence.

[tglx: Massaged changelog ]

Co-developed-by: Sanjeev Yadav<sanjeev.yadav@spreadtrum.com>
Signed-off-by: Gaurav Jindal<gaurav.jindal@spreadtrum.com>
Link: http://lkml.kernel.org/r/20160714120416.GB21099@gaurav.jindal@spreadtrum.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
7 years agogenirq: Fix missing irq allocation affinity hint
Vincent Stehle [Mon, 18 Jul 2016 20:56:26 +0000 (22:56 +0200)]
genirq: Fix missing irq allocation affinity hint

The new affinity hint argument of __irq_domain_alloc_irqs() is missing in
irq_reserve_ipi(). Add it.

This fixes the following compilation error:

  kernel/irq/ipi.c: In function ‘irq_reserve_ipi’:
  kernel/irq/ipi.c:85:9: error: too few arguments to function ‘__irq_domain_alloc_irqs’
    virq = __irq_domain_alloc_irqs(domain, virq, nr_irqs, NUMA_NO_NODE,
           ^
Fixes: 06ee6d571f0e ("genirq: Add affinity hint to irq allocation")
Signed-off-by: Vincent Stehlé <vincent.stehle@laposte.net>
Cc: linux-pci@vger.kernel.org
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
7 years agoclockevents: Make clockevents_subsys static
Ben Dooks [Fri, 17 Jun 2016 15:56:14 +0000 (16:56 +0100)]
clockevents: Make clockevents_subsys static

The clockevents_subsys struct is used for sysfs support and
is not declared or used outside the file it is defined in.
Fix the following warning by making it static:

kernel/time/clockevents.c:648:17: warning: symbol 'clockevents_subsys' was not declared. Should it be static?

Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Cc: linux-kernel@lists.codethink.co.uk
Link: http://lkml.kernel.org/r/1466178974-7105-1-git-send-email-ben.dooks@codethink.co.uk
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
7 years agoMerge tag 'topic/kbl-4.7-fixes-2016-07-18' of git://anongit.freedesktop.org/drm-intel...
Dave Airlie [Tue, 19 Jul 2016 08:00:15 +0000 (18:00 +1000)]
Merge tag 'topic/kbl-4.7-fixes-2016-07-18' of git://anongit.freedesktop.org/drm-intel into drm-fixes

As promised here's the pile of kbl cherry-picks assembled by Mika&Rodrigo.
It's a bit much, but all well-contained to kbl code and been tested for a
while in drm-intel-next. Still separate in case too much, but in that case
I think we'd need to disable kbl by default again (which would be annoying
too) in 4.7.

* tag 'topic/kbl-4.7-fixes-2016-07-18' of git://anongit.freedesktop.org/drm-intel: (28 commits)
  drm/i915/kbl: Introduce the first official DMC for Kabylake.
  drm/i915: Introduce Kabypoint PCH for Kabylake H/DT.
  drm/i915/gen9: implement WaConextSwitchWithConcurrentTLBInvalidate
  drm/i915/gen9: Add WaFbcHighMemBwCorruptionAvoidance
  drm/i195/fbc: Add WaFbcNukeOnHostModify
  drm/i915/gen9: Add WaFbcWakeMemOn
  drm/i915/gen9: Add WaFbcTurnOffFbcWatermark
  drm/i915/kbl: Add WaClearSlmSpaceAtContextSwitch
  drm/i915/gen9: Add WaEnableChickenDCPR
  drm/i915/kbl: Add WaDisableSbeCacheDispatchPortSharing
  drm/i915/kbl: Add WaDisableGafsUnitClkGating
  drm/i915/kbl: Add WaForGAMHang
  drm/i915: Add WaInsertDummyPushConstP for bxt and kbl
  drm/i915/kbl: Add WaDisableDynamicCreditSharing
  drm/i915/kbl: Add WaDisableGamClockGating
  drm/i915/gen9: Enable must set chicken bits in config0 reg
  drm/i915/kbl: Add WaDisableLSQCROPERFforOCL
  drm/i915/kbl: Add WaDisableSDEUnitClockGating
  drm/i915/kbl: Add WaDisableFenceDestinationToSLM for A0
  drm/i915/kbl: Add WaEnableGapsTsvCreditFix
  ...

7 years agocrypto: vmx - Adding support for XTS
Leonidas S. Barbosa [Mon, 18 Jul 2016 15:26:26 +0000 (12:26 -0300)]
crypto: vmx - Adding support for XTS

This patch add XTS support using VMX-crypto driver.

Signed-off-by: Leonidas S. Barbosa <leosilva@linux.vnet.ibm.com>
Signed-off-by: Paulo Flabiano Smorigo <pfsmorigo@linux.vnet.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: vmx - Adding asm subroutines for XTS
Paulo Flabiano Smorigo [Mon, 18 Jul 2016 15:26:25 +0000 (12:26 -0300)]
crypto: vmx - Adding asm subroutines for XTS

This patch add XTS subroutines using VMX-crypto driver.

It gives a boost of 20 times using XTS.

These code has been adopted from OpenSSL project in collaboration
with the original author (Andy Polyakov <appro@openssl.org>).

Signed-off-by: Leonidas S. Barbosa <leosilva@linux.vnet.ibm.com>
Signed-off-by: Paulo Flabiano Smorigo <pfsmorigo@linux.vnet.ibm.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: skcipher - Add comment for skcipher_alg->base
Herbert Xu [Mon, 18 Jul 2016 16:59:30 +0000 (00:59 +0800)]
crypto: skcipher - Add comment for skcipher_alg->base

This patch adds a missing comment for the base parameter in struct
skcipher_alg.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: testmgr - Print akcipher algorithm name
Herbert Xu [Mon, 18 Jul 2016 10:20:10 +0000 (18:20 +0800)]
crypto: testmgr - Print akcipher algorithm name

When an akcipher test fails, we don't know which algorithm failed
because the name is not printed.  This patch fixes this.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: marvell - Fix wrong flag used for GFP in mv_cesa_dma_add_iv_op
Romain Perier [Mon, 18 Jul 2016 09:32:24 +0000 (11:32 +0200)]
crypto: marvell - Fix wrong flag used for GFP in mv_cesa_dma_add_iv_op

Use the parameter 'gfp_flags' instead of 'flag' as second argument of
dma_pool_alloc(). The parameter 'flag' is for the TDMA descriptor, its
content has no sense for the allocator.

Fixes: bac8e805a30d ("crypto: marvell - Copy IV vectors by DMA...")
Signed-off-by: Romain Perier <romain.perier@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agom68k/defconfig: Update defconfigs for v4.7-rc2
Geert Uytterhoeven [Mon, 6 Jun 2016 07:43:00 +0000 (09:43 +0200)]
m68k/defconfig: Update defconfigs for v4.7-rc2

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
7 years agoMerge tag 'perf-core-for-mingo-20160718' of git://git.kernel.org/pub/scm/linux/kernel...
Ingo Molnar [Tue, 19 Jul 2016 06:44:38 +0000 (08:44 +0200)]
Merge tag 'perf-core-for-mingo-20160718' of git://git./linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

User visible changes:

 - Properly report when a function wildcard produces no matches in 'perf probe'
   (Masami Hiramatsu)

 - Balance opening and reading events in 'perf stat', which could cause
   it to get stuck trying to close invalid file descriptors (Mark Rutland)

Infrastructure changes:

 - Copy more headers from the kernel, this time for headers that
   were just including the contents of its kernel counterparts, should
   help resolving the problems with linux-next, where some uapi related
   patches seem to be breaking tools/object/ build. (Arnaldo Carvalho de Melo)

   Some more combing will be done, but at least it is possible to build
   perf out of tree, via a detached tarball (make help | grep perf),
   without including kernel files in its MANIFEST (Arnaldo Carvalho de Melo)

 - Fix smatch found errors that were not causing problems, but are
   mistakes nonetheless (Dan Carpenter)

 - Fix string vs. byte array resolving in the python script code (Jiri Olsa)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
7 years agoMerge tag 'drm-intel-fixes-2016-07-18' of git://anongit.freedesktop.org/drm-intel...
Dave Airlie [Tue, 19 Jul 2016 06:09:20 +0000 (16:09 +1000)]
Merge tag 'drm-intel-fixes-2016-07-18' of git://anongit.freedesktop.org/drm-intel into drm-fixes

Two more regression fixes for 4.7.

* tag 'drm-intel-fixes-2016-07-18' of git://anongit.freedesktop.org/drm-intel:
  drm/i915: add missing condition for committing planes on crtc
  drm/i915: Treat eDP as always connected, again

7 years agosctp: load transport header after sk_filter
Willem de Bruijn [Sat, 16 Jul 2016 21:33:15 +0000 (17:33 -0400)]
sctp: load transport header after sk_filter

Do not cache pointers into the skb linear segment across sk_filter.
The function call can trigger pskb_expand_head.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet/sched/sch_htb: clamp xstats tokens to fit into 32-bit int
Konstantin Khlebnikov [Sat, 16 Jul 2016 14:08:56 +0000 (17:08 +0300)]
net/sched/sch_htb: clamp xstats tokens to fit into 32-bit int

In kernel HTB keeps tokens in signed 64-bit in nanoseconds. In netlink
protocol these values are converted into pshed ticks (64ns for now) and
truncated to 32-bit. In struct tc_htb_xstats fields "tokens" and "ctokens"
are declared as unsigned 32-bit but they could be negative thus tool 'tc'
prints them as signed. Big values loose higher bits and/or become negative.

This patch clamps tokens in xstat into range from INT_MIN to INT_MAX.
In this way it's easier to understand what's going on here.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agocrypto: nx - off by one bug in nx_of_update_msc()
Dan Carpenter [Fri, 15 Jul 2016 11:09:13 +0000 (14:09 +0300)]
crypto: nx - off by one bug in nx_of_update_msc()

The props->ap[] array is defined like this:

struct alg_props ap[NX_MAX_FC][NX_MAX_MODE][3];

So we can see that if msc->fc and msc->mode are == to NX_MAX_FC or
NX_MAX_MODE then we're off by one.

Fixes: ae0222b7289d ('powerpc/crypto: nx driver code supporting nx encryption')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agocrypto: rsa-pkcs1pad - fix rsa-pkcs1pad request struct
Tadeusz Struk [Fri, 15 Jul 2016 03:39:18 +0000 (20:39 -0700)]
crypto: rsa-pkcs1pad - fix rsa-pkcs1pad request struct

To allow for child request context the struct akcipher_request child_req
needs to be at the end of the structure.

Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
7 years agoata: define ATA_PROT_* in terms of ATA_PROT_FLAG_*
Christoph Hellwig [Sat, 16 Jul 2016 13:16:43 +0000 (22:16 +0900)]
ata: define ATA_PROT_* in terms of ATA_PROT_FLAG_*

This avoid the need to always translate between the two in ata_prot_flags
and generally cleans up the taskfile protocol usage.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Tejun Heo <tj@kernel.org>
7 years agolibata: remove ATA_PROT_FLAG_DATA
Christoph Hellwig [Sat, 16 Jul 2016 13:16:42 +0000 (22:16 +0900)]
libata: remove ATA_PROT_FLAG_DATA

Instead we can simply check for PIO or DMA in ata_is_data.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Tejun Heo <tj@kernel.org>
7 years agolibata: remove ata_is_nodata
Christoph Hellwig [Sat, 16 Jul 2016 13:16:41 +0000 (22:16 +0900)]
libata: remove ata_is_nodata

The only caller can just check for !ata_is_data instead.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Tejun Heo <tj@kernel.org>
7 years agoclk: at91: fix clk_programmable_set_parent()
Boris Brezillon [Mon, 18 Jul 2016 07:49:12 +0000 (09:49 +0200)]
clk: at91: fix clk_programmable_set_parent()

Since commit 1bdf02326b71e ("clk: at91: make use of syscon/regmap
internally"), clk_programmable_set_parent() is always selecting the
first parent (AKA slow_clk), no matter what's passed in the 'index'
parameter.

Fix that by initializing the pckr variable to the index value.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Reported-by: Hans Verkuil <hans.verkuil@cisco.com>
Fixes: 1bdf02326b71e ("clk: at91: make use of syscon/regmap internally")
Cc: <stable@vger.kernel.org>
Signed-off-by: Michael Turquette <mturquette@baylibre.com>
Link: lkml.kernel.org/r/1468828152-18389-1-git-send-email-boris.brezillon@free-electrons.com

7 years agoperf tests: Add is_printable_array test
Jiri Olsa [Sat, 16 Jul 2016 16:11:20 +0000 (18:11 +0200)]
perf tests: Add is_printable_array test

Add automated test for is_printable_array function.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Pirko <jiri@mellanox.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1468685480-18951-4-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
7 years agoperf tools: Make is_printable_array global
Jiri Olsa [Sat, 16 Jul 2016 16:11:19 +0000 (18:11 +0200)]
perf tools: Make is_printable_array global

It's used from 2 objects in perf, so it's better to keep just one copy.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Pirko <jiri@mellanox.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1468685480-18951-3-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
7 years agoperf script python: Fix string vs byte array resolving
Jiri Olsa [Sat, 16 Jul 2016 16:11:18 +0000 (18:11 +0200)]
perf script python: Fix string vs byte array resolving

Jirka reported that python code returns all arrays as strings.  This
makes impossible to get all items for byte array tracepoint field
containing 0x00 value item.

Fixing this by scanning full length of the array and returning it as
PyByteArray object in case non printable byte is found.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Reported-and-Tested-by: Jiri Pirko <jiri@mellanox.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/1468685480-18951-2-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
7 years agoperf probe: Warn unmatched function filter correctly
Masami Hiramatsu [Mon, 18 Jul 2016 16:12:41 +0000 (01:12 +0900)]
perf probe: Warn unmatched function filter correctly

Warn unmatched function filter correctly instead of warning
"symbol-loading error", since that can be a filter issue.

From the technical point of view, this adds a filter chech in map__load
and if there is a filter, it returns -2 (filter-out), instead of -1
(error), and perf-probe checks it and change message.

E.g. without this fix:

  # perf probe -F rt_sp*
  no symbols found in [kernel.kallsyms], maybe install a debug package?
  Failed to load symbols in kernel

With this fix:

  # perf probe -F rt_sp*
  no symbols passed the given filter.
  Failed to find symbols matched to "rt_sp*"
    Error: Failed to show functions.

Reported-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/146885835596.16106.2293540792775552481.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
7 years agoperf cpu_map: Add more helpers
Mark Rutland [Fri, 15 Jul 2016 10:08:11 +0000 (11:08 +0100)]
perf cpu_map: Add more helpers

In some cases it's necessry to figure out the map-local index of a given
Linux logical CPU ID. Add a new helper, cpu_map__idx, to acquire this.
As the logic is largely the same as the existing cpu_map__has, this is
rewritten in terms of the new helper.

At the same time, add the inverse operation, cpu_map__cpu, which yields
the logical CPU id for a map-local index. While this can be performed
manually, wrapping this in a helper can make code more legible.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1468577293-19667-3-git-send-email-mark.rutland@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
7 years agoperf stat: Balance opening and reading events
Mark Rutland [Fri, 15 Jul 2016 10:08:10 +0000 (11:08 +0100)]
perf stat: Balance opening and reading events

In create_perf_stat_counter, when a target CPU has not been provided, we
call __perf_evsel__open with empty_cpu_map, and open a single FD per
thread. However, in read_counter we assume that we opened events for the
product of threads and CPUs described in the evsel's cpu_map.

Thus, if an evsel has a cpu_map with more than one entry, we will
attempt to access FDs that we didn't open. This could result in a number
of problems (e.g. blocking while reading from STDIN if the fd memory
happened to be initialised to zero).

This is problematic for systems were a logical CPU PMU covers some
arbitrary subset of CPUs. The cpu_map of any evsel for that PMU will be
initialised based on the cpumask exposed through sysfs, even if the user
requests per-thread events.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1468577293-19667-2-git-send-email-mark.rutland@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
7 years agolibata: LITE-ON CX1-JB256-HP needs lower max_sectors
Tejun Heo [Mon, 18 Jul 2016 22:40:00 +0000 (18:40 -0400)]
libata: LITE-ON CX1-JB256-HP needs lower max_sectors

Since 34b48db66e08 ("block: remove artifical max_hw_sectors cap"),
max_sectors is no longer limited to BLK_DEF_MAX_SECTORS and LITE-ON
CX1-JB256-HP keeps timing out with higher max_sectors.  Revert it to
the previous value.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: dgerasimov@gmail.com
Link: https://bugzilla.kernel.org/show_bug.cgi?id=121671
Cc: stable@vger.kernel.org # v3.19+
Fixes: 34b48db66e08 ("block: remove artifical max_hw_sectors cap")
Signed-off-by: Tejun Heo <tj@kernel.org>
7 years agoata: make lba_{28,48}_ok() use ATA_MAX_SECTORS{,_LBA48}
Tom Yan [Thu, 14 Jul 2016 21:09:02 +0000 (05:09 +0800)]
ata: make lba_{28,48}_ok() use ATA_MAX_SECTORS{,_LBA48}

Since we set ATA_MAX_SECTORS_LBA48 to 65535 to avoid the corner case
in some drives that commands with "count" set to 0000h (which
reprsents 65536) does not work as expected, lba_48_ok(), which is
used for number-of-blocks checking when libata pack commands, should
use the same limit as well. In fact, there is no reason for the two
functions not to use the macros anyway.

Signed-off-by: Tom Yan <tom.ty89@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
7 years agotools: Copy linux/{hash,poison}.h and check for drift
Arnaldo Carvalho de Melo [Mon, 18 Jul 2016 21:39:36 +0000 (18:39 -0300)]
tools: Copy linux/{hash,poison}.h and check for drift

We were also using this directly from the kernel sources, the two last
cases, fix it.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-7o14xvacqcjc5llc7gvjjyl8@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
7 years agoperf tools: Remove include/linux/list.h from perf's MANIFEST
Arnaldo Carvalho de Melo [Mon, 18 Jul 2016 21:35:11 +0000 (18:35 -0300)]
perf tools: Remove include/linux/list.h from perf's MANIFEST

It hasn't been used since we made tools/ self sufficiente wrt list.h.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: d1b39d41ebec ("tools: Make list.h self-sufficient")
Link: http://lkml.kernel.org/n/tip-w20ueqlf22kh7ctjqo0zjpig@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
7 years agotools: Copy the bitops files accessed from the kernel and check for drift
Arnaldo Carvalho de Melo [Mon, 18 Jul 2016 21:13:22 +0000 (18:13 -0300)]
tools: Copy the bitops files accessed from the kernel and check for drift

copy some more kernel files accessed from tools/, check for drift.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-omz8xdyvvxgjiuqzwj6ecm6j@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
7 years agoBtrfs: fix comparison in __btrfs_map_block()
Vincent Stehlé [Fri, 15 Jul 2016 15:03:21 +0000 (17:03 +0200)]
Btrfs: fix comparison in __btrfs_map_block()

Add missing comparison to op in expression, which was forgotten when doing
the REQ_OP transition.

Fixes: b3d3fa519905 ("btrfs: update __btrfs_map_block for REQ_OP transition")
Signed-off-by: Vincent Stehlé <vincent.stehle@intel.com>
Reviewed-by: Mike Christie <mchristi@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>