1
0
Commit Graph

72353 Commits

Author SHA1 Message Date
Michael Neuling
bc6dc752f3 powerpc/pseries: Fix software invalidate TCE
The following added support for powernv but broke pseries/BML:
 1f1616e powerpc/powernv: Add TCE SW invalidation support

TCE_PCI_SW_INVAL was split into FREE and CREATE flags but the tests in
the pseries code were not updated to reflect this.

Signed-off-by: Michael Neuling <mikey@neuling.org>
cc: stable@kernel.org [v3.3+]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-06-29 14:35:37 +10:00
Anton Blanchard
0b17ba7258 powerpc: check_and_cede_processor() never cedes
Commit f948501b36 ("Make hard_irq_disable() actually hard-disable
interrupts") caused check_and_cede_processor to stop working.
->irq_happened will never be zero right after a hard_irq_disable
so the compiler removes the call to cede_processor completely.

The bug was introduced back in the lazy interrupt handling rework
of 3.4 but was hidden until recently because hard_irq_disable did
nothing.

This issue will eventually appear in 3.4 stable since the
hard_irq_disable fix is marked stable, so mark this one for stable
too.

Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: stable@vger.kernel.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-06-29 14:35:37 +10:00
Steven Rostedt
2d773aa481 powerpc/ftrace: Do not trace restore_interrupts()
As I was adding code that affects all archs, I started testing function
tracer against PPC64 and found that it currently locks up with 3.4
kernel. I figured it was due to tracing a function that shouldn't be, so
I went through the following process to bisect to find the culprit:

 cat /debug/tracing/available_filter_functions > t
 num=`wc -l t`
 sed -ne "1,${num}p" t > t1
 let num=num+1
 sed -ne "${num},$p" t > t2
 cat t1 > /debug/tracing/set_ftrace_filter
 echo function /debug/tracing/current_tracer
 <failed? bisect t1, if not bisect t2>

It finally came down to this function: restore_interrupts()

I'm not sure why this locks up the system. It just seems to prevent
scheduling from occurring. Interrupts seem to still work, as I can ping
the box. But all user processes freeze.

When restore_interrupts() is not traced, function tracing works fine.

Cc: stable@kernel.org
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-06-29 14:35:36 +10:00
Li Zhong
2cb387ae75 powerpc: Fix Section mismatch warnings in prom_init.c
This patches tries to fix a couple of Section mismatch warnings like
following one:

WARNING: arch/powerpc/kernel/built-in.o(.text+0x2923c): Section mismatch
in reference from the function .prom_query_opal() to the
function .init.text:.call_prom()
The function .prom_query_opal() references
the function __init .call_prom().
This is often because .prom_query_opal lacks a __init
annotation or the annotation of .call_prom is wrong.

Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-06-29 14:35:36 +10:00
Tiejun Chen
c58ce2b1e3 ppc64: fix missing to check all bits of _TIF_USER_WORK_MASK in preempt
In entry_64.S version of ret_from_except_lite, you'll notice that
in the !preempt case, after we've checked MSR_PR we test for any
TIF flag in _TIF_USER_WORK_MASK to decide whether to go to do_work
or not. However, in the preempt case, we do a convoluted trick to
test SIGPENDING only if PR was set and always test NEED_RESCHED ...
but we forget to test any other bit of _TIF_USER_WORK_MASK !!! So
that means that with preempt, we completely fail to test for things
like single step, syscall tracing, etc...

This should be fixed as the following path:

 - Test PR. If not set, go to resume_kernel, else continue.

 - If go resume_kernel, to do that original do_work.

 - If else, then always test for _TIF_USER_WORK_MASK to decide to do
that original user_work, else restore directly.

Signed-off-by: Tiejun Chen <tiejun.chen@windriver.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-06-29 14:35:35 +10:00
Michael Neuling
82b2521d25 powerpc: Fix uninitialised error in numa.c
chroma_defconfig currently gives me this with gcc 4.6:
  arch/powerpc/mm/numa.c:638:13: error: 'dm' may be used uninitialized in this function [-Werror=uninitialized]

It's a bogus warning/error since of_get_drconf_memory() only writes it
anyway.

Signed-off-by: Michael Neuling <mikey@neuling.org>
cc: <stable@kernel.org> [v3.3+]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-06-29 14:35:35 +10:00
Michael Ellerman
7784655acc powerpc: Fix BPF_JIT code to link with multiple TOCs
If the kernel is big enough (eg. allyesconfig), the linker may need to
switch TOCs when calling from the BPF JIT code out to the external
helpers (skb_copy_bits() & bpf_internal_load_pointer_neg_helper()).

In order to do that we need to leave space after the bl for the linker
to insert a reload of our TOC pointer.

Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Acked-by: Matt Evans <matt@ozlabs.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2012-06-29 14:35:34 +10:00
Paul Bolle
f1cefa855f ARM: mxs/tx28: fix odd include
arch/arm/mach-mxs/module-tx28.c includes "../devices-mx28.h". That's a
bit odd, because that header can be found in the same directory. This
only works because arch/arm/mach-mxs/include should be in the header
search path for this file. Nevertheless, this file can simply include
"device-mx28.h" (just as the four other files including that header do).

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
2012-06-29 08:57:20 +08:00
David S. Miller
b26d344c6b Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts:
	drivers/net/caif/caif_hsi.c
	drivers/net/usb/qmi_wwan.c

The qmi_wwan merge was trivial.

The caif_hsi.c, on the other hand, was not.  It's a conflict between
1c385f1fdf ("caif-hsi: Replace platform
device with ops structure.") in the net-next tree and commit
39abbaef19 ("caif-hsi: Postpone init of
HIS until open()") in the net tree.

I did my best with that one and will ask Sjur to check it out.

Signed-off-by: David S. Miller <davem@davemloft.net>
2012-06-28 17:37:00 -07:00
Namhyung Kim
b102f1d0f1 tracing/kvm: Use __print_hex() for kvm_emulate_insn tracepoint
The kvm_emulate_insn tracepoint used __print_insn()
for printing its instructions. However it makes the
format of the event hard to parse as it reveals TP
internals.

Fortunately, kernel provides __print_hex for almost
same purpose, we can use it instead of open coding
it. The user-space can be changed to parse it later.

That means raw kernel tracing will not be affected
by this change:

 # cd /sys/kernel/debug/tracing/
 # cat events/kvm/kvm_emulate_insn/format
 name: kvm_emulate_insn
 ID: 29
 format:
	...
 print fmt: "%x:%llx:%s (%s)%s", REC->csbase, REC->rip, __print_hex(REC->insn, REC->len), \
 __print_symbolic(REC->flags, { 0, "real" }, { (1 << 0) | (1 << 1), "vm16" }, \
 { (1 << 0), "prot16" }, { (1 << 0) | (1 << 2), "prot32" }, { (1 << 0) | (1 << 3), "prot64" }), \
 REC->failed ? " failed" : ""

 # echo 1 > events/kvm/kvm_emulate_insn/enable
 # cat trace
 # tracer: nop
 #
 # entries-in-buffer/entries-written: 2183/2183   #P:12
 #
 #                              _-----=> irqs-off
 #                             / _----=> need-resched
 #                            | / _---=> hardirq/softirq
 #                            || / _--=> preempt-depth
 #                            ||| /     delay
 #           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
 #              | |       |   ||||       |         |
         qemu-kvm-1782  [002] ...1   140.931636: kvm_emulate_insn: 0:c102fa25:89 10 (prot32)
         qemu-kvm-1781  [004] ...1   140.931637: kvm_emulate_insn: 0:c102fa25:89 10 (prot32)

Link: http://lkml.kernel.org/n/tip-wfw6y3b9ugtey8snaow9nmg5@git.kernel.org
Link: http://lkml.kernel.org/r/1340757701-10711-2-git-send-email-namhyung@kernel.org

Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: kvm@vger.kernel.org
Acked-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-06-28 13:52:15 -04:00
Alessandro Rubini
158e8bfe80 ARM: 7432/1: use the new linux/sizes.h
Signed-off-by: Alessandro Rubini <rubini@gnudd.com>
Acked-by: Giancarlo Asnaghi <giancarlo.asnaghi@st.com>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Cc: Alan Cox <alan@linux.intel.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2012-06-28 17:14:35 +01:00
Guennadi Liakhovetski
d4c191dfb9 sh: ecovec: switch MMC power control to regulators
Power on the CN11 and CN12 SD/MMC slots on ecovec is controlled by GPIOs,
which makes it possible to use the fixed voltage regulator driver to switch
card power on and off.

Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2012-06-28 23:16:56 +09:00
Tony Lindgren
472fd54015 Merge branch 'cleanup-hwmod' into cleanup
Conflicts:
	arch/arm/mach-omap2/dsp.c
2012-06-28 05:47:01 -07:00
Tony Lindgren
5f6129675b Merge branches 'cleanup-udc' and 'cleanup-dma' into cleanup 2012-06-28 05:46:41 -07:00
Paul Bolle
6ac7d11527 treewide: Put a space between #include and FILE
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2012-06-28 11:44:36 +02:00
Paul Bolle
459dac82b5 Remove useless wrappers of asm-generic/rmap.h
xtensa has a header (in its include/asm directory) that is a thin
wrapper around asm-generic/rmap.h. This wrapper is useless, since that
header doesn't exist. It is also unused (no file includes asm/rmap.h).

openrisc generates a similar header at build time (using a generic-y
entry in include/asm/Kbuild). This generated header is useless and
unused too.

Remove this header and this generic-y entry.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Michal Marek <mmarek@suse.cz>
2012-06-28 11:29:26 +02:00
Paul Bolle
00cd7dc702 Remove useless wrappers of asm-generic/ipc.h
mn10300 has a header (in its include/asm directory) that is a thin
wrapper around asm-generic/ipc.h. This wrapper is useless, since that
header doesn't exist. It is also unused (no file includes asm/ipc.h).

hexagon and tile generate similar headers at build time (using a
generic-y entry in include/asm/Kbuild). These generated headers are
useless and unused too.

Remove this header and these generic-y entries.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Acked-by: Richard Kuo <rkuo@codeaurora.org>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Acked-by: David Howells <dhowells@redhat.com> [MN10300]
Signed-off-by: Michal Marek <mmarek@suse.cz>
2012-06-28 11:28:59 +02:00
Paul Bolle
da870585b3 Remove useless wrappers of asm-generic/cpumask.h
frv and xtensa both have a header (in their include/asm directories)
that are thin wrappers around asm-generic/cpumask.h. These wrappers are
useless, since that header doesn't exist. They are also unused (all
files including asm/cpumask.h are x86 specific).

hexagon and openrisc generate similar headers at build time (using a
generic-y entry in include/asm/Kbuild). These generated headers are
useless and unused too.

Remove these headers and generic-y entries.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Acked-by: Richard Kuo <rkuo@codeaurora.org>
Acked-by: David Howells <dhowells@redhat.com> [FRV]
Signed-off-by: Michal Marek <mmarek@suse.cz>
2012-06-28 11:19:19 +02:00
Guennadi Liakhovetski
67ef578699 sh: add fixed voltage regulators to se7724
On se7724 provide a 3.3V supply for its SD/MMC-card interfaces.

Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2012-06-28 18:01:03 +09:00
Guennadi Liakhovetski
2fcfe22ae7 sh: add fixed voltage regulators to sdk7786
On sdk7786 provide a dummy regulator for the smsc911x driver.

Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2012-06-28 18:01:01 +09:00
Guennadi Liakhovetski
1190ef9229 sh: add fixed voltage regulators to rsk
On rsk devices provide a dummy regulator for the smsc911x driver.

Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2012-06-28 18:01:00 +09:00
Guennadi Liakhovetski
48aff643d9 sh: add fixed voltage regulators to migor
On migor provide a 3.3V supply for its SD/MMC-card interfaces.

Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2012-06-28 18:00:58 +09:00
Guennadi Liakhovetski
da5f2ddce8 sh: add fixed voltage regulators to kfr2r09
On kfr2r09 provide a 3.3V supply for its SD/MMC-card interfaces.

Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2012-06-28 18:00:56 +09:00
Guennadi Liakhovetski
9c158b15e7 sh: add fixed voltage regulators to ap325rxa
On ap325rxa provide a 3.3V supply for its SD/MMC-card interfaces and a
dummy regulator for the smsc911x driver.

Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2012-06-28 18:00:54 +09:00
Guennadi Liakhovetski
2db73c9bad sh: add fixed voltage regulators to sh7757lcr
On sh7757lcr provide a 3.3V supply for its SD/MMC-card interfaces.

Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2012-06-28 18:00:53 +09:00
Guennadi Liakhovetski
9d5be6acd8 sh: add fixed voltage regulators to sh2007
On sh2007 provide a dummy regulator for the smsc911x driver for the two
SMSC 9118 devices.

Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2012-06-28 18:00:52 +09:00
Guennadi Liakhovetski
73c2a9dc0e sh: add fixed voltage regulators to polaris
On polaris provide a dummy regulator for the smsc911x driver.

Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2012-06-28 18:00:50 +09:00
Guennadi Liakhovetski
2bd5d08656 sh: add fixed voltage regulators to magicpanelr2
On magicpanelr2 provide a dummy regulator for the smsc911x driver.

Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2012-06-28 18:00:48 +09:00
Guennadi Liakhovetski
1f8eca1293 sh: add fixed voltage regulators to apsh4ad0a
On apsh4ad0a provide a dummy regulator for the smsc911x driver.

Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2012-06-28 18:00:47 +09:00
Guennadi Liakhovetski
920f550093 sh: add fixed voltage regulators to apsh4a3a
On apsh4a3a provide a dummy regulator for the smsc911x driver.

Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2012-06-28 18:00:46 +09:00
Paul Mundt
e9bcd470d8 Merge branches 'sh/urgent' and 'sh/trivial' into sh-fixes-for-linus 2012-06-28 16:46:13 +09:00
Nobuhiro Iwamatsu
ad3337cb38 sh: Convert sh_clk_mstp32_register to sh_clk_mstp_register
sh_clk_mstp32_register is deprecated. This convert to sh_clk_mstp_register.

Signed-off-by: Nobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2012-06-28 16:45:34 +09:00
Guennadi Liakhovetski
bc404e9128 sh: kfr2r09: fix compile breakage
Fix compile breakage caused by

commit aa82f9fcd0
Author: Paul Mundt <lethal@linux-sh.org>

    sh: kfr2r09 evt2irq migration.

Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2012-06-28 16:35:40 +09:00
Paul Bolle
0fb37842e4 ARM: OMAP: remove unused cpu detection macros
Now that OMAP730 and OMAP850 support is mostly unified, there's no
need for separate cpu detection macros for these architectures. At
least, currently there isn't, because both macros are unused.
cpu_is_7xx() seems to cover all possible uses.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
[tony@atomide.com: updated to also to remove related IS_OMAP_TYPE]
Signed-off-by: Tony Lindgren <tony@atomide.com>
2012-06-28 00:10:30 -07:00
Paul Bolle
d2ba779afc ARM: OMAP: fix typos related to OMAP330
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2012-06-28 00:10:24 -07:00
Tony Lindgren
0810048713 Merge branch 'for_3.6/cleanup/twl-irq' of git://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux-omap-pm into fixes-non-critical 2012-06-28 00:09:26 -07:00
Tony Lindgren
4348be7a58 Merge branch 'for_3.6/cleanup/pm' of git://git.kernel.org/pub/scm/linux/kernel/git/khilman/linux-omap-pm into fixes-non-critical 2012-06-28 00:08:58 -07:00
Paul Bolle
59757c750f ARM: OMAP7XX: Remove omap730.h and omap850.h
No file includes omap730.h or omap850.h. That's not surprising, as
commit e6684f7132 ("OMAP7XX: Create
omap7xx.h") created a header that was "intended to replace omap730.h and
omap850.h", while all "values defined [in omap7xx.h] are identical to
those in both the old files". So it seems it was just an oversight to
keep both the old files after commit
7a8f48f8c6 ("OMAP7XX: omap_uwire.c:
Convert to omap7xx.h") converted the last file still including one of
those two old files.

Convert the last reference to omap730.h to a reference to omap7xx.h too.

Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2012-06-28 00:08:10 -07:00
Venkatraman S
b56f2cb71a ARM: OMAP2+: fix naming collision of variable nr_irqs
Using nr_irqs as local variable name triggers the sparse warning..
./arch/arm/mach-omap2/irq.c:265:6: warning: symbol 'nr_irqs' shadows an earlier one
./linux/include/linux/irqnr.h:26:12: originally declared here

Signed-off-by: Venkatraman S <svenkatr@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2012-06-28 00:08:04 -07:00
Javier Martinez Canillas
d3ada72ee3 ARM: OMAP: omap2plus_defconfig: Enable EXT4 support
On OMAP boards that includes an SD card slot, an EXT4 partition could
be used to store the root file system. So, the kernel should have
built-in support for EXT4 to be able to mount the VFS root on boot.

Signed-off-by: Javier Martinez Canillas <javier@dowhile0.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2012-06-28 00:07:56 -07:00
Tony Lindgren
53e6a100f7 Merge branch 'fixes-omap4-dsp' into fixes-non-critical 2012-06-28 00:07:30 -07:00
Arnd Bergmann
00a3669838 ARM: OMAP depends on MMU
There is no way to build OMAP kernels without an MMU
at this point because of dependencies on MMU-only functions.

As long as nobody is interested in fixing this, let's just disable
this platforms for nommu kernels.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
2012-06-28 00:06:09 -07:00
Paul Walmsley
a77e1c4d09 Merge branches 'am35xx_hwmod_data_fixes_a_3.6', 'am35xx_emac_mdio_devel_3.6' and 'am35xx_prcm_data_devel_3.6' into am35xx_devel_3.6 2012-06-28 00:13:19 -06:00
Mark A. Greer
ff7ad7e492 arm: omap3: am35x: Set proper powerdomain states
The am35x family of SoCs only support the PWRSTS_ON
state so create a new set of powerdomain structures
that ensure that only the ON state is entered.

Signed-off-by: Mark A. Greer <mgreer@animalcreek.com>
2012-06-28 00:12:35 -06:00
Mark A. Greer
16e5e2c471 ARM: OMAP AM35x: clockdomain data: Fix clockdomain dependencies
The am35x family of SoCs do not have an IVA so
a parallel set of clockdomain dependencies are
required that are simililar to OMAP3 but without
any IVA dependencies.

Signed-off-by: Mark A. Greer <mgreer@animalcreek.com>
Signed-off-by: Paul Walmsley <paul@pwsan.com>
2012-06-28 00:07:32 -06:00
Alex Shi
effee4b9b3 x86/tlb: do flush_tlb_kernel_range by 'invlpg'
This patch do flush_tlb_kernel_range by 'invlpg'. The performance pay
and gain was analyzed in previous patch
(x86/flush_tlb: try flush_tlb_single one by one in flush_tlb_range).

In the testing: http://lkml.org/lkml/2012/6/21/10

The pay is mostly covered by long kernel path, but the gain is still
quite clear, memory access in user APP can increase 30+% when kernel
execute this funtion.

Signed-off-by: Alex Shi <alex.shi@intel.com>
Link: http://lkml.kernel.org/r/1340845344-27557-10-git-send-email-alex.shi@intel.com
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2012-06-27 19:29:14 -07:00
Alex Shi
52aec3308d x86/tlb: replace INVALIDATE_TLB_VECTOR by CALL_FUNCTION_VECTOR
There are 32 INVALIDATE_TLB_VECTOR now in kernel. That is quite big
amount of vector in IDT. But it is still not enough, since modern x86
sever has more cpu number. That still causes heavy lock contention
in TLB flushing.

The patch using generic smp call function to replace it. That saved 32
vector number in IDT, and resolved the lock contention in TLB
flushing on large system.

In the NHM EX machine 4P * 8cores * HT = 64 CPUs, hackbench pthread
has 3% performance increase.

Signed-off-by: Alex Shi <alex.shi@intel.com>
Link: http://lkml.kernel.org/r/1340845344-27557-9-git-send-email-alex.shi@intel.com
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2012-06-27 19:29:13 -07:00
Alex Shi
611ae8e3f5 x86/tlb: enable tlb flush range support for x86
Not every tlb_flush execution moment is really need to evacuate all
TLB entries, like in munmap, just few 'invlpg' is better for whole
process performance, since it leaves most of TLB entries for later
accessing.

This patch also rewrite flush_tlb_range for 2 purposes:
1, split it out to get flush_blt_mm_range function.
2, clean up to reduce line breaking, thanks for Borislav's input.

My micro benchmark 'mummap' http://lkml.org/lkml/2012/5/17/59
show that the random memory access on other CPU has 0~50% speed up
on a 2P * 4cores * HT NHM EP while do 'munmap'.

Thanks Yongjie's testing on this patch:
-------------
I used Linux 3.4-RC6 w/ and w/o his patches as Xen dom0 and guest
kernel.
After running two benchmarks in Xen HVM guest, I found his patches
brought about 1%~3% performance gain in 'kernel build' and 'netperf'
testing, though the performance gain was not very stable in 'kernel
build' testing.

Some detailed testing results are below.

Testing Environment:
	Hardware: Romley-EP platform
	Xen version: latest upstream
	Linux kernel: 3.4-RC6
	Guest vCPU number: 8
	NIC: Intel 82599 (10GB bandwidth)

In 'kernel build' testing in guest:
	Command line  |  performance gain
    make -j 4      |    3.81%
    make -j 8      |    0.37%
    make -j 16     |    -0.52%

In 'netperf' testing, we tested TCP_STREAM with default socket size
16384 byte as large packet and 64 byte as small packet.
I used several clients to add networking pressure, then 'netperf' server
automatically generated several threads to response them.
I also used large-size packet and small-size packet in the testing.
	Packet size  |  Thread number | performance gain
	16384 bytes  |      4       |   0.02%
	16384 bytes  |      8       |   2.21%
	16384 bytes  |      16      |   2.04%
	64 bytes     |      4       |   1.07%
	64 bytes     |      8       |   3.31%
	64 bytes     |      16      |   0.71%

Signed-off-by: Alex Shi <alex.shi@intel.com>
Link: http://lkml.kernel.org/r/1340845344-27557-8-git-send-email-alex.shi@intel.com
Tested-by: Ren, Yongjie <yongjie.ren@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2012-06-27 19:29:11 -07:00
Alex Shi
3df3212f97 x86/tlb: add tlb_flushall_shift knob into debugfs
kernel will replace cr3 rewrite with invlpg when
  tlb_flush_entries <= active_tlb_entries / 2^tlb_flushall_factor
if tlb_flushall_factor is -1, kernel won't do this replacement.

User can modify its value according to specific CPU/applications.

Thanks for Borislav providing the help message of
CONFIG_DEBUG_TLBFLUSH.

Signed-off-by: Alex Shi <alex.shi@intel.com>
Link: http://lkml.kernel.org/r/1340845344-27557-6-git-send-email-alex.shi@intel.com
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2012-06-27 19:29:10 -07:00
Alex Shi
c4211f42d3 x86/tlb: add tlb_flushall_shift for specific CPU
Testing show different CPU type(micro architectures and NUMA mode) has
different balance points between the TLB flush all and multiple invlpg.
And there also has cases the tlb flush change has no any help.

This patch give a interface to let x86 vendor developers have a chance
to set different shift for different CPU type.

like some machine in my hands, balance points is 16 entries on
Romely-EP; while it is at 8 entries on Bloomfield NHM-EP; and is 256 on
IVB mobile CPU. but on model 15 core2 Xeon using invlpg has nothing
help.

For untested machine, do a conservative optimization, same as NHM CPU.

Signed-off-by: Alex Shi <alex.shi@intel.com>
Link: http://lkml.kernel.org/r/1340845344-27557-5-git-send-email-alex.shi@intel.com
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2012-06-27 19:29:10 -07:00