mbox series

[RFC,INTERNAL,v3,0/4] Introduce NMI aware serial drivers

Message ID 1594966857-5215-1-git-send-email-sumit.garg@linaro.org
Headers show
Series Introduce NMI aware serial drivers | expand

Message

Sumit Garg July 17, 2020, 6:20 a.m. UTC
Motivation:
-----------

The major driver for this work has been the prior efforts around the
NMI/FIQ debugger which is pretty well described here [1].

Also, there is an existing kgdb NMI serial driver which provides partial
implementatation in upstream to have separate ttyNMI0 port but that
remained in silos with the serial core/drivers which made it a bit odd
to enable using serial device interrupt and hence remained unused. It
seems to be clearly intended to avoid almost all custom NMI changes to
the UART driver.

But now with the advent of pseudo NMIs on arm64 it became quite generic
to request serial device interrupt as an NMI rather than IRQ. And having
NMI driven serial RX will allow us to do a magic sysrq in NMI context
rather than in normal IRQ context and hence drop into kernel debugger in
NMI context.

[1] https://www.linaro.org/blog/debugging-arm-kernels-using-nmifiq/

Use-cases:
----------

The major use-case is to enhance kernel debugging capabilities to debug
scenarios such as:
- Primary CPU is stuck in deadlock with interrupts disabled and doesn't
  honour serial device interrupt. So having sysrq running in NMI context
  is helpful for debugging.
- Always enabled NMI based magic sysrq irrespective of whether the serial
  TTY port is active or not.
- Leverage kgdb capabilities in NMI context.

Approach:
---------

The overall idea is to intercept serial RX characters in NMI context, if
those are specific to magic sysrq then allow corresponding handler to run
in NMI context. Otherwise defer all NMI unsafe RX and TX operations onto
IRQ work queue in order to run those in normal interrupt context.

This approach is demonstrated using amba-pl011 driver.

Goal of this RFC:
-----------------

My main reason for sharing this as an RFC is to help decide whether or
not to continue with this approach. The next step for me would to port
the work to a system with an 8250 UART.

Usage:
------

This RFC has been developed on top of 5.8-rc3 and if anyone is interested
to give this a try on QEMU, just enable following config options
additional to arm64 defconfig:

CONFIG_KGDB=y
CONFIG_KGDB_KDB=y
CONFIG_ARM64_PSEUDO_NMI=y

Qemu command line to test:

$ qemu-system-aarch64 -nographic -machine virt,gic-version=3 -cpu cortex-a57 \
  -smp 2 -kernel arch/arm64/boot/Image -append 'console=ttyAMA0,38400 \
  keep_bootcon root=/dev/vda2 irqchip.gicv3_pseudo_nmi=1 kgdboc=ttyAMA0' \
  -initrd rootfs-arm64.cpio.gz

NMI entry into kgdb via sysrq:
- Ctrl a + b + g

I do look forward to your comments and feedback.

Changes since RFC v2:
- Added/updated cover-letter and commit descriptions.
- Keep serial core NMI feature under CONFIG_CONSOLE_POLL.
- Removed misc. redundant checks.
- Used container_of() instead of a backlink.

Changes since RFC v1:
- Switch TX to be NMI driven rather than poolling mode.
- Move common NMI framework bits to serial core.
- Make uart_handle_sysrq_char() and uart_handle_break() NMI safe.
- Correct poll_init() initialization sequence.
- Misc. comments.
- Regaring NMI alloc API, I thought about moving it to serial core but
  then after looking at it more closely there seems to be multiple
  serial driver specific bits like NMI handler, name, device ID which
  refrained me from moving it to serial core. But if you still think
  otherwise then I could just move it with aforementioned params as
  arguments.

Sumit Garg (4):
  tty/sysrq: Make sysrq handler NMI aware
  serial: core: Add framework to allow NMI aware serial drivers
  serial: amba-pl011: Re-order APIs definition
  serial: amba-pl011: Enable NMI aware polling mode

 drivers/tty/serial/amba-pl011.c  | 236 ++++++++++++++++++++++++++++-----------
 drivers/tty/serial/serial_core.c | 120 +++++++++++++++++++-
 drivers/tty/sysrq.c              |  33 +++++-
 include/linux/serial_core.h      |  67 +++++++++++
 include/linux/sysrq.h            |   1 +
 kernel/debug/debug_core.c        |   1 +
 6 files changed, 390 insertions(+), 68 deletions(-)

-- 
2.7.4

Comments

Daniel Thompson July 20, 2020, 2:28 p.m. UTC | #1
On Fri, Jul 17, 2020 at 11:50:53AM +0530, Sumit Garg wrote:
> Motivation:

> -----------

> 

> The major driver for this work has been the prior efforts around the

> NMI/FIQ debugger which is pretty well described here [1].


Why should the reviewer care (or read a large blog post from five years
ago before reading further)? You could cite this in "for more details"
at the end but kicking off with a link to a blog is pretty hostile to
reviewers.


> Also, there is an existing kgdb NMI serial driver which provides partial

> implementatation in upstream to have separate ttyNMI0 port but that

> remained in silos with the serial core/drivers which made it a bit odd

> to enable using serial device interrupt and hence remained unused. It

> seems to be clearly intended to avoid almost all custom NMI changes to

> the UART driver.


It occurs to me that there should be a patch 5/5 that removed kgdb_nmi.c
and all the related code! Actually showing how much code we can remove
might be persuasive.
 

> But now with the advent of pseudo NMIs on arm64 it became quite generic

> to request serial device interrupt as an NMI rather than IRQ. And having

> NMI driven serial RX will allow us to do a magic sysrq in NMI context

> rather than in normal IRQ context and hence drop into kernel debugger in

> NMI context.


Go back from here and read how much text we have to read before we find
out what the patchset actually does.

*Start* with the goal of the patch set: make it possible for UARTs to
trigger magic sysrq from an NMI.

Then show why that is useful. This is probably a mixture of what is
currently in use cases *and* the fact that we can kill off kgdb_nmi.c 
whilst getting more features.


> 

> [1] https://www.linaro.org/blog/debugging-arm-kernels-using-nmifiq/

> 

> Use-cases:

> ----------

> 

> The major use-case is to enhance kernel debugging capabilities to debug

> scenarios such as:

> - Primary CPU is stuck in deadlock with interrupts disabled and doesn't

>   honour serial device interrupt. So having sysrq running in NMI context

>   is helpful for debugging.

> - Always enabled NMI based magic sysrq irrespective of whether the serial

>   TTY port is active or not.

> - Leverage kgdb capabilities in NMI context.


Isn't "Leverage kgdb capabilities" simply a side effect of handling sysrq
from NMI.


Daniel


PS Everything from this point down is looking good. However it is
important to concisely explain what the patch set is for at the beginning.
Consider it to be explaining why someone should bother to continue reading!

PPS I think once you tweak the wording a bit more then I think this set
    is good to go.


> 

> Approach:

> ---------

> 

> The overall idea is to intercept serial RX characters in NMI context, if

> those are specific to magic sysrq then allow corresponding handler to run

> in NMI context. Otherwise defer all NMI unsafe RX and TX operations onto

> IRQ work queue in order to run those in normal interrupt context.

> 

> This approach is demonstrated using amba-pl011 driver.

> 

> Goal of this RFC:

> -----------------

> 

> My main reason for sharing this as an RFC is to help decide whether or

> not to continue with this approach. The next step for me would to port

> the work to a system with an 8250 UART.

> 

> Usage:

> ------

> 

> This RFC has been developed on top of 5.8-rc3 and if anyone is interested

> to give this a try on QEMU, just enable following config options

> additional to arm64 defconfig:

> 

> CONFIG_KGDB=y

> CONFIG_KGDB_KDB=y

> CONFIG_ARM64_PSEUDO_NMI=y

> 

> Qemu command line to test:

> 

> $ qemu-system-aarch64 -nographic -machine virt,gic-version=3 -cpu cortex-a57 \

>   -smp 2 -kernel arch/arm64/boot/Image -append 'console=ttyAMA0,38400 \

>   keep_bootcon root=/dev/vda2 irqchip.gicv3_pseudo_nmi=1 kgdboc=ttyAMA0' \

>   -initrd rootfs-arm64.cpio.gz

> 

> NMI entry into kgdb via sysrq:

> - Ctrl a + b + g

> 

> I do look forward to your comments and feedback.

> 

> Changes since RFC v2:

> - Added/updated cover-letter and commit descriptions.

> - Keep serial core NMI feature under CONFIG_CONSOLE_POLL.

> - Removed misc. redundant checks.

> - Used container_of() instead of a backlink.

> 

> Changes since RFC v1:

> - Switch TX to be NMI driven rather than poolling mode.

> - Move common NMI framework bits to serial core.

> - Make uart_handle_sysrq_char() and uart_handle_break() NMI safe.

> - Correct poll_init() initialization sequence.

> - Misc. comments.

> - Regaring NMI alloc API, I thought about moving it to serial core but

>   then after looking at it more closely there seems to be multiple

>   serial driver specific bits like NMI handler, name, device ID which

>   refrained me from moving it to serial core. But if you still think

>   otherwise then I could just move it with aforementioned params as

>   arguments.

> 

> Sumit Garg (4):

>   tty/sysrq: Make sysrq handler NMI aware

>   serial: core: Add framework to allow NMI aware serial drivers

>   serial: amba-pl011: Re-order APIs definition

>   serial: amba-pl011: Enable NMI aware polling mode

> 

>  drivers/tty/serial/amba-pl011.c  | 236 ++++++++++++++++++++++++++++-----------

>  drivers/tty/serial/serial_core.c | 120 +++++++++++++++++++-

>  drivers/tty/sysrq.c              |  33 +++++-

>  include/linux/serial_core.h      |  67 +++++++++++

>  include/linux/sysrq.h            |   1 +

>  kernel/debug/debug_core.c        |   1 +

>  6 files changed, 390 insertions(+), 68 deletions(-)

> 

> -- 

> 2.7.4
Sumit Garg July 21, 2020, 6:57 a.m. UTC | #2
On Mon, 20 Jul 2020 at 19:58, Daniel Thompson
<daniel.thompson@linaro.org> wrote:
>

> On Fri, Jul 17, 2020 at 11:50:53AM +0530, Sumit Garg wrote:

> > Motivation:

> > -----------

> >

> > The major driver for this work has been the prior efforts around the

> > NMI/FIQ debugger which is pretty well described here [1].

>

> Why should the reviewer care (or read a large blog post from five years

> ago before reading further)? You could cite this in "for more details"

> at the end but kicking off with a link to a blog is pretty hostile to

> reviewers.

>


Okay, will keep it as a reference at the end.

>

> > Also, there is an existing kgdb NMI serial driver which provides partial

> > implementatation in upstream to have separate ttyNMI0 port but that

> > remained in silos with the serial core/drivers which made it a bit odd

> > to enable using serial device interrupt and hence remained unused. It

> > seems to be clearly intended to avoid almost all custom NMI changes to

> > the UART driver.

>

> It occurs to me that there should be a patch 5/5 that removed kgdb_nmi.c

> and all the related code! Actually showing how much code we can remove

> might be persuasive.

>


Okay, will add a patch for that too.

>

> > But now with the advent of pseudo NMIs on arm64 it became quite generic

> > to request serial device interrupt as an NMI rather than IRQ. And having

> > NMI driven serial RX will allow us to do a magic sysrq in NMI context

> > rather than in normal IRQ context and hence drop into kernel debugger in

> > NMI context.

>

> Go back from here and read how much text we have to read before we find

> out what the patchset actually does.

>

> *Start* with the goal of the patch set: make it possible for UARTs to

> trigger magic sysrq from an NMI.

>

> Then show why that is useful. This is probably a mixture of what is

> currently in use cases *and* the fact that we can kill off kgdb_nmi.c

> whilst getting more features.

>


Okay, will update the description accordingly.

>

> >

> > [1] https://www.linaro.org/blog/debugging-arm-kernels-using-nmifiq/

> >

> > Use-cases:

> > ----------

> >

> > The major use-case is to enhance kernel debugging capabilities to debug

> > scenarios such as:

> > - Primary CPU is stuck in deadlock with interrupts disabled and doesn't

> >   honour serial device interrupt. So having sysrq running in NMI context

> >   is helpful for debugging.

> > - Always enabled NMI based magic sysrq irrespective of whether the serial

> >   TTY port is active or not.

> > - Leverage kgdb capabilities in NMI context.

>

> Isn't "Leverage kgdb capabilities" simply a side effect of handling sysrq

> from NMI.

>


Yeah this is somewhat redundant, will get rid of it.

>

> Daniel

>

>

> PS Everything from this point down is looking good. However it is

> important to concisely explain what the patch set is for at the beginning.

> Consider it to be explaining why someone should bother to continue reading!

>


Sure.

> PPS I think once you tweak the wording a bit more then I think this set

>     is good to go.

>


Thanks, will post the next version as RFC in upstream after
incorporating your comments.

-Sumit

>

> >

> > Approach:

> > ---------

> >

> > The overall idea is to intercept serial RX characters in NMI context, if

> > those are specific to magic sysrq then allow corresponding handler to run

> > in NMI context. Otherwise defer all NMI unsafe RX and TX operations onto

> > IRQ work queue in order to run those in normal interrupt context.

> >

> > This approach is demonstrated using amba-pl011 driver.

> >

> > Goal of this RFC:

> > -----------------

> >

> > My main reason for sharing this as an RFC is to help decide whether or

> > not to continue with this approach. The next step for me would to port

> > the work to a system with an 8250 UART.

> >

> > Usage:

> > ------

> >

> > This RFC has been developed on top of 5.8-rc3 and if anyone is interested

> > to give this a try on QEMU, just enable following config options

> > additional to arm64 defconfig:

> >

> > CONFIG_KGDB=y

> > CONFIG_KGDB_KDB=y

> > CONFIG_ARM64_PSEUDO_NMI=y

> >

> > Qemu command line to test:

> >

> > $ qemu-system-aarch64 -nographic -machine virt,gic-version=3 -cpu cortex-a57 \

> >   -smp 2 -kernel arch/arm64/boot/Image -append 'console=ttyAMA0,38400 \

> >   keep_bootcon root=/dev/vda2 irqchip.gicv3_pseudo_nmi=1 kgdboc=ttyAMA0' \

> >   -initrd rootfs-arm64.cpio.gz

> >

> > NMI entry into kgdb via sysrq:

> > - Ctrl a + b + g

> >

> > I do look forward to your comments and feedback.

> >

> > Changes since RFC v2:

> > - Added/updated cover-letter and commit descriptions.

> > - Keep serial core NMI feature under CONFIG_CONSOLE_POLL.

> > - Removed misc. redundant checks.

> > - Used container_of() instead of a backlink.

> >

> > Changes since RFC v1:

> > - Switch TX to be NMI driven rather than poolling mode.

> > - Move common NMI framework bits to serial core.

> > - Make uart_handle_sysrq_char() and uart_handle_break() NMI safe.

> > - Correct poll_init() initialization sequence.

> > - Misc. comments.

> > - Regaring NMI alloc API, I thought about moving it to serial core but

> >   then after looking at it more closely there seems to be multiple

> >   serial driver specific bits like NMI handler, name, device ID which

> >   refrained me from moving it to serial core. But if you still think

> >   otherwise then I could just move it with aforementioned params as

> >   arguments.

> >

> > Sumit Garg (4):

> >   tty/sysrq: Make sysrq handler NMI aware

> >   serial: core: Add framework to allow NMI aware serial drivers

> >   serial: amba-pl011: Re-order APIs definition

> >   serial: amba-pl011: Enable NMI aware polling mode

> >

> >  drivers/tty/serial/amba-pl011.c  | 236 ++++++++++++++++++++++++++++-----------

> >  drivers/tty/serial/serial_core.c | 120 +++++++++++++++++++-

> >  drivers/tty/sysrq.c              |  33 +++++-

> >  include/linux/serial_core.h      |  67 +++++++++++

> >  include/linux/sysrq.h            |   1 +

> >  kernel/debug/debug_core.c        |   1 +

> >  6 files changed, 390 insertions(+), 68 deletions(-)

> >

> > --

> > 2.7.4