diff mbox series

[v2] serial: core: only stop transmit when HW fifo is empty

Message ID 20240303150807.68117-1-jonas.gorski@gmail.com
State New
Headers show
Series [v2] serial: core: only stop transmit when HW fifo is empty | expand

Commit Message

Jonas Gorski March 3, 2024, 3:08 p.m. UTC
If the circular buffer is empty, it just means we fit all characters to
send into the HW fifo, but not that the hardware finished transmitting
them.

So if we immediately call stop_tx() after that, this may abort any
pending characters in the HW fifo, and cause dropped characters on the
console.

Fix this by only stopping tx when the tx HW fifo is actually empty.

Fixes: 8275b48b2780 ("tty: serial: introduce transmit helpers")
Cc: stable@vger.kernel.org
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
---
(this is v2 of the bcm63xx-uart fix attempt)

v1 -> v2
* replace workaround with fix for core issue
* add Cc: for stable

I'm somewhat confident this is the core issue causing the broken output
with bcm63xx-uart, and there is no actual need for the UART_TX_NOSTOP.

I wouldn't be surprised if this also fixes mxs-uart for which
UART_TX_NOSTOP was introduced.

If it does, there is no need for the flag anymore.
 include/linux/serial_core.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Doug Brown May 17, 2024, 4:22 a.m. UTC | #1
Hello,

On 3/3/2024 7:08 AM, Jonas Gorski wrote:
> If the circular buffer is empty, it just means we fit all characters to
> send into the HW fifo, but not that the hardware finished transmitting
> them.
> 
> So if we immediately call stop_tx() after that, this may abort any
> pending characters in the HW fifo, and cause dropped characters on the
> console.
> 
> Fix this by only stopping tx when the tx HW fifo is actually empty.
> 
> Fixes: 8275b48b2780 ("tty: serial: introduce transmit helpers")
> Cc: stable@vger.kernel.org
> Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
> ---
> (this is v2 of the bcm63xx-uart fix attempt)
> 
> v1 -> v2
> * replace workaround with fix for core issue
> * add Cc: for stable
> 
> I'm somewhat confident this is the core issue causing the broken output
> with bcm63xx-uart, and there is no actual need for the UART_TX_NOSTOP.
> 
> I wouldn't be surprised if this also fixes mxs-uart for which
> UART_TX_NOSTOP was introduced.
> 
> If it does, there is no need for the flag anymore.
>   include/linux/serial_core.h | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/serial_core.h b/include/linux/serial_core.h
> index 55b1f3ba48ac..bb0f2d4ac62f 100644
> --- a/include/linux/serial_core.h
> +++ b/include/linux/serial_core.h
> @@ -786,7 +786,8 @@ enum UART_TX_FLAGS {
>   	if (pending < WAKEUP_CHARS) {					      \
>   		uart_write_wakeup(__port);				      \
>   									      \
> -		if (!((flags) & UART_TX_NOSTOP) && pending == 0)	      \
> +		if (!((flags) & UART_TX_NOSTOP) && pending == 0 &&	      \
> +		    __port->ops->tx_empty(__port))			      \
>   			__port->ops->stop_tx(__port);			      \
>   	}								      \
>   									      \

I just upgraded to kernel 6.9 and discovered through a git bisect that
this patch (7bfb915a597a301abb892f620fe5c283a9fdbd77) causes a problem
with the legacy pxa.c serial driver (CONFIG_SERIAL_PXA_NON8250). I'm
using it with a PXA168-based ARM device for a serial console as well as
getty. With this patch applied, transmissions get hung up before they
finish. The data isn't lost, because the next time a transmit occurs,
the delayed data finally goes out -- but something seems to be causing
it to get stuck right at the end of many, but not all, transmissions.
For example, if I type "ps" and hit enter, nothing shows up until I hit
enter again, which finally kickstarts the whole TX process and then I
get all of the queued ps output.

I'm really confused about this symptom because it seems at face value
like this patch would only ever improve the situation by preventing
stop_tx() from being called too early. There's something about the pxa
driver that is happier when stop_tx() is called with an empty buffer
even if the UART is reporting that it's not empty yet. I tested some
other random systems in qemu and couldn't reproduce this issue, so the
problem may very well be limited just to this driver/hardware...

I realize this driver is old and deprecated (I'm likely one of the few
users left of it) so I'm hesitant to call it a regression. Maybe it's
really a bug in this driver that the new patch exposes? I even thought,
"heck, I should probably be using the newer 8250_pxa driver instead",
but that one is even worse -- it drops TX characters like crazy,
regardless of whether this patch is applied. I want to look into that
problem eventually.

I'm hoping there is some kind of simple fix that can be made to the pxa
driver to work around it with this new behavior. Can anyone think of a
reason that this driver would not like this change? It seems
counterintuitive to me -- the patch makes perfect sense.

Thanks,
Doug
Doug Brown May 19, 2024, 5:51 a.m. UTC | #2
Hi again,

On 5/16/2024 9:22 PM, Doug Brown wrote:

> I'm hoping there is some kind of simple fix that can be made to the pxa
> driver to work around it with this new behavior. Can anyone think of a
> reason that this driver would not like this change? It seems
> counterintuitive to me -- the patch makes perfect sense.

After further experimentation, I've come to the conclusion that this is
a bug in the pxa uart driver, and this patch simply exposed the bug.
I'll submit a patch to fix the issue in the pxa driver.

If anyone's interested in the details: basically, the pxa driver in its
current state doesn't work correctly if it receives a TX interrupt when
the circular buffer is empty. It handles it, but then gets stuck waiting
for the next TX IRQ that will never happen because no characters were
transmitted. The way stop_tx() was previously being called before the
transmitter was empty, it prevented that situation from happening
because toggling the TX interrupt enable flag off (with stop_tx) and
back on (with the next start_tx) causes a new TX interrupt to fire and
kickstarts the transmit process again.

The 8250 driver, for example, isn't affected by this problem because it
effectively does stop_tx() on its own if it detects an empty circular
buffer in the TX interrupt handler. Adding similar logic to the pxa
driver fixes it.

Doug
diff mbox series

Patch

diff --git a/include/linux/serial_core.h b/include/linux/serial_core.h
index 55b1f3ba48ac..bb0f2d4ac62f 100644
--- a/include/linux/serial_core.h
+++ b/include/linux/serial_core.h
@@ -786,7 +786,8 @@  enum UART_TX_FLAGS {
 	if (pending < WAKEUP_CHARS) {					      \
 		uart_write_wakeup(__port);				      \
 									      \
-		if (!((flags) & UART_TX_NOSTOP) && pending == 0)	      \
+		if (!((flags) & UART_TX_NOSTOP) && pending == 0 &&	      \
+		    __port->ops->tx_empty(__port))			      \
 			__port->ops->stop_tx(__port);			      \
 	}								      \
 									      \