diff mbox series

usb: typec: ucsi: Clear EVENT_PENDING bit if ucsi_send_command fails

Message ID 1694423055-8440-1-git-send-email-quic_prashk@quicinc.com
State New
Headers show
Series usb: typec: ucsi: Clear EVENT_PENDING bit if ucsi_send_command fails | expand

Commit Message

Prashanth K Sept. 11, 2023, 9:04 a.m. UTC
Currently if ucsi_send_command() fails, then we bail out without
clearing EVENT_PENDING flag. So when the next connector change
event comes, ucsi_connector_change() won't queue the con->work,
because of which none of the new events will be processed.

Fix this by clearing EVENT_PENDING flag if ucsi_send_command()
fails.

Cc: <stable@vger.kernel.org> # 5.16
Fixes: 512df95b9432 ("usb: typec: ucsi: Better fix for missing unplug events issue")
Signed-off-by: Prashanth K <quic_prashk@quicinc.com>
---
 drivers/usb/typec/ucsi/ucsi.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Heikki Krogerus Sept. 11, 2023, 12:49 p.m. UTC | #1
On Mon, Sep 11, 2023 at 02:34:15PM +0530, Prashanth K wrote:
> Currently if ucsi_send_command() fails, then we bail out without
> clearing EVENT_PENDING flag. So when the next connector change
> event comes, ucsi_connector_change() won't queue the con->work,
> because of which none of the new events will be processed.
> 
> Fix this by clearing EVENT_PENDING flag if ucsi_send_command()
> fails.
> 
> Cc: <stable@vger.kernel.org> # 5.16
> Fixes: 512df95b9432 ("usb: typec: ucsi: Better fix for missing unplug events issue")
> Signed-off-by: Prashanth K <quic_prashk@quicinc.com>
> ---
>  drivers/usb/typec/ucsi/ucsi.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c
> index c6dfe3d..509c67c 100644
> --- a/drivers/usb/typec/ucsi/ucsi.c
> +++ b/drivers/usb/typec/ucsi/ucsi.c
> @@ -884,6 +884,7 @@ static void ucsi_handle_connector_change(struct work_struct *work)
>  	if (ret < 0) {
>  		dev_err(ucsi->dev, "%s: GET_CONNECTOR_STATUS failed (%d)\n",
>  			__func__, ret);
> +		clear_bit(EVENT_PENDING, &con->ucsi->flags);
>  		goto out_unlock;
>  	}

I think it would be better to just move that label (out_unlock) above
the point where clear_bit() is already called instead of separately
calling it like that. That way the Connector Change Event will
also get acknowledged.

If this really can happen, then I think it would be good to also
schedule a task for ucsi_check_connection():

        if (ret < 0) {
                dev_err(ucsi->dev, "%s: GET_CONNECTOR_STATUS failed (%d)\n",
                        __func__, ret);
+               ucsi_partner_task(con, ucsi_check_connection, 1, HZ);
                goto out_unlock;
        }

thanks,
Heikki Krogerus Sept. 15, 2023, 12:32 p.m. UTC | #2
On Tue, Sep 12, 2023 at 04:37:47PM +0530, Prashanth K wrote:
> 
> 
> On 11-09-23 06:19 pm, Heikki Krogerus wrote:
> > On Mon, Sep 11, 2023 at 02:34:15PM +0530, Prashanth K wrote:
> > > Currently if ucsi_send_command() fails, then we bail out without
> > > clearing EVENT_PENDING flag. So when the next connector change
> > > event comes, ucsi_connector_change() won't queue the con->work,
> > > because of which none of the new events will be processed.
> > > 
> > > Fix this by clearing EVENT_PENDING flag if ucsi_send_command()
> > > fails.
> > > 
> > > Cc: <stable@vger.kernel.org> # 5.16
> > > Fixes: 512df95b9432 ("usb: typec: ucsi: Better fix for missing unplug events issue")
> > > Signed-off-by: Prashanth K <quic_prashk@quicinc.com>
> > > ---
> > >   drivers/usb/typec/ucsi/ucsi.c | 1 +
> > >   1 file changed, 1 insertion(+)
> > > 
> > > diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c
> > > index c6dfe3d..509c67c 100644
> > > --- a/drivers/usb/typec/ucsi/ucsi.c
> > > +++ b/drivers/usb/typec/ucsi/ucsi.c
> > > @@ -884,6 +884,7 @@ static void ucsi_handle_connector_change(struct work_struct *work)
> > >   	if (ret < 0) {
> > >   		dev_err(ucsi->dev, "%s: GET_CONNECTOR_STATUS failed (%d)\n",
> > >   			__func__, ret);
> > > +		clear_bit(EVENT_PENDING, &con->ucsi->flags);
> > >   		goto out_unlock;
> > >   	}
> > 
> > I think it would be better to just move that label (out_unlock) above
> > the point where clear_bit() is already called instead of separately
> > calling it like that. That way the Connector Change Event will
> > also get acknowledged.
> Do we really need to ACK in this case since we didn't process the current
> connector change event

You won't get the next event before the first one was ACK'd, right?

> > 
> > If this really can happen, then I think it would be good to also
> > schedule a task for ucsi_check_connection():
> > 
> >          if (ret < 0) {
> >                  dev_err(ucsi->dev, "%s: GET_CONNECTOR_STATUS failed (%d)\n",
> >                          __func__, ret);
> > +               ucsi_partner_task(con, ucsi_check_connection, 1, HZ);
> >                  goto out_unlock;
> >          }
> > 
> > thanks,
> > 
> Retrying is a good idea, but ucsi_check_connection() doesn't have the full
> functionality compared to handle_connector_change. I guess
> ucsi_check_connection() will send a set_role, but won't handle the
> connector_change scenarios happening due to PR/DR swap, which will lead to
> deadlocks (due to wait_for_completion). This is just an example. So its
> better to bail out and process the next events, because the failure here is
> from the glink layer.

Fair enough.
Prashanth K Sept. 15, 2023, 1:40 p.m. UTC | #3
On 15-09-23 06:02 pm, Heikki Krogerus wrote:
> On Tue, Sep 12, 2023 at 04:37:47PM +0530, Prashanth K wrote:
>>
>>
>> On 11-09-23 06:19 pm, Heikki Krogerus wrote:
>>> On Mon, Sep 11, 2023 at 02:34:15PM +0530, Prashanth K wrote:
>>>> Currently if ucsi_send_command() fails, then we bail out without
>>>> clearing EVENT_PENDING flag. So when the next connector change
>>>> event comes, ucsi_connector_change() won't queue the con->work,
>>>> because of which none of the new events will be processed.
>>>>
>>>> Fix this by clearing EVENT_PENDING flag if ucsi_send_command()
>>>> fails.
>>>>
>>>> Cc: <stable@vger.kernel.org> # 5.16
>>>> Fixes: 512df95b9432 ("usb: typec: ucsi: Better fix for missing unplug events issue")
>>>> Signed-off-by: Prashanth K <quic_prashk@quicinc.com>
>>>> ---
>>>>    drivers/usb/typec/ucsi/ucsi.c | 1 +
>>>>    1 file changed, 1 insertion(+)
>>>>
>>>> diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c
>>>> index c6dfe3d..509c67c 100644
>>>> --- a/drivers/usb/typec/ucsi/ucsi.c
>>>> +++ b/drivers/usb/typec/ucsi/ucsi.c
>>>> @@ -884,6 +884,7 @@ static void ucsi_handle_connector_change(struct work_struct *work)
>>>>    	if (ret < 0) {
>>>>    		dev_err(ucsi->dev, "%s: GET_CONNECTOR_STATUS failed (%d)\n",
>>>>    			__func__, ret);
>>>> +		clear_bit(EVENT_PENDING, &con->ucsi->flags);
>>>>    		goto out_unlock;
>>>>    	}
>>>
>>> I think it would be better to just move that label (out_unlock) above
>>> the point where clear_bit() is already called instead of separately
>>> calling it like that. That way the Connector Change Event will
>>> also get acknowledged.
>> Do we really need to ACK in this case since we didn't process the current
>> connector change event
> 
> You won't get the next event before the first one was ACK'd, right?
> 

The spec says that we need to ACK if we received AND processed a CCI

"4.5.4 Acknowledge Command Completion and/or Change Indication (R)
This command is used to acknowledge to the PPM that the OPM received and
processed a Command Completion and/or a Connector Change Indication."

And here in this case, we have received, but not processed the event.
So I'm not really sure what to do here in this case. If we don't send an 
ACK, then would the PPM think that OPM is not responding and reset it?
OR would it resend the previous event again since we didn't ACK?

Regards,
Prashanth K
Heikki Krogerus Sept. 15, 2023, 1:57 p.m. UTC | #4
Hi Prashanth,

On Fri, Sep 15, 2023 at 07:10:25PM +0530, Prashanth K wrote:
> On 15-09-23 06:02 pm, Heikki Krogerus wrote:
> > On Tue, Sep 12, 2023 at 04:37:47PM +0530, Prashanth K wrote:
> > > 
> > > 
> > > On 11-09-23 06:19 pm, Heikki Krogerus wrote:
> > > > On Mon, Sep 11, 2023 at 02:34:15PM +0530, Prashanth K wrote:
> > > > > Currently if ucsi_send_command() fails, then we bail out without
> > > > > clearing EVENT_PENDING flag. So when the next connector change
> > > > > event comes, ucsi_connector_change() won't queue the con->work,
> > > > > because of which none of the new events will be processed.
> > > > > 
> > > > > Fix this by clearing EVENT_PENDING flag if ucsi_send_command()
> > > > > fails.
> > > > > 
> > > > > Cc: <stable@vger.kernel.org> # 5.16
> > > > > Fixes: 512df95b9432 ("usb: typec: ucsi: Better fix for missing unplug events issue")
> > > > > Signed-off-by: Prashanth K <quic_prashk@quicinc.com>
> > > > > ---
> > > > >    drivers/usb/typec/ucsi/ucsi.c | 1 +
> > > > >    1 file changed, 1 insertion(+)
> > > > > 
> > > > > diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c
> > > > > index c6dfe3d..509c67c 100644
> > > > > --- a/drivers/usb/typec/ucsi/ucsi.c
> > > > > +++ b/drivers/usb/typec/ucsi/ucsi.c
> > > > > @@ -884,6 +884,7 @@ static void ucsi_handle_connector_change(struct work_struct *work)
> > > > >    	if (ret < 0) {
> > > > >    		dev_err(ucsi->dev, "%s: GET_CONNECTOR_STATUS failed (%d)\n",
> > > > >    			__func__, ret);
> > > > > +		clear_bit(EVENT_PENDING, &con->ucsi->flags);
> > > > >    		goto out_unlock;
> > > > >    	}
> > > > 
> > > > I think it would be better to just move that label (out_unlock) above
> > > > the point where clear_bit() is already called instead of separately
> > > > calling it like that. That way the Connector Change Event will
> > > > also get acknowledged.
> > > Do we really need to ACK in this case since we didn't process the current
> > > connector change event
> > 
> > You won't get the next event before the first one was ACK'd, right?
> > 
> 
> The spec says that we need to ACK if we received AND processed a CCI
> 
> "4.5.4 Acknowledge Command Completion and/or Change Indication (R)
> This command is used to acknowledge to the PPM that the OPM received and
> processed a Command Completion and/or a Connector Change Indication."
> 
> And here in this case, we have received, but not processed the event.
> So I'm not really sure what to do here in this case. If we don't send an
> ACK, then would the PPM think that OPM is not responding and reset it?
> OR would it resend the previous event again since we didn't ACK?

Every PPM behaves differently.

Did you actually see that happening - GET_CONNECTOR_STATUS failed? Can
you reproduce it?

thanks,
Heikki Krogerus Sept. 18, 2023, 2:25 p.m. UTC | #5
On Sat, Sep 16, 2023 at 01:58:30PM +0530, Prashanth K wrote:
> 
> 
> On 15-09-23 07:27 pm, Heikki Krogerus wrote:
> > Hi Prashanth,
> > 
> > On Fri, Sep 15, 2023 at 07:10:25PM +0530, Prashanth K wrote:
> > > On 15-09-23 06:02 pm, Heikki Krogerus wrote:
> > > > On Tue, Sep 12, 2023 at 04:37:47PM +0530, Prashanth K wrote:
> > > > > 
> > > > > 
> > > > > On 11-09-23 06:19 pm, Heikki Krogerus wrote:
> > > > > > On Mon, Sep 11, 2023 at 02:34:15PM +0530, Prashanth K wrote:
> > > > > > > Currently if ucsi_send_command() fails, then we bail out without
> > > > > > > clearing EVENT_PENDING flag. So when the next connector change
> > > > > > > event comes, ucsi_connector_change() won't queue the con->work,
> > > > > > > because of which none of the new events will be processed.
> > > > > > > 
> > > > > > > Fix this by clearing EVENT_PENDING flag if ucsi_send_command()
> > > > > > > fails.
> > > > > > > 
> > > > > > > Cc: <stable@vger.kernel.org> # 5.16
> > > > > > > Fixes: 512df95b9432 ("usb: typec: ucsi: Better fix for missing unplug events issue")
> > > > > > > Signed-off-by: Prashanth K <quic_prashk@quicinc.com>
> > > > > > > ---
> > > > > > >     drivers/usb/typec/ucsi/ucsi.c | 1 +
> > > > > > >     1 file changed, 1 insertion(+)
> > > > > > > 
> > > > > > > diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c
> > > > > > > index c6dfe3d..509c67c 100644
> > > > > > > --- a/drivers/usb/typec/ucsi/ucsi.c
> > > > > > > +++ b/drivers/usb/typec/ucsi/ucsi.c
> > > > > > > @@ -884,6 +884,7 @@ static void ucsi_handle_connector_change(struct work_struct *work)
> > > > > > >     	if (ret < 0) {
> > > > > > >     		dev_err(ucsi->dev, "%s: GET_CONNECTOR_STATUS failed (%d)\n",
> > > > > > >     			__func__, ret);
> > > > > > > +		clear_bit(EVENT_PENDING, &con->ucsi->flags);
> > > > > > >     		goto out_unlock;
> > > > > > >     	}
> > > > > > 
> > > > > > I think it would be better to just move that label (out_unlock) above
> > > > > > the point where clear_bit() is already called instead of separately
> > > > > > calling it like that. That way the Connector Change Event will
> > > > > > also get acknowledged.
> > > > > Do we really need to ACK in this case since we didn't process the current
> > > > > connector change event
> > > > 
> > > > You won't get the next event before the first one was ACK'd, right?
> > > > 
> > > 
> > > The spec says that we need to ACK if we received AND processed a CCI
> > > 
> > > "4.5.4 Acknowledge Command Completion and/or Change Indication (R)
> > > This command is used to acknowledge to the PPM that the OPM received and
> > > processed a Command Completion and/or a Connector Change Indication."
> > > 
> > > And here in this case, we have received, but not processed the event.
> > > So I'm not really sure what to do here in this case. If we don't send an
> > > ACK, then would the PPM think that OPM is not responding and reset it?
> > > OR would it resend the previous event again since we didn't ACK?
> > 
> > Every PPM behaves differently.
> > 
> > Did you actually see that happening - GET_CONNECTOR_STATUS failed? Can
> > you reproduce it?
> > 
> 
> Yea we actually hit the issue once where GET_CONNECTOR_STATUS failed and
> subsequent events didn't get queued since EVENT_PENDING wasn't cleared. Its
> not easily reproducible (<1%) though.
> 
> [4948:kworker/0:3]UCSI: ucsi_qti_glink_write: timed out
> [4948:kworker/0:3]ucsi_glink soc:qcom,pmic_glink:qcom,ucsi:
> ucsi_handle_connector_change: GET_CONNECTOR_STATUS failed (-110)

Okay. It would be really interesting to know why is it failing.
But let's just go with this for now.

thanks,
Heikki Krogerus Sept. 18, 2023, 2:26 p.m. UTC | #6
On Mon, Sep 11, 2023 at 02:34:15PM +0530, Prashanth K wrote:
> Currently if ucsi_send_command() fails, then we bail out without
> clearing EVENT_PENDING flag. So when the next connector change
> event comes, ucsi_connector_change() won't queue the con->work,
> because of which none of the new events will be processed.
> 
> Fix this by clearing EVENT_PENDING flag if ucsi_send_command()
> fails.
> 
> Cc: <stable@vger.kernel.org> # 5.16
> Fixes: 512df95b9432 ("usb: typec: ucsi: Better fix for missing unplug events issue")
> Signed-off-by: Prashanth K <quic_prashk@quicinc.com>

Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>

> ---
>  drivers/usb/typec/ucsi/ucsi.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c
> index c6dfe3d..509c67c 100644
> --- a/drivers/usb/typec/ucsi/ucsi.c
> +++ b/drivers/usb/typec/ucsi/ucsi.c
> @@ -884,6 +884,7 @@ static void ucsi_handle_connector_change(struct work_struct *work)
>  	if (ret < 0) {
>  		dev_err(ucsi->dev, "%s: GET_CONNECTOR_STATUS failed (%d)\n",
>  			__func__, ret);
> +		clear_bit(EVENT_PENDING, &con->ucsi->flags);
>  		goto out_unlock;
>  	}
>  

thanks,
Prashanth K Sept. 18, 2023, 3:18 p.m. UTC | #7
On 18-09-23 07:55 pm, Heikki Krogerus wrote:
> On Sat, Sep 16, 2023 at 01:58:30PM +0530, Prashanth K wrote:
>>
>>
>> On 15-09-23 07:27 pm, Heikki Krogerus wrote:
>>> Hi Prashanth,
>>>
>>> On Fri, Sep 15, 2023 at 07:10:25PM +0530, Prashanth K wrote:
>>>> On 15-09-23 06:02 pm, Heikki Krogerus wrote:
>>>>> On Tue, Sep 12, 2023 at 04:37:47PM +0530, Prashanth K wrote:
>>>>>>
>>>>>>
>>>>>> On 11-09-23 06:19 pm, Heikki Krogerus wrote:
>>>>>>> On Mon, Sep 11, 2023 at 02:34:15PM +0530, Prashanth K wrote:
>>>>>>>> Currently if ucsi_send_command() fails, then we bail out without
>>>>>>>> clearing EVENT_PENDING flag. So when the next connector change
>>>>>>>> event comes, ucsi_connector_change() won't queue the con->work,
>>>>>>>> because of which none of the new events will be processed.
>>>>>>>>
>>>>>>>> Fix this by clearing EVENT_PENDING flag if ucsi_send_command()
>>>>>>>> fails.
>>>>>>>>
>>>>>>>> Cc: <stable@vger.kernel.org> # 5.16
>>>>>>>> Fixes: 512df95b9432 ("usb: typec: ucsi: Better fix for missing unplug events issue")
>>>>>>>> Signed-off-by: Prashanth K <quic_prashk@quicinc.com>
>>>>>>>> ---
>>>>>>>>      drivers/usb/typec/ucsi/ucsi.c | 1 +
>>>>>>>>      1 file changed, 1 insertion(+)
>>>>>>>>
>>>>>>>> diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c
>>>>>>>> index c6dfe3d..509c67c 100644
>>>>>>>> --- a/drivers/usb/typec/ucsi/ucsi.c
>>>>>>>> +++ b/drivers/usb/typec/ucsi/ucsi.c
>>>>>>>> @@ -884,6 +884,7 @@ static void ucsi_handle_connector_change(struct work_struct *work)
>>>>>>>>      	if (ret < 0) {
>>>>>>>>      		dev_err(ucsi->dev, "%s: GET_CONNECTOR_STATUS failed (%d)\n",
>>>>>>>>      			__func__, ret);
>>>>>>>> +		clear_bit(EVENT_PENDING, &con->ucsi->flags);
>>>>>>>>      		goto out_unlock;
>>>>>>>>      	}
>>>>>>>
>>>>>>> I think it would be better to just move that label (out_unlock) above
>>>>>>> the point where clear_bit() is already called instead of separately
>>>>>>> calling it like that. That way the Connector Change Event will
>>>>>>> also get acknowledged.
>>>>>> Do we really need to ACK in this case since we didn't process the current
>>>>>> connector change event
>>>>>
>>>>> You won't get the next event before the first one was ACK'd, right?
>>>>>
>>>>
>>>> The spec says that we need to ACK if we received AND processed a CCI
>>>>
>>>> "4.5.4 Acknowledge Command Completion and/or Change Indication (R)
>>>> This command is used to acknowledge to the PPM that the OPM received and
>>>> processed a Command Completion and/or a Connector Change Indication."
>>>>
>>>> And here in this case, we have received, but not processed the event.
>>>> So I'm not really sure what to do here in this case. If we don't send an
>>>> ACK, then would the PPM think that OPM is not responding and reset it?
>>>> OR would it resend the previous event again since we didn't ACK?
>>>
>>> Every PPM behaves differently.
>>>
>>> Did you actually see that happening - GET_CONNECTOR_STATUS failed? Can
>>> you reproduce it?
>>>
>>
>> Yea we actually hit the issue once where GET_CONNECTOR_STATUS failed and
>> subsequent events didn't get queued since EVENT_PENDING wasn't cleared. Its
>> not easily reproducible (<1%) though.
>>
>> [4948:kworker/0:3]UCSI: ucsi_qti_glink_write: timed out
>> [4948:kworker/0:3]ucsi_glink soc:qcom,pmic_glink:qcom,ucsi:
>> ucsi_handle_connector_change: GET_CONNECTOR_STATUS failed (-110)
> 
> Okay. It would be really interesting to know why is it failing.
> But let's just go with this for now.
> 
> thanks,
> 

Agreed, I'm not really sure why its failing, because its in happening 
the lower layers. Anyways thanks for the comments and review!

Regards,
diff mbox series

Patch

diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c
index c6dfe3d..509c67c 100644
--- a/drivers/usb/typec/ucsi/ucsi.c
+++ b/drivers/usb/typec/ucsi/ucsi.c
@@ -884,6 +884,7 @@  static void ucsi_handle_connector_change(struct work_struct *work)
 	if (ret < 0) {
 		dev_err(ucsi->dev, "%s: GET_CONNECTOR_STATUS failed (%d)\n",
 			__func__, ret);
+		clear_bit(EVENT_PENDING, &con->ucsi->flags);
 		goto out_unlock;
 	}