[PATCH 1/2] Silicon/Broadcom/BcmGenetDxe: Delay for linkup in transmit


Jeremy Linton
 

Under normal circumstances GenetSimpleNetworkTransmit won't be
called unless the rest of the network stack detects the link is
up. So, during normal operation when the adapter is initialized
the link naturally transitions to link up, and then is ready for
activity later in the boot sequence. If that hasn't happened by
the time PXE runs then it will itself wait for the link.

OTOH, the normal distro PXE sequence involves PXE loading shim
which in turn loads grub, which tries to read machine specific
configs, modules, and grub.cfg in order to prepare the boot menu.
Then, once a grub selection is picked, it might try to load the
kernel+initrd.

In this sequence the network stack is shutdown and restarted
multiple times. Grub though, starts up, notices its been network
booted, reads saved network parameters and immediately tries to
transmit data assuming the link is still up.

When that happens grub will print "couldn't send network packet"
and if that lasts long enough it fails to load grub.cfg and the
user gets dropped to the grub prompt because no one in the path
bothers to assure the link state has transitioned back up.

For reference: https://github.com/pftf/RPi4/issues/113

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
.../Drivers/Net/BcmGenetDxe/SimpleNetwork.c | 24 ++++++++++++++++++++--
1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/Silicon/Broadcom/Drivers/Net/BcmGenetDxe/SimpleNetwork.c b/Silicon/Broadcom/Drivers/Net/BcmGenetDxe/SimpleNetwork.c
index 1bda18f157..29c76d8495 100644
--- a/Silicon/Broadcom/Drivers/Net/BcmGenetDxe/SimpleNetwork.c
+++ b/Silicon/Broadcom/Drivers/Net/BcmGenetDxe/SimpleNetwork.c
@@ -13,6 +13,7 @@
#include <Library/DebugLib.h>
#include <Library/DmaLib.h>
#include <Library/NetLib.h>
+#include <Library/UefiBootServicesTableLib.h>
#include <Protocol/SimpleNetwork.h>

#include "BcmGenetDxe.h"
@@ -590,9 +591,28 @@ GenetSimpleNetworkTransmit (

if (!Genet->SnpMode.MediaPresent) {
//
- // Don't bother transmitting if there's no link.
+ // We should only really get here if the link was up
+ // and is now down due to a stop/shutdown sequence, and
+ // the app (grub) doesn't bother to check link state
+ // because it was up a moment before.
+ // Lets wait a bit for the link to resume, rather than
+ // failing to send. In the case of grub it works either way
+ // but we can't be sure that is universally true, and
+ // hanging for a couple seconds is nicer than a screen of
+ // grub send failure messages.
//
- return EFI_NOT_READY;
+ int retries = 1000;
+ DEBUG ((DEBUG_INFO, "%a: Waiting 10s for link\n", __FUNCTION__));
+ do {
+ gBS->Stall (10000);
+ Status = GenericPhyUpdateConfig (&Genet->Phy);
+ } while (EFI_ERROR (Status) && retries--);
+ if (EFI_ERROR (Status)) {
+ DEBUG ((DEBUG_ERROR, "%a: no link\n", __FUNCTION__));
+ return EFI_NOT_READY;
+ } else {
+ Genet->SnpMode.MediaPresent = TRUE;
+ }
}

if (HeaderSize != 0) {
--
2.13.7


Jeremy Linton
 

+Jared McNeill for review.

Thanks,

On 4/15/21 2:22 PM, Jeremy Linton wrote:
Under normal circumstances GenetSimpleNetworkTransmit won't be
called unless the rest of the network stack detects the link is
up. So, during normal operation when the adapter is initialized
the link naturally transitions to link up, and then is ready for
activity later in the boot sequence. If that hasn't happened by
the time PXE runs then it will itself wait for the link.
OTOH, the normal distro PXE sequence involves PXE loading shim
which in turn loads grub, which tries to read machine specific
configs, modules, and grub.cfg in order to prepare the boot menu.
Then, once a grub selection is picked, it might try to load the
kernel+initrd.
In this sequence the network stack is shutdown and restarted
multiple times. Grub though, starts up, notices its been network
booted, reads saved network parameters and immediately tries to
transmit data assuming the link is still up.
When that happens grub will print "couldn't send network packet"
and if that lasts long enough it fails to load grub.cfg and the
user gets dropped to the grub prompt because no one in the path
bothers to assure the link state has transitioned back up.
For reference: https://github.com/pftf/RPi4/issues/113
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
.../Drivers/Net/BcmGenetDxe/SimpleNetwork.c | 24 ++++++++++++++++++++--
1 file changed, 22 insertions(+), 2 deletions(-)
diff --git a/Silicon/Broadcom/Drivers/Net/BcmGenetDxe/SimpleNetwork.c b/Silicon/Broadcom/Drivers/Net/BcmGenetDxe/SimpleNetwork.c
index 1bda18f157..29c76d8495 100644
--- a/Silicon/Broadcom/Drivers/Net/BcmGenetDxe/SimpleNetwork.c
+++ b/Silicon/Broadcom/Drivers/Net/BcmGenetDxe/SimpleNetwork.c
@@ -13,6 +13,7 @@
#include <Library/DebugLib.h>
#include <Library/DmaLib.h>
#include <Library/NetLib.h>
+#include <Library/UefiBootServicesTableLib.h>
#include <Protocol/SimpleNetwork.h>
#include "BcmGenetDxe.h"
@@ -590,9 +591,28 @@ GenetSimpleNetworkTransmit (
if (!Genet->SnpMode.MediaPresent) {
//
- // Don't bother transmitting if there's no link.
+ // We should only really get here if the link was up
+ // and is now down due to a stop/shutdown sequence, and
+ // the app (grub) doesn't bother to check link state
+ // because it was up a moment before.
+ // Lets wait a bit for the link to resume, rather than
+ // failing to send. In the case of grub it works either way
+ // but we can't be sure that is universally true, and
+ // hanging for a couple seconds is nicer than a screen of
+ // grub send failure messages.
//
- return EFI_NOT_READY;
+ int retries = 1000;
+ DEBUG ((DEBUG_INFO, "%a: Waiting 10s for link\n", __FUNCTION__));
+ do {
+ gBS->Stall (10000);
+ Status = GenericPhyUpdateConfig (&Genet->Phy);
+ } while (EFI_ERROR (Status) && retries--);
+ if (EFI_ERROR (Status)) {
+ DEBUG ((DEBUG_ERROR, "%a: no link\n", __FUNCTION__));
+ return EFI_NOT_READY;
+ } else {
+ Genet->SnpMode.MediaPresent = TRUE;
+ }
}
if (HeaderSize != 0) {


Andrei Warkentin
 

Looks fine to me

Reviewed-by: Andrei Warkentin <awarkentin@...>


From: Jeremy Linton <jeremy.linton@...>
Sent: Thursday, April 15, 2021 2:22 PM
To: devel@edk2.groups.io <devel@edk2.groups.io>
Cc: ard.biesheuvel@... <ard.biesheuvel@...>; leif@... <leif@...>; pete@... <pete@...>; samer.el-haj-mahmoud@... <samer.el-haj-mahmoud@...>; Andrei Warkentin <awarkentin@...>; Jeremy Linton <jeremy.linton@...>
Subject: [PATCH 1/2] Silicon/Broadcom/BcmGenetDxe: Delay for linkup in transmit
 
Under normal circumstances GenetSimpleNetworkTransmit won't be
called unless the rest of the network stack detects the link is
up. So, during normal operation when the adapter is initialized
the link naturally transitions to link up, and then is ready for
activity later in the boot sequence. If that hasn't happened by
the time PXE runs then it will itself wait for the link.

OTOH, the normal distro PXE sequence involves PXE loading shim
which in turn loads grub, which tries to read machine specific
configs, modules, and grub.cfg in order to prepare the boot menu.
Then, once a grub selection is picked, it might try to load the
kernel+initrd.

In this sequence the network stack is shutdown and restarted
multiple times. Grub though, starts up, notices its been network
booted, reads saved network parameters and immediately tries to
transmit data assuming the link is still up.

When that happens grub will print "couldn't send network packet"
and if that lasts long enough it fails to load grub.cfg and the
user gets dropped to the grub prompt because no one in the path
bothers to assure the link state has transitioned back up.

For reference: https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fpftf%2FRPi4%2Fissues%2F113&amp;data=04%7C01%7Cawarkentin%40vmware.com%7Cadf7e9a53f9b4c6cff9608d90043c562%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637541113365821315%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=Gdg67x%2Fv75%2B5r9bWhz8SvB%2B4VBTgZ6sEZ9SeYor%2B02E%3D&amp;reserved=0

Signed-off-by: Jeremy Linton <jeremy.linton@...>
---
 .../Drivers/Net/BcmGenetDxe/SimpleNetwork.c        | 24 ++++++++++++++++++++--
 1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/Silicon/Broadcom/Drivers/Net/BcmGenetDxe/SimpleNetwork.c b/Silicon/Broadcom/Drivers/Net/BcmGenetDxe/SimpleNetwork.c
index 1bda18f157..29c76d8495 100644
--- a/Silicon/Broadcom/Drivers/Net/BcmGenetDxe/SimpleNetwork.c
+++ b/Silicon/Broadcom/Drivers/Net/BcmGenetDxe/SimpleNetwork.c
@@ -13,6 +13,7 @@
 #include <Library/DebugLib.h>
 #include <Library/DmaLib.h>
 #include <Library/NetLib.h>
+#include <Library/UefiBootServicesTableLib.h>
 #include <Protocol/SimpleNetwork.h>
 
 #include "BcmGenetDxe.h"
@@ -590,9 +591,28 @@ GenetSimpleNetworkTransmit (
 
   if (!Genet->SnpMode.MediaPresent) {
     //
-    // Don't bother transmitting if there's no link.
+    // We should only really get here if the link was up
+    // and is now down due to a stop/shutdown sequence, and
+    // the app (grub) doesn't bother to check link state
+    // because it was up a moment before.
+    // Lets wait a bit for the link to resume, rather than
+    // failing to send. In the case of grub it works either way
+    // but we can't be sure that is universally true, and
+    // hanging for a couple seconds is nicer than a screen of
+    // grub send failure messages.
     //
-    return EFI_NOT_READY;
+    int retries = 1000;
+    DEBUG ((DEBUG_INFO, "%a: Waiting 10s for link\n", __FUNCTION__));
+    do {
+      gBS->Stall (10000);
+      Status = GenericPhyUpdateConfig (&Genet->Phy);
+    } while (EFI_ERROR (Status) && retries--);
+    if (EFI_ERROR (Status)) {
+      DEBUG ((DEBUG_ERROR, "%a: no link\n", __FUNCTION__));
+      return EFI_NOT_READY;
+    } else {
+      Genet->SnpMode.MediaPresent = TRUE;
+    }
   }
 
   if (HeaderSize != 0) {
--
2.13.7


Samer El-Haj-Mahmoud
 

+Ard's new e-mail address

-----Original Message-----
From: Jeremy Linton <jeremy.linton@arm.com>
Sent: Thursday, April 15, 2021 3:22 PM
To: devel@edk2.groups.io
Cc: Ard Biesheuvel <Ard.Biesheuvel@arm.com>; leif@nuviainc.com;
pete@akeo.ie; Samer El-Haj-Mahmoud <Samer.El-Haj-
Mahmoud@arm.com>; Andrei Warkentin (awarkentin@vmware.com)
<awarkentin@vmware.com>; Jeremy Linton <Jeremy.Linton@arm.com>
Subject: [PATCH 1/2] Silicon/Broadcom/BcmGenetDxe: Delay for linkup in
transmit

Under normal circumstances GenetSimpleNetworkTransmit won't be called
unless the rest of the network stack detects the link is up. So, during normal
operation when the adapter is initialized the link naturally transitions to link
up, and then is ready for activity later in the boot sequence. If that hasn't
happened by the time PXE runs then it will itself wait for the link.

OTOH, the normal distro PXE sequence involves PXE loading shim which in
turn loads grub, which tries to read machine specific configs, modules, and
grub.cfg in order to prepare the boot menu.
Then, once a grub selection is picked, it might try to load the
kernel+initrd.

In this sequence the network stack is shutdown and restarted multiple times.
Grub though, starts up, notices its been network booted, reads saved
network parameters and immediately tries to transmit data assuming the link
is still up.

When that happens grub will print "couldn't send network packet"
and if that lasts long enough it fails to load grub.cfg and the user gets
dropped to the grub prompt because no one in the path bothers to assure
the link state has transitioned back up.

For reference: https://github.com/pftf/RPi4/issues/113

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
.../Drivers/Net/BcmGenetDxe/SimpleNetwork.c | 24
++++++++++++++++++++--
1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/Silicon/Broadcom/Drivers/Net/BcmGenetDxe/SimpleNetwork.c
b/Silicon/Broadcom/Drivers/Net/BcmGenetDxe/SimpleNetwork.c
index 1bda18f157..29c76d8495 100644
--- a/Silicon/Broadcom/Drivers/Net/BcmGenetDxe/SimpleNetwork.c
+++ b/Silicon/Broadcom/Drivers/Net/BcmGenetDxe/SimpleNetwork.c
@@ -13,6 +13,7 @@
#include <Library/DebugLib.h>
#include <Library/DmaLib.h>
#include <Library/NetLib.h>
+#include <Library/UefiBootServicesTableLib.h>
#include <Protocol/SimpleNetwork.h>

#include "BcmGenetDxe.h"
@@ -590,9 +591,28 @@ GenetSimpleNetworkTransmit (

if (!Genet->SnpMode.MediaPresent) {
//
- // Don't bother transmitting if there's no link.
+ // We should only really get here if the link was up
+ // and is now down due to a stop/shutdown sequence, and
+ // the app (grub) doesn't bother to check link state
+ // because it was up a moment before.
+ // Lets wait a bit for the link to resume, rather than
+ // failing to send. In the case of grub it works either way
+ // but we can't be sure that is universally true, and
+ // hanging for a couple seconds is nicer than a screen of
+ // grub send failure messages.
//
- return EFI_NOT_READY;
+ int retries = 1000;
+ DEBUG ((DEBUG_INFO, "%a: Waiting 10s for link\n", __FUNCTION__));
+ do {
+ gBS->Stall (10000);
+ Status = GenericPhyUpdateConfig (&Genet->Phy);
+ } while (EFI_ERROR (Status) && retries--);
+ if (EFI_ERROR (Status)) {
+ DEBUG ((DEBUG_ERROR, "%a: no link\n", __FUNCTION__));
+ return EFI_NOT_READY;
+ } else {
+ Genet->SnpMode.MediaPresent = TRUE;
+ }
}

if (HeaderSize != 0) {
--
2.13.7
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.


Samer El-Haj-Mahmoud
 

+Ard's new e-mail address

Jared, please confirm this is a "Reviewed-By"

-----Original Message-----
From: Jared McNeill <jmcneill@invisible.ca>
Sent: Friday, April 30, 2021 6:30 AM
To: Jeremy Linton <Jeremy.Linton@arm.com>
Cc: devel@edk2.groups.io; Ard Biesheuvel <Ard.Biesheuvel@arm.com>;
leif@nuviainc.com; Pete Batard <pete@akeo.ie>; Samer El-Haj-Mahmoud
<Samer.El-Haj-Mahmoud@arm.com>; Andrei Warkentin
(awarkentin@vmware.com) <awarkentin@vmware.com>
Subject: Re: [PATCH 1/2] Silicon/Broadcom/BcmGenetDxe: Delay for linkup in
transmit

LGTM.

Take care,
Jared


On Apr 29, 2021, at 5:19 PM, Jeremy Linton <jeremy.linton@arm.com>
wrote:

+Jared McNeill for review.

Thanks,

On 4/15/21 2:22 PM, Jeremy Linton wrote:
Under normal circumstances GenetSimpleNetworkTransmit won't be
called
unless the rest of the network stack detects the link is up. So,
during normal operation when the adapter is initialized the link
naturally transitions to link up, and then is ready for activity
later in the boot sequence. If that hasn't happened by the time PXE
runs then it will itself wait for the link.
OTOH, the normal distro PXE sequence involves PXE loading shim which
in turn loads grub, which tries to read machine specific configs,
modules, and grub.cfg in order to prepare the boot menu.
Then, once a grub selection is picked, it might try to load the
kernel+initrd.
In this sequence the network stack is shutdown and restarted multiple
times. Grub though, starts up, notices its been network booted, reads
saved network parameters and immediately tries to transmit data
assuming the link is still up.
When that happens grub will print "couldn't send network packet"
and if that lasts long enough it fails to load grub.cfg and the user
gets dropped to the grub prompt because no one in the path bothers to
assure the link state has transitioned back up.
For reference: https://github.com/pftf/RPi4/issues/113
Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
.../Drivers/Net/BcmGenetDxe/SimpleNetwork.c | 24
++++++++++++++++++++--
1 file changed, 22 insertions(+), 2 deletions(-) diff --git
a/Silicon/Broadcom/Drivers/Net/BcmGenetDxe/SimpleNetwork.c
b/Silicon/Broadcom/Drivers/Net/BcmGenetDxe/SimpleNetwork.c
index 1bda18f157..29c76d8495 100644
--- a/Silicon/Broadcom/Drivers/Net/BcmGenetDxe/SimpleNetwork.c
+++ b/Silicon/Broadcom/Drivers/Net/BcmGenetDxe/SimpleNetwork.c
@@ -13,6 +13,7 @@
#include <Library/DebugLib.h>
#include <Library/DmaLib.h>
#include <Library/NetLib.h>
+#include <Library/UefiBootServicesTableLib.h>
#include <Protocol/SimpleNetwork.h>
#include "BcmGenetDxe.h"
@@ -590,9 +591,28 @@ GenetSimpleNetworkTransmit (
if (!Genet->SnpMode.MediaPresent) {
//
- // Don't bother transmitting if there's no link.
+ // We should only really get here if the link was up
+ // and is now down due to a stop/shutdown sequence, and
+ // the app (grub) doesn't bother to check link state
+ // because it was up a moment before.
+ // Lets wait a bit for the link to resume, rather than
+ // failing to send. In the case of grub it works either way
+ // but we can't be sure that is universally true, and
+ // hanging for a couple seconds is nicer than a screen of
+ // grub send failure messages.
//
- return EFI_NOT_READY;
+ int retries = 1000;
+ DEBUG ((DEBUG_INFO, "%a: Waiting 10s for link\n", __FUNCTION__));
+ do {
+ gBS->Stall (10000);
+ Status = GenericPhyUpdateConfig (&Genet->Phy);
+ } while (EFI_ERROR (Status) && retries--);
+ if (EFI_ERROR (Status)) {
+ DEBUG ((DEBUG_ERROR, "%a: no link\n", __FUNCTION__));
+ return EFI_NOT_READY;
+ } else {
+ Genet->SnpMode.MediaPresent = TRUE;
+ }
}
if (HeaderSize != 0) {
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.


Ard Biesheuvel
 

On Thu, 15 Apr 2021 at 21:22, Jeremy Linton <jeremy.linton@arm.com> wrote:

Under normal circumstances GenetSimpleNetworkTransmit won't be
called unless the rest of the network stack detects the link is
up. So, during normal operation when the adapter is initialized
the link naturally transitions to link up, and then is ready for
activity later in the boot sequence. If that hasn't happened by
the time PXE runs then it will itself wait for the link.

OTOH, the normal distro PXE sequence involves PXE loading shim
which in turn loads grub, which tries to read machine specific
configs, modules, and grub.cfg in order to prepare the boot menu.
Then, once a grub selection is picked, it might try to load the
kernel+initrd.

In this sequence the network stack is shutdown and restarted
multiple times. Grub though, starts up, notices its been network
booted, reads saved network parameters and immediately tries to
transmit data assuming the link is still up.

When that happens grub will print "couldn't send network packet"
and if that lasts long enough it fails to load grub.cfg and the
user gets dropped to the grub prompt because no one in the path
bothers to assure the link state has transitioned back up.
This is an excellent problem description. Could we dedicate a couple
of paragraphs to the solution as well? Especially given that 'wait for
random # of seconds' seems to be the taken approach here, which seems
a bit sloppy to me, given that we should be able to detect whether the
link is up, no?



For reference: https://github.com/pftf/RPi4/issues/113

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
.../Drivers/Net/BcmGenetDxe/SimpleNetwork.c | 24 ++++++++++++++++++++--
1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/Silicon/Broadcom/Drivers/Net/BcmGenetDxe/SimpleNetwork.c b/Silicon/Broadcom/Drivers/Net/BcmGenetDxe/SimpleNetwork.c
index 1bda18f157..29c76d8495 100644
--- a/Silicon/Broadcom/Drivers/Net/BcmGenetDxe/SimpleNetwork.c
+++ b/Silicon/Broadcom/Drivers/Net/BcmGenetDxe/SimpleNetwork.c
@@ -13,6 +13,7 @@
#include <Library/DebugLib.h>
#include <Library/DmaLib.h>
#include <Library/NetLib.h>
+#include <Library/UefiBootServicesTableLib.h>
#include <Protocol/SimpleNetwork.h>

#include "BcmGenetDxe.h"
@@ -590,9 +591,28 @@ GenetSimpleNetworkTransmit (

if (!Genet->SnpMode.MediaPresent) {
//
- // Don't bother transmitting if there's no link.
+ // We should only really get here if the link was up
+ // and is now down due to a stop/shutdown sequence, and
+ // the app (grub) doesn't bother to check link state
+ // because it was up a moment before.
+ // Lets wait a bit for the link to resume, rather than
+ // failing to send. In the case of grub it works either way
+ // but we can't be sure that is universally true, and
+ // hanging for a couple seconds is nicer than a screen of
+ // grub send failure messages.
//
- return EFI_NOT_READY;
+ int retries = 1000;
+ DEBUG ((DEBUG_INFO, "%a: Waiting 10s for link\n", __FUNCTION__));
+ do {
+ gBS->Stall (10000);
+ Status = GenericPhyUpdateConfig (&Genet->Phy);
+ } while (EFI_ERROR (Status) && retries--);
+ if (EFI_ERROR (Status)) {
+ DEBUG ((DEBUG_ERROR, "%a: no link\n", __FUNCTION__));
+ return EFI_NOT_READY;
+ } else {
+ Genet->SnpMode.MediaPresent = TRUE;
+ }
}

if (HeaderSize != 0) {
--
2.13.7






Jeremy Linton
 

Hi,

On 5/10/21 11:56 AM, Ard Biesheuvel wrote:
On Thu, 15 Apr 2021 at 21:22, Jeremy Linton <jeremy.linton@arm.com> wrote:

Under normal circumstances GenetSimpleNetworkTransmit won't be
called unless the rest of the network stack detects the link is
up. So, during normal operation when the adapter is initialized
the link naturally transitions to link up, and then is ready for
activity later in the boot sequence. If that hasn't happened by
the time PXE runs then it will itself wait for the link.

OTOH, the normal distro PXE sequence involves PXE loading shim
which in turn loads grub, which tries to read machine specific
configs, modules, and grub.cfg in order to prepare the boot menu.
Then, once a grub selection is picked, it might try to load the
kernel+initrd.

In this sequence the network stack is shutdown and restarted
multiple times. Grub though, starts up, notices its been network
booted, reads saved network parameters and immediately tries to
transmit data assuming the link is still up.

When that happens grub will print "couldn't send network packet"
and if that lasts long enough it fails to load grub.cfg and the
user gets dropped to the grub prompt because no one in the path
bothers to assure the link state has transitioned back up.
This is an excellent problem description. Could we dedicate a couple
of paragraphs to the solution as well? Especially given that 'wait for
random # of seconds' seems to be the taken approach here, which seems
a bit sloppy to me, given that we should be able to detect whether the
link is up, no?
Isn't that what GenericPhyUpdateConfig is doing? Checking for linkup, and updating the phy state? So once the link goes up, it return EFI_SUCCESS and the loop terminates.

I was actually a bit concerned that the 10s wasn't enough since AFAIK a large part of the delay is the remote side.




For reference: https://github.com/pftf/RPi4/issues/113

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
.../Drivers/Net/BcmGenetDxe/SimpleNetwork.c | 24 ++++++++++++++++++++--
1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/Silicon/Broadcom/Drivers/Net/BcmGenetDxe/SimpleNetwork.c b/Silicon/Broadcom/Drivers/Net/BcmGenetDxe/SimpleNetwork.c
index 1bda18f157..29c76d8495 100644
--- a/Silicon/Broadcom/Drivers/Net/BcmGenetDxe/SimpleNetwork.c
+++ b/Silicon/Broadcom/Drivers/Net/BcmGenetDxe/SimpleNetwork.c
@@ -13,6 +13,7 @@
#include <Library/DebugLib.h>
#include <Library/DmaLib.h>
#include <Library/NetLib.h>
+#include <Library/UefiBootServicesTableLib.h>
#include <Protocol/SimpleNetwork.h>

#include "BcmGenetDxe.h"
@@ -590,9 +591,28 @@ GenetSimpleNetworkTransmit (

if (!Genet->SnpMode.MediaPresent) {
//
- // Don't bother transmitting if there's no link.
+ // We should only really get here if the link was up
+ // and is now down due to a stop/shutdown sequence, and
+ // the app (grub) doesn't bother to check link state
+ // because it was up a moment before.
+ // Lets wait a bit for the link to resume, rather than
+ // failing to send. In the case of grub it works either way
+ // but we can't be sure that is universally true, and
+ // hanging for a couple seconds is nicer than a screen of
+ // grub send failure messages.
//
- return EFI_NOT_READY;
+ int retries = 1000;
+ DEBUG ((DEBUG_INFO, "%a: Waiting 10s for link\n", __FUNCTION__));
+ do {
+ gBS->Stall (10000);
+ Status = GenericPhyUpdateConfig (&Genet->Phy);
+ } while (EFI_ERROR (Status) && retries--);
+ if (EFI_ERROR (Status)) {
+ DEBUG ((DEBUG_ERROR, "%a: no link\n", __FUNCTION__));
+ return EFI_NOT_READY;
+ } else {
+ Genet->SnpMode.MediaPresent = TRUE;
+ }
}

if (HeaderSize != 0) {
--
2.13.7





Ard Biesheuvel
 

On Mon, 10 May 2021 at 19:26, Jeremy Linton <jeremy.linton@arm.com> wrote:

Hi,

On 5/10/21 11:56 AM, Ard Biesheuvel wrote:
On Thu, 15 Apr 2021 at 21:22, Jeremy Linton <jeremy.linton@arm.com> wrote:

Under normal circumstances GenetSimpleNetworkTransmit won't be
called unless the rest of the network stack detects the link is
up. So, during normal operation when the adapter is initialized
the link naturally transitions to link up, and then is ready for
activity later in the boot sequence. If that hasn't happened by
the time PXE runs then it will itself wait for the link.

OTOH, the normal distro PXE sequence involves PXE loading shim
which in turn loads grub, which tries to read machine specific
configs, modules, and grub.cfg in order to prepare the boot menu.
Then, once a grub selection is picked, it might try to load the
kernel+initrd.

In this sequence the network stack is shutdown and restarted
multiple times. Grub though, starts up, notices its been network
booted, reads saved network parameters and immediately tries to
transmit data assuming the link is still up.

When that happens grub will print "couldn't send network packet"
and if that lasts long enough it fails to load grub.cfg and the
user gets dropped to the grub prompt because no one in the path
bothers to assure the link state has transitioned back up.
This is an excellent problem description. Could we dedicate a couple
of paragraphs to the solution as well? Especially given that 'wait for
random # of seconds' seems to be the taken approach here, which seems
a bit sloppy to me, given that we should be able to detect whether the
link is up, no?
Isn't that what GenericPhyUpdateConfig is doing? Checking for linkup,
and updating the phy state? So once the link goes up, it return
EFI_SUCCESS and the loop terminates.
OK. Could you dedicate some prose to this logic in the commit log please?

I was actually a bit concerned that the 10s wasn't enough since AFAIK a
large part of the delay is the remote side.
The remote side of what? The Ethernet cable?






For reference: https://github.com/pftf/RPi4/issues/113

Signed-off-by: Jeremy Linton <jeremy.linton@arm.com>
---
.../Drivers/Net/BcmGenetDxe/SimpleNetwork.c | 24 ++++++++++++++++++++--
1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/Silicon/Broadcom/Drivers/Net/BcmGenetDxe/SimpleNetwork.c b/Silicon/Broadcom/Drivers/Net/BcmGenetDxe/SimpleNetwork.c
index 1bda18f157..29c76d8495 100644
--- a/Silicon/Broadcom/Drivers/Net/BcmGenetDxe/SimpleNetwork.c
+++ b/Silicon/Broadcom/Drivers/Net/BcmGenetDxe/SimpleNetwork.c
@@ -13,6 +13,7 @@
#include <Library/DebugLib.h>
#include <Library/DmaLib.h>
#include <Library/NetLib.h>
+#include <Library/UefiBootServicesTableLib.h>
#include <Protocol/SimpleNetwork.h>

#include "BcmGenetDxe.h"
@@ -590,9 +591,28 @@ GenetSimpleNetworkTransmit (

if (!Genet->SnpMode.MediaPresent) {
//
- // Don't bother transmitting if there's no link.
+ // We should only really get here if the link was up
+ // and is now down due to a stop/shutdown sequence, and
+ // the app (grub) doesn't bother to check link state
+ // because it was up a moment before.
+ // Lets wait a bit for the link to resume, rather than
+ // failing to send. In the case of grub it works either way
+ // but we can't be sure that is universally true, and
+ // hanging for a couple seconds is nicer than a screen of
+ // grub send failure messages.
//
- return EFI_NOT_READY;
+ int retries = 1000;
+ DEBUG ((DEBUG_INFO, "%a: Waiting 10s for link\n", __FUNCTION__));
+ do {
+ gBS->Stall (10000);
+ Status = GenericPhyUpdateConfig (&Genet->Phy);
+ } while (EFI_ERROR (Status) && retries--);
+ if (EFI_ERROR (Status)) {
+ DEBUG ((DEBUG_ERROR, "%a: no link\n", __FUNCTION__));
+ return EFI_NOT_READY;
+ } else {
+ Genet->SnpMode.MediaPresent = TRUE;
+ }
}

if (HeaderSize != 0) {
--
2.13.7