HTTP boot failed on timeout


Laszlo Ersek
 

On 02/24/20 11:01, doron.bleiberg@... wrote:
Hi Laszlo,

At first I thought some of the problems I'm facing were related to virtio-net-pci network type. So I've switched to e1000.
Eventually I've found the issues are not related to virtio-net-pci and reverted back to this configuration which works well.
So bottom line, no need for taking E3522X2.EFI driver nor using e1000 (at least in my case).
OK, thank you for explaining!

Note for completeness: -global virtio-net-pci.romfile="" is required for the VM to operate.

It was pleasant discussing with you on this thread, by no means I take it for granted your questions, answers and guidance which helped a-lot. Really appreciated.
Thank you -- I didn't try to imply I was "displeased" or whatever; I was
just genuinely curious why you liked e1000 more than virtio-net-pci. :)

Thanks!
Laszlo


doron.bleiberg@...
 

Hi Laszlo,

At first I thought some of the problems I'm facing were related to virtio-net-pci network type. So I've switched to e1000.
Eventually I've found the issues are not related to virtio-net-pci and reverted back to this configuration which works well.
So bottom line, no need for taking E3522X2.EFI driver nor using e1000 (at least in my case).

Note for completeness: -global virtio-net-pci.romfile="" is required for the VM to operate.

It was pleasant discussing with you on this thread, by no means I take it for granted your questions, answers and guidance which helped a-lot. Really appreciated.

Doron


Laszlo Ersek
 

On 02/22/20 20:00, doron.bleiberg@... wrote:
Hi Community,

An update regarding my case - case solved.
I was able to boot my system using HTTP boot. What worked for me eventually is:
Copy E3522X2.EFI from Intel
compile OVMF with e1000 support: build -p OvmfPkg/OvmfPkgX64.dsc -D E1000_ENABLE
run QEMU with flag -global e1000.romfile=""
Thanks!

May I ask why you switched from virtio-net-pci to e1000? Are you
HTTP-booting a Windows guest perhaps?

Thanks
Laszlo


doron.bleiberg@...
 

Hi Community,

An update regarding my case - case solved.
I was able to boot my system using HTTP boot. What worked for me eventually is:
Copy E3522X2.EFI from Intel
compile OVMF with e1000 support: build -p OvmfPkg/OvmfPkgX64.dsc -D E1000_ENABLE
run QEMU with flag -global e1000.romfile=""

Thank you all for the support.

Doron


Laszlo Ersek
 

On 02/17/20 11:52, doron.bleiberg@... wrote:
Hi Laszlo,

Thank you for the quick and detailed response. Some answers to your questions:
- Since I'm running the VMs as part of GNS3 project I'm not having full control over the VM startup command. I will try to run the VM outside GNS3 context just to make sure I'm having cleaner and more controlled environment.
- I'm using qcow2 drive file which is the HDD on which the net installation should run. The drive is empty waiting for the installation after boot completion.


I've used your suggestion for adding a 'romfile=""' property. I did it this way as I can't edit the device properties (GNS3...):
-global virtio-net-pci.romfile=""

Can you explain what is the meaning of setting this property and why it is related to my problem?
It is not necessarily related to your problem; it *could* be related.

The romfile property tells QEMU what PCI expansion ROM to load into the
device's PCI option ROM BAR. By default the ROM in question is built
from the iPXE project. This means that the Simple Network Protocol
driver that talks to the virtio-net-pci device directly in OVMF comes
from the iPXE project.

OVMF has a built-in driver (VirtioNetDxe) for the same device however,
it just has lower priority. So if you want VirtioNetDxe to take control
of the device, you need to prevent the loading of the expansion ROM
described above. That's what romfile='' does.

And this could be relevant to your problem because the SNP driver lies
at the bottom of the edk2 network stack (in OVMF anyway). If there is a
problem in your iPXE SNP driver, then that could affect the dependent
TCP connection, and hang your HTTP boot. Switching in the VirtioNetDxe
driver might show a difference here.

Note: I'm not blindly blaming the iPXE driver; I'm just saying it
*could* be related. Seeing how your QEMU binary is ancient (4+ years
old), there could be old iPXE issues affecting your iPXE expansion ROM
too, that have been fixed since, up-stream.

Also, if not suggested to use -bios what are the alternatives?
You should use the split build of OVMF (OVMF_CODE.fd and OVMF_VARS.fd).
OVMF_CODE.fd is the firmware executable. OVMF_VARS.fd is a *template*
file for creating actual variable store files from, when defining a new
domain. Once you define a new domain, the varstore file that was
originally copied from OVMF_VARS.fd should be considered basically
another "data disk" for the domain. The varstore file is where
persistent (non-volatile) UEFI variables are stored, for the domain.

With "-bios", you get a varstore emulation that's not spec-conformant.
It suffers from various obscure problems. Don't use it.

The related (traditional) QEMU cmdline options are shown below. There is
a more recent, more modern, format for the same, but that format
requires a newer QEMU release. (For details, check out
<http://mid.mail-archive.com/146b553d-cbe7-ac87-9423-bd07602e3e01@redhat.com>.
But honestly, the best idea is to just use libvirt.)

So, the original options are:

-drive if=pflash,format=raw,unit=0,file=OVMF_CODE.fd,readonly=on \
-drive if=pflash,format=raw,unit=1,file=guest-vars.fd \

where "guest-vars.fd" is specific to the guest in question, and was
originally copied from OVMF_VARS.fd.

If your distro doesn't package OVMF_CODE.fd and OVMF_VARS.fd separately,
then it's too old.

Is there an option for me to also skip 'Start PXE over IPv4' part from happening in the boot process?
No, there's not. While you can influence the UEFI boot order from the
QEMU command line, for example with:

-device virtio-net-pci,[other options],bootindex=0

the QEMU <-> firmware interface that exposes this to the firmware --
specifically, the "bootorder" fw_cfg file -- is not expressive enough to
tell apart PXE boot on a given NIC from HTTPv4 boot on the same NIC. You
can specify a particular NIC, but given that NIC, you'll have to stick
with the edk2-default boot order for a NIC.

If you can go into the firmware setup TUI once, and manually reorder the
PXE vs. HTTPv4 boot options, then OVMF will generally stick with that
order for you (until / unless you instruct OVMF to drop netboot from the
boot order altogether). But that requires you to interact with the guest
firmware.


This suggestion worked well! and I was able to fully download the file. Thank you!
Oh wow. :) So, it *is* related to the iPXE SNP driver.

For reference, can you search your installed package set for packages
that have "ipxe" in the name? Can you list their names and versions?

Basically now I expect that you are using your distro's very outdated
iPXE package, whose issues have long been fixed up-stream.

The only problem now, is that I get kernel panic on the next step - who said life are simple..... ;-)

I'll continue to debug the kernel panic.

I've attached qemu.log just in case.
Thanks,
Laszlo


doron.bleiberg@...
 

Hi Laszlo,

Thank you for the quick and detailed response. Some answers to your questions:
- Since I'm running the VMs as part of GNS3 project I'm not having full control over the VM startup command. I will try to run the VM outside GNS3 context just to make sure I'm having cleaner and more controlled environment.
- I'm using qcow2 drive file which is the HDD on which the net installation should run. The drive is empty waiting for the installation after boot completion.


I've used your suggestion for adding a 'romfile=""' property. I did it this way as I can't edit the device properties (GNS3...):
-global virtio-net-pci.romfile=""

Can you explain what is the meaning of setting this property and why it is related to my problem?
Also, if not suggested to use -bios what are the alternatives?
Is there an option for me to also skip 'Start PXE over IPv4' part from happening in the boot process?

This suggestion worked well! and I was able to fully download the file. Thank you!
The only problem now, is that I get kernel panic on the next step - who said life are simple..... ;-)

I'll continue to debug the kernel panic.

I've attached qemu.log just in case.

Doron


Laszlo Ersek
 

On 02/17/20 10:45, doron.bleiberg@... wrote:
Some inputs:
Qemu version: QEMU emulator version 2.5.0 (Debian
1:2.5+dfsg-5ubuntu10.34), Copyright (c) 2003-2008 Fabrice Bellard
Qemu CMD: /usr/bin/qemu-system-x86_64 \
-name HTTP-BOOT-VM-1 \
-m 8192M \
-smp cpus=1 \
-enable-kvm \
-machine smm=off \
-boot order=c \
-bios /opt/gns3/images/QEMU/OVMF-RELEASE.fd \
Ugh this makes my eyes bleed. :/ Please never use the "-bios" option
with OVMF.

Anyway that's not our current topic here.

-drive file=/opt/gns3/projects/1a83274a-c57f-4337-8a0d-1e68a9312e9a/project-files/qemu/d8f37f0b-2b63-455f-b536-b309b9020e36/hda_disk.qcow2,if=ide,index=0,media=disk \
-uuid d8f37f0b-2b63-455f-b536-b309b9020e36 \
-vnc 0.0.0.0:3 \
-monitor tcp:127.0.0.1:41898,server,nowait \
-net none \
Seems inconsistent with your intent to HTTP Boot... At least superfluous
with the below, I'd think

-device virtio-net-pci,mac=0c:2e:9a:0e:36:00,netdev=gns3-0 \
Seems OK; so you are using virtio-net-pci.

-netdev socket,id=gns3-0,udp=127.0.0.1:10017,localaddr=127.0.0.1:10016 \
Unfortunately, I'm entirely unused to "-netdev socket", especially with
"udp=...". I only use "-netdev tap" (via libvirt).

-nographic \
-debugcon file:debug.log \
-global isa-debugcon.iobase=0x402

The OVMF log is empty, all the logs appear in attached qemu.log file.
Hmmm... Ah you mention "OVMF-RELEASE.fd" above. So that must be a
RELEASE build of OVMF, which indeed does not produce debug messages. Can
you try with DEBUG please?


I've did some debugging myself and found out the offending line is
here:
File: NetworkPkg/HttpBootDxe/HttpBootSupport.c#L1012
The error is handled here:
NetworkPkg/HttpBootDxe/HttpBootSupport.c#L1018
Yes, I checked those lines after reading your earlier message.

I asked for your command line and the OVMF debug log because I wanted to
see if you were using the iPXE SNP driver for virtio-net (you most
likely are, from the info thus far). It could be interesting to try the
built-in virtio-net driver, for one data point. (This would be a
front-end check.)

Second, I have no idea what's happening on the udp socket netdev
back-end. It could explain the virtual network glitches you are seeing.
I'm not sure.

Laszlo


doron.bleiberg@...
 

I also need to mention that I've enabled the HTTP boot myself and compiled the project this way:

git clone https://github.com/tianocore/edk2.git edk2-stable201911
cd edk2-stable201911
git submodule update --init
make -C BaseTools
. edksetup.sh BaseTools

vi Conf/target.txt
ACTIVE_PLATFORM = OvmfPkg/OvmfPkgX64.dsc
TARGET_ARCH = X64
TOOL_CHAIN_TAG = GCC48
TARGET = RELEASE

vi OvmfPkg/OvmfPkgX64.dsc
DEFINE NETWORK_HTTP_BOOT_ENABLE = TRUE
BUILD_TARGETS = RELEASE

export EDK_TOOLS_PATH=$PWD/BaseTools
build -p OvmfPkg/OvmfPkgX64.dsc

And I'm taking the file:
Build/OvmfX64/RELEASE_GCC48/FV/OVMF.fd

Doron


doron.bleiberg@...
 

Some inputs:
Qemu version: QEMU emulator version 2.5.0 (Debian 1:2.5+dfsg-5ubuntu10.34), Copyright (c) 2003-2008 Fabrice Bellard
Qemu CMD: /usr/bin/qemu-system-x86_64 -name HTTP-BOOT-VM-1 -m 8192M -smp cpus=1 -enable-kvm -machine smm=off -boot order=c -bios /opt/gns3/images/QEMU/OVMF-RELEASE.fd -drive file=/opt/gns3/projects/1a83274a-c57f-4337-8a0d-1e68a9312e9a/project-files/qemu/d8f37f0b-2b63-455f-b536-b309b9020e36/hda_disk.qcow2,if=ide,index=0,media=disk -uuid d8f37f0b-2b63-455f-b536-b309b9020e36 -vnc 0.0.0.0:3 -monitor tcp:127.0.0.1:41898,server,nowait -net none -device virtio-net-pci,mac=0c:2e:9a:0e:36:00,netdev=gns3-0 -netdev socket,id=gns3-0,udp=127.0.0.1:10017,localaddr=127.0.0.1:10016 -nographic -debugcon file:debug.log -global isa-debugcon.iobase=0x402

The OVMF log is empty, all the logs appear in attached qemu.log file.

I've did some debugging myself and found out the offending line is here:
File: NetworkPkg/HttpBootDxe/HttpBootSupport.c#L1012
The error is handled here: NetworkPkg/HttpBootDxe/HttpBootSupport.c#L1018

Though I'm using large file I didn't observed a problem in buffer or RAM size.

I've tried to add the requests "Connection: Keep-Alive" header with no change in result.

The boot download always terminates at the same spot.

10x,
Doron


Laszlo Ersek
 

On 02/15/20 22:24, doron.bleiberg@... wrote:
Some additional inputs:
I'm running the boot from a QEMU VM. In the QEMU logs I see the following error during HTTP boot:
Error: Server response timeout.
BdsDxe: failed to load Boot0005 "UEFI HTTPv4 (MAC:0C2E9A0E3600)" from PciRoot(0x0)/Pci(0x3,0x0)/MAC(0C2E9A0E3600,0x1)/IPv4(0.0.0.0,0x0,DHCP,0.0.0.0,0.0.0.0,0.0.0.0)/Uri(): Not Found

I've also raised httpd server logging to trace6 but could not find anything pointing to a problem on this side. I'm only able to observe an error after the UEFI terminated the session, which is expected.
* Please specify
- your QEMU version,
- your full QEMU cmdline.

* Please capture the OVMF log by appending the following QEMU flags:

-debugcon file:debug.log -global isa-debugcon.iobase=0x402

and attach the log.

Thanks
Laszlo


doron.bleiberg@...
 

Some additional inputs:
I'm running the boot from a QEMU VM. In the QEMU logs I see the following error during HTTP boot:
Error: Server response timeout.
BdsDxe: failed to load Boot0005 "UEFI HTTPv4 (MAC:0C2E9A0E3600)" from PciRoot(0x0)/Pci(0x3,0x0)/MAC(0C2E9A0E3600,0x1)/IPv4(0.0.0.0,0x0,DHCP,0.0.0.0,0.0.0.0,0.0.0.0)/Uri(): Not Found

I've also raised httpd server logging to trace6 but could not find anything pointing to a problem on this side. I'm only able to observe an error after the UEFI terminated the session, which is expected.

Doron


doron.bleiberg@...
 

Hi Rebecca,

I'm running Apache httpd/2.4.6 (CentOS)
File size is: ~420MB

I did try to run Wireshark, however I was not able to see anything special. The traffic just stopped flowing at some point.

Doron


Rebecca Cran
 

What web server are you using? How large is the file you're trying to download?

You might be able to get more information about which is at fault by running a traffic analyzer application such as WireShark.


--
Rebecca Cran

On 2020-02-15 13:29, doron.bleiberg@... wrote:
Hi Community,

I'm trying to use HTTP boot, the process starts and download is progressing until ~40% of the download is complete. However, every-time the download it terminated at the same spot.
I've tried to debug both the UEFI and the http server side and was not able to get into a conclusion on which side the problem is.
I did find out the the timeout happen at:
File - NetworkPkg/HttpBootDxe/HttpBootSupport.c
Line: 1012
Code:
//
// Poll the network until receive finish.
//
while (!HttpIo->IsRxDone && ((HttpIo->TimeoutEvent == NULL) || EFI_ERROR (gBS->CheckEvent (HttpIo->TimeoutEvent)))) {
Http->Poll (Http);
}

The while loop terminates due to timeout and the below code takes action:
if (!HttpIo->IsRxDone) {
//
// Timeout occurs, cancel the response token.
//
Http->Cancel (Http, &HttpIo->RspToken);
Status = EFI_TIMEOUT;

return Status;
}

Is there any timeout I can set to avoid it? Is the problem on UEFI or http server side?

Doron


doron.bleiberg@...
 

Hi Community,

I'm trying to use HTTP boot, the process starts and download is progressing until ~40% of the download is complete. However, every-time the download it terminated at the same spot.
I've tried to debug both the UEFI and the http server side and was not able to get into a conclusion on which side the problem is.
I did find out the the timeout happen at:
File - NetworkPkg/HttpBootDxe/HttpBootSupport.c
Line: 1012
Code:
//
// Poll the network until receive finish.
//
while (!HttpIo->IsRxDone && ((HttpIo->TimeoutEvent == NULL) || EFI_ERROR (gBS->CheckEvent (HttpIo->TimeoutEvent)))) {
Http->Poll (Http);
}

The while loop terminates due to timeout and the below code takes action:
if (!HttpIo->IsRxDone) {
//
// Timeout occurs, cancel the response token.
//
Http->Cancel (Http, &HttpIo->RspToken);
Status = EFI_TIMEOUT;

return Status;
}

Is there any timeout I can set to avoid it? Is the problem on UEFI or http server side?

Doron