Load Option passing. Either bugs or my confusion.


valerij zaporogeci
 

Hello. Do I understand correctly, that LoadOptions field of the LOADED_IMAGE_PROTOCOL instance of an OS loader image, points to the Load Option descriptor, the format of which is described in the Boot Manager section of the spec? If yes, then I faced a behavior, that looks like bugs. It's on OVMF x64 (qemu) (and arm64 too, but only from the UEFI shell, since "ramfb" device shows broken view of what Boot Manager draws, that makes it unusable).
1) when I start my OSL from the UEFI shell, the LoadOptions field is not NULL, but the "load option" referenced, has only Description field valid, Attributes field doesn't seem valid, it contains 0x00730066. Most importantly, - FilePathListLength field is set to a nonzero value (48), but, - there is no valid FilePathList[] Device Path array after the Decsription field. Scanning memory there, reveals all zeros. Isn't it a bug?
2) when I create a Load Option in the Boot Manager, pointing to the same OSL and don't add "optional data" during the creation, the LoadOptions field is NULL. Why? How my OSL is supposed to get FilePathList[] then? For distinction between preinstallation run and postinstallation one, the OSL relies on the presence or absence of the Load Option or if there is one, - on the presence or absence FilePathList[1] Device Path, which points to the OS boot volume for postinstallation or is absent otherwise. But I cannot find FilePathList at all.
3) If I create a Load Option as in 2), but with "optional data", LoadOptions is not NULL, those "optional data" turn to be a "Description" field of the Load Option descriptor reported, and still, there is no FilePathList.
In all the above cases, where LoadOptions is not NULL, LoadOptionsSize equals exactly sizeof(Attributes) + sizeof(FilePathListLength) + sizeof(Description) + 2. the latter is NULL CHAR16. there is no FilePathList. Still FilePathListLength field reports some values.

Is it normal or it's bugs? Plus, there is yet one wierdness: DeviceHandle field (of the same LOADED_IMAGE_PROTOCOL instance for the OSL), which is a device handle (hard drive in this case - Type 4, Subtype - 1), doesn't have EFI_DEVICE_PATH_TO_TEXT_PROTOCOL instance on itself. but shouldn't it have it?

The main question is why there is no FilePathList[] part of the Load Option passed to the OSL in any scenario of its launch? It should be there at least for the case of the Load Option, created via the Boot Manager menu, right?


valerij zaporogeci
 

anyone, please?

The more I try to figure it out, the more it reveals that it's a bug. After that unsuccess with the LoadOptions in the image's Loaded Image Protocol instance, I went and enumerated BootXXXX variables, dumping their Attributes, FilePathListLength, Description and FilePathList[] itself. And this way, it worked - the load option has all those descriptor fields valid and the FilePathList is there and valid. So now, in comparison, when I take a Load Option through the Loaded Image Protocol LoadOptions field, I get:
1) wrong Attributes
2) invalid FilePathListLength (since FilePath is not present at all)
3) Description field has not the human readable name of the option (say "My OS") as is with when you get the option through GetValue() but what Boot Mahager suggests to enter as "optional data". btw, when these data are omitted during the Load Option creation, then the LoadOptions field in the Loaded Image Protocol is NULL at all, see the first post.
4) FilePathList[] is absent
again, when I get the same option by enumerating all the created BootXXXX variables with the GetValue(), all the fields are valid.

So, please tell me, is the LoadOptions pointer of the Loaded Image Protocol instance of the OSL image supposed to point to the Load Option descriptor in the format described in the Boot Manager section? If yes, then this behavior described is clearly a bug and it's not only important for me to figure this out, but for the edk2 as well.


valerij zaporogeci
 

Now looking closer to the supposed to be Attributes and FilePathListLength fileds of the Load Option descriptor, I noticed that it's just first 3 characters of the string. the string that with the Boot Manager case, you create as an "optional data" and with the UEFI shell, it's image file path string (that's why it started with a colon ":", because "fs0" was omitted because of thinking it's a descriptor and thus first 3 16 bit words got omitted (UINT32 + UINT16)).

So, the LoadOptions pointer of the Loaded Image Protocol doesn't point to Load Options. Is it normal?

And then the question arises, - if this is "normal", then how does the loader obtain its real Load Option? Yeah, the device from which it has been loaded is reported by another field, the file path part to the loader itself as well, but how about FilePathList[1] element, - "an OSV specific" and needed. for pointing to the OS boot volume e.g.?


valerij zaporogeci
 

excuse me my impatience, but is this thread visible at all? I can't believe noone in here could answer my simple questions about such a core functionality. maybe it's because I was too verbose, let's try terse.

1. what this pointer (OS Loader ImageHandle)->(LOADED_IMAGE_PROTOCOL)->LoadOptions points to?

if it points to the Load Option Descriptor, then on OVMF it doesn't, since it points to only OptionalData[].
if it was meant to point to OptionalData[], then:

1. how does an OS Loader get _its_ Load Option FilePathList[]? ALL elements.
and
2. should FilePathList[] be a multiinstance Device Path (ending with type 7F, subtype 1, except the last), or every element should end with {7F, FF}?


Laszlo Ersek
 

On 04/14/20 02:44, valerij zaporogeci wrote:
excuse me my impatience, but is this thread visible at all? I can't believe noone in here could answer my simple questions about such a core functionality. maybe it's because I was too verbose, let's try terse.
You posted

[edk2-devel] Bugs when starting an UEFI application from the shell. OVMF, UEFI shell.

on April 9th (in my time zone: a few minutes before April 10th), and started this thread (on edk2-discuss) on April 11th.

In many countries, April 10th and 13th are public holidays. And the 11th and 12th between them fall on a weekend. Put more simply, these days qualify as "Easter long weekend" in many countries.

I plan to look at your question later.

Laszlo


Laszlo Ersek
 

On 04/14/20 02:44, valerij zaporogeci wrote:

1. what this pointer (OS Loader
ImageHandle)->(LOADED_IMAGE_PROTOCOL)->LoadOptions points to?
According to the UEFI spec <https://uefi.org/specifications>, section
"9.1 EFI Loaded Image Protocol":

LoadOptionsSize The size in bytes of LoadOptions.
LoadOptions A pointer to the image's binary load options.

[...]

Each loaded image has an image handle that supports
EFI_LOADED_IMAGE_PROTOCOL. When an image is started, it is passed the
image handle for itself. The image can use the handle to obtain its
relevant image data stored in the EFI_LOADED_IMAGE_PROTOCOL structure,
such as its load options.

And from section "7.4 Image Services", near the LoadImage() boot
service:

Once the image is loaded, firmware creates and returns an EFI_HANDLE
that identifies the image and supports EFI_LOADED_IMAGE_PROTOCOL and
the EFI_LOADED_IMAGE_DEVICE_PATH_PROTOCOL. The caller may fill in the
image's "load options" data, or add additional protocol support to the
handle.

So an agent calls the LoadImage() boot service to load the image. Then
it looks up the EFI_LOADED_IMAGE_PROTOCOL instance on the handle that
was output by LoadImage(). The agent populates LoadOptionsSize and
LoadOptions in said EFI_LOADED_IMAGE_PROTOCOL instance, and then calls
StartImage(). The started image can then consume LoadOptionsSize and
LoadOptions.

I think you mistook these fields for describing an EFI_LOAD_OPTION
structure (from section "3.1.3 Load Options"). EFI_LOAD_OPTION is a
different thing -- the EFI_LOADED_IMAGE_PROTOCOL.LoadOptions field
points to a binary blob whose interpretation is specific to the image
being started. Its internals are not regulated by the spec.

More below:


if it points to the Load Option Descriptor, then on OVMF it doesn't,
since it points to only OptionalData[].
if it was meant to point to OptionalData[], then:

1. how does an OS Loader get _its_ Load Option FilePathList[]? ALL
elements.
and
Again, EFI_LOADED_IMAGE_PROTOCOL.LoadOptions does not point to an
EFI_LOAD_OPTION structure.

Still, your question #1 makes sense; indeed an OS boot loader (or
another UEFI app) is "entitled" to see the UEFI device path from which
it was loaded from. The passage I quoted above already mentions
EFI_LOADED_IMAGE_DEVICE_PATH_PROTOCOL. It is specified in detail in
section "9.2 EFI Loaded Image Device Path Protocol". Excerpt:

The Loaded Image Device Path Protocol must be installed onto the image
handle of a PE/COFF image loaded through the EFI Boot Service
LoadImage(). A copy of the device path specified by the DevicePath
parameter to the EFI Boot Service LoadImage() is made before it is
installed onto the image handle. It is legal to call LoadImage() for a
buffer in memory with a NULL DevicePath parameter. In this case, the
Loaded Image Device Path Protocol is installed with a NULL interface
pointer.

Summary:

- Passing arguments to a UEFI image: locate EFI_LOADED_IMAGE_PROTOCOL on
the image handle, and populate the LoadOptions / LoadOptionsSize
members. The "arguments" are a binary blob, only defined by the
receiving application.

- Finding where a UEFI image was loaded from: locate
EFI_LOADED_IMAGE_DEVICE_PATH_PROTOCOL on the image handle.

- Regarding "EFI_LOAD_OPTION.OptionalData": in case you boot a Boot####
option, it is "EFI_LOAD_OPTION.OptionalData" that will be passed to the
image in "EFI_LOADED_IMAGE_PROTOCOL.LoadOptions".

More below:

2. should FilePathList[] be a multiinstance Device Path (ending with
type 7F, subtype 1, except the last), or every element should end with
{7F, FF}?
I cannot answer this. The spec writes, about "FilePathList":

A packed array of UEFI device paths. The first element of the array is
a device path that describes the device and location of the Image for
this load option. The FilePathList[0] is specific to the device type.
Other device paths may optionally exist in the FilePathList, but their
usage is OSV specific. Each element in the array is variable length,
and ends at the device path end structure. Because the size of
Description is arbitrary, this data structure is not guaranteed to be
aligned on a natural boundary. This data structure may have to be
copied to an aligned natural boundary before it is used.

I'm not sure if the "non-first" elements in this array are supposed to
be 2nd and later devpath instances in a single multi-instance device
path, or if they are entirely separate (themselves single or
multi-instance) device paths.

Either way, I don't think those "elements" are passed to the loaded
image in any direct way. You could perhaps get back at the
EFI_LOAD_OPTION once the loaded image is running with the help of the
"BootCurrent" standard UEFI variable.

Thanks
Laszlo


valerij zaporogeci
 

Thank you, Laszlo! Again, sorry for the impatience, I started to suspect, my posts aren't visible.

Yes, I mistook the LoadOptions field of the Loaded Image Protocol for the one, describing image's EFI_LOAD_OPTION descriptor. because, well, "load options" is a term in the spec and I thought, the field named as such is about that entity. Later on, when I looked at the LoadImage() service description, I got it's not about Load Option Descriptor.

Either way, I don't think those "elements" are passed to the loaded
image in any direct way. You could perhaps get back at the
EFI_LOAD_OPTION once the loaded image is running with the help of the
"BootCurrent" standard UEFI variable.
This is exactly what I need! The way of knowing the current, chosen load option and ability of getting its data. Thanks for your explanation.

Yet one thing, I know you are occupied with OVMF, right? There is a Boot Manager UI display problem, I don't know if this is qemu problem, but with the ARM (both 32 and 64 bits, the qemu version is 4.2.0, the OVMF is fresh), and using "ramfb" device, the Boot Manager has troubles with drawing - it's interfase looks entirely broken, like this (I'll try to attach the screenshot). UEFI shell doesn't have this problem. switching to "serial" (which is -serial vc) doesn't produce it too. Only when ramfb is chosen, the Boot Manager UI gets smeared. But it takes input and presumable works properly, except displaying things. qemu writes these messages in the command prompt:
ramfb_fw_cfg_write: 640x480 @ 0x4bd00000
ramfb_fw_cfg_write: resolution locked, change rejected
ramfb_fw_cfg_write: 800x600 @ 0x4bd00000
ramfb_fw_cfg_write: resolution locked, change rejected

Thank you. :)


Laszlo Ersek
 

(CC Gerd, Qiming, Marcel, qemu-devel for ramfb:)

On 04/14/20 23:20, valerij zaporogeci wrote:

[snip]

There is a Boot Manager UI display problem, I don't know if this is
qemu problem, but with the ARM (both 32 and 64 bits, the qemu version
is 4.2.0, the OVMF is fresh), and using "ramfb" device, the Boot
Manager has troubles with drawing - it's interfase looks entirely
broken, like this (I'll try to attach the screenshot). UEFI shell
doesn't have this problem. switching to "serial" (which is -serial vc)
doesn't produce it too. Only when ramfb is chosen, the Boot Manager UI
gets smeared. But it takes input and presumable works properly, except
displaying things. qemu writes these messages in the command prompt:
ramfb_fw_cfg_write: 640x480 @ 0x4bd00000
ramfb_fw_cfg_write: resolution locked, change rejected
ramfb_fw_cfg_write: 800x600 @ 0x4bd00000
ramfb_fw_cfg_write: resolution locked, change rejected
Gerd contributed the OVMF QemuRamfbDxe driver in edk2 commit
1d25ff51af5c ("OvmfPkg: add QemuRamfbDxe", 2018-06-14). Note the date:
June 2018.

The then-latest (released) QEMU version was v2.12.0, and v2.12.1 /
v3.0.0 were in the making.

At that time, the resolution change definitely worked -- note my
"Tested-by" on the edk2 commit message.


Running "git blame" on the QEMU source, I now find commit a9e0cb67b7f4
("hw/display/ramfb: lock guest resolution after it's set", 2019-05-24).

Again, note the date: May 2019 (and this commit was released with QEMU
v4.1.0).

So I would say that the symptom you see is a QEMU v4.1.0 regression. The
QemuRamfbGraphicsOutputSetMode() function in the OVMF ramfb driver
certainly needs the QemuFwCfgWriteBytes() call to work, for changing the
resolution.


Now, I'm not familiar with the reasons behind QEMU commit a9e0cb67b7f4.
It says it intends to "prevent[] a crash when the guest writes garbage
to the configuration space (e.g. when rebooting)".

But I don't understand why locking the resolution was necessary for
preventing "a crash":

(1) Registering a device reset handler in QEMU seems sufficient, so that
QEMU forget about the currently shared RAMFB area at platform reset.

(2) The crash in question is apparently not a *QEMU* crash -- which
might otherwise justify a heavy-handed approach. Instead, it is a
*guest* crash. See the references below:

(2a) http://mid.mail-archive.com/CABSdmrmU7FK90Bupq_ySowcc9Uk=8nQxNLHgzvDsNYdp_QLogA@mail.gmail.com
https://lists.gnu.org/archive/html/qemu-devel/2019-05/msg02299.html

(2b) https://github.com/intel/gvt-linux/issues/23#issuecomment-483651476

Therefore, I don't think that locking the resolution was justified!

Importantly:

- The QemuRamfbDxe driver allocates the framebuffer in *reserved*
memory, therefore any well-behaving OS will *never* touch the
framebuffer.

- The QemuRamfbDxe driver allocates the framebuffer memory only once,
namely for such a resolution that needs the largest amount of
framebuffer memory. Therefore, framebuffer re-allocations in the guest
driver -- and thereby guest RAM *re-mapping* in QEMU -- are *not*
necessary, upon resolution change.

The ramfb device reset handler in QEMU is justified (for unmapping /
forgetting the previously shared RAMFB area).

The resolution locking is *NOT* justified, and it breaks the OVMF
driver. I suggest backing out the resolution locking from QEMU.

Reference (2a) above indicates 'It could be a misguided attempt to
"resize ramfb" by the guest Intel driver'. If that is the case, then
please fix the Intel guest driver, without regressing the QEMU device
model.

I'm sad that the QEMU device model change was not regression-tested
against the *upstream* OVMF driver (which, by then, had been upstream
for almost a year).

Laszlo


Laszlo Ersek
 

On 04/16/20 06:38, Hou Qiming wrote:
Very good point, I did neglect ramfb resolution changes... But there is one
important thing: it *can* cause a QEMU crash, a potentially exploitable
one, not always a guest crash. That's what motivated my heavy-handed
approach since allowing resolution change would have necessitated a good
deal of security checks. It has crashed my host *kernel* quite a few times.

The point is, while the QemuRamfbDxe driver may behave properly, nothing
prevents the guest from writing garbage or *malicious* values to the ramfb
config space. Then the values are sent to the display component without any
sanity check. For some GUI frontends, this means allocating an OpenGL
texture with guest-supplied dimensions and uploading guest memory content
to it, which means that guest memory content goes straight into a *kernel
driver*, *completely unchecked*. Some integer overflow and a lenient GPU
driver later, and the guest escapes straight to kernel.

The proper way to enable ramfb resolution change again is adding sanity
checks for ramfb resolution / pointer / etc. on the QEMU side. We have to
make sure it doesn't exceed what the host GPU driver supports. Maybe clamp
both width and height to between 1 and 2048? We also need to validate that
OpenGL texture dimension update succeeds. Note that OpenGL is not obliged
to validate anything and everything has to be checked on the QEMU side.
I agree that QEMU should sanity check the resolution requested by the
guest. I also agree that "arbitrary" limits are acceptable, for
preventing integer overflows and -- hopefully -- memory allocation
failures too.

But I don't see the host kernel / OpenGL / physical GPU angle, at least
not directly. That angle seems to be specific to your particular use
case (particular choice of display backend).

For example, if you nest QEMU/TCG in QEMU/TCG, with no KVM and no device
assignment in the picture anywhere, and OVMF drives ramfb in L2, and the
display *backend* (such as GTK or SDL GUI window) for the QEMU process
running in L1 sits on top of a virtual device (such as bochs-display)
provided by QEMU running in L0, then the ramfb stuff (including the
resolution changes and the range checks) should work just the same,
between L2 and L1.

I kinda feel like ramfb has been hijacked for providing a boot time
display crutch for kvmgt. (I might not be using the correct terminology
here; sorry about that). That's *not* what ramfb was originally intended
for, as far as I recall. Compare:

- 59926de9987c ("Merge remote-tracking branch
'remotes/kraxel/tags/vga-20180618-pull-request' into staging", 2018-06-19)

- dddb37495b84 ("Merge remote-tracking branch
'remotes/awilliam/tags/vfio-updates-20181015.0' into staging", 2018-10-15)

IIRC, Gerd originally invented ramfb for giving AARCH64 Windows the
linear framebuffer that the latter so badly wants, in particular so that
the framebuffer exist in guest RAM (not in guest MMIO), in order to
avoid the annoying S1/S2 caching behavior of AARCH64/KVM when the guest
maps an area as MMIO that is mapped as RAM on the host [1]. See:

- https://bugzilla.tianocore.org/show_bug.cgi?id=785#c4
- https://bugzilla.tianocore.org/show_bug.cgi?id=785#c7
- https://bugzilla.tianocore.org/show_bug.cgi?id=785#c8

and the further references given in those bugzilla comments.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1679680#c0

Component reuse is obviously *hugely* important, and it would be silly
for me to argue against reusing ramfb wherever it applies. Just please
don't break the original use case.

Should I file a bug report in LaunchPad, or is this thread enough for
tracking the QEMU regression?

Thanks
Laszlo


Qiming


On Wed, Apr 15, 2020 at 11:05 PM Laszlo Ersek <lersek@...> wrote:

(CC Gerd, Qiming, Marcel, qemu-devel for ramfb:)

On 04/14/20 23:20, valerij zaporogeci wrote:

[snip]

There is a Boot Manager UI display problem, I don't know if this is
qemu problem, but with the ARM (both 32 and 64 bits, the qemu version
is 4.2.0, the OVMF is fresh), and using "ramfb" device, the Boot
Manager has troubles with drawing - it's interfase looks entirely
broken, like this (I'll try to attach the screenshot). UEFI shell
doesn't have this problem. switching to "serial" (which is -serial vc)
doesn't produce it too. Only when ramfb is chosen, the Boot Manager UI
gets smeared. But it takes input and presumable works properly, except
displaying things. qemu writes these messages in the command prompt:
ramfb_fw_cfg_write: 640x480 @ 0x4bd00000
ramfb_fw_cfg_write: resolution locked, change rejected
ramfb_fw_cfg_write: 800x600 @ 0x4bd00000
ramfb_fw_cfg_write: resolution locked, change rejected
Gerd contributed the OVMF QemuRamfbDxe driver in edk2 commit
1d25ff51af5c ("OvmfPkg: add QemuRamfbDxe", 2018-06-14). Note the date:
June 2018.

The then-latest (released) QEMU version was v2.12.0, and v2.12.1 /
v3.0.0 were in the making.

At that time, the resolution change definitely worked -- note my
"Tested-by" on the edk2 commit message.


Running "git blame" on the QEMU source, I now find commit a9e0cb67b7f4
("hw/display/ramfb: lock guest resolution after it's set", 2019-05-24).

Again, note the date: May 2019 (and this commit was released with QEMU
v4.1.0).

So I would say that the symptom you see is a QEMU v4.1.0 regression. The
QemuRamfbGraphicsOutputSetMode() function in the OVMF ramfb driver
certainly needs the QemuFwCfgWriteBytes() call to work, for changing the
resolution.


Now, I'm not familiar with the reasons behind QEMU commit a9e0cb67b7f4.
It says it intends to "prevent[] a crash when the guest writes garbage
to the configuration space (e.g. when rebooting)".

But I don't understand why locking the resolution was necessary for
preventing "a crash":

(1) Registering a device reset handler in QEMU seems sufficient, so that
QEMU forget about the currently shared RAMFB area at platform reset.

(2) The crash in question is apparently not a *QEMU* crash -- which
might otherwise justify a heavy-handed approach. Instead, it is a
*guest* crash. See the references below:

(2a)
http://mid.mail-archive.com/CABSdmrmU7FK90Bupq_ySowcc9Uk=8nQxNLHgzvDsNYdp_QLogA@mail.gmail.com
https://lists.gnu.org/archive/html/qemu-devel/2019-05/msg02299.html

(2b) https://github.com/intel/gvt-linux/issues/23#issuecomment-483651476

Therefore, I don't think that locking the resolution was justified!

Importantly:

- The QemuRamfbDxe driver allocates the framebuffer in *reserved*
memory, therefore any well-behaving OS will *never* touch the
framebuffer.

- The QemuRamfbDxe driver allocates the framebuffer memory only once,
namely for such a resolution that needs the largest amount of
framebuffer memory. Therefore, framebuffer re-allocations in the guest
driver -- and thereby guest RAM *re-mapping* in QEMU -- are *not*
necessary, upon resolution change.

The ramfb device reset handler in QEMU is justified (for unmapping /
forgetting the previously shared RAMFB area).

The resolution locking is *NOT* justified, and it breaks the OVMF
driver. I suggest backing out the resolution locking from QEMU.

Reference (2a) above indicates 'It could be a misguided attempt to
"resize ramfb" by the guest Intel driver'. If that is the case, then
please fix the Intel guest driver, without regressing the QEMU device
model.

I'm sad that the QEMU device model change was not regression-tested
against the *upstream* OVMF driver (which, by then, had been upstream
for almost a year).

Laszlo


Laszlo Ersek
 

On 04/17/20 05:22, Hou Qiming wrote:
I'm glad we can reach a consensus that ramfb needs sanity checks. And well,
I'm probably at fault with the hijacking.

Your QEMU/TCG in QEMU/TCG example also made me realize a deeper problem,
though: your setting still can't escape the host display / physical GPU
issue. The middle display layers be bochs or whatever, but as long as the
framebuffer content and resolution values are propagated, and the end
result is displayed at all on the host, the host GPU attack surface remains
exposed to the L2 guest, and checks are needed. Everything shown on the
screen involves the display driver - GPU stack, GTK or SDL or tty, you
can't avoid that. ramfb-kvmgt just happened to be the shortest pipeline
where every stage neglected the checks, which exposed this problem.
Good point.

Blaming
this on ramfb is unfair since in your scenario the checks are better done
in the display subsystems.

TL;DR You made me realize right now, it's a very real risk that an AARCH64
Windows guest could exploit a x64 host's display driver by specifying a
crafted framebuffer with overflowing resolution. I don't want to break it,
but I'd prefer a broken state over an insecure state.

I'm not quite sure what this thread is. But I think with the scope this
discussion is going, maybe it's more of a bug than a regression.
All display devices (frontends) emulated by QEMU have to set bounds for
the permissible resolutions, for the guest. And those limits must never
break the capabilities of the display backends. So this is not a new
problem. How is it handled with other frontends? Like bochs-display, for
example -- I assume bochs-display too is purely virtual, i.e. the
resolution is fully controller (between bounds) by the guest. How is the
guest currently prevented from setting a bochs-display resolution that
"breaks SDL" (whatever that means)?

I'm inclined to agree that we're just seeing two sides of the same bug
-- the first state was too lax, and the current state is too strict.

Thanks
Laszlo


Laszlo Ersek
 

On 04/20/20 16:13, Gerd Hoffmann wrote:
Hi,

So I would say that the symptom you see is a QEMU v4.1.0 regression.
The QemuRamfbGraphicsOutputSetMode() function in the OVMF ramfb
driver certainly needs the QemuFwCfgWriteBytes() call to work, for
changing the resolution.
Oh? QemuRamfbGraphicsOutputSetMode() can be called multiple times?
How does that happen?
QemuRamfbGraphicsOutputSetMode() is the "SetMode" member function of the
EFI_GRAPHICS_OUTPUT_PROTOCOL instance that QemuRamfbDxe produces.

This is a standard protocol; UEFI drivers and applications are free to
locate it and to use it.

(1) When you launch OVMF, you get the splash screen in a particular
resolution. This resolution:
- is configured by OvmfPkg/PlatformDxe,
- is inherited by an OS boot loader,
- is reconfigurable with OvmfPkg/PlatformDxe, for the next boot, via the
Setup TUI,
- defaults to 800x600 (taking effect when no particular choice is
configured).

(2) UiApp -- the Setup TUI itself -- uses its own resolution. Under
OVMF, this resolution is fixed 640x480. When UiApp is entered,
ultimately a call is made to QemuRamfbGraphicsOutputSetMode() -- i.e., a
GOP.SetMode() member function -- for setting this 640x480 resolution.

Using the following command:

qemu-system-x86_64 \
-nodefaults \
-boot menu=on,splash-time=5000 \
-enable-kvm \
-device ramfb \
-drive if=pflash,readonly,format=raw,file=$PREFIX/share/qemu/edk2-x86_64-code.fd \
-drive if=pflash,snapshot,format=raw,file=$PREFIX/share/qemu/edk2-i386-vars.fd \
-debugcon file:ovmf.log \
-global isa-debugcon.iobase=0x402

when you first see the progress bar, the graphical resolution (1) is
800x600. Accordingly, QEMU prints to stderr:

ramfb_fw_cfg_write: 800x600 @ 0x6702000
Once you hit ESC to interrupt the progress bar and to enter the Setup
TUI, UiApp switches to resolution (2), 640x480. QEMU prints:

ramfb_fw_cfg_write: 640x480 @ 0x6702000
ramfb_fw_cfg_write: resolution locked, change rejected
And you get garbage in the Setup window.

Thanks,
Laszlo


Laszlo Ersek
 

On 04/22/20 09:42, Hou Qiming wrote:
A little off topic thing: isn't the default resolution supposed to be
1024x768?
No.

This is the Microsoft regulation which all my physical devices
seem to follow:

https://docs.microsoft.com/en-us/windows-hardware/test/hlk/testref/6afc8979-df62-4d86-8f6a-99f05bbdc7f3
Key term being "Microsoft regulation".

The UEFI spec requires discrete ("plug-in") graphics devices to support
at least either 800x600x32 or 640x480x32.

And the edk2 (not just OVMF) default for the console resolution is
800x600. (See PcdVideoHorizontalResolution and
PcdVideoVerticalResolution in "MdeModulePkg/MdeModulePkg.dec".)

And when the user provides an EDID one should parse it and set the default
resolution to match it. But that's a less important feature.
It's more complex than you might think, and (to me personally) it seems
to require more time than its importance justifies.

https://bugzilla.redhat.com/show_bug.cgi?id=1749250

Laszlo


Hou Qiming <hqm03ster@...>
 


> And when the user provides an EDID one should parse it and set the default
> resolution to match it. But that's a less important feature.

It's more complex than you might think, and (to me personally) it seems
to require more time than its importance justifies.

https://bugzilla.redhat.com/show_bug.cgi?id=1749250


Read the thread. Actually, I wrote some EDID parsing code a while ago, but that's before QEMU supporting EDID so I had to do it outside QEMU and pass my parsing result to ramfb as the now-removed starting_width / starting_height. In the context QEMU, the EDID actually reflects the user preference since the whole structure is usually made up from the user-specified resolution. And I think most guest OSes initialize first-time-seen monitors to their EDID resolution, which should have motivated QEMU to provide an EDID for a virtual monitor.

But at this point it's kind of awkward to do the EDID / resolution handling (that I need) in the ramfb driver as the kvmgt EDID has to be read out from the i915 MMIO just like a physical GPU. Guess now my use case is better covered with a fully functional i915 framebuffer driver for OVMF. If I had the time...


Hou Qiming <hqm03ster@...>
 

A little off topic thing: isn't the default resolution supposed to be 1024x768? This is the Microsoft regulation which all my physical devices seem to follow:


And when the user provides an EDID one should parse it and set the default resolution to match it. But that's a less important feature.


On Tue, Apr 21, 2020 at 9:03 PM Laszlo Ersek <lersek@...> wrote:
On 04/20/20 16:13, Gerd Hoffmann wrote:
>   Hi,
>
>> So I would say that the symptom you see is a QEMU v4.1.0 regression.
>> The QemuRamfbGraphicsOutputSetMode() function in the OVMF ramfb
>> driver certainly needs the QemuFwCfgWriteBytes() call to work, for
>> changing the resolution.
>
> Oh?  QemuRamfbGraphicsOutputSetMode() can be called multiple times?
> How does that happen?

QemuRamfbGraphicsOutputSetMode() is the "SetMode" member function of the
EFI_GRAPHICS_OUTPUT_PROTOCOL instance that QemuRamfbDxe produces.

This is a standard protocol; UEFI drivers and applications are free to
locate it and to use it.

(1) When you launch OVMF, you get the splash screen in a particular
resolution. This resolution:
- is configured by OvmfPkg/PlatformDxe,
- is inherited by an OS boot loader,
- is reconfigurable with OvmfPkg/PlatformDxe, for the next boot, via the
  Setup TUI,
- defaults to 800x600 (taking effect when no particular choice is
  configured).

(2) UiApp -- the Setup TUI itself -- uses its own resolution. Under
OVMF, this resolution is fixed 640x480. When UiApp is entered,
ultimately a call is made to QemuRamfbGraphicsOutputSetMode() -- i.e., a
GOP.SetMode() member function -- for setting this 640x480 resolution.

Using the following command:

  qemu-system-x86_64 \
    -nodefaults \
    -boot menu=on,splash-time=5000 \
    -enable-kvm \
    -device ramfb \
    -drive if=pflash,readonly,format=raw,file=$PREFIX/share/qemu/edk2-x86_64-code.fd \
    -drive if=pflash,snapshot,format=raw,file=$PREFIX/share/qemu/edk2-i386-vars.fd \
    -debugcon file:ovmf.log \
    -global isa-debugcon.iobase=0x402

when you first see the progress bar, the graphical resolution (1) is
800x600. Accordingly, QEMU prints to stderr:

> ramfb_fw_cfg_write: 800x600 @ 0x6702000

Once you hit ESC to interrupt the progress bar and to enter the Setup
TUI, UiApp switches to resolution (2), 640x480. QEMU prints:

> ramfb_fw_cfg_write: 640x480 @ 0x6702000
> ramfb_fw_cfg_write: resolution locked, change rejected

And you get garbage in the Setup window.

Thanks,
Laszlo