Date   

Re: 答复: [edk2-discuss] lost stderr in console

wenyi,xie
 

Hi, liming,

Yes, my command using 2>&1 to redirect strerr to stdout, and after updating to version 202011 the stderr are not shown in the screen but only written in xxx.log.
According to Laszlo's help and checking the commit log, I find the problem relate to this commit:
https://github.com/tianocore/edk2/commit/01b6090b75922bc72604c334bd3dc331490af3bb

Proc = MakeSubProc(Command, stdout=PIPE, stderr=STDOUT, env=os.environ, cwd=WorkingDir, bufsize=-1, shell=True)

As the stdout and stderr are combined togetherin since the commit, so in version 201903 my command can work, but in version 202011, after this commit merged, it can't work well.

-----邮件原件-----
发件人: gaoliming [mailto:gaoliming@byosoft.com.cn]
发送时间: 2021年3月25日 9:27
收件人: discuss@edk2.groups.io; xiewenyi (A) <xiewenyi2@huawei.com>
主题: 回复: [edk2-discuss] lost stderr in console

Wenyi:
Do you use build command to build your platform and find the message from stderr are not shown in the screen?

Thanks
Liming

-----邮件原件-----
发件人: discuss@edk2.groups.io <discuss@edk2.groups.io> 代表 wenyi,xie via
groups.io
发送时间: 2021年3月17日 11:23
收件人: discuss@edk2.groups.io
主题: [edk2-discuss] lost stderr in console

Hi,everyone,

I used to set makeflags like below,so that only message from stderr
will be shown in console while compiling, and at the same time message
from stderr and stdout will all be saved in xxx.log.
COMMAND MAKEFLAGS= set -eo pipefail && sh xxx.sh 2>&1 >>
${LOG_FILE_DIR}/xxx.log | tee -a ${LOG_FILE_DIR}/xxx.log

But after updating edk2 from 201903 to 202011, the message from stderr
is not shown in console any more. In xxx.log, the message from stderr
and stdout are still saved.
Do anything change in edk2 may affect on it ?




Re: 答复: [edk2-discuss] lost stderr in console

wenyi,xie
 

Hi, Laszlo,

Thank you very much for such detailed explanation. I know the root cause of the problem now.

-----邮件原件-----
发件人: Laszlo Ersek [mailto:lersek@redhat.com]
发送时间: 2021年3月18日 4:31
收件人: discuss@edk2.groups.io; xiewenyi (A) <xiewenyi2@huawei.com>
主题: Re: [edk2-discuss] lost stderr in console

On 03/17/21 04:23, wenyi,xie via groups.io wrote:
Hi,everyone,

I used to set makeflags like below,so that only message from stderr will be shown in console while compiling, and at the same time message from stderr and stdout will all be saved in xxx.log.
COMMAND MAKEFLAGS= set -eo pipefail && sh xxx.sh 2>&1 >>
${LOG_FILE_DIR}/xxx.log | tee -a ${LOG_FILE_DIR}/xxx.log
Redirections are processed in the following order:
- pipe first (redirects stdout)
- then left to right, as seen on the command line

So what we have for "xxx.sh" is:

step#1: file descriptor 1 refers to the (nameless) pipe's file description

step#2: file descriptor 2 refers to the same (nameless) pipe's file description (i.e., to the file description that file descriptor 1
*currently* refers to)

step#3: file descriptor 1 now refers to a file description that refers to the inode ("file") originally looked up by the name "xxx.log". At the file description level, the O_APPEND status flag is set.

So "tee" will only see stderr from "xxx.sh". Furthermore, the stdout of "xxx.sh" will only go to the log file ("xxx.log").


Let's consider "tee" then: "tee" opens the inode (the "file") looked up by the name "xxx.log" separately from when the shell opens "xxx.log", for the "xxx.sh" redirection. This means that, in the kernel, a separate file description exists, for the "xxx.log" inode. This file description also has the O_APPEND status flag set, but it doesn't matter -- the file description that "xxx.sh" writes through, and the file description that "tee" writes through, are independent. The "file offset" property is at the file description level. Therefore "tee" and "xxx.sh" do not share the file offset (and O_APPEND is useless, in both file descriptions), and they will mutually overwrite parts of each other's output.

In other words, your command line is buggy.


In general:

file descriptor --> file description --> file (inode)

When you open() the same file by name, you get this:

file descriptor --> file description \
--> file (inode)
file descriptor --> file description /

Whereas, if you use fork() or dup(), this is what you get:

file descriptor \
--> file description --> file (inode)
file descriptor /

O_APPEND and the file offset both exist in the *file description* object. So in the first case, you get no coordination from the kernel, and in the second case, you do.

Note that even in the second case, that is, when both file descriptors refer to the same file description, it is not guaranteed that
*concurrent* writes will not be *interleaved*. No data will be *overwritten*, for sure, but the granularity of "atomic" writes is not an easy question. If the file description refers to a pipe, then there are some guarantees from POSIX, as long as the writes are "small enough"
(PIPE_BUF):
<https://pubs.opengroup.org/onlinepubs/9699919799/functions/write.html>.


So... what you want to do is actually difficult. There are two approaches, but both are broken.

Approach (1): fuse stdout and stderr into a single stream, capture the common stream in the log file, and print the "diagnostic" lines to the
console:

sh xxx.sh 2>&1 | tee -- xxx.log | grep -- 'what exacty?'

This is broken because you cannot identify the diagnostic-only lines (the original stderr) by content.

Approach (2): duplicate the original stderr into two streams. The first instance will go to the console. The second instance, together with the script's stdout, will be written to the log file.

# fd 1: original stdout
# fd 2: original stderr
# make fd 3 point to the original stdout as well

exec 3>&1

(
# fd 1: pipe leading to the log file
# fd 2: original stderr
# fd 3: original stdout
# make fd 4 too point to the pipe leading to the log file

exec 4>&1

(
# fd 1: pipe carrying the main script's stderr
# fd 2: original stderr
# fd 3: original stdout
# fd 4: pipe leading to the log file

# main script's stderr goes to the duplicator pipe
# main script's stdout goes to the log file

xxx.sh 2>&1 >&4
) | (

# fd 0: pipe carrying the main script's stderr
# fd 1: pipe leading to the log file
# fd 2: original stderr
# fd 3: original stdout
# fd 4: pipe leading to the log file

# duplicate the main script's stderr to the original stdout (3)
# and to the log file (1)

tee /dev/fd/3
)
) | cat > xxx.log


In short form (no comments and no useless subshells):

exec 3>&1
(
exec 4>&1
xxx.sh 2>&1 >&4 | tee /dev/fd/3
) | cat > xxx.log

This will not lose messages. It will also not interleave write() syscalls that do not exceed PIPE_BUF (usually 4KB) individually -- this is the justification for the outermost pipe, and "cat".

However, it is still broken: dependent on process scheduling between "xxx.sh" and "tee", it's now possible that stdout and stderr lines from "xxx.sh" will be reordered, relative to each other. Let's say line#1 is a diagnostic message, while line#2 is a normal message. "xxx.sh" writes them in line#1, line#2 order. With the above, line#1 goes from "xxx.sh"
to "tee" to "cat", while line#2 goes from "xxx.sh" to "cat". If "tee" is "slow", then "cat" could see the messages in line#2, line#1 order.

Thanks
Laszlo


回复: [edk2-discuss] lost stderr in console

gaoliming
 

Wenyi:
Do you use build command to build your platform and find the message from stderr are not shown in the screen?

Thanks
Liming

-----邮件原件-----
发件人: discuss@edk2.groups.io <discuss@edk2.groups.io> 代表 wenyi,xie
via groups.io
发送时间: 2021年3月17日 11:23
收件人: discuss@edk2.groups.io
主题: [edk2-discuss] lost stderr in console

Hi,everyone,

I used to set makeflags like below,so that only message from stderr will be
shown in console while compiling, and at the same time message from stderr
and stdout will all be saved in xxx.log.
COMMAND MAKEFLAGS= set -eo pipefail && sh xxx.sh 2>&1 >>
${LOG_FILE_DIR}/xxx.log | tee -a ${LOG_FILE_DIR}/xxx.log

But after updating edk2 from 201903 to 202011, the message from stderr is
not shown in console any more. In xxx.log, the message from stderr and
stdout are still saved.
Do anything change in edk2 may affect on it ?




Re: GccBase.lds

Andrew Fish
 

Jose,

Caveat Emptor as Xcode clang is my daily driver, so I don’t really use GCC.

As Ard mentioned the ELF image is converted to a PE/COFF executable for EFI, and the symbols are stripped so the build ID is not going to help you. In PE/COFF there is something called a Debug Directory Entry[1]. It can point to a .debug section, or info in any section. The Debug Directory Entry starts with a 4 byte structure that defines its layout [1].

The 2nd issue you will have is the PE/COFF images are relocatable executables so they are not loaded at their linked address. Thus to load symbols you need to know the slide (delta of load address to link address) which you can derive from the load address of the image, and a path to the external symbols file. PE/COFF has the property that all the PE/COFF headers are loaded into memory when a PE/COFF image is loaded, this will be important later.

When you get past early PEI[2] and code is no longer XIP (eXecute In Place) you need to know the load address of the code to load symbols. So you end up with these options:
1) Load symbols for PC (This works for XIP too).
2) Load symbols for frame. Really just 1) for each frame PC.
3) Load symbols based on some table that has all the load info. This is the example from the EFI Spec [3].

To implement 1) you start with the PC and walk backwards in memory looking for the PE/COFF (or DOS) header. That header is the load address of the module. From the headers you can find the debug directory entry. The Debug Directory entry can then be decoded and that gives you (maybe) a unique identifier and the file name of the symbols the debugger needs to load. At that point you can tell the debugger this debug info matches the module loaded at this address. If your debugger knows PE/COFF you just have to point at the DOS or PE/COFF header.

So given I use lldb and the macOS flavor of lldb does not parse PE/COFF we had to write lldb Python to do all of the above and some more. So above is the how to do everything yourself, or theory of operation.

There is also the primitive look at the serial logs and load symbols by hand using debugger commands.

In terms of `git grep` in the edk2 look for EFI_IMAGE_DEBUG_DIRECTORY_ENTRY. There is a lib function called PeCoffLoaderGetPdbPointer() that returns a pointer to the PE/COFF Debug Directory Entry, it has the name pdb as that is the name of the windows symbols file that the Debug Directory Entry generally points to for VC++. For Xcode/clang the Debug Directory Entry entry has the UUID of mach-O executable (that got converted into a PE/COFF) and lldb can look up the symbols that way, other toolchains need the path to the debug file to load. So for example the OVMF exception handler [4] will dump out info about the faulting PC.

Sorry for the very generic answer, but I though it might be helpful for other folks too. You might want to ask a question about your target architecture, toolchain, and debugger on the mailing to find out what of the above is available for your toolchain with the edk2 for free. If you are interested in adding more features above is an outline of how it works and you can always ask more detailed questions on the mailing list.

[1] https://github.com/tianocore/edk2/blob/master/MdePkg/Include/IndustryStandard/PeImage.h#L609

[2] The build kicks out an <FvName>.Fv.map file that contains the info you need to load symbols.
For example: Build/OvmfX64/DEBUG_XCODE5/FV/PEIFV.Fv.map
EFI_FV_TOTAL_SIZE = 0xe0000
EFI_FV_TAKEN_SIZE = 0x43728
EFI_FV_SPACE_SIZE = 0x9c8d8

PeiCore (Fixed Flash Address, BaseAddress=0x0000820120, EntryPoint=0x000082b012)
(GUID=52C05B14-0B98-496C-BC3B-04B50211D680 .textbaseaddress=0x0000820360 .databaseaddress=0x000082e420)



PcdPeim (Fixed Flash Address, BaseAddress=0x000082ed00, EntryPoint=0x0000831a62)
(GUID=9B3ADA4F-AE56-4C24-8DEA-F03B7558AE50 .textbaseaddress=0x000082ef40 .databaseaddress=0x0000834ac0)
...

[3] https://github.com/tianocore/edk2/blob/master/MdePkg/Include/Guid/DebugImageInfoTable.h

[4] UefiCpuPkg/Library/CpuExceptionHandlerLib/CpuExceptionCommon.c:130: PdbPointer = PeCoffLoaderGetPdbPointer ((VOID *) Pe32Data);

Thanks,

Andrew Fish

On Mar 24, 2021, at 8:25 AM, Ard Biesheuvel <ardb@kernel.org> wrote:

On Wed, 24 Mar 2021 at 16:18, Laszlo Ersek <lersek@redhat.com <mailto:lersek@redhat.com>> wrote:

(+Ard)

On 03/24/21 02:49, Hernandez Miramontes, Jose Miguel wrote:
Hi

Is anyone familiar with this file? (Edk2\BaseTools\Scripts\GccBase.lds)
I would like to understand more what it does and why it is needed.

When looking at the .efi output of genfw, it seems the .build-id section generated by gcc is discarded.
git-blame is your friend. It leads to commit 7fd5d619806d ("BaseTools
GCC: drop GNU notes section from EFI image", 2016-08-02).
The build-id is used by Linux distros when they ship debug symbols
with production shared libraries. The build-id permits the loader to
look up the correct file, and confirm that the versions match.

In EDK2, the ELF binary is only an intermediate artifact, and so I
fail to see why we would need build IDs here. If the ELF binary was
built with debug symbols, they will be in the binary itself, not in a
side loaded symbol file.



Re: GccBase.lds

Ard Biesheuvel
 

On Wed, 24 Mar 2021 at 16:56, Hernandez Miramontes, Jose Miguel
<jose.miguel.hernandez.miramontes@intel.com> wrote:

yes, but if the section is not removed debugger would be able to find the debug symbols. Like how it happens with PDB files.
We could leave the symbols in the binaries, but it will consume more space. Space in the SPI is usually a constrain
I am not sure how your build environment works exactly, but when using
GCC to build for ARM, we already have this feature: the CodeView NB10
entry in the debug table in the PE/COFF metadata is populated with the
path to the .dll on the host from which the EFI executable was
generated. The debugger extracts this data to locate the file on the
build machine, and loads the symbols.

I am just wondering if it is possible to fix this so that the section is there in the final efi.
If I were to try it, where would I start?
I wouldn't be in favor of hacking this up: the ELF to PE/COFF
conversion in EDK2 is already highly dubious, and it would be much
better to stick with the PE/COFF semantics for carrying such metadata,
like the CodeView NB10 entry does.


Re: GccBase.lds

Ard Biesheuvel
 

On Wed, 24 Mar 2021 at 16:18, Laszlo Ersek <lersek@redhat.com> wrote:

(+Ard)

On 03/24/21 02:49, Hernandez Miramontes, Jose Miguel wrote:
Hi

Is anyone familiar with this file? (Edk2\BaseTools\Scripts\GccBase.lds)
I would like to understand more what it does and why it is needed.

When looking at the .efi output of genfw, it seems the .build-id section generated by gcc is discarded.
git-blame is your friend. It leads to commit 7fd5d619806d ("BaseTools
GCC: drop GNU notes section from EFI image", 2016-08-02).
The build-id is used by Linux distros when they ship debug symbols
with production shared libraries. The build-id permits the loader to
look up the correct file, and confirm that the versions match.

In EDK2, the ELF binary is only an intermediate artifact, and so I
fail to see why we would need build IDs here. If the ELF binary was
built with debug symbols, they will be in the binary itself, not in a
side loaded symbol file.


Re: GccBase.lds

Laszlo Ersek
 

(+Ard)

On 03/24/21 02:49, Hernandez Miramontes, Jose Miguel wrote:
Hi

Is anyone familiar with this file? (Edk2\BaseTools\Scripts\GccBase.lds)
I would like to understand more what it does and why it is needed.

When looking at the .efi output of genfw, it seems the .build-id section generated by gcc is discarded.
git-blame is your friend. It leads to commit 7fd5d619806d ("BaseTools
GCC: drop GNU notes section from EFI image", 2016-08-02).

Thanks
Laszlo

What would it take to preserve it?

When reading this, https://sourceware.org/gdb/onlinedocs/gdb/Separate-Debug-Files.html,
I understood that this section is needed so GDB can find the symbols when they are extracted from the binary.


objdump -fh PeiCore.dll

PeiCore.dll: file format elf32-i386
architecture: i386, flags 0x00000013:
HAS_RELOC, EXEC_P, HAS_SYMS
start address 0x00006089

Sections:
Idx Name Size VMA LMA File off Algn
0 .text 000060bc 00000220 00000220 00000080 2**5
CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
1 .data 00000348 000062e0 000062e0 00006140 2**5
CONTENTS, ALLOC, LOAD, RELOC, DATA
2 .build-id 00000024 00006628 00006628 00006488 2**2
CONTENTS, READONLY

$objdump -fh PeiCore.efi

PeiCore.efi: file format pei-i386
architecture: i386, flags 0x0000010b:
HAS_RELOC, EXEC_P, HAS_DEBUG, D_PAGED
start address 0x00006089

Sections:
Idx Name Size VMA LMA File off Algn
0 .text 000060c0 00000220 00000220 00000220 2**2
CONTENTS, ALLOC, LOAD, READONLY, CODE
1 .data 00000400 000062e0 000062e0 000062e0 2**2
CONTENTS, ALLOC, LOAD, DATA
2 .reloc 000001a0 000066e0 000066e0 000066e0 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
Jose Miguel Hernandez Miramontes
BIOS Engineer
jose.miguel.hernandez.miramontes@intel.com
+1 (512) 362-1230
Intel Corporation







GccBase.lds

Hernandez Miramontes, Jose Miguel <jose.miguel.hernandez.miramontes@...>
 

Hi

 

Is anyone familiar with this file? (Edk2\BaseTools\Scripts\GccBase.lds)
I would like to understand more what it does and why it is needed.

When looking at the .efi output of genfw, it seems the .build-id section generated by gcc is discarded.

What would it take to preserve it?

 

When reading this, https://sourceware.org/gdb/onlinedocs/gdb/Separate-Debug-Files.html,

I understood that this section is needed so GDB can find the symbols when they are extracted from the binary.



objdump -fh PeiCore.dll

 

PeiCore.dll:     file format elf32-i386

architecture: i386, flags 0x00000013:

HAS_RELOC, EXEC_P, HAS_SYMS

start address 0x00006089

 

Sections:

Idx Name          Size      VMA       LMA       File off  Algn

  0 .text         000060bc  00000220  00000220  00000080  2**5

                  CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE

  1 .data         00000348  000062e0  000062e0  00006140  2**5

                  CONTENTS, ALLOC, LOAD, RELOC, DATA

  2 .build-id     00000024  00006628  00006628  00006488  2**2

                  CONTENTS, READONLY

 

$objdump -fh PeiCore.efi

 

PeiCore.efi:     file format pei-i386

architecture: i386, flags 0x0000010b:

HAS_RELOC, EXEC_P, HAS_DEBUG, D_PAGED

start address 0x00006089

 

Sections:

Idx Name          Size      VMA       LMA       File off  Algn

  0 .text         000060c0  00000220  00000220  00000220  2**2

                  CONTENTS, ALLOC, LOAD, READONLY, CODE

  1 .data         00000400  000062e0  000062e0  000062e0  2**2

                  CONTENTS, ALLOC, LOAD, DATA

  2 .reloc        000001a0  000066e0  000066e0  000066e0  2**2

                  CONTENTS, ALLOC, LOAD, READONLY, DATA

Jose Miguel Hernandez Miramontes

BIOS  Engineer

jose.miguel.hernandez.miramontes@...

+1 (512) 362-1230

Intel Corporation

 


Re: Google Summer of Code 2021 interested student

Ayush Dwivedi <21cencturyayush@...>
 

I have one more question. Actually I am using Arch Linux 64 bit so finding dependencies took some time and in the documentation it says "Arch Linux is not officially supported or tested by the edk2 project at this time". Is it going to be a problem because of that?


Re: Google Summer of Code 2021 interested student

Ayush Dwivedi <21cencturyayush@...>
 

Thank you Nate.
I installed edkrepo using the instructions you gave. I ran edkrepo clone min Intel-MinPlatform and it setup my workspace. Since I had already setup the EDK2 build environment and installed the dependencies(such as gcc-5) I jumped to Board Builds section and finally used python build_bios.py -p TigerlakeURvp. I chose the intel board TigerlakeURvp which was given in the list of MinPlatform supported boards. When the build was completed it yielded TIGERLAKEURVP.fd in the Build directory. I have attached one screenshot as well. What should I do next?


Re: Google Summer of Code 2021 interested student

Nate DeSimone
 

Hi Ayush,



Great to meet you and welcome to the TianoCore project! Glad you hear you are interested! Sorry it has taken me a while to get back to you. I completely agree that the codebase is rather large and navigating it can be a challenge if you don’t know where to start. Let’s take a couple of steps back and talk about the architecture of UEFI system firmware in general. At a high level, there are three pieces of code needed to create a complete UEFI firmware solution: Core, Silicon, and Platform.



[UEFI Firmware Building Blocks]



While the core code (aka EDK II) has been open source for many years, historically the platform and silicon code have not been. EDK II leaves a lot of functionality to platform code and the implementation of that is largely left as an exercise for the user. Historically only a Qemu example was given in the form of OvmfPkg. The lack of open source platform code in particular has led to large amount of inconsistency between implementations and has been a hinderance to the development of open source UEFI firmware in general.



MinPlatform intends to bring more consistency to the platform code by decomposing it into smaller more reusable pieces. Under the MinPlatform Architecture, platform code is further sub-divided into MinPlatform, Advanced Features, and Board. Under the new design, both Advanced Features and MinPlatform are reusable. Only the board code remains specific to the motherboard being used. The board code provides the glue that connects the generic advanced features with the chipset being used and any motherboard specific details required.



[cid:image002.png@01D71F40.03CF6350]



How the silicon code is provided varies depending on the silicon vendor. Typically, silicon code is a mix of open source and closed source. We have open source silicon code from several vendors in https://github.com/tianocore/edk2-platforms/tree/master/Silicon. The closed source silicon code is provided using differing mechanisms that vary by vendor. For example, Intel provides the silicon code using FSP binaries (Firmware Support Package). AMD provides the equivalent using AGESA. ARM SoCs typically use ARM Trusted Firmware (TF-A) to provide silicon code.



To help you get an environment set up, you can try using the EdkRepo tool. I have pre-built binaries available here: https://github.com/nate-desimone/edk2-staging/releases/tag/EdkRepo-preview-2020-10-09



Installation instructions are available here: https://github.com/tianocore/edk2-edkrepo



After you have edkrepo installed, you can set up a MinPlatform development environment by running: edkrepo clone min Intel-MinPlatform



From there, you can try building one of the existing MinPlatform boards using https://github.com/tianocore/edk2-platforms/blob/master/Platform/Intel/build_bios.py Detailed build instructions are available here: https://github.com/tianocore/edk2-platforms/tree/master/Platform/Intel#windows-build-instructions



In general, the build system starts with a single DSC file which can (and often does) include several other DSC, INF, FDF files. This provides a full description of all the UEFI modules to build and any firmware volumes that need to be constructed. Every OpenBoardPkg has a *OpenBoardPkg.dsc file that provides this initial starting point for the build.



As far as skills needed, the most important one is a strong understanding and comfort with the C programming language.



Finally, given that there seems to be a large amount of interest in the QemuOpenBoardPkg task, I would recommend that anyone interested in that task write up another proposal for an alternate task that is also interesting. It seems likely that we will get more than one application for the QemuOpenBoardPkg task and we can only pick one.



Hope this helps and welcome to the project!



With Best Regards,

Nate

-----Original Message-----
From: discuss@edk2.groups.io <discuss@edk2.groups.io> On Behalf Of
Ayush Dwivedi
Sent: Wednesday, March 17, 2021 1:10 PM
To: discuss@edk2.groups.io
Subject: [edk2-discuss] Google Summer of Code 2021 interested student
Hello everyone,
My name is Ayush Dwivedi. I am currently studying Computer Science and
Engineering and am in my 3rd year of B.Tech program. I am interested in
operating systems, firmwares and game programming. I have written some
personal projects in C and C++ programming languages and have
experimented with x86_64 assembly. Recently I have been using QEMU with
the OVMF UEFI firmware which I had built from source using edk2. I have
tried to write my own EFI applications(it was an attempt to understand how
UEFI differs from BIOS). I wish to be a part of the TianoCore community and
want to contribute. I would like to know what skills and knowledge is needed
for the task "MinPlatform Qemu Support". It is explained that we are needed
to port MinPlatform to QEMU so as of now I have started looking into the
OvmfPkg(since it already runs on QEMU) and Platform/Intel/MinPlatformPkg
but the source tree for edk2 and edk2-platform is huge so I am confused on
how and where I should start. I am looking forward to guidance from the
community.
Thank you for your precious time.
Regards,
Ayush Dwivedi


Re: [edk2-devel] [edk2-discuss] Google Summer of Code Interested Student

Andrew Fish
 

On Mar 22, 2021, at 11:31 AM, Nate DeSimone <nathaniel.l.desimone@intel.com> wrote:

Hi Andrew,

I tested VT_UTF8 on the macOs Terminal software and I can confirm that VT_UTF8 renders nicely. See the attached screenshot.
Nate,

Yes it is close to VT_UTF8, but not quite the same. I was looking back through the driver and I’d forgotten how similar it really is.

Symbols used in table below
===========================
ESC = 0x1B
CSI = 0x9B
DEL = 0x7f
^ = CTRL

+=========+======+===========+==========+==========+=============+
| | EFI | EFI 1.10 | | | |
| | Scan | | VT100+ | | |
| KEY | Code | PC ANSI | VTUTF8 | VT100 | xterm-color |
+=========+======+===========+==========+==========+=============+
| NULL | 0x00 | | | | |
| UP | 0x01 | ESC [ A | ESC [ A | ESC [ A | ESC [ A |
| DOWN | 0x02 | ESC [ B | ESC [ B | ESC [ B | ESC [ B |
| RIGHT | 0x03 | ESC [ C | ESC [ C | ESC [ C | ESC [ C |
| LEFT | 0x04 | ESC [ D | ESC [ D | ESC [ D | ESC [ D |
| HOME | 0x05 | ESC [ H | ESC h | ESC [ H | ESC [ H |
| | | | | | ^A |
| END | 0x06 | ESC [ F | ESC k | ESC [ K | ESC [ F |
| INSERT | 0x07 | ESC [ @ | ESC + | ESC [ @ | ^E |
| | | ESC [ L | | ESC [ L | |
| DELETE | 0x08 | ESC [ X | ESC - | ESC [ P | 0x7f |
| PG UP | 0x09 | ESC [ I | ESC ? | ESC [ V | ESC [ 5 ~ |
| | | | | ESC [ ? | ^P |
| PG DOWN | 0x0A | ESC [ G | ESC / | ESC [ U | ESC [ 6 ~ |
| | | | | ESC [ / | ^N |
| F1 | 0x0B | ESC [ M | ESC 1 | ESC O P | ESC O P |
| F2 | 0x0C | ESC [ N | ESC 2 | ESC O Q | ESC O Q |
| F3 | 0x0D | ESC [ O | ESC 3 | ESC O w | ESC O R |
| F4 | 0x0E | ESC [ P | ESC 4 | ESC O x | ESC O S |
| F5 | 0x0F | ESC [ Q | ESC 5 | ESC O t | ESC [ 1 5 ~ |
| F6 | 0x10 | ESC [ R | ESC 6 | ESC O u | ESC [ 1 7 ~ |
| F7 | 0x11 | ESC [ S | ESC 7 | ESC O q | ESC [ 1 8 ~ |
| F8 | 0x12 | ESC [ T | ESC 8 | ESC O r | ESC [ 1 9 ~ |
| F9 | 0x13 | ESC [ U | ESC 9 | ESC O p | ESC [ 2 0 ~ |
| F10 | 0x14 | ESC [ V | ESC 0 | ESC O M | ESC [ 2 1 ~ |
| Escape | 0x17 | ESC | ESC | ESC | ESC |
+=========+======+===========+==========+==========+=============+
| F11 | 0x16 | | | | ESC [ 2 3 ~ |
| F12 | 0x16 | | | | ESC [ 2 4 ~ |
+=========+======+===========+==========+==========+=============+


I’m wondering it we could pick the default terminal type based on the toolchain?

Thanks,

Andrew Fish

Thanks,
Nate

On 3/17/21, 9:02 AM, "Andrew Fish" <afish@apple.com <mailto:afish@apple.com>> wrote:

If we are mentioning terminal types the default terminal type on a Mac is xterm-256color. So that is going to be the default when people run OVMF on a Mac. So it would be nice if we can add that. I can help out with anything xterm-256color related.

Thanks,

Andrew Fish

On Mar 16, 2021, at 8:23 AM, Laszlo Ersek <lersek@redhat.com> wrote:

Hi Nate,

(adding Leif and Ard)

On 03/13/21 03:52, Desimone, Nathaniel L wrote:
I've created a new wiki page for this task with all the information I
have gathered thus far. I've done some more experimentation and found
that there are several newer terminal emulators that don't support
DEC Special Graphics so I've reduced the number of modes where DEC
Special Graphics should be preferred. Laszlo, if you could take a
look at the terminal type matrix I created that would be very
helpful.

https://github.com/tianocore/tianocore.github.io/wiki/Tasks-Terminal-driver-improvements
(

My background:

I settled on plain (non-UTF-8) xterm around 1998, and have been using it
ever since. Whenever something was off, I always tried to hammer the
application into conformance with my particular xterm setup, rather than
the other way around. I also have some quirky terminal settings -- for
me, "backspace" generates ^H / keycode 22 (stty sets erase to ^H),
"delete" generates keycode 119, and there's no "rubout". I still don't
use UTF-8 (I use latin2).

)

* Regarding ArmVirtPkg, I stick with the default TTY_TERMINAL=FALSE
setting (which means VT-100). Using that setting, I see the following
kind of "ASCII approximation" for box drawing:

/------------------------------------------------------------------------------\
| Boot Manager |
\------------------------------------------------------------------------------/

I'm really happy with this, as I don't care much for nice-looking
boxes; instead I prefer portability.

(NB: this seems to disagree with your "Current Behavior (Which is
wrong)" line for VT100, as it suggests CP437. That's not what I'm
seeing with VT100.)

TTY_TERMINAL=TRUE would mainly affect backspace / delete I think -- as
far as I recall, that's why I asked Roy not to make TTY_TERMINAL=TRUE
the default, in 2015:

http://mid.mail-archive.com/555458DB.3090602@redhat.com
http://mid.mail-archive.com/CAFECyb_E+bGZt5xv7QhRqyD0jX=AzoEMw7VW_tjZr+E=sQf8ww@mail.gmail.com

(I'd like to CC Roy, but I can't tell if he's now working for Linaro,
Cavium, HPE, Marvell, or another company.)

* Regarding OvmfPkg, currently PC_ANSI is hard-coded, and for me it
looks like this:

ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄż
ł Boot Manager ł
ŔÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄŮ

Obviously I'd much prefer if I got the simple ASCII approximation here
as well.

* Whether VT100 and/or PC_ANSI and/or TTY_TERM are *officially* supposed
to use DEC Special Graphics, I can't tell.

I know what my preferences are:

- the current BackSpace and Delete mappings (which work fine for me
with both VT100 and PC_ANSI, but *not* with TTY_TERM),

- and the most primitive ASCII mapping (no special graphics, no UTF-8
sequences, etc). I really like a super dumb terminal, where taking
simple "ASCII screenshots" (and pasting them into plaintext emails!)
is *trivial*.

... Looking at your "Expected Behavior" table, there is only one line
left with "poor man's ASCII" -- namely, TTY_TERM. Unfortunately,
TTY_TERM breaks my BackSpace / Delete settings :(

* In summary, I'd prefer if (a) VT100 stayed as-is (using "poor man's
ASCII", as seen in ArmVirtPkg), and (b) if OVMF used *that* VT100,
rather than PC_ANSI, by default.

Thanks!
Laszlo











<macos_terminal_vt_utf8.png>


Re: Windows guest fails to boot into recovery mode due to commit 5267926

Andrew Fish
 

On Mar 22, 2021, at 8:16 AM, Laszlo Ersek <lersek@redhat.com> wrote:

On 03/19/21 20:43, Annie Li wrote:
Hello Laszlo,

I was thinking of CR2 because it has the address that causes this exception, and we suppose to find out the information of the page in which the address locates. no?
CR2 is the address whose access faulted; I mentioned CR3 because Andrew
wrote "points the 1st level of the page tables". I understood that
expression as the point where you'd start walking the page tables
manually -- and that "root pointer" is in CR3.
Sorry for the typo. Yes I meant use CR3 to walk the page tables to find the CR2 address. I was not sure if QEMU had a tool for that?

I know for our internal lldb scripts I wrote a debugger command to walk the page tables and decode things. I was not sure if QEMU had an easy way to do that.

Thanks,

Andrew Fish

Thanks
Laszlo


Thanks
Annie

-----Original Message-----
From: Laszlo Ersek [mailto:lersek@redhat.com]
Sent: Friday, March 19, 2021 2:11 PM
To: Andrew Fish <afish@apple.com>; Annie Li <annie.li@oracle.com>
Cc: Yao, Jiewen <jiewen.yao@intel.com>; discuss@edk2.groups.io; Wang, Jian J <jian.j.wang@intel.com>; Aaron Young <aaron.young@oracle.com>
Subject: Re: [edk2-discuss] Windows guest fails to boot into recovery mode due to commit 5267926

On 03/19/21 18:41, Andrew Fish wrote:
Annie,

CR2 points the 1st level of the page tables. Those entries point to other page tables, so you kind of have to walk it by hand.
I think that's a typo: it should be CR3.

But, I agree -- you can use QEMU monitor commands to read RAM words, and walk the page tables manually. (I figured I could help with this, but I couldn't reproduce the issue locally. I used the manual Recovery entry
-- click "Reboot" with Shift held down. For me the Windows VM just entered recovery fine.)

Thanks,
Laszlo


https://urldefense.com/v3/__https://wiki.osdev.org/Paging__;!!GqivPVa7
Brio!Ns258pdi0mlsnSN0oODh9wsYUe4PNDNF6avU8uN1wSySF8ktrFBAJma0qi5hWw$
<https://urldefense.com/v3/__https://wiki.osdev.org/Paging__;!!GqivPVa
7Brio!Ns258pdi0mlsnSN0oODh9wsYUe4PNDNF6avU8uN1wSySF8ktrFBAJma0qi5hWw$

Thanks,

Andrew Fish

On Mar 19, 2021, at 9:56 AM, Annie Li <annie.li@oracle.com> wrote:

Hi Jiewen,

In DumpCpuContext function in ArchExceptionHandler.c, the exception details are gotten from either "SystemContextX64->ExceptionData" or "SystemContextX64.xxx". I am wondering how I can dump the page info there? Are there some related info that can be retrieved from CR2? can you enlighten me a little bit?

Thanks
Annie

-----Original Message-----
From: Yao, Jiewen [mailto:jiewen.yao@intel.com]
Sent: Thursday, March 18, 2021 8:37 PM
To: discuss@edk2.groups.io; Annie Li <annie.li@oracle.com>; Laszlo
Ersek <lersek@redhat.com>
Cc: Wang, Jian J <jian.j.wang@intel.com>; Andrew Fish
<afish@apple.com>; Aaron Young <aaron.young@oracle.com>; Yao, Jiewen
<jiewen.yao@intel.com>
Subject: RE: [edk2-discuss] Windows guest fails to boot into recovery
mode due to commit 5267926

Hi Arie
I added some of my thought in the Bugzilla. -
https://urldefense.com/v3/__https://bugzilla.tianocore.org/show_bug.c
gi?id=3269__;!!GqivPVa7Brio!JMob8PcNWJxj_RZSIWy7iwhqFFhIYSwtnR_8i0X6V
-UzBkycx-iObkffqGNBrw$

If you can dump paging structure info for further analysis, we can help to check.


-----Original Message-----
From: discuss@edk2.groups.io <discuss@edk2.groups.io> On Behalf Of
annie li
Sent: Friday, March 19, 2021 3:27 AM
To: Laszlo Ersek <lersek@redhat.com>; discuss@edk2.groups.io
Cc: Wang, Jian J <jian.j.wang@intel.com>; Andrew Fish
<afish@apple.com>; Aaron Young <aaron.young@oracle.com>
Subject: Re: [edk2-discuss] Windows guest fails to boot into
recovery mode due to commit 5267926

Hello Laszlo,

In my previous email, the exception is reproduced with pretty old
code base from where I started bisecting the comments. This time I
reproduce this issue with the code of branch 'stable/202011' of
upstream. All the log I am collecting is from this code base(75ab038).
Since the overall size of all log is pretty big, I'll attach all the
data you required in to this bug(https://urldefense.com/v3/__https://bugzilla.tianocore.org/show_bug.cgi?id=3269__;!!GqivPVa7Brio!JMob8PcNWJxj_RZSIWy7iwhqFFhIYSwtnR_8i0X6V-UzBkycx-iObkffqGNBrw$ ).

I dump the register by qmp-regdump, and the result(regdump) is
uploaded into this bug. If this log doesn't suffice, can you please suggest the way you prefer?
The objdump is uploaded, as well as the details of page fault
exception, please check the details there.

Thanks
Annie

-----Original Message-----
From: Laszlo Ersek [mailto:lersek@redhat.com]
Sent: Thursday, March 18, 2021 9:23 AM
To: Annie Li <annie.li@oracle.com>; discuss@edk2.groups.io
Cc: jian.j.wang@intel.com; Andrew Fish <afish@apple.com>
Subject: Re: Windows guest fails to boot into recovery mode due to
commit
5267926

On 03/18/21 02:48, Annie Li wrote:
Hello,

I ran into a windows booting failure issue(a page fault exception),
and narrow down it to the following patch,
MdeModulePkg/DxeIpl: support more NX related PCDs
https://urldefense.com/v3/__https://github.com/tianocore/edk2/commi
t
/5
267926134d17e86672b84fd57b438f05ffa68e1__;!!GqivPVa7Brio!PuMaBhjIGEd
V
v
lQi7PKC_FQeyIy-sjSaIZXk_W_MusXNUlQBxGqsJBRONWwKWw$

This issue always happens after QMP is terminated by <ctrl-C>
twice, see
following steps.

1. Boot Windows VM up, and <ctrl-C> to exit the QMP

2. Repeat 1

3. Boot Windows VM, and this page fault issue happens. (Note: Windows
should boot into recovery mode in this round, and this is due to the
previous two consecutive boot failure, see
https://urldefense.com/v3/__https://docs.microsoft.com/en-us/windows
-
hardware/manufacture/desktop/windows-recovery-environment--windows-r
e-
-
technical-reference*entry-points-into-
winre__;Iw!!GqivPVa7Brio!PuMaBhjIGEdVvlQi7PKC_FQeyIy-
sjSaIZXk_W_MusXNUlQBxGqsJBSkXMCNZA$ )

During above 3 windows booting procedures, the value of following
variables are always the same, PcdSetNxForStack 0
PcdDxeNxMemoryProtectionPolicy 0 PcdImageProtectionPolicy 2

However, Windows guest fails to boot up into recovery mode in the
3rd round
due to the patch above(5267926). I modified the return value to
"(PcdGetBool (PcdSetNxForStack)" in function "IsEnableNonExecNeeded"
in MdeModulePkg/Core/DxeIplPeim/X64/VirtualMemory.c, this page fault
issue is gone with this change. The patch(5267926) is for fixing bug
https://urldefense.com/v3/__https://bugzilla.tianocore.org/show_bug.
cg
i?id=1
116__;!!GqivPVa7Brio!PuMaBhjIGEdVvlQi7PKC_FQeyIy-
sjSaIZXk_W_MusXNUlQBxGqsJBTLSxdsog$ , where the comments show
PcdImageProtectionPolicy needs also to enable NXE. But this does
cause the page fault exception in this scenario, any suggestion?

The page fault exception is pasted here,


!!!! X64 Exception Type - 0E(#PF - Page-Fault) CPU Apic ID - 00000000 !!!!
ExceptionData - 0000000000000009 I:0 R:1 U:0 W:0 P:1 PK:0 SS:0
SGX:0 RIP - 000000003E0A7C75, CS - 0000000000000038, RFLAGS -
0000000000010202 RAX - 8000000000000003, RCX - 0000000000000001,
RDX
- 0000000001040001 RBX - 0000000000000001, RSP - 00000000001A6AA0,
RBP - 0000000001040001 RSI - 000000003F2E2010, RDI - 0000000000000001
R8 - 0000000000000000, R9 - 000000003E0AEC90, R10 - 0000FFFFFFFFF000
R11 - 00000000001A6E90, R12 - 0000000000000000, R13 -
000000003E0AEC90
R14 - 00000000001A6B28, R15 - 00000000001AB000
DS - 0000000000000030, ES - 0000000000000030, FS - 0000000000000030
GS - 0000000000000030, SS - 0000000000000030
CR0 - 0000000080010033, CR2 - 000000003F2E2010, CR3 -
000000003F401000
CR4 - 0000000000040668, CR8 - 0000000000000000
DR0 - 0000000000000000, DR1 - 0000000000000000, DR2 -
0000000000000000
DR3 - 0000000000000000, DR6 - 00000000FFFF0FF0, DR7 -
0000000000000400 GDTR - 000000003F1EE698 0000000000000047, LDTR -
0000000000000000
IDTR - 000000003ECCA018 0000000000000FFF, TR - 0000000000000000
FXSAVE_STATE - 00000000001A6700
!!!! Find image based on IP(0x3E0A7C75) /builddir/build/BUILD/edk2-
1.4.3/Build/OvmfX64/DEBUG_GCC48/X64/MdeModulePkg/Universal/Console/T
erminalDxe/TerminalDxe/DEBUG/TerminalDxe.dll
(ImageBase=000000003E0A5000, EntryPoint=000000003E0A86E8) !!!!

In addition to what Andrew said, I suggest the following:

(1) Please rebuild OVMF *locally*, using the same edk2 tree, and the
same toolchain, and the same "build" flags.

(2) Reproduce the issue, capture the register dump.

(3) Run the following command:

objdump -f -S
Build/OvmfX64/DEBUG_GCC48/X64/MdeModulePkg/Universal/Console/Termin
alDxe/TerminalDxe/DEBUG/TerminalDxe.debug


The point of this exercise is to reproduce the issue with such an
OVMF build for which you have a matching "TerminalDxe.debug" file.
Once you do that, you can run "objdump" on the ".debug" file, and
get a disassembly of the TerminalDxe driver, inter-leaved with the C language source code.

Then, we can do two things:

- we can verify whether (EntryPoint - ImageBase), from the register
dump, matches the (relative) "start address" that "objdump -f"
reports,

- we can take the crash offset (RIP - ImageBase), from the register
dump, and use that offset into the "objdump -S" disassembly, to
narrow down what the terminal driver may have been doing to trigger the crash.

It's not necessarily the terminal driver's fault that encounter a
crash, but knowing what TerminalDxe was up to, might shed light on
the actual reason. It's of course also possible that TerminalDxe *is* at fault. We'll see.

If possible, please post:
- your precise edk2 version (if you have local patches, it would be
best to reproduce with an upstream-only tree),
- your full firmware log (feel free to compress it),
- the register dump from serial,
- the objdump (disassembly) output (feel free to compress it).

Thanks,
Laszlo





Re: [edk2-devel] [edk2-discuss] Google Summer of Code Interested Student

Nate DeSimone
 

Hi Andrew,

I tested VT_UTF8 on the macOs Terminal software and I can confirm that VT_UTF8 renders nicely. See the attached screenshot.

Thanks,
Nate

On 3/17/21, 9:02 AM, "Andrew Fish" <afish@apple.com> wrote:

If we are mentioning terminal types the default terminal type on a Mac is xterm-256color. So that is going to be the default when people run OVMF on a Mac. So it would be nice if we can add that. I can help out with anything xterm-256color related.

Thanks,

Andrew Fish

On Mar 16, 2021, at 8:23 AM, Laszlo Ersek <lersek@redhat.com> wrote:
>
> Hi Nate,
>
> (adding Leif and Ard)
>
> On 03/13/21 03:52, Desimone, Nathaniel L wrote:
>> I've created a new wiki page for this task with all the information I
>> have gathered thus far. I've done some more experimentation and found
>> that there are several newer terminal emulators that don't support
>> DEC Special Graphics so I've reduced the number of modes where DEC
>> Special Graphics should be preferred. Laszlo, if you could take a
>> look at the terminal type matrix I created that would be very
>> helpful.
>>
>> https://github.com/tianocore/tianocore.github.io/wiki/Tasks-Terminal-driver-improvements
>
> (
>
> My background:
>
> I settled on plain (non-UTF-8) xterm around 1998, and have been using it
> ever since. Whenever something was off, I always tried to hammer the
> application into conformance with my particular xterm setup, rather than
> the other way around. I also have some quirky terminal settings -- for
> me, "backspace" generates ^H / keycode 22 (stty sets erase to ^H),
> "delete" generates keycode 119, and there's no "rubout". I still don't
> use UTF-8 (I use latin2).
>
> )
>
> * Regarding ArmVirtPkg, I stick with the default TTY_TERMINAL=FALSE
> setting (which means VT-100). Using that setting, I see the following
> kind of "ASCII approximation" for box drawing:
>
> /------------------------------------------------------------------------------\
> | Boot Manager |
> \------------------------------------------------------------------------------/
>
> I'm really happy with this, as I don't care much for nice-looking
> boxes; instead I prefer portability.
>
> (NB: this seems to disagree with your "Current Behavior (Which is
> wrong)" line for VT100, as it suggests CP437. That's not what I'm
> seeing with VT100.)
>
> TTY_TERMINAL=TRUE would mainly affect backspace / delete I think -- as
> far as I recall, that's why I asked Roy not to make TTY_TERMINAL=TRUE
> the default, in 2015:
>
> http://mid.mail-archive.com/555458DB.3090602@redhat.com
> http://mid.mail-archive.com/CAFECyb_E+bGZt5xv7QhRqyD0jX=AzoEMw7VW_tjZr+E=sQf8ww@mail.gmail.com
>
> (I'd like to CC Roy, but I can't tell if he's now working for Linaro,
> Cavium, HPE, Marvell, or another company.)
>
> * Regarding OvmfPkg, currently PC_ANSI is hard-coded, and for me it
> looks like this:
>
> ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄż
> ł Boot Manager ł
> ŔÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄŮ
>
> Obviously I'd much prefer if I got the simple ASCII approximation here
> as well.
>
> * Whether VT100 and/or PC_ANSI and/or TTY_TERM are *officially* supposed
> to use DEC Special Graphics, I can't tell.
>
> I know what my preferences are:
>
> - the current BackSpace and Delete mappings (which work fine for me
> with both VT100 and PC_ANSI, but *not* with TTY_TERM),
>
> - and the most primitive ASCII mapping (no special graphics, no UTF-8
> sequences, etc). I really like a super dumb terminal, where taking
> simple "ASCII screenshots" (and pasting them into plaintext emails!)
> is *trivial*.
>
> ... Looking at your "Expected Behavior" table, there is only one line
> left with "poor man's ASCII" -- namely, TTY_TERM. Unfortunately,
> TTY_TERM breaks my BackSpace / Delete settings :(
>
> * In summary, I'd prefer if (a) VT100 stayed as-is (using "poor man's
> ASCII", as seen in ArmVirtPkg), and (b) if OVMF used *that* VT100,
> rather than PC_ANSI, by default.
>
> Thanks!
> Laszlo
>
>
>
>
>
>


Re: Windows guest fails to boot into recovery mode due to commit 5267926

annie li
 

Ah, thanks for the clarification.

Thanks
Annie

-----Original Message-----
From: Laszlo Ersek [mailto:lersek@redhat.com]
Sent: Monday, March 22, 2021 11:17 AM
To: Annie Li <annie.li@oracle.com>; Andrew Fish <afish@apple.com>
Cc: Yao, Jiewen <jiewen.yao@intel.com>; discuss@edk2.groups.io; Wang, Jian J <jian.j.wang@intel.com>; Aaron Young <aaron.young@oracle.com>
Subject: Re: [edk2-discuss] Windows guest fails to boot into recovery mode due to commit 5267926

On 03/19/21 20:43, Annie Li wrote:
Hello Laszlo,

I was thinking of CR2 because it has the address that causes this exception, and we suppose to find out the information of the page in which the address locates. no?
CR2 is the address whose access faulted; I mentioned CR3 because Andrew wrote "points the 1st level of the page tables". I understood that expression as the point where you'd start walking the page tables manually -- and that "root pointer" is in CR3.

Thanks
Laszlo


Thanks
Annie

-----Original Message-----
From: Laszlo Ersek [mailto:lersek@redhat.com]
Sent: Friday, March 19, 2021 2:11 PM
To: Andrew Fish <afish@apple.com>; Annie Li <annie.li@oracle.com>
Cc: Yao, Jiewen <jiewen.yao@intel.com>; discuss@edk2.groups.io; Wang,
Jian J <jian.j.wang@intel.com>; Aaron Young <aaron.young@oracle.com>
Subject: Re: [edk2-discuss] Windows guest fails to boot into recovery
mode due to commit 5267926

On 03/19/21 18:41, Andrew Fish wrote:
Annie,

CR2 points the 1st level of the page tables. Those entries point to other page tables, so you kind of have to walk it by hand.
I think that's a typo: it should be CR3.

But, I agree -- you can use QEMU monitor commands to read RAM words,
and walk the page tables manually. (I figured I could help with this,
but I couldn't reproduce the issue locally. I used the manual Recovery
entry
-- click "Reboot" with Shift held down. For me the Windows VM just
entered recovery fine.)

Thanks,
Laszlo


https://urldefense.com/v3/__https://wiki.osdev.org/Paging__;!!GqivPVa
7
Brio!Ns258pdi0mlsnSN0oODh9wsYUe4PNDNF6avU8uN1wSySF8ktrFBAJma0qi5hWw$
<https://urldefense.com/v3/__https://wiki.osdev.org/Paging__;!!GqivPV
a
7Brio!Ns258pdi0mlsnSN0oODh9wsYUe4PNDNF6avU8uN1wSySF8ktrFBAJma0qi5hWw$

Thanks,

Andrew Fish

On Mar 19, 2021, at 9:56 AM, Annie Li <annie.li@oracle.com> wrote:

Hi Jiewen,

In DumpCpuContext function in ArchExceptionHandler.c, the exception details are gotten from either "SystemContextX64->ExceptionData" or "SystemContextX64.xxx". I am wondering how I can dump the page info there? Are there some related info that can be retrieved from CR2? can you enlighten me a little bit?

Thanks
Annie

-----Original Message-----
From: Yao, Jiewen [mailto:jiewen.yao@intel.com]
Sent: Thursday, March 18, 2021 8:37 PM
To: discuss@edk2.groups.io; Annie Li <annie.li@oracle.com>; Laszlo
Ersek <lersek@redhat.com>
Cc: Wang, Jian J <jian.j.wang@intel.com>; Andrew Fish
<afish@apple.com>; Aaron Young <aaron.young@oracle.com>; Yao, Jiewen
<jiewen.yao@intel.com>
Subject: RE: [edk2-discuss] Windows guest fails to boot into
recovery mode due to commit 5267926

Hi Arie
I added some of my thought in the Bugzilla. -
https://urldefense.com/v3/__https://bugzilla.tianocore.org/show_bug.
c
gi?id=3269__;!!GqivPVa7Brio!JMob8PcNWJxj_RZSIWy7iwhqFFhIYSwtnR_8i0X6
V
-UzBkycx-iObkffqGNBrw$

If you can dump paging structure info for further analysis, we can help to check.


-----Original Message-----
From: discuss@edk2.groups.io <discuss@edk2.groups.io> On Behalf Of
annie li
Sent: Friday, March 19, 2021 3:27 AM
To: Laszlo Ersek <lersek@redhat.com>; discuss@edk2.groups.io
Cc: Wang, Jian J <jian.j.wang@intel.com>; Andrew Fish
<afish@apple.com>; Aaron Young <aaron.young@oracle.com>
Subject: Re: [edk2-discuss] Windows guest fails to boot into
recovery mode due to commit 5267926

Hello Laszlo,

In my previous email, the exception is reproduced with pretty old
code base from where I started bisecting the comments. This time I
reproduce this issue with the code of branch 'stable/202011' of
upstream. All the log I am collecting is from this code base(75ab038).
Since the overall size of all log is pretty big, I'll attach all
the data you required in to this bug(https://urldefense.com/v3/__https://bugzilla.tianocore.org/show_bug.cgi?id=3269__;!!GqivPVa7Brio!JMob8PcNWJxj_RZSIWy7iwhqFFhIYSwtnR_8i0X6V-UzBkycx-iObkffqGNBrw$ ).

I dump the register by qmp-regdump, and the result(regdump) is
uploaded into this bug. If this log doesn't suffice, can you please suggest the way you prefer?
The objdump is uploaded, as well as the details of page fault
exception, please check the details there.

Thanks
Annie

-----Original Message-----
From: Laszlo Ersek [mailto:lersek@redhat.com]
Sent: Thursday, March 18, 2021 9:23 AM
To: Annie Li <annie.li@oracle.com>; discuss@edk2.groups.io
Cc: jian.j.wang@intel.com; Andrew Fish <afish@apple.com>
Subject: Re: Windows guest fails to boot into recovery mode due to
commit
5267926

On 03/18/21 02:48, Annie Li wrote:
Hello,

I ran into a windows booting failure issue(a page fault
exception), and narrow down it to the following patch,
MdeModulePkg/DxeIpl: support more NX related PCDs
https://urldefense.com/v3/__https://github.com/tianocore/edk2/comm
i
t
/5
267926134d17e86672b84fd57b438f05ffa68e1__;!!GqivPVa7Brio!PuMaBhjIGE
d
V
v
lQi7PKC_FQeyIy-sjSaIZXk_W_MusXNUlQBxGqsJBRONWwKWw$

This issue always happens after QMP is terminated by <ctrl-C>
twice, see
following steps.

1. Boot Windows VM up, and <ctrl-C> to exit the QMP

2. Repeat 1

3. Boot Windows VM, and this page fault issue happens. (Note: Windows
should boot into recovery mode in this round, and this is due to
the previous two consecutive boot failure, see
https://urldefense.com/v3/__https://docs.microsoft.com/en-us/window
s
-
hardware/manufacture/desktop/windows-recovery-environment--windows-
r
e-
-
technical-reference*entry-points-into-
winre__;Iw!!GqivPVa7Brio!PuMaBhjIGEdVvlQi7PKC_FQeyIy-
sjSaIZXk_W_MusXNUlQBxGqsJBSkXMCNZA$ )

During above 3 windows booting procedures, the value of following
variables are always the same, PcdSetNxForStack 0
PcdDxeNxMemoryProtectionPolicy 0 PcdImageProtectionPolicy 2

However, Windows guest fails to boot up into recovery mode in the
3rd round
due to the patch above(5267926). I modified the return value to
"(PcdGetBool (PcdSetNxForStack)" in function "IsEnableNonExecNeeded"
in MdeModulePkg/Core/DxeIplPeim/X64/VirtualMemory.c, this page
fault issue is gone with this change. The patch(5267926) is for
fixing bug https://urldefense.com/v3/__https://bugzilla.tianocore.org/show_bug.
cg
i?id=1
116__;!!GqivPVa7Brio!PuMaBhjIGEdVvlQi7PKC_FQeyIy-
sjSaIZXk_W_MusXNUlQBxGqsJBTLSxdsog$ , where the comments show
PcdImageProtectionPolicy needs also to enable NXE. But this does
cause the page fault exception in this scenario, any suggestion?

The page fault exception is pasted here,


!!!! X64 Exception Type - 0E(#PF - Page-Fault) CPU Apic ID - 00000000 !!!!
ExceptionData - 0000000000000009 I:0 R:1 U:0 W:0 P:1 PK:0 SS:0
SGX:0 RIP - 000000003E0A7C75, CS - 0000000000000038, RFLAGS -
0000000000010202 RAX - 8000000000000003, RCX - 0000000000000001,
RDX
- 0000000001040001 RBX - 0000000000000001, RSP -
00000000001A6AA0, RBP - 0000000001040001 RSI - 000000003F2E2010, RDI - 0000000000000001
R8 - 0000000000000000, R9 - 000000003E0AEC90, R10 - 0000FFFFFFFFF000
R11 - 00000000001A6E90, R12 - 0000000000000000, R13 -
000000003E0AEC90
R14 - 00000000001A6B28, R15 - 00000000001AB000
DS - 0000000000000030, ES - 0000000000000030, FS - 0000000000000030
GS - 0000000000000030, SS - 0000000000000030
CR0 - 0000000080010033, CR2 - 000000003F2E2010, CR3 -
000000003F401000
CR4 - 0000000000040668, CR8 - 0000000000000000
DR0 - 0000000000000000, DR1 - 0000000000000000, DR2 -
0000000000000000
DR3 - 0000000000000000, DR6 - 00000000FFFF0FF0, DR7 -
0000000000000400 GDTR - 000000003F1EE698 0000000000000047, LDTR -
0000000000000000
IDTR - 000000003ECCA018 0000000000000FFF, TR - 0000000000000000
FXSAVE_STATE - 00000000001A6700
!!!! Find image based on IP(0x3E0A7C75)
/builddir/build/BUILD/edk2-
1.4.3/Build/OvmfX64/DEBUG_GCC48/X64/MdeModulePkg/Universal/Console/
T erminalDxe/TerminalDxe/DEBUG/TerminalDxe.dll
(ImageBase=000000003E0A5000, EntryPoint=000000003E0A86E8) !!!!

In addition to what Andrew said, I suggest the following:

(1) Please rebuild OVMF *locally*, using the same edk2 tree, and
the same toolchain, and the same "build" flags.

(2) Reproduce the issue, capture the register dump.

(3) Run the following command:

objdump -f -S
Build/OvmfX64/DEBUG_GCC48/X64/MdeModulePkg/Universal/Console/Termin
alDxe/TerminalDxe/DEBUG/TerminalDxe.debug


The point of this exercise is to reproduce the issue with such an
OVMF build for which you have a matching "TerminalDxe.debug" file.
Once you do that, you can run "objdump" on the ".debug" file, and
get a disassembly of the TerminalDxe driver, inter-leaved with the C language source code.

Then, we can do two things:

- we can verify whether (EntryPoint - ImageBase), from the register
dump, matches the (relative) "start address" that "objdump -f"
reports,

- we can take the crash offset (RIP - ImageBase), from the register
dump, and use that offset into the "objdump -S" disassembly, to
narrow down what the terminal driver may have been doing to trigger the crash.

It's not necessarily the terminal driver's fault that encounter a
crash, but knowing what TerminalDxe was up to, might shed light on
the actual reason. It's of course also possible that TerminalDxe *is* at fault. We'll see.

If possible, please post:
- your precise edk2 version (if you have local patches, it would be
best to reproduce with an upstream-only tree),
- your full firmware log (feel free to compress it),
- the register dump from serial,
- the objdump (disassembly) output (feel free to compress it).

Thanks,
Laszlo





Re: Windows guest fails to boot into recovery mode due to commit 5267926

Laszlo Ersek
 

On 03/19/21 20:43, Annie Li wrote:
Hello Laszlo,

I was thinking of CR2 because it has the address that causes this exception, and we suppose to find out the information of the page in which the address locates. no?
CR2 is the address whose access faulted; I mentioned CR3 because Andrew
wrote "points the 1st level of the page tables". I understood that
expression as the point where you'd start walking the page tables
manually -- and that "root pointer" is in CR3.

Thanks
Laszlo


Thanks
Annie

-----Original Message-----
From: Laszlo Ersek [mailto:lersek@redhat.com]
Sent: Friday, March 19, 2021 2:11 PM
To: Andrew Fish <afish@apple.com>; Annie Li <annie.li@oracle.com>
Cc: Yao, Jiewen <jiewen.yao@intel.com>; discuss@edk2.groups.io; Wang, Jian J <jian.j.wang@intel.com>; Aaron Young <aaron.young@oracle.com>
Subject: Re: [edk2-discuss] Windows guest fails to boot into recovery mode due to commit 5267926

On 03/19/21 18:41, Andrew Fish wrote:
Annie,

CR2 points the 1st level of the page tables. Those entries point to other page tables, so you kind of have to walk it by hand.
I think that's a typo: it should be CR3.

But, I agree -- you can use QEMU monitor commands to read RAM words, and walk the page tables manually. (I figured I could help with this, but I couldn't reproduce the issue locally. I used the manual Recovery entry
-- click "Reboot" with Shift held down. For me the Windows VM just entered recovery fine.)

Thanks,
Laszlo


https://urldefense.com/v3/__https://wiki.osdev.org/Paging__;!!GqivPVa7
Brio!Ns258pdi0mlsnSN0oODh9wsYUe4PNDNF6avU8uN1wSySF8ktrFBAJma0qi5hWw$
<https://urldefense.com/v3/__https://wiki.osdev.org/Paging__;!!GqivPVa
7Brio!Ns258pdi0mlsnSN0oODh9wsYUe4PNDNF6avU8uN1wSySF8ktrFBAJma0qi5hWw$

Thanks,

Andrew Fish

On Mar 19, 2021, at 9:56 AM, Annie Li <annie.li@oracle.com> wrote:

Hi Jiewen,

In DumpCpuContext function in ArchExceptionHandler.c, the exception details are gotten from either "SystemContextX64->ExceptionData" or "SystemContextX64.xxx". I am wondering how I can dump the page info there? Are there some related info that can be retrieved from CR2? can you enlighten me a little bit?

Thanks
Annie

-----Original Message-----
From: Yao, Jiewen [mailto:jiewen.yao@intel.com]
Sent: Thursday, March 18, 2021 8:37 PM
To: discuss@edk2.groups.io; Annie Li <annie.li@oracle.com>; Laszlo
Ersek <lersek@redhat.com>
Cc: Wang, Jian J <jian.j.wang@intel.com>; Andrew Fish
<afish@apple.com>; Aaron Young <aaron.young@oracle.com>; Yao, Jiewen
<jiewen.yao@intel.com>
Subject: RE: [edk2-discuss] Windows guest fails to boot into recovery
mode due to commit 5267926

Hi Arie
I added some of my thought in the Bugzilla. -
https://urldefense.com/v3/__https://bugzilla.tianocore.org/show_bug.c
gi?id=3269__;!!GqivPVa7Brio!JMob8PcNWJxj_RZSIWy7iwhqFFhIYSwtnR_8i0X6V
-UzBkycx-iObkffqGNBrw$

If you can dump paging structure info for further analysis, we can help to check.


-----Original Message-----
From: discuss@edk2.groups.io <discuss@edk2.groups.io> On Behalf Of
annie li
Sent: Friday, March 19, 2021 3:27 AM
To: Laszlo Ersek <lersek@redhat.com>; discuss@edk2.groups.io
Cc: Wang, Jian J <jian.j.wang@intel.com>; Andrew Fish
<afish@apple.com>; Aaron Young <aaron.young@oracle.com>
Subject: Re: [edk2-discuss] Windows guest fails to boot into
recovery mode due to commit 5267926

Hello Laszlo,

In my previous email, the exception is reproduced with pretty old
code base from where I started bisecting the comments. This time I
reproduce this issue with the code of branch 'stable/202011' of
upstream. All the log I am collecting is from this code base(75ab038).
Since the overall size of all log is pretty big, I'll attach all the
data you required in to this bug(https://urldefense.com/v3/__https://bugzilla.tianocore.org/show_bug.cgi?id=3269__;!!GqivPVa7Brio!JMob8PcNWJxj_RZSIWy7iwhqFFhIYSwtnR_8i0X6V-UzBkycx-iObkffqGNBrw$ ).

I dump the register by qmp-regdump, and the result(regdump) is
uploaded into this bug. If this log doesn't suffice, can you please suggest the way you prefer?
The objdump is uploaded, as well as the details of page fault
exception, please check the details there.

Thanks
Annie

-----Original Message-----
From: Laszlo Ersek [mailto:lersek@redhat.com]
Sent: Thursday, March 18, 2021 9:23 AM
To: Annie Li <annie.li@oracle.com>; discuss@edk2.groups.io
Cc: jian.j.wang@intel.com; Andrew Fish <afish@apple.com>
Subject: Re: Windows guest fails to boot into recovery mode due to
commit
5267926

On 03/18/21 02:48, Annie Li wrote:
Hello,

I ran into a windows booting failure issue(a page fault exception),
and narrow down it to the following patch,
MdeModulePkg/DxeIpl: support more NX related PCDs
https://urldefense.com/v3/__https://github.com/tianocore/edk2/commi
t
/5
267926134d17e86672b84fd57b438f05ffa68e1__;!!GqivPVa7Brio!PuMaBhjIGEd
V
v
lQi7PKC_FQeyIy-sjSaIZXk_W_MusXNUlQBxGqsJBRONWwKWw$

This issue always happens after QMP is terminated by <ctrl-C>
twice, see
following steps.

1. Boot Windows VM up, and <ctrl-C> to exit the QMP

2. Repeat 1

3. Boot Windows VM, and this page fault issue happens. (Note: Windows
should boot into recovery mode in this round, and this is due to the
previous two consecutive boot failure, see
https://urldefense.com/v3/__https://docs.microsoft.com/en-us/windows
-
hardware/manufacture/desktop/windows-recovery-environment--windows-r
e-
-
technical-reference*entry-points-into-
winre__;Iw!!GqivPVa7Brio!PuMaBhjIGEdVvlQi7PKC_FQeyIy-
sjSaIZXk_W_MusXNUlQBxGqsJBSkXMCNZA$ )

During above 3 windows booting procedures, the value of following
variables are always the same, PcdSetNxForStack 0
PcdDxeNxMemoryProtectionPolicy 0 PcdImageProtectionPolicy 2

However, Windows guest fails to boot up into recovery mode in the
3rd round
due to the patch above(5267926). I modified the return value to
"(PcdGetBool (PcdSetNxForStack)" in function "IsEnableNonExecNeeded"
in MdeModulePkg/Core/DxeIplPeim/X64/VirtualMemory.c, this page fault
issue is gone with this change. The patch(5267926) is for fixing bug
https://urldefense.com/v3/__https://bugzilla.tianocore.org/show_bug.
cg
i?id=1
116__;!!GqivPVa7Brio!PuMaBhjIGEdVvlQi7PKC_FQeyIy-
sjSaIZXk_W_MusXNUlQBxGqsJBTLSxdsog$ , where the comments show
PcdImageProtectionPolicy needs also to enable NXE. But this does
cause the page fault exception in this scenario, any suggestion?

The page fault exception is pasted here,


!!!! X64 Exception Type - 0E(#PF - Page-Fault) CPU Apic ID - 00000000 !!!!
ExceptionData - 0000000000000009 I:0 R:1 U:0 W:0 P:1 PK:0 SS:0
SGX:0 RIP - 000000003E0A7C75, CS - 0000000000000038, RFLAGS -
0000000000010202 RAX - 8000000000000003, RCX - 0000000000000001,
RDX
- 0000000001040001 RBX - 0000000000000001, RSP - 00000000001A6AA0,
RBP - 0000000001040001 RSI - 000000003F2E2010, RDI - 0000000000000001
R8 - 0000000000000000, R9 - 000000003E0AEC90, R10 - 0000FFFFFFFFF000
R11 - 00000000001A6E90, R12 - 0000000000000000, R13 -
000000003E0AEC90
R14 - 00000000001A6B28, R15 - 00000000001AB000
DS - 0000000000000030, ES - 0000000000000030, FS - 0000000000000030
GS - 0000000000000030, SS - 0000000000000030
CR0 - 0000000080010033, CR2 - 000000003F2E2010, CR3 -
000000003F401000
CR4 - 0000000000040668, CR8 - 0000000000000000
DR0 - 0000000000000000, DR1 - 0000000000000000, DR2 -
0000000000000000
DR3 - 0000000000000000, DR6 - 00000000FFFF0FF0, DR7 -
0000000000000400 GDTR - 000000003F1EE698 0000000000000047, LDTR -
0000000000000000
IDTR - 000000003ECCA018 0000000000000FFF, TR - 0000000000000000
FXSAVE_STATE - 00000000001A6700
!!!! Find image based on IP(0x3E0A7C75) /builddir/build/BUILD/edk2-
1.4.3/Build/OvmfX64/DEBUG_GCC48/X64/MdeModulePkg/Universal/Console/T
erminalDxe/TerminalDxe/DEBUG/TerminalDxe.dll
(ImageBase=000000003E0A5000, EntryPoint=000000003E0A86E8) !!!!

In addition to what Andrew said, I suggest the following:

(1) Please rebuild OVMF *locally*, using the same edk2 tree, and the
same toolchain, and the same "build" flags.

(2) Reproduce the issue, capture the register dump.

(3) Run the following command:

objdump -f -S
Build/OvmfX64/DEBUG_GCC48/X64/MdeModulePkg/Universal/Console/Termin
alDxe/TerminalDxe/DEBUG/TerminalDxe.debug


The point of this exercise is to reproduce the issue with such an
OVMF build for which you have a matching "TerminalDxe.debug" file.
Once you do that, you can run "objdump" on the ".debug" file, and
get a disassembly of the TerminalDxe driver, inter-leaved with the C language source code.

Then, we can do two things:

- we can verify whether (EntryPoint - ImageBase), from the register
dump, matches the (relative) "start address" that "objdump -f"
reports,

- we can take the crash offset (RIP - ImageBase), from the register
dump, and use that offset into the "objdump -S" disassembly, to
narrow down what the terminal driver may have been doing to trigger the crash.

It's not necessarily the terminal driver's fault that encounter a
crash, but knowing what TerminalDxe was up to, might shed light on
the actual reason. It's of course also possible that TerminalDxe *is* at fault. We'll see.

If possible, please post:
- your precise edk2 version (if you have local patches, it would be
best to reproduce with an upstream-only tree),
- your full firmware log (feel free to compress it),
- the register dump from serial,
- the objdump (disassembly) output (feel free to compress it).

Thanks,
Laszlo





Re: Windows guest fails to boot into recovery mode due to commit 5267926

annie li
 

Hello

I added log into the code to dump the page info, and also run QMP command "xp" to walk through the page manually. The result shows the corresponding page table entry doesn't exist. I updated details in the bug, please take a look.

Thanks
Annie

-----Original Message-----
From: Laszlo Ersek [mailto:lersek@redhat.com]
Sent: Friday, March 19, 2021 2:11 PM
To: Andrew Fish <afish@apple.com>; Annie Li <annie.li@oracle.com>
Cc: Yao, Jiewen <jiewen.yao@intel.com>; discuss@edk2.groups.io; Wang, Jian J <jian.j.wang@intel.com>; Aaron Young <aaron.young@oracle.com>
Subject: Re: [edk2-discuss] Windows guest fails to boot into recovery mode due to commit 5267926

On 03/19/21 18:41, Andrew Fish wrote:
Annie,

CR2 points the 1st level of the page tables. Those entries point to other page tables, so you kind of have to walk it by hand.
I think that's a typo: it should be CR3.

But, I agree -- you can use QEMU monitor commands to read RAM words, and walk the page tables manually. (I figured I could help with this, but I couldn't reproduce the issue locally. I used the manual Recovery entry
-- click "Reboot" with Shift held down. For me the Windows VM just entered recovery fine.)

Thanks,
Laszlo


https://urldefense.com/v3/__https://wiki.osdev.org/Paging__;!!GqivPVa7
Brio!Ns258pdi0mlsnSN0oODh9wsYUe4PNDNF6avU8uN1wSySF8ktrFBAJma0qi5hWw$
<https://urldefense.com/v3/__https://wiki.osdev.org/Paging__;!!GqivPVa
7Brio!Ns258pdi0mlsnSN0oODh9wsYUe4PNDNF6avU8uN1wSySF8ktrFBAJma0qi5hWw$

Thanks,

Andrew Fish

On Mar 19, 2021, at 9:56 AM, Annie Li <annie.li@oracle.com> wrote:

Hi Jiewen,

In DumpCpuContext function in ArchExceptionHandler.c, the exception details are gotten from either "SystemContextX64->ExceptionData" or "SystemContextX64.xxx". I am wondering how I can dump the page info there? Are there some related info that can be retrieved from CR2? can you enlighten me a little bit?

Thanks
Annie

-----Original Message-----
From: Yao, Jiewen [mailto:jiewen.yao@intel.com]
Sent: Thursday, March 18, 2021 8:37 PM
To: discuss@edk2.groups.io; Annie Li <annie.li@oracle.com>; Laszlo
Ersek <lersek@redhat.com>
Cc: Wang, Jian J <jian.j.wang@intel.com>; Andrew Fish
<afish@apple.com>; Aaron Young <aaron.young@oracle.com>; Yao, Jiewen
<jiewen.yao@intel.com>
Subject: RE: [edk2-discuss] Windows guest fails to boot into recovery
mode due to commit 5267926

Hi Arie
I added some of my thought in the Bugzilla. -
https://urldefense.com/v3/__https://bugzilla.tianocore.org/show_bug.c
gi?id=3269__;!!GqivPVa7Brio!JMob8PcNWJxj_RZSIWy7iwhqFFhIYSwtnR_8i0X6V
-UzBkycx-iObkffqGNBrw$

If you can dump paging structure info for further analysis, we can help to check.


-----Original Message-----
From: discuss@edk2.groups.io <discuss@edk2.groups.io> On Behalf Of
annie li
Sent: Friday, March 19, 2021 3:27 AM
To: Laszlo Ersek <lersek@redhat.com>; discuss@edk2.groups.io
Cc: Wang, Jian J <jian.j.wang@intel.com>; Andrew Fish
<afish@apple.com>; Aaron Young <aaron.young@oracle.com>
Subject: Re: [edk2-discuss] Windows guest fails to boot into
recovery mode due to commit 5267926

Hello Laszlo,

In my previous email, the exception is reproduced with pretty old
code base from where I started bisecting the comments. This time I
reproduce this issue with the code of branch 'stable/202011' of
upstream. All the log I am collecting is from this code base(75ab038).
Since the overall size of all log is pretty big, I'll attach all the
data you required in to this bug(https://urldefense.com/v3/__https://bugzilla.tianocore.org/show_bug.cgi?id=3269__;!!GqivPVa7Brio!JMob8PcNWJxj_RZSIWy7iwhqFFhIYSwtnR_8i0X6V-UzBkycx-iObkffqGNBrw$ ).

I dump the register by qmp-regdump, and the result(regdump) is
uploaded into this bug. If this log doesn't suffice, can you please suggest the way you prefer?
The objdump is uploaded, as well as the details of page fault
exception, please check the details there.

Thanks
Annie

-----Original Message-----
From: Laszlo Ersek [mailto:lersek@redhat.com]
Sent: Thursday, March 18, 2021 9:23 AM
To: Annie Li <annie.li@oracle.com>; discuss@edk2.groups.io
Cc: jian.j.wang@intel.com; Andrew Fish <afish@apple.com>
Subject: Re: Windows guest fails to boot into recovery mode due to
commit
5267926

On 03/18/21 02:48, Annie Li wrote:
Hello,

I ran into a windows booting failure issue(a page fault exception),
and narrow down it to the following patch,
MdeModulePkg/DxeIpl: support more NX related PCDs
https://urldefense.com/v3/__https://github.com/tianocore/edk2/commi
t
/5
267926134d17e86672b84fd57b438f05ffa68e1__;!!GqivPVa7Brio!PuMaBhjIGEd
V
v
lQi7PKC_FQeyIy-sjSaIZXk_W_MusXNUlQBxGqsJBRONWwKWw$

This issue always happens after QMP is terminated by <ctrl-C>
twice, see
following steps.

1. Boot Windows VM up, and <ctrl-C> to exit the QMP

2. Repeat 1

3. Boot Windows VM, and this page fault issue happens. (Note: Windows
should boot into recovery mode in this round, and this is due to the
previous two consecutive boot failure, see
https://urldefense.com/v3/__https://docs.microsoft.com/en-us/windows
-
hardware/manufacture/desktop/windows-recovery-environment--windows-r
e-
-
technical-reference*entry-points-into-
winre__;Iw!!GqivPVa7Brio!PuMaBhjIGEdVvlQi7PKC_FQeyIy-
sjSaIZXk_W_MusXNUlQBxGqsJBSkXMCNZA$ )

During above 3 windows booting procedures, the value of following
variables are always the same, PcdSetNxForStack 0
PcdDxeNxMemoryProtectionPolicy 0 PcdImageProtectionPolicy 2

However, Windows guest fails to boot up into recovery mode in the
3rd round
due to the patch above(5267926). I modified the return value to
"(PcdGetBool (PcdSetNxForStack)" in function "IsEnableNonExecNeeded"
in MdeModulePkg/Core/DxeIplPeim/X64/VirtualMemory.c, this page fault
issue is gone with this change. The patch(5267926) is for fixing bug
https://urldefense.com/v3/__https://bugzilla.tianocore.org/show_bug.
cg
i?id=1
116__;!!GqivPVa7Brio!PuMaBhjIGEdVvlQi7PKC_FQeyIy-
sjSaIZXk_W_MusXNUlQBxGqsJBTLSxdsog$ , where the comments show
PcdImageProtectionPolicy needs also to enable NXE. But this does
cause the page fault exception in this scenario, any suggestion?

The page fault exception is pasted here,


!!!! X64 Exception Type - 0E(#PF - Page-Fault) CPU Apic ID - 00000000 !!!!
ExceptionData - 0000000000000009 I:0 R:1 U:0 W:0 P:1 PK:0 SS:0
SGX:0 RIP - 000000003E0A7C75, CS - 0000000000000038, RFLAGS -
0000000000010202 RAX - 8000000000000003, RCX - 0000000000000001,
RDX
- 0000000001040001 RBX - 0000000000000001, RSP - 00000000001A6AA0,
RBP - 0000000001040001 RSI - 000000003F2E2010, RDI - 0000000000000001
R8 - 0000000000000000, R9 - 000000003E0AEC90, R10 - 0000FFFFFFFFF000
R11 - 00000000001A6E90, R12 - 0000000000000000, R13 -
000000003E0AEC90
R14 - 00000000001A6B28, R15 - 00000000001AB000
DS - 0000000000000030, ES - 0000000000000030, FS - 0000000000000030
GS - 0000000000000030, SS - 0000000000000030
CR0 - 0000000080010033, CR2 - 000000003F2E2010, CR3 -
000000003F401000
CR4 - 0000000000040668, CR8 - 0000000000000000
DR0 - 0000000000000000, DR1 - 0000000000000000, DR2 -
0000000000000000
DR3 - 0000000000000000, DR6 - 00000000FFFF0FF0, DR7 -
0000000000000400 GDTR - 000000003F1EE698 0000000000000047, LDTR -
0000000000000000
IDTR - 000000003ECCA018 0000000000000FFF, TR - 0000000000000000
FXSAVE_STATE - 00000000001A6700
!!!! Find image based on IP(0x3E0A7C75) /builddir/build/BUILD/edk2-
1.4.3/Build/OvmfX64/DEBUG_GCC48/X64/MdeModulePkg/Universal/Console/T
erminalDxe/TerminalDxe/DEBUG/TerminalDxe.dll
(ImageBase=000000003E0A5000, EntryPoint=000000003E0A86E8) !!!!

In addition to what Andrew said, I suggest the following:

(1) Please rebuild OVMF *locally*, using the same edk2 tree, and the
same toolchain, and the same "build" flags.

(2) Reproduce the issue, capture the register dump.

(3) Run the following command:

objdump -f -S
Build/OvmfX64/DEBUG_GCC48/X64/MdeModulePkg/Universal/Console/Termin
alDxe/TerminalDxe/DEBUG/TerminalDxe.debug


The point of this exercise is to reproduce the issue with such an
OVMF build for which you have a matching "TerminalDxe.debug" file.
Once you do that, you can run "objdump" on the ".debug" file, and
get a disassembly of the TerminalDxe driver, inter-leaved with the C language source code.

Then, we can do two things:

- we can verify whether (EntryPoint - ImageBase), from the register
dump, matches the (relative) "start address" that "objdump -f"
reports,

- we can take the crash offset (RIP - ImageBase), from the register
dump, and use that offset into the "objdump -S" disassembly, to
narrow down what the terminal driver may have been doing to trigger the crash.

It's not necessarily the terminal driver's fault that encounter a
crash, but knowing what TerminalDxe was up to, might shed light on
the actual reason. It's of course also possible that TerminalDxe *is* at fault. We'll see.

If possible, please post:
- your precise edk2 version (if you have local patches, it would be
best to reproduce with an upstream-only tree),
- your full firmware log (feel free to compress it),
- the register dump from serial,
- the objdump (disassembly) output (feel free to compress it).

Thanks,
Laszlo





[Announce] A UEFI NTFS driver, with read/write support

Pete Batard
 

Hi,

I have just completed a fork of the GPLv2 ntfs-3g project [1], that enables compiling it as a UEFI driver.

The project, as well as binary releases (for X64, IA32, AARCH64 and ARM), can be found at: https://github.com/pbatard/ntfs-3g

The driver can be compiled, using EDK2, on either Windows or Linux (though I have only tested MSVC and gcc). Or you can also build (and test) it from Visual Studio 2019, on account that it also supports gnu-efi.

The produced driver should be compliant with sections 13.4 and 13.5 of version 2.8 of the UEFI specs. But of course, I wouldn't mind some more testing from folks who might be interested in such a driver.

It needs to be pointed however that, whereas the read-only version of the driver should be fairly safe to use, this is brand new code. As such, you may want to treat the read/write version of the driver as an executable that has the potential to destroy all data on an NTFS volume. In other words, you may want to refrain from using the current driver on NTFS drives where you have data that you can't afford to lose.

I should also point out that I contacted the ntfs-3g maintainers, to see if they have any interest in integrating this proposal into their mainline, as I believe it would make a lot of sense to have it there (which is the reason why you'll find that I didn't follow the usual EDK2 package/file naming conventions, so that it would look more suitable for ntfs-3g)

If you have any questions, or think you found an issue, feel free to e-mail or use the GitHub issue tracker [2].

Regards,

/Pete

[1] https://sourceforge.net/projects/ntfs-3g/
[2] https://github.com/pbatard/ntfs-3g/issues


Re: Windows guest fails to boot into recovery mode due to commit 5267926

annie li
 

Never mind, please ignore my previous email.
The QMP command 'xp' can dump the memory, I'll try to reproduce this issue and dump the memory.

Thanks
Annie

-----Original Message-----
From: discuss@edk2.groups.io [mailto:discuss@edk2.groups.io] On Behalf Of annie li
Sent: Friday, March 19, 2021 3:44 PM
To: Laszlo Ersek <lersek@redhat.com>; Andrew Fish <afish@apple.com>
Cc: Yao, Jiewen <jiewen.yao@intel.com>; discuss@edk2.groups.io; Wang, Jian J <jian.j.wang@intel.com>; Aaron Young <aaron.young@oracle.com>
Subject: Re: [edk2-discuss] Windows guest fails to boot into recovery mode due to commit 5267926

Hello Laszlo,

I was thinking of CR2 because it has the address that causes this exception, and we suppose to find out the information of the page in which the address locates. no?

Thanks
Annie

-----Original Message-----
From: Laszlo Ersek [mailto:lersek@redhat.com]
Sent: Friday, March 19, 2021 2:11 PM
To: Andrew Fish <afish@apple.com>; Annie Li <annie.li@oracle.com>
Cc: Yao, Jiewen <jiewen.yao@intel.com>; discuss@edk2.groups.io; Wang, Jian J <jian.j.wang@intel.com>; Aaron Young <aaron.young@oracle.com>
Subject: Re: [edk2-discuss] Windows guest fails to boot into recovery mode due to commit 5267926

On 03/19/21 18:41, Andrew Fish wrote:
Annie,

CR2 points the 1st level of the page tables. Those entries point to other page tables, so you kind of have to walk it by hand.
I think that's a typo: it should be CR3.

But, I agree -- you can use QEMU monitor commands to read RAM words, and walk the page tables manually. (I figured I could help with this, but I couldn't reproduce the issue locally. I used the manual Recovery entry
-- click "Reboot" with Shift held down. For me the Windows VM just entered recovery fine.)

Thanks,
Laszlo


https://urldefense.com/v3/__https://wiki.osdev.org/Paging__;!!GqivPVa7
Brio!Ns258pdi0mlsnSN0oODh9wsYUe4PNDNF6avU8uN1wSySF8ktrFBAJma0qi5hWw$
<https://urldefense.com/v3/__https://wiki.osdev.org/Paging__;!!GqivPVa
7Brio!Ns258pdi0mlsnSN0oODh9wsYUe4PNDNF6avU8uN1wSySF8ktrFBAJma0qi5hWw$

Thanks,

Andrew Fish

On Mar 19, 2021, at 9:56 AM, Annie Li <annie.li@oracle.com> wrote:

Hi Jiewen,

In DumpCpuContext function in ArchExceptionHandler.c, the exception details are gotten from either "SystemContextX64->ExceptionData" or "SystemContextX64.xxx". I am wondering how I can dump the page info there? Are there some related info that can be retrieved from CR2? can you enlighten me a little bit?

Thanks
Annie

-----Original Message-----
From: Yao, Jiewen [mailto:jiewen.yao@intel.com]
Sent: Thursday, March 18, 2021 8:37 PM
To: discuss@edk2.groups.io; Annie Li <annie.li@oracle.com>; Laszlo
Ersek <lersek@redhat.com>
Cc: Wang, Jian J <jian.j.wang@intel.com>; Andrew Fish
<afish@apple.com>; Aaron Young <aaron.young@oracle.com>; Yao, Jiewen
<jiewen.yao@intel.com>
Subject: RE: [edk2-discuss] Windows guest fails to boot into recovery
mode due to commit 5267926

Hi Arie
I added some of my thought in the Bugzilla. -
https://urldefense.com/v3/__https://bugzilla.tianocore.org/show_bug.c
gi?id=3269__;!!GqivPVa7Brio!JMob8PcNWJxj_RZSIWy7iwhqFFhIYSwtnR_8i0X6V
-UzBkycx-iObkffqGNBrw$

If you can dump paging structure info for further analysis, we can help to check.


-----Original Message-----
From: discuss@edk2.groups.io <discuss@edk2.groups.io> On Behalf Of
annie li
Sent: Friday, March 19, 2021 3:27 AM
To: Laszlo Ersek <lersek@redhat.com>; discuss@edk2.groups.io
Cc: Wang, Jian J <jian.j.wang@intel.com>; Andrew Fish
<afish@apple.com>; Aaron Young <aaron.young@oracle.com>
Subject: Re: [edk2-discuss] Windows guest fails to boot into
recovery mode due to commit 5267926

Hello Laszlo,

In my previous email, the exception is reproduced with pretty old
code base from where I started bisecting the comments. This time I
reproduce this issue with the code of branch 'stable/202011' of
upstream. All the log I am collecting is from this code base(75ab038).
Since the overall size of all log is pretty big, I'll attach all the
data you required in to this bug(https://urldefense.com/v3/__https://bugzilla.tianocore.org/show_bug.cgi?id=3269__;!!GqivPVa7Brio!JMob8PcNWJxj_RZSIWy7iwhqFFhIYSwtnR_8i0X6V-UzBkycx-iObkffqGNBrw$ ).

I dump the register by qmp-regdump, and the result(regdump) is
uploaded into this bug. If this log doesn't suffice, can you please suggest the way you prefer?
The objdump is uploaded, as well as the details of page fault
exception, please check the details there.

Thanks
Annie

-----Original Message-----
From: Laszlo Ersek [mailto:lersek@redhat.com]
Sent: Thursday, March 18, 2021 9:23 AM
To: Annie Li <annie.li@oracle.com>; discuss@edk2.groups.io
Cc: jian.j.wang@intel.com; Andrew Fish <afish@apple.com>
Subject: Re: Windows guest fails to boot into recovery mode due to
commit
5267926

On 03/18/21 02:48, Annie Li wrote:
Hello,

I ran into a windows booting failure issue(a page fault exception),
and narrow down it to the following patch,
MdeModulePkg/DxeIpl: support more NX related PCDs
https://urldefense.com/v3/__https://github.com/tianocore/edk2/commi
t
/5
267926134d17e86672b84fd57b438f05ffa68e1__;!!GqivPVa7Brio!PuMaBhjIGEd
V
v
lQi7PKC_FQeyIy-sjSaIZXk_W_MusXNUlQBxGqsJBRONWwKWw$

This issue always happens after QMP is terminated by <ctrl-C>
twice, see
following steps.

1. Boot Windows VM up, and <ctrl-C> to exit the QMP

2. Repeat 1

3. Boot Windows VM, and this page fault issue happens. (Note: Windows
should boot into recovery mode in this round, and this is due to the
previous two consecutive boot failure, see
https://urldefense.com/v3/__https://docs.microsoft.com/en-us/windows
-
hardware/manufacture/desktop/windows-recovery-environment--windows-r
e-
-
technical-reference*entry-points-into-
winre__;Iw!!GqivPVa7Brio!PuMaBhjIGEdVvlQi7PKC_FQeyIy-
sjSaIZXk_W_MusXNUlQBxGqsJBSkXMCNZA$ )

During above 3 windows booting procedures, the value of following
variables are always the same, PcdSetNxForStack 0
PcdDxeNxMemoryProtectionPolicy 0 PcdImageProtectionPolicy 2

However, Windows guest fails to boot up into recovery mode in the
3rd round
due to the patch above(5267926). I modified the return value to
"(PcdGetBool (PcdSetNxForStack)" in function "IsEnableNonExecNeeded"
in MdeModulePkg/Core/DxeIplPeim/X64/VirtualMemory.c, this page fault
issue is gone with this change. The patch(5267926) is for fixing bug
https://urldefense.com/v3/__https://bugzilla.tianocore.org/show_bug.
cg
i?id=1
116__;!!GqivPVa7Brio!PuMaBhjIGEdVvlQi7PKC_FQeyIy-
sjSaIZXk_W_MusXNUlQBxGqsJBTLSxdsog$ , where the comments show
PcdImageProtectionPolicy needs also to enable NXE. But this does
cause the page fault exception in this scenario, any suggestion?

The page fault exception is pasted here,


!!!! X64 Exception Type - 0E(#PF - Page-Fault) CPU Apic ID - 00000000 !!!!
ExceptionData - 0000000000000009 I:0 R:1 U:0 W:0 P:1 PK:0 SS:0
SGX:0 RIP - 000000003E0A7C75, CS - 0000000000000038, RFLAGS -
0000000000010202 RAX - 8000000000000003, RCX - 0000000000000001,
RDX
- 0000000001040001 RBX - 0000000000000001, RSP - 00000000001A6AA0,
RBP - 0000000001040001 RSI - 000000003F2E2010, RDI - 0000000000000001
R8 - 0000000000000000, R9 - 000000003E0AEC90, R10 - 0000FFFFFFFFF000
R11 - 00000000001A6E90, R12 - 0000000000000000, R13 -
000000003E0AEC90
R14 - 00000000001A6B28, R15 - 00000000001AB000
DS - 0000000000000030, ES - 0000000000000030, FS - 0000000000000030
GS - 0000000000000030, SS - 0000000000000030
CR0 - 0000000080010033, CR2 - 000000003F2E2010, CR3 -
000000003F401000
CR4 - 0000000000040668, CR8 - 0000000000000000
DR0 - 0000000000000000, DR1 - 0000000000000000, DR2 -
0000000000000000
DR3 - 0000000000000000, DR6 - 00000000FFFF0FF0, DR7 -
0000000000000400 GDTR - 000000003F1EE698 0000000000000047, LDTR -
0000000000000000
IDTR - 000000003ECCA018 0000000000000FFF, TR - 0000000000000000
FXSAVE_STATE - 00000000001A6700
!!!! Find image based on IP(0x3E0A7C75) /builddir/build/BUILD/edk2-
1.4.3/Build/OvmfX64/DEBUG_GCC48/X64/MdeModulePkg/Universal/Console/T
erminalDxe/TerminalDxe/DEBUG/TerminalDxe.dll
(ImageBase=000000003E0A5000, EntryPoint=000000003E0A86E8) !!!!

In addition to what Andrew said, I suggest the following:

(1) Please rebuild OVMF *locally*, using the same edk2 tree, and the
same toolchain, and the same "build" flags.

(2) Reproduce the issue, capture the register dump.

(3) Run the following command:

objdump -f -S
Build/OvmfX64/DEBUG_GCC48/X64/MdeModulePkg/Universal/Console/Termin
alDxe/TerminalDxe/DEBUG/TerminalDxe.debug


The point of this exercise is to reproduce the issue with such an
OVMF build for which you have a matching "TerminalDxe.debug" file.
Once you do that, you can run "objdump" on the ".debug" file, and
get a disassembly of the TerminalDxe driver, inter-leaved with the C language source code.

Then, we can do two things:

- we can verify whether (EntryPoint - ImageBase), from the register
dump, matches the (relative) "start address" that "objdump -f"
reports,

- we can take the crash offset (RIP - ImageBase), from the register
dump, and use that offset into the "objdump -S" disassembly, to
narrow down what the terminal driver may have been doing to trigger the crash.

It's not necessarily the terminal driver's fault that encounter a
crash, but knowing what TerminalDxe was up to, might shed light on
the actual reason. It's of course also possible that TerminalDxe *is* at fault. We'll see.

If possible, please post:
- your precise edk2 version (if you have local patches, it would be
best to reproduce with an upstream-only tree),
- your full firmware log (feel free to compress it),
- the register dump from serial,
- the objdump (disassembly) output (feel free to compress it).

Thanks,
Laszlo





Re: Google Summer of Code 2021 interested student

Laszlo Ersek
 

CC Nate

On 03/17/21 21:16, Ayush Dwivedi wrote:
Hello everyone,
My name is Ayush Dwivedi. I am currently studying Computer Science and
Engineering and am in my 3rd year of B.Tech program. I am interested in
operating systems, firmwares and game programming. I have written some
personal projects in C and C++ programming languages and have experimented
with x86_64 assembly. Recently I have been using QEMU with the OVMF UEFI
firmware which I had built from source using edk2. I have tried to write my
own EFI applications(it was an attempt to understand how UEFI differs from
BIOS). I wish to be a part of the TianoCore community and want to
contribute. I would like to know what skills and knowledge is needed for
the task "MinPlatform Qemu Support". It is explained that we are needed to
port MinPlatform to QEMU so as of now I have started looking into the
OvmfPkg(since it already runs on QEMU) and Platform/Intel/MinPlatformPkg
but the source tree for edk2 and edk2-platform is huge so I am confused on
how and where I should start. I am looking forward to guidance from the
community.

Thank you for your precious time.

Regards,
Ayush Dwivedi




141 - 160 of 810