Date
1 - 5 of 5
Request for help understanding MemoryOverwriteRequestControl
Martyn Welch <martyn.welch@...>
Hi,
We have a number of MinnowBoards (both Turbot and Max variants) here that are used for various Linux development purposes, including having a number that are used in our LAVA farm which among other things runs Linux KernelCI jobs. The firmware on these devices is currently "MNW2MAX1.X64.0101.R01.1908071815" as downloaded from the Intel site: https://software.intel.com/content/www/us/en/develop/articles/minnowboard-maxturbot-uefi-firmware.html We are seeing the following message during boot on *some* of the boards, but not others: Clear memory in MRC per MOR request Start, Please wait for some minutes... We have "CONFIG_RESET_ATTACK_MITIGATION" set in the Linux kernel configuration, which I understand will cause the "MemoryOverwriteRequest" bit to be set during boot and hence trigger this behaviour (unless explicitly cleared before the board is reset). Some of our boards seem to only be exposing the related "MemoryOverwriteRequestControlLock" EFI variable: Shell> dmpstore -guid e20939be-32d4-41be-a150-897f85d49829 dmpstore: No matching variables found. Guid E20939BE-32D4-41BE- A150-897F85D49829 Shell> dmpstore -guid bb983ccf-151d-40e1-a07b-4a17be168292 Variable NV+RT+BS 'BB983CCF-151D-40E1-A07B- 4A17BE168292:MemoryOverwriteRequestControlLock' DataSize = 0x01 00000000: 00 *.* Thus this behaviour isn't triggered. Others expose both "MemoryOverwriteRequestControl" and "MemoryOverwriteRequestControlLock": Shell> dmpstore -guid e20939be-32d4-41be-a150-897f85d49829 Variable NV+RT+BS 'E20939BE-32D4-41BE-A150- 897F85D49829:MemoryOverwriteRequestControl' DataSize = 0x01 00000000: 01 *.* Shell> dmpstore -guid bb983ccf-151d-40e1-a07b-4a17be168292 Variable NV+RT+BS 'BB983CCF-151D-40E1-A07B- 4A17BE168292:MemoryOverwriteRequestControlLock' DataSize = 0x01 00000000: 00 *.* My understanding is that we should be seeing both these EFI variables being exposed. I'm rather unfamiliar with the EDK codebase and have not been able to work out how I would end up with "MemoryOverwriteRequestControlLock" and not "MemoryOverwriteRequestControl". I've tried using `J7` to reset the NVRAM on a board just exposing "MemoryOverwriteRequestControlLock", following the process described here to see if it would have an effect and it hasn't: https://uchan.hateblo.jp/entry/2018/01/09/075230 One of the boards in our LAVA instance was initially only exposing the lock variable, but then seemingly randomly started to expose the other variable and perform the erase at boot. I've not been able to determine what triggered this change in behaviour. Any help/pointers would be much appreciated. Thanks in advance, Martyn |
|
Laszlo Ersek
+Jiewen, comments below
On 09/24/20 11:36, Martyn Welch wrote: Hi,Side comment: Wow! I very much suspected that the memory overwrite is performed on physical platforms via proprietary DIMM or board access, not via plain RAM writes. The above "MRC" reference confirms it. Yes, see commit ccc829ba3624 ("efi/libstub: Enable reset attack mitigation", 2017-08-26). (Side comment regarding historical Fedora kernels: <https://bugzilla.redhat.com/show_bug.cgi?id=1498159>.) The config knob's documentation was later extended in commit a5c03c31af22 ("x86/efi: Clarify that reset attack mitigation needs appropriate userspace", 2018-01-19). MemoryOverwriteRequestControlLock is a Microsoft- (not TCG-) originated hardening: https://docs.microsoft.com/en-us/windows-hardware/drivers/bringup/device-guard-requirements It's been a while since I last thought about it, but basically it's a way to prevent the attacker from even attempting to clear the MOR bit in the original MemoryOverwriteRequestControl variable, before they'd force a platform reset. The situation where you see MOR Control Lock variable but not the MOR Control variable, was a bug in edk2. It has been fixed under https://bugzilla.tianocore.org/show_bug.cgi?id=727 (We first encountered this issue in <https://bugzilla.redhat.com/show_bug.cgi?id=1496170>.) My understanding is that we should be seeing both these EFI variablesMore precisely, the valid cases are: - none of them present (= system doesn't support the Platform Reset Attack Mitigation from the TCG) - MOR Control is present, but MOR Control Lock is not (= the TCG spec is supported, but the Microsoft-defined hardening is not) - both MOR Control and MOR Control Lock are present (= both specs are supported) Yes, with the bug present, the firmware would re-create MOR Control Lock. See TianoCore#727 (link above). One of the boards in our LAVA instance was initially only exposing theThis seems vaguely consistent with the OS kernel being buggy too (that is, <https://bugzilla.redhat.com/show_bug.cgi?id=1498159>), *or else* with your Linux userspace not clearing the MOR bit in the MOR Control variable, as a part of the controlled OS shutdown. Any help/pointers would be much appreciated.I would suggest: (1) Upgrade the platform firmware to a version that contains the edk2 commit range fixing TianoCore#727 (namely 35ac962b5473..fda8f631edbb). This prevents the out-of-spec situation where only MOR Control Lock exists. (While that situation is not your acute problem now, it's best to get it solved too.) (2) Make sure your kernel does not *create* MOR Control under any circumstances, only modifies it if it exists. (3) Either remove CONFIG_RESET_ATTACK_MITIGATION from your kernel config, or verify that your userspace clears the MOR bit in MOR Control before a controlled OS shutdown. (To be honest, I don't know what Linux distributions satisfy the userspace requirement, as CONFIG_RESET_ATTACK_MITIGATION is not enabled in RHEL8 for example, as far as I can see.) Thanks Laszlo |
|
Laszlo Ersek
On 09/29/20 15:57, Yao, Jiewen wrote:
Thanks Laszlo to explain that.Ah, thanks. :) I'll have to refresh my local PDF then! Laszlo |
|
Yao, Jiewen
Thanks Laszlo to explain that.
toggle quoted message
Show quoted text
One minor comments: TCG adopted Secure MOR in latest MOR 1.1 spec in 2019 - https://trustedcomputinggroup.org/resource/pc-client-work-group-platform-reset-attack-mitigation-specification/ So it is a TCG standard now. Thank you Yao Jiewen -----Original Message----- |
|
Martyn Welch <martyn.welch@...>
On Tue, 2020-09-29 at 15:06 +0200, Laszlo Ersek wrote:
+Jiewen, comments belowYeah, I don't think this platform has any fancy way of doing this. Ah! Awesome, this was the bit I was missing!We have "CONFIG_RESET_ATTACK_MITIGATION" set in the Linux kernelYes, see commit ccc829ba3624 ("efi/libstub: Enable reset attack I suspect this is exactly what we are seeing.My understanding is that we should be seeing both these EFIMore precisely, the valid cases are: Since this board is in our LAVA farm and used by a few differentOne of the boards in our LAVA instance was initially only exposingThis seems vaguely consistent with the OS kernel being buggy too things, I guess there's a reasonable chance that one of these ran a kernel of a sufficient vintage to have triggered the buggy behaviour. Based on the behaviour I'm seeing, I believe the latest offering on theAny help/pointers would be much appreciated.I would suggest: Intel site, released in Aug 2019, doesn't include these changes. It also reports as `UEFI v2.60 (EDK II, 0x00010000)`. I'm unsure exactly how the versioning in EDK2 is managed, but the `EFI_SYSTEM_TABLE_REVISION` that I think is used here was changed to `2.70` in Dec 2017, so I guess that minnowboard firmware is based on an old tree. Now that I know this is probably a bug, I can live with it to be honest. (2) Make sure your kernel does not *create* MOR Control under anyThe one I really care about is currently a 5.7.x stable release, so should be good there. (3) Either remove CONFIG_RESET_ATTACK_MITIGATION from your kernelYes, this is looking like it may be the best way to handle this. Thanks for your help, Martyn |
|