[edk2-platforms][PATCH v3 0/5] Platform/Sgi: Add platform support for firmware first error handling


Omkar Anand Kulkarni
 

Changes since v2:
- Rebased to the latest upstream code.

This patch series introduces platform support for RAS using Firmware Firs=
t
error handling. Firmware first error handling on ARM Neoverse reference d=
esign
platforms is achieved using HEST[1] and SDEI[2] ACPI tables.

For doing so the Platform Error handler DXE driver is introduced. This dr=
iver
is integral part of the firmware first error handling framework in EDK2. =
SDEI
being the notification mechanism used to communicate the platform errors =
to
OSPM, it builds and installs the SDEI ACPI table. Also installs the HEST =
table
which is already created as part of firmware first framework in EDK2.

This series does provide a reference implementation to leverage the firmw=
are
first framework by implementing a platform MM driver for Dynamic Memory
Controller DMC[3] that has RAS feature enabled. This driver mainly handle=
s
following:
- Implements the Hest Error Source Descriptor protocol introduced as part=
of
firmware first framework in EDK2. Publishes the 1-bit ECC DRAM error so=
urces
as GHESv2[4] type error source descriptors.
- For any 1-bit CE that occurs on DRAM it reads DMC error record register=
s and
populates a error status block (CPER)[5] of Memory Type error[6].

References:
[1] : ACPI 6.3, Table 18-382, Hardware Error Source Table
[2] : SDEI Platform Design Document, revision b, 10 Appendix C, ACPI tabl=
e
definitions for SDEI
[3] : DMC620 Dynamic Memory Controller, revision r1p0
[4] : ACPI Reference Specification 6.3, Table 18-393 GHESv2 Structure
[5] : UEFI Reference Specification 2.8, Appendix N - Common Platform Erro=
r
Record
[6] : UEFI Reference Specification 2.8, Section N.2.5 Memory Error Sectio=
n

This patch series is dependent on the edk2 patch series
https://edk2.groups.io/g/devel/message/79741

Link to github branch with the patches in this series -
https://github.com/omkkul01/edk2-platforms/tree/ras_firware_first_edk2-pl=
atforms_v3

Omkar Anand Kulkarni (5):
Platform/ARM: Add DMC-620 ECC error handling driver
Platform/Sgi: dmc-620 firmware-first error handling
Platform/Sgi: define memory region for GHES error status block
Platform/Sgi: Define values for ACPI table header
Platform/Sgi: Add platform error handling driver

Platform/ARM/Drivers/Dmc620Mm/Dmc620Mm.dec | 30 ++
Platform/ARM/SgiPkg/SgiPlatform.dec | 1 +
Platform/ARM/SgiPkg/SgiPlatform.dsc.inc | 38 ++
Platform/ARM/SgiPkg/SgiPlatformMm.dsc.inc | 30 ++
Platform/ARM/SgiPkg/PlatformStandaloneMm.fdf | 6 +
Platform/ARM/SgiPkg/SgiPlatform.fdf | 13 +
Platform/ARM/Drivers/Dmc620Mm/Dmc620Mm.inf | 61 +++
.../PlatformErrorHandlerDxe.inf | 51 +++
.../Library/PlatformLib/PlatformLib.inf | 6 +
Platform/ARM/Drivers/Dmc620Mm/Dmc620Mm.h | 174 +++++++++
Platform/ARM/Drivers/Dmc620Mm/Dmc620Mm.c | 362 ++++++++++++++++++
.../Dmc620Mm/Dmc620MmErrorSourceInfo.c | 194 ++++++++++
.../PlatformErrorHandlerDxe.c | 171 +++++++++
.../Library/PlatformLib/PlatformLibMem.c | 13 +-
14 files changed, 1148 insertions(+), 2 deletions(-)
create mode 100644 Platform/ARM/Drivers/Dmc620Mm/Dmc620Mm.dec
create mode 100644 Platform/ARM/Drivers/Dmc620Mm/Dmc620Mm.inf
create mode 100644 Platform/ARM/SgiPkg/Drivers/PlatformErrorHandlerDxe/P=
latformErrorHandlerDxe.inf
create mode 100644 Platform/ARM/Drivers/Dmc620Mm/Dmc620Mm.h
create mode 100644 Platform/ARM/Drivers/Dmc620Mm/Dmc620Mm.c
create mode 100644 Platform/ARM/Drivers/Dmc620Mm/Dmc620MmErrorSourceInfo=
.c
create mode 100644 Platform/ARM/SgiPkg/Drivers/PlatformErrorHandlerDxe/P=
latformErrorHandlerDxe.c

--=20
2.17.1


Omkar Anand Kulkarni
 

Gentle reminder, to review this patch series.

- Omkar

On August 24, 2021 11:30 AM, Omkar Kulkarni wrote:
Changes since v2:
- Rebased to the latest upstream code.

This patch series introduces platform support for RAS using Firmware First
error handling. Firmware first error handling on ARM Neoverse reference
design platforms is achieved using HEST[1] and SDEI[2] ACPI tables.

For doing so the Platform Error handler DXE driver is introduced. This driver
is integral part of the firmware first error handling framework in EDK2. SDEI
being the notification mechanism used to communicate the platform errors
to OSPM, it builds and installs the SDEI ACPI table. Also installs the HEST table
which is already created as part of firmware first framework in EDK2.

This series does provide a reference implementation to leverage the
firmware first framework by implementing a platform MM driver for
Dynamic Memory Controller DMC[3] that has RAS feature enabled. This
driver mainly handles
following:
- Implements the Hest Error Source Descriptor protocol introduced as part of
firmware first framework in EDK2. Publishes the 1-bit ECC DRAM error
sources
as GHESv2[4] type error source descriptors.
- For any 1-bit CE that occurs on DRAM it reads DMC error record registers
and
populates a error status block (CPER)[5] of Memory Type error[6].

References:
[1] : ACPI 6.3, Table 18-382, Hardware Error Source Table [2] : SDEI Platform
Design Document, revision b, 10 Appendix C, ACPI table
definitions for SDEI
[3] : DMC620 Dynamic Memory Controller, revision r1p0 [4] : ACPI Reference
Specification 6.3, Table 18-393 GHESv2 Structure [5] : UEFI Reference
Specification 2.8, Appendix N - Common Platform Error
Record
[6] : UEFI Reference Specification 2.8, Section N.2.5 Memory Error Section

This patch series is dependent on the edk2 patch series
https://edk2.groups.io/g/devel/message/79741

Link to github branch with the patches in this series -
https://github.com/omkkul01/edk2-platforms/tree/ras_firware_first_edk2-
platforms_v3

Omkar Anand Kulkarni (5):
Platform/ARM: Add DMC-620 ECC error handling driver
Platform/Sgi: dmc-620 firmware-first error handling
Platform/Sgi: define memory region for GHES error status block
Platform/Sgi: Define values for ACPI table header
Platform/Sgi: Add platform error handling driver

Platform/ARM/Drivers/Dmc620Mm/Dmc620Mm.dec | 30 ++
Platform/ARM/SgiPkg/SgiPlatform.dec | 1 +
Platform/ARM/SgiPkg/SgiPlatform.dsc.inc | 38 ++
Platform/ARM/SgiPkg/SgiPlatformMm.dsc.inc | 30 ++
Platform/ARM/SgiPkg/PlatformStandaloneMm.fdf | 6 +
Platform/ARM/SgiPkg/SgiPlatform.fdf | 13 +
Platform/ARM/Drivers/Dmc620Mm/Dmc620Mm.inf | 61 +++
.../PlatformErrorHandlerDxe.inf | 51 +++
.../Library/PlatformLib/PlatformLib.inf | 6 +
Platform/ARM/Drivers/Dmc620Mm/Dmc620Mm.h | 174 +++++++++
Platform/ARM/Drivers/Dmc620Mm/Dmc620Mm.c | 362
++++++++++++++++++
.../Dmc620Mm/Dmc620MmErrorSourceInfo.c | 194 ++++++++++
.../PlatformErrorHandlerDxe.c | 171 +++++++++
.../Library/PlatformLib/PlatformLibMem.c | 13 +-
14 files changed, 1148 insertions(+), 2 deletions(-) create mode 100644
Platform/ARM/Drivers/Dmc620Mm/Dmc620Mm.dec
create mode 100644 Platform/ARM/Drivers/Dmc620Mm/Dmc620Mm.inf
create mode 100644
Platform/ARM/SgiPkg/Drivers/PlatformErrorHandlerDxe/PlatformErrorHandl
erDxe.inf
create mode 100644 Platform/ARM/Drivers/Dmc620Mm/Dmc620Mm.h
create mode 100644 Platform/ARM/Drivers/Dmc620Mm/Dmc620Mm.c
create mode 100644
Platform/ARM/Drivers/Dmc620Mm/Dmc620MmErrorSourceInfo.c
create mode 100644
Platform/ARM/SgiPkg/Drivers/PlatformErrorHandlerDxe/PlatformErrorHandl
erDxe.c

--
2.17.1



-=-=-=-=-=-=
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#79749): https://edk2.groups.io/g/devel/message/79749
Mute This Topic: https://groups.io/mt/85104844/4857533
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub
[omkar.kulkarni@arm.com]
-=-=-=-=-=-=
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.


Ard Biesheuvel
 

On Fri, 3 Sept 2021 at 15:39, Omkar Kulkarni <Omkar.Kulkarni@arm.com> wrote:

Gentle reminder, to review this patch series.
Sami, would you mind taking this one?



On August 24, 2021 11:30 AM, Omkar Kulkarni wrote:
Changes since v2:
- Rebased to the latest upstream code.

This patch series introduces platform support for RAS using Firmware First
error handling. Firmware first error handling on ARM Neoverse reference
design platforms is achieved using HEST[1] and SDEI[2] ACPI tables.

For doing so the Platform Error handler DXE driver is introduced. This driver
is integral part of the firmware first error handling framework in EDK2. SDEI
being the notification mechanism used to communicate the platform errors
to OSPM, it builds and installs the SDEI ACPI table. Also installs the HEST table
which is already created as part of firmware first framework in EDK2.

This series does provide a reference implementation to leverage the
firmware first framework by implementing a platform MM driver for
Dynamic Memory Controller DMC[3] that has RAS feature enabled. This
driver mainly handles
following:
- Implements the Hest Error Source Descriptor protocol introduced as part of
firmware first framework in EDK2. Publishes the 1-bit ECC DRAM error
sources
as GHESv2[4] type error source descriptors.
- For any 1-bit CE that occurs on DRAM it reads DMC error record registers
and
populates a error status block (CPER)[5] of Memory Type error[6].

References:
[1] : ACPI 6.3, Table 18-382, Hardware Error Source Table [2] : SDEI Platform
Design Document, revision b, 10 Appendix C, ACPI table
definitions for SDEI
[3] : DMC620 Dynamic Memory Controller, revision r1p0 [4] : ACPI Reference
Specification 6.3, Table 18-393 GHESv2 Structure [5] : UEFI Reference
Specification 2.8, Appendix N - Common Platform Error
Record
[6] : UEFI Reference Specification 2.8, Section N.2.5 Memory Error Section

This patch series is dependent on the edk2 patch series
https://edk2.groups.io/g/devel/message/79741

Link to github branch with the patches in this series -
https://github.com/omkkul01/edk2-platforms/tree/ras_firware_first_edk2-
platforms_v3

Omkar Anand Kulkarni (5):
Platform/ARM: Add DMC-620 ECC error handling driver
Platform/Sgi: dmc-620 firmware-first error handling
Platform/Sgi: define memory region for GHES error status block
Platform/Sgi: Define values for ACPI table header
Platform/Sgi: Add platform error handling driver

Platform/ARM/Drivers/Dmc620Mm/Dmc620Mm.dec | 30 ++
Platform/ARM/SgiPkg/SgiPlatform.dec | 1 +
Platform/ARM/SgiPkg/SgiPlatform.dsc.inc | 38 ++
Platform/ARM/SgiPkg/SgiPlatformMm.dsc.inc | 30 ++
Platform/ARM/SgiPkg/PlatformStandaloneMm.fdf | 6 +
Platform/ARM/SgiPkg/SgiPlatform.fdf | 13 +
Platform/ARM/Drivers/Dmc620Mm/Dmc620Mm.inf | 61 +++
.../PlatformErrorHandlerDxe.inf | 51 +++
.../Library/PlatformLib/PlatformLib.inf | 6 +
Platform/ARM/Drivers/Dmc620Mm/Dmc620Mm.h | 174 +++++++++
Platform/ARM/Drivers/Dmc620Mm/Dmc620Mm.c | 362
++++++++++++++++++
.../Dmc620Mm/Dmc620MmErrorSourceInfo.c | 194 ++++++++++
.../PlatformErrorHandlerDxe.c | 171 +++++++++
.../Library/PlatformLib/PlatformLibMem.c | 13 +-
14 files changed, 1148 insertions(+), 2 deletions(-) create mode 100644
Platform/ARM/Drivers/Dmc620Mm/Dmc620Mm.dec
create mode 100644 Platform/ARM/Drivers/Dmc620Mm/Dmc620Mm.inf
create mode 100644
Platform/ARM/SgiPkg/Drivers/PlatformErrorHandlerDxe/PlatformErrorHandl
erDxe.inf
create mode 100644 Platform/ARM/Drivers/Dmc620Mm/Dmc620Mm.h
create mode 100644 Platform/ARM/Drivers/Dmc620Mm/Dmc620Mm.c
create mode 100644
Platform/ARM/Drivers/Dmc620Mm/Dmc620MmErrorSourceInfo.c
create mode 100644
Platform/ARM/SgiPkg/Drivers/PlatformErrorHandlerDxe/PlatformErrorHandl
erDxe.c

--
2.17.1



-=-=-=-=-=-=
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#79749): https://edk2.groups.io/g/devel/message/79749
Mute This Topic: https://groups.io/mt/85104844/4857533
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub
[omkar.kulkarni@arm.com]
-=-=-=-=-=-=
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.


Sami Mujawar
 

Hi Ard,

I will review the series this week.

Regards,

Sami Mujawar

´╗┐On 22/09/2021, 12:47, "Ard Biesheuvel" <ardb@kernel.org> wrote:

On Fri, 3 Sept 2021 at 15:39, Omkar Kulkarni <Omkar.Kulkarni@arm.com> wrote:
>
> Gentle reminder, to review this patch series.
>

Sami, would you mind taking this one?


>
> > On August 24, 2021 11:30 AM, Omkar Kulkarni wrote:
> > Changes since v2:
> > - Rebased to the latest upstream code.
> >
> > This patch series introduces platform support for RAS using Firmware First
> > error handling. Firmware first error handling on ARM Neoverse reference
> > design platforms is achieved using HEST[1] and SDEI[2] ACPI tables.
> >
> > For doing so the Platform Error handler DXE driver is introduced. This driver
> > is integral part of the firmware first error handling framework in EDK2. SDEI
> > being the notification mechanism used to communicate the platform errors
> > to OSPM, it builds and installs the SDEI ACPI table. Also installs the HEST table
> > which is already created as part of firmware first framework in EDK2.
> >
> > This series does provide a reference implementation to leverage the
> > firmware first framework by implementing a platform MM driver for
> > Dynamic Memory Controller DMC[3] that has RAS feature enabled. This
> > driver mainly handles
> > following:
> > - Implements the Hest Error Source Descriptor protocol introduced as part of
> > firmware first framework in EDK2. Publishes the 1-bit ECC DRAM error
> > sources
> > as GHESv2[4] type error source descriptors.
> > - For any 1-bit CE that occurs on DRAM it reads DMC error record registers
> > and
> > populates a error status block (CPER)[5] of Memory Type error[6].
> >
> > References:
> > [1] : ACPI 6.3, Table 18-382, Hardware Error Source Table [2] : SDEI Platform
> > Design Document, revision b, 10 Appendix C, ACPI table
> > definitions for SDEI
> > [3] : DMC620 Dynamic Memory Controller, revision r1p0 [4] : ACPI Reference
> > Specification 6.3, Table 18-393 GHESv2 Structure [5] : UEFI Reference
> > Specification 2.8, Appendix N - Common Platform Error
> > Record
> > [6] : UEFI Reference Specification 2.8, Section N.2.5 Memory Error Section
> >
> > This patch series is dependent on the edk2 patch series
> > https://edk2.groups.io/g/devel/message/79741
> >
> > Link to github branch with the patches in this series -
> > https://github.com/omkkul01/edk2-platforms/tree/ras_firware_first_edk2-
> > platforms_v3
> >
> > Omkar Anand Kulkarni (5):
> > Platform/ARM: Add DMC-620 ECC error handling driver
> > Platform/Sgi: dmc-620 firmware-first error handling
> > Platform/Sgi: define memory region for GHES error status block
> > Platform/Sgi: Define values for ACPI table header
> > Platform/Sgi: Add platform error handling driver
> >
> > Platform/ARM/Drivers/Dmc620Mm/Dmc620Mm.dec | 30 ++
> > Platform/ARM/SgiPkg/SgiPlatform.dec | 1 +
> > Platform/ARM/SgiPkg/SgiPlatform.dsc.inc | 38 ++
> > Platform/ARM/SgiPkg/SgiPlatformMm.dsc.inc | 30 ++
> > Platform/ARM/SgiPkg/PlatformStandaloneMm.fdf | 6 +
> > Platform/ARM/SgiPkg/SgiPlatform.fdf | 13 +
> > Platform/ARM/Drivers/Dmc620Mm/Dmc620Mm.inf | 61 +++
> > .../PlatformErrorHandlerDxe.inf | 51 +++
> > .../Library/PlatformLib/PlatformLib.inf | 6 +
> > Platform/ARM/Drivers/Dmc620Mm/Dmc620Mm.h | 174 +++++++++
> > Platform/ARM/Drivers/Dmc620Mm/Dmc620Mm.c | 362
> > ++++++++++++++++++
> > .../Dmc620Mm/Dmc620MmErrorSourceInfo.c | 194 ++++++++++
> > .../PlatformErrorHandlerDxe.c | 171 +++++++++
> > .../Library/PlatformLib/PlatformLibMem.c | 13 +-
> > 14 files changed, 1148 insertions(+), 2 deletions(-) create mode 100644
> > Platform/ARM/Drivers/Dmc620Mm/Dmc620Mm.dec
> > create mode 100644 Platform/ARM/Drivers/Dmc620Mm/Dmc620Mm.inf
> > create mode 100644
> > Platform/ARM/SgiPkg/Drivers/PlatformErrorHandlerDxe/PlatformErrorHandl
> > erDxe.inf
> > create mode 100644 Platform/ARM/Drivers/Dmc620Mm/Dmc620Mm.h
> > create mode 100644 Platform/ARM/Drivers/Dmc620Mm/Dmc620Mm.c
> > create mode 100644
> > Platform/ARM/Drivers/Dmc620Mm/Dmc620MmErrorSourceInfo.c
> > create mode 100644
> > Platform/ARM/SgiPkg/Drivers/PlatformErrorHandlerDxe/PlatformErrorHandl
> > erDxe.c
> >
> > --
> > 2.17.1
> >
> >
> >
> > -=-=-=-=-=-=
> > Groups.io Links: You receive all messages sent to this group.
> > View/Reply Online (#79749): https://edk2.groups.io/g/devel/message/79749
> > Mute This Topic: https://groups.io/mt/85104844/4857533
> > Group Owner: devel+owner@edk2.groups.io
> > Unsubscribe: https://edk2.groups.io/g/devel/unsub
> > [omkar.kulkarni@arm.com]
> > -=-=-=-=-=-=
> >
>
> IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.