Date   

Re: [Qemu-devel] [edk2-rfc] [edk2-devel] CPU hotplug using SMM with QEMU+OVMF

Igor Mammedov <imammedo@...>
 

On Mon, 2 Sep 2019 21:09:58 +0200
Laszlo Ersek <lersek@redhat.com> wrote:

On 09/02/19 10:45, Igor Mammedov wrote:
On Fri, 30 Aug 2019 20:46:14 +0200
Laszlo Ersek <lersek@redhat.com> wrote:

On 08/30/19 16:48, Igor Mammedov wrote:

(01) On boot firmware maps and initializes SMI handler at default SMBASE (30000)
(using dedicated SMRAM at 30000 would allow us to avoid save/restore
steps and make SMM handler pointer not vulnerable to DMA attacks)

(02) QEMU hotplugs a new CPU in reset-ed state and sends SCI

(03) on receiving SCI, host CPU calls GPE cpu hotplug handler
which writes to IO port 0xB2 (broadcast SMI)

(04) firmware waits for all existing CPUs rendezvous in SMM mode,
new CPU(s) have SMI pending but does nothing yet

(05) host CPU wakes up one new CPU (INIT-INIT-SIPI)
SIPI vector points to RO flash HLT loop.
(how host CPU will know which new CPUs to relocate?
possibly reuse QEMU CPU hotplug MMIO interface???)

(06) new CPU does relocation.
(in case of attacker sends SIPI to several new CPUs,
open question how to detect collision of several CPUs at the same default SMBASE)

(07) once new CPU relocated host CPU completes initialization, returns
from IO port write and executes the rest of GPE handler, telling OS
to online new CPU.
In step (03), it is the OS that handles the SCI; it transfers control to
ACPI. The AML can write to IO port 0xB2 only because the OS allows it.

If the OS decides to omit that step, and sends an INIT-SIPI-SIPI
directly to the new CPU, can it steal the CPU?
It sure can but this way it won't get access to privileged SMRAM
so OS can't subvert firmware.
The next time SMI broadcast is sent the CPU will use SMI handler at
default 30000 SMBASE. It's up to us to define behavior here (for example
relocation handler can put such CPU in shutdown state).

It's in the best interest of OS to cooperate and execute AML
provided by firmware, if it does not follow proper cpu hotplug flow
we can't guarantee that stolen CPU will work.
This sounds convincing enough, for the hotplugged CPU; thanks.

So now my concern is with step (01). While preparing for the initial
relocation (of cold-plugged CPUs), the code assumes the memory at the
default SMBASE (0x30000) is normal RAM.

Is it not a problem that the area is written initially while running in
normal 32-bit or 64-bit mode, but then executed (in response to the
first, synchronous, SMI) as SMRAM?
currently there is no SMRAM at 0x30000, so all access falls through
into RAM address space and we are about to change that.

but firmware doesn't have to use it as RAM, it can check if QEMU
supports SMRAM at 0x30000 and if supported map it to configure
and then lock it down.


Basically I'm confused by the alias.

TSEG (and presumably, A/B seg) work like this:
- when open, looks like RAM to normal mode and SMM
- when closed, looks like black-hole to normal mode, and like RAM to SMM

The generic edk2 code knows this, and manages the SMRAM areas accordingly.

The area at 0x30000 is different:
- looks like RAM to both normal mode and SMM

If we set up the alias at 0x30000 into A/B seg,
- will that *permanently* hide the normal RAM at 0x30000?
- will 0x30000 start behaving like A/B seg?

Basically my concern is that the universal code in edk2 might or might
not keep A/B seg open while initially populating the area at the default
SMBASE. Specifically, I can imagine two issues:

- if the alias into A/B seg is inactive during the initial population,
then the initial writes go to RAM, but the execution (the first SMBASE
relocation) will occur from A/B seg through the alias

- alternatively, if the alias is always active, but A/B seg is closed
during initial population (which happens in normal mode), then the
initial writes go to the black hole, and execution will occur from a
"blank" A/B seg.

Am I seeing things? (Sorry, I keep feeling dumber and dumber in this
thread.)
I don't really know how firmware uses A/B segments and I'm afraid that
cannibalizing one for configuring 0x30000 might break something.

Since we are inventing something out of q35 spec anyway, How about
leaving A/B/TSEG to be and using fwcfg to configure when/where
SMRAM(0x30000+128K) should be mapped into RAM address space.

I see a couple of options:
1: use identity mapping where SMRAM(0x30000+128K) maps into the same
range in RAM address space when firmware writes into fwcfg
file and unmaps/locks on the second write (until HW reset)
2: let firmware choose where to map SMRAM(0x30000+128K) in RAM address
space, logic is essentially the same as above only firmware
picks and writes into fwcfg an address where SMRAM(0x30000+128K)
should be mapped.


Anyway, I guess we could try and see if OVMF still boots with the alias...

Thanks
Laszlo


Re: [edk2-devel] [RFC] EDK II Continuous Integration Phase 1

Laszlo Ersek
 

On 09/03/19 19:09, Sean Brogan wrote:
Laszlo/Mike,

The idea that the maintainer must create the PR is fighting the
optimized github PR flow. Github and PRs process is optimized for
letting everyone contribute from "their" fork while still protecting
and putting process in place for the "upstream".

Why not use github to assign maintainers to each package or filepath
and then let contributors submit their own PR to edk2 after the
mailing list has approved. Then the PR has a policy that someone that
is the maintainer of all files changed must approve before the PR can
be completed (or you could even set it so that maintainer must
complete PR). This would have the benefit of less monotonous work
for the maintainers and on rejected PRs the contributor could easily
update their branch and resubmit their PR.
I'll let Mike respond.

Thanks
Laszlo


Re: [Qemu-devel] [edk2-rfc] [edk2-devel] CPU hotplug using SMM with QEMU+OVMF

Laszlo Ersek
 

On 09/03/19 16:53, Igor Mammedov wrote:
On Mon, 2 Sep 2019 21:09:58 +0200
Laszlo Ersek <lersek@redhat.com> wrote:

On 09/02/19 10:45, Igor Mammedov wrote:
On Fri, 30 Aug 2019 20:46:14 +0200
Laszlo Ersek <lersek@redhat.com> wrote:

On 08/30/19 16:48, Igor Mammedov wrote:

(01) On boot firmware maps and initializes SMI handler at default SMBASE (30000)
(using dedicated SMRAM at 30000 would allow us to avoid save/restore
steps and make SMM handler pointer not vulnerable to DMA attacks)

(02) QEMU hotplugs a new CPU in reset-ed state and sends SCI

(03) on receiving SCI, host CPU calls GPE cpu hotplug handler
which writes to IO port 0xB2 (broadcast SMI)

(04) firmware waits for all existing CPUs rendezvous in SMM mode,
new CPU(s) have SMI pending but does nothing yet

(05) host CPU wakes up one new CPU (INIT-INIT-SIPI)
SIPI vector points to RO flash HLT loop.
(how host CPU will know which new CPUs to relocate?
possibly reuse QEMU CPU hotplug MMIO interface???)

(06) new CPU does relocation.
(in case of attacker sends SIPI to several new CPUs,
open question how to detect collision of several CPUs at the same default SMBASE)

(07) once new CPU relocated host CPU completes initialization, returns
from IO port write and executes the rest of GPE handler, telling OS
to online new CPU.
In step (03), it is the OS that handles the SCI; it transfers control to
ACPI. The AML can write to IO port 0xB2 only because the OS allows it.

If the OS decides to omit that step, and sends an INIT-SIPI-SIPI
directly to the new CPU, can it steal the CPU?
It sure can but this way it won't get access to privileged SMRAM
so OS can't subvert firmware.
The next time SMI broadcast is sent the CPU will use SMI handler at
default 30000 SMBASE. It's up to us to define behavior here (for example
relocation handler can put such CPU in shutdown state).

It's in the best interest of OS to cooperate and execute AML
provided by firmware, if it does not follow proper cpu hotplug flow
we can't guarantee that stolen CPU will work.
This sounds convincing enough, for the hotplugged CPU; thanks.

So now my concern is with step (01). While preparing for the initial
relocation (of cold-plugged CPUs), the code assumes the memory at the
default SMBASE (0x30000) is normal RAM.

Is it not a problem that the area is written initially while running in
normal 32-bit or 64-bit mode, but then executed (in response to the
first, synchronous, SMI) as SMRAM?
currently there is no SMRAM at 0x30000, so all access falls through
into RAM address space and we are about to change that.

but firmware doesn't have to use it as RAM, it can check if QEMU
supports SMRAM at 0x30000 and if supported map it to configure
and then lock it down.
I'm sure you are *technically* right, but you seem to be assuming that I
can modify or rearrange anything I want in edk2. :)

If we can solve the above in OVMF platform code, that's great. If not
(e.g. UefiCpuPkg code needs to be updated), then things will get tricky.
If we can introduce another platform hook for this, that would help. I
can't say before I try.




Basically I'm confused by the alias.

TSEG (and presumably, A/B seg) work like this:
- when open, looks like RAM to normal mode and SMM
- when closed, looks like black-hole to normal mode, and like RAM to SMM

The generic edk2 code knows this, and manages the SMRAM areas accordingly.

The area at 0x30000 is different:
- looks like RAM to both normal mode and SMM

If we set up the alias at 0x30000 into A/B seg,
- will that *permanently* hide the normal RAM at 0x30000?
- will 0x30000 start behaving like A/B seg?

Basically my concern is that the universal code in edk2 might or might
not keep A/B seg open while initially populating the area at the default
SMBASE. Specifically, I can imagine two issues:

- if the alias into A/B seg is inactive during the initial population,
then the initial writes go to RAM, but the execution (the first SMBASE
relocation) will occur from A/B seg through the alias

- alternatively, if the alias is always active, but A/B seg is closed
during initial population (which happens in normal mode), then the
initial writes go to the black hole, and execution will occur from a
"blank" A/B seg.

Am I seeing things? (Sorry, I keep feeling dumber and dumber in this
thread.)
I don't really know how firmware uses A/B segments and I'm afraid that
cannibalizing one for configuring 0x30000 might break something.

Since we are inventing something out of q35 spec anyway, How about
leaving A/B/TSEG to be and using fwcfg to configure when/where
SMRAM(0x30000+128K) should be mapped into RAM address space.

I see a couple of options:
1: use identity mapping where SMRAM(0x30000+128K) maps into the same
range in RAM address space when firmware writes into fwcfg
file and unmaps/locks on the second write (until HW reset)
2: let firmware choose where to map SMRAM(0x30000+128K) in RAM address
space, logic is essentially the same as above only firmware
picks and writes into fwcfg an address where SMRAM(0x30000+128K)
should be mapped.
Option#1 would be similar to how TSEG works now, correct? IOW normal RAM
(from the QEMU perspective) is exposed as "SMRAM" to the guest, hidden
with a "black hole" overlay (outside of SMM) if SMRAM is closed.

If that's correct, then #1 looks more attractive to me than #2.

Thanks
Laszlo


Re: [edk2-devel] [RFC] EDK II Continuous Integration Phase 1

Sean
 

Laszlo/Mike,

The idea that the maintainer must create the PR is fighting the optimized github PR flow. Github and PRs process is optimized for letting everyone contribute from "their" fork while still protecting and putting process in place for the "upstream".

Why not use github to assign maintainers to each package or filepath and then let contributors submit their own PR to edk2 after the mailing list has approved. Then the PR has a policy that someone that is the maintainer of all files changed must approve before the PR can be completed (or you could even set it so that maintainer must complete PR). This would have the benefit of less monotonous work for the maintainers and on rejected PRs the contributor could easily update their branch and resubmit their PR.

Thanks
Sean

-----Original Message-----
From: rfc@edk2.groups.io <rfc@edk2.groups.io> On Behalf Of Laszlo Ersek via Groups.Io
Sent: Tuesday, September 3, 2019 9:55 AM
To: Ni, Ray <ray.ni@intel.com>; devel@edk2.groups.io; rfc@edk2.groups.io; Gao, Liming <liming.gao@intel.com>; Kinney, Michael D <michael.d.kinney@intel.com>
Subject: Re: [edk2-rfc] [edk2-devel] [RFC] EDK II Continuous Integration Phase 1

On 09/03/19 18:41, Ni, Ray wrote:
-----Original Message-----
From: devel@edk2.groups.io <devel@edk2.groups.io> On Behalf Of Laszlo
Ersek
Sent: Tuesday, September 3, 2019 6:20 AM
To: Ni, Ray <ray.ni@intel.com>; rfc@edk2.groups.io;
devel@edk2.groups.io; Gao, Liming <liming.gao@intel.com>; Kinney,
Michael D <michael.d.kinney@intel.com>
Subject: Re: [edk2-rfc] [edk2-devel] [RFC] EDK II Continuous
Integration Phase 1

On 09/03/19 05:39, Ni, Ray wrote:
Can someone draw a flow chart of who does what to help clarify?
I don't see this as a huge change, relative to the current process.

Before, it's always been up to the subsys maintainer to apply &
rebase the patches locally, pick up the feedback tags, and run at
least *some* build tests before pushing.

After, the final git push is not to "origin" but to a personal branch
on github.com, and then a github PR is needed. If that fails, notify
the submitter. That's all, as far as I can see.

It sounds like maintainers are going to be the very important bridges between CI system and the patch owners (developers).
Important it is I agree but boring it is as well I have to say.

Oh, it *is* absolutely boring.

Are maintainers still needed to be picked up/chosen/promoted from developers?
Would you trust a person to apply, build-test, and push your
UefiCpuPkg patches, if that person had *zero* UefiCpuPkg development experience?
(Or even zero edk2 development experience?)
Can we change the process a bit?
1. maintainers created pull requests on behave of the patch owners 2.
patch owners can be notified automatically if pull requests fail
I believe this can work if:
- the submitter has a github account
- the maintainer knows the submitter's account name
- the maintainer manually subscribes the submitter to the PR
(I have never tried this myself)

3. patch owners update the pull requests
(I am not familiar to pull requests. I assume the rights of
modifying pull requests and creating pull requests are separated. Are
they?)
PRs should not be updated. If there is a CI failure, the PR should be rejected. If that happens automatically, that's great. If not, then someone must close the PR manually. If the patch series submitter can do that, that's great (I have no idea if that works!) If only the PR owner (= maintainer) can close the PR, then they should close it.

Either way, the patch author will have to submit a v2 to the list. When that is sufficiently reviewed, the same process starts (new PR opened by maintainer).

So, maintainers only need to initiate the pull requests.
I hope this can actually work. We need to test this out first.

It assumes when pull requests are initiated, everyone at least agrees the justifications of the changes are valid and the ways the changes are done.
I agree fully about this. If the CI check completes, then the changes are pushed to master automatically! (Only if the push is a fast-forward
-- otherwise the PR is rejected again. GitHub will not be allowed to merge or rebase withoout supervision.)

Thanks
Laszlo


Re: [edk2-devel] [RFC] EDK II Continuous Integration Phase 1

Laszlo Ersek
 

On 09/03/19 18:41, Ni, Ray wrote:
-----Original Message-----
From: devel@edk2.groups.io <devel@edk2.groups.io> On Behalf Of Laszlo Ersek
Sent: Tuesday, September 3, 2019 6:20 AM
To: Ni, Ray <ray.ni@intel.com>; rfc@edk2.groups.io; devel@edk2.groups.io; Gao, Liming <liming.gao@intel.com>; Kinney,
Michael D <michael.d.kinney@intel.com>
Subject: Re: [edk2-rfc] [edk2-devel] [RFC] EDK II Continuous Integration Phase 1

On 09/03/19 05:39, Ni, Ray wrote:
Can someone draw a flow chart of who does what to help clarify?
I don't see this as a huge change, relative to the current process.

Before, it's always been up to the subsys maintainer to apply & rebase
the patches locally, pick up the feedback tags, and run at least *some*
build tests before pushing.

After, the final git push is not to "origin" but to a personal branch on
github.com, and then a github PR is needed. If that fails, notify the
submitter. That's all, as far as I can see.

It sounds like maintainers are going to be the very important bridges between CI system and the patch owners (developers).
Important it is I agree but boring it is as well I have to say.

Oh, it *is* absolutely boring.

Are maintainers still needed to be picked up/chosen/promoted from developers?
Would you trust a person to apply, build-test, and push your UefiCpuPkg
patches, if that person had *zero* UefiCpuPkg development experience?
(Or even zero edk2 development experience?)
Can we change the process a bit?
1. maintainers created pull requests on behave of the patch owners
2. patch owners can be notified automatically if pull requests fail
I believe this can work if:
- the submitter has a github account
- the maintainer knows the submitter's account name
- the maintainer manually subscribes the submitter to the PR
(I have never tried this myself)

3. patch owners update the pull requests
(I am not familiar to pull requests. I assume the rights of modifying pull requests and creating pull requests are separated. Are they?)
PRs should not be updated. If there is a CI failure, the PR should be
rejected. If that happens automatically, that's great. If not, then
someone must close the PR manually. If the patch series submitter can do
that, that's great (I have no idea if that works!) If only the PR owner
(= maintainer) can close the PR, then they should close it.

Either way, the patch author will have to submit a v2 to the list. When
that is sufficiently reviewed, the same process starts (new PR opened by
maintainer).

So, maintainers only need to initiate the pull requests.
I hope this can actually work. We need to test this out first.

It assumes when pull requests are initiated, everyone at least agrees the justifications of the changes are valid and the ways the changes are done.
I agree fully about this. If the CI check completes, then the changes
are pushed to master automatically! (Only if the push is a fast-forward
-- otherwise the PR is rejected again. GitHub will not be allowed to
merge or rebase withoout supervision.)

Thanks
Laszlo


Re: [edk2-devel] [RFC] EDK II Continuous Integration Phase 1

Ni, Ray
 

-----Original Message-----
From: devel@edk2.groups.io <devel@edk2.groups.io> On Behalf Of Laszlo Ersek
Sent: Tuesday, September 3, 2019 6:20 AM
To: Ni, Ray <ray.ni@intel.com>; rfc@edk2.groups.io; devel@edk2.groups.io; Gao, Liming <liming.gao@intel.com>; Kinney,
Michael D <michael.d.kinney@intel.com>
Subject: Re: [edk2-rfc] [edk2-devel] [RFC] EDK II Continuous Integration Phase 1

On 09/03/19 05:39, Ni, Ray wrote:
Can someone draw a flow chart of who does what to help clarify?
I don't see this as a huge change, relative to the current process.

Before, it's always been up to the subsys maintainer to apply & rebase
the patches locally, pick up the feedback tags, and run at least *some*
build tests before pushing.

After, the final git push is not to "origin" but to a personal branch on
github.com, and then a github PR is needed. If that fails, notify the
submitter. That's all, as far as I can see.

It sounds like maintainers are going to be the very important bridges between CI system and the patch owners (developers).
Important it is I agree but boring it is as well I have to say.

Oh, it *is* absolutely boring.

Are maintainers still needed to be picked up/chosen/promoted from developers?
Would you trust a person to apply, build-test, and push your UefiCpuPkg
patches, if that person had *zero* UefiCpuPkg development experience?
(Or even zero edk2 development experience?)
Can we change the process a bit?
1. maintainers created pull requests on behave of the patch owners
2. patch owners can be notified automatically if pull requests fail
3. patch owners update the pull requests
(I am not familiar to pull requests. I assume the rights of modifying pull requests and creating pull requests are separated. Are they?)

So, maintainers only need to initiate the pull requests. It assumes when pull requests are initiated, everyone at least agrees the justifications of the changes are valid and the ways the changes are done.

Thanks,
Ray


Re: [edk2-devel] [RFC] EDK II Continuous Integration Phase 1

Laszlo Ersek
 

On 09/03/19 05:39, Ni, Ray wrote:
Can someone draw a flow chart of who does what to help clarify?
I don't see this as a huge change, relative to the current process.

Before, it's always been up to the subsys maintainer to apply & rebase
the patches locally, pick up the feedback tags, and run at least *some*
build tests before pushing.

After, the final git push is not to "origin" but to a personal branch on
github.com, and then a github PR is needed. If that fails, notify the
submitter. That's all, as far as I can see.

It sounds like maintainers are going to be the very important bridges between CI system and the patch owners (developers). Important it is I agree but boring it is as well I have to say.
Oh, it *is* absolutely boring.

Are maintainers still needed to be picked up/chosen/promoted from developers?
Would you trust a person to apply, build-test, and push your UefiCpuPkg
patches, if that person had *zero* UefiCpuPkg development experience?
(Or even zero edk2 development experience?)

Maintainers don't have to be intimately familiar with the subsystem that
they are pushing a given set of patches for, but every bit of knowledge
helps.

So the answer is "technically no, but you'll regret it".

Maintainership is *always* a chore. It always takes away resources from
development activities, for the maintainer.

In most projects, people alternate in a given maintainer role -- they
keep replacing each other. The variance is mostly in how frequent this
alternation is.

In some projects, the role belongs to a group, and people cover the
maintainer responsibilities in very quick succession. In other projects,
people replace each other in maintainer roles when the current
maintainer gets tired of the work, and steps down. Then someone
volunteers to assume the role.

Either way, the new subsys maintainer is almost always a seasoned
developer in the subsystem.

Thanks,
Laszlo


-----Original Message-----
From: rfc@edk2.groups.io <rfc@edk2.groups.io> On Behalf Of Laszlo Ersek
Sent: Friday, August 30, 2019 5:58 AM
To: devel@edk2.groups.io; Gao, Liming <liming.gao@intel.com>; rfc@edk2.groups.io; Kinney, Michael D
<michael.d.kinney@intel.com>
Subject: Re: [edk2-rfc] [edk2-devel] [RFC] EDK II Continuous Integration Phase 1

On 08/30/19 10:43, Liming Gao wrote:
Mike:
I add my comments.

-----Original Message-----
From: rfc@edk2.groups.io [mailto:rfc@edk2.groups.io] On Behalf Of Michael
D Kinney
Sent: Friday, August 30, 2019 4:23 AM
To: devel@edk2.groups.io; rfc@edk2.groups.io
Subject: [edk2-rfc] [RFC] EDK II Continuous Integration Phase 1

Hello,

This is a proposal for a first step towards continuous
integration for all TianoCore repositories to help
improve to quality of commits and automate testing and
release processes for all EDK II packages and platforms.

This is based on work from a number of EDK II community
members that have provide valuable input and evaluations.

* Rebecca Cran <rebecca@bsdio.com> Jenkins evaluation
* Laszlo Ersek <lersek@redhat.com> GitLab evaluation
* Philippe Mathieu-Daudé <philmd@redhat.com> GitLab evaluation
* Sean Brogan <sean.brogan@microsoft.com> Azure Pipelines and HBFA
* Bret Barkelew <Bret.Barkelew@microsoft.com> Azure Pipelines and HBFA
* Jiewen Yao <jiewen.yao@intel.com> HBFA

The following link is a link to an EDK II WIKI page that
contains a summary of the work to date. Please provide
feedback in the EDK II mailing lists. The WIKI pages will
be updated with input from the entire EDK II community.

https://github.com/tianocore/tianocore.github.io/wiki/EDK-II-Continuous-
Integration

Proposal
========
Phase 1 of adding continuous integration is limited to the
edk2 repository. Additional repositories will be added later.

The following changes are proposed:
* Remove EDK II Maintainers write access to edk2 repository.
Only EDK II Administrators will continue to have write
access, and that should only be used to handle extraordinary
events.
* EDK II Maintainers use a GitHub Pull Request instead of push
to commit a patch series to the edk2 repository. There are
no other changes to the development and review process. The
patch series is prepared in an EDK II maintainer branch with
all commit message requirements met on each patch in the series.
Will the maintainer manually provide pull request after the patch passes the review?
Yes. The maintainer will prepare a local branch that is rebased to
master, and has all the mailing list feedback tags (Reviewed-by, etc)
applied. The maintainer also does all the local testing that they
usually do, just before the final "git push origin".

Then, the final "git push origin" action is replaced by:
(1) git push personal-repo topic-branch-pr
(2) manual creation of a GitHub.com Pull Request, for the topic branch
just pushed.

That is, the final "git push origin" action is replaced with the pushing
of a personal (ready-made) topic branch, and a GitHub.com Pull Request
against that branch. The verification and the final merging will be
performed by github.

Can the script scan the mail list and auto trig pull request once the patch gets
Reviewed-by or Acked-by from Package maintainers?
No, it can't. (And, at this stage, it should not.) The coordination
between submitters, maintainers, reviewers, and occasionally, stewards,
for determining when a patch series is ready to go, based on review
discussion, remains the same.

* The edk2 repository only accepts Pull Requests from members
of the EDK II Maintainers team. Pull Requests from anyone else
are rejected.
* Run pre-commit checks using Azure Pipelines
The maintainer manually trig pre-commit check or auto trig pre-commit?
After the maintainer pushes the ready-made branch to their personal
repo, and submits a PR against that, the PR will set off the checks. If
the checks pass, the topic branch is merged.

By default, pre-commit should be auto trigged based on pull request.

* If all pre-commit checks pass, then the patch series is auto
committed. The result of this commit must match the contents
and commit history that would have occurred using the previous
push operation.
* If any pre-commit checks fail, then notify the submitter.
Will Pre-commit check fail send the mail to the patch submitter?
The patch submitter need update the patch and go through review process again.
My understanding is that, if the testing in GitHub.com fails, the PR
will be rejected and closed. This will generate a notification email for
the maintainer that submitted the PR. The maintainer can then return to
the email thread, and report the CI failure, possibly with a link to the
failed / rejected PR.

Then, indeed, the submitter must rework the series and post a new
version to the list.

It's also possible (and polite) if the maintainer posts the PR link in
the mailing list thread right after submitting the PR. Then the
submitter can monitor the PR too. (Subscribe to it for notifications.)
As I understand it.

Furthermore,


A typical reason for a failure would be a merge conflict with
another pull request that was just processed.
* Limit pre-commit checks execution time to 10 minutes.
* Provide on-demand builds to EDK II Maintainers that to allow
EDK II Maintainers to submit a branch through for the same
set of pre-commit checks without submitting a pull request.
a maintainer could use this on-demand build & testing facility in the
course of review, to report findings early.

Thanks
Laszlo


## Pre-Commit Checks in Phase 1
* Run and pass PatchCheck.py with no errors

=====================================================

The following are some additional pre-commit check ideas
that could be quickly added once the initial version using
PatchCheck.py is fully functional. Please provide feedback
on the ones you like and additional ones you think may
improve the quality of the commits to the edk2 repository.

## Proposed Pre-Commit Checks in Phase 2
* Verify Reviewed-by and Acked-by tags are present with
correct maintainer email addresses
* Verify no non-ASCII characters in modified files
* Verify no binary files in set of modified files
Now, Edk2 has few binary files, like logo.bmp.
I see one BZ to request logo bmp update.
(BZ https://bugzilla.tianocore.org/show_bug.cgi?id=2050)
So, we need the exception way for it.

* Verify package dependency rules in modified files

## Proposed Pre-Commit Checks in Phase 3
* Run ECC on modified files
* Verify modified modules/libs build
* Run available host based tests (HBFA) against modified
modules/libs
I request boot test on Emulator and Ovmf in the daily and weekly scope.
Daily can cover boot to Shell.
Weekly can cover more boot functionality.


Re: [edk2-devel] [RFC] EDK II Continuous Integration Phase 1

Ni, Ray
 

Can someone draw a flow chart of who does what to help clarify?

It sounds like maintainers are going to be the very important bridges between CI system and the patch owners (developers). Important it is I agree but boring it is as well I have to say.

Are maintainers still needed to be picked up/chosen/promoted from developers?

-----Original Message-----
From: rfc@edk2.groups.io <rfc@edk2.groups.io> On Behalf Of Laszlo Ersek
Sent: Friday, August 30, 2019 5:58 AM
To: devel@edk2.groups.io; Gao, Liming <liming.gao@intel.com>; rfc@edk2.groups.io; Kinney, Michael D
<michael.d.kinney@intel.com>
Subject: Re: [edk2-rfc] [edk2-devel] [RFC] EDK II Continuous Integration Phase 1

On 08/30/19 10:43, Liming Gao wrote:
Mike:
I add my comments.

-----Original Message-----
From: rfc@edk2.groups.io [mailto:rfc@edk2.groups.io] On Behalf Of Michael
D Kinney
Sent: Friday, August 30, 2019 4:23 AM
To: devel@edk2.groups.io; rfc@edk2.groups.io
Subject: [edk2-rfc] [RFC] EDK II Continuous Integration Phase 1

Hello,

This is a proposal for a first step towards continuous
integration for all TianoCore repositories to help
improve to quality of commits and automate testing and
release processes for all EDK II packages and platforms.

This is based on work from a number of EDK II community
members that have provide valuable input and evaluations.

* Rebecca Cran <rebecca@bsdio.com> Jenkins evaluation
* Laszlo Ersek <lersek@redhat.com> GitLab evaluation
* Philippe Mathieu-Daudé <philmd@redhat.com> GitLab evaluation
* Sean Brogan <sean.brogan@microsoft.com> Azure Pipelines and HBFA
* Bret Barkelew <Bret.Barkelew@microsoft.com> Azure Pipelines and HBFA
* Jiewen Yao <jiewen.yao@intel.com> HBFA

The following link is a link to an EDK II WIKI page that
contains a summary of the work to date. Please provide
feedback in the EDK II mailing lists. The WIKI pages will
be updated with input from the entire EDK II community.

https://github.com/tianocore/tianocore.github.io/wiki/EDK-II-Continuous-
Integration

Proposal
========
Phase 1 of adding continuous integration is limited to the
edk2 repository. Additional repositories will be added later.

The following changes are proposed:
* Remove EDK II Maintainers write access to edk2 repository.
Only EDK II Administrators will continue to have write
access, and that should only be used to handle extraordinary
events.
* EDK II Maintainers use a GitHub Pull Request instead of push
to commit a patch series to the edk2 repository. There are
no other changes to the development and review process. The
patch series is prepared in an EDK II maintainer branch with
all commit message requirements met on each patch in the series.
Will the maintainer manually provide pull request after the patch passes the review?
Yes. The maintainer will prepare a local branch that is rebased to
master, and has all the mailing list feedback tags (Reviewed-by, etc)
applied. The maintainer also does all the local testing that they
usually do, just before the final "git push origin".

Then, the final "git push origin" action is replaced by:
(1) git push personal-repo topic-branch-pr
(2) manual creation of a GitHub.com Pull Request, for the topic branch
just pushed.

That is, the final "git push origin" action is replaced with the pushing
of a personal (ready-made) topic branch, and a GitHub.com Pull Request
against that branch. The verification and the final merging will be
performed by github.

Can the script scan the mail list and auto trig pull request once the patch gets
Reviewed-by or Acked-by from Package maintainers?
No, it can't. (And, at this stage, it should not.) The coordination
between submitters, maintainers, reviewers, and occasionally, stewards,
for determining when a patch series is ready to go, based on review
discussion, remains the same.

* The edk2 repository only accepts Pull Requests from members
of the EDK II Maintainers team. Pull Requests from anyone else
are rejected.
* Run pre-commit checks using Azure Pipelines
The maintainer manually trig pre-commit check or auto trig pre-commit?
After the maintainer pushes the ready-made branch to their personal
repo, and submits a PR against that, the PR will set off the checks. If
the checks pass, the topic branch is merged.

By default, pre-commit should be auto trigged based on pull request.

* If all pre-commit checks pass, then the patch series is auto
committed. The result of this commit must match the contents
and commit history that would have occurred using the previous
push operation.
* If any pre-commit checks fail, then notify the submitter.
Will Pre-commit check fail send the mail to the patch submitter?
The patch submitter need update the patch and go through review process again.
My understanding is that, if the testing in GitHub.com fails, the PR
will be rejected and closed. This will generate a notification email for
the maintainer that submitted the PR. The maintainer can then return to
the email thread, and report the CI failure, possibly with a link to the
failed / rejected PR.

Then, indeed, the submitter must rework the series and post a new
version to the list.

It's also possible (and polite) if the maintainer posts the PR link in
the mailing list thread right after submitting the PR. Then the
submitter can monitor the PR too. (Subscribe to it for notifications.)
As I understand it.

Furthermore,


A typical reason for a failure would be a merge conflict with
another pull request that was just processed.
* Limit pre-commit checks execution time to 10 minutes.
* Provide on-demand builds to EDK II Maintainers that to allow
EDK II Maintainers to submit a branch through for the same
set of pre-commit checks without submitting a pull request.
a maintainer could use this on-demand build & testing facility in the
course of review, to report findings early.

Thanks
Laszlo


## Pre-Commit Checks in Phase 1
* Run and pass PatchCheck.py with no errors

=====================================================

The following are some additional pre-commit check ideas
that could be quickly added once the initial version using
PatchCheck.py is fully functional. Please provide feedback
on the ones you like and additional ones you think may
improve the quality of the commits to the edk2 repository.

## Proposed Pre-Commit Checks in Phase 2
* Verify Reviewed-by and Acked-by tags are present with
correct maintainer email addresses
* Verify no non-ASCII characters in modified files
* Verify no binary files in set of modified files
Now, Edk2 has few binary files, like logo.bmp.
I see one BZ to request logo bmp update.
(BZ https://bugzilla.tianocore.org/show_bug.cgi?id=2050)
So, we need the exception way for it.

* Verify package dependency rules in modified files

## Proposed Pre-Commit Checks in Phase 3
* Run ECC on modified files
* Verify modified modules/libs build
* Run available host based tests (HBFA) against modified
modules/libs
I request boot test on Emulator and Ovmf in the daily and weekly scope.
Daily can cover boot to Shell.
Weekly can cover more boot functionality.


Re: [edk2-devel] CPU hotplug using SMM with QEMU+OVMF

Laszlo Ersek
 

On 09/02/19 10:45, Igor Mammedov wrote:
On Fri, 30 Aug 2019 20:46:14 +0200
Laszlo Ersek <lersek@redhat.com> wrote:

On 08/30/19 16:48, Igor Mammedov wrote:

(01) On boot firmware maps and initializes SMI handler at default SMBASE (30000)
(using dedicated SMRAM at 30000 would allow us to avoid save/restore
steps and make SMM handler pointer not vulnerable to DMA attacks)

(02) QEMU hotplugs a new CPU in reset-ed state and sends SCI

(03) on receiving SCI, host CPU calls GPE cpu hotplug handler
which writes to IO port 0xB2 (broadcast SMI)

(04) firmware waits for all existing CPUs rendezvous in SMM mode,
new CPU(s) have SMI pending but does nothing yet

(05) host CPU wakes up one new CPU (INIT-INIT-SIPI)
SIPI vector points to RO flash HLT loop.
(how host CPU will know which new CPUs to relocate?
possibly reuse QEMU CPU hotplug MMIO interface???)

(06) new CPU does relocation.
(in case of attacker sends SIPI to several new CPUs,
open question how to detect collision of several CPUs at the same default SMBASE)

(07) once new CPU relocated host CPU completes initialization, returns
from IO port write and executes the rest of GPE handler, telling OS
to online new CPU.
In step (03), it is the OS that handles the SCI; it transfers control to
ACPI. The AML can write to IO port 0xB2 only because the OS allows it.

If the OS decides to omit that step, and sends an INIT-SIPI-SIPI
directly to the new CPU, can it steal the CPU?
It sure can but this way it won't get access to privileged SMRAM
so OS can't subvert firmware.
The next time SMI broadcast is sent the CPU will use SMI handler at
default 30000 SMBASE. It's up to us to define behavior here (for example
relocation handler can put such CPU in shutdown state).

It's in the best interest of OS to cooperate and execute AML
provided by firmware, if it does not follow proper cpu hotplug flow
we can't guarantee that stolen CPU will work.
This sounds convincing enough, for the hotplugged CPU; thanks.

So now my concern is with step (01). While preparing for the initial
relocation (of cold-plugged CPUs), the code assumes the memory at the
default SMBASE (0x30000) is normal RAM.

Is it not a problem that the area is written initially while running in
normal 32-bit or 64-bit mode, but then executed (in response to the
first, synchronous, SMI) as SMRAM?

Basically I'm confused by the alias.

TSEG (and presumably, A/B seg) work like this:
- when open, looks like RAM to normal mode and SMM
- when closed, looks like black-hole to normal mode, and like RAM to SMM

The generic edk2 code knows this, and manages the SMRAM areas accordingly.

The area at 0x30000 is different:
- looks like RAM to both normal mode and SMM

If we set up the alias at 0x30000 into A/B seg,
- will that *permanently* hide the normal RAM at 0x30000?
- will 0x30000 start behaving like A/B seg?

Basically my concern is that the universal code in edk2 might or might
not keep A/B seg open while initially populating the area at the default
SMBASE. Specifically, I can imagine two issues:

- if the alias into A/B seg is inactive during the initial population,
then the initial writes go to RAM, but the execution (the first SMBASE
relocation) will occur from A/B seg through the alias

- alternatively, if the alias is always active, but A/B seg is closed
during initial population (which happens in normal mode), then the
initial writes go to the black hole, and execution will occur from a
"blank" A/B seg.

Am I seeing things? (Sorry, I keep feeling dumber and dumber in this
thread.)

Anyway, I guess we could try and see if OVMF still boots with the alias...

Thanks
Laszlo


Re: [edk2-devel] CPU hotplug using SMM with QEMU+OVMF

Igor Mammedov <imammedo@...>
 

On Fri, 30 Aug 2019 20:46:14 +0200
Laszlo Ersek <lersek@redhat.com> wrote:

On 08/30/19 16:48, Igor Mammedov wrote:

(01) On boot firmware maps and initializes SMI handler at default SMBASE (30000)
(using dedicated SMRAM at 30000 would allow us to avoid save/restore
steps and make SMM handler pointer not vulnerable to DMA attacks)

(02) QEMU hotplugs a new CPU in reset-ed state and sends SCI

(03) on receiving SCI, host CPU calls GPE cpu hotplug handler
which writes to IO port 0xB2 (broadcast SMI)

(04) firmware waits for all existing CPUs rendezvous in SMM mode,
new CPU(s) have SMI pending but does nothing yet

(05) host CPU wakes up one new CPU (INIT-INIT-SIPI)
SIPI vector points to RO flash HLT loop.
(how host CPU will know which new CPUs to relocate?
possibly reuse QEMU CPU hotplug MMIO interface???)

(06) new CPU does relocation.
(in case of attacker sends SIPI to several new CPUs,
open question how to detect collision of several CPUs at the same default SMBASE)

(07) once new CPU relocated host CPU completes initialization, returns
from IO port write and executes the rest of GPE handler, telling OS
to online new CPU.
In step (03), it is the OS that handles the SCI; it transfers control to
ACPI. The AML can write to IO port 0xB2 only because the OS allows it.

If the OS decides to omit that step, and sends an INIT-SIPI-SIPI
directly to the new CPU, can it steal the CPU?
It sure can but this way it won't get access to privileged SMRAM
so OS can't subvert firmware.
The next time SMI broadcast is sent the CPU will use SMI handler at
default 30000 SMBASE. It's up to us to define behavior here (for example
relocation handler can put such CPU in shutdown state).

It's in the best interest of OS to cooperate and execute AML
provided by firmware, if it does not follow proper cpu hotplug flow
we can't guarantee that stolen CPU will work.

Thanks!
Laszlo


Re: [RFC] EDK II Continuous Integration Phase 1

rebecca@...
 

On 2019-08-29 14:22, Michael D Kinney wrote:
This is based on work from a number of EDK II community
members that have provide valuable input and evaluations.

* Rebecca Cran <rebecca@bsdio.com> Jenkins evaluation
* Laszlo Ersek <lersek@redhat.com> GitLab evaluation
* Philippe Mathieu-Daudé <philmd@redhat.com> GitLab evaluation
* Sean Brogan <sean.brogan@microsoft.com> Azure Pipelines and HBFA
* Bret Barkelew <Bret.Barkelew@microsoft.com> Azure Pipelines and HBFA
* Jiewen Yao <jiewen.yao@intel.com> HBFA

The following link is a link to an EDK II WIKI page that
contains a summary of the work to date. Please provide
feedback in the EDK II mailing lists. The WIKI pages will
be updated with input from the entire EDK II community.

https://github.com/tianocore/tianocore.github.io/wiki/EDK-II-Continuous-Integration


Sorry, I was thinking the evaluation I was doing was having the entire
CI run in Jenkins, not only the boot testing. I don't have the array of
hardware you'd want for boot testing, just mostly large x86 systems. 

So I think I'll drop out of the evaluation.


--

Rebecca Cran


Re: [edk2-devel] CPU hotplug using SMM with QEMU+OVMF

Laszlo Ersek
 

On 08/30/19 16:48, Igor Mammedov wrote:

(01) On boot firmware maps and initializes SMI handler at default SMBASE (30000)
(using dedicated SMRAM at 30000 would allow us to avoid save/restore
steps and make SMM handler pointer not vulnerable to DMA attacks)

(02) QEMU hotplugs a new CPU in reset-ed state and sends SCI

(03) on receiving SCI, host CPU calls GPE cpu hotplug handler
which writes to IO port 0xB2 (broadcast SMI)

(04) firmware waits for all existing CPUs rendezvous in SMM mode,
new CPU(s) have SMI pending but does nothing yet

(05) host CPU wakes up one new CPU (INIT-INIT-SIPI)
SIPI vector points to RO flash HLT loop.
(how host CPU will know which new CPUs to relocate?
possibly reuse QEMU CPU hotplug MMIO interface???)

(06) new CPU does relocation.
(in case of attacker sends SIPI to several new CPUs,
open question how to detect collision of several CPUs at the same default SMBASE)

(07) once new CPU relocated host CPU completes initialization, returns
from IO port write and executes the rest of GPE handler, telling OS
to online new CPU.
In step (03), it is the OS that handles the SCI; it transfers control to
ACPI. The AML can write to IO port 0xB2 only because the OS allows it.

If the OS decides to omit that step, and sends an INIT-SIPI-SIPI
directly to the new CPU, can it steal the CPU?

Thanks!
Laszlo


Re: [edk2-devel] CPU hotplug using SMM with QEMU+OVMF

Igor Mammedov <imammedo@...>
 

On Thu, 29 Aug 2019 19:01:35 +0200
Laszlo Ersek <lersek@redhat.com> wrote:

On 08/27/19 20:31, Igor Mammedov wrote:
On Sat, 24 Aug 2019 01:48:09 +0000
"Yao, Jiewen" <jiewen.yao@intel.com> wrote:
(05) Host CPU: (OS) Port 0xB2 write, all CPUs enter SMM (NOTE: New CPU
will not enter CPU because SMI is disabled)
I think only CPU that does the write will enter SMM
That used to be the case (and it is still the default QEMU behavior, if
broadcast SMI is not negotiated). However, OVMF does negotiate broadcast
SMI whenever QEMU offers the feature. Broadcast SMI is important for the
stability of the edk2 SMM infrastructure on QEMU/KVM, we've found.

https://bugzilla.redhat.com/show_bug.cgi?id=1412313
https://bugzilla.redhat.com/show_bug.cgi?id=1412327

and we might not need to pull in all already initialized CPUs into SMM.
That, on the other hand, could be a valid idea. But then the CPU should
use a different method for raising a synchronous SMI for itself (not a
write to IO port 0xB2). Is a "directed SMI for self" possible?
theoretically depending on argument in 0xb3, it should be possible to
rise directed SMI even if broadcast ones are negotiated.

[...]
I've tried to read through the procedure with your suggested changes,
but I'm failing at composing a coherent mental image, in this email
response format.

If you have the time, can you write up the suggested list of steps in a
"flat" format? (I believe you are suggesting to eliminate some steps
completely.)
if I'd sum it up:

(01) On boot firmware maps and initializes SMI handler at default SMBASE (30000)
(using dedicated SMRAM at 30000 would allow us to avoid save/restore
steps and make SMM handler pointer not vulnerable to DMA attacks)

(02) QEMU hotplugs a new CPU in reset-ed state and sends SCI

(03) on receiving SCI, host CPU calls GPE cpu hotplug handler
which writes to IO port 0xB2 (broadcast SMI)

(04) firmware waits for all existing CPUs rendezvous in SMM mode,
new CPU(s) have SMI pending but does nothing yet

(05) host CPU wakes up one new CPU (INIT-INIT-SIPI)
SIPI vector points to RO flash HLT loop.
(how host CPU will know which new CPUs to relocate?
possibly reuse QEMU CPU hotplug MMIO interface???)

(06) new CPU does relocation.
(in case of attacker sends SIPI to several new CPUs,
open question how to detect collision of several CPUs at the same default SMBASE)

(07) once new CPU relocated host CPU completes initialization, returns
from IO port write and executes the rest of GPE handler, telling OS
to online new CPU.


... jumping to another point:

2) Let trusted software (SMM and init code) guarantee SMREBASE one by one (include any code runs before SMREBASE)
that would mean pulling all present CPUs into SMM mode so no attack
code could be executing before doing hotplug. With a lot of present CPUs
it could be quite expensive and unlike physical hardware, guest's CPUs
could be preempted arbitrarily long causing long delays.
I agree with your analysis, but I slightly disagree about the impact:

- CPU hotplug is not a frequent administrative action, so the CPU load
should be temporary (it should be a spike). I don't worry that it would
trip up OS kernel code. (SMI handling is known to take long on physical
platforms oo.) In practice, all "normal" SMIs are broadcast already (for
example when calling the runtime UEFI variable services from the OS kernel).

- The fact that QEMU/KVM introduces some jitter into the execution of
multi-core code (including SMM code) has proved useful in the past, for
catching edk2 regressions.

Again, this is not a strong disagreement from my side. I'm open to
better ways for synching CPUs during muti-CPU-hotplug.

(Digression:

I expect someone could be curious why (a) I find it acceptable (even
beneficial) that "some jitter" injected by the QEMU/KVM scheduling
exposes multi-core regressions in edk2, but at the same time (b) I found
it really important to add broadcast SMI to QEMU and OVMF. After all,
both "jitter" and "unicast SMIs" are QEMU/KVM platform specifics, so why
the different treatment?

The reason is that the "jitter" does not interfere with normal
operation, and it has been good for catching *regressions*. IOW, there
is a working edk2 state, someone posts a patch, works on physical
hardware, but breaks on QEMU/KVM --> then we can still reject or rework
or revert the patch. And we're back to a working state again (in the
best case, with a fixed feature patch).

With the unicast SMIs however, it was impossible to enable the SMM stack
reliably in the first place. There was no functional state to return to.
I don't really get the last statement, but the I know nothing about OVMF.
I don't insist on unicast SMI being used, it's just some ideas about what
we could do. It could be done later, broadcast SMI (might be not the best)
is sufficient to implement CPU hotplug.

Digression ends.)

lets first see if if we can ignore race
Makes me uncomfortable, but if this is the consensus, I'll go along.
same here, as mentioned in another reply as it's only possible in
attack case (multiple SMIs + multiple SIPI) so it could be fine to just
explode in case it happens (point is fw in not leaking anything from SMRAM
and OS did something illegeal).

and if it's not then
we probably end up with implementing some form of #1
OK.

Thanks!
Laszlo


Re: [Qemu-devel] [edk2-rfc] [edk2-devel] CPU hotplug using SMM with QEMU+OVMF

Igor Mammedov <imammedo@...>
 

On Thu, 29 Aug 2019 18:25:17 +0200
Laszlo Ersek <lersek@redhat.com> wrote:

On 08/28/19 14:01, Igor Mammedov wrote:
On Tue, 27 Aug 2019 22:11:15 +0200
Laszlo Ersek <lersek@redhat.com> wrote:

On 08/27/19 18:23, Igor Mammedov wrote:
On Mon, 26 Aug 2019 17:30:43 +0200
Laszlo Ersek <lersek@redhat.com> wrote:

On 08/23/19 17:25, Kinney, Michael D wrote:
Hi Jiewen,

If a hot add CPU needs to run any code before the
first SMI, I would recommend is only executes code
from a write protected FLASH range without a stack
and then wait for the first SMI.
"without a stack" looks very risky to me. Even if we manage to implement
the guest code initially, we'll be trapped without a stack, should we
ever need to add more complex stuff there.
Do we need anything complex in relocation handler, though?
From what I'd imagine, minimum handler should
1: get address of TSEG, possibly read it from chipset
The TSEG base calculation is not trivial in this environment. The 32-bit
RAM size needs to be read from the CMOS (IO port accesses). Then the
extended TSEG size (if any) needs to be detected from PCI config space
(IO port accesses). Both CMOS and PCI config space requires IO port
writes too (not just reads). Even if there are enough registers for the
calculations, can we rely on these unprotected IO ports?

Also, can we switch to 32-bit mode without a stack? I assume it would be
necessary to switch to 32-bit mode for 32-bit arithmetic.
from SDM vol 3:
"
34.5.1 Initial SMM Execution Environment
After saving the current context of the processor, the processor initializes its core registers to the values shown in Table 34-4. Upon entering SMM, the PE and PG flags in control register CR0 are cleared, which places the processor in an environment similar to real-address mode. The differences between the SMM execution environment and the real-address mode execution environment are as follows:
• The addressable address space ranges from 0 to FFFFFFFFH (4 GBytes).
• The normal 64-KByte segment limit for real-address mode is increased to 4 GBytes.
• The default operand and address sizes are set to 16 bits, which restricts the addressable SMRAM address space to the 1-MByte real-address mode limit for native real-address-mode code. However, operand-size and address-size override prefixes can be used to access the address space beyond
^^^^^^^^
the 1-MByte.
"
That helps. Thanks for the quote!

Getting the initial APIC ID needs some CPUID instructions IIUC, which
clobber EAX through EDX, if I understand correctly. Given the register
pressure, CPUID might have to be one of the first instructions to call.
we could map at 30000 not 64K required for save area but 128K and use
2nd half as secure RAM for stack and intermediate data.

Firmware could put there pre-calculated pointer to TSEG after it's configured and locked down,
this way relocation handler won't have to figure out TSEG address on its own.
Sounds like a great idea.

2: calculate its new SMBASE offset based on its APIC ID
3: save new SMBASE

For this OVMF use case, is any CPU init required
before the first SMI?
I expressed a preference for that too: "I wish we could simply wake the
new CPU [...] with an SMI".

http://mid.mail-archive.com/398b3327-0820-95af-a34d-1a4a1d50cf35@redhat.com


From Paolo's list of steps are steps (8a) and (8b)
really required?
07b - implies 08b
I agree about that implication, yes. *If* we send an INIT/SIPI/SIPI to
the new CPU, then the new CPU needs a HLT loop, I think.
It also could execute INIT reset, which leaves initialized SMM untouched
but otherwise CPU would be inactive.


8b could be trivial hlt loop and we most likely could skip 08a and signaling host CPU steps
but we need INIT/SIPI/SIPI sequence to wake up AP so it could handle pending SMI
before handling SIPI (so behavior would follow SDM).


See again my message linked above -- just after the quoted sentence, I
wrote, "IOW, if we could excise steps 07b, 08a, 08b".

But, I obviously defer to Paolo and Igor on that.

(I do believe we have a dilemma here. In QEMU, we probably prefer to
emulate physical hardware as faithfully as possible. However, we do not
have Cache-As-RAM (nor do we intend to, IIUC). Does that justify other
divergences from physical hardware too, such as waking just by virtue of
an SMI?)
So far we should be able to implement it per spec (at least SDM one),
but we would still need to invent chipset hardware
i.e. like adding to Q35 non exiting SMRAM and means to map/unmap it
to non-SMM address space.
(and I hope we could avoid adding "parked CPU" thingy)
I think we'll need a separate QEMU tree for this. I'm quite in the dark
-- I can't tell if I'll be able to do something in OVMF without actually
trying it. And for that, we'll need some proposed QEMU code that is
testable, but not upstream yet. (As I might realize that I'm unable to
make it work in OVMF.)
Let me prepare a QEMU branch with something usable for you.

To avoid inventing mgmt API for configuring SMRAM at 30000,
I'm suggesting to steal/alias top or bottom 128K of TSEG window to 30000.
This way OVMF would be able to set SMI relocation handler modifying
TSEG and pass TSEG base/other data to it as well.
Would it work for you or should we try more elaborate approach?
I believe this this change may not be cross-compatible between QEMU and
OVMF. OVMF platform code would have to hide the stolen part of the TSEG
from core edk2 SMM code.

If old OVMF were booted on new QEMU, I believe things could break -- the
SMM core would be at liberty to use any part of the TSEG (advertized by
OVMF platform code to the full extent), and the SMM core would continue
expecting 0x30000 to be normal (and distinct) RAM. If QEMU suddenly
aliased both ranges to the same contents (in System Management Mode), I
think that would confuse the SMM core.

We already negotiate (or at least, detect) two features in this area;
"extended TSEG" and "broadcast SMI". I believe we need a CPU hotplug
controller anyway -- is that still the case? If it is, we could use
registers on that device, for managing the alias.
Ok, let me check if we could cannibalize q35 pci-host for the task or
it would be easier to extend MMIO cpu-hotplug interface.
I'll probably come back with questions about how OVMF uses pci-host later.

If the default SMBASE area is corrupted due to concurrent access, could
that lead to invalid relocated SMBASE values? Possibly pointing into
normal RAM?
in case of broadcast SMI (btw does OVMF use broadcast SMIs?) several CPUs could end up
Broadcast SMI is very important for OVMF.

The Platform Init spec basically defines an abstract interface for
runtime UEFI drivers for submitting an "SMM request". Part of that is
raising an SMI (also abstracted).

*How* an SMI is raised is platform-dependent, and edk2 provides two
implementations for synching APs in SMM (broadcast ("traditional") and
relaxed).

In our testing on QEMU/KVM, the broadcast/traditional sync mode worked
very robustly (with QEMU actually broadcasting the SMI in response to IO
port 0xB2 writes), but the relaxed synch mode was unstable / brittle (in
particular during S3 resume). Therefore broadcast SMI is negotiated by
OVMF whenever it is available -- it makes a big difference in stability.

Now, whether broadcast SMI needs to be part of CPU hotplug specifically,
that's a different question. The CPU hotplug logic may not necessarily
have to go through the same (standardized) interfaces that runtime UEFI
drivers do.
considering above we are pretty much stuck with broadcast SMI mode
for standard UEFI interfaces. So for starters we can use it for CPU
hotplug as well (I think it's not possible to trigger directed SMI
from GPE handler and no nice way to implement it comes to my mind so far)

Broadcast SMI by itself does not present any problems to normal hotplug
flow. Even if there are several hotplugged CPUs, SMI# will be only pending
on all of them and host CPU can serialize them by waking one CPU at a time
by sending INIT-INIT-SIPI. Once one CPU is relocated, host CPU may wake up
the next one the same way ...


with the same SMBASE within SMRAM
1: default one: in case the 2nd CPU enters SMM after the 1st CPU saved new SMBASE but before it's called RSM
2: duplicated SMBASE: where the 2nd CPU saves its new SMBASE before the 1st calls RSM

while the 2nd could be counteracted with using locks, I don't see how 1st one could be avoided.
May be host CPU can send 2nd SMI so just relocated CPU could send an ACK from relocated SMBASE/with new SMI handler?
I don't have any better idea. We could protect the default SMBASE with a
semaphore (spinlock?) in SMRAM, but that would have to be released with
the owning CPU executing code at the new SMBASE. Basically, what you
say, just "ACK" meaning "release the spinlock".
Lets try it, if it won't work out we will invent 'parking' feature in QEMU.
Considering that an attack scenario, it is probably fine to even avoid
attempts to recover from collision if it happens and just do machine wide
reset once detected that CPU is using not its own SMBASE.
This way locking might be not needed.

In case of 'parking' I see 2 possible ways:
1: on CPU hotplug inhibit other CPUs hotplug in QEMU (device_add cpu) and
wait until firmware permits it. (relatively simple to implement without
affecting CPU/KVM code)
downside is that's not nice to upper layers as they will start getting
transient errors while previously hotplugged CPU is being relocated.

2: implement parked 'CPU' feature, which as I think in practice means
ignore or queue SIPI and process it only when allowed (which is out
of spec behavior). That probably would require changes not only to
QEMU but to KVM as well.

Thanks,
Laszlo


Re: [edk2-devel] [RFC] EDK II Continuous Integration Phase 1

Laszlo Ersek
 

Hi Sean,

On 08/30/19 04:21, Sean via Groups.Io wrote:
would like to see these two efforts merged together.
Currently if the full set of tests are run we take about 20 minutes.
This is because compiling MdeModulePkg for debug, release, and host
based tests take a while. Most other packages are in the 10 minute
range. We do have easy ways to disable or limit certain tests as well
as expand the matrix to leverage more cloud resources (more parallel
builds).
### Module Inclusion Test - DscCompleteCheck
### Code Compilation Test - CompilerPlugin
### Host-Based UnitTests - HostUnitTestCompilerPlugin and
HostUnitTestDscCompleteCheck
### GUID Uniqueness Test - GuidCheck
### Cross-Package Dependency Test - DependencyCheck
### Library Declaration Test - LibraryClassCheck
### Invalid Character Test - CharEncodingCheck
These tests sound awesome!

## Next Steps

* Receive community feedback on RFC.
* Determine where this phase makes sense given existing RFCs from
other TianoCore contributors.
* Optimize testing beharior.
* Only run a subset of tests on PRs or individual commits.
* Run full testing either once per day or once every several
commits.

I'd like to keep the per-PR tests down to 10 minutes.

On the other hand, it would be great if all of these tests could be
performed daily or weekly.

Are these tests easy to integrate with the infrastructure described by
Mike?

Thanks,
Laszlo


Re: [edk2-devel] [RFC] EDK II Continuous Integration Phase 1

Laszlo Ersek
 

On 08/30/19 10:43, Liming Gao wrote:
Mike:
I add my comments.

-----Original Message-----
From: rfc@edk2.groups.io [mailto:rfc@edk2.groups.io] On Behalf Of Michael
D Kinney
Sent: Friday, August 30, 2019 4:23 AM
To: devel@edk2.groups.io; rfc@edk2.groups.io
Subject: [edk2-rfc] [RFC] EDK II Continuous Integration Phase 1

Hello,

This is a proposal for a first step towards continuous
integration for all TianoCore repositories to help
improve to quality of commits and automate testing and
release processes for all EDK II packages and platforms.

This is based on work from a number of EDK II community
members that have provide valuable input and evaluations.

* Rebecca Cran <rebecca@bsdio.com> Jenkins evaluation
* Laszlo Ersek <lersek@redhat.com> GitLab evaluation
* Philippe Mathieu-Daudé <philmd@redhat.com> GitLab evaluation
* Sean Brogan <sean.brogan@microsoft.com> Azure Pipelines and HBFA
* Bret Barkelew <Bret.Barkelew@microsoft.com> Azure Pipelines and HBFA
* Jiewen Yao <jiewen.yao@intel.com> HBFA

The following link is a link to an EDK II WIKI page that
contains a summary of the work to date. Please provide
feedback in the EDK II mailing lists. The WIKI pages will
be updated with input from the entire EDK II community.

https://github.com/tianocore/tianocore.github.io/wiki/EDK-II-Continuous-
Integration

Proposal
========
Phase 1 of adding continuous integration is limited to the
edk2 repository. Additional repositories will be added later.

The following changes are proposed:
* Remove EDK II Maintainers write access to edk2 repository.
Only EDK II Administrators will continue to have write
access, and that should only be used to handle extraordinary
events.
* EDK II Maintainers use a GitHub Pull Request instead of push
to commit a patch series to the edk2 repository. There are
no other changes to the development and review process. The
patch series is prepared in an EDK II maintainer branch with
all commit message requirements met on each patch in the series.
Will the maintainer manually provide pull request after the patch passes the review?
Yes. The maintainer will prepare a local branch that is rebased to
master, and has all the mailing list feedback tags (Reviewed-by, etc)
applied. The maintainer also does all the local testing that they
usually do, just before the final "git push origin".

Then, the final "git push origin" action is replaced by:
(1) git push personal-repo topic-branch-pr
(2) manual creation of a GitHub.com Pull Request, for the topic branch
just pushed.

That is, the final "git push origin" action is replaced with the pushing
of a personal (ready-made) topic branch, and a GitHub.com Pull Request
against that branch. The verification and the final merging will be
performed by github.

Can the script scan the mail list and auto trig pull request once the patch gets
Reviewed-by or Acked-by from Package maintainers?
No, it can't. (And, at this stage, it should not.) The coordination
between submitters, maintainers, reviewers, and occasionally, stewards,
for determining when a patch series is ready to go, based on review
discussion, remains the same.

* The edk2 repository only accepts Pull Requests from members
of the EDK II Maintainers team. Pull Requests from anyone else
are rejected.
* Run pre-commit checks using Azure Pipelines
The maintainer manually trig pre-commit check or auto trig pre-commit?
After the maintainer pushes the ready-made branch to their personal
repo, and submits a PR against that, the PR will set off the checks. If
the checks pass, the topic branch is merged.

By default, pre-commit should be auto trigged based on pull request.

* If all pre-commit checks pass, then the patch series is auto
committed. The result of this commit must match the contents
and commit history that would have occurred using the previous
push operation.
* If any pre-commit checks fail, then notify the submitter.
Will Pre-commit check fail send the mail to the patch submitter?
The patch submitter need update the patch and go through review process again.
My understanding is that, if the testing in GitHub.com fails, the PR
will be rejected and closed. This will generate a notification email for
the maintainer that submitted the PR. The maintainer can then return to
the email thread, and report the CI failure, possibly with a link to the
failed / rejected PR.

Then, indeed, the submitter must rework the series and post a new
version to the list.

It's also possible (and polite) if the maintainer posts the PR link in
the mailing list thread right after submitting the PR. Then the
submitter can monitor the PR too. (Subscribe to it for notifications.)
As I understand it.

Furthermore,


A typical reason for a failure would be a merge conflict with
another pull request that was just processed.
* Limit pre-commit checks execution time to 10 minutes.
* Provide on-demand builds to EDK II Maintainers that to allow
EDK II Maintainers to submit a branch through for the same
set of pre-commit checks without submitting a pull request.
a maintainer could use this on-demand build & testing facility in the
course of review, to report findings early.

Thanks
Laszlo


## Pre-Commit Checks in Phase 1
* Run and pass PatchCheck.py with no errors

=====================================================

The following are some additional pre-commit check ideas
that could be quickly added once the initial version using
PatchCheck.py is fully functional. Please provide feedback
on the ones you like and additional ones you think may
improve the quality of the commits to the edk2 repository.

## Proposed Pre-Commit Checks in Phase 2
* Verify Reviewed-by and Acked-by tags are present with
correct maintainer email addresses
* Verify no non-ASCII characters in modified files
* Verify no binary files in set of modified files
Now, Edk2 has few binary files, like logo.bmp.
I see one BZ to request logo bmp update.
(BZ https://bugzilla.tianocore.org/show_bug.cgi?id=2050)
So, we need the exception way for it.

* Verify package dependency rules in modified files

## Proposed Pre-Commit Checks in Phase 3
* Run ECC on modified files
* Verify modified modules/libs build
* Run available host based tests (HBFA) against modified
modules/libs
I request boot test on Emulator and Ovmf in the daily and weekly scope.
Daily can cover boot to Shell.
Weekly can cover more boot functionality.


Re: [RFC] EDK II Continuous Integration Phase 1

Liming Gao
 

Mike:
I add my comments.

-----Original Message-----
From: rfc@edk2.groups.io [mailto:rfc@edk2.groups.io] On Behalf Of Michael
D Kinney
Sent: Friday, August 30, 2019 4:23 AM
To: devel@edk2.groups.io; rfc@edk2.groups.io
Subject: [edk2-rfc] [RFC] EDK II Continuous Integration Phase 1

Hello,

This is a proposal for a first step towards continuous
integration for all TianoCore repositories to help
improve to quality of commits and automate testing and
release processes for all EDK II packages and platforms.

This is based on work from a number of EDK II community
members that have provide valuable input and evaluations.

* Rebecca Cran <rebecca@bsdio.com> Jenkins evaluation
* Laszlo Ersek <lersek@redhat.com> GitLab evaluation
* Philippe Mathieu-Daudé <philmd@redhat.com> GitLab evaluation
* Sean Brogan <sean.brogan@microsoft.com> Azure Pipelines and HBFA
* Bret Barkelew <Bret.Barkelew@microsoft.com> Azure Pipelines and HBFA
* Jiewen Yao <jiewen.yao@intel.com> HBFA

The following link is a link to an EDK II WIKI page that
contains a summary of the work to date. Please provide
feedback in the EDK II mailing lists. The WIKI pages will
be updated with input from the entire EDK II community.

https://github.com/tianocore/tianocore.github.io/wiki/EDK-II-Continuous-
Integration

Proposal
========
Phase 1 of adding continuous integration is limited to the
edk2 repository. Additional repositories will be added later.

The following changes are proposed:
* Remove EDK II Maintainers write access to edk2 repository.
Only EDK II Administrators will continue to have write
access, and that should only be used to handle extraordinary
events.
* EDK II Maintainers use a GitHub Pull Request instead of push
to commit a patch series to the edk2 repository. There are
no other changes to the development and review process. The
patch series is prepared in an EDK II maintainer branch with
all commit message requirements met on each patch in the series.
Will the maintainer manually provide pull request after the patch passes the review?
Can the script scan the mail list and auto trig pull request once the patch gets
Reviewed-by or Acked-by from Package maintainers?

* The edk2 repository only accepts Pull Requests from members
of the EDK II Maintainers team. Pull Requests from anyone else
are rejected.
* Run pre-commit checks using Azure Pipelines
The maintainer manually trig pre-commit check or auto trig pre-commit?
By default, pre-commit should be auto trigged based on pull request.

* If all pre-commit checks pass, then the patch series is auto
committed. The result of this commit must match the contents
and commit history that would have occurred using the previous
push operation.
* If any pre-commit checks fail, then notify the submitter.
Will Pre-commit check fail send the mail to the patch submitter?
The patch submitter need update the patch and go through review process again.

A typical reason for a failure would be a merge conflict with
another pull request that was just processed.
* Limit pre-commit checks execution time to 10 minutes.
* Provide on-demand builds to EDK II Maintainers that to allow
EDK II Maintainers to submit a branch through for the same
set of pre-commit checks without submitting a pull request.

## Pre-Commit Checks in Phase 1
* Run and pass PatchCheck.py with no errors

=====================================================

The following are some additional pre-commit check ideas
that could be quickly added once the initial version using
PatchCheck.py is fully functional. Please provide feedback
on the ones you like and additional ones you think may
improve the quality of the commits to the edk2 repository.

## Proposed Pre-Commit Checks in Phase 2
* Verify Reviewed-by and Acked-by tags are present with
correct maintainer email addresses
* Verify no non-ASCII characters in modified files
* Verify no binary files in set of modified files
Now, Edk2 has few binary files, like logo.bmp.
I see one BZ to request logo bmp update.
(BZ https://bugzilla.tianocore.org/show_bug.cgi?id=2050)
So, we need the exception way for it.

* Verify package dependency rules in modified files

## Proposed Pre-Commit Checks in Phase 3
* Run ECC on modified files
* Verify modified modules/libs build
* Run available host based tests (HBFA) against modified
modules/libs
I request boot test on Emulator and Ovmf in the daily and weekly scope.
Daily can cover boot to Shell.
Weekly can cover more boot functionality.

Thanks
Liming
Best regards,

Mike







Re: [RFC] EDK II Continuous Integration Phase 1

Sean
 

Mike, as you mentioned we have been working towards enabling a practical and extensible CI for Edk2 using Azure dev ops and the recently added edk2-pytool infrastructure. We have been using similar CI for Project Mu for the last few years.

Our approach is a little different in that we focus on validating the whole code base rather than just the incoming patch. We do this because we have found unexpected consequences of patches and overall we want all code to be compliant not just new additions. We have found the time to test the whole tree is not much longer than only the parts impacted by a code change (except maybe when running the entire compile test on every package). This obviously comes with an initial tax of needing to get the codebase into compliant form. Anyway we have prepared an RFC in addition to yours and would like to see these two efforts merged together.

We are still working on making a few optimizations. Currently if the full set of tests are run we take about 20 minutes. This is because compiling MdeModulePkg for debug, release, and host based tests take a while. Most other packages are in the 10 minute range. We do have easy ways to disable or limit certain tests as well as expand the matrix to leverage more cloud resources (more parallel builds).


Content is best viewed online with links to helpful content but is also attached below:
https://github.com/spbrogan/edk2-staging/blob/edk2-stuart-ci-latest/Readme-CI-RFC.md

# CI and PR Gates

## Background

Historically, while the TianoCore maintainers and stewards have done a fantastic job of keeping contribution policies consistent and contributions clean and well-documented, there have been few processes that ran to verify the sanity, cleanliness, and efficacy of the codebase, and even fewer that publicly published their results for the community at large. This has caused inconsistancies and issues within the codebase from time to time.

Adding continuous integration (and potentially PR gates) to the checkin process ensures that simple errors like these are caught and can be fixed on a regular basis.

## Strategy

While a number of CI solutions exist, this proposal will focus on the usage of Azure Dev Ops and Build Pipelines. For demonstration, a sample [TianoCore repo](https://github.com/spbrogan/edk2-staging.git) (branch edk2-stuart-ci-latest) and [Dev Ops Pipeline](https://dev.azure.com/tianocore/edk2-ci-play/_build?definitionId=12) have been set up.

Furthermore, this proposal will leverage the TianoCore python tools PIP modules: [library](https://pypi.org/project/edk2-pytool-library/) and [extensions](https://pypi.org/project/edk2-pytool-extensions/) (with repos located [here](https://github.com/tianocore/edk2-pytool-library) and [here](https://github.com/tianocore/edk2-pytool-extensions)).

The primary execution flows can be found in the `azure-pipelines-pr-gate.yml` and `azure-pipelines-pr-gate-linux.yml` files. These YAML files are consumed by the Azure Dev Ops Build Pipeline and dictate what server resources should be used, how they should be configured, and what processes should be run on them. An overview of this schema can be found [here](https://docs.microsoft.com/en-us/azure/devops/pipelines/yaml-schema?view=azure-devops&tabs=schema).

Inspection of these files reveals the EDKII Tools commands that make up the primary processes for the CI build: 'stuart_setup', 'stuart_update', and 'stuart_ci_build'. These commands come from the EDKII Tools PIP modules and are configured as described below. More documentation on the stuart tools can be found [here](https://github.com/tianocore/edk2-pytool-extensions/blob/master/docs/using.md) and [here](https://github.com/tianocore/edk2-pytool-extensions/blob/master/docs/features/feature_invocables.md).

## Configuration

Configuration of the CI process consists of (in order of precedence):
* command-line arguments passed in via the Pipeline YAML
* a per-package configuration file (e.g. `<package-name>.mu.yaml`) that is detected by the CI system in EDKII Tools.
* a global configuration Python module (e.g. `CISetting.py`) passed in via the command-line

The global configuration file is described in [this readme](https://github.com/tianocore/edk2-pytool-extensions/blob/master/docs/usability/using_settings_manager.md) from the EDKII Tools documentation. This configuration is written as a Python module so that decisions can be made dynamically based on command line parameters and codebase state.

The per-package configuration file can override most settings in the global configuration file, but is not dynamic. This file can be used to skip or customize tests that may be incompatible with a specific package. By default, the global configuration will try to run all tests on all packages.

## CI Test Types

All CI tests are instances of EDKII Tools plugins. Documentation on the plugin system can be found [here](https://github.com/tianocore/edk2-pytool-extensions/blob/master/docs/usability/using_plugin_manager.md) and [here](https://github.com/tianocore/edk2-pytool-extensions/blob/master/docs/features/feature_plugin_manager.md). Upon invocation, each plugin will be passed the path to the current package under test and a dictionary containing its targeted configuration, as assembled from the command line, per-package configuration, and global configuration.

Note: CI plugins are considered unique from build plugins and helper plugins, even though some CI plugins may execute steps of a build.

In the example, these plugins live alongside the code under test (in the `BaseTools` directory), but may be moved to the 'edk2-test' repo if that location makes more sense for the community.

### Module Inclusion Test - DscCompleteCheck

This test scans all available modules (via INF files) and compares them to the package-level DSC file for the package each module is contained within. The test considers it an error if any module does not appear in the `Components` section of at least one package-level DSC (indicating that it would not be built if the package were built).

### Code Compilation Test - CompilerPlugin

Once the Module Inclusion Test has verified that all modules would be built if all package-level DSCs were built, the Code Compilation Test simply runs through and builds every package-level DSC on every toolchain and for every architecture that is supported. Any module that fails to build is considered an error.

### Host-Based UnitTests - HostUnitTestCompilerPlugin and HostUnitTestDscCompleteCheck

The [Testing RFC doc](Readme-Testing-RFC.md) has much more detail on this, but the basic idea is that host-based unit tests can be compiled against individual modules and libraries and run on the build agent (in this case, the Dev Ops build server). The successful and failing test case results are collected and included in the final build report.

### GUID Uniqueness Test - GuidCheck

This test works on the collection of all packages rather than an individual package. It looks at all FILE_GUIDs and GUIDs declared in DEC files and ensures that they are unique for the codebase. This prevents, for example, accidental duplication of GUIDs when using an existing INF as a template for a new module.

### Cross-Package Dependency Test - DependencyCheck

This test compares the list of all packages used in INFs files for a given package against a list of "allowed dependencies" in plugin configuration for that package. Any module that depends on a disallowed package will cause a test failure.

### Library Declaration Test - LibraryClassCheck

This test looks at all library header files found in a package's `Include/Library` directory and ensures that all files have a matching LibraryClass declaration in the DEC file for the package. Any missing declarations will cause a failure.

### Invalid Character Test - CharEncodingCheck

This test scans all files in a package to make sure that there are no invalid Unicode characters that may cause build errors in some character sets/localizations.

## Next Steps

* Receive community feedback on RFC.
* Determine where this phase makes sense given existing RFCs from other TianoCore contributors.
* Optimize testing beharior.
* Only run a subset of tests on PRs or individual commits.
* Run full testing either once per day or once every several commits.
* Add more tests/capabilities.
* Continue to improve results formatting.
* Continue to improve CI documentation.
* Much of this documentation effort is pending community feedback on which parts are needed and what phases are priorities.

Thanks

-----Original Message-----
From: rfc@edk2.groups.io <rfc@edk2.groups.io> On Behalf Of Michael D Kinney via Groups.Io
Sent: Thursday, August 29, 2019 1:23 PM
To: devel@edk2.groups.io; rfc@edk2.groups.io
Subject: [edk2-rfc] [RFC] EDK II Continuous Integration Phase 1

Hello,

This is a proposal for a first step towards continuous integration for all TianoCore repositories to help improve to quality of commits and automate testing and release processes for all EDK II packages and platforms.

This is based on work from a number of EDK II community members that have provide valuable input and evaluations.

* Rebecca Cran <mailto:rebecca@bsdio.com> Jenkins evaluation
* Laszlo Ersek <mailto:lersek@redhat.com> GitLab evaluation
* Philippe Mathieu-Daudé <mailto:philmd@redhat.com> GitLab evaluation
* Sean Brogan <mailto:sean.brogan@microsoft.com> Azure Pipelines and HBFA
* Bret Barkelew <mailto:Bret.Barkelew@microsoft.com> Azure Pipelines and HBFA
* Jiewen Yao <mailto:jiewen.yao@intel.com> HBFA

The following link is a link to an EDK II WIKI page that contains a summary of the work to date. Please provide feedback in the EDK II mailing lists. The WIKI pages will be updated with input from the entire EDK II community.

https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Ftianocore%2Ftianocore.github.io%2Fwiki%2FEDK-II-Continuous-Integration&;data=02%7C01%7Csean.brogan%40microsoft.com%7C6f67f169a6c746b4288608d72cbea7b6%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637027069686644659&amp;sdata=GR9wN6gP3mJlCTopAaQ2rlzhby1nuF%2BwDVsfFIQAZjA%3D&amp;reserved=0

Proposal
========
Phase 1 of adding continuous integration is limited to the
edk2 repository. Additional repositories will be added later.

The following changes are proposed:
* Remove EDK II Maintainers write access to edk2 repository.
Only EDK II Administrators will continue to have write
access, and that should only be used to handle extraordinary
events.
* EDK II Maintainers use a GitHub Pull Request instead of push
to commit a patch series to the edk2 repository. There are
no other changes to the development and review process. The
patch series is prepared in an EDK II maintainer branch with
all commit message requirements met on each patch in the series.
* The edk2 repository only accepts Pull Requests from members
of the EDK II Maintainers team. Pull Requests from anyone else
are rejected.
* Run pre-commit checks using Azure Pipelines
* If all pre-commit checks pass, then the patch series is auto
committed. The result of this commit must match the contents
and commit history that would have occurred using the previous
push operation.
* If any pre-commit checks fail, then notify the submitter.
A typical reason for a failure would be a merge conflict with
another pull request that was just processed.
* Limit pre-commit checks execution time to 10 minutes.
* Provide on-demand builds to EDK II Maintainers that to allow
EDK II Maintainers to submit a branch through for the same
set of pre-commit checks without submitting a pull request.

## Pre-Commit Checks in Phase 1
* Run and pass PatchCheck.py with no errors

=====================================================

The following are some additional pre-commit check ideas that could be quickly added once the initial version using PatchCheck.py is fully functional. Please provide feedback on the ones you like and additional ones you think may improve the quality of the commits to the edk2 repository.

## Proposed Pre-Commit Checks in Phase 2
* Verify Reviewed-by and Acked-by tags are present with
correct maintainer email addresses
* Verify no non-ASCII characters in modified files
* Verify no binary files in set of modified files
* Verify package dependency rules in modified files

## Proposed Pre-Commit Checks in Phase 3
* Run ECC on modified files
* Verify modified modules/libs build
* Run available host based tests (HBFA) against modified
modules/libs

Best regards,

Mike


Re: [edk2-devel] [RFC] EDK II Continuous Integration Phase 1

Michael Zimmermann <sigmaepsilon92@...>
 

Hi Michael,

would it make sense to run SCT (using UnixHost and/or qemu) to verify
the high level logic or do you think that would be too much to do for
each PR?

Also, do we want to run all these checks for each commit in the PR to
verify that they pass at each point in the git history?

Thanks
Michael

On Thu, Aug 29, 2019 at 10:22 PM Michael D Kinney
<michael.d.kinney@intel.com> wrote:

Hello,

This is a proposal for a first step towards continuous
integration for all TianoCore repositories to help
improve to quality of commits and automate testing and
release processes for all EDK II packages and platforms.

This is based on work from a number of EDK II community
members that have provide valuable input and evaluations.

* Rebecca Cran <rebecca@bsdio.com> Jenkins evaluation
* Laszlo Ersek <lersek@redhat.com> GitLab evaluation
* Philippe Mathieu-Daudé <philmd@redhat.com> GitLab evaluation
* Sean Brogan <sean.brogan@microsoft.com> Azure Pipelines and HBFA
* Bret Barkelew <Bret.Barkelew@microsoft.com> Azure Pipelines and HBFA
* Jiewen Yao <jiewen.yao@intel.com> HBFA

The following link is a link to an EDK II WIKI page that
contains a summary of the work to date. Please provide
feedback in the EDK II mailing lists. The WIKI pages will
be updated with input from the entire EDK II community.

https://github.com/tianocore/tianocore.github.io/wiki/EDK-II-Continuous-Integration

Proposal
========
Phase 1 of adding continuous integration is limited to the
edk2 repository. Additional repositories will be added later.

The following changes are proposed:
* Remove EDK II Maintainers write access to edk2 repository.
Only EDK II Administrators will continue to have write
access, and that should only be used to handle extraordinary
events.
* EDK II Maintainers use a GitHub Pull Request instead of push
to commit a patch series to the edk2 repository. There are
no other changes to the development and review process. The
patch series is prepared in an EDK II maintainer branch with
all commit message requirements met on each patch in the series.
* The edk2 repository only accepts Pull Requests from members
of the EDK II Maintainers team. Pull Requests from anyone else
are rejected.
* Run pre-commit checks using Azure Pipelines
* If all pre-commit checks pass, then the patch series is auto
committed. The result of this commit must match the contents
and commit history that would have occurred using the previous
push operation.
* If any pre-commit checks fail, then notify the submitter.
A typical reason for a failure would be a merge conflict with
another pull request that was just processed.
* Limit pre-commit checks execution time to 10 minutes.
* Provide on-demand builds to EDK II Maintainers that to allow
EDK II Maintainers to submit a branch through for the same
set of pre-commit checks without submitting a pull request.

## Pre-Commit Checks in Phase 1
* Run and pass PatchCheck.py with no errors

=====================================================

The following are some additional pre-commit check ideas
that could be quickly added once the initial version using
PatchCheck.py is fully functional. Please provide feedback
on the ones you like and additional ones you think may
improve the quality of the commits to the edk2 repository.

## Proposed Pre-Commit Checks in Phase 2
* Verify Reviewed-by and Acked-by tags are present with
correct maintainer email addresses
* Verify no non-ASCII characters in modified files
* Verify no binary files in set of modified files
* Verify package dependency rules in modified files

## Proposed Pre-Commit Checks in Phase 3
* Run ECC on modified files
* Verify modified modules/libs build
* Run available host based tests (HBFA) against modified
modules/libs

Best regards,

Mike








Re: [edk2-devel] [RFC] EDK II Continuous Integration Phase 1

Michael D Kinney
 

Hi Michael,

SCTs are in scope. It is only deciding when they get run
and how much pre-commit execution time developers are
willing to wait for a pass/fail result.

The on-demand testing feature is one place we can add extra
testing to help improve the quality of commits. The
Daily and Weekly scoped tests are another option.

Mike

-----Original Message-----
From: Michael Zimmermann <sigmaepsilon92@gmail.com>
Sent: Thursday, August 29, 2019 1:40 PM
To: devel@edk2.groups.io; Kinney, Michael D
<michael.d.kinney@intel.com>
Cc: rfc@edk2.groups.io
Subject: Re: [edk2-devel] [RFC] EDK II Continuous
Integration Phase 1

Hi Michael,

would it make sense to run SCT (using UnixHost and/or
qemu) to verify the high level logic or do you think
that would be too much to do for each PR?

Also, do we want to run all these checks for each
commit in the PR to verify that they pass at each point
in the git history?

Thanks
Michael

On Thu, Aug 29, 2019 at 10:22 PM Michael D Kinney
<michael.d.kinney@intel.com> wrote:

Hello,

This is a proposal for a first step towards
continuous integration for
all TianoCore repositories to help improve to quality
of commits and
automate testing and release processes for all EDK II
packages and
platforms.

This is based on work from a number of EDK II
community members that
have provide valuable input and evaluations.

* Rebecca Cran <rebecca@bsdio.com> Jenkins evaluation
* Laszlo Ersek <lersek@redhat.com> GitLab evaluation
* Philippe Mathieu-Daudé <philmd@redhat.com> GitLab
evaluation
* Sean Brogan <sean.brogan@microsoft.com> Azure
Pipelines and HBFA
* Bret Barkelew <Bret.Barkelew@microsoft.com> Azure
Pipelines and HBFA
* Jiewen Yao <jiewen.yao@intel.com> HBFA

The following link is a link to an EDK II WIKI page
that contains a
summary of the work to date. Please provide feedback
in the EDK II
mailing lists. The WIKI pages will be updated with
input from the
entire EDK II community.


https://github.com/tianocore/tianocore.github.io/wiki/E
DK-II-Continuou
s-Integration

Proposal
========
Phase 1 of adding continuous integration is limited
to the
edk2 repository. Additional repositories will be
added later.

The following changes are proposed:
* Remove EDK II Maintainers write access to edk2
repository.
Only EDK II Administrators will continue to have
write
access, and that should only be used to handle
extraordinary
events.
* EDK II Maintainers use a GitHub Pull Request
instead of push
to commit a patch series to the edk2 repository.
There are
no other changes to the development and review
process. The
patch series is prepared in an EDK II maintainer
branch with
all commit message requirements met on each patch
in the series.
* The edk2 repository only accepts Pull Requests from
members
of the EDK II Maintainers team. Pull Requests from
anyone else
are rejected.
* Run pre-commit checks using Azure Pipelines
* If all pre-commit checks pass, then the patch
series is auto
committed. The result of this commit must match
the contents
and commit history that would have occurred using
the previous
push operation.
* If any pre-commit checks fail, then notify the
submitter.
A typical reason for a failure would be a merge
conflict with
another pull request that was just processed.
* Limit pre-commit checks execution time to 10
minutes.
* Provide on-demand builds to EDK II Maintainers that
to allow
EDK II Maintainers to submit a branch through for
the same
set of pre-commit checks without submitting a pull
request.

## Pre-Commit Checks in Phase 1
* Run and pass PatchCheck.py with no errors

=====================================================

The following are some additional pre-commit check
ideas that could be
quickly added once the initial version using
PatchCheck.py is fully
functional. Please provide feedback on the ones you
like and
additional ones you think may improve the quality of
the commits to
the edk2 repository.

## Proposed Pre-Commit Checks in Phase 2
* Verify Reviewed-by and Acked-by tags are present
with
correct maintainer email addresses
* Verify no non-ASCII characters in modified files
* Verify no binary files in set of modified files
* Verify package dependency rules in modified files

## Proposed Pre-Commit Checks in Phase 3
* Run ECC on modified files
* Verify modified modules/libs build
* Run available host based tests (HBFA) against
modified
modules/libs

Best regards,

Mike







641 - 660 of 753