Re: UEFI accessibility mandate


Ethin Probst
 

The patent for MP3 expired a year or so ago, if memory serves. (I
don't know how long ago it expired but it definitely has expired, so
you don't need to worry about IP (I don't think).) An OS setup utility
would be nice if it weren't for the fact that it would prevent you
from modifying things like secure boot. Why can't we just either (1)
embed the audio files in NVRAM or (2) embed a speech synthesizer
directly into the UEFI system (the synthesizer would not need to be
standardized, just easily embeddable) and then send PCM samples
directly to the HDA device using that synthesizer? ESpeak is one such
synthesizer; I don't know how we'd go about modifying it but given
enough time we could probably make it fit in the UEFI environment.

On 9/24/19, Andrew Fish <afish@...> wrote:


On Sep 24, 2019, at 5:06 AM, Rafael Machado
<rafaelrodrigues.machado@...> wrote:

Hi Everyone

Answering Ethin:
If other platforms are PCI-based (i.e. allow us to scan the PCI bus
and figure out where (in MMIO space) the HDA controller is mapped to),
then it (theoretically) sould work. I don't know for sure though; I'm
not very knowledgeable in other CPU architectures
Answer: Neither do I. Lets wait some ARM expert to give some opinion :)
Rafael,

I'm not exactly the ARM expert, but what we have seen is as long as en EFI
PCI driver follows the EFI rules for DMA it should work just fine on all CPU
architectures. The historical problem is if you don't follow the rules on
x86 DMA still works since the memory coherency is maintained by hardware on
x86. Since the coherency requires software on ARM, you have to do it right.


Answering Andrew
Did you do much research into CODECs? Like which one(s) should be
supported? I assume the CODEC implies the audio file formats that can
be decoded? Also how large are the audio files?
Answer: During my research I have studied some codec specs, and the way
the codecs works is really similar. Normally they need to receive some
packets to initialize the nodes (node is the name of each internal
component of the codec), by receiving some commands (named verbs) that are
defined at the HDASpec. For example the verb "power state" (0xF05) is
used to set the power state of each node on the codec, to really enable
the chip or put the chip to sleep.
The verbs can be found at this document
https://www.intel.com/content/dam/www/public/us/en/documents/product-specifications/high-definition-audio-specification.pdf
<https://www.intel.com/content/dam/www/public/us/en/documents/product-specifications/high-definition-audio-specification.pdf>,
at page 216.
Not all verbs are standardized, for example the verb "Amplifier Gain Mute"
(0xB--), each vendor can decide the command. In the case of the ASUS
system I have used, the codec is a Conexant CX20752
(https://www.datasheets360.com/pdf/7550682360275196829
<https://www.datasheets360.com/pdf/7550682360275196829>), and at the
datasheet we can see that the "Amplifier Gain Mute" verb is 0xB00 and
0xB20 (right and left channel in the case of a stereo stream) at page 33.
These verbs that are vendor defined creates the problems to think on a
generic solution.
About the second part of the question, the Codecs are independent, so they
are only responsible to process signals that are passed to them in a
stream using a DMA buffer, or a command to change their behavior (increase
/ decrease volume for example). So for the codec to process correctly the
buffers managed by the audio controller, both need to be configured with
the same stream format, something like 2 channel 8 bit depth for example.
At the second video I send previously
(https://www.youtube.com/watch?v=6ED2BSc89-Y&feature=youtu.be
<https://www.youtube.com/watch?v=6ED2BSc89-Y&feature=youtu.be>) the
problem was that the controller was configured to manage a stream with 2
channel and 8 bits depth, but the codec was processing the data as 2
channel 16 bit depth.
About the audio files size, at my work I didn't use any compressed format,
because I was running out of time to finish my project, so I was
processing raw audio data, that is much bigger than other formats like
MP3. In case we decide to use audio files at this first stage, maybe we
will need to port some audio format lib to UEFI also.
Thanks I tracked down a friend of mine who knows audio so I think I grok it
better now. It seems HDA is a hardware standard. Seems like most of the work
in an HDA driver is doing the setup to decode the data.

As far as a file format we might be able to use PCD, or as we know it a wav
file. It might also be possible to use MP3, but there have been IP issues in
the past encoding MP3 and I'm not sure how that will work out. The only
issue with WAV file is size, but we may be able to compress the WAV files
with a decompressor that already lives in the ROM. Saving file space is
probably as simple as wrapping the file with a header that describes the
file type and encryption scheme.

Is the CODEC protocol more of a plug-in for the Intel HDA? By that I
mean it only works on that hardware, or does it work generically on top
of any EFI_AUDIO_OUTPUT_PROTOCOL?
Answer: My understanding is that the CODEC protocol will need the audio
controller to be already configured, because the communication with the
codec is done using the addresses that are related to the AUDIO
controller. So the access to the AUDIO Codec is managed by the Audio
controller. We need to write the commands at the address that was
previously allocated and set as the CORB/RIRB (Command Output Ring Buffer
/ Response Input Ring Buffer), and set at the HDA controller registers
(page 36 of the HDA Spec)

I was starting to think about how to store to audio and deal with
dynamic configuration. I guess one crazy idea could be to have the OS
create the audio files using text to speech, and maybe store them on
the EFI System Partition. For example when an OS installs it is going
to write the EFI Boot Variable that contains a Description of the boot
option. Maybe we could convert that to audio and add a nvram variable
that points to the given audio file?
Answer: This is one option. One question. How would the OS knows the bios
options and menus so it could create these files?
The EFI Boot options are NVRAM based, so we could use NVRAM for that. For
the HII (Setup) it may be possible to pass the data up to the OS and then
have it encoded. We might be able to make it visible to EFI via writing to
the disk on the EFI System Partition with an NVARM variable for any extra
information we needed. I see to remember we talked about passing data up to
the OS at one point so we could have an OS based setup utility. I don't
remember the details. Mike Rothman wrote the HII part of the spec so I can
hit him up for ideas when we try to figure this out.

Thanks,

Andrew Fish

Thanks and Regards
Rafael


Em seg, 23 de set de 2019 às 12:11, Andrew Fish <afish@...
<mailto:afish@...>> escreveu:
Rafael,

Did you do much research into CODECs? Like which one(s) should be
supported? I assume th CODEC implies the audio file formats that can be
decoded? Also how large are the audio files?

Is the CODEC protocol more of a plug-in for the Intel HDA? By that I mean
it only works on that hardware, or does it work generically on top of any
EFI_AUDIO_OUTPUT_PROTOCOL?

I was starting to think about how to store to audio and deal with dynamic
configuration. I guess one crazy idea could be to have the OS create the
audio files using text to speech, and maybe store them on the EFI System
Partition. For example when an OS installs it is going to write the EFI
Boot Variable that contains a Description of the boot option. Maybe we
could convert that to audio and add a nvram variable that points to the
given audio file?

Thanks,

Andrew Fish

On Sep 23, 2019, at 6:20 AM, Rafael Machado
<rafaelrodrigues.machado@...
<mailto:rafaelrodrigues.machado@...>> wrote:

Hi everyone.
So, based on everything was mentioned here.

The idea is to propose the creation of two protocols:

- EFI_AUDIO_OUTPUT_PROTOCOL: This protocol should be the responsible for
initializing the audio controller, that in the case of my MSc work does
initialization of the RING buffers, that are used as containers to the
audio streams that need to be processed.

- EFI_AUDIO_CODEC_PROTOCOL: This protocol should be responsible for
initializing the codec (each codec may need different init commands), and
also is used to control things like mute, volume level and this kind of
things. (can see the volume control actions at the last video I mentioned
on the previous e-mail)

Does this approach works at non-x86 platforms? (I don't have knowledge in
ARM platforms, so feedback from the community will be well received.)

Hope to hear some voices :)

Thanks and Regards
Rafael

Em sáb, 21 de set de 2019 às 09:36, Rafael Machado
<rafaelrodrigues.machado@...
<mailto:rafaelrodrigues.machado@...>> escreveu:
Hi Everyone
Sorry for the delay on the response, to many things happening at the same
time.
I will try to answer e-mails to this thread every Saturday or Sunday
morning at least.
About Andrew's and Laszlo's comments and questions

Please let us know what you find out. I probably don''t have the time
to help implement this feature, but I happy to help work on the
architecture and design for UEFI accessibility on the edk2 mailing
lists, and I >>can also represent what ever we come up with at the
UEFI Spec Work Group.
During my MSc I had to study a lot the audio and BIOS architectures. The
idea was to eliminate the first barrier to the creation of a screen
reader for pre-OS environment, that was the lack of some open
implementation of audio control and actions at UEFI. To do that I studied
the Intel High Definition Audio Spec and a lot of UEFI specs to
understand better how to do that.
The initial target was to do all this development at OVMF, but as far as
I could get, the sound card is not available to OVMF. as Laszlo mentioned
at this e-mail there are some projects that may help on this, but at the
time I was working on my MSc I didn't find this, so I did everything on a
real system (a ASUS notebook).
It took me 2 years of work, because I didn't know a lot of things and
working on a MSc degree at the same time having a 40hours/week job, being
a father and a husband is not an easy task, but it got to what I was
expecting.
The evolution of the project was this:
1 - First tests using some registers named "Immediate Registers", that
later I noticed that are not mandatory. This is a simple C Major scale:
https://www.youtube.com/watch?v=I-mgzcOnRCg&feature=youtu.be
<https://www.youtube.com/watch?v=I-mgzcOnRCg&feature=youtu.be>
2 - Some months later I started to work with the Ring Buffers and DMA
memory access. For the ones that have good years, it's possible to listen
some music behing the noise.
https://www.youtube.com/watch?v=6ED2BSc89-Y&feature=youtu.be
<https://www.youtube.com/watch?v=6ED2BSc89-Y&feature=youtu.be>
3 - Later, wen I was almost giving up, I noticed that the problem was
that one of my write operations was causing some misunderstanding between
the audio controller and the audio codec. The controller was sending
packets with 16bit depth, but the codec was processing them as 8bit
depth
https://www.youtube.com/watch?v=2De9dI9WbwM&feature=youtu.be
<https://www.youtube.com/watch?v=2De9dI9WbwM&feature=youtu.be>

So the conclusion is that doing this at UEFI us much easier that doing at
the OS level.
The reference code, that is just a proof-of-concept, and that has several
things to be improved, can be found here:
https://github.com/RafaelRMachado/Msc_UefiHda_PreOs_Accessibility
<https://github.com/RafaelRMachado/Msc_UefiHda_PreOs_Accessibility>

Currently it is just an UEFI Application, but we can convert this to UEFI
drivers after some discussion. Everything is released as BDS so companies
can use without IP problems.
Just to give some more information about the need of this kind of
solution. There is a lot of blind people that work with hardware support,
so formatting disk, configuring RAID and booting dual-boot systems is
always a challenge to them. Even set the BIOS clock. How to do that
without the system's feedback?

It would be hard to have a UEFI mandate for accessibility, given there
is no guideline on how a User Interface (UI) works. If accessibility
requires some from of hardware abstraction, like audio, then we could
likely get that into the UEFI Spec. What might be possible is an
EDK2 reference implementation of accessibility. Maybe we could use the
reference implementation to write a UEFI white paper on design >>for
accessibility? I there is an open source implementation, and an
official design guide this would make it much easier for advocacy
groups to lobby for this feature.
I agree this is the way. Writing a white paper as an official EDK2 papers
is one of my targets since the beginning of my MSc almost 5 years ago.

I've got some experience with accessibility as the macOS EFI OS Loader
has a UI for the Full Disk Encryption password. If you press the power
button quickly 3 times at the disk encryption password prompt
accessibility is enabled and Voice Over gets turned on. You then get
localized voice prompts when you move between UI elements. Since this
is the OS loader all the resources are stored on the disk. You
quickly run into a range of challenges since, audio is hard,
abstracting audio is hard (what codec does firmware have to support),
Audio files are not small and firmware is size constrained, the need
to localize >>the audio responses causes even more size issues, the
boot options are usually written by an OS installer so how would
firmware know what to call them?
The solution to this would be the porting of some voice synthesizer, so
no audio files would need to be stored. There are some open-source
implementations that are not GPL.
This was described at my MSc as future works that can continue what I
have started.

I listed a lot of reasons it is hard but as Kennedy stated in his "We
choose to go to the Moon!" speech sometimes we chose to do things
"not because they are easy, but because they are hard; because that
goal will serve to organize and measure the best of our energies and
skills, because that challenge is one that we are willing to accept".
If we have a design that means we can break the problem up into
smaller parts, and maybe we can find people that have expertise in
that part to build a chunk at a time. If we could implement the
prototype in OVMF that would show how it works, but run on everyone's
machines, so that would be really helpful for demos and design
review.
I totally agree. Amazing words that I didn't have heard yet. Thanks!
As far as I could understand, and with Leif's help, some possible future
steps could be (not at this specific order):
- 1) Convert proof-of-concept HDA driver to UEFI driver model with
proper PCI discovery.
- 2) Design a prototype EFI_AUDIO_OUTPUT_PROTOCOL, rework driver to
produce this and application to discover and consume it.
- 3) Implement a USB Audio Class driver also producing
EFI_AUDIO_OUTPUT_PROTOCOL and ensure test application remains
functional.
- 4) Separate controller and codec code by creating an
EFI_AUDIO_CODEC_PROTOCOL, implement this in HDA driver, and separate out
the codec support into individual drivers.
- 5) Prototype audio output additions to HII. (using pre-recorder
audio files)
- 6) Porting of some voice synthesizer to UEFI. (eliminating the need
of audio files)

Beyond this, there are other things we should look at adding, like
- EFI_AUDIO_INPUT_PROTOCOL.
- Audio input additions to HII.

It's a lot of work, but I accept the challenge.
It may take a long time, but it is possible.

I am still trying to find some time to finish the translation of my
thesis to English.
I wrote everything in Portuguese because there was not material about
UEFI to the Brazilian audience, and another target I have is to show
companies that we have people that can work at this kind of projects in
Brazil, bringing this kind of development to south america. (Yes, I have
complicated target, but I like the challenge :) )

Thanks and Regards
Rafael R. Machado

Em qui, 19 de set de 2019 às 14:45, Laszlo Ersek <lersek@...
<mailto:lersek@...>> escreveu:
On 09/18/19 19:57, Andrew Fish wrote:
Rafael,

Please let us know what you find out. I probably don''t have the time
to help implement this feature, but I happy to help work on the
architecture and design for UEFI accessibility on the edk2 mailing
lists, and I can also represent what ever we come up with at the UEFI
Spec Work Group.

It would be hard to have a UEFI mandate for accessibility, given
there is no guideline on how a User Interface (UI) works. If
accessibility requires some from of hardware abstraction, like audio,
then we could likely get that into the UEFI Spec. What might be
possible is an EDK2 reference implementation of accessibility. Maybe
we could use the reference implementation to write a UEFI white paper
on design for accessibility? I there is an open source
implementation, and an official design guide this would make it much
easier for advocacy groups to lobby for this feature.

I've got some experience with accessibility as the macOS EFI OS
Loader has a UI for the Full Disk Encryption password. If you press
the power button quickly 3 times at the disk encryption password
prompt accessibility is enabled and Voice Over gets turned on. You
then get localized voice prompts when you move between UI elements.
Since this is the OS loader all the resources are stored on the disk.
You quickly run into a range of challenges since, audio is hard,
abstracting audio is hard (what codec does firmware have to support),
Audio files are not small and firmware is size constrained, the need
to localize the audio responses causes even more size issues, the
boot options are usually written by an OS installer so how would
firmware know what to call them?

I listed a lot of reasons it is hard but as Kennedy stated in his "We
choose to go to the Moon!" speech sometimes we chose to do things
"not because they are easy, but because they are hard; because that
goal will serve to organize and measure the best of our energies and
skills, because that challenge is one that we are willing to accept".
If we have a design that means we can break the problem up into
smaller parts, and maybe we can find people that have expertise in
that part to build a chunk at a time. If we could implement the
prototype in OVMF that would show how it works, but run on everyones
machines, so that would be really helpful for demos and design
review.
Somewhat related, in April there was a thread on virtio-dev that
suggests there is interest in a virtio-audio device model:

https://lists.oasis-open.org/archives/virtio-dev/201904/msg00049.html
<https://lists.oasis-open.org/archives/virtio-dev/201904/msg00049.html>

It looks like the ACRN project already implements a (non-standard, as of
now) virtio-audio device already:

https://lists.oasis-open.org/archives/virtio-dev/201907/msg00061.html
<https://lists.oasis-open.org/archives/virtio-dev/201907/msg00061.html>

(This is all I can mention right now.)

Thanks
Laszlo
--
Signed,
Ethin D. Probst

Join {rfc@edk2.groups.io to automatically receive all group messages.