Re: UEFI accessibility mandate


Rafael Machado <rafaelrodrigues.machado@...>
 

Hi Ethin

Unfortunately I didn't have a chance to continue this work on the last
months.
Code sample can be found here:

https://github.com/RafaelRMachado/Msc_UefiHda_PreOs_Accessibility

At YouTube there are some videos of my msc progress:

https://youtu.be/2De9dI9WbwM

And my work can be found here (pt):

http://repositorio.ufscar.br/handle/ufscar/10931

And here(en):
https://ieeexplore.ieee.org/abstract/document/8116573/

Hope to have a chance to keep this work in progress on the next months.

Rafael


Em seg, 29 de jun de 2020 16:09, Ethin Probst <harlydavidsen@...>
escreveu:

Hello all,
Raising this from (the seeming) silence it's been in since the 30th of
Oct. of 2019. What is the status on this? Also, is there a bit of
source code you guys can share with me for initializing the HDA
controller via MMIO (assume the address is already known) that I can
look at? I have two reasons for asking: (1) I'm writing an little
hobby operating system and would love to get audio in it (but I still
haven't really figured out HDA :)); and (2) I'd like to see about
creating a UEFI application in rust to see if I can embed a speech
synthesizer I know of out there and see if it can talk. If I can get
that working, I'll create a github repo for that and we can see about
turning it into an actual UEFI driver. Intercepting text on the screen
will be a challenge that EDK2 will most likely need to undertake, as I
have never contributed to EDK2 and find its source code layout quite
confusing.

On 10/30/19, Rafael Machado <rafaelrodrigues.machado@...> wrote:
Hi Ethin

I think you have entered the community after the code was sent at this
discussion.
Attached the latest code we have received from Andrew. Not sure if this
is
really the latest version, but it's a good starting point for your tests.

Thanks and Regards
Rafael R. Machado

Em ter, 29 de out de 2019 às 20:14, Ethin Probst <
harlydavidsen@...>
escreveu:

This is incredible progress! I'd love to help where I can when you
release the src.

On 10/29/19, Andrew Fish <afish@...> wrote:
Ethin,

I've made progress using the EmulatorPkg. I've ported my test C
command
line
application into the EmulatorPkg Host (OS C command line application
part
of the EmulatorPkg), so the the host now produces APIs that support
Text
to
Voice PCM files, and playing PCM, also called wave, files. There is a
EmulatorPkg driver to produce Text To PCM, and a driver to play PCM
files. I
also wrote a test application. At this point the test application in
the
emulator can use the two APIs to produce Text To Voice.

Next steps for me are to write up a Bugzilla, cleanup the code, and
make
github branch to share the work.

Next steps for the community... From the EmulatorPkg it should be
possible
to implement the UI work to add voice. it should also be possible to
implement a real EFI Voice to PCM driver. It should also be possible
to
use
the test application to test a real EFI audio driver.

Thanks,

Andrew Fish

On Oct 29, 2019, at 1:17 PM, Ethin Probst <harlydavidsen@...>
wrote:

Has there been any progress on this? I wish I could help, but I have
no experience in developing for EDK2...

On 10/1/19, Rafael Machado <rafaelrodrigues.machado@...>
wrote:
Incredible!
I will find some time to try it at linux on the following weeks.
Thanks for the effort Andrew!

Rafael R. Machado

Em seg, 30 de set de 2019 às 22:18, Andrew Fish <afish@...>
escreveu:



On Sep 30, 2019, at 11:50 AM, Rafael Machado <
rafaelrodrigues.machado@...> wrote:

Hi Andrew

As you have mentioned:
" I might find some time to try to get the EmulatorPkg mocking
working."
Answer: This would be amazing! Thanks for trying to help!


Rafael,

I made a little progress this weekend watching American Football on
TV,
and I had some time on an airplane today to add some comments.

I wrote a simple C command line program that does almost all the
primitive
audio functions we need to add Audio/Text-To-Speech to the
EmulatorPkg.
Since ANSI C does not support audio I had to system() out to macOS
command
line tools to get some of the tasks complete. The only feature I'm
missing
is playing the PCM buffer asynchronously, and I may be able to add
that
next weekend. I guessed at the Linux commands, it would be good if
someone
could debug the program on a Linux system, and also make sure I
picked
the
right command line tools to use. I think for Windows there is
going
to
be
more coding involved, but I think there is at least a C API to play
audio,
but I only saw a C++ API for speech to text. But the good news is
folks
can
use the simple C program to get the primitives working on Linux and
Windows
systems and it should be easy to port that work to the future
EmulatorPkg
work.

What I've got working on macOS is:
1) Mute/Unmute
2) GetVolume/SetVolume
3) Text to PCD buffer (Wave file format)
a) Supports word per minute rate
b) Supports localizing the voice based on RFC 4646 language code
used
by
EFI (the program speaks in en, zh, and fr as a test)
4) PCD buffer to audio

This is how I was building on my Mac: clang -g -Wall -Werror
audio.c

Short term anyone that has some free time if they could look at
testing/debugging audio.c on Linux and send feedback, also maybe
some
one
could start playing around with getting audio.c ported to Windows?

After I figure out getting async audio output working in audio.c I
think
the next steps are working on the protocol/variable definitions.
With
the
protocol/variable definitions we should be able to add the
capability
to
the EmulatorPkg. At that point it should be possible for anyone to
help
work on the accessibility UI. Obviously to get this working on a
real
platform we will need a pure EFI Text to Speech driver (We could
debug
that
in the EmulatorPkg). We would also need Intel HDA Audio drivers,
and
Audio
drivers for OVFM, and other Audio hardware that is on peoples
platforms.

Thanks,

Andrew Fish

The plan is to try to get some students to help on this also at the
next
"google summer of code".
Not sure how it works, but since the idea is good, and it is a real
need,
I believe we have chances to have some more hands working on it.

Thanks and Regards
Rafael R. Machado



Em qui, 26 de set de 2019 às 20:44, Andrew Fish <afish@...>
escreveu:



On Sep 26, 2019, at 5:15 AM, Rafael Machado <
rafaelrodrigues.machado@...> wrote:

Hi everyone.

About Ethin's question:
As for the boot manager, yes, that would be covered by the speech
synthesizer protocol. Perhaps we could extend this to the setup
utility
too
(as well as all other things displayed on the screen)?
Answer: This is the target from my point of view. Since we have
the
advantages of drivers at UEFI, there is no reason for not having
accessibility at pre-OS anymore. We could also think in future to
add
other
kinds of accessibility, like the magnifying glass (that would be
some
multi-layer version of the GOP driver), that would help low-vision
people.
Just one thing for everyone to think about. The "Convention on the
Rights
of Persons with Disabilities" say that a person has the right to
access
information. So if someone buys a notebook, all information
accessible
to
"common" people must be available to all kinds of people, in a way
they
can
choose to use this information or not.
*" (e) Recognizing that disability is an evolving concept and that
disability results from the interaction between persons with
impairments
and attitudinal and environmental barriers that hinders their full
and
effective participation in society on an equal basis with others,
..."*
Details can be found at the United Nations web site
https://www.un.org/disabilities/documents/convention/convention_accessible_pdf.pdf


About Andrew's comments and questions:
What member functions to you think that EFI_AUDIO_OUTPUT_PROTOCOL
should contain?
Answer: From my perspective, this protocol would need at least the
following functions (names may change after more discussion.)


- EFI_AUDIO_OUTPUT_PROTOCOL.LoadStream()
This function is intended to copy some data to the dma buffers
that
were allocated previously by the AudioController driver. The
initial
idea
I
had was to add, in the case of my MSc, a HdaDriver.efi that is
responsible
to do all the HDA controller initialization (allocating dma
buffers,
configuring the PCIe bus if needed). Not sure if this
initialization
should
we added to the EFI_AUDIO_OUTPUT_PROTOCOL, because this could
create
problems when using other controllers or concepts, like I believe
ARM
platforms do.My understanding is that the platforms should have a
driver
to
initialize the audio controller/codec, so the
EFI_AUDIO_OUTPUT_PROTOCOL
is
just worried with loading and playing information.

- EFI_AUDIO_OUTPUT_PROTOCOL.PlayStream()
This function is responsible to play the stream. This is done
by
setting one specific bit at the controllers configuration space
(pointed
by
the bar[0] pcie configuration register at the pcie config space.)
At
this
project I don't think we need to be worried with implementing an
sound
server to enable multiple streams at a time. To the actions done
by
this
function are much simpler. Just play a stream.

- EFI_AUDIO_OUTPUT_PROTOCOL.StopStream()
This function is responsible to stop the stream play. The
opposite
of
the previous one. Maybe we could use a single function that
receives
some
parameter to start or stop the play.

- EFI_AUDIO_OUTPUT_PROTOCOL.SetVolume()
This function is responsible to set the volume to be used at
all
nodes at the codec (so it depends on a EFI_AUDIO_CODEC_PROTOCOL
that
knows
the commands a given codec needs to receive to process streams.)

- EFI_AUDIO_OUTPUT_PROTOCOL.SetStreamSpeed()
This function is responsible to set the speed a given stream
should
be played on. Just to clarify, visual impaired people have an
amazing
capacity of understanding sound streams in incredibly fast speeds.
So
the
possibility of setting the stream processing speed is what makes
them
be
as
productive as a person with usual vision capacities.

- EFI_AUDIO_OUTPUT_PROTOCOL.UnloadStream()
This function is responsible to clean the buffers and letting
them
prepared to receive other audio streams.

Still about Andrews comments.
This gave me the idea of adding a EFI_AUDIO_OUTPUT_PROTOCOL and
EFI_TEXT_TO_SPEECH_PROTOCOL
driver to the EmulatorPkg that thunk down to the App to do the
work.
With this we could prototype the UI part of accessibility.
Answer: This would be great, because we would split the
development.
One
focusing on the HII and navigation (done normally using the tab
key,
so
each element at the screen would need an tab sequence number and a
accessibility tag with what the speech software should say), and
another
focusing on the drivers/synthesizer/audio files depending on our
future
decisions.

So my understanding is that the list of things is increasing. It
is
actually something like this (not ordered by priority):
- EFI_AUDIO_OUTPUT_PROTOCOL: Protocol used to process the streams
- EFI_AUDIO_CODEC_PROTOCOL: Protocol used to communicate with the
codec
- Audio configuration driver (HdaAudio.efi for example). Should
this
create some other protocol, like EFI_AUDIO_CONTROLLER_PROTOCOL
that
is
responsible to configure the controller?


Rafael,

Generally how it works for a PCI device is it gets its Start()
function
called to bind to the device. So the EFI_AUDIO_OUTPUT_PROTOCOL and
EFI_AUDIO_CODEC_PROTOCOL protocols would likely get installed on
the
PCI
IO
handle of the HDA PCI hardware. Some basic init will happen during
the
Start() and the driver could defer some config until one of the
member
functions is called the 1st time.

So for example on a normal boot the Audio driver may not even get
started. So there may have to be an nvram variable to start the
driver.
We
might also need an NVRAM variable for the voice to text driver if
it
is
stored on disk.

- EFI_TEXT_TO_SPEECH_PROTOCOL: Protocol responsible to convert
text
to
speech (using the EFI_AUDIO_OUTPUT_PROTOCOL )
- HII changes to add tab navigation capabilities and accessibility
tags
at each component on the screen

Are we aligned with the understanding?


I think so and if we can mock an implementation in the EmulatorPkg
that
would let the HII UI work happen independently of the text to
speech,
and
Audio driver work.

I'm on some air plaines in the next weeks (I'm at the airport
right
now),
and my wife is heading to a conference after I get back from my
trips.
Given my dogs are not very talkative I might find some time to try
to
get
the EmulatorPkg mocking working.

Thanks,

Andrew Fish

I believe soon we will be prepared to talk about this on a design
meeting.

Thanks and Regards
Rafael R. Machado



Em qua, 25 de set de 2019 às 22:08, Andrew Fish <afish@...>
escreveu:

Rafael,

What member functions to you think that EFI_AUDIO_OUTPUT_PROTOCOL
should
contain?

I'm thinking if we had an EFI_TEXT_TO_SPEECH_PROTOCOL that driver
could
produce a PCM/Wave buffer, it could then use
EFI_AUDIO_OUTPUT_PROTOCOL
to play the sound.

I poked around my Mac at lunch and I can generate text to speech
from
the command line into a wave file via the `say` command line
tool.
I
can
play the wave file from the command line via `afplay`. This gave
me
the
idea of adding a EFI_AUDIO_OUTPUT_PROTOCOL and
EFI_TEXT_TO_SPEECH_PROTOCOL
driver to the EmulatorPkg that thunk down to the App to do the
work.
With this we could prototype the UI part of accessibility. We
could
also
debug the EFI EFI_TEXT_TO_SPEECH_PROTOCOL in this environment.

Thanks,

Andrew Fish

On Sep 21, 2019, at 5:36 AM, Rafael Machado <
rafaelrodrigues.machado@...> wrote:

Hi Everyone
Sorry for the delay on the response, to many things happening at
the
same time.
I will try to answer e-mails to this thread every Saturday or
Sunday
morning at least.
About Andrew's and Laszlo's comments and questions

Please let us know what you find out. I probably don''t have
the
time
to help implement this feature, but I happy to help work on the
architecture and design for UEFI accessibility on the edk2
mailing
lists,
and I >>can also represent what ever we come up with at the UEFI
Spec
Work
Group.
During my MSc I had to study a lot the audio and BIOS
architectures.
The
idea was to eliminate the first barrier to the creation of a
screen
reader
for pre-OS environment, that was the lack of some open
implementation
of
audio control and actions at UEFI. To do that I studied the Intel
High
Definition Audio Spec and a lot of UEFI specs to understand
better
how
to
do that.
The initial target was to do all this development at OVMF, but as
far
as
I could get, the sound card is not available to OVMF. as Laszlo
mentioned
at this e-mail there are some projects that may help on this, but
at
the
time I was working on my MSc I didn't find this, so I did
everything
on
a
real system (a ASUS notebook).
It took me 2 years of work, because I didn't know a lot of things
and
working on a MSc degree at the same time having a 40hours/week
job,
being a
father and a husband is not an easy task, but it got to what I
was
expecting.
The evolution of the project was this:
1 - First tests using some registers named "Immediate Registers",
that
later I noticed that are not mandatory. This is a simple C Major
scale:
https://www.youtube.com/watch?v=I-mgzcOnRCg&feature=youtu.be
2 - Some months later I started to work with the Ring Buffers and
DMA
memory access. For the ones that have good years, it's possible
to
listen
some music behing the noise.
https://www.youtube.com/watch?v=6ED2BSc89-Y&feature=youtu.be
3 - Later, wen I was almost giving up, I noticed that the problem
was
that one of my write operations was causing some misunderstanding
between
the audio controller and the audio codec. The controller was
sending
packets with 16bit depth, but the codec was processing them as
8bit
depth
https://www.youtube.com/watch?v=2De9dI9WbwM&feature=youtu.be

So the conclusion is that doing this at UEFI us much easier that
doing
at the OS level.
The reference code, that is just a proof-of-concept, and that has
several things to be improved, can be found here:
https://github.com/RafaelRMachado/Msc_UefiHda_PreOs_Accessibility

Currently it is just an UEFI Application, but we can convert this
to
UEFI drivers after some discussion. Everything is released as BDS
so
companies can use without IP problems.
Just to give some more information about the need of this kind of
solution. There is a lot of blind people that work with hardware
support,
so formatting disk, configuring RAID and booting dual-boot
systems
is
always a challenge to them. Even set the BIOS clock. How to do
that
without
the system's feedback?

It would be hard to have a UEFI mandate for accessibility,
given
there
is no guideline on how a User Interface (UI) works. If
accessibility
requires some from of hardware abstraction, like audio, then we
could
likely get that into the UEFI Spec. What might be possible is
an
EDK2
reference implementation of accessibility. Maybe we could use the
reference
implementation to write a UEFI white paper on design >>for
accessibility? I
there is an open source implementation, and an official design
guide
this
would make it much easier for advocacy groups to lobby for this
feature.
I agree this is the way. Writing a white paper as an official
EDK2
papers is one of my targets since the beginning of my MSc almost
5
years
ago.

I've got some experience with accessibility as the macOS EFI OS
Loader
has a UI for the Full Disk Encryption password. If you press the
power
button quickly 3 times at the disk encryption password prompt
accessibility is enabled and Voice Over gets turned on. You
then
get
localized voice prompts when you move between UI elements. Since
this
is
the OS loader all the resources are stored on the disk. You
quickly
run
into a range of challenges since, audio is hard, abstracting
audio
is
hard
(what codec does firmware have to support), Audio files are not
small
and
firmware is size constrained, the need to localize >>the audio
responses
causes even more size issues, the boot options are usually
written
by
an
OS
installer so how would firmware know what to call them?
The solution to this would be the porting of some voice
synthesizer,
so
no audio files would need to be stored. There are some
open-source
implementations that are not GPL.
This was described at my MSc as future works that can continue
what
I
have started.

I listed a lot of reasons it is hard but as Kennedy stated in
his
"We
choose to go to the Moon!" speech sometimes we chose to do things
"not
because they are easy, but because they are hard; because that
goal
will
serve to organize and measure the best of our energies and
skills,
because
that challenge is one that we are willing to accept". If we have
a
design
that means we can break the problem up into >>smaller parts, and
maybe
we
can find people that have expertise in that part to build a chunk
at
a
time. If we could implement the prototype in OVMF that would show
how
it
works, but run on everyone's >>machines, so that would be really
helpful
for demos and design review.
I totally agree. Amazing words that I didn't have heard yet.
Thanks!
As far as I could understand, and with Leif's help, some possible
future
steps could be (not at this specific order):
- 1) Convert proof-of-concept HDA driver to UEFI driver model
with
proper PCI discovery.
- 2) Design a prototype EFI_AUDIO_OUTPUT_PROTOCOL, rework
driver
to produce this and application to discover and consume it.
- 3) Implement a USB Audio Class driver also
producing EFI_AUDIO_OUTPUT_PROTOCOL and ensure test application
remains functional.
- 4) Separate controller and codec code by creating
an EFI_AUDIO_CODEC_PROTOCOL, implement this in HDA driver, and
separate
out
the codec support into individual drivers.
- 5) Prototype audio output additions to HII. (using
pre-recorder
audio files)
- 6) Porting of some voice synthesizer to UEFI. (eliminating
the
need
of audio files)

Beyond this, there are other things we should look at adding,
like
- EFI_AUDIO_INPUT_PROTOCOL.
- Audio input additions to HII.

It's a lot of work, but I accept the challenge.
It may take a long time, but it is possible.

I am still trying to find some time to finish the translation of
my
thesis to English.
I wrote everything in Portuguese because there was not material
about
UEFI to the Brazilian audience, and another target I have is to
show
companies that we have people that can work at this kind of
projects
in
Brazil, bringing this kind of development to south america. (Yes,
I
have
complicated target, but I like the challenge :) )

Thanks and Regards
Rafael R. Machado

Em qui, 19 de set de 2019 às 14:45, Laszlo Ersek
<lersek@...
escreveu:

On 09/18/19 19:57, Andrew Fish wrote:
Rafael,

Please let us know what you find out. I probably don''t have
the
time
to help implement this feature, but I happy to help work on the
architecture and design for UEFI accessibility on the edk2
mailing
lists, and I can also represent what ever we come up with at
the
UEFI
Spec Work Group.

It would be hard to have a UEFI mandate for accessibility,
given
there is no guideline on how a User Interface (UI) works. If
accessibility requires some from of hardware abstraction, like
audio,
then we could likely get that into the UEFI Spec. What might be
possible is an EDK2 reference implementation of accessibility.
Maybe
we could use the reference implementation to write a UEFI white
paper
on design for accessibility? I there is an open source
implementation, and an official design guide this would make it
much
easier for advocacy groups to lobby for this feature.

I've got some experience with accessibility as the macOS EFI OS
Loader has a UI for the Full Disk Encryption password. If you
press
the power button quickly 3 times at the disk encryption
password
prompt accessibility is enabled and Voice Over gets turned on.
You
then get localized voice prompts when you move between UI
elements.
Since this is the OS loader all the resources are stored on the
disk.
You quickly run into a range of challenges since, audio is
hard,
abstracting audio is hard (what codec does firmware have to
support),
Audio files are not small and firmware is size constrained, the
need
to localize the audio responses causes even more size issues,
the
boot options are usually written by an OS installer so how
would
firmware know what to call them?

I listed a lot of reasons it is hard but as Kennedy stated in
his
"We
choose to go to the Moon!" speech sometimes we chose to do
things
"not because they are easy, but because they are hard; because
that
goal will serve to organize and measure the best of our
energies
and
skills, because that challenge is one that we are willing to
accept".
If we have a design that means we can break the problem up into
smaller parts, and maybe we can find people that have expertise
in
that part to build a chunk at a time. If we could implement the
prototype in OVMF that would show how it works, but run on
everyones
machines, so that would be really helpful for demos and design
review.
Somewhat related, in April there was a thread on virtio-dev that
suggests there is interest in a virtio-audio device model:

https://lists.oasis-open.org/archives/virtio-dev/201904/msg00049.html

It looks like the ACRN project already implements a
(non-standard,
as
of
now) virtio-audio device already:

https://lists.oasis-open.org/archives/virtio-dev/201907/msg00061.html

(This is all I can mention right now.)

Thanks
Laszlo


--
Signed,
Ethin D. Probst

--
Signed,
Ethin D. Probst

--
Signed,
Ethin D. Probst

Join rfc@edk2.groups.io to automatically receive all group messages.