Need clarification about UEFI Strings


 

Hello everyone, I am trying to write an implementation for UEFI
strings in Rust and just wanted clarification about some things.

Are UEFI Strings UTF-16 encoded? I have looked at some previous Rust
implementations for this and it seems UEFI does not support the whole
UTF-16 but rather only UCS-2
(https://en.wikipedia.org/wiki/Universal_Coded_Character_Set) which is
a subset of UTF-16.

There is also something called WTF-8
(https://en.wikipedia.org/wiki/UTF-8#WTF-8) which Rust uses to
represent OsStrings in Windows which is supposed to use UTF-16 (?).

Anyway, if someone can point me to the resources/specifications of
UEFI Strings, it would be a great help.

Ayush Singh


Pedro Falcato
 

Hi Ayush,

In the latest UEFI 2.9 spec, it's specified under 2.3.1 that CHAR8 strings/characters are (usually) ASCII, and CHAR16 strings/characters are (usually) UCS-2 (*not* UTF-16).

On Tue, Jun 7, 2022 at 7:02 AM Ayush Singh <ayushdevel1325@...> wrote:
Hello everyone, I am trying to write an implementation for UEFI
strings in Rust and just wanted clarification about some things.

Are UEFI Strings UTF-16 encoded? I have looked at some previous Rust
implementations for this and it seems UEFI does not support the whole
UTF-16 but rather only UCS-2
(https://en.wikipedia.org/wiki/Universal_Coded_Character_Set) which is
a subset of UTF-16.

There is also something called WTF-8
(https://en.wikipedia.org/wiki/UTF-8#WTF-8) which Rust uses to
represent OsStrings in Windows which is supposed to use UTF-16 (?).

Anyway, if someone can point me to the resources/specifications of
UEFI Strings, it would be a great help.

Ayush Singh







--
Pedro Falcato


 

Thanks, Pedro,

However, according to the specs, it is possible to construct ASCII
Strings as well. So when would ASCII Strings be used over normal UCS-2
Strings?

Ayush Singh

On Tue, Jun 7, 2022 at 1:13 PM Pedro Falcato <pedro.falcato@...> wrote:

Hi Ayush,

In the latest UEFI 2.9 spec, it's specified under 2.3.1 that CHAR8 strings/characters are (usually) ASCII, and CHAR16 strings/characters are (usually) UCS-2 (*not* UTF-16).

On Tue, Jun 7, 2022 at 7:02 AM Ayush Singh <ayushdevel1325@...> wrote:

Hello everyone, I am trying to write an implementation for UEFI
strings in Rust and just wanted clarification about some things.

Are UEFI Strings UTF-16 encoded? I have looked at some previous Rust
implementations for this and it seems UEFI does not support the whole
UTF-16 but rather only UCS-2
(https://en.wikipedia.org/wiki/Universal_Coded_Character_Set) which is
a subset of UTF-16.

There is also something called WTF-8
(https://en.wikipedia.org/wiki/UTF-8#WTF-8) which Rust uses to
represent OsStrings in Windows which is supposed to use UTF-16 (?).

Anyway, if someone can point me to the resources/specifications of
UEFI Strings, it would be a great help.

Ayush Singh





--
Pedro Falcato


Pedro Falcato
 

I'd say that it depends. But 98% of the strings you'll find in UEFI (including APIs) are UCS-2 CHAR16 strings.


On Tue, Jun 7, 2022 at 9:19 AM Ayush Singh <ayushdevel1325@...> wrote:
Thanks, Pedro,

However, according to the specs, it is possible to construct ASCII
Strings as well. So when would ASCII Strings be used over normal UCS-2
Strings?

Ayush Singh

On Tue, Jun 7, 2022 at 1:13 PM Pedro Falcato <pedro.falcato@...> wrote:
>
> Hi Ayush,
>
> In the latest UEFI 2.9 spec, it's specified under 2.3.1 that CHAR8 strings/characters are (usually) ASCII, and CHAR16 strings/characters are (usually) UCS-2 (*not* UTF-16).
>
> On Tue, Jun 7, 2022 at 7:02 AM Ayush Singh <ayushdevel1325@...> wrote:
>>
>> Hello everyone, I am trying to write an implementation for UEFI
>> strings in Rust and just wanted clarification about some things.
>>
>> Are UEFI Strings UTF-16 encoded? I have looked at some previous Rust
>> implementations for this and it seems UEFI does not support the whole
>> UTF-16 but rather only UCS-2
>> (https://en.wikipedia.org/wiki/Universal_Coded_Character_Set) which is
>> a subset of UTF-16.
>>
>> There is also something called WTF-8
>> (https://en.wikipedia.org/wiki/UTF-8#WTF-8) which Rust uses to
>> represent OsStrings in Windows which is supposed to use UTF-16 (?).
>>
>> Anyway, if someone can point me to the resources/specifications of
>> UEFI Strings, it would be a great help.
>>
>> Ayush Singh
>>
>>
>>
>>
>>
>
>
> --
> Pedro Falcato


--
Pedro Falcato


 

Ok, Thanks for all the help.

On Tue, Jun 7, 2022 at 3:28 PM Pedro Falcato <pedro.falcato@...> wrote:

I'd say that it depends. But 98% of the strings you'll find in UEFI (including APIs) are UCS-2 CHAR16 strings.

On Tue, Jun 7, 2022 at 9:19 AM Ayush Singh <ayushdevel1325@...> wrote:

Thanks, Pedro,

However, according to the specs, it is possible to construct ASCII
Strings as well. So when would ASCII Strings be used over normal UCS-2
Strings?

Ayush Singh

On Tue, Jun 7, 2022 at 1:13 PM Pedro Falcato <pedro.falcato@...> wrote:

Hi Ayush,

In the latest UEFI 2.9 spec, it's specified under 2.3.1 that CHAR8 strings/characters are (usually) ASCII, and CHAR16 strings/characters are (usually) UCS-2 (*not* UTF-16).

On Tue, Jun 7, 2022 at 7:02 AM Ayush Singh <ayushdevel1325@...> wrote:

Hello everyone, I am trying to write an implementation for UEFI
strings in Rust and just wanted clarification about some things.

Are UEFI Strings UTF-16 encoded? I have looked at some previous Rust
implementations for this and it seems UEFI does not support the whole
UTF-16 but rather only UCS-2
(https://en.wikipedia.org/wiki/Universal_Coded_Character_Set) which is
a subset of UTF-16.

There is also something called WTF-8
(https://en.wikipedia.org/wiki/UTF-8#WTF-8) which Rust uses to
represent OsStrings in Windows which is supposed to use UTF-16 (?).

Anyway, if someone can point me to the resources/specifications of
UEFI Strings, it would be a great help.

Ayush Singh





--
Pedro Falcato


--
Pedro Falcato


 

Just for clarification, UCS2 and not UTF-16 means there are no
surrogate pairs right?

Ayush Singh

On Tue, Jun 7, 2022 at 5:15 PM Ayush Singh via groups.io
<ayushdevel1325@...> wrote:

Ok, Thanks for all the help.

On Tue, Jun 7, 2022 at 3:28 PM Pedro Falcato <pedro.falcato@...> wrote:

I'd say that it depends. But 98% of the strings you'll find in UEFI (including APIs) are UCS-2 CHAR16 strings.

On Tue, Jun 7, 2022 at 9:19 AM Ayush Singh <ayushdevel1325@...> wrote:

Thanks, Pedro,

However, according to the specs, it is possible to construct ASCII
Strings as well. So when would ASCII Strings be used over normal UCS-2
Strings?

Ayush Singh

On Tue, Jun 7, 2022 at 1:13 PM Pedro Falcato <pedro.falcato@...> wrote:

Hi Ayush,

In the latest UEFI 2.9 spec, it's specified under 2.3.1 that CHAR8 strings/characters are (usually) ASCII, and CHAR16 strings/characters are (usually) UCS-2 (*not* UTF-16).

On Tue, Jun 7, 2022 at 7:02 AM Ayush Singh <ayushdevel1325@...> wrote:

Hello everyone, I am trying to write an implementation for UEFI
strings in Rust and just wanted clarification about some things.

Are UEFI Strings UTF-16 encoded? I have looked at some previous Rust
implementations for this and it seems UEFI does not support the whole
UTF-16 but rather only UCS-2
(https://en.wikipedia.org/wiki/Universal_Coded_Character_Set) which is
a subset of UTF-16.

There is also something called WTF-8
(https://en.wikipedia.org/wiki/UTF-8#WTF-8) which Rust uses to
represent OsStrings in Windows which is supposed to use UTF-16 (?).

Anyway, if someone can point me to the resources/specifications of
UEFI Strings, it would be a great help.

Ayush Singh





--
Pedro Falcato


--
Pedro Falcato