Re: Windows 2019 VM fails to boot from vhost-scsi with UEFI mode

annie li

On 5/28/2020 5:51 PM, Laszlo Ersek wrote:
On 05/28/20 00:04, annie li wrote:
On 5/27/2020 2:00 PM, Laszlo Ersek wrote:
(4) Annie: can you try launching QEMU with the following flag:

    -global vhost-scsi-pci.max_sectors=2048

If that works, then I *guess* the kernel-side vhost device model
could interrogate the virtio-scsi config space for "max_sectors", and
use the value seen there in place of PREALLOC_SGLS /
I am a little confused here,
TCM_VHOST_PREALLOC_PROT_SGLS(512) are hard coded in vhost/scsi.c.
sgl_count = vhost_scsi_calc_sgls(prot_iter, prot_bytes,
sgl_count = vhost_scsi_calc_sgls(data_iter, data_bytes,

In vhost_scsi_calc_sgls, error is printed out if sgl_count is more than

    sgl_count = iov_iter_npages(iter, 0xffff);
    if (sgl_count > max_sgls) {
        pr_err("%s: requested sgl_count: %d exceeds pre-allocated"
               " max_sgls: %d\n", __func__, sgl_count, max_sgls);
        return -EINVAL;


Looks like vhost-scsi doesn't interrogate the virtio-scsi config space for
"max_sectors". The guest virtio-scsi driver may read this configuration
out though.

So the following flag reduces the transfer size to 8M on QEMU side.
"-global vhost-scsi-pci.max_sectors=2048"
Due to this setting, even though max ByteCount of Read(10) command in
ScsiDiskDxe/ScsiDisk.c is is 0xFFFF*0x200 = 0x1FFFE00(32M), under
EFI_BAD_BUFFER_SIZE logic, ScsiDiskDxe/ScsiDisk.c does retries and
adjust the request size <= 8M?

The transfer size that ultimately reaches the device is the minimum of
three quantities:

(a) the transfer size requested by the caller (i.e., the UEFI application),

(b) the limit set by the READ(10) / READ(16) decision (i.e., MaxBlock),

(c) the transfer size limit enforced / reported by

Whichever is the smallest from the three, determines the transfer size
that the device ultimately sees in the request.

And then *that* transfer size must satisfy PREALLOC_SGLS and/or
PREALLOC_PROT_SGLS (2048 4K pages: 0x80_0000 bytes).

In your original use case, (a) is 0x93_F400 bytes, (b) is 0x1FF_FE00
bytes, and (c) is 0x1FF_FE00 too. Therefore the minimum is 0x93_F400, so
that is what reaches the device. And because 0x93_F400 exceeds
0x80_0000, the request fails.

When you set "-global vhost-scsi-pci.max_sectors=2048", that lowers (c)
to 0x10_0000. (a) and (b) remain unchanged. Therefore the new minimum
(which finally reaches the device) is 0x10_0000. This does not exceed
0x80_0000, so the request succeeds.
Much clear now, thank you!

... In my prior email, I think I missed a detail: while the unit for
QEMU's "vhost-scsi-pci.max_sectors" property is a "sector" (512 bytes),
the unit for PREALLOC_SGLS and PREALLOC_PROT_SGLS in the kernel device
model seems to be a *page*, rather than a sector. (I don't think I've
ever checked iov_iter_npages() before.)

Therefore the QEMU flag that I recommended previously was too strict.
Can you try this instead, please?:

-global vhost-scsi-pci.max_sectors=16384
It works but run into another failure. I put details in another email.

This should set (c) to 0x80_0000 bytes. And so the minimum of {(a), (b),
{c}) will be 0x80_0000 bytes -- exactly what PREALLOC_SGLS and

Although Win2019 boots from vhost-scsi with above flag, I assume we still
need to enlarge the value of VHOST_SCSI_PREALLOC_SGLS in vhost-scsi for
final fix instead of setting max_sectors through QEMU options?
There are multiple ways (alternatives) to fix the issue.

- use larger constants for PREALLOC_SGLS and PREALLOC_PROT_SGLS in the

- or replace the PREALLOC_SGLS and PREALLOC_PROT_SGLS constants in the
kernel altogether, with such logic that dynamically calculates them from
the "max_sectors" virtio-scsi config header field;

- or change the QEMU default for "vhost-scsi-pci.max_sectors", from
0xFFFF to 16384.
I prefer to fixing it in the kernel side, details are in another email too.:-)


Either should work.


Join to automatically receive all group messages.