import qemu-kvm-9.0.0-10.el9_5

i9-beta changed/i9-beta/qemu-kvm-9.0.0-10.el9_5
Arkady L. Shane 2 months ago
parent c3d7983744
commit 9c30296e51
Signed by untrusted user: tigro
GPG Key ID: 1EC08A25C9DB2503

2
.gitignore vendored

@ -1 +1 @@
SOURCES/qemu-8.0.0.tar.xz
SOURCES/qemu-9.0.0.tar.xz

@ -1 +1 @@
17d54a85aa5d7f5dcfc619aa34049f9a91ceed0d SOURCES/qemu-8.0.0.tar.xz
6699bb03d6da21159b89668bca01c6c958b95d07 SOURCES/qemu-9.0.0.tar.xz

@ -1,4 +1,4 @@
From 84039bfc860878f3c3421de4a1836ac5d6300ed7 Mon Sep 17 00:00:00 2001
From ea7dff3dbf979d7d8a85a16cf5187235143e1048 Mon Sep 17 00:00:00 2001
From: Miroslav Rezanina <mrezanin@redhat.com>
Date: Wed, 26 May 2021 10:56:02 +0200
Subject: Initial redhat build
@ -13,7 +13,7 @@ several issues are fixed in QEMU tree:
We disable make check due to issues with some of the tests.
This rebase is based on qemu-kvm-7.2.0-14.el9
This rebase is based on qemu-kvm-8.2.0-11.el9
Signed-off-by: Miroslav Rezanina <mrezanin@redhat.com>
--
@ -50,32 +50,45 @@ Rebase changes (7.0.0):
- Change permissions on installing tests/Makefile.include
- Remove ssh block driver
Rebase changes (7.1.0 rc0):
Rebase changes (7.1.0):
- --disable-vnc-png renamed to --disable-png (upstream)
- removed --disable-vhost-vsock and --disable-vhost-scsi
- capstone submodule removed
- Temporary include capstone build
Rebase changes (7.2.0 rc0):
Rebase changes (7.2.0):
- Switch --enable-slirp=system to --enable-slirp
Rebaes changes (7.2.0 rc2):
- Added new configure options (blkio and sndio, both disabled)
Rebase changes (7.2.0):
- Fix SRPM name generation to work on Fedora 37
- Switch back to system meson
Rebase changes (8.0.0-rc1):
Rebase changes (8.0.0):
- use enable-dtrace-backands instead of enable-dtrace-backend
- Removed qemu virtiofsd bits
Rebase changes (8.0.0-rc2):
- test/check-block.sh removed (upstream)
Rebase changes (8.0.0-rc3):
- Add new --disable-* options for configure
Rebase changes (8.1.0):
- qmp-spec.txt installed by make
- Removed --meson configure option
- Add --disable-pypi
- Removed --with-git and -with-gitsubmodules
- Renamed --disable-pypi to --disable-downloads
- Minor updates in README.tests
Rebase changes (8.2.0):
- Removed --disable-hax (upstream)
- Added --disable-plugins configure option
- Fixing frh.py strings
Rebase notes (9.0.0):
- Fixed qemu-kvm binary location change
- Remove hppa-firmware64.img
- Package stp files for utilities
- Download subprojects on local build
Merged patches (6.0.0):
- 605758c902 Limit build on Power to qemu-img and qemu-ga only
@ -168,24 +181,38 @@ Merged patches (7.0.0):
- d46d2710b2 spec: Obsolete old usb redir subpackage
- 6f52a50b68 spec: Obsolete ssh driver
Merged patches (7.2.0 rc4):
Merged patches (7.2.0):
- 8c6834feb6 Remove opengl display device subpackages (C9S MR 124)
- 0ecc97f29e spec: Add requires for packages with additional virtio-gpu variants (C9S MR 124)
Merged patches (8.0.0-rc1):
Merged patches (8.0.0):
- 7754f6ba78 Minor packaging fixes
- 401af56187 spec: Disable VDUSE
Merged patches (8.1.0):
- 0c2306676f Enable Linux io_uring
- b7fa6426d5 Enable libblkio block drivers
- 19f6d7a6f4 Fix virtio-blk-vhost-vdpa typo in spec file
- f356cae88f spec: Build DBUS display
- 77b763efd5 Provide elf2dmp binary in qemu-tools
Merged patches (8.2.0):
- cd9efa221d Enable qemu-kvm-device-usb-redirec for aarch64
Merged patches (9.0.0 rc0):
- 25de053dbf spec: Enable zstd
Signed-off-by: Miroslav Rezanina <mrezanin@redhat.com>
---
.distro/Makefile | 100 +
.distro/Makefile.common | 41 +
.distro/Makefile | 101 +
.distro/Makefile.common | 42 +
.distro/README.tests | 39 +
.distro/modules-load.conf | 4 +
.distro/qemu-guest-agent.service | 1 -
.distro/qemu-kvm.spec.template | 4528 +++++++++++++++++++++++
.distro/qemu-kvm.spec.template | 5170 +++++++++++++++++++++++
.distro/rpminspect.yaml | 6 +-
.distro/scripts/extract_build_cmd.py | 12 +
.distro/scripts/frh.py | 4 +-
.distro/scripts/process-patches.sh | 4 +
.gitignore | 1 +
README.systemtap | 43 +
@ -193,7 +220,7 @@ Signed-off-by: Miroslav Rezanina <mrezanin@redhat.com>
scripts/systemtap/conf.d/qemu_kvm.conf | 4 +
scripts/systemtap/script.d/qemu_kvm.stp | 1 +
ui/vnc-auth-sasl.c | 2 +-
15 files changed, 4784 insertions(+), 4 deletions(-)
16 files changed, 5430 insertions(+), 6 deletions(-)
create mode 100644 .distro/Makefile
create mode 100644 .distro/Makefile.common
create mode 100644 .distro/README.tests
@ -296,5 +323,5 @@ index 47fdae5b21..2a950caa2a 100644
if (saslErr != SASL_OK) {
error_setg(errp, "Failed to initialize SASL auth: %s",
--
2.39.1
2.39.3

@ -1,4 +1,4 @@
From 63829772dbc2075fc014a9d52e3968735d228018 Mon Sep 17 00:00:00 2001
From 780c39975b059deaee106775b6e3a240155acea3 Mon Sep 17 00:00:00 2001
From: Miroslav Rezanina <mrezanin@redhat.com>
Date: Wed, 7 Dec 2022 03:05:48 -0500
Subject: Enable/disable devices for RHEL
@ -22,21 +22,37 @@ Rebase notes (7.0.0):
- Renamed CONFIG_ARM_GIC_TCG to CONFIG_ARM_GICV3_TCG
- Removed upstream devices
Rebase notes (7.1.0 rc0):
Rebase notes (7.1.0):
- Added CONFIG_VHOST_VSOCK and CONFIG_VHOST_USER_VSOCK configs
- Added CONFIG_CXL and CONFIG_CXL_MEM_DEVICE for aarch64 and x86_64
Rebase notes (7.1.0 rc3):
- Added CONFIG_VHOST_USER_FS option (all archs)
Rebase notes (7.2.0 rc20):
Rebase notes (7.2.0):
- Removed disabling a15mpcore.c as no longer needed
Rebase notes (8.0.0-rc1):
Rebase notes (8.0.0):
- Rename CONFIG_ACPI_X86_ICH to CONFIG_ACPI_ICH9
- Inlude qemu/error-report.h in hw/display/cirrus_vga.c
- Change virtiofsd dependency version
Rebase notes (8.1.0):
- Added CONFIG_PCIE_PCI_BRIDGE for x86_64
- Disabling tcg cpus for aarch64
- Disable CONFIG_ARM_V7M and remove related hack
- Moved aarch64 tcg cpu disabling from arm machine type commit
Rebase notes (8.2.0):
- Disabled new a710 arm64 tcg cpu
- No longer needed hack for removal of i2c-echo
- Disable new neoverse-v2
- Removed CONFIG_OPENGL from x86_64 config file
Rebase notes (9.0.0 rc0):
- Split CONFIG_IDE_QDEV to CONFIG_IDE_DEV and CONFIG_IDE_BUS (upstream change)
Rebase notes (9.0.0 rc1):
- Do not compile armv7 cpu types
Merged patches (6.1.0):
- c51bf45304 Remove SPICE and QXL from x86_64-rh-devices.mak
- 02fc745601 aarch64-rh-devices: add CONFIG_PVPANIC_PCI
@ -53,32 +69,52 @@ Merged patches (7.0.0):
- fd7c45a5a8 redhat: Enable virtio-mem as tech-preview on x86-64
- c9e68ea451 Enable SGX -- RH Only
Merged patches (7.1.0 rc0):
Merged patches (7.1.0):
- 38b89dc245 pc: Move s3/s4 suspend disabling to compat (only hw/acpi/ich9.c chunk)
- 8f663466c6 configs/devices/aarch64-softmmu: Enable CONFIG_VIRTIO_MEM
- 1bf372717a Enable virtio-iommu-pci on aarch64
- ae3f269458 Enable virtio-iommu-pci on x86_64
Merged patches (8.1.0):
- 8173d2eaba Disable unwanted new devices
Merged patches (8.2.0):
- b29f66431f Enable igb on x86_64
Merged patches (9.0.0 rc0):
- 3889ede5d9 Compile IOMMUFD on x86_64
- 0beb18451f Compile IOMMUFD on s390x
- 2b4b13f70d Compile IOMMUFD object on aarch64
---
.distro/qemu-kvm.spec.template | 18 +--
.../aarch64-softmmu/aarch64-rh-devices.mak | 41 +++++++
.../aarch64-softmmu/aarch64-rh-devices.mak | 42 +++++++
.../ppc64-softmmu/ppc64-rh-devices.mak | 37 ++++++
configs/devices/rh-virtio.mak | 10 ++
.../s390x-softmmu/s390x-rh-devices.mak | 18 +++
.../x86_64-softmmu/x86_64-rh-devices.mak | 109 ++++++++++++++++++
hw/arm/meson.build | 2 +-
.../s390x-softmmu/s390x-rh-devices.mak | 19 +++
.../x86_64-softmmu/x86_64-rh-devices.mak | 112 ++++++++++++++++++
hw/arm/virt.c | 2 +
hw/block/fdc.c | 10 ++
hw/cpu/meson.build | 3 +-
hw/display/cirrus_vga.c | 7 +-
hw/cxl/meson.build | 3 +-
hw/display/cirrus_vga.c | 4 +
hw/ide/piix.c | 5 +-
hw/input/pckbd.c | 2 +
hw/net/e1000.c | 2 +
hw/ppc/spapr_cpu_core.c | 2 +
hw/usb/meson.build | 2 +-
target/arm/cpu_tcg.c | 10 ++
hw/virtio/meson.build | 6 +-
target/arm/arm-qmp-cmds.c | 2 +
target/arm/cpu.c | 4 +
target/arm/cpu.h | 3 +
target/arm/cpu64.c | 12 +-
target/arm/tcg/cpu32.c | 2 +
target/arm/tcg/cpu64.c | 8 ++
target/arm/tcg/meson.build | 4 +-
target/ppc/cpu-models.c | 9 ++
target/s390x/cpu_models_sysemu.c | 3 +
target/s390x/kvm/kvm.c | 8 ++
19 files changed, 285 insertions(+), 13 deletions(-)
tests/qtest/arm-cpu-features.c | 4 +
28 files changed, 321 insertions(+), 17 deletions(-)
create mode 100644 configs/devices/aarch64-softmmu/aarch64-rh-devices.mak
create mode 100644 configs/devices/ppc64-softmmu/ppc64-rh-devices.mak
create mode 100644 configs/devices/rh-virtio.mak
@ -87,22 +123,22 @@ Merged patches (7.1.0 rc0):
diff --git a/configs/devices/aarch64-softmmu/aarch64-rh-devices.mak b/configs/devices/aarch64-softmmu/aarch64-rh-devices.mak
new file mode 100644
index 0000000000..720ec0cb57
index 0000000000..b0191d3c69
--- /dev/null
+++ b/configs/devices/aarch64-softmmu/aarch64-rh-devices.mak
@@ -0,0 +1,41 @@
@@ -0,0 +1,42 @@
+include ../rh-virtio.mak
+
+CONFIG_ARM_GIC_KVM=y
+CONFIG_ARM_GICV3_TCG=y
+CONFIG_ARM_GIC=y
+CONFIG_ARM_SMMUV3=y
+CONFIG_ARM_V7M=y
+CONFIG_ARM_VIRT=y
+CONFIG_CXL=y
+CONFIG_CXL_MEM_DEVICE=y
+CONFIG_EDID=y
+CONFIG_PCIE_PORT=y
+CONFIG_PCIE_PCI_BRIDGE=y
+CONFIG_PCI_DEVICES=y
+CONFIG_PCI_TESTDEV=y
+CONFIG_PFLASH_CFI01=y
@ -132,6 +168,7 @@ index 0000000000..720ec0cb57
+CONFIG_VHOST_VSOCK=y
+CONFIG_VHOST_USER_VSOCK=y
+CONFIG_VHOST_USER_FS=y
+CONFIG_IOMMUFD=y
diff --git a/configs/devices/ppc64-softmmu/ppc64-rh-devices.mak b/configs/devices/ppc64-softmmu/ppc64-rh-devices.mak
new file mode 100644
index 0000000000..dbb7d30829
@ -193,10 +230,10 @@ index 0000000000..94ede1b5f6
+CONFIG_VIRTIO_SERIAL=y
diff --git a/configs/devices/s390x-softmmu/s390x-rh-devices.mak b/configs/devices/s390x-softmmu/s390x-rh-devices.mak
new file mode 100644
index 0000000000..69a799adbd
index 0000000000..24cf6dbd03
--- /dev/null
+++ b/configs/devices/s390x-softmmu/s390x-rh-devices.mak
@@ -0,0 +1,18 @@
@@ -0,0 +1,19 @@
+include ../rh-virtio.mak
+
+CONFIG_PCI=y
@ -215,12 +252,13 @@ index 0000000000..69a799adbd
+CONFIG_VHOST_VSOCK=y
+CONFIG_VHOST_USER_VSOCK=y
+CONFIG_VHOST_USER_FS=y
+CONFIG_IOMMUFD=y
diff --git a/configs/devices/x86_64-softmmu/x86_64-rh-devices.mak b/configs/devices/x86_64-softmmu/x86_64-rh-devices.mak
new file mode 100644
index 0000000000..668b2d0e18
index 0000000000..d60ff1bcfc
--- /dev/null
+++ b/configs/devices/x86_64-softmmu/x86_64-rh-devices.mak
@@ -0,0 +1,109 @@
@@ -0,0 +1,112 @@
+include ../rh-virtio.mak
+
+CONFIG_ACPI=y
@ -258,7 +296,9 @@ index 0000000000..668b2d0e18
+CONFIG_IDE_CORE=y
+CONFIG_IDE_PCI=y
+CONFIG_IDE_PIIX=y
+CONFIG_IDE_QDEV=y
+CONFIG_IDE_DEV=y
+CONFIG_IDE_BUS=y
+CONFIG_IGB_PCI_EXPRESS=y
+CONFIG_IOAPIC=y
+CONFIG_IOH3420=y
+CONFIG_ISA_BUS=y
@ -268,7 +308,6 @@ index 0000000000..668b2d0e18
+CONFIG_MC146818RTC=y
+CONFIG_MEM_DEVICE=y
+CONFIG_NVDIMM=y
+CONFIG_OPENGL=y
+CONFIG_PAM=y
+CONFIG_PC=y
+CONFIG_PCI=y
@ -282,6 +321,7 @@ index 0000000000..668b2d0e18
+CONFIG_PCSPK=y
+CONFIG_PC_ACPI=y
+CONFIG_PC_PCI=y
+CONFIG_PCIE_PCI_BRIDGE=y
+CONFIG_PFLASH_CFI01=y
+CONFIG_PVPANIC_ISA=y
+CONFIG_PXB=y
@ -330,21 +370,29 @@ index 0000000000..668b2d0e18
+CONFIG_VHOST_VSOCK=y
+CONFIG_VHOST_USER_VSOCK=y
+CONFIG_VHOST_USER_FS=y
diff --git a/hw/arm/meson.build b/hw/arm/meson.build
index b545ba0e4f..a41a16cba7 100644
--- a/hw/arm/meson.build
+++ b/hw/arm/meson.build
@@ -29,7 +29,7 @@ arm_ss.add(when: 'CONFIG_VEXPRESS', if_true: files('vexpress.c'))
arm_ss.add(when: 'CONFIG_ZYNQ', if_true: files('xilinx_zynq.c'))
arm_ss.add(when: 'CONFIG_SABRELITE', if_true: files('sabrelite.c'))
-arm_ss.add(when: 'CONFIG_ARM_V7M', if_true: files('armv7m.c'))
+#arm_ss.add(when: 'CONFIG_ARM_V7M', if_true: files('armv7m.c'))
arm_ss.add(when: 'CONFIG_EXYNOS4', if_true: files('exynos4210.c'))
arm_ss.add(when: 'CONFIG_PXA2XX', if_true: files('pxa2xx.c', 'pxa2xx_gpio.c', 'pxa2xx_pic.c'))
arm_ss.add(when: 'CONFIG_DIGIC', if_true: files('digic.c'))
+CONFIG_IOMMUFD=y
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index a9a913aead..6c6d155002 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -2954,6 +2954,7 @@ static void virt_machine_class_init(ObjectClass *oc, void *data)
MachineClass *mc = MACHINE_CLASS(oc);
HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(oc);
static const char * const valid_cpu_types[] = {
+#if 0 /* Disabled for Red Hat Enterprise Linux */
#ifdef CONFIG_TCG
ARM_CPU_TYPE_NAME("cortex-a7"),
ARM_CPU_TYPE_NAME("cortex-a15"),
@@ -2971,6 +2972,7 @@ static void virt_machine_class_init(ObjectClass *oc, void *data)
#endif /* CONFIG_TCG */
#ifdef TARGET_AARCH64
ARM_CPU_TYPE_NAME("cortex-a53"),
+#endif /* disabled for RHEL */
ARM_CPU_TYPE_NAME("cortex-a57"),
#if defined(CONFIG_KVM) || defined(CONFIG_HVF)
ARM_CPU_TYPE_NAME("host"),
diff --git a/hw/block/fdc.c b/hw/block/fdc.c
index d7cc4d3ec1..12d0a60905 100644
index 6dd94e98bc..a05757fc9a 100644
--- a/hw/block/fdc.c
+++ b/hw/block/fdc.c
@@ -49,6 +49,8 @@
@ -372,18 +420,32 @@ index d7cc4d3ec1..12d0a60905 100644
error_setg(errp, "Cannot choose a fallback FDrive type of 'auto'");
return;
diff --git a/hw/cpu/meson.build b/hw/cpu/meson.build
index e37490074f..4431e3731c 100644
index 38cdcfbe57..e588ecfd42 100644
--- a/hw/cpu/meson.build
+++ b/hw/cpu/meson.build
@@ -1,4 +1,5 @@
-softmmu_ss.add(files('core.c', 'cluster.c'))
+#softmmu_ss.add(files('core.c', 'cluster.c'))
+softmmu_ss.add(files('core.c'))
softmmu_ss.add(when: 'CONFIG_ARM11MPCORE', if_true: files('arm11mpcore.c'))
softmmu_ss.add(when: 'CONFIG_REALVIEW', if_true: files('realview_mpcore.c'))
-system_ss.add(files('core.c', 'cluster.c'))
+#system_ss.add(files('core.c', 'cluster.c'))
+system_ss.add(files('core.c'))
system_ss.add(when: 'CONFIG_ARM11MPCORE', if_true: files('arm11mpcore.c'))
system_ss.add(when: 'CONFIG_REALVIEW', if_true: files('realview_mpcore.c'))
diff --git a/hw/cxl/meson.build b/hw/cxl/meson.build
index 3e375f61a9..613adb3ebb 100644
--- a/hw/cxl/meson.build
+++ b/hw/cxl/meson.build
@@ -6,7 +6,8 @@ system_ss.add(when: 'CONFIG_CXL',
'cxl-host.c',
'cxl-cdat.c',
'cxl-events.c',
- 'switch-mailbox-cci.c',
+# Disabled for 8.2.0 rebase for RHEL 9.4.0
+# 'switch-mailbox-cci.c',
),
if_false: files(
'cxl-host-stubs.c',
diff --git a/hw/display/cirrus_vga.c b/hw/display/cirrus_vga.c
index b80f98b6c4..cbde6a8f15 100644
index 150883a971..497365bd80 100644
--- a/hw/display/cirrus_vga.c
+++ b/hw/display/cirrus_vga.c
@@ -36,6 +36,7 @@
@ -394,31 +456,21 @@ index b80f98b6c4..cbde6a8f15 100644
#include "sysemu/reset.h"
#include "qapi/error.h"
#include "trace.h"
@@ -47,6 +48,7 @@
#include "qom/object.h"
#include "ui/console.h"
+
/*
* TODO:
* - destination write mask support not complete (bits 5..7)
@@ -2946,7 +2948,10 @@ static void pci_cirrus_vga_realize(PCIDevice *dev, Error **errp)
@@ -2946,6 +2947,9 @@ static void pci_cirrus_vga_realize(PCIDevice *dev, Error **errp)
PCIDeviceClass *pc = PCI_DEVICE_GET_CLASS(dev);
int16_t device_id = pc->device_id;
- /*
+ warn_report("'cirrus-vga' is deprecated, "
+ "please use a different VGA card instead");
+ warn_report("'cirrus-vga' is deprecated, "
+ "please use a different VGA card instead");
+
+ /*
/*
* Follow real hardware, cirrus card emulated has 4 MB video memory.
* Also accept 8 MB/16 MB for backward compatibility.
*/
diff --git a/hw/ide/piix.c b/hw/ide/piix.c
index 41d60921e3..a4af45b4e8 100644
index 80efc633d3..9cb82b8eea 100644
--- a/hw/ide/piix.c
+++ b/hw/ide/piix.c
@@ -193,7 +193,8 @@ static void piix3_ide_class_init(ObjectClass *klass, void *data)
@@ -191,7 +191,8 @@ static void piix3_ide_class_init(ObjectClass *klass, void *data)
k->device_id = PCI_DEVICE_ID_INTEL_82371SB_1;
k->class_id = PCI_CLASS_STORAGE_IDE;
set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
@ -428,7 +480,7 @@ index 41d60921e3..a4af45b4e8 100644
}
static const TypeInfo piix3_ide_info = {
@@ -216,6 +217,8 @@ static void piix4_ide_class_init(ObjectClass *klass, void *data)
@@ -215,6 +216,8 @@ static void piix4_ide_class_init(ObjectClass *klass, void *data)
k->class_id = PCI_CLASS_STORAGE_IDE;
set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
dc->hotpluggable = false;
@ -438,10 +490,10 @@ index 41d60921e3..a4af45b4e8 100644
static const TypeInfo piix4_ide_info = {
diff --git a/hw/input/pckbd.c b/hw/input/pckbd.c
index b92b63bedc..3b6235dde6 100644
index 74f10b640f..2e85ecf476 100644
--- a/hw/input/pckbd.c
+++ b/hw/input/pckbd.c
@@ -957,6 +957,8 @@ static void i8042_class_initfn(ObjectClass *klass, void *data)
@@ -952,6 +952,8 @@ static void i8042_class_initfn(ObjectClass *klass, void *data)
dc->vmsd = &vmstate_kbd_isa;
adevc->build_dev_aml = i8042_build_aml;
set_bit(DEVICE_CATEGORY_INPUT, dc->categories);
@ -451,10 +503,10 @@ index b92b63bedc..3b6235dde6 100644
static const TypeInfo i8042_info = {
diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index 23d660619f..b75c9aa799 100644
index 43f3a4a701..267f182883 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -1805,6 +1805,7 @@ static const E1000Info e1000_devices[] = {
@@ -1746,6 +1746,7 @@ static const E1000Info e1000_devices[] = {
.revision = 0x03,
.phy_id2 = E1000_PHY_ID2_8254xx_DEFAULT,
},
@ -462,7 +514,7 @@ index 23d660619f..b75c9aa799 100644
{
.name = "e1000-82544gc",
.device_id = E1000_DEV_ID_82544GC_COPPER,
@@ -1817,6 +1818,7 @@ static const E1000Info e1000_devices[] = {
@@ -1758,6 +1759,7 @@ static const E1000Info e1000_devices[] = {
.revision = 0x03,
.phy_id2 = E1000_PHY_ID2_8254xx_DEFAULT,
},
@ -471,10 +523,10 @@ index 23d660619f..b75c9aa799 100644
static void e1000_register_types(void)
diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
index 8a4861f45a..fcb5dfe792 100644
index e7c9edd033..3b0a47a28c 100644
--- a/hw/ppc/spapr_cpu_core.c
+++ b/hw/ppc/spapr_cpu_core.c
@@ -379,10 +379,12 @@ static const TypeInfo spapr_cpu_core_type_infos[] = {
@@ -389,10 +389,12 @@ static const TypeInfo spapr_cpu_core_type_infos[] = {
.instance_size = sizeof(SpaprCpuCore),
.class_size = sizeof(SpaprCpuCoreClass),
},
@ -482,16 +534,16 @@ index 8a4861f45a..fcb5dfe792 100644
DEFINE_SPAPR_CPU_CORE_TYPE("970_v2.2"),
DEFINE_SPAPR_CPU_CORE_TYPE("970mp_v1.0"),
DEFINE_SPAPR_CPU_CORE_TYPE("970mp_v1.1"),
DEFINE_SPAPR_CPU_CORE_TYPE("power5+_v2.1"),
DEFINE_SPAPR_CPU_CORE_TYPE("power5p_v2.1"),
+#endif
DEFINE_SPAPR_CPU_CORE_TYPE("power7_v2.3"),
DEFINE_SPAPR_CPU_CORE_TYPE("power7+_v2.1"),
DEFINE_SPAPR_CPU_CORE_TYPE("power7p_v2.1"),
DEFINE_SPAPR_CPU_CORE_TYPE("power8_v2.0"),
diff --git a/hw/usb/meson.build b/hw/usb/meson.build
index 599dc24f0d..905a994c3a 100644
index aac3bb35f2..5411ff35df 100644
--- a/hw/usb/meson.build
+++ b/hw/usb/meson.build
@@ -52,7 +52,7 @@ softmmu_ss.add(when: 'CONFIG_USB_SMARTCARD', if_true: files('dev-smartcard-reade
@@ -55,7 +55,7 @@ system_ss.add(when: 'CONFIG_USB_SMARTCARD', if_true: files('dev-smartcard-reader
if cacard.found()
usbsmartcard_ss = ss.source_set()
usbsmartcard_ss.add(when: 'CONFIG_USB_SMARTCARD',
@ -500,86 +552,226 @@ index 599dc24f0d..905a994c3a 100644
hw_usb_modules += {'smartcard': usbsmartcard_ss}
endif
diff --git a/target/arm/cpu_tcg.c b/target/arm/cpu_tcg.c
index df0c45e523..c154a4dcf2 100644
--- a/target/arm/cpu_tcg.c
+++ b/target/arm/cpu_tcg.c
@@ -155,6 +155,7 @@ void define_cortex_a72_a57_a53_cp_reginfo(ARMCPU *cpu)
/* CPU models. These are not needed for the AArch64 linux-user build. */
#if !defined(CONFIG_USER_ONLY) || !defined(TARGET_AARCH64)
diff --git a/hw/virtio/meson.build b/hw/virtio/meson.build
index d7f18c96e6..aaabbb8b0b 100644
--- a/hw/virtio/meson.build
+++ b/hw/virtio/meson.build
@@ -20,7 +20,8 @@ if have_vhost
system_virtio_ss.add(files('vhost-user-base.c'))
# MMIO Stubs
- system_virtio_ss.add(files('vhost-user-device.c'))
+# Disabled for 8.2.0 rebase for RHEL 9.4.0
+# system_virtio_ss.add(files('vhost-user-device.c'))
system_virtio_ss.add(when: 'CONFIG_VHOST_USER_GPIO', if_true: files('vhost-user-gpio.c'))
system_virtio_ss.add(when: 'CONFIG_VHOST_USER_I2C', if_true: files('vhost-user-i2c.c'))
system_virtio_ss.add(when: 'CONFIG_VHOST_USER_RNG', if_true: files('vhost-user-rng.c'))
@@ -28,7 +29,8 @@ if have_vhost
system_virtio_ss.add(when: 'CONFIG_VHOST_USER_INPUT', if_true: files('vhost-user-input.c'))
# PCI Stubs
- system_virtio_ss.add(when: 'CONFIG_VIRTIO_PCI', if_true: files('vhost-user-device-pci.c'))
+# Disabled for 8.2.0 rebase for RHEL 9.4.0
+# system_virtio_ss.add(when: 'CONFIG_VIRTIO_PCI', if_true: files('vhost-user-device-pci.c'))
system_virtio_ss.add(when: ['CONFIG_VIRTIO_PCI', 'CONFIG_VHOST_USER_GPIO'],
if_true: files('vhost-user-gpio-pci.c'))
system_virtio_ss.add(when: ['CONFIG_VIRTIO_PCI', 'CONFIG_VHOST_USER_I2C'],
diff --git a/target/arm/arm-qmp-cmds.c b/target/arm/arm-qmp-cmds.c
index 3cc8cc738b..6f21fea1f5 100644
--- a/target/arm/arm-qmp-cmds.c
+++ b/target/arm/arm-qmp-cmds.c
@@ -223,6 +223,7 @@ CpuModelExpansionInfo *qmp_query_cpu_model_expansion(CpuModelExpansionType type,
static void arm_cpu_add_definition(gpointer data, gpointer user_data)
{
ObjectClass *oc = data;
+ CPUClass *cc = CPU_CLASS(oc);
CpuDefinitionInfoList **cpu_list = user_data;
CpuDefinitionInfo *info;
const char *typename;
@@ -231,6 +232,7 @@ static void arm_cpu_add_definition(gpointer data, gpointer user_data)
info = g_malloc0(sizeof(*info));
info->name = cpu_model_from_type(typename);
info->q_typename = g_strdup(typename);
+ info->deprecated = !!cc->deprecation_note;
QAPI_LIST_PREPEND(*cpu_list, info);
}
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index ab8d007a86..e5dce20f19 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -2546,6 +2546,10 @@ static void cpu_register_class_init(ObjectClass *oc, void *data)
acc->info = data;
cc->gdb_core_xml_file = "arm-core.xml";
+
+ if (acc->info->deprecation_note) {
+ cc->deprecation_note = acc->info->deprecation_note;
+ }
}
void arm_cpu_register(const ARMCPUInfo *info)
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index bc0c84873f..e9472c8bb8 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -37,6 +37,8 @@
#define KVM_HAVE_MCE_INJECTION 1
#endif
+#define RHEL_CPU_DEPRECATION "use 'host' / 'max'"
+
#define EXCP_UDEF 1 /* undefined instruction */
#define EXCP_SWI 2 /* software interrupt */
#define EXCP_PREFETCH_ABORT 3
@@ -1092,6 +1094,7 @@ typedef struct ARMCPUInfo {
const char *name;
void (*initfn)(Object *obj);
void (*class_init)(ObjectClass *oc, void *data);
+ const char *deprecation_note;
} ARMCPUInfo;
/**
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index 985b1efe16..46a4e80171 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -648,6 +648,7 @@ static void aarch64_a57_initfn(Object *obj)
define_cortex_a72_a57_a53_cp_reginfo(cpu);
}
+#if 0 /* Disabled for Red Hat Enterprise Linux */
#if !defined(CONFIG_USER_ONLY) && defined(CONFIG_TCG)
static bool arm_v7m_cpu_exec_interrupt(CPUState *cs, int interrupt_request)
static void aarch64_a53_initfn(Object *obj)
{
@@ -508,6 +509,7 @@ static void cortex_a9_initfn(Object *obj)
cpu->isar.reset_pmcr_el0 = 0x41093000;
define_arm_cp_regs(cpu, cortexa9_cp_reginfo);
ARMCPU *cpu = ARM_CPU(obj);
@@ -704,6 +705,7 @@ static void aarch64_a53_initfn(Object *obj)
cpu->gic_pribits = 5;
define_cortex_a72_a57_a53_cp_reginfo(cpu);
}
+#endif /* disabled for RHEL */
+#endif
#ifndef CONFIG_USER_ONLY
static uint64_t a15_l2ctlr_read(CPUARMState *env, const ARMCPRegInfo *ri)
@@ -532,6 +534,7 @@ static const ARMCPRegInfo cortexa15_cp_reginfo[] = {
.access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
};
static void aarch64_host_initfn(Object *obj)
{
@@ -742,8 +744,11 @@ static void aarch64_max_initfn(Object *obj)
}
static const ARMCPUInfo aarch64_cpus[] = {
- { .name = "cortex-a57", .initfn = aarch64_a57_initfn },
+ { .name = "cortex-a57", .initfn = aarch64_a57_initfn,
+ .deprecation_note = RHEL_CPU_DEPRECATION },
+#if 0 /* Disabled for Red Hat Enterprise Linux */
static void cortex_a7_initfn(Object *obj)
{ .name = "cortex-a53", .initfn = aarch64_a53_initfn },
+#endif /* disabled for RHEL */
{ .name = "max", .initfn = aarch64_max_initfn },
#if defined(CONFIG_KVM) || defined(CONFIG_HVF)
{ .name = "host", .initfn = aarch64_host_initfn },
@@ -814,8 +819,13 @@ static void aarch64_cpu_instance_init(Object *obj)
static void cpu_register_class_init(ObjectClass *oc, void *data)
{
ARMCPU *cpu = ARM_CPU(obj);
@@ -580,6 +583,7 @@ static void cortex_a7_initfn(Object *obj)
cpu->isar.reset_pmcr_el0 = 0x41072000;
define_arm_cp_regs(cpu, cortexa15_cp_reginfo); /* Same as A15 */
ARMCPUClass *acc = ARM_CPU_CLASS(oc);
+ CPUClass *cc = CPU_CLASS(oc);
acc->info = data;
+
+ if (acc->info->deprecation_note) {
+ cc->deprecation_note = acc->info->deprecation_note;
+ }
}
void aarch64_cpu_register(const ARMCPUInfo *info)
diff --git a/target/arm/tcg/cpu32.c b/target/arm/tcg/cpu32.c
index de8f2be941..8896295ae3 100644
--- a/target/arm/tcg/cpu32.c
+++ b/target/arm/tcg/cpu32.c
@@ -92,6 +92,7 @@ void aa32_max_features(ARMCPU *cpu)
cpu->isar.id_dfr1 = t;
}
+#if 0 /* Disabled for Red Hat Enterprise Linux */
/* CPU models. These are not needed for the AArch64 linux-user build. */
#if !defined(CONFIG_USER_ONLY) || !defined(TARGET_AARCH64)
@@ -1037,3 +1038,4 @@ static void arm_tcg_cpu_register_types(void)
type_init(arm_tcg_cpu_register_types)
#endif /* !CONFIG_USER_ONLY || !TARGET_AARCH64 */
+#endif /* disabled for RHEL */
diff --git a/target/arm/tcg/cpu64.c b/target/arm/tcg/cpu64.c
index 9f7a9f3d2c..7ec6851c9c 100644
--- a/target/arm/tcg/cpu64.c
+++ b/target/arm/tcg/cpu64.c
@@ -29,6 +29,7 @@
#include "cpu-features.h"
#include "cpregs.h"
static void cortex_a15_initfn(Object *obj)
+#if 0 /* Disabled for Red Hat Enterprise Linux */
static uint64_t make_ccsidr64(unsigned assoc, unsigned linesize,
unsigned cachesize)
{
@@ -628,6 +632,7 @@ static void cortex_a15_initfn(Object *obj)
define_arm_cp_regs(cpu, cortexa15_cp_reginfo);
@@ -134,6 +135,7 @@ static void aarch64_a35_initfn(Object *obj)
/* These values are the same with A53/A57/A72. */
define_cortex_a72_a57_a53_cp_reginfo(cpu);
}
+#endif
static void cpu_max_get_sve_max_vq(Object *obj, Visitor *v, const char *name,
void *opaque, Error **errp)
@@ -223,6 +225,7 @@ static void cpu_max_get_l0gptsz(Object *obj, Visitor *v, const char *name,
static Property arm_cpu_lpa2_property =
DEFINE_PROP_BOOL("lpa2", ARMCPU, prop_lpa2, true);
+#if 0 /* Disabled for Red Hat Enterprise Linux */
static void cortex_m0_initfn(Object *obj)
static void aarch64_a55_initfn(Object *obj)
{
ARMCPU *cpu = ARM_CPU(obj);
@@ -1110,6 +1115,7 @@ static void arm_v7m_class_init(ObjectClass *oc, void *data)
cc->gdb_core_xml_file = "arm-m-profile.xml";
@@ -1065,6 +1068,7 @@ static void aarch64_neoverse_n2_initfn(Object *obj)
aarch64_add_pauth_properties(obj);
aarch64_add_sve_properties(obj);
}
+#endif /* disabled for RHEL */
+#endif
#ifndef TARGET_AARCH64
/*
@@ -1177,6 +1183,7 @@ static void arm_max_initfn(Object *obj)
#endif /* !TARGET_AARCH64 */
* -cpu max: a CPU with as many features enabled as our emulation supports.
@@ -1271,6 +1275,7 @@ void aarch64_max_tcg_initfn(Object *obj)
qdev_property_add_static(DEVICE(obj), &arm_cpu_lpa2_property);
}
static const ARMCPUInfo arm_tcg_cpus[] = {
+#if 0 /* Disabled for Red Hat Enterprise Linux */
{ .name = "arm926", .initfn = arm926_initfn },
{ .name = "arm946", .initfn = arm946_initfn },
{ .name = "arm1026", .initfn = arm1026_initfn },
@@ -1192,7 +1199,9 @@ static const ARMCPUInfo arm_tcg_cpus[] = {
{ .name = "cortex-a7", .initfn = cortex_a7_initfn },
{ .name = "cortex-a8", .initfn = cortex_a8_initfn },
{ .name = "cortex-a9", .initfn = cortex_a9_initfn },
+#endif /* disabled for RHEL */
{ .name = "cortex-a15", .initfn = cortex_a15_initfn },
static const ARMCPUInfo aarch64_cpus[] = {
{ .name = "cortex-a35", .initfn = aarch64_a35_initfn },
{ .name = "cortex-a55", .initfn = aarch64_a55_initfn },
@@ -1282,14 +1287,17 @@ static const ARMCPUInfo aarch64_cpus[] = {
{ .name = "neoverse-v1", .initfn = aarch64_neoverse_v1_initfn },
{ .name = "neoverse-n2", .initfn = aarch64_neoverse_n2_initfn },
};
+#endif
static void aarch64_cpu_register_types(void)
{
+#if 0 /* Disabled for Red Hat Enterprise Linux */
{ .name = "cortex-m0", .initfn = cortex_m0_initfn,
.class_init = arm_v7m_class_init },
{ .name = "cortex-m3", .initfn = cortex_m3_initfn,
@@ -1224,6 +1233,7 @@ static const ARMCPUInfo arm_tcg_cpus[] = {
{ .name = "pxa270-b1", .initfn = pxa270b1_initfn },
{ .name = "pxa270-c0", .initfn = pxa270c0_initfn },
{ .name = "pxa270-c5", .initfn = pxa270c5_initfn },
+#endif /* disabled for RHEL */
#ifndef TARGET_AARCH64
{ .name = "max", .initfn = arm_max_initfn },
#endif
size_t i;
for (i = 0; i < ARRAY_SIZE(aarch64_cpus); ++i) {
aarch64_cpu_register(&aarch64_cpus[i]);
}
+#endif
}
type_init(aarch64_cpu_register_types)
diff --git a/target/arm/tcg/meson.build b/target/arm/tcg/meson.build
index 3b1a9f0fc5..6898b4de6f 100644
--- a/target/arm/tcg/meson.build
+++ b/target/arm/tcg/meson.build
@@ -56,5 +56,5 @@ arm_system_ss.add(files(
'psci.c',
))
-arm_system_ss.add(when: 'CONFIG_ARM_V7M', if_true: files('cpu-v7m.c'))
-arm_user_ss.add(when: 'TARGET_AARCH64', if_false: files('cpu-v7m.c'))
+#arm_system_ss.add(when: 'CONFIG_ARM_V7M', if_true: files('cpu-v7m.c'))
+#arm_user_ss.add(when: 'TARGET_AARCH64', if_false: files('cpu-v7m.c'))
diff --git a/target/ppc/cpu-models.c b/target/ppc/cpu-models.c
index 912b037c63..cd3ff700ac 100644
index f2301b43f7..f77ebfcc81 100644
--- a/target/ppc/cpu-models.c
+++ b/target/ppc/cpu-models.c
@@ -66,6 +66,7 @@
@ -603,13 +795,13 @@ index 912b037c63..cd3ff700ac 100644
POWERPC_DEF("970fx_v1.0", CPU_POWERPC_970FX_v10, 970,
@@ -718,6 +721,7 @@
"PowerPC 970MP v1.1")
POWERPC_DEF("power5+_v2.1", CPU_POWERPC_POWER5P_v21, POWER5P,
POWERPC_DEF("power5p_v2.1", CPU_POWERPC_POWER5P_v21, POWER5P,
"POWER5+ v2.1")
+#endif
POWERPC_DEF("power7_v2.3", CPU_POWERPC_POWER7_v23, POWER7,
"POWER7 v2.3")
POWERPC_DEF("power7+_v2.1", CPU_POWERPC_POWER7P_v21, POWER7,
@@ -896,12 +900,15 @@ PowerPCCPUAlias ppc_cpu_aliases[] = {
POWERPC_DEF("power7p_v2.1", CPU_POWERPC_POWER7P_v21, POWER7,
@@ -894,13 +898,16 @@ PowerPCCPUAlias ppc_cpu_aliases[] = {
{ "7447a", "7447a_v1.2" },
{ "7457a", "7457a_v1.2" },
{ "apollo7pm", "7457a_v1.0" },
@ -619,12 +811,13 @@ index 912b037c63..cd3ff700ac 100644
{ "970", "970_v2.2" },
{ "970fx", "970fx_v3.1" },
{ "970mp", "970mp_v1.1" },
{ "power5+", "power5+_v2.1" },
{ "power5+", "power5p_v2.1" },
{ "power5+_v2.1", "power5p_v2.1" },
{ "power5gs", "power5+_v2.1" },
+#endif
{ "power7", "power7_v2.3" },
{ "power7+", "power7+_v2.1" },
{ "power8e", "power8e_v2.1" },
{ "power7+", "power7p_v2.1" },
{ "power7+_v2.1", "power7p_v2.1" },
@@ -911,12 +918,14 @@ PowerPCCPUAlias ppc_cpu_aliases[] = {
{ "power10", "power10_v2.0" },
#endif
@ -641,10 +834,10 @@ index 912b037c63..cd3ff700ac 100644
{ NULL, NULL }
};
diff --git a/target/s390x/cpu_models_sysemu.c b/target/s390x/cpu_models_sysemu.c
index 63981bf36b..87a4480c05 100644
index 2d99218069..0728bfcc20 100644
--- a/target/s390x/cpu_models_sysemu.c
+++ b/target/s390x/cpu_models_sysemu.c
@@ -35,6 +35,9 @@ static void check_unavailable_features(const S390CPUModel *max_model,
@@ -34,6 +34,9 @@ static void check_unavailable_features(const S390CPUModel *max_model,
(max_model->def->gen == model->def->gen &&
max_model->def->ec_ga < model->def->ec_ga)) {
list_add_feat("type", unavailable);
@ -655,10 +848,10 @@ index 63981bf36b..87a4480c05 100644
/* detect missing features if any to properly report them */
diff --git a/target/s390x/kvm/kvm.c b/target/s390x/kvm/kvm.c
index 3ac7ec9acf..97da1a6424 100644
index 4ce809c5d4..55fb4855b1 100644
--- a/target/s390x/kvm/kvm.c
+++ b/target/s390x/kvm/kvm.c
@@ -2529,6 +2529,14 @@ void kvm_s390_apply_cpu_model(const S390CPUModel *model, Error **errp)
@@ -2565,6 +2565,14 @@ void kvm_s390_apply_cpu_model(const S390CPUModel *model, Error **errp)
error_setg(errp, "KVM doesn't support CPU models");
return;
}
@ -673,6 +866,37 @@ index 3ac7ec9acf..97da1a6424 100644
prop.cpuid = s390_cpuid_from_cpu_model(model);
prop.ibc = s390_ibc_from_cpu_model(model);
/* configure cpu features indicated via STFL(e) */
diff --git a/tests/qtest/arm-cpu-features.c b/tests/qtest/arm-cpu-features.c
index 9d6e6190d5..f822526acb 100644
--- a/tests/qtest/arm-cpu-features.c
+++ b/tests/qtest/arm-cpu-features.c
@@ -452,8 +452,10 @@ static void test_query_cpu_model_expansion(const void *data)
assert_error(qts, "host", "The CPU type 'host' requires KVM", NULL);
/* Test expected feature presence/absence for some cpu types */
+#if 0 /* Disabled for Red Hat Enterprise Linux */
assert_has_feature_enabled(qts, "cortex-a15", "pmu");
assert_has_not_feature(qts, "cortex-a15", "aarch64");
+#endif /* disabled for RHEL */
/* Enabling and disabling pmu should always work. */
assert_has_feature_enabled(qts, "max", "pmu");
@@ -470,6 +472,7 @@ static void test_query_cpu_model_expansion(const void *data)
assert_has_feature_enabled(qts, "cortex-a57", "pmu");
assert_has_feature_enabled(qts, "cortex-a57", "aarch64");
+#if 0 /* Disabled for Red Hat Enterprise Linux */
assert_has_feature_enabled(qts, "a64fx", "pmu");
assert_has_feature_enabled(qts, "a64fx", "aarch64");
/*
@@ -482,6 +485,7 @@ static void test_query_cpu_model_expansion(const void *data)
"{ 'sve384': true }");
assert_error(qts, "a64fx", "cannot enable sve640",
"{ 'sve640': true }");
+#endif /* disabled for RHEL */
sve_tests_default(qts, "max");
pauth_tests_default(qts, "max");
--
2.39.1
2.39.3

@ -1,4 +1,4 @@
From c13f8e21b32aa06b08847e88080f2fdea5084a9b Mon Sep 17 00:00:00 2001
From 8e6a30073f9c1a5d6294b2d16556522453e227e7 Mon Sep 17 00:00:00 2001
From: Miroslav Rezanina <mrezanin@redhat.com>
Date: Fri, 11 Jan 2019 09:54:45 +0100
Subject: Machine type related general changes
@ -19,10 +19,19 @@ Rebase notes (7.0.0):
- Remove downstream changes leftovers in hw/rtc/mc146818rtc.c
- Remove unnecessary change in hw/usb/hcd-uhci.c
Rebase notes (7.1.0 rc0):
Rebase notes (7.1.0):
- Moved adding rhel_old_machine_deprecation variable from s390x to general machine types commit
- Moved adding hw_compat_rhel_8_6 struct from x86_64 to general machine types commit
Rebase notes (8.1.0):
- Do not modify unused vga-isa.c
Rebase notes (9.0.0 rc0):
- Updated smsbios handling
Rebase notes (9.0.0 rc4):
- Moving downstream compat changes
Merged patches (6.1.0):
- f2fb42a3c6 redhat: add missing entries in hw_compat_rhel_8_4
- 1949ec258e hw/arm/virt: Disable PL011 clock migration through hw_compat_rhel_8_3
@ -40,67 +49,76 @@ Merged patches (7.0.0):
- ef5afcc86d Fix virtio-net-pci* "vectors" compat
- 168f0d56e3 compat: Update hw_compat_rhel_8_5 with 6.2.0 RC2 changes
Merged patches (7.1.0 rc0):
Merged patches (7.1.0):
- 38b89dc245 pc: Move s3/s4 suspend disabling to compat (only hw/acpi/piix4.c chunk)
- 1d6439527a WRB: Introduce RHEL 9.0.0 hw compat structure (only hw/core/machine.c and include/hw/boards.h chunk)
Merged patches (7.2.0 rc0):
Merged patches (7.2.0):
- 0be2889fa2 Introduce upstream 7.0 compat changes (only applicable parts)
Merged patches (8.0.0-rc1):
Merged patches (8.0.0):
- 21ed34787b Addd 7.2 compat bits for RHEL 9.1 machine type
- e5c8d5d603 virtio-rng-pci: fix migration compat for vectors
- 5a5fa77059 virtio-rng-pci: fix transitional migration compat for vectors
Merged patches (8.1.0):
- bd5d81d286 Add RHEL 9.2.0 compat structure (general part)
- 1165e24c6b hw/pci: Disable PCI_ERR_UNCOR_MASK reg for machine type <= pc-q35-rhel9.2.0
Merged patches (8.2.0):
- 4ee284aca9 Add machine types compat bits. (partial)
Merged patches (9.0.0 rc0):
- 4b8fe42abc virtio-mem: default-enable "dynamic-memslots"
---
hw/acpi/piix4.c | 2 +-
hw/arm/virt.c | 2 +-
hw/core/machine.c | 229 +++++++++++++++++++++++++++++++++++
hw/display/vga-isa.c | 2 +-
hw/i386/pc_piix.c | 2 +
hw/i386/pc_q35.c | 2 +
hw/core/machine.c | 269 +++++++++++++++++++++++++++++++++++
hw/i386/fw_cfg.c | 3 +-
hw/net/rtl8139.c | 4 +-
hw/smbios/smbios.c | 46 ++++++-
hw/smbios/smbios.c | 46 +++++-
hw/timer/i8254_common.c | 2 +-
hw/usb/hcd-xhci-pci.c | 59 ++++++---
hw/usb/hcd-xhci-pci.c | 59 ++++++--
hw/usb/hcd-xhci-pci.h | 1 +
include/hw/boards.h | 31 +++++
include/hw/firmware/smbios.h | 5 +-
hw/virtio/virtio-mem.c | 3 +-
include/hw/boards.h | 40 ++++++
include/hw/firmware/smbios.h | 4 +-
include/hw/i386/pc.h | 3 +
14 files changed, 367 insertions(+), 23 deletions(-)
13 files changed, 414 insertions(+), 24 deletions(-)
diff --git a/hw/acpi/piix4.c b/hw/acpi/piix4.c
index 63d2113b86..a24b9aac92 100644
index debe1adb84..e8ddcd716e 100644
--- a/hw/acpi/piix4.c
+++ b/hw/acpi/piix4.c
@@ -247,7 +247,7 @@ static bool vmstate_test_migrate_acpi_index(void *opaque, int version_id)
@@ -245,7 +245,7 @@ static bool vmstate_test_migrate_acpi_index(void *opaque, int version_id)
static const VMStateDescription vmstate_acpi = {
.name = "piix4_pm",
.version_id = 3,
- .minimum_version_id = 3,
+ .minimum_version_id = 2,
.post_load = vmstate_acpi_post_load,
.fields = (VMStateField[]) {
.fields = (const VMStateField[]) {
VMSTATE_PCI_DEVICE(parent_obj, PIIX4PMState),
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index ac626b3bef..4a6e89c7bc 100644
index 6c6d155002..36e9b4b4e9 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -1629,7 +1629,7 @@ static void virt_build_smbios(VirtMachineState *vms)
@@ -1651,7 +1651,7 @@ static void virt_build_smbios(VirtMachineState *vms)
smbios_set_defaults("QEMU", product,
vmc->smbios_old_sys_ver ? "1.0" : mc->name, false,
- true, SMBIOS_ENTRY_POINT_TYPE_64);
+ true, NULL, NULL, SMBIOS_ENTRY_POINT_TYPE_64);
vmc->smbios_old_sys_ver ? "1.0" : mc->name,
- true);
+ true, NULL, NULL);
/* build the array of physical mem area from base_memmap */
mem_array.address = vms->memmap[VIRT_MEM].base;
diff --git a/hw/core/machine.c b/hw/core/machine.c
index cd13b8b0a3..5aa567fad3 100644
index 37ede0e7d4..695cb89a46 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -46,6 +46,235 @@ GlobalProperty hw_compat_7_2[] = {
@@ -296,6 +296,275 @@ GlobalProperty hw_compat_2_1[] = {
};
const size_t hw_compat_7_2_len = G_N_ELEMENTS(hw_compat_7_2);
const size_t hw_compat_2_1_len = G_N_ELEMENTS(hw_compat_2_1);
+/*
+ * RHEL only: machine types for previous major releases are deprecated
@ -108,6 +126,46 @@ index cd13b8b0a3..5aa567fad3 100644
+const char *rhel_old_machine_deprecation =
+ "machine types for previous major releases are deprecated";
+
+GlobalProperty hw_compat_rhel_9_4[] = {
+ /* hw_compat_rhel_9_4 from hw_compat_8_0 */
+ { TYPE_VIRTIO_NET, "host_uso", "off"},
+ /* hw_compat_rhel_9_4 from hw_compat_8_0 */
+ { TYPE_VIRTIO_NET, "guest_uso4", "off"},
+ /* hw_compat_rhel_9_4 from hw_compat_8_0 */
+ { TYPE_VIRTIO_NET, "guest_uso6", "off"},
+ /* hw_compat_rhel_9_4 from hw_compat_8_1 */
+ { TYPE_PCI_BRIDGE, "x-pci-express-writeable-slt-bug", "true" },
+ /* hw_compat_rhel_9_4 from hw_compat_8_1 */
+ { "ramfb", "x-migrate", "off" },
+ /* hw_compat_rhel_9_4 from hw_compat_8_1 */
+ { "vfio-pci-nohotplug", "x-ramfb-migrate", "off" },
+ /* hw_compat_rhel_9_4 from hw_compat_8_1 */
+ { "igb", "x-pcie-flr-init", "off" },
+ /* hw_compat_rhel_9_4 jira RHEL-24045 */
+ { "virtio-mem", "dynamic-memslots", "off" },
+};
+const size_t hw_compat_rhel_9_4_len = G_N_ELEMENTS(hw_compat_rhel_9_4);
+
+GlobalProperty hw_compat_rhel_9_3[] = {
+ /* hw_compat_rhel_9_3 from hw_compat_8_0 */
+ { "migration", "multifd-flush-after-each-section", "on"},
+ /* hw_compat_rhel_9_3 from hw_compat_8_0 */
+ { TYPE_PCI_DEVICE, "x-pcie-ari-nextfn-1", "on" },
+};
+const size_t hw_compat_rhel_9_3_len = G_N_ELEMENTS(hw_compat_rhel_9_3);
+
+GlobalProperty hw_compat_rhel_9_2[] = {
+ /* hw_compat_rhel_9_2 from hw_compat_7_2 */
+ { "e1000e", "migrate-timadj", "off" },
+ /* hw_compat_rhel_9_2 from hw_compat_7_2 */
+ { "virtio-mem", "x-early-migration", "false" },
+ /* hw_compat_rhel_9_2 from hw_compat_7_2 */
+ { "migration", "x-preempt-pre-7-2", "true" },
+ /* hw_compat_rhel_9_2 from hw_compat_7_2 */
+ { TYPE_PCI_DEVICE, "x-pcie-err-unc-mask", "off" },
+};
+const size_t hw_compat_rhel_9_2_len = G_N_ELEMENTS(hw_compat_rhel_9_2);
+
+/*
+ * Mostly the same as hw_compat_7_0
+ */
@ -331,53 +389,28 @@ index cd13b8b0a3..5aa567fad3 100644
+};
+const size_t hw_compat_rhel_7_6_len = G_N_ELEMENTS(hw_compat_rhel_7_6);
+
GlobalProperty hw_compat_7_1[] = {
{ "virtio-device", "queue_reset", "false" },
{ "virtio-rng-pci", "vectors", "0" },
diff --git a/hw/display/vga-isa.c b/hw/display/vga-isa.c
index 2a5437d803..0db2c2b2a1 100644
--- a/hw/display/vga-isa.c
+++ b/hw/display/vga-isa.c
@@ -89,7 +89,7 @@ static void vga_isa_realizefn(DeviceState *dev, Error **errp)
}
static Property vga_isa_properties[] = {
- DEFINE_PROP_UINT32("vgamem_mb", ISAVGAState, state.vram_size_mb, 8),
+ DEFINE_PROP_UINT32("vgamem_mb", ISAVGAState, state.vram_size_mb, 16),
DEFINE_PROP_END_OF_LIST(),
};
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 30eedd62a3..14a794081e 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -201,6 +201,8 @@ static void pc_init1(MachineState *machine,
smbios_set_defaults("QEMU", "Standard PC (i440FX + PIIX, 1996)",
mc->name, pcmc->smbios_legacy_mode,
pcmc->smbios_uuid_encoded,
+ pcmc->smbios_stream_product,
+ pcmc->smbios_stream_version,
pcms->smbios_entry_point_type);
}
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index 797ba347fd..dc0ba5f9e7 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -202,6 +202,8 @@ static void pc_q35_init(MachineState *machine)
smbios_set_defaults("QEMU", "Standard PC (Q35 + ICH9, 2009)",
mc->name, pcmc->smbios_legacy_mode,
pcmc->smbios_uuid_encoded,
+ pcmc->smbios_stream_product,
+ pcmc->smbios_stream_version,
pcms->smbios_entry_point_type);
MachineState *current_machine;
static char *machine_get_kernel(Object *obj, Error **errp)
diff --git a/hw/i386/fw_cfg.c b/hw/i386/fw_cfg.c
index d802d2787f..c7aa39a13e 100644
--- a/hw/i386/fw_cfg.c
+++ b/hw/i386/fw_cfg.c
@@ -64,7 +64,8 @@ void fw_cfg_build_smbios(PCMachineState *pcms, FWCfgState *fw_cfg,
if (pcmc->smbios_defaults) {
/* These values are guest ABI, do not change */
smbios_set_defaults("QEMU", mc->desc, mc->name,
- pcmc->smbios_uuid_encoded);
+ pcmc->smbios_uuid_encoded,
+ pcmc->smbios_stream_product, pcmc->smbios_stream_version);
}
/* tell smbios about cpuid version and features */
diff --git a/hw/net/rtl8139.c b/hw/net/rtl8139.c
index 5a5aaf868d..3d473d5869 100644
index 897c86ec41..2d0db43f49 100644
--- a/hw/net/rtl8139.c
+++ b/hw/net/rtl8139.c
@@ -3178,7 +3178,7 @@ static int rtl8139_pre_save(void *opaque)
@@ -3169,7 +3169,7 @@ static int rtl8139_pre_save(void *opaque)
static const VMStateDescription vmstate_rtl8139 = {
.name = "rtl8139",
@ -386,7 +419,7 @@ index 5a5aaf868d..3d473d5869 100644
.minimum_version_id = 3,
.post_load = rtl8139_post_load,
.pre_save = rtl8139_pre_save,
@@ -3259,7 +3259,9 @@ static const VMStateDescription vmstate_rtl8139 = {
@@ -3250,7 +3250,9 @@ static const VMStateDescription vmstate_rtl8139 = {
VMSTATE_UINT32(tally_counters.TxMCol, RTL8139State),
VMSTATE_UINT64(tally_counters.RxOkPhy, RTL8139State),
VMSTATE_UINT64(tally_counters.RxOkBrd, RTL8139State),
@ -397,20 +430,21 @@ index 5a5aaf868d..3d473d5869 100644
VMSTATE_UINT16(tally_counters.TxUndrn, RTL8139State),
diff --git a/hw/smbios/smbios.c b/hw/smbios/smbios.c
index d2007e70fb..319eae9e9d 100644
index eed5787b15..68608a3403 100644
--- a/hw/smbios/smbios.c
+++ b/hw/smbios/smbios.c
@@ -58,6 +58,9 @@ static bool smbios_legacy = true;
static bool smbios_uuid_encoded = true;
/* end: legacy structures & constants for <= 2.0 machines */
@@ -39,6 +39,10 @@ size_t usr_blobs_len;
static unsigned usr_table_max;
static unsigned usr_table_cnt;
+/* Set to true for modern Windows 10 HardwareID-6 compat */
+static bool smbios_type2_required;
+
+
uint8_t *smbios_tables;
size_t smbios_tables_len;
@@ -670,7 +673,7 @@ static void smbios_build_type_1_table(void)
unsigned smbios_table_max;
@@ -629,7 +633,7 @@ static void smbios_build_type_1_table(void)
static void smbios_build_type_2_table(void)
{
@ -419,21 +453,17 @@ index d2007e70fb..319eae9e9d 100644
SMBIOS_TABLE_SET_STR(2, manufacturer_str, type2.manufacturer);
SMBIOS_TABLE_SET_STR(2, product_str, type2.product);
@@ -980,7 +983,10 @@ void smbios_set_cpuid(uint32_t version, uint32_t features)
@@ -1018,16 +1022,52 @@ void smbios_set_default_processor_family(uint16_t processor_family)
void smbios_set_defaults(const char *manufacturer, const char *product,
const char *version, bool legacy_mode,
- bool uuid_encoded, SmbiosEntryPointType ep_type)
const char *version,
- bool uuid_encoded)
+ bool uuid_encoded,
+ const char *stream_product,
+ const char *stream_version,
+ SmbiosEntryPointType ep_type)
+ const char *stream_version)
{
smbios_have_defaults = true;
smbios_legacy = legacy_mode;
@@ -1001,11 +1007,45 @@ void smbios_set_defaults(const char *manufacturer, const char *product,
g_free(smbios_entries);
}
smbios_uuid_encoded = uuid_encoded;
+ /*
+ * If @stream_product & @stream_version are non-NULL, then
@ -460,12 +490,12 @@ index d2007e70fb..319eae9e9d 100644
+ *
+ * We get 'System Manufacturer' and 'Baseboard Manufacturer'
+ */
SMBIOS_SET_DEFAULT(type1.manufacturer, manufacturer);
SMBIOS_SET_DEFAULT(type1.product, product);
SMBIOS_SET_DEFAULT(type1.version, version);
+ SMBIOS_SET_DEFAULT(type1.family, "Red Hat Enterprise Linux");
SMBIOS_SET_DEFAULT(smbios_type1.manufacturer, manufacturer);
SMBIOS_SET_DEFAULT(smbios_type1.product, product);
SMBIOS_SET_DEFAULT(smbios_type1.version, version);
+ SMBIOS_SET_DEFAULT(smbios_type1.family, "Red Hat Enterprise Linux");
+ if (stream_version != NULL) {
+ SMBIOS_SET_DEFAULT(type1.sku, stream_version);
+ SMBIOS_SET_DEFAULT(smbios_type1.sku, stream_version);
+ }
SMBIOS_SET_DEFAULT(type2.manufacturer, manufacturer);
- SMBIOS_SET_DEFAULT(type2.product, product);
@ -479,20 +509,20 @@ index d2007e70fb..319eae9e9d 100644
SMBIOS_SET_DEFAULT(type3.manufacturer, manufacturer);
SMBIOS_SET_DEFAULT(type3.version, version);
diff --git a/hw/timer/i8254_common.c b/hw/timer/i8254_common.c
index 050875b497..32935da46c 100644
index 28fdabc321..bad13ec224 100644
--- a/hw/timer/i8254_common.c
+++ b/hw/timer/i8254_common.c
@@ -231,7 +231,7 @@ static const VMStateDescription vmstate_pit_common = {
@@ -229,7 +229,7 @@ static const VMStateDescription vmstate_pit_common = {
.pre_save = pit_dispatch_pre_save,
.post_load = pit_dispatch_post_load,
.fields = (VMStateField[]) {
.fields = (const VMStateField[]) {
- VMSTATE_UINT32_V(channels[0].irq_disabled, PITCommonState, 3),
+ VMSTATE_UINT32(channels[0].irq_disabled, PITCommonState), /* qemu-kvm's v2 had 'flags' here */
VMSTATE_STRUCT_ARRAY(channels, PITCommonState, 3, 2,
vmstate_pit_channel, PITChannelState),
VMSTATE_INT64(channels[0].next_transition_time,
diff --git a/hw/usb/hcd-xhci-pci.c b/hw/usb/hcd-xhci-pci.c
index 643d4643e4..529bad9366 100644
index 4423983308..43b4b71fdf 100644
--- a/hw/usb/hcd-xhci-pci.c
+++ b/hw/usb/hcd-xhci-pci.c
@@ -104,6 +104,33 @@ static int xhci_pci_vmstate_post_load(void *opaque, int version_id)
@ -602,14 +632,38 @@ index 08f70ce97c..1be7527c1b 100644
} XHCIPciState;
#endif
diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c
index ffd119ebac..0e2be2219c 100644
--- a/hw/virtio/virtio-mem.c
+++ b/hw/virtio/virtio-mem.c
@@ -1694,8 +1694,9 @@ static Property virtio_mem_properties[] = {
#endif
DEFINE_PROP_BOOL(VIRTIO_MEM_EARLY_MIGRATION_PROP, VirtIOMEM,
early_migration, true),
+ /* RHEL: default-enable "dynamic-memslots" (jira RHEL-24045) */
DEFINE_PROP_BOOL(VIRTIO_MEM_DYNAMIC_MEMSLOTS_PROP, VirtIOMEM,
- dynamic_memslots, false),
+ dynamic_memslots, true),
DEFINE_PROP_END_OF_LIST(),
};
diff --git a/include/hw/boards.h b/include/hw/boards.h
index 6fbbfd56c8..c5a965d27f 100644
index 8b8f6d5c00..0466f9d0f3 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -459,4 +459,35 @@ extern const size_t hw_compat_2_2_len;
@@ -512,4 +512,44 @@ extern const size_t hw_compat_2_2_len;
extern GlobalProperty hw_compat_2_1[];
extern const size_t hw_compat_2_1_len;
+extern GlobalProperty hw_compat_rhel_9_4[];
+extern const size_t hw_compat_rhel_9_4_len;
+
+extern GlobalProperty hw_compat_rhel_9_3[];
+extern const size_t hw_compat_rhel_9_3_len;
+
+extern GlobalProperty hw_compat_rhel_9_2[];
+extern const size_t hw_compat_rhel_9_2_len;
+
+extern GlobalProperty hw_compat_rhel_9_1[];
+extern const size_t hw_compat_rhel_9_1_len;
+
@ -643,29 +697,28 @@ index 6fbbfd56c8..c5a965d27f 100644
+extern const char *rhel_old_machine_deprecation;
#endif
diff --git a/include/hw/firmware/smbios.h b/include/hw/firmware/smbios.h
index 7f3259a630..d24b3ccd32 100644
index 8d3fb2fb3b..d9d6d7a169 100644
--- a/include/hw/firmware/smbios.h
+++ b/include/hw/firmware/smbios.h
@@ -294,7 +294,10 @@ void smbios_entry_add(QemuOpts *opts, Error **errp);
@@ -332,7 +332,9 @@ void smbios_entry_add(QemuOpts *opts, Error **errp);
void smbios_set_cpuid(uint32_t version, uint32_t features);
void smbios_set_defaults(const char *manufacturer, const char *product,
const char *version, bool legacy_mode,
- bool uuid_encoded, SmbiosEntryPointType ep_type);
const char *version,
- bool uuid_encoded);
+ bool uuid_encoded,
+ const char *stream_product,
+ const char *stream_version,
+ SmbiosEntryPointType ep_type);
uint8_t *smbios_get_table_legacy(MachineState *ms, size_t *length);
+ const char *stream_version);
void smbios_set_default_processor_family(uint16_t processor_family);
uint8_t *smbios_get_table_legacy(size_t *length, Error **errp);
void smbios_get_tables(MachineState *ms,
const struct smbios_phys_mem_area *mem_array,
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 8206d5405a..908a275736 100644
index 27a68071d7..ebd8f973f2 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -111,6 +111,9 @@ struct PCMachineClass {
bool smbios_defaults;
@@ -112,6 +112,9 @@ struct PCMachineClass {
bool smbios_legacy_mode;
bool smbios_uuid_encoded;
SmbiosEntryPointType default_smbios_ep_type;
+ /* New fields needed for Windows HardwareID-6 matching */
+ const char *smbios_stream_product;
+ const char *smbios_stream_version;
@ -673,5 +726,5 @@ index 8206d5405a..908a275736 100644
/* RAM / address space compat: */
bool gigabyte_align;
--
2.39.1
2.39.3

@ -1,4 +1,4 @@
From ec6468b65a3af0e2b84575c9f965f61916d0d8ea Mon Sep 17 00:00:00 2001
From cf398296f3fcee185a00f23de5deae57c97d648e Mon Sep 17 00:00:00 2001
From: Miroslav Rezanina <mrezanin@redhat.com>
Date: Fri, 19 Oct 2018 12:53:31 +0200
Subject: Add aarch64 machine types
@ -17,18 +17,22 @@ Rebase notes (7.0.0):
- Added dtb-kaslr-seed option
- Set no_tcg_lpa2 to true
Rebase notes (7.1.0 rc0):
Rebase notes (7.1.0):
- replace dtb_kaslr_seed by dtb_randomness
Rebase notes (7.1.0 rc3):
- Updated dtb_randomness comment
Rebase notes (7.2.0 rc0):
Rebase notes (7.2.0):
- Disabled cortex-a35
Rebase notes (8.0.0-rc1):
Rebase notes (8.0.0):
- Moved changed code from target/arm/helper.c to target/arm/arm-qmp-cmds.c
Rebase notes (8.1.0):
- Added setting default_nic
Rebase notes (9.0.0 rc0):
- call arm_virt_compat_set on rhel type class_init
Merged patches (6.2.0):
- 9a3d4fde0e hw/arm/virt: Remove 9.0 machine type
- f7d04d6695 hw: arm: virt: Add hw_compat_rhel_8_5 to 8.5 machine type
@ -46,52 +50,84 @@ Merged patches (7.0.0):
- f79b31bdef hw/arm/virt: Remove the dtb-kaslr-seed machine option
- b6fca85f4a hw/arm/virt: Fix missing initialization in instance/class_init()
Merged patches (7.1.0 rc0):
Merged patches (7.1.0):
- ac97dd4f9f RHEL-only: AArch64: Drop unsupported CPU types
- e9c0a70664 target/arm: deprecate named CPU models
Merged patches (7.2.0 rc0):
Merged patches (7.2.0):
- 0be2889fa2 Introduce upstream 7.0 compat changes (only applicable parts)
Merged patches (8.0.0-rc1):
Merged patches (8.0.0):
- c1a21266d8 redhat: aarch64: add rhel9.2.0 virt machine type
- d97cd7c513 redhat: fix virt-rhel9.2.0 compat props
Merged patches (8.1.0):
- bd5d81d286 Add RHEL 9.2.0 compat structure (arm part)
- c07f666086 hw/arm/virt: Validate cluster and NUMA node boundary for RHEL machines
Merged patches (8.2.0):
- 4ee284aca9 Add machine types compat bits. (partial)
Merged patches (9.0.0 rc0):
- 117068376a hw/arm/virt: Fix compats
- 8bcccfabc4 hw/arm/virt: Add properties to disable high memory regions
- 0005a8b93a hw/arm/virt: deprecate virt-rhel9.{0,2}.0 machine types
---
hw/arm/virt.c | 251 ++++++++++++++++++++++++++++++++-
include/hw/arm/virt.h | 8 ++
target/arm/arm-qmp-cmds.c | 2 +
target/arm/cpu-qom.h | 1 +
target/arm/cpu.c | 5 +
target/arm/cpu.h | 2 +
target/arm/cpu64.c | 16 ++-
target/arm/cpu_tcg.c | 12 +-
tests/qtest/arm-cpu-features.c | 6 +
9 files changed, 289 insertions(+), 14 deletions(-)
hw/arm/virt.c | 299 +++++++++++++++++++++++++++++++++++++++++-
include/hw/arm/virt.h | 8 ++
2 files changed, 306 insertions(+), 1 deletion(-)
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 4a6e89c7bc..1ae1654be5 100644
index 36e9b4b4e9..22bc345137 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -81,6 +81,7 @@
#include "hw/char/pl011.h"
#include "qemu/guest-random.h"
@@ -101,6 +101,7 @@ static void arm_virt_compat_set(MachineClass *mc)
arm_virt_compat_len);
}
+#if 0 /* Disabled for Red Hat Enterprise Linux */
#define DEFINE_VIRT_MACHINE_LATEST(major, minor, latest) \
static void virt_##major##_##minor##_class_init(ObjectClass *oc, \
void *data) \
@@ -107,7 +108,48 @@
@@ -128,7 +129,63 @@ static void arm_virt_compat_set(MachineClass *mc)
DEFINE_VIRT_MACHINE_LATEST(major, minor, true)
#define DEFINE_VIRT_MACHINE(major, minor) \
DEFINE_VIRT_MACHINE_LATEST(major, minor, false)
-
+#endif /* disabled for RHEL */
+
+/*
+ * This variable is for changes to properties that are RHEL specific,
+ * different to the current upstream and to be applied to the latest
+ * machine type. They may be overriden by older machine compats.
+ *
+ * virtio-net-pci variant romfiles are not needed because edk2 does
+ * fully support the pxe boot. Besides virtio romfiles are not shipped
+ * on rhel/aarch64.
+ */
+GlobalProperty arm_rhel_compat[] = {
+ {"virtio-net-pci", "romfile", "" },
+ {"virtio-net-pci-transitional", "romfile", "" },
+ {"virtio-net-pci-non-transitional", "romfile", "" },
+};
+const size_t arm_rhel_compat_len = G_N_ELEMENTS(arm_rhel_compat);
+/*
+ * This cannot be called from the rhel_virt_class_init() because
+ * TYPE_RHEL_MACHINE is abstract and mc->compat_props g_ptr_array_new()
+ * only is called on virt-rhelm.n.s non abstract class init.
+ */
+static void arm_rhel_compat_set(MachineClass *mc)
+{
+ compat_props_add(mc->compat_props, arm_rhel_compat,
+ arm_rhel_compat_len);
+}
+
+#define DEFINE_RHEL_MACHINE_LATEST(m, n, s, latest) \
+ static void rhel##m##n##s##_virt_class_init(ObjectClass *oc, \
+ void *data) \
+ { \
+ MachineClass *mc = MACHINE_CLASS(oc); \
+ arm_rhel_compat_set(mc); \
+ rhel##m##n##s##_virt_options(mc); \
+ mc->desc = "RHEL " # m "." # n "." # s " ARM Virtual Machine"; \
+ if (latest) { \
@ -114,44 +150,10 @@ index 4a6e89c7bc..1ae1654be5 100644
+ DEFINE_RHEL_MACHINE_LATEST(major, minor, subminor, true)
+#define DEFINE_RHEL_MACHINE(major, minor, subminor) \
+ DEFINE_RHEL_MACHINE_LATEST(major, minor, subminor, false)
+
+/* This variable is for changes to properties that are RHEL specific,
+ * different to the current upstream and to be applied to the latest
+ * machine type.
+ */
+GlobalProperty arm_rhel_compat[] = {
+ {
+ .driver = "virtio-net-pci",
+ .property = "romfile",
+ .value = "",
+ },
+};
+const size_t arm_rhel_compat_len = G_N_ELEMENTS(arm_rhel_compat);
/* Number of external interrupt lines to configure the GIC with */
#define NUM_IRQS 256
@@ -204,16 +246,20 @@ static const int a15irqmap[] = {
};
static const char *valid_cpus[] = {
+#if 0 /* Disabled for Red Hat Enterprise Linux */
ARM_CPU_TYPE_NAME("cortex-a7"),
ARM_CPU_TYPE_NAME("cortex-a15"),
ARM_CPU_TYPE_NAME("cortex-a35"),
ARM_CPU_TYPE_NAME("cortex-a53"),
ARM_CPU_TYPE_NAME("cortex-a55"),
+#endif /* disabled for RHEL */
ARM_CPU_TYPE_NAME("cortex-a57"),
+#if 0 /* Disabled for Red Hat Enterprise Linux */
ARM_CPU_TYPE_NAME("cortex-a72"),
ARM_CPU_TYPE_NAME("cortex-a76"),
ARM_CPU_TYPE_NAME("a64fx"),
ARM_CPU_TYPE_NAME("neoverse-n1"),
+#endif /* disabled for RHEL */
ARM_CPU_TYPE_NAME("host"),
ARM_CPU_TYPE_NAME("max"),
};
@@ -2339,6 +2385,7 @@ static void machvirt_init(MachineState *machine)
@@ -2355,6 +2412,7 @@ static void machvirt_init(MachineState *machine)
qemu_add_machine_init_done_notifier(&vms->machine_done);
}
@ -159,7 +161,7 @@ index 4a6e89c7bc..1ae1654be5 100644
static bool virt_get_secure(Object *obj, Error **errp)
{
VirtMachineState *vms = VIRT_MACHINE(obj);
@@ -2366,6 +2413,7 @@ static void virt_set_virt(Object *obj, bool value, Error **errp)
@@ -2382,6 +2440,7 @@ static void virt_set_virt(Object *obj, bool value, Error **errp)
vms->virt = value;
}
@ -167,25 +169,31 @@ index 4a6e89c7bc..1ae1654be5 100644
static bool virt_get_highmem(Object *obj, Error **errp)
{
@@ -2380,7 +2428,7 @@ static void virt_set_highmem(Object *obj, bool value, Error **errp)
@@ -2397,6 +2456,7 @@ static void virt_set_highmem(Object *obj, bool value, Error **errp)
vms->highmem = value;
}
-
+#if 0 /* Disabled for Red Hat Enterprise Linux */
static bool virt_get_compact_highmem(Object *obj, Error **errp)
{
VirtMachineState *vms = VIRT_MACHINE(obj);
@@ -2436,7 +2484,7 @@ static void virt_set_highmem_mmio(Object *obj, bool value, Error **errp)
@@ -2410,6 +2470,7 @@ static void virt_set_compact_highmem(Object *obj, bool value, Error **errp)
vms->highmem_mmio = value;
vms->highmem_compact = value;
}
-
+#endif /* disabled for RHEL */
static bool virt_get_highmem_redists(Object *obj, Error **errp)
{
@@ -2453,7 +2514,6 @@ static void virt_set_highmem_mmio(Object *obj, bool value, Error **errp)
vms->highmem_mmio = value;
}
-
static bool virt_get_its(Object *obj, Error **errp)
{
@@ -2452,6 +2500,7 @@ static void virt_set_its(Object *obj, bool value, Error **errp)
VirtMachineState *vms = VIRT_MACHINE(obj);
@@ -2468,6 +2528,7 @@ static void virt_set_its(Object *obj, bool value, Error **errp)
vms->its = value;
}
@ -193,7 +201,7 @@ index 4a6e89c7bc..1ae1654be5 100644
static bool virt_get_dtb_randomness(Object *obj, Error **errp)
{
VirtMachineState *vms = VIRT_MACHINE(obj);
@@ -2465,6 +2514,7 @@ static void virt_set_dtb_randomness(Object *obj, bool value, Error **errp)
@@ -2481,6 +2542,7 @@ static void virt_set_dtb_randomness(Object *obj, bool value, Error **errp)
vms->dtb_randomness = value;
}
@ -201,7 +209,7 @@ index 4a6e89c7bc..1ae1654be5 100644
static char *virt_get_oem_id(Object *obj, Error **errp)
{
@@ -2548,6 +2598,7 @@ static void virt_set_ras(Object *obj, bool value, Error **errp)
@@ -2564,6 +2626,7 @@ static void virt_set_ras(Object *obj, bool value, Error **errp)
vms->ras = value;
}
@ -209,7 +217,7 @@ index 4a6e89c7bc..1ae1654be5 100644
static bool virt_get_mte(Object *obj, Error **errp)
{
VirtMachineState *vms = VIRT_MACHINE(obj);
@@ -2561,6 +2612,7 @@ static void virt_set_mte(Object *obj, bool value, Error **errp)
@@ -2577,6 +2640,7 @@ static void virt_set_mte(Object *obj, bool value, Error **errp)
vms->mte = value;
}
@ -217,7 +225,7 @@ index 4a6e89c7bc..1ae1654be5 100644
static char *virt_get_gic_version(Object *obj, Error **errp)
{
@@ -2988,6 +3040,7 @@ static int virt_kvm_type(MachineState *ms, const char *type_str)
@@ -2949,6 +3013,7 @@ static int virt_kvm_type(MachineState *ms, const char *type_str)
return fixed_ipa ? 0 : requested_pa_size;
}
@ -225,7 +233,7 @@ index 4a6e89c7bc..1ae1654be5 100644
static void virt_machine_class_init(ObjectClass *oc, void *data)
{
MachineClass *mc = MACHINE_CLASS(oc);
@@ -3441,3 +3494,195 @@ static void virt_machine_2_6_options(MachineClass *mc)
@@ -3463,3 +3528,235 @@ static void virt_machine_2_6_options(MachineClass *mc)
vmc->no_pmu = true;
}
DEFINE_VIRT_MACHINE(2, 6)
@ -235,6 +243,7 @@ index 4a6e89c7bc..1ae1654be5 100644
+{
+ MachineClass *mc = MACHINE_CLASS(oc);
+ HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(oc);
+ arm_virt_compat_set(mc);
+
+ mc->family = "virt-rhel-Z";
+ mc->init = machvirt_init;
@ -263,7 +272,10 @@ index 4a6e89c7bc..1ae1654be5 100644
+ mc->smp_props.clusters_supported = true;
+ mc->auto_enable_numa_with_memhp = true;
+ mc->auto_enable_numa_with_memdev = true;
+ /* platform instead of architectural choice */
+ mc->cpu_cluster_has_numa_boundary = true;
+ mc->default_ram_id = "mach-virt.ram";
+ mc->default_nic = "virtio-net-pci";
+
+ object_class_property_add(oc, "acpi", "OnOffAuto",
+ virt_get_acpi, virt_set_acpi,
@ -277,6 +289,28 @@ index 4a6e89c7bc..1ae1654be5 100644
+ "Set on/off to enable/disable using "
+ "physical address space above 32 bits");
+
+ object_class_property_add_bool(oc, "highmem-redists",
+ virt_get_highmem_redists,
+ virt_set_highmem_redists);
+ object_class_property_set_description(oc, "highmem-redists",
+ "Set on/off to enable/disable high "
+ "memory region for GICv3 or GICv4 "
+ "redistributor");
+
+ object_class_property_add_bool(oc, "highmem-ecam",
+ virt_get_highmem_ecam,
+ virt_set_highmem_ecam);
+ object_class_property_set_description(oc, "highmem-ecam",
+ "Set on/off to enable/disable high "
+ "memory region for PCI ECAM");
+
+ object_class_property_add_bool(oc, "highmem-mmio",
+ virt_get_highmem_mmio,
+ virt_set_highmem_mmio);
+ object_class_property_set_description(oc, "highmem-mmio",
+ "Set on/off to enable/disable high "
+ "memory region for PCI MMIO");
+
+ object_class_property_add_str(oc, "gic-version", virt_get_gic_version,
+ virt_set_gic_version);
+ object_class_property_set_description(oc, "gic-version",
@ -401,11 +435,24 @@ index 4a6e89c7bc..1ae1654be5 100644
+}
+type_init(rhel_machine_init);
+
+static void rhel940_virt_options(MachineClass *mc)
+{
+}
+DEFINE_RHEL_MACHINE_AS_LATEST(9, 4, 0)
+
+static void rhel920_virt_options(MachineClass *mc)
+{
+ compat_props_add(mc->compat_props, arm_rhel_compat, arm_rhel_compat_len);
+ rhel940_virt_options(mc);
+
+ compat_props_add(mc->compat_props, hw_compat_rhel_9_4, hw_compat_rhel_9_4_len);
+ compat_props_add(mc->compat_props, hw_compat_rhel_9_3, hw_compat_rhel_9_3_len);
+ compat_props_add(mc->compat_props, hw_compat_rhel_9_2, hw_compat_rhel_9_2_len);
+
+ /* RHEL 9.4 is the first supported release */
+ mc->deprecation_reason =
+ "machine types for versions prior to 9.4 are deprecated";
+}
+DEFINE_RHEL_MACHINE_AS_LATEST(9, 2, 0)
+DEFINE_RHEL_MACHINE(9, 2, 0)
+
+static void rhel900_virt_options(MachineClass *mc)
+{
@ -414,6 +461,7 @@ index 4a6e89c7bc..1ae1654be5 100644
+ rhel920_virt_options(mc);
+
+ compat_props_add(mc->compat_props, hw_compat_rhel_9_1, hw_compat_rhel_9_1_len);
+ compat_props_add(mc->compat_props, hw_compat_rhel_9_0, hw_compat_rhel_9_0_len);
+
+ /* Disable FEAT_LPA2 since old kernels (<= v5.12) don't boot with that feature */
+ vmc->no_tcg_lpa2 = true;
@ -422,10 +470,10 @@ index 4a6e89c7bc..1ae1654be5 100644
+}
+DEFINE_RHEL_MACHINE(9, 0, 0)
diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h
index e1ddbea96b..81c2363a40 100644
index bb486d36b1..237fc77bda 100644
--- a/include/hw/arm/virt.h
+++ b/include/hw/arm/virt.h
@@ -187,9 +187,17 @@ struct VirtMachineState {
@@ -179,9 +179,17 @@ struct VirtMachineState {
#define VIRT_ECAM_ID(high) (high ? VIRT_HIGH_PCIE_ECAM : VIRT_PCIE_ECAM)
@ -443,270 +491,6 @@ index e1ddbea96b..81c2363a40 100644
void virt_acpi_setup(VirtMachineState *vms);
bool virt_is_acpi_enabled(VirtMachineState *vms);
diff --git a/target/arm/arm-qmp-cmds.c b/target/arm/arm-qmp-cmds.c
index c8fa524002..3aa089abf3 100644
--- a/target/arm/arm-qmp-cmds.c
+++ b/target/arm/arm-qmp-cmds.c
@@ -231,6 +231,7 @@ CpuModelExpansionInfo *qmp_query_cpu_model_expansion(CpuModelExpansionType type,
static void arm_cpu_add_definition(gpointer data, gpointer user_data)
{
ObjectClass *oc = data;
+ CPUClass *cc = CPU_CLASS(oc);
CpuDefinitionInfoList **cpu_list = user_data;
CpuDefinitionInfo *info;
const char *typename;
@@ -240,6 +241,7 @@ static void arm_cpu_add_definition(gpointer data, gpointer user_data)
info->name = g_strndup(typename,
strlen(typename) - strlen("-" TYPE_ARM_CPU));
info->q_typename = g_strdup(typename);
+ info->deprecated = !!cc->deprecation_note;
QAPI_LIST_PREPEND(*cpu_list, info);
}
diff --git a/target/arm/cpu-qom.h b/target/arm/cpu-qom.h
index 514c22ced9..f789173451 100644
--- a/target/arm/cpu-qom.h
+++ b/target/arm/cpu-qom.h
@@ -35,6 +35,7 @@ typedef struct ARMCPUInfo {
const char *name;
void (*initfn)(Object *obj);
void (*class_init)(ObjectClass *oc, void *data);
+ const char *deprecation_note;
} ARMCPUInfo;
void arm_cpu_register(const ARMCPUInfo *info);
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 5182ed0c91..6740a8b940 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -2290,8 +2290,13 @@ static void arm_cpu_instance_init(Object *obj)
static void cpu_register_class_init(ObjectClass *oc, void *data)
{
ARMCPUClass *acc = ARM_CPU_CLASS(oc);
+ CPUClass *cc = CPU_CLASS(oc);
acc->info = data;
+
+ if (acc->info->deprecation_note) {
+ cc->deprecation_note = acc->info->deprecation_note;
+ }
}
void arm_cpu_register(const ARMCPUInfo *info)
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index c097cae988..829d4a2328 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -34,6 +34,8 @@
#define KVM_HAVE_MCE_INJECTION 1
#endif
+#define RHEL_CPU_DEPRECATION "use 'host' / 'max'"
+
#define EXCP_UDEF 1 /* undefined instruction */
#define EXCP_SWI 2 /* software interrupt */
#define EXCP_PREFETCH_ABORT 3
diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index 0fb07cc7b6..47459627fb 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -31,6 +31,7 @@
#include "hw/qdev-properties.h"
#include "internals.h"
+#if 0 /* Disabled for Red Hat Enterprise Linux */
static void aarch64_a35_initfn(Object *obj)
{
ARMCPU *cpu = ARM_CPU(obj);
@@ -110,6 +111,7 @@ static void aarch64_a35_initfn(Object *obj)
/* These values are the same with A53/A57/A72. */
define_cortex_a72_a57_a53_cp_reginfo(cpu);
}
+#endif /* disabled for RHEL */
void arm_cpu_sve_finalize(ARMCPU *cpu, Error **errp)
{
@@ -730,6 +732,7 @@ static void aarch64_a57_initfn(Object *obj)
define_cortex_a72_a57_a53_cp_reginfo(cpu);
}
+#if 0 /* Disabled for Red Hat Enterprise Linux */
static void aarch64_a53_initfn(Object *obj)
{
ARMCPU *cpu = ARM_CPU(obj);
@@ -1164,6 +1167,7 @@ static void aarch64_neoverse_n1_initfn(Object *obj)
define_neoverse_n1_cp_reginfo(cpu);
}
+#endif /* disabled for RHEL */
static void aarch64_host_initfn(Object *obj)
{
@@ -1373,14 +1377,19 @@ static void aarch64_max_initfn(Object *obj)
}
static const ARMCPUInfo aarch64_cpus[] = {
+#if 0 /* Disabled for Red Hat Enterprise Linux */
{ .name = "cortex-a35", .initfn = aarch64_a35_initfn },
- { .name = "cortex-a57", .initfn = aarch64_a57_initfn },
+#endif /* disabled for RHEL */
+ { .name = "cortex-a57", .initfn = aarch64_a57_initfn,
+ .deprecation_note = RHEL_CPU_DEPRECATION },
+#if 0 /* Disabled for Red Hat Enterprise Linux */
{ .name = "cortex-a53", .initfn = aarch64_a53_initfn },
{ .name = "cortex-a55", .initfn = aarch64_a55_initfn },
{ .name = "cortex-a72", .initfn = aarch64_a72_initfn },
{ .name = "cortex-a76", .initfn = aarch64_a76_initfn },
{ .name = "a64fx", .initfn = aarch64_a64fx_initfn },
{ .name = "neoverse-n1", .initfn = aarch64_neoverse_n1_initfn },
+#endif /* disabled for RHEL */
{ .name = "max", .initfn = aarch64_max_initfn },
#if defined(CONFIG_KVM) || defined(CONFIG_HVF)
{ .name = "host", .initfn = aarch64_host_initfn },
@@ -1452,8 +1461,13 @@ static void aarch64_cpu_instance_init(Object *obj)
static void cpu_register_class_init(ObjectClass *oc, void *data)
{
ARMCPUClass *acc = ARM_CPU_CLASS(oc);
+ CPUClass *cc = CPU_CLASS(oc);
acc->info = data;
+
+ if (acc->info->deprecation_note) {
+ cc->deprecation_note = acc->info->deprecation_note;
+ }
}
void aarch64_cpu_register(const ARMCPUInfo *info)
diff --git a/target/arm/cpu_tcg.c b/target/arm/cpu_tcg.c
index c154a4dcf2..f29425b656 100644
--- a/target/arm/cpu_tcg.c
+++ b/target/arm/cpu_tcg.c
@@ -152,10 +152,10 @@ void define_cortex_a72_a57_a53_cp_reginfo(ARMCPU *cpu)
}
#endif /* !CONFIG_USER_ONLY */
+#if 0 /* Disabled for Red Hat Enterprise Linux */
/* CPU models. These are not needed for the AArch64 linux-user build. */
#if !defined(CONFIG_USER_ONLY) || !defined(TARGET_AARCH64)
-#if 0 /* Disabled for Red Hat Enterprise Linux */
#if !defined(CONFIG_USER_ONLY) && defined(CONFIG_TCG)
static bool arm_v7m_cpu_exec_interrupt(CPUState *cs, int interrupt_request)
{
@@ -509,7 +509,6 @@ static void cortex_a9_initfn(Object *obj)
cpu->isar.reset_pmcr_el0 = 0x41093000;
define_arm_cp_regs(cpu, cortexa9_cp_reginfo);
}
-#endif /* disabled for RHEL */
#ifndef CONFIG_USER_ONLY
static uint64_t a15_l2ctlr_read(CPUARMState *env, const ARMCPRegInfo *ri)
@@ -534,7 +533,6 @@ static const ARMCPRegInfo cortexa15_cp_reginfo[] = {
.access = PL1_RW, .type = ARM_CP_CONST, .resetvalue = 0 },
};
-#if 0 /* Disabled for Red Hat Enterprise Linux */
static void cortex_a7_initfn(Object *obj)
{
ARMCPU *cpu = ARM_CPU(obj);
@@ -583,7 +581,6 @@ static void cortex_a7_initfn(Object *obj)
cpu->isar.reset_pmcr_el0 = 0x41072000;
define_arm_cp_regs(cpu, cortexa15_cp_reginfo); /* Same as A15 */
}
-#endif /* disabled for RHEL */
static void cortex_a15_initfn(Object *obj)
{
@@ -632,7 +629,6 @@ static void cortex_a15_initfn(Object *obj)
define_arm_cp_regs(cpu, cortexa15_cp_reginfo);
}
-#if 0 /* Disabled for Red Hat Enterprise Linux */
static void cortex_m0_initfn(Object *obj)
{
ARMCPU *cpu = ARM_CPU(obj);
@@ -1115,7 +1111,6 @@ static void arm_v7m_class_init(ObjectClass *oc, void *data)
cc->gdb_core_xml_file = "arm-m-profile.xml";
}
-#endif /* disabled for RHEL */
#ifndef TARGET_AARCH64
/*
@@ -1183,7 +1178,6 @@ static void arm_max_initfn(Object *obj)
#endif /* !TARGET_AARCH64 */
static const ARMCPUInfo arm_tcg_cpus[] = {
-#if 0 /* Disabled for Red Hat Enterprise Linux */
{ .name = "arm926", .initfn = arm926_initfn },
{ .name = "arm946", .initfn = arm946_initfn },
{ .name = "arm1026", .initfn = arm1026_initfn },
@@ -1199,9 +1193,7 @@ static const ARMCPUInfo arm_tcg_cpus[] = {
{ .name = "cortex-a7", .initfn = cortex_a7_initfn },
{ .name = "cortex-a8", .initfn = cortex_a8_initfn },
{ .name = "cortex-a9", .initfn = cortex_a9_initfn },
-#endif /* disabled for RHEL */
{ .name = "cortex-a15", .initfn = cortex_a15_initfn },
-#if 0 /* Disabled for Red Hat Enterprise Linux */
{ .name = "cortex-m0", .initfn = cortex_m0_initfn,
.class_init = arm_v7m_class_init },
{ .name = "cortex-m3", .initfn = cortex_m3_initfn,
@@ -1233,7 +1225,6 @@ static const ARMCPUInfo arm_tcg_cpus[] = {
{ .name = "pxa270-b1", .initfn = pxa270b1_initfn },
{ .name = "pxa270-c0", .initfn = pxa270c0_initfn },
{ .name = "pxa270-c5", .initfn = pxa270c5_initfn },
-#endif /* disabled for RHEL */
#ifndef TARGET_AARCH64
{ .name = "max", .initfn = arm_max_initfn },
#endif
@@ -1261,3 +1252,4 @@ static void arm_tcg_cpu_register_types(void)
type_init(arm_tcg_cpu_register_types)
#endif /* !CONFIG_USER_ONLY || !TARGET_AARCH64 */
+#endif /* disabled for RHEL */
diff --git a/tests/qtest/arm-cpu-features.c b/tests/qtest/arm-cpu-features.c
index 1cb08138ad..834497dfec 100644
--- a/tests/qtest/arm-cpu-features.c
+++ b/tests/qtest/arm-cpu-features.c
@@ -441,8 +441,10 @@ static void test_query_cpu_model_expansion(const void *data)
assert_error(qts, "host", "The CPU type 'host' requires KVM", NULL);
/* Test expected feature presence/absence for some cpu types */
+#if 0 /* Disabled for Red Hat Enterprise Linux */
assert_has_feature_enabled(qts, "cortex-a15", "pmu");
assert_has_not_feature(qts, "cortex-a15", "aarch64");
+#endif /* disabled for RHEL */
/* Enabling and disabling pmu should always work. */
assert_has_feature_enabled(qts, "max", "pmu");
@@ -459,6 +461,7 @@ static void test_query_cpu_model_expansion(const void *data)
assert_has_feature_enabled(qts, "cortex-a57", "pmu");
assert_has_feature_enabled(qts, "cortex-a57", "aarch64");
+#if 0 /* Disabled for Red Hat Enterprise Linux */
assert_has_feature_enabled(qts, "a64fx", "pmu");
assert_has_feature_enabled(qts, "a64fx", "aarch64");
/*
@@ -471,6 +474,7 @@ static void test_query_cpu_model_expansion(const void *data)
"{ 'sve384': true }");
assert_error(qts, "a64fx", "cannot enable sve640",
"{ 'sve640': true }");
+#endif /* disabled for RHEL */
sve_tests_default(qts, "max");
pauth_tests_default(qts, "max");
@@ -506,9 +510,11 @@ static void test_query_cpu_model_expansion_kvm(const void *data)
QDict *resp;
char *error;
+#if 0 /* Disabled for Red Hat Enterprise Linux */
assert_error(qts, "cortex-a15",
"We cannot guarantee the CPU type 'cortex-a15' works "
"with KVM on this host", NULL);
+#endif /* disabled for RHEL */
assert_has_feature_enabled(qts, "host", "aarch64");
--
2.39.1
2.39.3

@ -1,4 +1,4 @@
From 401d0ebf1ee959fd944df6b5b4ae9c51c36d1244 Mon Sep 17 00:00:00 2001
From fb905dbe5b51ed899062ef99a2dd7f238d3e3384 Mon Sep 17 00:00:00 2001
From: Miroslav Rezanina <mrezanin@redhat.com>
Date: Fri, 19 Oct 2018 13:27:13 +0200
Subject: Add ppc64 machine types
@ -20,7 +20,7 @@ Merged patches (6.1.0):
- af69d1ca6e Remove RHEL 7.4.0 machine types (only ppc64 changes)
- 8f7a74ab78 Remove RHEL 7.5.0 machine types (only ppc64 changes)
Merged patches (7.1.0 rc0):
Merged patches (7.1.0):
- baa6790171 target/ppc/cpu-models: Fix ppc_cpu_aliases list for RHEL
---
hw/ppc/spapr.c | 243 ++++++++++++++++++++++++++++++++++++++++
@ -34,20 +34,20 @@ Merged patches (7.1.0 rc0):
8 files changed, 314 insertions(+), 1 deletion(-)
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 4921198b9d..e24b3e22e3 100644
index e9bc97fee0..a258d81846 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1634,6 +1634,9 @@ static void spapr_machine_reset(MachineState *machine, ShutdownCause reason)
@@ -1718,6 +1718,9 @@ static void spapr_machine_reset(MachineState *machine, ShutdownCause reason)
pef_kvm_reset(machine->cgs, &error_fatal);
spapr_caps_apply(spapr);
spapr_nested_reset(spapr);
+ if (spapr->svm_allowed) {
+ kvmppc_svm_allow(&error_fatal);
+ }
first_ppc_cpu = POWERPC_CPU(first_cpu);
if (kvm_enabled() && kvmppc_has_cap_mmu_radix() &&
@@ -3348,6 +3351,20 @@ static void spapr_set_host_serial(Object *obj, const char *value, Error **errp)
@@ -3421,6 +3424,20 @@ static void spapr_set_host_serial(Object *obj, const char *value, Error **errp)
spapr->host_serial = g_strdup(value);
}
@ -68,7 +68,7 @@ index 4921198b9d..e24b3e22e3 100644
static void spapr_instance_init(Object *obj)
{
SpaprMachineState *spapr = SPAPR_MACHINE(obj);
@@ -3426,6 +3443,12 @@ static void spapr_instance_init(Object *obj)
@@ -3499,6 +3516,12 @@ static void spapr_instance_init(Object *obj)
spapr_get_host_serial, spapr_set_host_serial);
object_property_set_description(obj, "host-serial",
"Host serial number to advertise in guest device tree");
@ -81,7 +81,7 @@ index 4921198b9d..e24b3e22e3 100644
}
static void spapr_machine_finalizefn(Object *obj)
@@ -4683,6 +4706,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
@@ -4754,6 +4777,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
vmc->client_architecture_support = spapr_vof_client_architecture_support;
vmc->quiesce = spapr_vof_quiesce;
vmc->setprop = spapr_vof_setprop;
@ -89,15 +89,15 @@ index 4921198b9d..e24b3e22e3 100644
}
static const TypeInfo spapr_machine_info = {
@@ -4734,6 +4758,7 @@ static void spapr_machine_latest_class_options(MachineClass *mc)
@@ -4805,6 +4829,7 @@ static void spapr_machine_latest_class_options(MachineClass *mc)
} \
type_init(spapr_machine_register_##suffix)
+#if 0 /* Disabled for Red Hat Enterprise Linux */
/*
* pseries-8.0
* pseries-9.0
*/
@@ -4894,6 +4919,7 @@ static void spapr_machine_4_1_class_options(MachineClass *mc)
@@ -4998,6 +5023,7 @@ static void spapr_machine_4_1_class_options(MachineClass *mc)
}
DEFINE_SPAPR_MACHINE(4_1, "4.1", false);
@ -105,8 +105,8 @@ index 4921198b9d..e24b3e22e3 100644
/*
* pseries-4.0
@@ -4913,6 +4939,8 @@ static bool phb_placement_4_0(SpaprMachineState *spapr, uint32_t index,
*nv2atsd = 0;
@@ -5013,6 +5039,8 @@ static bool phb_placement_4_0(SpaprMachineState *spapr, uint32_t index,
}
return true;
}
+
@ -114,7 +114,7 @@ index 4921198b9d..e24b3e22e3 100644
static void spapr_machine_4_0_class_options(MachineClass *mc)
{
SpaprMachineClass *smc = SPAPR_MACHINE_CLASS(mc);
@@ -5240,6 +5268,221 @@ static void spapr_machine_2_1_class_options(MachineClass *mc)
@@ -5338,6 +5366,221 @@ static void spapr_machine_2_1_class_options(MachineClass *mc)
compat_props_add(mc->compat_props, hw_compat_2_1, hw_compat_2_1_len);
}
DEFINE_SPAPR_MACHINE(2_1, "2.1", false);
@ -337,7 +337,7 @@ index 4921198b9d..e24b3e22e3 100644
static void spapr_machine_register_types(void)
{
diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
index fcb5dfe792..ab8fb5bf62 100644
index 3b0a47a28c..375e0c8e45 100644
--- a/hw/ppc/spapr_cpu_core.c
+++ b/hw/ppc/spapr_cpu_core.c
@@ -25,6 +25,7 @@
@ -348,7 +348,7 @@ index fcb5dfe792..ab8fb5bf62 100644
static void spapr_reset_vcpu(PowerPCCPU *cpu)
{
@@ -259,6 +260,7 @@ static bool spapr_realize_vcpu(PowerPCCPU *cpu, SpaprMachineState *spapr,
@@ -264,6 +265,7 @@ static bool spapr_realize_vcpu(PowerPCCPU *cpu, SpaprMachineState *spapr,
{
CPUPPCState *env = &cpu->env;
CPUState *cs = CPU(cpu);
@ -356,7 +356,7 @@ index fcb5dfe792..ab8fb5bf62 100644
if (!qdev_realize(DEVICE(cpu), NULL, errp)) {
return false;
@@ -270,6 +272,17 @@ static bool spapr_realize_vcpu(PowerPCCPU *cpu, SpaprMachineState *spapr,
@@ -280,6 +282,17 @@ static bool spapr_realize_vcpu(PowerPCCPU *cpu, SpaprMachineState *spapr,
/* Set time-base frequency to 512 MHz. vhyp must be set first. */
cpu_ppc_tb_init(env, SPAPR_TIMEBASE_FREQ);
@ -375,10 +375,10 @@ index fcb5dfe792..ab8fb5bf62 100644
qdev_unrealize(DEVICE(cpu));
return false;
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 5c8aabd444..04489d5808 100644
index 4aaf23d28f..3233c54d11 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -155,6 +155,7 @@ struct SpaprMachineClass {
@@ -157,6 +157,7 @@ struct SpaprMachineClass {
bool pre_5_2_numa_associativity;
bool pre_6_2_numa_affinity;
@ -386,7 +386,7 @@ index 5c8aabd444..04489d5808 100644
bool (*phb_placement)(SpaprMachineState *spapr, uint32_t index,
uint64_t *buid, hwaddr *pio,
hwaddr *mmio32, hwaddr *mmio64,
@@ -257,6 +258,9 @@ struct SpaprMachineState {
@@ -259,6 +260,9 @@ struct SpaprMachineState {
/* Set by -boot */
char *boot_device;
@ -397,7 +397,7 @@ index 5c8aabd444..04489d5808 100644
char *kvm_type;
char *host_model;
diff --git a/target/ppc/compat.c b/target/ppc/compat.c
index 7949a24f5a..f207a9ba01 100644
index ebef2cccec..ff2c00c60e 100644
--- a/target/ppc/compat.c
+++ b/target/ppc/compat.c
@@ -114,8 +114,19 @@ static const CompatInfo *compat_by_pvr(uint32_t pvr)
@ -422,10 +422,10 @@ index 7949a24f5a..f207a9ba01 100644
const CompatInfo *compat = compat_by_pvr(compat_pvr);
const CompatInfo *min = compat_by_pvr(min_compat_pvr);
diff --git a/target/ppc/cpu-models.c b/target/ppc/cpu-models.c
index cd3ff700ac..1cb49c8087 100644
index f77ebfcc81..18e9422006 100644
--- a/target/ppc/cpu-models.c
+++ b/target/ppc/cpu-models.c
@@ -746,6 +746,7 @@
@@ -744,6 +744,7 @@
/* PowerPC CPU aliases */
PowerPCCPUAlias ppc_cpu_aliases[] = {
@ -434,10 +434,10 @@ index cd3ff700ac..1cb49c8087 100644
{ "405cr", "405crc" },
{ "405gp", "405gpd" },
diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index 557d736dab..6646ec1c27 100644
index 67e6b2effd..11187aeb93 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -1482,6 +1482,7 @@ static inline int cpu_mmu_index(CPUPPCState *env, bool ifetch)
@@ -1655,6 +1655,7 @@ static inline int ppc_env_mmu_index(CPUPPCState *env, bool ifetch)
/* Compatibility modes */
#if defined(TARGET_PPC64)
@ -446,18 +446,18 @@ index 557d736dab..6646ec1c27 100644
uint32_t min_compat_pvr, uint32_t max_compat_pvr);
bool ppc_type_check_compat(const char *cputype, uint32_t compat_pvr,
diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
index 78f6fc50cd..68d06c3f8f 100644
index 8231feb2d4..59f640cf7b 100644
--- a/target/ppc/kvm.c
+++ b/target/ppc/kvm.c
@@ -88,6 +88,7 @@ static int cap_ppc_nested_kvm_hv;
static int cap_large_decr;
@@ -89,6 +89,7 @@ static int cap_large_decr;
static int cap_fwnmi;
static int cap_rpt_invalidate;
static int cap_ail_mode_3;
+static int cap_ppc_secure_guest;
static uint32_t debug_inst_opcode;
@@ -135,6 +136,7 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
@@ -141,6 +142,7 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
cap_resize_hpt = kvm_vm_check_extension(s, KVM_CAP_SPAPR_RESIZE_HPT);
kvmppc_get_cpu_characteristics(s);
cap_ppc_nested_kvm_hv = kvm_vm_check_extension(s, KVM_CAP_PPC_NESTED_HV);
@ -465,8 +465,8 @@ index 78f6fc50cd..68d06c3f8f 100644
cap_large_decr = kvmppc_get_dec_bits();
cap_fwnmi = kvm_vm_check_extension(s, KVM_CAP_PPC_FWNMI);
/*
@@ -2569,6 +2571,16 @@ int kvmppc_has_cap_rpt_invalidate(void)
return cap_rpt_invalidate;
@@ -2564,6 +2566,16 @@ bool kvmppc_supports_ail_3(void)
return cap_ail_mode_3;
}
+bool kvmppc_has_cap_secure_guest(void)
@ -482,7 +482,7 @@ index 78f6fc50cd..68d06c3f8f 100644
PowerPCCPUClass *kvm_ppc_get_host_cpu_class(void)
{
uint32_t host_pvr = mfpvr();
@@ -2969,3 +2981,18 @@ bool kvm_arch_cpu_check_are_resettable(void)
@@ -2964,3 +2976,18 @@ bool kvm_arch_cpu_check_are_resettable(void)
void kvm_arch_accel_class_init(ObjectClass *oc)
{
}
@ -502,27 +502,27 @@ index 78f6fc50cd..68d06c3f8f 100644
+ }
+}
diff --git a/target/ppc/kvm_ppc.h b/target/ppc/kvm_ppc.h
index 5fd9753953..b5ebfe2be0 100644
index 1975fb5ee6..d1017f98be 100644
--- a/target/ppc/kvm_ppc.h
+++ b/target/ppc/kvm_ppc.h
@@ -43,6 +43,7 @@ int kvmppc_booke_watchdog_enable(PowerPCCPU *cpu);
@@ -46,6 +46,7 @@ int kvmppc_booke_watchdog_enable(PowerPCCPU *cpu);
target_ulong kvmppc_configure_v3_mmu(PowerPCCPU *cpu,
bool radix, bool gtse,
uint64_t proc_tbl);
+void kvmppc_svm_allow(Error **errp);
#ifndef CONFIG_USER_ONLY
bool kvmppc_spapr_use_multitce(void);
int kvmppc_spapr_enable_inkernel_multitce(void);
@@ -77,6 +78,8 @@ int kvmppc_get_cap_large_decr(void);
int kvmppc_enable_cap_large_decr(PowerPCCPU *cpu, int enable);
void *kvmppc_create_spapr_tce(uint32_t liobn, uint32_t page_shift,
@@ -79,6 +80,8 @@ int kvmppc_enable_cap_large_decr(PowerPCCPU *cpu, int enable);
int kvmppc_has_cap_rpt_invalidate(void);
bool kvmppc_supports_ail_3(void);
int kvmppc_enable_hwrng(void);
+bool kvmppc_has_cap_secure_guest(void);
+int kvmppc_enable_cap_secure_guest(void);
int kvmppc_put_books_sregs(PowerPCCPU *cpu);
PowerPCCPUClass *kvm_ppc_get_host_cpu_class(void);
void kvmppc_check_papr_resize_hpt(Error **errp);
@@ -396,6 +399,16 @@ static inline int kvmppc_has_cap_rpt_invalidate(void)
@@ -427,6 +430,16 @@ static inline bool kvmppc_supports_ail_3(void)
return false;
}
@ -540,5 +540,5 @@ index 5fd9753953..b5ebfe2be0 100644
{
return -1;
--
2.39.1
2.39.3

@ -1,4 +1,4 @@
From 3c7647197729fcd76e219070c6f359bb3667d04d Mon Sep 17 00:00:00 2001
From 04178c77cfe188b4eed9c08a0bf66842e61fe5dc Mon Sep 17 00:00:00 2001
From: Miroslav Rezanina <mrezanin@redhat.com>
Date: Fri, 19 Oct 2018 13:47:32 +0200
Subject: Add s390x machine types
@ -8,7 +8,7 @@ Adding changes to add RHEL machine types for s390x architecture.
Signed-off-by: Miroslav Rezanina <mrezanin@redhat.com>
--
Rebase changes (7.1.0 rc0):
Rebase changes (7.1.0):
- Moved adding rhel_old_machine_deprecation variable to general machine types commit
Merged patches (6.1.0):
@ -23,52 +23,74 @@ Merged patches (7.0.0):
- 4b0efa7e21 redhat: Add rhel8.6.0 and rhel9.0.0 machine types for s390x
- dcc64971bf RHEL: mark old machine types as deprecated (partialy)
Merged patches (7.1.0 rc0):
Merged patches (7.1.0):
- 1d6439527a WRB: Introduce RHEL 9.0.0 hw compat structure (only hw/s390x/s390-virtio-ccw.c chunk)
- c8ad21ca31 redhat: Update s390x machine type compatibility for rebase to QEMU 7.0.0
- 5bcf8d874c target/s390x: deprecate CPUs older than z14
Merged patches (7.2.0 rc0):
Merged patches (7.2.0):
- 0be2889fa2 Introduce upstream 7.0 compat changes (only applicable parts)
Merged patches (8.0.0-rc1):
Merged patches (8.0.0):
- 27c188c6a4 redhat: Update s390x machine type compatibility for QEMU 7.2.0 update
- a932b8d429 redhat: Add new rhel-9.2.0 s390x machine type
- ac88104bad s390x/s390-virtio-ccw: Activate zPCI features on s390-ccw-virtio-rhel8.6.0
Merged patches (8.1.0):
- bd5d81d286 Add RHEL 9.2.0 compat structure (s390x part)
Merged patches (8.2.0):
- 4ee284aca9 Add machine types compat bits. (partial)
---
hw/s390x/s390-virtio-ccw.c | 143 +++++++++++++++++++++++++++++++
hw/s390x/s390-virtio-ccw.c | 159 +++++++++++++++++++++++++++++++
target/s390x/cpu_models.c | 11 +++
target/s390x/cpu_models.h | 2 +
target/s390x/cpu_models_sysemu.c | 2 +
4 files changed, 158 insertions(+)
4 files changed, 174 insertions(+)
diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
index 503f212a31..dcd3b966b0 100644
index b1dcb3857f..ff753a29e0 100644
--- a/hw/s390x/s390-virtio-ccw.c
+++ b/hw/s390x/s390-virtio-ccw.c
@@ -826,6 +826,7 @@ bool css_migration_enabled(void)
@@ -859,6 +859,7 @@ bool css_migration_enabled(void)
} \
type_init(ccw_machine_register_##suffix)
+#if 0 /* Disabled for Red Hat Enterprise Linux */
static void ccw_machine_8_0_instance_options(MachineState *machine)
static void ccw_machine_9_0_instance_options(MachineState *machine)
{
}
@@ -1201,6 +1202,148 @@ static void ccw_machine_2_4_class_options(MachineClass *mc)
@@ -1272,6 +1273,164 @@ static void ccw_machine_2_4_class_options(MachineClass *mc)
compat_props_add(mc->compat_props, compat, G_N_ELEMENTS(compat));
}
DEFINE_CCW_MACHINE(2_4, "2.4", false);
+#endif
+
+
+static void ccw_machine_rhel940_instance_options(MachineState *machine)
+{
+}
+
+static void ccw_machine_rhel940_class_options(MachineClass *mc)
+{
+}
+DEFINE_CCW_MACHINE(rhel940, "rhel9.4.0", true);
+
+static void ccw_machine_rhel920_instance_options(MachineState *machine)
+{
+ ccw_machine_rhel940_instance_options(machine);
+}
+
+static void ccw_machine_rhel920_class_options(MachineClass *mc)
+{
+ ccw_machine_rhel940_class_options(mc);
+ compat_props_add(mc->compat_props, hw_compat_rhel_9_4, hw_compat_rhel_9_4_len);
+ compat_props_add(mc->compat_props, hw_compat_rhel_9_3, hw_compat_rhel_9_3_len);
+ compat_props_add(mc->compat_props, hw_compat_rhel_9_2, hw_compat_rhel_9_2_len);
+ mc->smp_props.drawers_supported = false; /* from ccw_machine_8_1 */
+ mc->smp_props.books_supported = false; /* from ccw_machine_8_1 */
+}
+DEFINE_CCW_MACHINE(rhel920, "rhel9.2.0", true);
+DEFINE_CCW_MACHINE(rhel920, "rhel9.2.0", false);
+
+static void ccw_machine_rhel900_instance_options(MachineState *machine)
+{
@ -204,7 +226,7 @@ index 503f212a31..dcd3b966b0 100644
static void ccw_machine_register_types(void)
{
diff --git a/target/s390x/cpu_models.c b/target/s390x/cpu_models.c
index 457b5cb10c..ff6b9463cb 100644
index 8ed3bb6a27..370b3b3065 100644
--- a/target/s390x/cpu_models.c
+++ b/target/s390x/cpu_models.c
@@ -46,6 +46,9 @@
@ -217,7 +239,7 @@ index 457b5cb10c..ff6b9463cb 100644
static S390CPUDef s390_cpu_defs[] = {
CPUDEF_INIT(0x2064, 7, 1, 38, 0x00000000U, "z900", "IBM zSeries 900 GA1"),
CPUDEF_INIT(0x2064, 7, 2, 38, 0x00000000U, "z900.2", "IBM zSeries 900 GA2"),
@@ -857,22 +860,30 @@ static void s390_host_cpu_model_class_init(ObjectClass *oc, void *data)
@@ -866,22 +869,30 @@ static void s390_host_cpu_model_class_init(ObjectClass *oc, void *data)
static void s390_base_cpu_model_class_init(ObjectClass *oc, void *data)
{
S390CPUClass *xcc = S390_CPU_CLASS(oc);
@ -249,23 +271,23 @@ index 457b5cb10c..ff6b9463cb 100644
static void s390_qemu_cpu_model_class_init(ObjectClass *oc, void *data)
diff --git a/target/s390x/cpu_models.h b/target/s390x/cpu_models.h
index fb1adc8b21..d76745afa9 100644
index d7b8912989..1a806a97c4 100644
--- a/target/s390x/cpu_models.h
+++ b/target/s390x/cpu_models.h
@@ -38,6 +38,8 @@ struct S390CPUDef {
@@ -38,6 +38,8 @@ typedef struct S390CPUDef {
S390FeatBitmap full_feat;
/* used to init full_feat from generated data */
S390FeatInit full_init;
+ /* if deprecated, provides a suggestion */
+ const char *deprecation_note;
};
} S390CPUDef;
/* CPU model based on a CPU definition */
diff --git a/target/s390x/cpu_models_sysemu.c b/target/s390x/cpu_models_sysemu.c
index 87a4480c05..28c1b0486c 100644
index 0728bfcc20..ca2e5d91e2 100644
--- a/target/s390x/cpu_models_sysemu.c
+++ b/target/s390x/cpu_models_sysemu.c
@@ -60,6 +60,7 @@ static void create_cpu_model_list(ObjectClass *klass, void *opaque)
@@ -59,6 +59,7 @@ static void create_cpu_model_list(ObjectClass *klass, void *opaque)
CpuDefinitionInfo *info;
char *name = g_strdup(object_class_get_name(klass));
S390CPUClass *scc = S390_CPU_CLASS(klass);
@ -273,7 +295,7 @@ index 87a4480c05..28c1b0486c 100644
/* strip off the -s390x-cpu */
g_strrstr(name, "-" TYPE_S390_CPU)[0] = 0;
@@ -69,6 +70,7 @@ static void create_cpu_model_list(ObjectClass *klass, void *opaque)
@@ -68,6 +69,7 @@ static void create_cpu_model_list(ObjectClass *klass, void *opaque)
info->migration_safe = scc->is_migration_safe;
info->q_static = scc->is_static;
info->q_typename = g_strdup(object_class_get_name(klass));
@ -282,5 +304,5 @@ index 87a4480c05..28c1b0486c 100644
if (cpu_list_data->model) {
Object *obj;
--
2.39.1
2.39.3

@ -1,4 +1,4 @@
From 510291040cb280e1f68b793a84ec0f7d1c88aafa Mon Sep 17 00:00:00 2001
From 3c88acb005806ad2386ab6c94a8831151f624738 Mon Sep 17 00:00:00 2001
From: Miroslav Rezanina <mrezanin@redhat.com>
Date: Fri, 19 Oct 2018 13:10:31 +0200
Subject: Add x86_64 machine types
@ -13,9 +13,12 @@ Rebase notes (6.1.0):
Rebase notes (7.0.0):
- Reset alias for all machine-types except latest one
Rebase notes (8.0.0-rc1):
Rebase notes (8.0.0):
- remove legacy_no_rng_seed usage (removed upstream)
Rebase notes (8.1.0):
- default_nic_model to default_nic
Merged patches (6.1.0):
- 59c284ad3b x86: Add x86 rhel8.5 machine types
- a8868b42fe redhat: x86: Enable 'kvm-asyncpf-int' by default
@ -35,35 +38,61 @@ Merged patches (7.0.0):
- dcc64971bf RHEL: mark old machine types as deprecated (partialy)
- 6b396f182b RHEL: disable "seqpacket" for "vhost-vsock-device" in rhel8.6.0
Merged patches (7.1.0 rc0):
Merged patches (7.1.0):
- 38b89dc245 pc: Move s3/s4 suspend disabling to compat (only hw/i386/pc.c chunk)
- 1d6439527a WRB: Introduce RHEL 9.0.0 hw compat structure (x86_64 specific changes)
- 35b5c8554f target/i386: deprecate CPUs older than x86_64-v2 ABI
Merged patches (7.2.0 rc0):
Merged patches (7.2.0):
- 0be2889fa2 Introduce upstream 7.0 compat changes (only applicable parts)
Merged patches (8.0.0-rc1):
Merged patches (8.0.0):
- f33ca8aed4 x86: rhel 9.2.0 machine type
Merged patches (8.1.0):
- bd5d81d286 Add RHEL 9.2.0 compat structure (x86_64 part)
- c6eaf73add redhat: hw/i386/pc: Update x86 machine type compatibility for QEMU 8.0.0 update
- 6cbf496e5e hw/acpi: Mark acpi blobs as resizable on RHEL pc machines version 7.6 and above
Merged patches (8.2.0):
- 4ee284aca9 Add machine types compat bits. (partial)
- 719e2ac147 Fix x86 machine type compatibility for qemu-kvm 8.1.0
Merged patches (9.0.0 rc0):
- 9149e2bc8f x86: rhel 9.2.0 machine type compat fix
---
hw/i386/pc.c | 147 +++++++++++++++++++++-
hw/i386/pc_piix.c | 86 ++++++++++++-
hw/i386/pc_q35.c | 252 ++++++++++++++++++++++++++++++++++++-
hw/i386/fw_cfg.c | 2 +-
hw/i386/pc.c | 159 ++++++++++++++++++++-
hw/i386/pc_piix.c | 109 ++++++++++++++
hw/i386/pc_q35.c | 285 +++++++++++++++++++++++++++++++++++++
include/hw/boards.h | 2 +
include/hw/i386/pc.h | 27 ++++
target/i386/cpu.c | 21 ++++
include/hw/i386/pc.h | 33 +++++
target/i386/cpu.c | 21 +++
target/i386/kvm/kvm-cpu.c | 1 +
target/i386/kvm/kvm.c | 4 +
tests/qtest/pvpanic-test.c | 5 +-
9 files changed, 538 insertions(+), 7 deletions(-)
10 files changed, 617 insertions(+), 4 deletions(-)
diff --git a/hw/i386/fw_cfg.c b/hw/i386/fw_cfg.c
index c7aa39a13e..283c3f4c16 100644
--- a/hw/i386/fw_cfg.c
+++ b/hw/i386/fw_cfg.c
@@ -63,7 +63,7 @@ void fw_cfg_build_smbios(PCMachineState *pcms, FWCfgState *fw_cfg,
if (pcmc->smbios_defaults) {
/* These values are guest ABI, do not change */
- smbios_set_defaults("QEMU", mc->desc, mc->name,
+ smbios_set_defaults("Red Hat", "KVM", mc->desc,
pcmc->smbios_uuid_encoded,
pcmc->smbios_stream_product, pcmc->smbios_stream_version);
}
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 1489abf010..8abb1f872e 100644
index 5c21b0c4db..4a154c1a9a 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -407,6 +407,149 @@ GlobalProperty pc_compat_1_4[] = {
@@ -326,6 +326,161 @@ GlobalProperty pc_compat_2_0[] = {
};
const size_t pc_compat_1_4_len = G_N_ELEMENTS(pc_compat_1_4);
const size_t pc_compat_2_0_len = G_N_ELEMENTS(pc_compat_2_0);
+/* This macro is for changes to properties that are RHEL specific,
+ * different to the current upstream and to be applied to the latest
@ -87,6 +116,18 @@ index 1489abf010..8abb1f872e 100644
+};
+const size_t pc_rhel_compat_len = G_N_ELEMENTS(pc_rhel_compat);
+
+GlobalProperty pc_rhel_9_3_compat[] = {
+ /* pc_rhel_9_3_compat from pc_compat_8_0 */
+ { "virtio-mem", "unplugged-inaccessible", "auto" },
+};
+const size_t pc_rhel_9_3_compat_len = G_N_ELEMENTS(pc_rhel_9_3_compat);
+
+GlobalProperty pc_rhel_9_2_compat[] = {
+ /* pc_rhel_9_2_compat from pc_compat_7_2 */
+ { "ICH9-LPC", "noreboot", "true" },
+};
+const size_t pc_rhel_9_2_compat_len = G_N_ELEMENTS(pc_rhel_9_2_compat);
+
+GlobalProperty pc_rhel_9_0_compat[] = {
+ /* pc_rhel_9_0_compat from pc_compat_6_2 */
+ { "virtio-mem", "unplugged-inaccessible", "off" },
@ -211,15 +252,15 @@ index 1489abf010..8abb1f872e 100644
GSIState *pc_gsi_create(qemu_irq **irqs, bool pci_enabled)
{
GSIState *s;
@@ -1944,6 +2087,7 @@ static void pc_machine_class_init(ObjectClass *oc, void *data)
pcmc->pvh_enabled = true;
pcmc->kvmclock_create_always = true;
@@ -1813,6 +1968,7 @@ static void pc_machine_class_init(ObjectClass *oc, void *data)
pcmc->resizable_acpi_blob = true;
x86mc->apic_xrupt_override = true;
assert(!mc->get_hotplug_handler);
+ mc->async_pf_vmexit_disable = false;
mc->get_hotplug_handler = pc_get_hotplug_handler;
mc->hotplug_allowed = pc_hotplug_allowed;
mc->cpu_index_to_instance_props = x86_cpu_index_to_props;
@@ -1954,7 +2098,8 @@ static void pc_machine_class_init(ObjectClass *oc, void *data)
@@ -1823,7 +1979,8 @@ static void pc_machine_class_init(ObjectClass *oc, void *data)
mc->has_hotpluggable_cpus = true;
mc->default_boot_order = "cad";
mc->block_default_type = IF_IDE;
@ -230,10 +271,10 @@ index 1489abf010..8abb1f872e 100644
mc->wakeup = pc_machine_wakeup;
hc->pre_plug = pc_machine_device_pre_plug_cb;
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 14a794081e..3e330fd36f 100644
index 18ba076609..a647262d63 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -54,6 +54,7 @@
@@ -52,6 +52,7 @@
#include "qapi/error.h"
#include "qemu/error-report.h"
#include "sysemu/xen.h"
@ -241,18 +282,7 @@ index 14a794081e..3e330fd36f 100644
#ifdef CONFIG_XEN
#include <xen/hvm/hvm_info_table.h>
#include "hw/xen/xen_pt.h"
@@ -198,8 +199,8 @@ static void pc_init1(MachineState *machine,
if (pcmc->smbios_defaults) {
MachineClass *mc = MACHINE_GET_CLASS(machine);
/* These values are guest ABI, do not change */
- smbios_set_defaults("QEMU", "Standard PC (i440FX + PIIX, 1996)",
- mc->name, pcmc->smbios_legacy_mode,
+ smbios_set_defaults("Red Hat", "KVM",
+ mc->desc, pcmc->smbios_legacy_mode,
pcmc->smbios_uuid_encoded,
pcmc->smbios_stream_product,
pcmc->smbios_stream_version,
@@ -351,6 +352,7 @@ static void pc_init1(MachineState *machine,
@@ -422,6 +423,7 @@ static void pc_set_south_bridge(Object *obj, int value, Error **errp)
* hw_compat_*, pc_compat_*, or * pc_*_machine_options().
*/
@ -260,7 +290,7 @@ index 14a794081e..3e330fd36f 100644
static void pc_compat_2_3_fn(MachineState *machine)
{
X86MachineState *x86ms = X86_MACHINE(machine);
@@ -899,3 +901,83 @@ static void xenfv_3_1_machine_options(MachineClass *m)
@@ -951,3 +953,110 @@ static void xenfv_3_1_machine_options(MachineClass *m)
DEFINE_PC_MACHINE(xenfv, "xenfv-3.1", pc_xen_hvm_init,
xenfv_3_1_machine_options);
#endif
@ -274,8 +304,9 @@ index 14a794081e..3e330fd36f 100644
+ PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
+ m->family = "pc_piix_Y";
+ m->default_machine_opts = "firmware=bios-256k.bin,hpet=off";
+ pcmc->default_nic_model = "e1000";
+ pcmc->pci_root_uid = 0;
+ pcmc->resizable_acpi_blob = true;
+ m->default_nic = "e1000";
+ m->default_display = "std";
+ m->no_parallel = 1;
+ m->numa_mem_supported = true;
@ -289,13 +320,13 @@ index 14a794081e..3e330fd36f 100644
+
+static void pc_init_rhel760(MachineState *machine)
+{
+ pc_init1(machine, TYPE_I440FX_PCI_HOST_BRIDGE, \
+ TYPE_I440FX_PCI_DEVICE);
+ pc_init1(machine, TYPE_I440FX_PCI_DEVICE);
+}
+
+static void pc_machine_rhel760_options(MachineClass *m)
+{
+ PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
+ ObjectClass *oc = OBJECT_CLASS(m);
+ pc_machine_rhel7_options(m);
+ m->desc = "RHEL 7.6.0 PC (i440FX + PIIX, 1996)";
+ m->async_pf_vmexit_disable = true;
@ -309,7 +340,33 @@ index 14a794081e..3e330fd36f 100644
+ pcmc->kvmclock_create_always = false;
+ /* From pc_i440fx_5_1_machine_options() */
+ pcmc->pci_root_uid = 1;
+ /* From pc_i440fx_7_0_machine_options() */
+ pcmc->enforce_amd_1tb_hole = false;
+ /* From pc_i440fx_8_0_machine_options() */
+ pcmc->default_smbios_ep_type = SMBIOS_ENTRY_POINT_TYPE_32;
+ /* From pc_i440fx_8_1_machine_options() */
+ pcmc->broken_32bit_mem_addr_check = true;
+ /* Introduced in QEMU 8.2 */
+ pcmc->default_south_bridge = TYPE_PIIX3_DEVICE;
+
+ object_class_property_add_enum(oc, "x-south-bridge", "PCSouthBridgeOption",
+ &PCSouthBridgeOption_lookup,
+ pc_get_south_bridge,
+ pc_set_south_bridge);
+ object_class_property_set_description(oc, "x-south-bridge",
+ "Use a different south bridge than PIIX3");
+
+
+ compat_props_add(m->compat_props, hw_compat_rhel_9_4,
+ hw_compat_rhel_9_4_len);
+ compat_props_add(m->compat_props, hw_compat_rhel_9_3,
+ hw_compat_rhel_9_3_len);
+ compat_props_add(m->compat_props, pc_rhel_9_3_compat,
+ pc_rhel_9_3_compat_len);
+ compat_props_add(m->compat_props, hw_compat_rhel_9_2,
+ hw_compat_rhel_9_2_len);
+ compat_props_add(m->compat_props, pc_rhel_9_2_compat,
+ pc_rhel_9_2_compat_len);
+ compat_props_add(m->compat_props, hw_compat_rhel_9_1,
+ hw_compat_rhel_9_1_len);
+ compat_props_add(m->compat_props, hw_compat_rhel_9_0,
@ -345,21 +402,10 @@ index 14a794081e..3e330fd36f 100644
+DEFINE_PC_MACHINE(rhel760, "pc-i440fx-rhel7.6.0", pc_init_rhel760,
+ pc_machine_rhel760_options);
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index dc0ba5f9e7..98601bb76f 100644
index c7bc8a2041..e872dc7e46 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -199,8 +199,8 @@ static void pc_q35_init(MachineState *machine)
if (pcmc->smbios_defaults) {
/* These values are guest ABI, do not change */
- smbios_set_defaults("QEMU", "Standard PC (Q35 + ICH9, 2009)",
- mc->name, pcmc->smbios_legacy_mode,
+ smbios_set_defaults("Red Hat", "KVM",
+ mc->desc, pcmc->smbios_legacy_mode,
pcmc->smbios_uuid_encoded,
pcmc->smbios_stream_product,
pcmc->smbios_stream_version,
@@ -354,6 +354,7 @@ static void pc_q35_init(MachineState *machine)
@@ -341,6 +341,7 @@ static void pc_q35_init(MachineState *machine)
DEFINE_PC_MACHINE(suffix, name, pc_init_##suffix, optionfn)
@ -367,7 +413,7 @@ index dc0ba5f9e7..98601bb76f 100644
static void pc_q35_machine_options(MachineClass *m)
{
PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
@@ -663,3 +664,250 @@ static void pc_q35_2_4_machine_options(MachineClass *m)
@@ -693,3 +694,287 @@ static void pc_q35_2_4_machine_options(MachineClass *m)
DEFINE_Q35_MACHINE(v2_4, "pc-q35-2.4", NULL,
pc_q35_2_4_machine_options);
@ -379,8 +425,8 @@ index dc0ba5f9e7..98601bb76f 100644
+static void pc_q35_machine_rhel_options(MachineClass *m)
+{
+ PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
+ pcmc->default_nic_model = "e1000e";
+ pcmc->pci_root_uid = 0;
+ m->default_nic = "e1000e";
+ m->family = "pc_q35_Z";
+ m->units_per_default_bus = 1;
+ m->default_machine_opts = "firmware=bios-256k.bin,hpet=off";
@ -394,8 +440,28 @@ index dc0ba5f9e7..98601bb76f 100644
+ m->alias = "q35";
+ m->max_cpus = 710;
+ compat_props_add(m->compat_props, pc_rhel_compat, pc_rhel_compat_len);
+ compat_props_add(m->compat_props,
+ pc_q35_compat_defaults, pc_q35_compat_defaults_len);
+}
+
+static void pc_q35_init_rhel940(MachineState *machine)
+{
+ pc_q35_init(machine);
+}
+
+static void pc_q35_machine_rhel940_options(MachineClass *m)
+{
+ PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
+ pc_q35_machine_rhel_options(m);
+ m->desc = "RHEL-9.4.0 PC (Q35 + ICH9, 2009)";
+ pcmc->smbios_stream_product = "RHEL";
+ pcmc->smbios_stream_version = "9.4.0";
+}
+
+DEFINE_PC_MACHINE(q35_rhel940, "pc-q35-rhel9.4.0", pc_q35_init_rhel940,
+ pc_q35_machine_rhel940_options);
+
+
+static void pc_q35_init_rhel920(MachineState *machine)
+{
+ pc_q35_init(machine);
@ -404,10 +470,27 @@ index dc0ba5f9e7..98601bb76f 100644
+static void pc_q35_machine_rhel920_options(MachineClass *m)
+{
+ PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
+ pc_q35_machine_rhel_options(m);
+ pc_q35_machine_rhel940_options(m);
+ m->desc = "RHEL-9.2.0 PC (Q35 + ICH9, 2009)";
+ m->alias = NULL;
+ pcmc->smbios_stream_product = "RHEL";
+ pcmc->smbios_stream_version = "9.2.0";
+
+ /* From pc_q35_8_0_machine_options() */
+ pcmc->default_smbios_ep_type = SMBIOS_ENTRY_POINT_TYPE_32;
+ /* From pc_q35_8_1_machine_options() */
+ pcmc->broken_32bit_mem_addr_check = true;
+
+ compat_props_add(m->compat_props, hw_compat_rhel_9_4,
+ hw_compat_rhel_9_4_len);
+ compat_props_add(m->compat_props, hw_compat_rhel_9_3,
+ hw_compat_rhel_9_3_len);
+ compat_props_add(m->compat_props, pc_rhel_9_3_compat,
+ pc_rhel_9_3_compat_len);
+ compat_props_add(m->compat_props, hw_compat_rhel_9_2,
+ hw_compat_rhel_9_2_len);
+ compat_props_add(m->compat_props, pc_rhel_9_2_compat,
+ pc_rhel_9_2_compat_len);
+}
+
+DEFINE_PC_MACHINE(q35_rhel920, "pc-q35-rhel9.2.0", pc_q35_init_rhel920,
@ -619,10 +702,10 @@ index dc0ba5f9e7..98601bb76f 100644
+DEFINE_PC_MACHINE(q35_rhel760, "pc-q35-rhel7.6.0", pc_q35_init_rhel760,
+ pc_q35_machine_rhel760_options);
diff --git a/include/hw/boards.h b/include/hw/boards.h
index c5a965d27f..5e7446ee40 100644
index 0466f9d0f3..46b8725c41 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -268,6 +268,8 @@ struct MachineClass {
@@ -283,6 +283,8 @@ struct MachineClass {
strList *allowed_dynamic_sysbus_devices;
bool auto_enable_numa_with_memhp;
bool auto_enable_numa_with_memdev;
@ -632,16 +715,22 @@ index c5a965d27f..5e7446ee40 100644
bool smbus_no_migration_support;
bool nvdimm_supported;
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 908a275736..4376f64a47 100644
index ebd8f973f2..a984c951ad 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -293,6 +293,33 @@ extern const size_t pc_compat_1_4_len;
int pc_machine_kvm_type(MachineState *machine, const char *vm_type);
@@ -291,6 +291,39 @@ extern const size_t pc_compat_2_1_len;
extern GlobalProperty pc_compat_2_0[];
extern const size_t pc_compat_2_0_len;
+extern GlobalProperty pc_rhel_compat[];
+extern const size_t pc_rhel_compat_len;
+
+extern GlobalProperty pc_rhel_9_3_compat[];
+extern const size_t pc_rhel_9_3_compat_len;
+
+extern GlobalProperty pc_rhel_9_2_compat[];
+extern const size_t pc_rhel_9_2_compat_len;
+
+extern GlobalProperty pc_rhel_9_0_compat[];
+extern const size_t pc_rhel_9_0_compat_len;
+
@ -670,10 +759,10 @@ index 908a275736..4376f64a47 100644
static void pc_machine_##suffix##_class_init(ObjectClass *oc, void *data) \
{ \
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 6576287e5b..0ef2bf1b93 100644
index 33760a2ee1..be7b0663cd 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -1834,9 +1834,13 @@ static const CPUCaches epyc_milan_cache_info = {
@@ -2190,9 +2190,13 @@ static const CPUCaches epyc_genoa_cache_info = {
* PT in VMX operation
*/
@ -687,7 +776,7 @@ index 6576287e5b..0ef2bf1b93 100644
.level = 0xd,
.vendor = CPUID_VENDOR_AMD,
.family = 15,
@@ -1857,6 +1861,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
@@ -2213,6 +2217,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
},
{
.name = "phenom",
@ -695,7 +784,7 @@ index 6576287e5b..0ef2bf1b93 100644
.level = 5,
.vendor = CPUID_VENDOR_AMD,
.family = 16,
@@ -1889,6 +1894,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
@@ -2245,6 +2250,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
},
{
.name = "core2duo",
@ -703,7 +792,7 @@ index 6576287e5b..0ef2bf1b93 100644
.level = 10,
.vendor = CPUID_VENDOR_INTEL,
.family = 6,
@@ -1931,6 +1937,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
@@ -2287,6 +2293,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
},
{
.name = "kvm64",
@ -711,7 +800,7 @@ index 6576287e5b..0ef2bf1b93 100644
.level = 0xd,
.vendor = CPUID_VENDOR_INTEL,
.family = 15,
@@ -1972,6 +1979,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
@@ -2328,6 +2335,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
},
{
.name = "qemu32",
@ -719,7 +808,7 @@ index 6576287e5b..0ef2bf1b93 100644
.level = 4,
.vendor = CPUID_VENDOR_INTEL,
.family = 6,
@@ -1986,6 +1994,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
@@ -2342,6 +2350,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
},
{
.name = "kvm32",
@ -727,7 +816,7 @@ index 6576287e5b..0ef2bf1b93 100644
.level = 5,
.vendor = CPUID_VENDOR_INTEL,
.family = 15,
@@ -2016,6 +2025,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
@@ -2372,6 +2381,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
},
{
.name = "coreduo",
@ -735,7 +824,7 @@ index 6576287e5b..0ef2bf1b93 100644
.level = 10,
.vendor = CPUID_VENDOR_INTEL,
.family = 6,
@@ -2049,6 +2059,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
@@ -2405,6 +2415,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
},
{
.name = "486",
@ -743,7 +832,7 @@ index 6576287e5b..0ef2bf1b93 100644
.level = 1,
.vendor = CPUID_VENDOR_INTEL,
.family = 4,
@@ -2061,6 +2072,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
@@ -2417,6 +2428,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
},
{
.name = "pentium",
@ -751,7 +840,7 @@ index 6576287e5b..0ef2bf1b93 100644
.level = 1,
.vendor = CPUID_VENDOR_INTEL,
.family = 5,
@@ -2073,6 +2085,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
@@ -2429,6 +2441,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
},
{
.name = "pentium2",
@ -759,7 +848,7 @@ index 6576287e5b..0ef2bf1b93 100644
.level = 2,
.vendor = CPUID_VENDOR_INTEL,
.family = 6,
@@ -2085,6 +2098,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
@@ -2441,6 +2454,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
},
{
.name = "pentium3",
@ -767,7 +856,7 @@ index 6576287e5b..0ef2bf1b93 100644
.level = 3,
.vendor = CPUID_VENDOR_INTEL,
.family = 6,
@@ -2097,6 +2111,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
@@ -2453,6 +2467,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
},
{
.name = "athlon",
@ -775,7 +864,7 @@ index 6576287e5b..0ef2bf1b93 100644
.level = 2,
.vendor = CPUID_VENDOR_AMD,
.family = 6,
@@ -2112,6 +2127,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
@@ -2468,6 +2483,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
},
{
.name = "n270",
@ -783,7 +872,7 @@ index 6576287e5b..0ef2bf1b93 100644
.level = 10,
.vendor = CPUID_VENDOR_INTEL,
.family = 6,
@@ -2137,6 +2153,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
@@ -2493,6 +2509,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
},
{
.name = "Conroe",
@ -791,7 +880,7 @@ index 6576287e5b..0ef2bf1b93 100644
.level = 10,
.vendor = CPUID_VENDOR_INTEL,
.family = 6,
@@ -2177,6 +2194,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
@@ -2533,6 +2550,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
},
{
.name = "Penryn",
@ -799,7 +888,7 @@ index 6576287e5b..0ef2bf1b93 100644
.level = 10,
.vendor = CPUID_VENDOR_INTEL,
.family = 6,
@@ -3893,6 +3911,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
@@ -4394,6 +4412,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
},
{
.name = "Opteron_G1",
@ -807,7 +896,7 @@ index 6576287e5b..0ef2bf1b93 100644
.level = 5,
.vendor = CPUID_VENDOR_AMD,
.family = 15,
@@ -3913,6 +3932,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
@@ -4414,6 +4433,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
},
{
.name = "Opteron_G2",
@ -815,7 +904,7 @@ index 6576287e5b..0ef2bf1b93 100644
.level = 5,
.vendor = CPUID_VENDOR_AMD,
.family = 15,
@@ -3935,6 +3955,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
@@ -4436,6 +4456,7 @@ static const X86CPUDefinition builtin_x86_defs[] = {
},
{
.name = "Opteron_G3",
@ -824,10 +913,10 @@ index 6576287e5b..0ef2bf1b93 100644
.vendor = CPUID_VENDOR_AMD,
.family = 16,
diff --git a/target/i386/kvm/kvm-cpu.c b/target/i386/kvm/kvm-cpu.c
index 7237378a7d..7b8a3d5af0 100644
index 9c791b7b05..b91af5051f 100644
--- a/target/i386/kvm/kvm-cpu.c
+++ b/target/i386/kvm/kvm-cpu.c
@@ -137,6 +137,7 @@ static PropValue kvm_default_props[] = {
@@ -138,6 +138,7 @@ static PropValue kvm_default_props[] = {
{ "acpi", "off" },
{ "monitor", "off" },
{ "svm", "off" },
@ -836,10 +925,10 @@ index 7237378a7d..7b8a3d5af0 100644
};
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index de531842f6..8d82304609 100644
index e68cbe9293..739f33db47 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -3822,6 +3822,7 @@ static int kvm_get_msrs(X86CPU *cpu)
@@ -3715,6 +3715,7 @@ static int kvm_get_msrs(X86CPU *cpu)
struct kvm_msr_entry *msrs = cpu->kvm_msr_buf->entries;
int ret, i;
uint64_t mtrr_top_bits;
@ -847,7 +936,7 @@ index de531842f6..8d82304609 100644
kvm_msr_buf_reset(cpu);
@@ -4177,6 +4178,9 @@ static int kvm_get_msrs(X86CPU *cpu)
@@ -4069,6 +4070,9 @@ static int kvm_get_msrs(X86CPU *cpu)
break;
case MSR_KVM_ASYNC_PF_EN:
env->async_pf_en_msr = msrs[i].data;
@ -881,5 +970,5 @@ index 78f1cf8186..ac954c9b06 100644
val = qtest_inb(qts, 0x505);
g_assert_cmpuint(val, ==, 3);
--
2.39.1
2.39.3

@ -1,4 +1,4 @@
From 738db8353055eb6fd902513949c6659af8b401d0 Mon Sep 17 00:00:00 2001
From 5768cf6811842e5c59da3b752f60659a9d6b5ba1 Mon Sep 17 00:00:00 2001
From: Miroslav Rezanina <mrezanin@redhat.com>
Date: Wed, 2 Sep 2020 09:39:41 +0200
Subject: Enable make check
@ -24,43 +24,49 @@ Rebase changes (7.0.0):
- Remove unnecessary changes in iotest 051
- Remove changes in bios-tables-test.c and prom-env-test.c qtests
Rebase changes (7.1.0 rc0):
Rebase changes (7.1.0):
- Disable bcm2835-dma-test (added upstream)
Rebase changes (8.0.0-rc1):
Rebase changes (8.0.0):
- Removed chunks for disabling bios-table-test (protected upstream)
Rebase change (8.0.0-rc2):
- Disable new qemu-iotests execution
- Revert change in tco qtest (blocking test run)
Rebase changes (8.1.0):
- Do not disable device-plug-test for s390x
Rebase changes (8.2.0 rc1):
- Remove unneeded hack in qtest/usb-hcd-xhci-test.c
Merged patches (6.1.0):
- 2f129df7d3 redhat: Enable the 'test-block-iothread' test again
Merged patches (7.1.0 rc0):
Merged patches (7.1.0):
- 64d736640e RHEL-only: tests/avocado: Switch aarch64 tests from a53 to a57
Merged patches (8.1.0):
- f468163234 iotests: Use alternative CPU type that is not deprecated in RHEL
---
.distro/qemu-kvm.spec.template | 4 ++--
tests/avocado/replay_kernel.py | 2 +-
tests/avocado/reverse_debugging.py | 2 +-
tests/avocado/tcg_plugins.py | 6 ++---
tests/qemu-iotests/meson.build | 34 ++++++++++++++---------------
tests/qemu-iotests/testenv.py | 3 +++
tests/qtest/fuzz-e1000e-test.c | 2 +-
tests/qtest/fuzz-virtio-scsi-test.c | 2 +-
tests/qtest/intel-hda-test.c | 2 +-
tests/qtest/libqos/meson.build | 2 +-
tests/qtest/lpc-ich9-test.c | 2 +-
tests/qtest/meson.build | 2 --
tests/qtest/tco-test.c | 2 +-
tests/qtest/usb-hcd-xhci-test.c | 4 ++++
tests/qtest/meson.build | 1 -
tests/qtest/virtio-net-failover.c | 1 +
14 files changed, 35 insertions(+), 32 deletions(-)
13 files changed, 33 insertions(+), 30 deletions(-)
diff --git a/tests/avocado/replay_kernel.py b/tests/avocado/replay_kernel.py
index f13456e1ec..2fee270a42 100644
index 10d99403a4..c3422ea1e4 100644
--- a/tests/avocado/replay_kernel.py
+++ b/tests/avocado/replay_kernel.py
@@ -147,7 +147,7 @@ def test_aarch64_virt(self):
@@ -166,7 +166,7 @@ def test_aarch64_virt(self):
"""
:avocado: tags=arch:aarch64
:avocado: tags=machine:virt
@ -70,10 +76,10 @@ index f13456e1ec..2fee270a42 100644
kernel_url = ('https://archives.fedoraproject.org/pub/archive/fedora'
'/linux/releases/29/Everything/aarch64/os/images/pxeboot'
diff --git a/tests/avocado/reverse_debugging.py b/tests/avocado/reverse_debugging.py
index 680c314cfc..71eccb8fb6 100644
index 92855a02a5..87822074b6 100644
--- a/tests/avocado/reverse_debugging.py
+++ b/tests/avocado/reverse_debugging.py
@@ -206,7 +206,7 @@ def test_aarch64_virt(self):
@@ -230,7 +230,7 @@ def test_aarch64_virt(self):
"""
:avocado: tags=arch:aarch64
:avocado: tags=machine:virt
@ -83,10 +89,10 @@ index 680c314cfc..71eccb8fb6 100644
kernel_url = ('https://archives.fedoraproject.org/pub/archive/fedora'
'/linux/releases/29/Everything/aarch64/os/images/pxeboot'
diff --git a/tests/avocado/tcg_plugins.py b/tests/avocado/tcg_plugins.py
index 642d2e49e3..93b3afd823 100644
index 15fd87b2c1..f0d9d89c93 100644
--- a/tests/avocado/tcg_plugins.py
+++ b/tests/avocado/tcg_plugins.py
@@ -68,7 +68,7 @@ def test_aarch64_virt_insn(self):
@@ -66,7 +66,7 @@ def test_aarch64_virt_insn(self):
:avocado: tags=accel:tcg
:avocado: tags=arch:aarch64
:avocado: tags=machine:virt
@ -95,7 +101,7 @@ index 642d2e49e3..93b3afd823 100644
"""
kernel_path = self._grab_aarch64_kernel()
kernel_command_line = (self.KERNEL_COMMON_COMMAND_LINE +
@@ -94,7 +94,7 @@ def test_aarch64_virt_insn_icount(self):
@@ -96,7 +96,7 @@ def test_aarch64_virt_insn_icount(self):
:avocado: tags=accel:tcg
:avocado: tags=arch:aarch64
:avocado: tags=machine:virt
@ -104,7 +110,7 @@ index 642d2e49e3..93b3afd823 100644
"""
kernel_path = self._grab_aarch64_kernel()
kernel_command_line = (self.KERNEL_COMMON_COMMAND_LINE +
@@ -120,7 +120,7 @@ def test_aarch64_virt_mem_icount(self):
@@ -126,7 +126,7 @@ def test_aarch64_virt_mem_icount(self):
:avocado: tags=accel:tcg
:avocado: tags=arch:aarch64
:avocado: tags=machine:virt
@ -114,7 +120,7 @@ index 642d2e49e3..93b3afd823 100644
kernel_path = self._grab_aarch64_kernel()
kernel_command_line = (self.KERNEL_COMMON_COMMAND_LINE +
diff --git a/tests/qemu-iotests/meson.build b/tests/qemu-iotests/meson.build
index 9735071a29..32002335f4 100644
index fad340ad59..3c0d5241f6 100644
--- a/tests/qemu-iotests/meson.build
+++ b/tests/qemu-iotests/meson.build
@@ -51,21 +51,21 @@ foreach format, speed: qemu_iotests_formats
@ -156,6 +162,20 @@ index 9735071a29..32002335f4 100644
+# suite: suites)
+# endforeach
endforeach
diff --git a/tests/qemu-iotests/testenv.py b/tests/qemu-iotests/testenv.py
index 588f30a4f1..3929a3634f 100644
--- a/tests/qemu-iotests/testenv.py
+++ b/tests/qemu-iotests/testenv.py
@@ -244,6 +244,9 @@ def __init__(self, source_dir: str, build_dir: str,
if self.qemu_prog.endswith(f'qemu-system-{suffix}'):
self.qemu_options += f' -machine {machine}'
+ if self.qemu_prog.endswith('qemu-system-x86_64'):
+ self.qemu_options += ' -cpu Nehalem'
+
# QEMU_DEFAULT_MACHINE
self.qemu_default_machine = get_default_machine(self.qemu_prog)
diff --git a/tests/qtest/fuzz-e1000e-test.c b/tests/qtest/fuzz-e1000e-test.c
index 5052883fb6..b5286f4b12 100644
--- a/tests/qtest/fuzz-e1000e-test.c
@ -183,20 +203,20 @@ index e37b48b2cc..88647da054 100644
qtest_outl(s, 0xcf8, 0x80001811);
diff --git a/tests/qtest/intel-hda-test.c b/tests/qtest/intel-hda-test.c
index d4a8db6fd6..1a796ec15a 100644
index 663bb6c485..2efc43e3f7 100644
--- a/tests/qtest/intel-hda-test.c
+++ b/tests/qtest/intel-hda-test.c
@@ -38,7 +38,7 @@ static void test_issue542_ich6(void)
@@ -42,7 +42,7 @@ static void test_issue542_ich6(void)
{
QTestState *s;
- s = qtest_init("-nographic -nodefaults -M pc-q35-6.2 "
+ s = qtest_init("-nographic -nodefaults -M pc-q35-rhel9.0.0 "
AUDIODEV
"-device intel-hda,id=" HDA_ID CODEC_DEVICES);
qtest_outl(s, 0xcf8, 0x80000804);
diff --git a/tests/qtest/libqos/meson.build b/tests/qtest/libqos/meson.build
index cc209a8de5..42a7c529c9 100644
index 3aed6efcb8..119613237e 100644
--- a/tests/qtest/libqos/meson.build
+++ b/tests/qtest/libqos/meson.build
@@ -44,7 +44,7 @@ libqos_srcs = files(
@ -206,8 +226,8 @@ index cc209a8de5..42a7c529c9 100644
- 'virtio-iommu.c',
+# 'virtio-iommu.c',
'virtio-gpio.c',
'virtio-scmi.c',
'generic-pcihost.c',
diff --git a/tests/qtest/lpc-ich9-test.c b/tests/qtest/lpc-ich9-test.c
index 8ac95b89f7..cd2102555c 100644
--- a/tests/qtest/lpc-ich9-test.c
@ -222,10 +242,10 @@ index 8ac95b89f7..cd2102555c 100644
qtest_outl(s, 0xcf8, 0x8000f840); /* PMBASE */
diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
index 85ea4e8d99..893afc8eeb 100644
index 36c5c13a7b..a2887d6057 100644
--- a/tests/qtest/meson.build
+++ b/tests/qtest/meson.build
@@ -94,7 +94,6 @@ qtests_i386 = \
@@ -101,7 +101,6 @@ qtests_i386 = \
'drive_del-test',
'tco-test',
'cpu-plug-test',
@ -233,62 +253,11 @@ index 85ea4e8d99..893afc8eeb 100644
'vmgenid-test',
'migration-test',
'test-x86-cpuid-compat',
@@ -223,7 +222,6 @@ qtests_s390x = \
(config_host.has_key('CONFIG_POSIX') ? ['test-filter-redirector'] : []) + \
['boot-serial-test',
'drive_del-test',
- 'device-plug-test',
'virtio-ccw-test',
'cpu-plug-test',
'migration-test']
diff --git a/tests/qtest/tco-test.c b/tests/qtest/tco-test.c
index 0547d41173..3756ce82d8 100644
--- a/tests/qtest/tco-test.c
+++ b/tests/qtest/tco-test.c
@@ -60,7 +60,7 @@ static void test_init(TestData *d)
QTestState *qs;
qs = qtest_initf("-machine q35 %s %s",
- d->noreboot ? "-global ICH9-LPC.noreboot=true" : "",
+ d->noreboot ? "" : "-global ICH9-LPC.noreboot=false",
!d->args ? "" : d->args);
qtest_irq_intercept_in(qs, "ioapic");
diff --git a/tests/qtest/usb-hcd-xhci-test.c b/tests/qtest/usb-hcd-xhci-test.c
index 10ef9d2a91..3855873050 100644
--- a/tests/qtest/usb-hcd-xhci-test.c
+++ b/tests/qtest/usb-hcd-xhci-test.c
@@ -21,6 +21,7 @@ static void test_xhci_hotplug(void)
usb_test_hotplug(global_qtest, "xhci", "1", NULL);
}
+#if 0 /* Disabled for Red Hat Enterprise Linux */
static void test_usb_uas_hotplug(void)
{
QTestState *qts = global_qtest;
@@ -36,6 +37,7 @@ static void test_usb_uas_hotplug(void)
qtest_qmp_device_del(qts, "scsihd");
qtest_qmp_device_del(qts, "uas");
}
+#endif
static void test_usb_ccid_hotplug(void)
{
@@ -56,7 +58,9 @@ int main(int argc, char **argv)
qtest_add_func("/xhci/pci/init", test_xhci_init);
qtest_add_func("/xhci/pci/hotplug", test_xhci_hotplug);
+#if 0 /* Disabled for Red Hat Enterprise Linux */
qtest_add_func("/xhci/pci/hotplug/usb-uas", test_usb_uas_hotplug);
+#endif
qtest_add_func("/xhci/pci/hotplug/usb-ccid", test_usb_ccid_hotplug);
qtest_start("-device nec-usb-xhci,id=xhci"
diff --git a/tests/qtest/virtio-net-failover.c b/tests/qtest/virtio-net-failover.c
index 4a809590bf..1bf3fa641c 100644
index 73dfabc272..a9dd304781 100644
--- a/tests/qtest/virtio-net-failover.c
+++ b/tests/qtest/virtio-net-failover.c
@@ -25,6 +25,7 @@
@@ -26,6 +26,7 @@
#define PCI_SEL_BASE 0x0010
#define BASE_MACHINE "-M q35 -nodefaults " \
@ -297,5 +266,5 @@ index 4a809590bf..1bf3fa641c 100644
"-device pcie-root-port,id=root1,addr=0x2,bus=pcie.0,chassis=2 "
--
2.39.1
2.39.3

@ -1,4 +1,4 @@
From 34cb4f7ddd762ec46ed1a6a4261aebde39360ca4 Mon Sep 17 00:00:00 2001
From e06a905d726fc20ea6bd95dff1bd0ffe97ebb202 Mon Sep 17 00:00:00 2001
From: Bandan Das <bsd@redhat.com>
Date: Tue, 3 Dec 2013 20:05:13 +0100
Subject: vfio: cap number of devices that can be assigned
@ -26,16 +26,19 @@ Count of slots increased to 509 later so we could increase limit
to 64 as some usecases require more than 32 devices.
Signed-off-by: Bandan Das <bsd@redhat.com>
Rebase changes (8.2.0):
- Update to upstream changes
---
hw/vfio/pci.c | 29 ++++++++++++++++++++++++++++-
hw/vfio/pci.c | 31 ++++++++++++++++++++++++++++++-
hw/vfio/pci.h | 1 +
2 files changed, 29 insertions(+), 1 deletion(-)
2 files changed, 31 insertions(+), 1 deletion(-)
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index ec9a854361..a779053be3 100644
index 64780d1b79..57ac63c10c 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -48,6 +48,9 @@
@@ -50,6 +50,9 @@
/* Protected by BQL */
static KVMRouteChange vfio_route_change;
@ -45,13 +48,19 @@ index ec9a854361..a779053be3 100644
static void vfio_disable_interrupts(VFIOPCIDevice *vdev);
static void vfio_mmap_set_enabled(VFIOPCIDevice *vdev, bool enabled);
static void vfio_msi_disable_common(VFIOPCIDevice *vdev);
@@ -2854,9 +2857,30 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
ssize_t len;
struct stat st;
int groupid;
@@ -2946,13 +2949,36 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
ERRP_GUARD();
VFIOPCIDevice *vdev = VFIO_PCI(pdev);
VFIODevice *vbasedev = &vdev->vbasedev;
+ VFIODevice *vbasedev_iter;
+ VFIOGroup *group;
char *tmp, *subsys;
Error *err = NULL;
- int i, ret;
+ int ret, i = 0;
bool is_mdev;
char uuid[UUID_STR_LEN];
char *name;
+ if (device_limit && device_limit != vdev->assigned_device_limit) {
+ error_setg(errp, "Assigned device limit has been redefined. "
@ -74,10 +83,10 @@ index ec9a854361..a779053be3 100644
+ return;
+ }
+
if (!vbasedev->sysfsdev) {
if (vbasedev->fd < 0 && !vbasedev->sysfsdev) {
if (!(~vdev->host.domain || ~vdev->host.bus ||
~vdev->host.slot || ~vdev->host.function)) {
@@ -3294,6 +3318,9 @@ static Property vfio_pci_dev_properties[] = {
@@ -3370,6 +3396,9 @@ static Property vfio_pci_dev_properties[] = {
DEFINE_PROP_BOOL("x-no-kvm-msix", VFIOPCIDevice, no_kvm_msix, false),
DEFINE_PROP_BOOL("x-no-geforce-quirks", VFIOPCIDevice,
no_geforce_quirks, false),
@ -88,10 +97,10 @@ index ec9a854361..a779053be3 100644
false),
DEFINE_PROP_BOOL("x-no-vfio-ioeventfd", VFIOPCIDevice, no_vfio_ioeventfd,
diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
index 177abcc8fb..45235d38ba 100644
index 6e64a2654e..b7de39c010 100644
--- a/hw/vfio/pci.h
+++ b/hw/vfio/pci.h
@@ -140,6 +140,7 @@ struct VFIOPCIDevice {
@@ -142,6 +142,7 @@ struct VFIOPCIDevice {
EventNotifier err_notifier;
EventNotifier req_notifier;
int (*resetfn)(struct VFIOPCIDevice *);
@ -100,5 +109,5 @@ index 177abcc8fb..45235d38ba 100644
uint32_t device_id;
uint32_t sub_vendor_id;
--
2.39.1
2.39.3

@ -1,4 +1,4 @@
From 8964a3e8835992442902d35b011a708787366d82 Mon Sep 17 00:00:00 2001
From b467dc6a24ef41fa574260429807711f6802a54d Mon Sep 17 00:00:00 2001
From: Eduardo Habkost <ehabkost@redhat.com>
Date: Wed, 4 Dec 2013 18:53:17 +0100
Subject: Add support statement to -help output
@ -17,14 +17,14 @@ as unsupported by Red Hat, and advising users to use libvirt instead.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
---
softmmu/vl.c | 9 +++++++++
system/vl.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/softmmu/vl.c b/softmmu/vl.c
index ea20b23e4c..ad4173138d 100644
--- a/softmmu/vl.c
+++ b/softmmu/vl.c
@@ -834,9 +834,17 @@ static void version(void)
diff --git a/system/vl.c b/system/vl.c
index c644222982..03c3b0aa94 100644
--- a/system/vl.c
+++ b/system/vl.c
@@ -869,9 +869,17 @@ static void version(void)
QEMU_COPYRIGHT "\n");
}
@ -42,7 +42,7 @@ index ea20b23e4c..ad4173138d 100644
printf("usage: %s [options] [disk_image]\n\n"
"'disk_image' is a raw hard disk image for IDE hard disk 0\n\n",
g_get_prgname());
@@ -862,6 +870,7 @@ static void help(int exitcode)
@@ -897,6 +905,7 @@ static void help(int exitcode)
"\n"
QEMU_HELP_BOTTOM "\n");
@ -51,5 +51,5 @@ index ea20b23e4c..ad4173138d 100644
}
--
2.39.1
2.39.3

@ -1,4 +1,4 @@
From 0b72d348fa0714de641ee242e5cee97df006e8fd Mon Sep 17 00:00:00 2001
From 20cc3a6d9bce3e40d165f865b5e398c300cae7bf Mon Sep 17 00:00:00 2001
From: Miroslav Rezanina <mrezanin@redhat.com>
Date: Wed, 8 Jul 2020 08:35:50 +0200
Subject: Use qemu-kvm in documentation instead of qemu-system-<arch>
@ -36,10 +36,10 @@ index 52d6454b93..d74dbdeca9 100644
.. |I2C| replace:: I\ :sup:`2`\ C
.. |I2S| replace:: I\ :sup:`2`\ S
diff --git a/qemu-options.hx b/qemu-options.hx
index 59bdf67a2c..52b49f1f6a 100644
index 8ce85d4559..4fc27ee2e2 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -3296,11 +3296,11 @@ SRST
@@ -3493,11 +3493,11 @@ SRST
::
@ -57,5 +57,5 @@ index 59bdf67a2c..52b49f1f6a 100644
``-netdev vhost-vdpa[,vhostdev=/path/to/dev][,vhostfd=h]``
Establish a vhost-vdpa netdev.
--
2.39.1
2.39.3

@ -1,4 +1,4 @@
From bd6bcebfd783fa49e283d035d378fb5240423d84 Mon Sep 17 00:00:00 2001
From 2f9fdd21ecf2810d0d83a8125ce0cc1e75dbb13a Mon Sep 17 00:00:00 2001
From: Kevin Wolf <kwolf@redhat.com>
Date: Fri, 20 Aug 2021 18:25:12 +0200
Subject: qcow2: Deprecation warning when opening v2 images rw
@ -44,10 +44,10 @@ Rebase notes (6.1.0):
2 files changed, 7 insertions(+)
diff --git a/block/qcow2.c b/block/qcow2.c
index 30fd53fa64..22084730f9 100644
index 956128b409..0e8b2f7518 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1337,6 +1337,12 @@ qcow2_do_open(BlockDriverState *bs, QDict *options, int flags,
@@ -1358,6 +1358,12 @@ qcow2_do_open(BlockDriverState *bs, QDict *options, int flags,
ret = -ENOTSUP;
goto fail;
}
@ -61,7 +61,7 @@ index 30fd53fa64..22084730f9 100644
s->qcow_version = header.version;
diff --git a/tests/qemu-iotests/common.filter b/tests/qemu-iotests/common.filter
index 6b32c7fbfa..6ddda2ee64 100644
index 2846c83808..83472953a2 100644
--- a/tests/qemu-iotests/common.filter
+++ b/tests/qemu-iotests/common.filter
@@ -83,6 +83,7 @@ _filter_qemu()
@ -73,5 +73,5 @@ index 6b32c7fbfa..6ddda2ee64 100644
}
--
2.39.1
2.39.3

@ -0,0 +1,121 @@
From 59470e8ab849f22b407f55292e540e16a8cad01a Mon Sep 17 00:00:00 2001
From: Miroslav Rezanina <mrezanin@redhat.com>
Date: Wed, 20 Mar 2024 05:34:32 -0400
Subject: Add upstream compatibility bits
Adding new compats structure for changes introduced during rebase to QEMU 9.0.0.
Signed-off-by: Miroslav Rezanina <mrezanin@redhat.com>
---
Rebase notes (9.0.0 rc2):
- Add aw-bits setting for aarch compat record (overwritten for 9.4 and older)
---
hw/arm/virt.c | 3 +++
hw/core/machine.c | 10 ++++++++++
hw/i386/pc_piix.c | 3 ++-
hw/i386/pc_q35.c | 3 +++
hw/s390x/s390-virtio-ccw.c | 1 +
include/hw/boards.h | 3 +++
6 files changed, 22 insertions(+), 1 deletion(-)
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 22bc345137..f1af9495c6 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -144,6 +144,8 @@ GlobalProperty arm_rhel_compat[] = {
{"virtio-net-pci", "romfile", "" },
{"virtio-net-pci-transitional", "romfile", "" },
{"virtio-net-pci-non-transitional", "romfile", "" },
+ /* arm_rhel_compat from arm_virt_compat, added for 9.0.0 rebase */
+ { TYPE_VIRTIO_IOMMU_PCI, "aw-bits", "48" },
};
const size_t arm_rhel_compat_len = G_N_ELEMENTS(arm_rhel_compat);
@@ -3728,6 +3730,7 @@ type_init(rhel_machine_init);
static void rhel940_virt_options(MachineClass *mc)
{
+ compat_props_add(mc->compat_props, hw_compat_rhel_9_5, hw_compat_rhel_9_5_len);
}
DEFINE_RHEL_MACHINE_AS_LATEST(9, 4, 0)
diff --git a/hw/core/machine.c b/hw/core/machine.c
index 695cb89a46..0f256d9633 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -302,6 +302,16 @@ const size_t hw_compat_2_1_len = G_N_ELEMENTS(hw_compat_2_1);
const char *rhel_old_machine_deprecation =
"machine types for previous major releases are deprecated";
+GlobalProperty hw_compat_rhel_9_5[] = {
+ /* hw_compat_rhel_9_5 from hw_compat_8_2 */
+ { "migration", "zero-page-detection", "legacy"},
+ /* hw_compat_rhel_9_5 from hw_compat_8_2 */
+ { TYPE_VIRTIO_IOMMU_PCI, "granule", "4k" },
+ /* hw_compat_rhel_9_5 from hw_compat_8_2 */
+ { TYPE_VIRTIO_IOMMU_PCI, "aw-bits", "64" },
+};
+const size_t hw_compat_rhel_9_5_len = G_N_ELEMENTS(hw_compat_rhel_9_5);
+
GlobalProperty hw_compat_rhel_9_4[] = {
/* hw_compat_rhel_9_4 from hw_compat_8_0 */
{ TYPE_VIRTIO_NET, "host_uso", "off"},
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index a647262d63..6b260682eb 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -1015,7 +1015,8 @@ static void pc_machine_rhel760_options(MachineClass *m)
object_class_property_set_description(oc, "x-south-bridge",
"Use a different south bridge than PIIX3");
-
+ compat_props_add(m->compat_props, hw_compat_rhel_9_5,
+ hw_compat_rhel_9_5_len);
compat_props_add(m->compat_props, hw_compat_rhel_9_4,
hw_compat_rhel_9_4_len);
compat_props_add(m->compat_props, hw_compat_rhel_9_3,
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index e872dc7e46..2b54944c0f 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -733,6 +733,9 @@ static void pc_q35_machine_rhel940_options(MachineClass *m)
m->desc = "RHEL-9.4.0 PC (Q35 + ICH9, 2009)";
pcmc->smbios_stream_product = "RHEL";
pcmc->smbios_stream_version = "9.4.0";
+
+ compat_props_add(m->compat_props, hw_compat_rhel_9_5,
+ hw_compat_rhel_9_5_len);
}
DEFINE_PC_MACHINE(q35_rhel940, "pc-q35-rhel9.4.0", pc_q35_init_rhel940,
diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
index ff753a29e0..9ad54682c6 100644
--- a/hw/s390x/s390-virtio-ccw.c
+++ b/hw/s390x/s390-virtio-ccw.c
@@ -1282,6 +1282,7 @@ static void ccw_machine_rhel940_instance_options(MachineState *machine)
static void ccw_machine_rhel940_class_options(MachineClass *mc)
{
+ compat_props_add(mc->compat_props, hw_compat_rhel_9_5, hw_compat_rhel_9_5_len);
}
DEFINE_CCW_MACHINE(rhel940, "rhel9.4.0", true);
diff --git a/include/hw/boards.h b/include/hw/boards.h
index 46b8725c41..cca62f906b 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -514,6 +514,9 @@ extern const size_t hw_compat_2_2_len;
extern GlobalProperty hw_compat_2_1[];
extern const size_t hw_compat_2_1_len;
+extern GlobalProperty hw_compat_rhel_9_5[];
+extern const size_t hw_compat_rhel_9_5_len;
+
extern GlobalProperty hw_compat_rhel_9_4[];
extern const size_t hw_compat_rhel_9_4_len;
--
2.39.3

@ -1,53 +0,0 @@
From 78a42cf27aa519bb71214443ab570b40e156fa9c Mon Sep 17 00:00:00 2001
From: Kfir Manor <kfir@daynix.com>
Date: Sun, 22 Jan 2023 17:33:07 +0200
Subject: qga/linux: add usb support to guest-get-fsinfo
RH-Author: Kostiantyn Kostiuk <kkostiuk@redhat.com>
RH-MergeRequest: 140: qga/linux: add usb support to guest-get-fsinfo
RH-Bugzilla: 2149191
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Acked-by: yvugenfi <None>
RH-Commit: [1/1] bae929a2d0d0ad20e7308ede69c26499fc2119c7 (kostyanf14/redhat_centos-stream_src_qemu-kvm)
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2149191
Upstream patch: https://patchew.org/QEMU/20230122153307.1050593-1-kfir@daynix.com/
Signed-off-by: Kfir Manor <kfir@daynix.com>
Reviewed-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Signed-off-by: Konstantin Kostiuk <kkostiuk@redhat.com>
Patch-name: kvm-qga-linux-add-usb-support-to-guest-get-fsinfo.patch
Patch-id: 72
Patch-present-in-specfile: True
---
qga/commands-posix.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/qga/commands-posix.c b/qga/commands-posix.c
index 079689d79a..97754930c1 100644
--- a/qga/commands-posix.c
+++ b/qga/commands-posix.c
@@ -879,7 +879,9 @@ static bool build_guest_fsinfo_for_pci_dev(char const *syspath,
g_str_equal(driver, "sym53c8xx") ||
g_str_equal(driver, "virtio-pci") ||
g_str_equal(driver, "ahci") ||
- g_str_equal(driver, "nvme"))) {
+ g_str_equal(driver, "nvme") ||
+ g_str_equal(driver, "xhci_hcd") ||
+ g_str_equal(driver, "ehci-pci"))) {
break;
}
@@ -976,6 +978,8 @@ static bool build_guest_fsinfo_for_pci_dev(char const *syspath,
}
} else if (strcmp(driver, "nvme") == 0) {
disk->bus_type = GUEST_DISK_BUS_TYPE_NVME;
+ } else if (strcmp(driver, "ehci-pci") == 0 || strcmp(driver, "xhci_hcd") == 0) {
+ disk->bus_type = GUEST_DISK_BUS_TYPE_USB;
} else {
g_debug("unknown driver '%s' (sysfs path '%s')", driver, syspath);
goto cleanup;
--
2.39.1

@ -1,110 +0,0 @@
From bd5d81d2865c239ffea0fecf32476732149ad05c Mon Sep 17 00:00:00 2001
From: Miroslav Rezanina <mrezanin@redhat.com>
Date: Wed, 15 Feb 2023 02:03:17 -0500
Subject: Add RHEL 9.2.0 compat structure
Adding compatibility bits necessary to keep 9.2.0 machine
types same after rebase to 8.0.
Signed-off-by: Miroslav Rezanina <mrezanin@redhat.com>
Rebase notes (8.0.0 rc4):
- Added migration.x-preempt-pre-7-2 compat)
---
hw/arm/virt.c | 1 +
hw/core/machine.c | 10 ++++++++++
hw/i386/pc_piix.c | 2 ++
hw/i386/pc_q35.c | 3 +++
hw/s390x/s390-virtio-ccw.c | 1 +
include/hw/boards.h | 3 +++
6 files changed, 20 insertions(+)
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 1ae1654be5..9be53e9355 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -3669,6 +3669,7 @@ type_init(rhel_machine_init);
static void rhel920_virt_options(MachineClass *mc)
{
compat_props_add(mc->compat_props, arm_rhel_compat, arm_rhel_compat_len);
+ compat_props_add(mc->compat_props, hw_compat_rhel_9_2, hw_compat_rhel_9_2_len);
}
DEFINE_RHEL_MACHINE_AS_LATEST(9, 2, 0)
diff --git a/hw/core/machine.c b/hw/core/machine.c
index 5aa567fad3..0e0120b7f2 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -52,6 +52,16 @@ const size_t hw_compat_7_2_len = G_N_ELEMENTS(hw_compat_7_2);
const char *rhel_old_machine_deprecation =
"machine types for previous major releases are deprecated";
+GlobalProperty hw_compat_rhel_9_2[] = {
+ /* hw_compat_rhel_9_2 from hw_compat_7_2 */
+ { "e1000e", "migrate-timadj", "off" },
+ /* hw_compat_rhel_9_2 from hw_compat_7_2 */
+ { "virtio-mem", "x-early-migration", "false" },
+ /* hw_compat_rhel_9_2 from hw_compat_7_2 */
+ { "migration", "x-preempt-pre-7-2", "true" },
+};
+const size_t hw_compat_rhel_9_2_len = G_N_ELEMENTS(hw_compat_rhel_9_2);
+
/*
* Mostly the same as hw_compat_7_0
*/
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 3e330fd36f..90fb6e2e03 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -947,6 +947,8 @@ static void pc_machine_rhel760_options(MachineClass *m)
/* From pc_i440fx_5_1_machine_options() */
pcmc->pci_root_uid = 1;
pcmc->enforce_amd_1tb_hole = false;
+ compat_props_add(m->compat_props, hw_compat_rhel_9_2,
+ hw_compat_rhel_9_2_len);
compat_props_add(m->compat_props, hw_compat_rhel_9_1,
hw_compat_rhel_9_1_len);
compat_props_add(m->compat_props, hw_compat_rhel_9_0,
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index 98601bb76f..8945b69175 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -701,6 +701,9 @@ static void pc_q35_machine_rhel920_options(MachineClass *m)
m->desc = "RHEL-9.2.0 PC (Q35 + ICH9, 2009)";
pcmc->smbios_stream_product = "RHEL";
pcmc->smbios_stream_version = "9.2.0";
+
+ compat_props_add(m->compat_props, hw_compat_rhel_9_2,
+ hw_compat_rhel_9_2_len);
}
DEFINE_PC_MACHINE(q35_rhel920, "pc-q35-rhel9.2.0", pc_q35_init_rhel920,
diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
index dcd3b966b0..6a0b93c63d 100644
--- a/hw/s390x/s390-virtio-ccw.c
+++ b/hw/s390x/s390-virtio-ccw.c
@@ -1211,6 +1211,7 @@ static void ccw_machine_rhel920_instance_options(MachineState *machine)
static void ccw_machine_rhel920_class_options(MachineClass *mc)
{
+ compat_props_add(mc->compat_props, hw_compat_rhel_9_2, hw_compat_rhel_9_2_len);
}
DEFINE_CCW_MACHINE(rhel920, "rhel9.2.0", true);
diff --git a/include/hw/boards.h b/include/hw/boards.h
index 5e7446ee40..5f08bd7550 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -461,6 +461,9 @@ extern const size_t hw_compat_2_2_len;
extern GlobalProperty hw_compat_2_1[];
extern const size_t hw_compat_2_1_len;
+extern GlobalProperty hw_compat_rhel_9_2[];
+extern const size_t hw_compat_rhel_9_2_len;
+
extern GlobalProperty hw_compat_rhel_9_1[];
extern const size_t hw_compat_rhel_9_1_len;
--
2.39.1

@ -0,0 +1,30 @@
From ba574acacf679850e337ec2d5e7836b8277cf393 Mon Sep 17 00:00:00 2001
From: Sebastian Ott <sebott@redhat.com>
Date: Thu, 18 Apr 2024 15:04:28 +0200
Subject: x86: rhel 9.4.0 machine type compat fix
Fix up the compatibility for 9.4.0. Ensure that pc-q35-rhel9.4.0
still uses SMBIOS 3.X by default.
Signed-off-by: Sebastian Ott <sebott@redhat.com>
---
hw/i386/pc_q35.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index 2b54944c0f..2f11f9af7d 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -734,6 +734,9 @@ static void pc_q35_machine_rhel940_options(MachineClass *m)
pcmc->smbios_stream_product = "RHEL";
pcmc->smbios_stream_version = "9.4.0";
+ /* From pc_q35_8_2_machine_options() - use SMBIOS 3.X by default */
+ pcmc->default_smbios_ep_type = SMBIOS_ENTRY_POINT_TYPE_64;
+
compat_props_add(m->compat_props, hw_compat_rhel_9_5,
hw_compat_rhel_9_5_len);
}
--
2.39.3

@ -1,76 +0,0 @@
From c6eaf73adda2e87fe91c9a3836f45dd58a553e06 Mon Sep 17 00:00:00 2001
From: Thomas Huth <thuth@redhat.com>
Date: Mon, 27 Mar 2023 15:14:03 +0200
Subject: redhat: hw/i386/pc: Update x86 machine type compatibility for QEMU
8.0.0 update
Add pc_rhel_9_2_compat based on upstream pc_compat_7_2.
Signed-off-by: Thomas Huth <thuth@redhat.com>
---
hw/i386/pc.c | 6 ++++++
hw/i386/pc_piix.c | 2 ++
hw/i386/pc_q35.c | 2 ++
include/hw/i386/pc.h | 3 +++
4 files changed, 13 insertions(+)
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 8abb1f872e..f216922cee 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -429,6 +429,12 @@ GlobalProperty pc_rhel_compat[] = {
};
const size_t pc_rhel_compat_len = G_N_ELEMENTS(pc_rhel_compat);
+GlobalProperty pc_rhel_9_2_compat[] = {
+ /* pc_rhel_9_2_compat from pc_compat_7_2 */
+ { "ICH9-LPC", "noreboot", "true" },
+};
+const size_t pc_rhel_9_2_compat_len = G_N_ELEMENTS(pc_rhel_9_2_compat);
+
GlobalProperty pc_rhel_9_0_compat[] = {
/* pc_rhel_9_0_compat from pc_compat_6_2 */
{ "virtio-mem", "unplugged-inaccessible", "off" },
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 90fb6e2e03..fc704d783f 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -949,6 +949,8 @@ static void pc_machine_rhel760_options(MachineClass *m)
pcmc->enforce_amd_1tb_hole = false;
compat_props_add(m->compat_props, hw_compat_rhel_9_2,
hw_compat_rhel_9_2_len);
+ compat_props_add(m->compat_props, pc_rhel_9_2_compat,
+ pc_rhel_9_2_compat_len);
compat_props_add(m->compat_props, hw_compat_rhel_9_1,
hw_compat_rhel_9_1_len);
compat_props_add(m->compat_props, hw_compat_rhel_9_0,
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index 8945b69175..e97655616a 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -704,6 +704,8 @@ static void pc_q35_machine_rhel920_options(MachineClass *m)
compat_props_add(m->compat_props, hw_compat_rhel_9_2,
hw_compat_rhel_9_2_len);
+ compat_props_add(m->compat_props, pc_rhel_9_2_compat,
+ pc_rhel_9_2_compat_len);
}
DEFINE_PC_MACHINE(q35_rhel920, "pc-q35-rhel9.2.0", pc_q35_init_rhel920,
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 4376f64a47..d218ad1628 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -296,6 +296,9 @@ int pc_machine_kvm_type(MachineState *machine, const char *vm_type);
extern GlobalProperty pc_rhel_compat[];
extern const size_t pc_rhel_compat_len;
+extern GlobalProperty pc_rhel_9_2_compat[];
+extern const size_t pc_rhel_9_2_compat_len;
+
extern GlobalProperty pc_rhel_9_0_compat[];
extern const size_t pc_rhel_9_0_compat_len;
--
2.39.1

@ -1,83 +0,0 @@
From 8173d2eabaf77312d36b00c618f6770948b80593 Mon Sep 17 00:00:00 2001
From: Miroslav Rezanina <mrezanin@redhat.com>
Date: Mon, 17 Apr 2023 01:24:18 -0400
Subject: Disable unwanted new devices
QEMU 8.0 adds two new device we do not want to support that can't
be disabled using configure switch.
1) ide-cf - virtual CompactFlash card
2) i2c-echo - testing echo device
Use manual disabling of the device by changing code (1) and meson configs (2).
Signed-off-by: Miroslav Rezanina <mrezanin@redhat.com>
---
hw/ide/qdev.c | 9 +++++++++
hw/misc/meson.build | 3 ++-
2 files changed, 11 insertions(+), 1 deletion(-)
diff --git a/hw/ide/qdev.c b/hw/ide/qdev.c
index 1b3b4da01d..454bfa5783 100644
--- a/hw/ide/qdev.c
+++ b/hw/ide/qdev.c
@@ -283,10 +283,13 @@ static void ide_cd_realize(IDEDevice *dev, Error **errp)
ide_dev_initfn(dev, IDE_CD, errp);
}
+/* Disabled for Red Hat Enterprise Linux */
+#if 0
static void ide_cf_realize(IDEDevice *dev, Error **errp)
{
ide_dev_initfn(dev, IDE_CFATA, errp);
}
+#endif
#define DEFINE_IDE_DEV_PROPERTIES() \
DEFINE_BLOCK_PROPERTIES(IDEDrive, dev.conf), \
@@ -346,6 +349,8 @@ static const TypeInfo ide_cd_info = {
.class_init = ide_cd_class_init,
};
+/* Disabled for Red Hat Enterprise Linux */
+#if 0
static Property ide_cf_properties[] = {
DEFINE_IDE_DEV_PROPERTIES(),
DEFINE_BLOCK_CHS_PROPERTIES(IDEDrive, dev.conf),
@@ -371,6 +376,7 @@ static const TypeInfo ide_cf_info = {
.instance_size = sizeof(IDEDrive),
.class_init = ide_cf_class_init,
};
+#endif
static void ide_device_class_init(ObjectClass *klass, void *data)
{
@@ -396,7 +402,10 @@ static void ide_register_types(void)
type_register_static(&ide_bus_info);
type_register_static(&ide_hd_info);
type_register_static(&ide_cd_info);
+/* Disabled for Red Hat Enterprise Linux */
+#if 0
type_register_static(&ide_cf_info);
+#endif
type_register_static(&ide_device_type_info);
}
diff --git a/hw/misc/meson.build b/hw/misc/meson.build
index a40245ad44..9cc5a61ed7 100644
--- a/hw/misc/meson.build
+++ b/hw/misc/meson.build
@@ -128,7 +128,8 @@ softmmu_ss.add(when: 'CONFIG_NRF51_SOC', if_true: files('nrf51_rng.c'))
softmmu_ss.add(when: 'CONFIG_GRLIB', if_true: files('grlib_ahb_apb_pnp.c'))
-softmmu_ss.add(when: 'CONFIG_I2C', if_true: files('i2c-echo.c'))
+# Disabled for Red Hat Enterprise Linux
+# softmmu_ss.add(when: 'CONFIG_I2C', if_true: files('i2c-echo.c'))
specific_ss.add(when: 'CONFIG_AVR_POWER', if_true: files('avr_power.c'))
--
2.39.1

@ -28,7 +28,7 @@ avocado_qemu tests:
The avocado_qemu tests can be executed by running the following avocado command:
avocado run -p qemu_bin=/usr/libexec/qemu-kvm /usr/lib64/qemu-kvm/tests/acceptance/
Avocado needs to be installed separately using either pip or from source as
Avocado is not being packaged for RHEL-8.
Avocado is not being packaged for RHEL.
qemu-iotests:
symlinks to corresponding binaries need to be created for QEMU_PROG,
@ -36,4 +36,4 @@ QEMU_IO_PROG, QEMU_IMG_PROG, and QEMU_NBD_PROG before the iotests can be
executed.
The primary purpose of this package is to make these tests available to be
executed as gating tests for the virt module in the RHEL-8 OSCI environment.
executed as gating tests for the qemu-kvm in the RHEL OSCI environment.

@ -0,0 +1,139 @@
From 93ea86ac8849ad9ca365b1646313dde9a34ba59c Mon Sep 17 00:00:00 2001
From: Xiaoyao Li <xiaoyao.li@intel.com>
Date: Wed, 20 Mar 2024 03:39:03 -0500
Subject: [PATCH 031/100] HostMem: Add mechanism to opt in kvm guest memfd via
MachineState
RH-Author: Paolo Bonzini <pbonzini@redhat.com>
RH-MergeRequest: 245: SEV-SNP support
RH-Jira: RHEL-39544
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Bandan Das <bdas@redhat.com>
RH-Acked-by: Vitaly Kuznetsov <vkuznets@redhat.com>
RH-Commit: [31/91] 43ce32aef954479cdb736301d1adcb919602c321 (bonzini/rhel-qemu-kvm)
Add a new member "guest_memfd" to memory backends. When it's set
to true, it enables RAM_GUEST_MEMFD in ram_flags, thus private kvm
guest_memfd will be allocated during RAMBlock allocation.
Memory backend's @guest_memfd is wired with @require_guest_memfd
field of MachineState. It avoid looking up the machine in phymem.c.
MachineState::require_guest_memfd is supposed to be set by any VMs
that requires KVM guest memfd as private memory, e.g., TDX VM.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-ID: <20240320083945.991426-8-michael.roth@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 37662d85b0b7dded0ebdf6747bef6c3bb7ed6a0c)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
backends/hostmem-file.c | 1 +
backends/hostmem-memfd.c | 1 +
backends/hostmem-ram.c | 1 +
backends/hostmem.c | 1 +
hw/core/machine.c | 5 +++++
include/hw/boards.h | 2 ++
include/sysemu/hostmem.h | 1 +
7 files changed, 12 insertions(+)
diff --git a/backends/hostmem-file.c b/backends/hostmem-file.c
index ac3e433cbd..3c69db7946 100644
--- a/backends/hostmem-file.c
+++ b/backends/hostmem-file.c
@@ -85,6 +85,7 @@ file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
ram_flags |= fb->readonly ? RAM_READONLY_FD : 0;
ram_flags |= fb->rom == ON_OFF_AUTO_ON ? RAM_READONLY : 0;
ram_flags |= backend->reserve ? 0 : RAM_NORESERVE;
+ ram_flags |= backend->guest_memfd ? RAM_GUEST_MEMFD : 0;
ram_flags |= fb->is_pmem ? RAM_PMEM : 0;
ram_flags |= RAM_NAMED_FILE;
return memory_region_init_ram_from_file(&backend->mr, OBJECT(backend), name,
diff --git a/backends/hostmem-memfd.c b/backends/hostmem-memfd.c
index 3923ea9364..745ead0034 100644
--- a/backends/hostmem-memfd.c
+++ b/backends/hostmem-memfd.c
@@ -55,6 +55,7 @@ memfd_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
name = host_memory_backend_get_name(backend);
ram_flags = backend->share ? RAM_SHARED : 0;
ram_flags |= backend->reserve ? 0 : RAM_NORESERVE;
+ ram_flags |= backend->guest_memfd ? RAM_GUEST_MEMFD : 0;
return memory_region_init_ram_from_fd(&backend->mr, OBJECT(backend), name,
backend->size, ram_flags, fd, 0, errp);
}
diff --git a/backends/hostmem-ram.c b/backends/hostmem-ram.c
index d121249f0f..f7d81af783 100644
--- a/backends/hostmem-ram.c
+++ b/backends/hostmem-ram.c
@@ -30,6 +30,7 @@ ram_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
name = host_memory_backend_get_name(backend);
ram_flags = backend->share ? RAM_SHARED : 0;
ram_flags |= backend->reserve ? 0 : RAM_NORESERVE;
+ ram_flags |= backend->guest_memfd ? RAM_GUEST_MEMFD : 0;
return memory_region_init_ram_flags_nomigrate(&backend->mr, OBJECT(backend),
name, backend->size,
ram_flags, errp);
diff --git a/backends/hostmem.c b/backends/hostmem.c
index 81a72ce40b..eb9682b4a8 100644
--- a/backends/hostmem.c
+++ b/backends/hostmem.c
@@ -277,6 +277,7 @@ static void host_memory_backend_init(Object *obj)
/* TODO: convert access to globals to compat properties */
backend->merge = machine_mem_merge(machine);
backend->dump = machine_dump_guest_core(machine);
+ backend->guest_memfd = machine_require_guest_memfd(machine);
backend->reserve = true;
backend->prealloc_threads = machine->smp.cpus;
}
diff --git a/hw/core/machine.c b/hw/core/machine.c
index 92609aae27..07b994e136 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -1480,6 +1480,11 @@ bool machine_mem_merge(MachineState *machine)
return machine->mem_merge;
}
+bool machine_require_guest_memfd(MachineState *machine)
+{
+ return machine->require_guest_memfd;
+}
+
static char *cpu_slot_to_string(const CPUArchId *cpu)
{
GString *s = g_string_new(NULL);
diff --git a/include/hw/boards.h b/include/hw/boards.h
index cca62f906b..815a1c4b26 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -36,6 +36,7 @@ bool machine_usb(MachineState *machine);
int machine_phandle_start(MachineState *machine);
bool machine_dump_guest_core(MachineState *machine);
bool machine_mem_merge(MachineState *machine);
+bool machine_require_guest_memfd(MachineState *machine);
HotpluggableCPUList *machine_query_hotpluggable_cpus(MachineState *machine);
void machine_set_cpu_numa_node(MachineState *machine,
const CpuInstanceProperties *props,
@@ -372,6 +373,7 @@ struct MachineState {
char *dt_compatible;
bool dump_guest_core;
bool mem_merge;
+ bool require_guest_memfd;
bool usb;
bool usb_disabled;
char *firmware;
diff --git a/include/sysemu/hostmem.h b/include/sysemu/hostmem.h
index 0e411aaa29..04b884bf42 100644
--- a/include/sysemu/hostmem.h
+++ b/include/sysemu/hostmem.h
@@ -74,6 +74,7 @@ struct HostMemoryBackend {
uint64_t size;
bool merge, dump, use_canonical_path;
bool prealloc, is_mapped, share, reserve;
+ bool guest_memfd;
uint32_t prealloc_threads;
ThreadContext *prealloc_context;
DECLARE_BITMAP(host_nodes, MAX_NODES + 1);
--
2.39.3

@ -0,0 +1,203 @@
From c46ac3db0a4db60e667edeabc9ed451c6e8e0ccf Mon Sep 17 00:00:00 2001
From: Paolo Bonzini <pbonzini@redhat.com>
Date: Mon, 18 Mar 2024 14:41:33 -0400
Subject: [PATCH 020/100] KVM: remove kvm_arch_cpu_check_are_resettable
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Paolo Bonzini <pbonzini@redhat.com>
RH-MergeRequest: 245: SEV-SNP support
RH-Jira: RHEL-39544
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Bandan Das <bdas@redhat.com>
RH-Acked-by: Vitaly Kuznetsov <vkuznets@redhat.com>
RH-Commit: [20/91] d7745bd1a0ed1b215847f150f4a1bb2e912beabc (bonzini/rhel-qemu-kvm)
Board reset requires writing a fresh CPU state. As far as KVM is
concerned, the only thing that blocks reset is that CPU state is
encrypted; therefore, kvm_cpus_are_resettable() can simply check
if that is the case.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit a99c0c66ebe7d8db3af6f16689ade9375247e43e)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
accel/kvm/kvm-accel-ops.c | 2 +-
accel/kvm/kvm-all.c | 5 -----
include/sysemu/kvm.h | 10 ----------
target/arm/kvm.c | 5 -----
target/i386/kvm/kvm.c | 5 -----
target/loongarch/kvm/kvm.c | 5 -----
target/mips/kvm.c | 5 -----
target/ppc/kvm.c | 5 -----
target/riscv/kvm/kvm-cpu.c | 5 -----
target/s390x/kvm/kvm.c | 5 -----
10 files changed, 1 insertion(+), 51 deletions(-)
diff --git a/accel/kvm/kvm-accel-ops.c b/accel/kvm/kvm-accel-ops.c
index b3c946dc4b..74e3c5785b 100644
--- a/accel/kvm/kvm-accel-ops.c
+++ b/accel/kvm/kvm-accel-ops.c
@@ -82,7 +82,7 @@ static bool kvm_vcpu_thread_is_idle(CPUState *cpu)
static bool kvm_cpus_are_resettable(void)
{
- return !kvm_enabled() || kvm_cpu_check_are_resettable();
+ return !kvm_enabled() || !kvm_state->guest_state_protected;
}
#ifdef KVM_CAP_SET_GUEST_DEBUG
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index ec0f6df7c5..b51e09a583 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -2696,11 +2696,6 @@ void kvm_flush_coalesced_mmio_buffer(void)
s->coalesced_flush_in_progress = false;
}
-bool kvm_cpu_check_are_resettable(void)
-{
- return kvm_arch_cpu_check_are_resettable();
-}
-
static void do_kvm_cpu_synchronize_state(CPUState *cpu, run_on_cpu_data arg)
{
if (!cpu->vcpu_dirty && !kvm_state->guest_state_protected) {
diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index 302e8f6f1e..54f4d83a37 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -525,16 +525,6 @@ int kvm_get_one_reg(CPUState *cs, uint64_t id, void *target);
/* Notify resamplefd for EOI of specific interrupts. */
void kvm_resample_fd_notify(int gsi);
-/**
- * kvm_cpu_check_are_resettable - return whether CPUs can be reset
- *
- * Returns: true: CPUs are resettable
- * false: CPUs are not resettable
- */
-bool kvm_cpu_check_are_resettable(void);
-
-bool kvm_arch_cpu_check_are_resettable(void);
-
bool kvm_dirty_ring_enabled(void);
uint32_t kvm_dirty_ring_size(void);
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index ab85d628a8..21ebbf3b8f 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -1598,11 +1598,6 @@ int kvm_arch_msi_data_to_gsi(uint32_t data)
return (data - 32) & 0xffff;
}
-bool kvm_arch_cpu_check_are_resettable(void)
-{
- return true;
-}
-
static void kvm_arch_get_eager_split_size(Object *obj, Visitor *v,
const char *name, void *opaque,
Error **errp)
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index e271652620..a12207a8ee 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -5623,11 +5623,6 @@ bool kvm_has_waitpkg(void)
return has_msr_umwait;
}
-bool kvm_arch_cpu_check_are_resettable(void)
-{
- return !sev_es_enabled();
-}
-
#define ARCH_REQ_XCOMP_GUEST_PERM 0x1025
void kvm_request_xsave_components(X86CPU *cpu, uint64_t mask)
diff --git a/target/loongarch/kvm/kvm.c b/target/loongarch/kvm/kvm.c
index d630cc39cb..8224d94333 100644
--- a/target/loongarch/kvm/kvm.c
+++ b/target/loongarch/kvm/kvm.c
@@ -733,11 +733,6 @@ bool kvm_arch_stop_on_emulation_error(CPUState *cs)
return true;
}
-bool kvm_arch_cpu_check_are_resettable(void)
-{
- return true;
-}
-
int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
{
int ret = 0;
diff --git a/target/mips/kvm.c b/target/mips/kvm.c
index 6c52e59f55..a631ab544f 100644
--- a/target/mips/kvm.c
+++ b/target/mips/kvm.c
@@ -1273,11 +1273,6 @@ int kvm_arch_get_default_type(MachineState *machine)
return -1;
}
-bool kvm_arch_cpu_check_are_resettable(void)
-{
- return true;
-}
-
void kvm_arch_accel_class_init(ObjectClass *oc)
{
}
diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
index 59f640cf7b..9d9d9f0d79 100644
--- a/target/ppc/kvm.c
+++ b/target/ppc/kvm.c
@@ -2968,11 +2968,6 @@ void kvmppc_set_reg_tb_offset(PowerPCCPU *cpu, int64_t tb_offset)
}
}
-bool kvm_arch_cpu_check_are_resettable(void)
-{
- return true;
-}
-
void kvm_arch_accel_class_init(ObjectClass *oc)
{
}
diff --git a/target/riscv/kvm/kvm-cpu.c b/target/riscv/kvm/kvm-cpu.c
index 6a6c6cae80..49d2f3ad58 100644
--- a/target/riscv/kvm/kvm-cpu.c
+++ b/target/riscv/kvm/kvm-cpu.c
@@ -1475,11 +1475,6 @@ void kvm_riscv_set_irq(RISCVCPU *cpu, int irq, int level)
}
}
-bool kvm_arch_cpu_check_are_resettable(void)
-{
- return true;
-}
-
static int aia_mode;
static const char *kvm_aia_mode_str(uint64_t mode)
diff --git a/target/s390x/kvm/kvm.c b/target/s390x/kvm/kvm.c
index 55fb4855b1..4db59658e1 100644
--- a/target/s390x/kvm/kvm.c
+++ b/target/s390x/kvm/kvm.c
@@ -2630,11 +2630,6 @@ void kvm_s390_stop_interrupt(S390CPU *cpu)
kvm_s390_vcpu_interrupt(cpu, &irq);
}
-bool kvm_arch_cpu_check_are_resettable(void)
-{
- return true;
-}
-
int kvm_s390_get_zpci_op(void)
{
return cap_zpci_op;
--
2.39.3

@ -0,0 +1,127 @@
From 50399796da938c4ea7c69058fde84695bce9d794 Mon Sep 17 00:00:00 2001
From: Paolo Bonzini <pbonzini@redhat.com>
Date: Mon, 18 Mar 2024 14:41:10 -0400
Subject: [PATCH 019/100] KVM: track whether guest state is encrypted
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Paolo Bonzini <pbonzini@redhat.com>
RH-MergeRequest: 245: SEV-SNP support
RH-Jira: RHEL-39544
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Bandan Das <bdas@redhat.com>
RH-Acked-by: Vitaly Kuznetsov <vkuznets@redhat.com>
RH-Commit: [19/91] 685b9c54d43d0043d15c33d13afc3a420cbe139b (bonzini/rhel-qemu-kvm)
So far, KVM has allowed KVM_GET/SET_* ioctls to execute even if the
guest state is encrypted, in which case they do nothing. For the new
API using VM types, instead, the ioctls will fail which is a safer and
more robust approach.
The new API will be the only one available for SEV-SNP and TDX, but it
is also usable for SEV and SEV-ES. In preparation for that, require
architecture-specific KVM code to communicate the point at which guest
state is protected (which must be after kvm_cpu_synchronize_post_init(),
though that might change in the future in order to suppor migration).
From that point, skip reading registers so that cpu->vcpu_dirty is
never true: if it ever becomes true, kvm_arch_put_registers() will
fail miserably.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 5c3131c392f84c660033d511ec39872d8beb4b1e)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
accel/kvm/kvm-all.c | 17 ++++++++++++++---
include/sysemu/kvm.h | 2 ++
include/sysemu/kvm_int.h | 1 +
target/i386/sev.c | 1 +
4 files changed, 18 insertions(+), 3 deletions(-)
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 931f74256e..ec0f6df7c5 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -2703,7 +2703,7 @@ bool kvm_cpu_check_are_resettable(void)
static void do_kvm_cpu_synchronize_state(CPUState *cpu, run_on_cpu_data arg)
{
- if (!cpu->vcpu_dirty) {
+ if (!cpu->vcpu_dirty && !kvm_state->guest_state_protected) {
int ret = kvm_arch_get_registers(cpu);
if (ret) {
error_report("Failed to get registers: %s", strerror(-ret));
@@ -2717,7 +2717,7 @@ static void do_kvm_cpu_synchronize_state(CPUState *cpu, run_on_cpu_data arg)
void kvm_cpu_synchronize_state(CPUState *cpu)
{
- if (!cpu->vcpu_dirty) {
+ if (!cpu->vcpu_dirty && !kvm_state->guest_state_protected) {
run_on_cpu(cpu, do_kvm_cpu_synchronize_state, RUN_ON_CPU_NULL);
}
}
@@ -2752,7 +2752,13 @@ static void do_kvm_cpu_synchronize_post_init(CPUState *cpu, run_on_cpu_data arg)
void kvm_cpu_synchronize_post_init(CPUState *cpu)
{
- run_on_cpu(cpu, do_kvm_cpu_synchronize_post_init, RUN_ON_CPU_NULL);
+ if (!kvm_state->guest_state_protected) {
+ /*
+ * This runs before the machine_init_done notifiers, and is the last
+ * opportunity to synchronize the state of confidential guests.
+ */
+ run_on_cpu(cpu, do_kvm_cpu_synchronize_post_init, RUN_ON_CPU_NULL);
+ }
}
static void do_kvm_cpu_synchronize_pre_loadvm(CPUState *cpu, run_on_cpu_data arg)
@@ -4099,3 +4105,8 @@ void query_stats_schemas_cb(StatsSchemaList **result, Error **errp)
query_stats_schema_vcpu(first_cpu, &stats_args);
}
}
+
+void kvm_mark_guest_state_protected(void)
+{
+ kvm_state->guest_state_protected = true;
+}
diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index fad9a7e8ff..302e8f6f1e 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -539,6 +539,8 @@ bool kvm_dirty_ring_enabled(void);
uint32_t kvm_dirty_ring_size(void);
+void kvm_mark_guest_state_protected(void);
+
/**
* kvm_hwpoisoned_mem - indicate if there is any hwpoisoned page
* reported for the VM.
diff --git a/include/sysemu/kvm_int.h b/include/sysemu/kvm_int.h
index 882e37e12c..3496be7997 100644
--- a/include/sysemu/kvm_int.h
+++ b/include/sysemu/kvm_int.h
@@ -87,6 +87,7 @@ struct KVMState
bool kernel_irqchip_required;
OnOffAuto kernel_irqchip_split;
bool sync_mmu;
+ bool guest_state_protected;
uint64_t manual_dirty_log_protect;
/* The man page (and posix) say ioctl numbers are signed int, but
* they're not. Linux, glibc and *BSD all treat ioctl numbers as
diff --git a/target/i386/sev.c b/target/i386/sev.c
index b8f79d34d1..c49a8fd55e 100644
--- a/target/i386/sev.c
+++ b/target/i386/sev.c
@@ -755,6 +755,7 @@ sev_launch_get_measure(Notifier *notifier, void *unused)
if (ret) {
exit(1);
}
+ kvm_mark_guest_state_protected();
}
/* query the measurement blob length */
--
2.39.3

@ -0,0 +1,329 @@
From f4b01d645926faab2cab86fadb7398c26d6b8285 Mon Sep 17 00:00:00 2001
From: Xiaoyao Li <xiaoyao.li@intel.com>
Date: Wed, 20 Mar 2024 03:39:02 -0500
Subject: [PATCH 028/100] RAMBlock: Add support of KVM private guest memfd
RH-Author: Paolo Bonzini <pbonzini@redhat.com>
RH-MergeRequest: 245: SEV-SNP support
RH-Jira: RHEL-39544
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Bandan Das <bdas@redhat.com>
RH-Acked-by: Vitaly Kuznetsov <vkuznets@redhat.com>
RH-Commit: [28/91] 95fdf196afcb67113834c20fa354ee1397411bfd (bonzini/rhel-qemu-kvm)
Add KVM guest_memfd support to RAMBlock so both normal hva based memory
and kvm guest memfd based private memory can be associated in one RAMBlock.
Introduce new flag RAM_GUEST_MEMFD. When it's set, it calls KVM ioctl to
create private guest_memfd during RAMBlock setup.
Allocating a new RAM_GUEST_MEMFD flag to instruct the setup of guest memfd
is more flexible and extensible than simply relying on the VM type because
in the future we may have the case that not all the memory of a VM need
guest memfd. As a benefit, it also avoid getting MachineState in memory
subsystem.
Note, RAM_GUEST_MEMFD is supposed to be set for memory backends of
confidential guests, such as TDX VM. How and when to set it for memory
backends will be implemented in the following patches.
Introduce memory_region_has_guest_memfd() to query if the MemoryRegion has
KVM guest_memfd allocated.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-ID: <20240320083945.991426-7-michael.roth@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 15f7a80c49cb3637f62fa37fa4a17da913bd91ff)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
accel/kvm/kvm-all.c | 28 ++++++++++++++++++++++++++++
accel/stubs/kvm-stub.c | 5 +++++
include/exec/memory.h | 20 +++++++++++++++++---
include/exec/ram_addr.h | 2 +-
include/exec/ramblock.h | 1 +
include/sysemu/kvm.h | 2 ++
system/memory.c | 5 +++++
system/physmem.c | 34 +++++++++++++++++++++++++++++++---
8 files changed, 90 insertions(+), 7 deletions(-)
diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c
index 272e945f52..a7b9a127dd 100644
--- a/accel/kvm/kvm-all.c
+++ b/accel/kvm/kvm-all.c
@@ -92,6 +92,7 @@ static bool kvm_has_guest_debug;
static int kvm_sstep_flags;
static bool kvm_immediate_exit;
static uint64_t kvm_supported_memory_attributes;
+static bool kvm_guest_memfd_supported;
static hwaddr kvm_max_slot_size = ~0;
static const KVMCapabilityInfo kvm_required_capabilites[] = {
@@ -2419,6 +2420,11 @@ static int kvm_init(MachineState *ms)
}
kvm_supported_memory_attributes = kvm_check_extension(s, KVM_CAP_MEMORY_ATTRIBUTES);
+ kvm_guest_memfd_supported =
+ kvm_check_extension(s, KVM_CAP_GUEST_MEMFD) &&
+ kvm_check_extension(s, KVM_CAP_USER_MEMORY2) &&
+ (kvm_supported_memory_attributes & KVM_MEMORY_ATTRIBUTE_PRIVATE);
+
kvm_immediate_exit = kvm_check_extension(s, KVM_CAP_IMMEDIATE_EXIT);
s->nr_slots = kvm_check_extension(s, KVM_CAP_NR_MEMSLOTS);
@@ -4138,3 +4144,25 @@ void kvm_mark_guest_state_protected(void)
{
kvm_state->guest_state_protected = true;
}
+
+int kvm_create_guest_memfd(uint64_t size, uint64_t flags, Error **errp)
+{
+ int fd;
+ struct kvm_create_guest_memfd guest_memfd = {
+ .size = size,
+ .flags = flags,
+ };
+
+ if (!kvm_guest_memfd_supported) {
+ error_setg(errp, "KVM does not support guest_memfd");
+ return -1;
+ }
+
+ fd = kvm_vm_ioctl(kvm_state, KVM_CREATE_GUEST_MEMFD, &guest_memfd);
+ if (fd < 0) {
+ error_setg_errno(errp, errno, "Error creating KVM guest_memfd");
+ return -1;
+ }
+
+ return fd;
+}
diff --git a/accel/stubs/kvm-stub.c b/accel/stubs/kvm-stub.c
index ca38172884..8e0eb22e61 100644
--- a/accel/stubs/kvm-stub.c
+++ b/accel/stubs/kvm-stub.c
@@ -129,3 +129,8 @@ bool kvm_hwpoisoned_mem(void)
{
return false;
}
+
+int kvm_create_guest_memfd(uint64_t size, uint64_t flags, Error **errp)
+{
+ return -ENOSYS;
+}
diff --git a/include/exec/memory.h b/include/exec/memory.h
index 8626a355b3..679a847685 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -243,6 +243,9 @@ typedef struct IOMMUTLBEvent {
/* RAM FD is opened read-only */
#define RAM_READONLY_FD (1 << 11)
+/* RAM can be private that has kvm guest memfd backend */
+#define RAM_GUEST_MEMFD (1 << 12)
+
static inline void iommu_notifier_init(IOMMUNotifier *n, IOMMUNotify fn,
IOMMUNotifierFlag flags,
hwaddr start, hwaddr end,
@@ -1307,7 +1310,8 @@ bool memory_region_init_ram_nomigrate(MemoryRegion *mr,
* @name: Region name, becomes part of RAMBlock name used in migration stream
* must be unique within any device
* @size: size of the region.
- * @ram_flags: RamBlock flags. Supported flags: RAM_SHARED, RAM_NORESERVE.
+ * @ram_flags: RamBlock flags. Supported flags: RAM_SHARED, RAM_NORESERVE,
+ * RAM_GUEST_MEMFD.
* @errp: pointer to Error*, to store an error if it happens.
*
* Note that this function does not do anything to cause the data in the
@@ -1369,7 +1373,7 @@ bool memory_region_init_resizeable_ram(MemoryRegion *mr,
* (getpagesize()) will be used.
* @ram_flags: RamBlock flags. Supported flags: RAM_SHARED, RAM_PMEM,
* RAM_NORESERVE, RAM_PROTECTED, RAM_NAMED_FILE, RAM_READONLY,
- * RAM_READONLY_FD
+ * RAM_READONLY_FD, RAM_GUEST_MEMFD
* @path: the path in which to allocate the RAM.
* @offset: offset within the file referenced by path
* @errp: pointer to Error*, to store an error if it happens.
@@ -1399,7 +1403,7 @@ bool memory_region_init_ram_from_file(MemoryRegion *mr,
* @size: size of the region.
* @ram_flags: RamBlock flags. Supported flags: RAM_SHARED, RAM_PMEM,
* RAM_NORESERVE, RAM_PROTECTED, RAM_NAMED_FILE, RAM_READONLY,
- * RAM_READONLY_FD
+ * RAM_READONLY_FD, RAM_GUEST_MEMFD
* @fd: the fd to mmap.
* @offset: offset within the file referenced by fd
* @errp: pointer to Error*, to store an error if it happens.
@@ -1722,6 +1726,16 @@ static inline bool memory_region_is_romd(MemoryRegion *mr)
*/
bool memory_region_is_protected(MemoryRegion *mr);
+/**
+ * memory_region_has_guest_memfd: check whether a memory region has guest_memfd
+ * associated
+ *
+ * Returns %true if a memory region's ram_block has valid guest_memfd assigned.
+ *
+ * @mr: the memory region being queried
+ */
+bool memory_region_has_guest_memfd(MemoryRegion *mr);
+
/**
* memory_region_get_iommu: check whether a memory region is an iommu
*
diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
index de45ba7bc9..07c8f86375 100644
--- a/include/exec/ram_addr.h
+++ b/include/exec/ram_addr.h
@@ -110,7 +110,7 @@ long qemu_maxrampagesize(void);
* @mr: the memory region where the ram block is
* @ram_flags: RamBlock flags. Supported flags: RAM_SHARED, RAM_PMEM,
* RAM_NORESERVE, RAM_PROTECTED, RAM_NAMED_FILE, RAM_READONLY,
- * RAM_READONLY_FD
+ * RAM_READONLY_FD, RAM_GUEST_MEMFD
* @mem_path or @fd: specify the backing file or device
* @offset: Offset into target file
* @errp: pointer to Error*, to store an error if it happens
diff --git a/include/exec/ramblock.h b/include/exec/ramblock.h
index 848915ea5b..459c8917de 100644
--- a/include/exec/ramblock.h
+++ b/include/exec/ramblock.h
@@ -41,6 +41,7 @@ struct RAMBlock {
QLIST_HEAD(, RAMBlockNotifier) ramblock_notifiers;
int fd;
uint64_t fd_offset;
+ int guest_memfd;
size_t page_size;
/* dirty bitmap used during migration */
unsigned long *bmap;
diff --git a/include/sysemu/kvm.h b/include/sysemu/kvm.h
index f114ff6986..9e4ab7ae89 100644
--- a/include/sysemu/kvm.h
+++ b/include/sysemu/kvm.h
@@ -537,6 +537,8 @@ void kvm_mark_guest_state_protected(void);
*/
bool kvm_hwpoisoned_mem(void);
+int kvm_create_guest_memfd(uint64_t size, uint64_t flags, Error **errp);
+
int kvm_set_memory_attributes_private(hwaddr start, uint64_t size);
int kvm_set_memory_attributes_shared(hwaddr start, uint64_t size);
diff --git a/system/memory.c b/system/memory.c
index a229a79988..c756950c0c 100644
--- a/system/memory.c
+++ b/system/memory.c
@@ -1850,6 +1850,11 @@ bool memory_region_is_protected(MemoryRegion *mr)
return mr->ram && (mr->ram_block->flags & RAM_PROTECTED);
}
+bool memory_region_has_guest_memfd(MemoryRegion *mr)
+{
+ return mr->ram_block && mr->ram_block->guest_memfd >= 0;
+}
+
uint8_t memory_region_get_dirty_log_mask(MemoryRegion *mr)
{
uint8_t mask = mr->dirty_log_mask;
diff --git a/system/physmem.c b/system/physmem.c
index a4fe3d2bf8..f5dfa20e57 100644
--- a/system/physmem.c
+++ b/system/physmem.c
@@ -1808,6 +1808,7 @@ static void ram_block_add(RAMBlock *new_block, Error **errp)
const bool shared = qemu_ram_is_shared(new_block);
RAMBlock *block;
RAMBlock *last_block = NULL;
+ bool free_on_error = false;
ram_addr_t old_ram_size, new_ram_size;
Error *err = NULL;
@@ -1837,6 +1838,19 @@ static void ram_block_add(RAMBlock *new_block, Error **errp)
return;
}
memory_try_enable_merging(new_block->host, new_block->max_length);
+ free_on_error = true;
+ }
+ }
+
+ if (new_block->flags & RAM_GUEST_MEMFD) {
+ assert(kvm_enabled());
+ assert(new_block->guest_memfd < 0);
+
+ new_block->guest_memfd = kvm_create_guest_memfd(new_block->max_length,
+ 0, errp);
+ if (new_block->guest_memfd < 0) {
+ qemu_mutex_unlock_ramlist();
+ goto out_free;
}
}
@@ -1888,6 +1902,13 @@ static void ram_block_add(RAMBlock *new_block, Error **errp)
ram_block_notify_add(new_block->host, new_block->used_length,
new_block->max_length);
}
+ return;
+
+out_free:
+ if (free_on_error) {
+ qemu_anon_ram_free(new_block->host, new_block->max_length);
+ new_block->host = NULL;
+ }
}
#ifdef CONFIG_POSIX
@@ -1902,7 +1923,7 @@ RAMBlock *qemu_ram_alloc_from_fd(ram_addr_t size, MemoryRegion *mr,
/* Just support these ram flags by now. */
assert((ram_flags & ~(RAM_SHARED | RAM_PMEM | RAM_NORESERVE |
RAM_PROTECTED | RAM_NAMED_FILE | RAM_READONLY |
- RAM_READONLY_FD)) == 0);
+ RAM_READONLY_FD | RAM_GUEST_MEMFD)) == 0);
if (xen_enabled()) {
error_setg(errp, "-mem-path not supported with Xen");
@@ -1939,6 +1960,7 @@ RAMBlock *qemu_ram_alloc_from_fd(ram_addr_t size, MemoryRegion *mr,
new_block->used_length = size;
new_block->max_length = size;
new_block->flags = ram_flags;
+ new_block->guest_memfd = -1;
new_block->host = file_ram_alloc(new_block, size, fd, !file_size, offset,
errp);
if (!new_block->host) {
@@ -2018,7 +2040,7 @@ RAMBlock *qemu_ram_alloc_internal(ram_addr_t size, ram_addr_t max_size,
int align;
assert((ram_flags & ~(RAM_SHARED | RAM_RESIZEABLE | RAM_PREALLOC |
- RAM_NORESERVE)) == 0);
+ RAM_NORESERVE | RAM_GUEST_MEMFD)) == 0);
assert(!host ^ (ram_flags & RAM_PREALLOC));
align = qemu_real_host_page_size();
@@ -2033,6 +2055,7 @@ RAMBlock *qemu_ram_alloc_internal(ram_addr_t size, ram_addr_t max_size,
new_block->max_length = max_size;
assert(max_size >= size);
new_block->fd = -1;
+ new_block->guest_memfd = -1;
new_block->page_size = qemu_real_host_page_size();
new_block->host = host;
new_block->flags = ram_flags;
@@ -2055,7 +2078,7 @@ RAMBlock *qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
RAMBlock *qemu_ram_alloc(ram_addr_t size, uint32_t ram_flags,
MemoryRegion *mr, Error **errp)
{
- assert((ram_flags & ~(RAM_SHARED | RAM_NORESERVE)) == 0);
+ assert((ram_flags & ~(RAM_SHARED | RAM_NORESERVE | RAM_GUEST_MEMFD)) == 0);
return qemu_ram_alloc_internal(size, size, NULL, NULL, ram_flags, mr, errp);
}
@@ -2083,6 +2106,11 @@ static void reclaim_ramblock(RAMBlock *block)
} else {
qemu_anon_ram_free(block->host, block->max_length);
}
+
+ if (block->guest_memfd >= 0) {
+ close(block->guest_memfd);
+ }
+
g_free(block);
}
--
2.39.3

@ -0,0 +1,82 @@
From bd289293604d6f33e9fb89196f0b19117ce81f89 Mon Sep 17 00:00:00 2001
From: Paolo Bonzini <pbonzini@redhat.com>
Date: Wed, 20 Mar 2024 17:45:29 +0100
Subject: [PATCH 032/100] RAMBlock: make guest_memfd require uncoordinated
discard
RH-Author: Paolo Bonzini <pbonzini@redhat.com>
RH-MergeRequest: 245: SEV-SNP support
RH-Jira: RHEL-39544
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Bandan Das <bdas@redhat.com>
RH-Acked-by: Vitaly Kuznetsov <vkuznets@redhat.com>
RH-Commit: [32/91] 0c005849026c334737b88cbd20a0ac237dfca37e (bonzini/rhel-qemu-kvm)
Some subsystems like VFIO might disable ram block discard, but guest_memfd
uses discard operations to implement conversions between private and
shared memory. Because of this, sequences like the following can result
in stale IOMMU mappings:
1. allocate shared page
2. convert page shared->private
3. discard shared page
4. convert page private->shared
5. allocate shared page
6. issue DMA operations against that shared page
This is not a use-after-free, because after step 3 VFIO is still pinning
the page. However, DMA operations in step 6 will hit the old mapping
that was allocated in step 1.
Address this by taking ram_block_discard_is_enabled() into account when
deciding whether or not to discard pages.
Since kvm_convert_memory()/guest_memfd doesn't implement a
RamDiscardManager handler to convey and replay discard operations,
this is a case of uncoordinated discard, which is blocked/released
by ram_block_discard_require(). Interestingly, this function had
no use so far.
Alternative approaches would be to block discard of shared pages, but
this would cause guests to consume twice the memory if they use VFIO;
or to implement a RamDiscardManager and only block uncoordinated
discard, i.e. use ram_block_coordinated_discard_require().
[Commit message mostly by Michael Roth <michael.roth@amd.com>]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 852f0048f3ea9f14de18eb279a99fccb6d250e8f)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
system/physmem.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/system/physmem.c b/system/physmem.c
index f5dfa20e57..5ebcf5be11 100644
--- a/system/physmem.c
+++ b/system/physmem.c
@@ -1846,6 +1846,13 @@ static void ram_block_add(RAMBlock *new_block, Error **errp)
assert(kvm_enabled());
assert(new_block->guest_memfd < 0);
+ if (ram_block_discard_require(true) < 0) {
+ error_setg_errno(errp, errno,
+ "cannot set up private guest memory: discard currently blocked");
+ error_append_hint(errp, "Are you using assigned devices?\n");
+ goto out_free;
+ }
+
new_block->guest_memfd = kvm_create_guest_memfd(new_block->max_length,
0, errp);
if (new_block->guest_memfd < 0) {
@@ -2109,6 +2116,7 @@ static void reclaim_ramblock(RAMBlock *block)
if (block->guest_memfd >= 0) {
close(block->guest_memfd);
+ ram_block_discard_require(false);
}
g_free(block);
--
2.39.3

@ -0,0 +1,67 @@
From d4e6f7105b00ba2536d5d733b7c03116f28ce116 Mon Sep 17 00:00:00 2001
From: Stefan Hajnoczi <stefanha@redhat.com>
Date: Mon, 6 May 2024 15:06:21 -0400
Subject: [PATCH 2/5] Revert "monitor: use aio_co_reschedule_self()"
RH-Author: Kevin Wolf <kwolf@redhat.com>
RH-MergeRequest: 248: Revert "monitor: use aio_co_reschedule_self()"
RH-Jira: RHEL-34618 RHEL-38697
RH-Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Commit: [1/2] b6a2ebd4a69dbcd2bd56c61e7c747f8f8f42337e (kmwolf/centos-qemu-kvm)
Commit 1f25c172f837 ("monitor: use aio_co_reschedule_self()") was a code
cleanup that uses aio_co_reschedule_self() instead of open coding
coroutine rescheduling.
Bug RHEL-34618 was reported and Kevin Wolf <kwolf@redhat.com> identified
the root cause. I missed that aio_co_reschedule_self() ->
qemu_get_current_aio_context() only knows about
qemu_aio_context/IOThread AioContexts and not about iohandler_ctx. It
does not function correctly when going back from the iohandler_ctx to
qemu_aio_context.
Go back to open coding the AioContext transitions to avoid this bug.
This reverts commit 1f25c172f83704e350c0829438d832384084a74d.
Cc: qemu-stable@nongnu.org
Buglink: https://issues.redhat.com/browse/RHEL-34618
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240506190622.56095-2-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 719c6819ed9a9838520fa732f9861918dc693bda)
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
qapi/qmp-dispatch.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/qapi/qmp-dispatch.c b/qapi/qmp-dispatch.c
index f3488afeef..176b549473 100644
--- a/qapi/qmp-dispatch.c
+++ b/qapi/qmp-dispatch.c
@@ -212,7 +212,8 @@ QDict *coroutine_mixed_fn qmp_dispatch(const QmpCommandList *cmds, QObject *requ
* executing the command handler so that it can make progress if it
* involves an AIO_WAIT_WHILE().
*/
- aio_co_reschedule_self(qemu_get_aio_context());
+ aio_co_schedule(qemu_get_aio_context(), qemu_coroutine_self());
+ qemu_coroutine_yield();
}
monitor_set_cur(qemu_coroutine_self(), cur_mon);
@@ -226,7 +227,9 @@ QDict *coroutine_mixed_fn qmp_dispatch(const QmpCommandList *cmds, QObject *requ
* Move back to iohandler_ctx so that nested event loops for
* qemu_aio_context don't start new monitor commands.
*/
- aio_co_reschedule_self(iohandler_get_aio_context());
+ aio_co_schedule(iohandler_get_aio_context(),
+ qemu_coroutine_self());
+ qemu_coroutine_yield();
}
} else {
/*
--
2.39.3

@ -0,0 +1,38 @@
From bcbc897cb19b3a6523de611f48f6bac6cea16c97 Mon Sep 17 00:00:00 2001
From: Sebastian Ott <sebott@redhat.com>
Date: Thu, 2 May 2024 13:17:03 +0200
Subject: [PATCH 2/2] Revert "x86: rhel 9.4.0 machine type compat fix"
RH-Author: Sebastian Ott <sebott@redhat.com>
RH-MergeRequest: 237: Revert "x86: rhel 9.4.0 machine type compat fix"
RH-Jira: RHEL-30362
RH-Acked-by: Ani Sinha <anisinha@redhat.com>
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Commit: [1/1] 858ec153e65e96c39ca4db17ed93fd58c77dc2eb (seott1/cos-qemu-kvm)
This reverts commit c46e44f0f4e861fe412ce679b0b0204881c1c2f5.
pc-q35-rhel9.4.0 and newer should stay with SMBIOS_ENTRY_POINT_TYPE_AUTO.
Signed-off-by: Sebastian Ott <sebott@redhat.com>
---
hw/i386/pc_q35.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index 2f11f9af7d..2b54944c0f 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -734,9 +734,6 @@ static void pc_q35_machine_rhel940_options(MachineClass *m)
pcmc->smbios_stream_product = "RHEL";
pcmc->smbios_stream_version = "9.4.0";
- /* From pc_q35_8_2_machine_options() - use SMBIOS 3.X by default */
- pcmc->default_smbios_ep_type = SMBIOS_ENTRY_POINT_TYPE_64;
-
compat_props_add(m->compat_props, hw_compat_rhel_9_5,
hw_compat_rhel_9_5_len);
}
--
2.39.3

@ -1,84 +0,0 @@
From 61256a82ce78f40222455becb8850b5f5ebb5d72 Mon Sep 17 00:00:00 2001
From: Igor Mammedov <imammedo@redhat.com>
Date: Tue, 18 Apr 2023 11:04:49 +0200
Subject: [PATCH 1/3] acpi: pcihp: allow repeating hot-unplug requests
RH-Author: Igor Mammedov <imammedo@redhat.com>
RH-MergeRequest: 159: acpi: pcihp: allow repeating hot-unplug requests
RH-Bugzilla: 2087047
RH-Acked-by: Ani Sinha <None>
RH-Acked-by: Julia Suvorova <None>
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Acked-by: MST <mst@redhat.com>
RH-Commit: [1/1] 9c597232466b27d91f127ee6004322d6ba69755f (imammedo/qemu-kvm-c-9-s-imam)
with Q35 using ACPI PCI hotplug by default, user's request to unplug
device is ignored when it's issued before guest OS has been booted.
And any additional attempt to request device hot-unplug afterwards
results in following error:
"Device XYZ is already in the process of unplug"
arguably it can be considered as a regression introduced by [2],
before which it was possible to issue unplug request multiple
times.
Accept new uplug requests after timeout (1ms). This brings ACPI PCI
hotplug on par with native PCIe unplug behavior [1] and allows user
to repeat unplug requests at propper times.
Set expire timeout to arbitrary 1msec so user won't be able to
flood guest with SCI interrupts by calling device_del in tight loop.
PS:
ACPI spec doesn't mandate what OSPM can do with GPEx.status
bits set before it's booted => it's impl. depended.
Status bits may be retained (I tested with one Windows version)
or cleared (Linux since 2.6 kernel times) during guest's ACPI
subsystem initialization.
Clearing status bits (though not wrong per se) hides the unplug
event from guest, and it's upto user to repeat device_del later
when guest is able to handle unplug requests.
1) 18416c62e3 ("pcie: expire pending delete")
2)
Fixes: cce8944cc9ef ("qdev-monitor: Forbid repeated device_del")
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
CC: mst@redhat.com
CC: anisinha@redhat.com
CC: jusual@redhat.com
CC: kraxel@redhat.com
Message-Id: <20230418090449.2155757-1-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Ani Sinha <anisinha@redhat.com>
(cherry picked from commit 0f689cf5ada4d5df5ab95c7f7aa9fc221afa855d)
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
---
hw/acpi/pcihp.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/hw/acpi/pcihp.c b/hw/acpi/pcihp.c
index dcfb779a7a..cdd6f775a1 100644
--- a/hw/acpi/pcihp.c
+++ b/hw/acpi/pcihp.c
@@ -357,6 +357,16 @@ void acpi_pcihp_device_unplug_request_cb(HotplugHandler *hotplug_dev,
* acpi_pcihp_eject_slot() when the operation is completed.
*/
pdev->qdev.pending_deleted_event = true;
+ /* if unplug was requested before OSPM is initialized,
+ * linux kernel will clear GPE0.sts[] bits during boot, which effectively
+ * hides unplug event. And than followup qmp_device_del() calls remain
+ * blocked by above flag permanently.
+ * Unblock qmp_device_del() by setting expire limit, so user can
+ * repeat unplug request later when OSPM has been booted.
+ */
+ pdev->qdev.pending_deleted_expires_ms =
+ qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL); /* 1 msec */
+
s->acpi_pcihp_pci_status[bsel].down |= (1U << slot);
acpi_send_event(DEVICE(hotplug_dev), ACPI_PCI_HOTPLUG_STATUS);
}
--
2.39.1

@ -0,0 +1,64 @@
From 0e3934e89ad1dda21681f64ff38da69b07d1b531 Mon Sep 17 00:00:00 2001
From: Stefan Hajnoczi <stefanha@redhat.com>
Date: Mon, 6 May 2024 15:06:22 -0400
Subject: [PATCH 3/5] aio: warn about iohandler_ctx special casing
RH-Author: Kevin Wolf <kwolf@redhat.com>
RH-MergeRequest: 248: Revert "monitor: use aio_co_reschedule_self()"
RH-Jira: RHEL-34618 RHEL-38697
RH-Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Commit: [2/2] cc316d70b2c187ee0412d6560ca1a03e381a69c1 (kmwolf/centos-qemu-kvm)
The main loop has two AioContexts: qemu_aio_context and iohandler_ctx.
The main loop runs them both, but nested aio_poll() calls on
qemu_aio_context exclude iohandler_ctx.
Which one should qemu_get_current_aio_context() return when called from
the main loop? Document that it's always qemu_aio_context.
This has subtle effects on functions that use
qemu_get_current_aio_context(). For example, aio_co_reschedule_self()
does not work when moving from iohandler_ctx to qemu_aio_context because
qemu_get_current_aio_context() does not differentiate these two
AioContexts.
Document this in order to reduce the chance of future bugs.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240506190622.56095-3-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit e669e800fc9ef8806af5c5578249ab758a4f8a5a)
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
include/block/aio.h | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/include/block/aio.h b/include/block/aio.h
index 8378553eb9..4ee81936ed 100644
--- a/include/block/aio.h
+++ b/include/block/aio.h
@@ -629,6 +629,9 @@ void aio_co_schedule(AioContext *ctx, Coroutine *co);
*
* Move the currently running coroutine to new_ctx. If the coroutine is already
* running in new_ctx, do nothing.
+ *
+ * Note that this function cannot reschedule from iohandler_ctx to
+ * qemu_aio_context.
*/
void coroutine_fn aio_co_reschedule_self(AioContext *new_ctx);
@@ -661,6 +664,9 @@ void aio_co_enter(AioContext *ctx, Coroutine *co);
* If called from an IOThread this will be the IOThread's AioContext. If
* called from the main thread or with the "big QEMU lock" taken it
* will be the main loop AioContext.
+ *
+ * Note that the return value is never the main loop's iohandler_ctx and the
+ * return value is the main loop AioContext instead.
*/
AioContext *qemu_get_current_aio_context(void);
--
2.39.3

@ -1,55 +0,0 @@
From 5beea8b889a38aa59259679d7f1ba050f09eb0f0 Mon Sep 17 00:00:00 2001
From: Jon Maloy <jmaloy@redhat.com>
Date: Tue, 9 May 2023 10:29:03 -0400
Subject: [PATCH 12/21] apic: disable reentrancy detection for apic-msi
RH-Author: Jon Maloy <jmaloy@redhat.com>
RH-MergeRequest: 165: memory: prevent dma-reentracy issues
RH-Jira: RHEL-516
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Commit: [8/13] 329f3b1c02fc42d85c821dd14c70e6b885cf849a (jmaloy/jmaloy-qemu-kvm-2)
Jira: https://issues.redhat.com/browse/RHEL-516
Upstream: Merged
CVE: CVE-2023-2680
commit 50795ee051a342c681a9b45671c552fbd6274db8
Author: Alexander Bulekov <alxndr@bu.edu>
Date: Thu Apr 27 17:10:13 2023 -0400
apic: disable reentrancy detection for apic-msi
As the code is designed for re-entrant calls to apic-msi, mark apic-msi
as reentrancy-safe.
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-Id: <20230427211013.2994127-9-alxndr@bu.edu>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
---
hw/intc/apic.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/hw/intc/apic.c b/hw/intc/apic.c
index 20b5a94073..ac3d47d231 100644
--- a/hw/intc/apic.c
+++ b/hw/intc/apic.c
@@ -885,6 +885,13 @@ static void apic_realize(DeviceState *dev, Error **errp)
memory_region_init_io(&s->io_memory, OBJECT(s), &apic_io_ops, s, "apic-msi",
APIC_SPACE_SIZE);
+ /*
+ * apic-msi's apic_mem_write can call into ioapic_eoi_broadcast, which can
+ * write back to apic-msi. As such mark the apic-msi region re-entrancy
+ * safe.
+ */
+ s->io_memory.disable_reentrancy_guard = true;
+
s->timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, apic_timer, s);
local_apics[s->id] = s;
--
2.39.3

@ -1,231 +0,0 @@
From f6db359f543723e2eb840653d35004af357ea5ac Mon Sep 17 00:00:00 2001
From: Jon Maloy <jmaloy@redhat.com>
Date: Tue, 9 May 2023 10:29:03 -0400
Subject: [PATCH 06/21] async: Add an optional reentrancy guard to the BH API
RH-Author: Jon Maloy <jmaloy@redhat.com>
RH-MergeRequest: 165: memory: prevent dma-reentracy issues
RH-Jira: RHEL-516
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Commit: [2/13] 009a9a68c1c25b9ad0cd9bc0d73b3e07bee2a19d (jmaloy/jmaloy-qemu-kvm-2)
Jira: https://issues.redhat.com/browse/RHEL-516
Upstream: Merged
CVE: CVE-2023-2680
commit 9c86c97f12c060bf7484dd931f38634e166a81f0
Author: Alexander Bulekov <alxndr@bu.edu>
Date: Thu Apr 27 17:10:07 2023 -0400
async: Add an optional reentrancy guard to the BH API
Devices can pass their MemoryReentrancyGuard (from their DeviceState),
when creating new BHes. Then, the async API will toggle the guard
before/after calling the BH call-back. This prevents bh->mmio reentrancy
issues.
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-Id: <20230427211013.2994127-3-alxndr@bu.edu>
[thuth: Fix "line over 90 characters" checkpatch.pl error]
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
---
docs/devel/multiple-iothreads.txt | 7 +++++++
include/block/aio.h | 18 ++++++++++++++++--
include/qemu/main-loop.h | 7 +++++--
tests/unit/ptimer-test-stubs.c | 3 ++-
util/async.c | 18 +++++++++++++++++-
util/main-loop.c | 6 ++++--
util/trace-events | 1 +
7 files changed, 52 insertions(+), 8 deletions(-)
diff --git a/docs/devel/multiple-iothreads.txt b/docs/devel/multiple-iothreads.txt
index 343120f2ef..a3e949f6b3 100644
--- a/docs/devel/multiple-iothreads.txt
+++ b/docs/devel/multiple-iothreads.txt
@@ -61,6 +61,7 @@ There are several old APIs that use the main loop AioContext:
* LEGACY qemu_aio_set_event_notifier() - monitor an event notifier
* LEGACY timer_new_ms() - create a timer
* LEGACY qemu_bh_new() - create a BH
+ * LEGACY qemu_bh_new_guarded() - create a BH with a device re-entrancy guard
* LEGACY qemu_aio_wait() - run an event loop iteration
Since they implicitly work on the main loop they cannot be used in code that
@@ -72,8 +73,14 @@ Instead, use the AioContext functions directly (see include/block/aio.h):
* aio_set_event_notifier() - monitor an event notifier
* aio_timer_new() - create a timer
* aio_bh_new() - create a BH
+ * aio_bh_new_guarded() - create a BH with a device re-entrancy guard
* aio_poll() - run an event loop iteration
+The qemu_bh_new_guarded/aio_bh_new_guarded APIs accept a "MemReentrancyGuard"
+argument, which is used to check for and prevent re-entrancy problems. For
+BHs associated with devices, the reentrancy-guard is contained in the
+corresponding DeviceState and named "mem_reentrancy_guard".
+
The AioContext can be obtained from the IOThread using
iothread_get_aio_context() or for the main loop using qemu_get_aio_context().
Code that takes an AioContext argument works both in IOThreads or the main
diff --git a/include/block/aio.h b/include/block/aio.h
index 543717f294..db6f23c619 100644
--- a/include/block/aio.h
+++ b/include/block/aio.h
@@ -23,6 +23,8 @@
#include "qemu/thread.h"
#include "qemu/timer.h"
#include "block/graph-lock.h"
+#include "hw/qdev-core.h"
+
typedef struct BlockAIOCB BlockAIOCB;
typedef void BlockCompletionFunc(void *opaque, int ret);
@@ -331,9 +333,11 @@ void aio_bh_schedule_oneshot_full(AioContext *ctx, QEMUBHFunc *cb, void *opaque,
* is opaque and must be allocated prior to its use.
*
* @name: A human-readable identifier for debugging purposes.
+ * @reentrancy_guard: A guard set when entering a cb to prevent
+ * device-reentrancy issues
*/
QEMUBH *aio_bh_new_full(AioContext *ctx, QEMUBHFunc *cb, void *opaque,
- const char *name);
+ const char *name, MemReentrancyGuard *reentrancy_guard);
/**
* aio_bh_new: Allocate a new bottom half structure
@@ -342,7 +346,17 @@ QEMUBH *aio_bh_new_full(AioContext *ctx, QEMUBHFunc *cb, void *opaque,
* string.
*/
#define aio_bh_new(ctx, cb, opaque) \
- aio_bh_new_full((ctx), (cb), (opaque), (stringify(cb)))
+ aio_bh_new_full((ctx), (cb), (opaque), (stringify(cb)), NULL)
+
+/**
+ * aio_bh_new_guarded: Allocate a new bottom half structure with a
+ * reentrancy_guard
+ *
+ * A convenience wrapper for aio_bh_new_full() that uses the cb as the name
+ * string.
+ */
+#define aio_bh_new_guarded(ctx, cb, opaque, guard) \
+ aio_bh_new_full((ctx), (cb), (opaque), (stringify(cb)), guard)
/**
* aio_notify: Force processing of pending events.
diff --git a/include/qemu/main-loop.h b/include/qemu/main-loop.h
index b3e54e00bc..68e70e61aa 100644
--- a/include/qemu/main-loop.h
+++ b/include/qemu/main-loop.h
@@ -387,9 +387,12 @@ void qemu_cond_timedwait_iothread(QemuCond *cond, int ms);
/* internal interfaces */
+#define qemu_bh_new_guarded(cb, opaque, guard) \
+ qemu_bh_new_full((cb), (opaque), (stringify(cb)), guard)
#define qemu_bh_new(cb, opaque) \
- qemu_bh_new_full((cb), (opaque), (stringify(cb)))
-QEMUBH *qemu_bh_new_full(QEMUBHFunc *cb, void *opaque, const char *name);
+ qemu_bh_new_full((cb), (opaque), (stringify(cb)), NULL)
+QEMUBH *qemu_bh_new_full(QEMUBHFunc *cb, void *opaque, const char *name,
+ MemReentrancyGuard *reentrancy_guard);
void qemu_bh_schedule_idle(QEMUBH *bh);
enum {
diff --git a/tests/unit/ptimer-test-stubs.c b/tests/unit/ptimer-test-stubs.c
index f2bfcede93..8c9407c560 100644
--- a/tests/unit/ptimer-test-stubs.c
+++ b/tests/unit/ptimer-test-stubs.c
@@ -107,7 +107,8 @@ int64_t qemu_clock_deadline_ns_all(QEMUClockType type, int attr_mask)
return deadline;
}
-QEMUBH *qemu_bh_new_full(QEMUBHFunc *cb, void *opaque, const char *name)
+QEMUBH *qemu_bh_new_full(QEMUBHFunc *cb, void *opaque, const char *name,
+ MemReentrancyGuard *reentrancy_guard)
{
QEMUBH *bh = g_new(QEMUBH, 1);
diff --git a/util/async.c b/util/async.c
index 21016a1ac7..a9b528c370 100644
--- a/util/async.c
+++ b/util/async.c
@@ -65,6 +65,7 @@ struct QEMUBH {
void *opaque;
QSLIST_ENTRY(QEMUBH) next;
unsigned flags;
+ MemReentrancyGuard *reentrancy_guard;
};
/* Called concurrently from any thread */
@@ -137,7 +138,7 @@ void aio_bh_schedule_oneshot_full(AioContext *ctx, QEMUBHFunc *cb,
}
QEMUBH *aio_bh_new_full(AioContext *ctx, QEMUBHFunc *cb, void *opaque,
- const char *name)
+ const char *name, MemReentrancyGuard *reentrancy_guard)
{
QEMUBH *bh;
bh = g_new(QEMUBH, 1);
@@ -146,13 +147,28 @@ QEMUBH *aio_bh_new_full(AioContext *ctx, QEMUBHFunc *cb, void *opaque,
.cb = cb,
.opaque = opaque,
.name = name,
+ .reentrancy_guard = reentrancy_guard,
};
return bh;
}
void aio_bh_call(QEMUBH *bh)
{
+ bool last_engaged_in_io = false;
+
+ if (bh->reentrancy_guard) {
+ last_engaged_in_io = bh->reentrancy_guard->engaged_in_io;
+ if (bh->reentrancy_guard->engaged_in_io) {
+ trace_reentrant_aio(bh->ctx, bh->name);
+ }
+ bh->reentrancy_guard->engaged_in_io = true;
+ }
+
bh->cb(bh->opaque);
+
+ if (bh->reentrancy_guard) {
+ bh->reentrancy_guard->engaged_in_io = last_engaged_in_io;
+ }
}
/* Multiple occurrences of aio_bh_poll cannot be called concurrently. */
diff --git a/util/main-loop.c b/util/main-loop.c
index e180c85145..7022f02ef8 100644
--- a/util/main-loop.c
+++ b/util/main-loop.c
@@ -605,9 +605,11 @@ void main_loop_wait(int nonblocking)
/* Functions to operate on the main QEMU AioContext. */
-QEMUBH *qemu_bh_new_full(QEMUBHFunc *cb, void *opaque, const char *name)
+QEMUBH *qemu_bh_new_full(QEMUBHFunc *cb, void *opaque, const char *name,
+ MemReentrancyGuard *reentrancy_guard)
{
- return aio_bh_new_full(qemu_aio_context, cb, opaque, name);
+ return aio_bh_new_full(qemu_aio_context, cb, opaque, name,
+ reentrancy_guard);
}
/*
diff --git a/util/trace-events b/util/trace-events
index 16f78d8fe5..3f7e766683 100644
--- a/util/trace-events
+++ b/util/trace-events
@@ -11,6 +11,7 @@ poll_remove(void *ctx, void *node, int fd) "ctx %p node %p fd %d"
# async.c
aio_co_schedule(void *ctx, void *co) "ctx %p co %p"
aio_co_schedule_bh_cb(void *ctx, void *co) "ctx %p co %p"
+reentrant_aio(void *ctx, const char *name) "ctx %p name %s"
# thread-pool.c
thread_pool_submit(void *pool, void *req, void *opaque) "pool %p req %p opaque %p"
--
2.39.3

@ -1,70 +0,0 @@
From 137e84f68da06666ebf7f391766cc6209ce1c39c Mon Sep 17 00:00:00 2001
From: Jon Maloy <jmaloy@redhat.com>
Date: Tue, 9 May 2023 10:29:03 -0400
Subject: [PATCH 13/21] async: avoid use-after-free on re-entrancy guard
RH-Author: Jon Maloy <jmaloy@redhat.com>
RH-MergeRequest: 165: memory: prevent dma-reentracy issues
RH-Jira: RHEL-516
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Commit: [9/13] d4b957108aaacf4a597122aaeeaa8e56985f1fca (jmaloy/jmaloy-qemu-kvm-2)
Jira: https://issues.redhat.com/browse/RHEL-516
Upstream: Merged
CVE: CVE-2023-2680
commit 7915bd06f25e1803778081161bf6fa10c42dc7cd
Author: Alexander Bulekov <alxndr@bu.edu>
Date: Mon May 1 10:19:56 2023 -0400
async: avoid use-after-free on re-entrancy guard
A BH callback can free the BH, causing a use-after-free in aio_bh_call.
Fix that by keeping a local copy of the re-entrancy guard pointer.
Buglink: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=58513
Fixes: 9c86c97f12 ("async: Add an optional reentrancy guard to the BH API")
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <20230501141956.3444868-1-alxndr@bu.edu>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
---
util/async.c | 14 ++++++++------
1 file changed, 8 insertions(+), 6 deletions(-)
diff --git a/util/async.c b/util/async.c
index a9b528c370..cd1a1815f9 100644
--- a/util/async.c
+++ b/util/async.c
@@ -156,18 +156,20 @@ void aio_bh_call(QEMUBH *bh)
{
bool last_engaged_in_io = false;
- if (bh->reentrancy_guard) {
- last_engaged_in_io = bh->reentrancy_guard->engaged_in_io;
- if (bh->reentrancy_guard->engaged_in_io) {
+ /* Make a copy of the guard-pointer as cb may free the bh */
+ MemReentrancyGuard *reentrancy_guard = bh->reentrancy_guard;
+ if (reentrancy_guard) {
+ last_engaged_in_io = reentrancy_guard->engaged_in_io;
+ if (reentrancy_guard->engaged_in_io) {
trace_reentrant_aio(bh->ctx, bh->name);
}
- bh->reentrancy_guard->engaged_in_io = true;
+ reentrancy_guard->engaged_in_io = true;
}
bh->cb(bh->opaque);
- if (bh->reentrancy_guard) {
- bh->reentrancy_guard->engaged_in_io = last_engaged_in_io;
+ if (reentrancy_guard) {
+ reentrancy_guard->engaged_in_io = last_engaged_in_io;
}
}
--
2.39.3

@ -1,57 +0,0 @@
From 40866640d15e6a8c9f6af7e437edc1ec1e17ba34 Mon Sep 17 00:00:00 2001
From: Jon Maloy <jmaloy@redhat.com>
Date: Tue, 9 May 2023 10:29:03 -0400
Subject: [PATCH 10/21] bcm2835_property: disable reentrancy detection for
iomem
RH-Author: Jon Maloy <jmaloy@redhat.com>
RH-MergeRequest: 165: memory: prevent dma-reentracy issues
RH-Jira: RHEL-516
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Commit: [6/13] 128ebc85e228674af66553af82fba70eb87960e6 (jmaloy/jmaloy-qemu-kvm-2)
Jira: https://issues.redhat.com/browse/RHEL-516
Upstream: Merged
CVE: CVE-2023-2680
commit 985c4a4e547afb9573b6bd6843d20eb2c3d1d1cd
Author: Alexander Bulekov <alxndr@bu.edu>
Date: Thu Apr 27 17:10:11 2023 -0400
bcm2835_property: disable reentrancy detection for iomem
As the code is designed for re-entrant calls from bcm2835_property to
bcm2835_mbox and back into bcm2835_property, mark iomem as
reentrancy-safe.
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230427211013.2994127-7-alxndr@bu.edu>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
---
hw/misc/bcm2835_property.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/hw/misc/bcm2835_property.c b/hw/misc/bcm2835_property.c
index 890ae7bae5..de056ea2df 100644
--- a/hw/misc/bcm2835_property.c
+++ b/hw/misc/bcm2835_property.c
@@ -382,6 +382,13 @@ static void bcm2835_property_init(Object *obj)
memory_region_init_io(&s->iomem, OBJECT(s), &bcm2835_property_ops, s,
TYPE_BCM2835_PROPERTY, 0x10);
+
+ /*
+ * bcm2835_property_ops call into bcm2835_mbox, which in-turn reads from
+ * iomem. As such, mark iomem as re-entracy safe.
+ */
+ s->iomem.disable_reentrancy_guard = true;
+
sysbus_init_mmio(SYS_BUS_DEVICE(s), &s->iomem);
sysbus_init_irq(SYS_BUS_DEVICE(s), &s->mbox_irq);
}
--
2.39.3

@ -1,354 +0,0 @@
From ff05c0b0d3414c0e5b3903048280accdc6c75ca0 Mon Sep 17 00:00:00 2001
From: Hanna Czenczek <hreitz@redhat.com>
Date: Tue, 11 Apr 2023 19:34:16 +0200
Subject: [PATCH 2/9] block: Collapse padded I/O vecs exceeding IOV_MAX
RH-Author: Hanna Czenczek <hreitz@redhat.com>
RH-MergeRequest: 189: block: Split padded I/O vectors exceeding IOV_MAX
RH-Bugzilla: 2174676
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Commit: [2/5] 84c56bd16f841a18cf2baa918dfeab3240e3944d (hreitz/qemu-kvm-c-9-s)
When processing vectored guest requests that are not aligned to the
storage request alignment, we pad them by adding head and/or tail
buffers for a read-modify-write cycle.
The guest can submit I/O vectors up to IOV_MAX (1024) in length, but
with this padding, the vector can exceed that limit. As of
4c002cef0e9abe7135d7916c51abce47f7fc1ee2 ("util/iov: make
qemu_iovec_init_extended() honest"), we refuse to pad vectors beyond the
limit, instead returning an error to the guest.
To the guest, this appears as a random I/O error. We should not return
an I/O error to the guest when it issued a perfectly valid request.
Before 4c002cef0e9abe7135d7916c51abce47f7fc1ee2, we just made the vector
longer than IOV_MAX, which generally seems to work (because the guest
assumes a smaller alignment than we really have, file-posix's
raw_co_prw() will generally see bdrv_qiov_is_aligned() return false, and
so emulate the request, so that the IOV_MAX does not matter). However,
that does not seem exactly great.
I see two ways to fix this problem:
1. We split such long requests into two requests.
2. We join some elements of the vector into new buffers to make it
shorter.
I am wary of (1), because it seems like it may have unintended side
effects.
(2) on the other hand seems relatively simple to implement, with
hopefully few side effects, so this patch does that.
To do this, the use of qemu_iovec_init_extended() in bdrv_pad_request()
is effectively replaced by the new function bdrv_create_padded_qiov(),
which not only wraps the request IOV with padding head/tail, but also
ensures that the resulting vector will not have more than IOV_MAX
elements. Putting that functionality into qemu_iovec_init_extended() is
infeasible because it requires allocating a bounce buffer; doing so
would require many more parameters (buffer alignment, how to initialize
the buffer, and out parameters like the buffer, its length, and the
original elements), which is not reasonable.
Conversely, it is not difficult to move qemu_iovec_init_extended()'s
functionality into bdrv_create_padded_qiov() by using public
qemu_iovec_* functions, so that is what this patch does.
Because bdrv_pad_request() was the only "serious" user of
qemu_iovec_init_extended(), the next patch will remove the latter
function, so the functionality is not implemented twice.
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2141964
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Message-Id: <20230411173418.19549-3-hreitz@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
(cherry picked from commit 18743311b829cafc1737a5f20bc3248d5f91ee2a)
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
---
block/io.c | 166 ++++++++++++++++++++++++++++++++++++++++++++++++-----
1 file changed, 151 insertions(+), 15 deletions(-)
diff --git a/block/io.c b/block/io.c
index 2e267a85ab..4e8e90208b 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1439,6 +1439,14 @@ out:
* @merge_reads is true for small requests,
* if @buf_len == @head + bytes + @tail. In this case it is possible that both
* head and tail exist but @buf_len == align and @tail_buf == @buf.
+ *
+ * @write is true for write requests, false for read requests.
+ *
+ * If padding makes the vector too long (exceeding IOV_MAX), then we need to
+ * merge existing vector elements into a single one. @collapse_bounce_buf acts
+ * as the bounce buffer in such cases. @pre_collapse_qiov has the pre-collapse
+ * I/O vector elements so for read requests, the data can be copied back after
+ * the read is done.
*/
typedef struct BdrvRequestPadding {
uint8_t *buf;
@@ -1447,11 +1455,17 @@ typedef struct BdrvRequestPadding {
size_t head;
size_t tail;
bool merge_reads;
+ bool write;
QEMUIOVector local_qiov;
+
+ uint8_t *collapse_bounce_buf;
+ size_t collapse_len;
+ QEMUIOVector pre_collapse_qiov;
} BdrvRequestPadding;
static bool bdrv_init_padding(BlockDriverState *bs,
int64_t offset, int64_t bytes,
+ bool write,
BdrvRequestPadding *pad)
{
int64_t align = bs->bl.request_alignment;
@@ -1483,6 +1497,8 @@ static bool bdrv_init_padding(BlockDriverState *bs,
pad->tail_buf = pad->buf + pad->buf_len - align;
}
+ pad->write = write;
+
return true;
}
@@ -1547,8 +1563,23 @@ zero_mem:
return 0;
}
-static void bdrv_padding_destroy(BdrvRequestPadding *pad)
+/**
+ * Free *pad's associated buffers, and perform any necessary finalization steps.
+ */
+static void bdrv_padding_finalize(BdrvRequestPadding *pad)
{
+ if (pad->collapse_bounce_buf) {
+ if (!pad->write) {
+ /*
+ * If padding required elements in the vector to be collapsed into a
+ * bounce buffer, copy the bounce buffer content back
+ */
+ qemu_iovec_from_buf(&pad->pre_collapse_qiov, 0,
+ pad->collapse_bounce_buf, pad->collapse_len);
+ }
+ qemu_vfree(pad->collapse_bounce_buf);
+ qemu_iovec_destroy(&pad->pre_collapse_qiov);
+ }
if (pad->buf) {
qemu_vfree(pad->buf);
qemu_iovec_destroy(&pad->local_qiov);
@@ -1556,6 +1587,101 @@ static void bdrv_padding_destroy(BdrvRequestPadding *pad)
memset(pad, 0, sizeof(*pad));
}
+/*
+ * Create pad->local_qiov by wrapping @iov in the padding head and tail, while
+ * ensuring that the resulting vector will not exceed IOV_MAX elements.
+ *
+ * To ensure this, when necessary, the first two or three elements of @iov are
+ * merged into pad->collapse_bounce_buf and replaced by a reference to that
+ * bounce buffer in pad->local_qiov.
+ *
+ * After performing a read request, the data from the bounce buffer must be
+ * copied back into pad->pre_collapse_qiov (e.g. by bdrv_padding_finalize()).
+ */
+static int bdrv_create_padded_qiov(BlockDriverState *bs,
+ BdrvRequestPadding *pad,
+ struct iovec *iov, int niov,
+ size_t iov_offset, size_t bytes)
+{
+ int padded_niov, surplus_count, collapse_count;
+
+ /* Assert this invariant */
+ assert(niov <= IOV_MAX);
+
+ /*
+ * Cannot pad if resulting length would exceed SIZE_MAX. Returning an error
+ * to the guest is not ideal, but there is little else we can do. At least
+ * this will practically never happen on 64-bit systems.
+ */
+ if (SIZE_MAX - pad->head < bytes ||
+ SIZE_MAX - pad->head - bytes < pad->tail)
+ {
+ return -EINVAL;
+ }
+
+ /* Length of the resulting IOV if we just concatenated everything */
+ padded_niov = !!pad->head + niov + !!pad->tail;
+
+ qemu_iovec_init(&pad->local_qiov, MIN(padded_niov, IOV_MAX));
+
+ if (pad->head) {
+ qemu_iovec_add(&pad->local_qiov, pad->buf, pad->head);
+ }
+
+ /*
+ * If padded_niov > IOV_MAX, we cannot just concatenate everything.
+ * Instead, merge the first two or three elements of @iov to reduce the
+ * number of vector elements as necessary.
+ */
+ if (padded_niov > IOV_MAX) {
+ /*
+ * Only head and tail can have lead to the number of entries exceeding
+ * IOV_MAX, so we can exceed it by the head and tail at most. We need
+ * to reduce the number of elements by `surplus_count`, so we merge that
+ * many elements plus one into one element.
+ */
+ surplus_count = padded_niov - IOV_MAX;
+ assert(surplus_count <= !!pad->head + !!pad->tail);
+ collapse_count = surplus_count + 1;
+
+ /*
+ * Move the elements to collapse into `pad->pre_collapse_qiov`, then
+ * advance `iov` (and associated variables) by those elements.
+ */
+ qemu_iovec_init(&pad->pre_collapse_qiov, collapse_count);
+ qemu_iovec_concat_iov(&pad->pre_collapse_qiov, iov,
+ collapse_count, iov_offset, SIZE_MAX);
+ iov += collapse_count;
+ iov_offset = 0;
+ niov -= collapse_count;
+ bytes -= pad->pre_collapse_qiov.size;
+
+ /*
+ * Construct the bounce buffer to match the length of the to-collapse
+ * vector elements, and for write requests, initialize it with the data
+ * from those elements. Then add it to `pad->local_qiov`.
+ */
+ pad->collapse_len = pad->pre_collapse_qiov.size;
+ pad->collapse_bounce_buf = qemu_blockalign(bs, pad->collapse_len);
+ if (pad->write) {
+ qemu_iovec_to_buf(&pad->pre_collapse_qiov, 0,
+ pad->collapse_bounce_buf, pad->collapse_len);
+ }
+ qemu_iovec_add(&pad->local_qiov,
+ pad->collapse_bounce_buf, pad->collapse_len);
+ }
+
+ qemu_iovec_concat_iov(&pad->local_qiov, iov, niov, iov_offset, bytes);
+
+ if (pad->tail) {
+ qemu_iovec_add(&pad->local_qiov,
+ pad->buf + pad->buf_len - pad->tail, pad->tail);
+ }
+
+ assert(pad->local_qiov.niov == MIN(padded_niov, IOV_MAX));
+ return 0;
+}
+
/*
* bdrv_pad_request
*
@@ -1563,6 +1689,8 @@ static void bdrv_padding_destroy(BdrvRequestPadding *pad)
* read of padding, bdrv_padding_rmw_read() should be called separately if
* needed.
*
+ * @write is true for write requests, false for read requests.
+ *
* Request parameters (@qiov, &qiov_offset, &offset, &bytes) are in-out:
* - on function start they represent original request
* - on failure or when padding is not needed they are unchanged
@@ -1571,26 +1699,34 @@ static void bdrv_padding_destroy(BdrvRequestPadding *pad)
static int bdrv_pad_request(BlockDriverState *bs,
QEMUIOVector **qiov, size_t *qiov_offset,
int64_t *offset, int64_t *bytes,
+ bool write,
BdrvRequestPadding *pad, bool *padded,
BdrvRequestFlags *flags)
{
int ret;
+ struct iovec *sliced_iov;
+ int sliced_niov;
+ size_t sliced_head, sliced_tail;
bdrv_check_qiov_request(*offset, *bytes, *qiov, *qiov_offset, &error_abort);
- if (!bdrv_init_padding(bs, *offset, *bytes, pad)) {
+ if (!bdrv_init_padding(bs, *offset, *bytes, write, pad)) {
if (padded) {
*padded = false;
}
return 0;
}
- ret = qemu_iovec_init_extended(&pad->local_qiov, pad->buf, pad->head,
- *qiov, *qiov_offset, *bytes,
- pad->buf + pad->buf_len - pad->tail,
- pad->tail);
+ sliced_iov = qemu_iovec_slice(*qiov, *qiov_offset, *bytes,
+ &sliced_head, &sliced_tail,
+ &sliced_niov);
+
+ /* Guaranteed by bdrv_check_qiov_request() */
+ assert(*bytes <= SIZE_MAX);
+ ret = bdrv_create_padded_qiov(bs, pad, sliced_iov, sliced_niov,
+ sliced_head, *bytes);
if (ret < 0) {
- bdrv_padding_destroy(pad);
+ bdrv_padding_finalize(pad);
return ret;
}
*bytes += pad->head + pad->tail;
@@ -1657,8 +1793,8 @@ int coroutine_fn bdrv_co_preadv_part(BdrvChild *child,
flags |= BDRV_REQ_COPY_ON_READ;
}
- ret = bdrv_pad_request(bs, &qiov, &qiov_offset, &offset, &bytes, &pad,
- NULL, &flags);
+ ret = bdrv_pad_request(bs, &qiov, &qiov_offset, &offset, &bytes, false,
+ &pad, NULL, &flags);
if (ret < 0) {
goto fail;
}
@@ -1668,7 +1804,7 @@ int coroutine_fn bdrv_co_preadv_part(BdrvChild *child,
bs->bl.request_alignment,
qiov, qiov_offset, flags);
tracked_request_end(&req);
- bdrv_padding_destroy(&pad);
+ bdrv_padding_finalize(&pad);
fail:
bdrv_dec_in_flight(bs);
@@ -2000,7 +2136,7 @@ bdrv_co_do_zero_pwritev(BdrvChild *child, int64_t offset, int64_t bytes,
/* This flag doesn't make sense for padding or zero writes */
flags &= ~BDRV_REQ_REGISTERED_BUF;
- padding = bdrv_init_padding(bs, offset, bytes, &pad);
+ padding = bdrv_init_padding(bs, offset, bytes, true, &pad);
if (padding) {
assert(!(flags & BDRV_REQ_NO_WAIT));
bdrv_make_request_serialising(req, align);
@@ -2048,7 +2184,7 @@ bdrv_co_do_zero_pwritev(BdrvChild *child, int64_t offset, int64_t bytes,
}
out:
- bdrv_padding_destroy(&pad);
+ bdrv_padding_finalize(&pad);
return ret;
}
@@ -2116,8 +2252,8 @@ int coroutine_fn bdrv_co_pwritev_part(BdrvChild *child,
* bdrv_co_do_zero_pwritev() does aligning by itself, so, we do
* alignment only if there is no ZERO flag.
*/
- ret = bdrv_pad_request(bs, &qiov, &qiov_offset, &offset, &bytes, &pad,
- &padded, &flags);
+ ret = bdrv_pad_request(bs, &qiov, &qiov_offset, &offset, &bytes, true,
+ &pad, &padded, &flags);
if (ret < 0) {
return ret;
}
@@ -2147,7 +2283,7 @@ int coroutine_fn bdrv_co_pwritev_part(BdrvChild *child,
ret = bdrv_aligned_pwritev(child, &req, offset, bytes, align,
qiov, qiov_offset, flags);
- bdrv_padding_destroy(&pad);
+ bdrv_padding_finalize(&pad);
out:
tracked_request_end(&req);
--
2.39.3

@ -1,56 +0,0 @@
From dfa2811e88afaf996345552330e97f0513c1803c Mon Sep 17 00:00:00 2001
From: Kevin Wolf <kwolf@redhat.com>
Date: Thu, 4 May 2023 13:57:34 +0200
Subject: [PATCH 53/56] block: Don't call no_coroutine_fns in
qmp_block_resize()
RH-Author: Kevin Wolf <kwolf@redhat.com>
RH-MergeRequest: 164: block: Fix hangs in qmp_block_resize()
RH-Bugzilla: 2185688
RH-Acked-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
RH-Acked-by: Hanna Czenczek <hreitz@redhat.com>
RH-Acked-by: Eric Blake <eblake@redhat.com>
RH-Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
RH-Commit: [2/4] 7ac7e34821cfc8bd5f0daadd7a1c4a5596bc60a6 (kmwolf/centos-qemu-kvm)
This QMP handler runs in a coroutine, so it must use the corresponding
no_co_wrappers instead.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2185688
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230504115750.54437-5-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 0c7d204f50c382c6baac8c94bd57af4a022b3888)
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
blockdev.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/blockdev.c b/blockdev.c
index d7b5c18f0a..eb509cf964 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -2430,7 +2430,7 @@ void coroutine_fn qmp_block_resize(const char *device, const char *node_name,
return;
}
- blk = blk_new_with_bs(bs, BLK_PERM_RESIZE, BLK_PERM_ALL, errp);
+ blk = blk_co_new_with_bs(bs, BLK_PERM_RESIZE, BLK_PERM_ALL, errp);
if (!blk) {
return;
}
@@ -2445,7 +2445,7 @@ void coroutine_fn qmp_block_resize(const char *device, const char *node_name,
bdrv_co_lock(bs);
bdrv_drained_end(bs);
- blk_unref(blk);
+ blk_co_unref(blk);
bdrv_co_unlock(bs);
}
--
2.39.1

@ -1,73 +0,0 @@
From 547f6bf93734f7c13675eebb93273ef2273f7c31 Mon Sep 17 00:00:00 2001
From: Hanna Czenczek <hreitz@redhat.com>
Date: Fri, 14 Jul 2023 10:59:38 +0200
Subject: [PATCH 5/9] block: Fix pad_request's request restriction
RH-Author: Hanna Czenczek <hreitz@redhat.com>
RH-MergeRequest: 189: block: Split padded I/O vectors exceeding IOV_MAX
RH-Bugzilla: 2174676
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Commit: [5/5] e8abc0485f6e0608a1ec55143ff40a14d273dfc8 (hreitz/qemu-kvm-c-9-s)
bdrv_pad_request() relies on requests' lengths not to exceed SIZE_MAX,
which bdrv_check_qiov_request() does not guarantee.
bdrv_check_request32() however will guarantee this, and both of
bdrv_pad_request()'s callers (bdrv_co_preadv_part() and
bdrv_co_pwritev_part()) already run it before calling
bdrv_pad_request(). Therefore, bdrv_pad_request() can safely call
bdrv_check_request32() without expecting error, too.
In effect, this patch will not change guest-visible behavior. It is a
clean-up to tighten a condition to match what is guaranteed by our
callers, and which exists purely to show clearly why the subsequent
assertion (`assert(*bytes <= SIZE_MAX)`) is always true.
Note there is a difference between the interfaces of
bdrv_check_qiov_request() and bdrv_check_request32(): The former takes
an errp, the latter does not, so we can no longer just pass
&error_abort. Instead, we need to check the returned value. While we
do expect success (because the callers have already run this function),
an assert(ret == 0) is not much simpler than just to return an error if
it occurs, so let us handle errors by returning them up the stack now.
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Message-id: 20230714085938.202730-1-hreitz@redhat.com
Fixes: 18743311b829cafc1737a5f20bc3248d5f91ee2a
("block: Collapse padded I/O vecs exceeding IOV_MAX")
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
block/io.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/block/io.c b/block/io.c
index 4e8e90208b..807c9fb720 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1708,7 +1708,11 @@ static int bdrv_pad_request(BlockDriverState *bs,
int sliced_niov;
size_t sliced_head, sliced_tail;
- bdrv_check_qiov_request(*offset, *bytes, *qiov, *qiov_offset, &error_abort);
+ /* Should have been checked by the caller already */
+ ret = bdrv_check_request32(*offset, *bytes, *qiov, *qiov_offset);
+ if (ret < 0) {
+ return ret;
+ }
if (!bdrv_init_padding(bs, *offset, *bytes, write, pad)) {
if (padded) {
@@ -1721,7 +1725,7 @@ static int bdrv_pad_request(BlockDriverState *bs,
&sliced_head, &sliced_tail,
&sliced_niov);
- /* Guaranteed by bdrv_check_qiov_request() */
+ /* Guaranteed by bdrv_check_request32() */
assert(*bytes <= SIZE_MAX);
ret = bdrv_create_padded_qiov(bs, pad, sliced_iov, sliced_niov,
sliced_head, *bytes);
--
2.39.3

@ -0,0 +1,252 @@
From 2ee645a339e9ef9cd92620a8b784d18d512326be Mon Sep 17 00:00:00 2001
From: Kevin Wolf <kwolf@redhat.com>
Date: Thu, 25 Apr 2024 14:56:02 +0200
Subject: [PATCH 4/4] block: Parse filenames only when explicitly requested
RH-Author: Hana Czenczek <hczenczek@redhat.com>
RH-MergeRequest: 1: CVE 2024-4467 (PRDSC)
RH-Jira: RHEL-35611
RH-CVE: CVE-2024-4467
RH-Acked-by: Kevin Wolf <kwolf@redhat.com>
RH-Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
RH-Acked-by: Eric Blake <eblake@redhat.com>
RH-Commit: [4/4] f44c2941d4419e60f16dea3e9adca164e75aa78d
When handling image filenames from legacy options such as -drive or from
tools, these filenames are parsed for protocol prefixes, including for
the json:{} pseudo-protocol.
This behaviour is intended for filenames that come directly from the
command line and for backing files, which may come from the image file
itself. Higher level management tools generally take care to verify that
untrusted images don't contain a bad (or any) backing file reference;
'qemu-img info' is a suitable tool for this.
However, for other files that can be referenced in images, such as
qcow2 data files or VMDK extents, the string from the image file is
usually not verified by management tools - and 'qemu-img info' wouldn't
be suitable because in contrast to backing files, it already opens these
other referenced files. So here the string should be interpreted as a
literal local filename. More complex configurations need to be specified
explicitly on the command line or in QMP.
This patch changes bdrv_open_inherit() so that it only parses filenames
if a new parameter parse_filename is true. It is set for the top level
in bdrv_open(), for the file child and for the backing file child. All
other callers pass false and disable filename parsing this way.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Upstream: N/A, embargoed
Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
---
block.c | 90 ++++++++++++++++++++++++++++++++++++---------------------
1 file changed, 57 insertions(+), 33 deletions(-)
diff --git a/block.c b/block.c
index 468cf5e67d..50bdd197b7 100644
--- a/block.c
+++ b/block.c
@@ -86,6 +86,7 @@ static BlockDriverState *bdrv_open_inherit(const char *filename,
BlockDriverState *parent,
const BdrvChildClass *child_class,
BdrvChildRole child_role,
+ bool parse_filename,
Error **errp);
static bool bdrv_recurse_has_child(BlockDriverState *bs,
@@ -2058,7 +2059,8 @@ static void parse_json_protocol(QDict *options, const char **pfilename,
* block driver has been specified explicitly.
*/
static int bdrv_fill_options(QDict **options, const char *filename,
- int *flags, Error **errp)
+ int *flags, bool allow_parse_filename,
+ Error **errp)
{
const char *drvname;
bool protocol = *flags & BDRV_O_PROTOCOL;
@@ -2100,7 +2102,7 @@ static int bdrv_fill_options(QDict **options, const char *filename,
if (protocol && filename) {
if (!qdict_haskey(*options, "filename")) {
qdict_put_str(*options, "filename", filename);
- parse_filename = true;
+ parse_filename = allow_parse_filename;
} else {
error_setg(errp, "Can't specify 'file' and 'filename' options at "
"the same time");
@@ -3663,7 +3665,8 @@ int bdrv_open_backing_file(BlockDriverState *bs, QDict *parent_options,
}
backing_hd = bdrv_open_inherit(backing_filename, reference, options, 0, bs,
- &child_of_bds, bdrv_backing_role(bs), errp);
+ &child_of_bds, bdrv_backing_role(bs), true,
+ errp);
if (!backing_hd) {
bs->open_flags |= BDRV_O_NO_BACKING;
error_prepend(errp, "Could not open backing file: ");
@@ -3697,7 +3700,8 @@ free_exit:
static BlockDriverState *
bdrv_open_child_bs(const char *filename, QDict *options, const char *bdref_key,
BlockDriverState *parent, const BdrvChildClass *child_class,
- BdrvChildRole child_role, bool allow_none, Error **errp)
+ BdrvChildRole child_role, bool allow_none,
+ bool parse_filename, Error **errp)
{
BlockDriverState *bs = NULL;
QDict *image_options;
@@ -3728,7 +3732,8 @@ bdrv_open_child_bs(const char *filename, QDict *options, const char *bdref_key,
}
bs = bdrv_open_inherit(filename, reference, image_options, 0,
- parent, child_class, child_role, errp);
+ parent, child_class, child_role, parse_filename,
+ errp);
if (!bs) {
goto done;
}
@@ -3738,6 +3743,33 @@ done:
return bs;
}
+static BdrvChild *bdrv_open_child_common(const char *filename,
+ QDict *options, const char *bdref_key,
+ BlockDriverState *parent,
+ const BdrvChildClass *child_class,
+ BdrvChildRole child_role,
+ bool allow_none, bool parse_filename,
+ Error **errp)
+{
+ BlockDriverState *bs;
+ BdrvChild *child;
+
+ GLOBAL_STATE_CODE();
+
+ bs = bdrv_open_child_bs(filename, options, bdref_key, parent, child_class,
+ child_role, allow_none, parse_filename, errp);
+ if (bs == NULL) {
+ return NULL;
+ }
+
+ bdrv_graph_wrlock();
+ child = bdrv_attach_child(parent, bs, bdref_key, child_class, child_role,
+ errp);
+ bdrv_graph_wrunlock();
+
+ return child;
+}
+
/*
* Opens a disk image whose options are given as BlockdevRef in another block
* device's options.
@@ -3761,27 +3793,15 @@ BdrvChild *bdrv_open_child(const char *filename,
BdrvChildRole child_role,
bool allow_none, Error **errp)
{
- BlockDriverState *bs;
- BdrvChild *child;
-
- GLOBAL_STATE_CODE();
-
- bs = bdrv_open_child_bs(filename, options, bdref_key, parent, child_class,
- child_role, allow_none, errp);
- if (bs == NULL) {
- return NULL;
- }
-
- bdrv_graph_wrlock();
- child = bdrv_attach_child(parent, bs, bdref_key, child_class, child_role,
- errp);
- bdrv_graph_wrunlock();
-
- return child;
+ return bdrv_open_child_common(filename, options, bdref_key, parent,
+ child_class, child_role, allow_none, false,
+ errp);
}
/*
- * Wrapper on bdrv_open_child() for most popular case: open primary child of bs.
+ * This does mostly the same as bdrv_open_child(), but for opening the primary
+ * child of a node. A notable difference from bdrv_open_child() is that it
+ * enables filename parsing for protocol names (including json:).
*
* @parent can move to a different AioContext in this function.
*/
@@ -3796,8 +3816,8 @@ int bdrv_open_file_child(const char *filename,
role = parent->drv->is_filter ?
(BDRV_CHILD_FILTERED | BDRV_CHILD_PRIMARY) : BDRV_CHILD_IMAGE;
- if (!bdrv_open_child(filename, options, bdref_key, parent,
- &child_of_bds, role, false, errp))
+ if (!bdrv_open_child_common(filename, options, bdref_key, parent,
+ &child_of_bds, role, false, true, errp))
{
return -EINVAL;
}
@@ -3842,7 +3862,8 @@ BlockDriverState *bdrv_open_blockdev_ref(BlockdevRef *ref, Error **errp)
}
- bs = bdrv_open_inherit(NULL, reference, qdict, 0, NULL, NULL, 0, errp);
+ bs = bdrv_open_inherit(NULL, reference, qdict, 0, NULL, NULL, 0, false,
+ errp);
obj = NULL;
qobject_unref(obj);
visit_free(v);
@@ -3932,7 +3953,7 @@ static BlockDriverState * no_coroutine_fn
bdrv_open_inherit(const char *filename, const char *reference, QDict *options,
int flags, BlockDriverState *parent,
const BdrvChildClass *child_class, BdrvChildRole child_role,
- Error **errp)
+ bool parse_filename, Error **errp)
{
int ret;
BlockBackend *file = NULL;
@@ -3980,9 +4001,11 @@ bdrv_open_inherit(const char *filename, const char *reference, QDict *options,
}
/* json: syntax counts as explicit options, as if in the QDict */
- parse_json_protocol(options, &filename, &local_err);
- if (local_err) {
- goto fail;
+ if (parse_filename) {
+ parse_json_protocol(options, &filename, &local_err);
+ if (local_err) {
+ goto fail;
+ }
}
bs->explicit_options = qdict_clone_shallow(options);
@@ -4007,7 +4030,8 @@ bdrv_open_inherit(const char *filename, const char *reference, QDict *options,
parent->open_flags, parent->options);
}
- ret = bdrv_fill_options(&options, filename, &flags, &local_err);
+ ret = bdrv_fill_options(&options, filename, &flags, parse_filename,
+ &local_err);
if (ret < 0) {
goto fail;
}
@@ -4076,7 +4100,7 @@ bdrv_open_inherit(const char *filename, const char *reference, QDict *options,
file_bs = bdrv_open_child_bs(filename, options, "file", bs,
&child_of_bds, BDRV_CHILD_IMAGE,
- true, &local_err);
+ true, true, &local_err);
if (local_err) {
goto fail;
}
@@ -4225,7 +4249,7 @@ BlockDriverState *bdrv_open(const char *filename, const char *reference,
GLOBAL_STATE_CODE();
return bdrv_open_inherit(filename, reference, options, flags, NULL,
- NULL, 0, errp);
+ NULL, 0, true, errp);
}
/* Return true if the NULL-terminated @list contains @str */
--
2.39.3

@ -1,386 +0,0 @@
From 7baea25be90e184175dd5a919ee5878cbd4970c2 Mon Sep 17 00:00:00 2001
From: Kevin Wolf <kwolf@redhat.com>
Date: Thu, 4 May 2023 13:57:33 +0200
Subject: [PATCH 52/56] block: bdrv/blk_co_unref() for calls in coroutine
context
RH-Author: Kevin Wolf <kwolf@redhat.com>
RH-MergeRequest: 164: block: Fix hangs in qmp_block_resize()
RH-Bugzilla: 2185688
RH-Acked-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
RH-Acked-by: Hanna Czenczek <hreitz@redhat.com>
RH-Acked-by: Eric Blake <eblake@redhat.com>
RH-Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
RH-Commit: [1/4] 8ebf8486b082c30ca1b39a6ede35e471eaaccfa3 (kmwolf/centos-qemu-kvm)
These functions must not be called in coroutine context, because they
need write access to the graph.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230504115750.54437-4-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit b2ab5f545fa1eaaf2955dd617bee19a8b3279786)
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
block.c | 2 +-
block/crypto.c | 6 +++---
block/parallels.c | 6 +++---
block/qcow.c | 6 +++---
block/qcow2.c | 14 +++++++-------
block/qed.c | 6 +++---
block/vdi.c | 6 +++---
block/vhdx.c | 6 +++---
block/vmdk.c | 18 +++++++++---------
block/vpc.c | 6 +++---
include/block/block-global-state.h | 3 ++-
include/sysemu/block-backend-global-state.h | 5 ++++-
12 files changed, 44 insertions(+), 40 deletions(-)
diff --git a/block.c b/block.c
index d79a52ca74..a48112f945 100644
--- a/block.c
+++ b/block.c
@@ -680,7 +680,7 @@ int coroutine_fn bdrv_co_create_opts_simple(BlockDriver *drv,
ret = 0;
out:
- blk_unref(blk);
+ blk_co_unref(blk);
return ret;
}
diff --git a/block/crypto.c b/block/crypto.c
index ca67289187..8fd3ad0054 100644
--- a/block/crypto.c
+++ b/block/crypto.c
@@ -355,7 +355,7 @@ block_crypto_co_create_generic(BlockDriverState *bs, int64_t size,
ret = 0;
cleanup:
qcrypto_block_free(crypto);
- blk_unref(blk);
+ blk_co_unref(blk);
return ret;
}
@@ -661,7 +661,7 @@ block_crypto_co_create_luks(BlockdevCreateOptions *create_options, Error **errp)
ret = 0;
fail:
- bdrv_unref(bs);
+ bdrv_co_unref(bs);
return ret;
}
@@ -730,7 +730,7 @@ fail:
bdrv_co_delete_file_noerr(bs);
}
- bdrv_unref(bs);
+ bdrv_co_unref(bs);
qapi_free_QCryptoBlockCreateOptions(create_opts);
qobject_unref(cryptoopts);
return ret;
diff --git a/block/parallels.c b/block/parallels.c
index 013684801a..b49c35929e 100644
--- a/block/parallels.c
+++ b/block/parallels.c
@@ -613,8 +613,8 @@ static int coroutine_fn parallels_co_create(BlockdevCreateOptions* opts,
ret = 0;
out:
- blk_unref(blk);
- bdrv_unref(bs);
+ blk_co_unref(blk);
+ bdrv_co_unref(bs);
return ret;
exit:
@@ -691,7 +691,7 @@ parallels_co_create_opts(BlockDriver *drv, const char *filename,
done:
qobject_unref(qdict);
- bdrv_unref(bs);
+ bdrv_co_unref(bs);
qapi_free_BlockdevCreateOptions(create_options);
return ret;
}
diff --git a/block/qcow.c b/block/qcow.c
index 490e4f819e..a0c701f578 100644
--- a/block/qcow.c
+++ b/block/qcow.c
@@ -915,8 +915,8 @@ static int coroutine_fn qcow_co_create(BlockdevCreateOptions *opts,
g_free(tmp);
ret = 0;
exit:
- blk_unref(qcow_blk);
- bdrv_unref(bs);
+ blk_co_unref(qcow_blk);
+ bdrv_co_unref(bs);
qcrypto_block_free(crypto);
return ret;
}
@@ -1015,7 +1015,7 @@ qcow_co_create_opts(BlockDriver *drv, const char *filename,
fail:
g_free(backing_fmt);
qobject_unref(qdict);
- bdrv_unref(bs);
+ bdrv_co_unref(bs);
qapi_free_BlockdevCreateOptions(create_options);
return ret;
}
diff --git a/block/qcow2.c b/block/qcow2.c
index 22084730f9..0b8beb8b47 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -3711,7 +3711,7 @@ qcow2_co_create(BlockdevCreateOptions *create_options, Error **errp)
goto out;
}
- blk_unref(blk);
+ blk_co_unref(blk);
blk = NULL;
/*
@@ -3791,7 +3791,7 @@ qcow2_co_create(BlockdevCreateOptions *create_options, Error **errp)
}
}
- blk_unref(blk);
+ blk_co_unref(blk);
blk = NULL;
/* Reopen the image without BDRV_O_NO_FLUSH to flush it before returning.
@@ -3816,9 +3816,9 @@ qcow2_co_create(BlockdevCreateOptions *create_options, Error **errp)
ret = 0;
out:
- blk_unref(blk);
- bdrv_unref(bs);
- bdrv_unref(data_bs);
+ blk_co_unref(blk);
+ bdrv_co_unref(bs);
+ bdrv_co_unref(data_bs);
return ret;
}
@@ -3949,8 +3949,8 @@ finish:
}
qobject_unref(qdict);
- bdrv_unref(bs);
- bdrv_unref(data_bs);
+ bdrv_co_unref(bs);
+ bdrv_co_unref(data_bs);
qapi_free_BlockdevCreateOptions(create_options);
return ret;
}
diff --git a/block/qed.c b/block/qed.c
index 0705a7b4e2..aff2a2076e 100644
--- a/block/qed.c
+++ b/block/qed.c
@@ -748,8 +748,8 @@ static int coroutine_fn bdrv_qed_co_create(BlockdevCreateOptions *opts,
ret = 0; /* success */
out:
g_free(l1_table);
- blk_unref(blk);
- bdrv_unref(bs);
+ blk_co_unref(blk);
+ bdrv_co_unref(bs);
return ret;
}
@@ -819,7 +819,7 @@ bdrv_qed_co_create_opts(BlockDriver *drv, const char *filename,
fail:
qobject_unref(qdict);
- bdrv_unref(bs);
+ bdrv_co_unref(bs);
qapi_free_BlockdevCreateOptions(create_options);
return ret;
}
diff --git a/block/vdi.c b/block/vdi.c
index f2434d6153..08331d2dd7 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -886,8 +886,8 @@ static int coroutine_fn vdi_co_do_create(BlockdevCreateOptions *create_options,
ret = 0;
exit:
- blk_unref(blk);
- bdrv_unref(bs_file);
+ blk_co_unref(blk);
+ bdrv_co_unref(bs_file);
g_free(bmap);
return ret;
}
@@ -975,7 +975,7 @@ vdi_co_create_opts(BlockDriver *drv, const char *filename,
done:
qobject_unref(qdict);
qapi_free_BlockdevCreateOptions(create_options);
- bdrv_unref(bs_file);
+ bdrv_co_unref(bs_file);
return ret;
}
diff --git a/block/vhdx.c b/block/vhdx.c
index 81420722a1..00777da91a 100644
--- a/block/vhdx.c
+++ b/block/vhdx.c
@@ -2053,8 +2053,8 @@ static int coroutine_fn vhdx_co_create(BlockdevCreateOptions *opts,
ret = 0;
delete_and_exit:
- blk_unref(blk);
- bdrv_unref(bs);
+ blk_co_unref(blk);
+ bdrv_co_unref(bs);
g_free(creator);
return ret;
}
@@ -2144,7 +2144,7 @@ vhdx_co_create_opts(BlockDriver *drv, const char *filename,
fail:
qobject_unref(qdict);
- bdrv_unref(bs);
+ bdrv_co_unref(bs);
qapi_free_BlockdevCreateOptions(create_options);
return ret;
}
diff --git a/block/vmdk.c b/block/vmdk.c
index f5f49018fe..01ca13c82b 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -2306,7 +2306,7 @@ exit:
if (pbb) {
*pbb = blk;
} else {
- blk_unref(blk);
+ blk_co_unref(blk);
blk = NULL;
}
}
@@ -2516,12 +2516,12 @@ vmdk_co_do_create(int64_t size,
if (strcmp(blk_bs(backing)->drv->format_name, "vmdk")) {
error_setg(errp, "Invalid backing file format: %s. Must be vmdk",
blk_bs(backing)->drv->format_name);
- blk_unref(backing);
+ blk_co_unref(backing);
ret = -EINVAL;
goto exit;
}
ret = vmdk_read_cid(blk_bs(backing), 0, &parent_cid);
- blk_unref(backing);
+ blk_co_unref(backing);
if (ret) {
error_setg(errp, "Failed to read parent CID");
goto exit;
@@ -2542,14 +2542,14 @@ vmdk_co_do_create(int64_t size,
blk_bs(extent_blk)->filename);
created_size += cur_size;
extent_idx++;
- blk_unref(extent_blk);
+ blk_co_unref(extent_blk);
}
/* Check whether we got excess extents */
extent_blk = extent_fn(-1, extent_idx, flat, split, compress, zeroed_grain,
opaque, NULL);
if (extent_blk) {
- blk_unref(extent_blk);
+ blk_co_unref(extent_blk);
error_setg(errp, "List of extents contains unused extents");
ret = -EINVAL;
goto exit;
@@ -2590,7 +2590,7 @@ vmdk_co_do_create(int64_t size,
ret = 0;
exit:
if (blk) {
- blk_unref(blk);
+ blk_co_unref(blk);
}
g_free(desc);
g_free(parent_desc_line);
@@ -2641,7 +2641,7 @@ vmdk_co_create_opts_cb(int64_t size, int idx, bool flat, bool split,
errp)) {
goto exit;
}
- bdrv_unref(bs);
+ bdrv_co_unref(bs);
exit:
g_free(ext_filename);
return blk;
@@ -2797,12 +2797,12 @@ static BlockBackend * coroutine_fn vmdk_co_create_cb(int64_t size, int idx,
return NULL;
}
blk_set_allow_write_beyond_eof(blk, true);
- bdrv_unref(bs);
+ bdrv_co_unref(bs);
if (size != -1) {
ret = vmdk_init_extent(blk, size, flat, compress, zeroed_grain, errp);
if (ret) {
- blk_unref(blk);
+ blk_co_unref(blk);
blk = NULL;
}
}
diff --git a/block/vpc.c b/block/vpc.c
index b89b0ff8e2..07ddda5b99 100644
--- a/block/vpc.c
+++ b/block/vpc.c
@@ -1082,8 +1082,8 @@ static int coroutine_fn vpc_co_create(BlockdevCreateOptions *opts,
}
out:
- blk_unref(blk);
- bdrv_unref(bs);
+ blk_co_unref(blk);
+ bdrv_co_unref(bs);
return ret;
}
@@ -1162,7 +1162,7 @@ vpc_co_create_opts(BlockDriver *drv, const char *filename,
fail:
qobject_unref(qdict);
- bdrv_unref(bs);
+ bdrv_co_unref(bs);
qapi_free_BlockdevCreateOptions(create_options);
return ret;
}
diff --git a/include/block/block-global-state.h b/include/block/block-global-state.h
index 399200a9a3..cd4ea554bf 100644
--- a/include/block/block-global-state.h
+++ b/include/block/block-global-state.h
@@ -214,7 +214,8 @@ void bdrv_img_create(const char *filename, const char *fmt,
bool quiet, Error **errp);
void bdrv_ref(BlockDriverState *bs);
-void bdrv_unref(BlockDriverState *bs);
+void no_coroutine_fn bdrv_unref(BlockDriverState *bs);
+void coroutine_fn no_co_wrapper bdrv_co_unref(BlockDriverState *bs);
void bdrv_unref_child(BlockDriverState *parent, BdrvChild *child);
BdrvChild *bdrv_attach_child(BlockDriverState *parent_bs,
BlockDriverState *child_bs,
diff --git a/include/sysemu/block-backend-global-state.h b/include/sysemu/block-backend-global-state.h
index 2b6d27db7c..fa83f9389c 100644
--- a/include/sysemu/block-backend-global-state.h
+++ b/include/sysemu/block-backend-global-state.h
@@ -42,7 +42,10 @@ blk_co_new_open(const char *filename, const char *reference, QDict *options,
int blk_get_refcnt(BlockBackend *blk);
void blk_ref(BlockBackend *blk);
-void blk_unref(BlockBackend *blk);
+
+void no_coroutine_fn blk_unref(BlockBackend *blk);
+void coroutine_fn no_co_wrapper blk_co_unref(BlockBackend *blk);
+
void blk_remove_all_bs(void);
BlockBackend *blk_by_name(const char *name);
BlockBackend *blk_next(BlockBackend *blk);
--
2.39.1

@ -1,74 +0,0 @@
From b1f0546548e561856252c2bc610a8f4f8fcdf007 Mon Sep 17 00:00:00 2001
From: Stefano Garzarella <sgarzare@redhat.com>
Date: Wed, 26 Jul 2023 09:48:07 +0200
Subject: [PATCH 02/14] block/blkio: do not use open flags in qemu_open()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Stefano Garzarella <sgarzare@redhat.com>
RH-MergeRequest: 194: block/blkio: backport latest fixes for virtio-blk-* drivers
RH-Bugzilla: 2225354 2225439
RH-Acked-by: Hanna Czenczek <hreitz@redhat.com>
RH-Acked-by: Alberto Faria <None>
RH-Commit: [2/6] 1ccd0ef56182bb5e2374c3b5be98ee1ec05066d6 (sgarzarella/qemu-kvm-c-9-s)
qemu_open() in blkio_virtio_blk_common_open() is used to open the
character device (e.g. /dev/vhost-vdpa-0 or /dev/vfio/vfio) or in
the future eventually the unix socket.
In all these cases we cannot open the path in read-only mode,
when the `read-only` option of blockdev is on, because the exchange
of IOCTL commands for example will fail.
In order to open the device read-only, we have to use the `read-only`
property of the libblkio driver as we already do in blkio_file_open().
Fixes: cad2ccc395 ("block/blkio: use qemu_open() to support fd passing for virtio-blk")
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2225439
Reported-by: Qing Wang <qinwang@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20230726074807.14041-1-sgarzare@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit a5942c177b7bcc1357e496b7d68668befcfc2bb9)
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
---
block/blkio.c | 21 ++++++++++++---------
1 file changed, 12 insertions(+), 9 deletions(-)
diff --git a/block/blkio.c b/block/blkio.c
index 3ea9841bd8..5a82c6cb1a 100644
--- a/block/blkio.c
+++ b/block/blkio.c
@@ -685,15 +685,18 @@ static int blkio_virtio_blk_common_open(BlockDriverState *bs,
* layer through the "/dev/fdset/N" special path.
*/
if (fd_supported) {
- int open_flags;
-
- if (flags & BDRV_O_RDWR) {
- open_flags = O_RDWR;
- } else {
- open_flags = O_RDONLY;
- }
-
- fd = qemu_open(path, open_flags, errp);
+ /*
+ * `path` can contain the path of a character device
+ * (e.g. /dev/vhost-vdpa-0 or /dev/vfio/vfio) or a unix socket.
+ *
+ * So, we should always open it with O_RDWR flag, also if BDRV_O_RDWR
+ * is not set in the open flags, because the exchange of IOCTL commands
+ * for example will fail.
+ *
+ * In order to open the device read-only, we are using the `read-only`
+ * property of the libblkio driver in blkio_file_open().
+ */
+ fd = qemu_open(path, O_RDWR, errp);
if (fd < 0) {
return -EINVAL;
}
--
2.39.3

@ -1,54 +0,0 @@
From ef99db21e9469f3fc946b7bf3edc1837d7b24e0b Mon Sep 17 00:00:00 2001
From: Stefano Garzarella <sgarzare@redhat.com>
Date: Tue, 25 Jul 2023 12:37:44 +0200
Subject: [PATCH 01/14] block/blkio: enable the completion eventfd
RH-Author: Stefano Garzarella <sgarzare@redhat.com>
RH-MergeRequest: 194: block/blkio: backport latest fixes for virtio-blk-* drivers
RH-Bugzilla: 2225354 2225439
RH-Acked-by: Hanna Czenczek <hreitz@redhat.com>
RH-Acked-by: Alberto Faria <None>
RH-Commit: [1/6] d91b3a465942863550130105ae2f38f47a82a360 (sgarzarella/qemu-kvm-c-9-s)
Until libblkio 1.3.0, virtio-blk drivers had completion eventfd
notifications enabled from the start, but from the next releases
this is no longer the case, so we have to explicitly enable them.
In fact, the libblkio documentation says they could be disabled,
so we should always enable them at the start if we want to be
sure to get completion eventfd notifications:
By default, the driver might not generate completion events for
requests so it is necessary to explicitly enable the completion
file descriptor before use:
void blkioq_set_completion_fd_enabled(struct blkioq *q, bool enable);
I discovered this while trying a development version of libblkio:
the guest kernel hangs during boot, while probing the device.
Fixes: fd66dbd424f5 ("blkio: add libblkio block driver")
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20230725103744.77343-1-sgarzare@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 9359c459889fce1804c4e1b2a2ff8f182b4a9ae8)
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
---
block/blkio.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/block/blkio.c b/block/blkio.c
index afcec359f2..3ea9841bd8 100644
--- a/block/blkio.c
+++ b/block/blkio.c
@@ -844,6 +844,7 @@ static int blkio_file_open(BlockDriverState *bs, QDict *options, int flags,
QLIST_INIT(&s->bounce_bufs);
s->blkioq = blkio_get_queue(s->blkio, 0);
s->completion_fd = blkioq_get_completion_fd(s->blkioq);
+ blkioq_set_completion_fd_enabled(s->blkioq, true);
blkio_attach_aio_context(bs, bdrv_get_aio_context(bs));
return 0;
--
2.39.3

@ -1,67 +0,0 @@
From c1ce3ba81698b9d52ac9dff83c01ee8141ca403d Mon Sep 17 00:00:00 2001
From: Stefano Garzarella <sgarzare@redhat.com>
Date: Thu, 27 Jul 2023 18:10:19 +0200
Subject: [PATCH 05/14] block/blkio: fall back on using `path` when `fd`
setting fails
RH-Author: Stefano Garzarella <sgarzare@redhat.com>
RH-MergeRequest: 194: block/blkio: backport latest fixes for virtio-blk-* drivers
RH-Bugzilla: 2225354 2225439
RH-Acked-by: Hanna Czenczek <hreitz@redhat.com>
RH-Acked-by: Alberto Faria <None>
RH-Commit: [5/6] c03cea95146a59b2830ffe2dd56ef77a6630ce3e (sgarzarella/qemu-kvm-c-9-s)
qemu_open() fails if called with an unix domain socket in this way:
-blockdev node-name=drive0,driver=virtio-blk-vhost-user,path=vhost-user-blk.sock,cache.direct=on: Could not open 'vhost-user-blk.sock': No such device or address
Since virtio-blk-vhost-user does not support fd passing, let`s always fall back
on using `path` if we fail the fd passing.
Fixes: cad2ccc395 ("block/blkio: use qemu_open() to support fd passing for virtio-blk")
Reported-by: Qing Wang <qinwang@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20230727161020.84213-4-sgarzare@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 723bea27b127969931fa26bc0de79372a3d9e148)
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
---
block/blkio.c | 20 ++++++++++----------
1 file changed, 10 insertions(+), 10 deletions(-)
diff --git a/block/blkio.c b/block/blkio.c
index 93a8f8fc5c..eef80e9ce5 100644
--- a/block/blkio.c
+++ b/block/blkio.c
@@ -710,19 +710,19 @@ static int blkio_virtio_blk_connect(BlockDriverState *bs, QDict *options,
* In order to open the device read-only, we are using the `read-only`
* property of the libblkio driver in blkio_file_open().
*/
- fd = qemu_open(path, O_RDWR, errp);
+ fd = qemu_open(path, O_RDWR, NULL);
if (fd < 0) {
- return -EINVAL;
+ fd_supported = false;
+ } else {
+ ret = blkio_set_int(s->blkio, "fd", fd);
+ if (ret < 0) {
+ fd_supported = false;
+ qemu_close(fd);
+ }
}
+ }
- ret = blkio_set_int(s->blkio, "fd", fd);
- if (ret < 0) {
- error_setg_errno(errp, -ret, "failed to set fd: %s",
- blkio_get_error_msg());
- qemu_close(fd);
- return ret;
- }
- } else {
+ if (!fd_supported) {
ret = blkio_set_str(s->blkio, "path", path);
if (ret < 0) {
error_setg_errno(errp, -ret, "failed to set path: %s",
--
2.39.3

@ -1,205 +0,0 @@
From 545482400ea87d54b1b839587f8aaad41e30692f Mon Sep 17 00:00:00 2001
From: Stefan Hajnoczi <stefanha@redhat.com>
Date: Tue, 4 Jul 2023 14:34:36 +0200
Subject: [PATCH 36/37] block/blkio: fix module_block.py parsing
RH-Author: Stefan Hajnoczi <stefanha@redhat.com>
RH-MergeRequest: 181: block/blkio: fix module_block.py parsing
RH-Bugzilla: 2213317
RH-Acked-by: Stefano Garzarella <sgarzare@redhat.com>
RH-Acked-by: Kevin Wolf <kwolf@redhat.com>
RH-Commit: [1/2] c85df95824f4889526a73527771dec9efcb06926 (stefanha/centos-stream-qemu-kvm)
When QEMU is built with --enable-modules, the module_block.py script
parses block/*.c to find block drivers that are built as modules. The
script generates a table of block drivers called block_driver_modules[].
This table is used for block driver module loading.
The blkio.c driver uses macros to define its BlockDriver structs. This
was done to avoid code duplication but the module_block.py script is
unable to parse the macro. The result is that libblkio-based block
drivers can be built as modules but will not be found at runtime.
One fix is to make the module_block.py script or build system fancier so
it can parse C macros (e.g. by parsing the preprocessed source code). I
chose not to do this because it raises the complexity of the build,
making future issues harder to debug.
Keep things simple: use the macro to avoid duplicating BlockDriver
function pointers but define .format_name and .protocol_name manually
for each BlockDriver. This way the module_block.py is able to parse the
code.
Also get rid of the block driver name macros (e.g. DRIVER_IO_URING)
because module_block.py cannot parse them either.
Fixes: fd66dbd424f5 ("blkio: add libblkio block driver")
Reported-by: Qing Wang <qinwang@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20230704123436.187761-1-stefanha@redhat.com
Cc: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit c21eae1ccc782440f320accb6f90c66cb8f45ee9)
Conflicts:
- Downstream lacks commit 28ff7b4dfbb5 ("block/blkio: convert to
blk_io_plug_call() API") so keep the .bdrv_co_io_unplug callback.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
block/blkio.c | 118 ++++++++++++++++++++++++++------------------------
1 file changed, 61 insertions(+), 57 deletions(-)
diff --git a/block/blkio.c b/block/blkio.c
index 6a6f20f923..afcec359f2 100644
--- a/block/blkio.c
+++ b/block/blkio.c
@@ -21,16 +21,6 @@
#include "block/block-io.h"
-/*
- * Keep the QEMU BlockDriver names identical to the libblkio driver names.
- * Using macros instead of typing out the string literals avoids typos.
- */
-#define DRIVER_IO_URING "io_uring"
-#define DRIVER_NVME_IO_URING "nvme-io_uring"
-#define DRIVER_VIRTIO_BLK_VFIO_PCI "virtio-blk-vfio-pci"
-#define DRIVER_VIRTIO_BLK_VHOST_USER "virtio-blk-vhost-user"
-#define DRIVER_VIRTIO_BLK_VHOST_VDPA "virtio-blk-vhost-vdpa"
-
/*
* Allocated bounce buffers are kept in a list sorted by buffer address.
*/
@@ -743,15 +733,15 @@ static int blkio_file_open(BlockDriverState *bs, QDict *options, int flags,
return ret;
}
- if (strcmp(blkio_driver, DRIVER_IO_URING) == 0) {
+ if (strcmp(blkio_driver, "io_uring") == 0) {
ret = blkio_io_uring_open(bs, options, flags, errp);
- } else if (strcmp(blkio_driver, DRIVER_NVME_IO_URING) == 0) {
+ } else if (strcmp(blkio_driver, "nvme-io_uring") == 0) {
ret = blkio_nvme_io_uring(bs, options, flags, errp);
- } else if (strcmp(blkio_driver, DRIVER_VIRTIO_BLK_VFIO_PCI) == 0) {
+ } else if (strcmp(blkio_driver, "virtio-blk-vfio-pci") == 0) {
ret = blkio_virtio_blk_common_open(bs, options, flags, errp);
- } else if (strcmp(blkio_driver, DRIVER_VIRTIO_BLK_VHOST_USER) == 0) {
+ } else if (strcmp(blkio_driver, "virtio-blk-vhost-user") == 0) {
ret = blkio_virtio_blk_common_open(bs, options, flags, errp);
- } else if (strcmp(blkio_driver, DRIVER_VIRTIO_BLK_VHOST_VDPA) == 0) {
+ } else if (strcmp(blkio_driver, "virtio-blk-vhost-vdpa") == 0) {
ret = blkio_virtio_blk_common_open(bs, options, flags, errp);
} else {
g_assert_not_reached();
@@ -1027,50 +1017,64 @@ static void blkio_refresh_limits(BlockDriverState *bs, Error **errp)
* - truncate
*/
-#define BLKIO_DRIVER(name, ...) \
- { \
- .format_name = name, \
- .protocol_name = name, \
- .instance_size = sizeof(BDRVBlkioState), \
- .bdrv_file_open = blkio_file_open, \
- .bdrv_close = blkio_close, \
- .bdrv_co_getlength = blkio_co_getlength, \
- .bdrv_co_truncate = blkio_truncate, \
- .bdrv_co_get_info = blkio_co_get_info, \
- .bdrv_attach_aio_context = blkio_attach_aio_context, \
- .bdrv_detach_aio_context = blkio_detach_aio_context, \
- .bdrv_co_pdiscard = blkio_co_pdiscard, \
- .bdrv_co_preadv = blkio_co_preadv, \
- .bdrv_co_pwritev = blkio_co_pwritev, \
- .bdrv_co_flush_to_disk = blkio_co_flush, \
- .bdrv_co_pwrite_zeroes = blkio_co_pwrite_zeroes, \
- .bdrv_co_io_unplug = blkio_co_io_unplug, \
- .bdrv_refresh_limits = blkio_refresh_limits, \
- .bdrv_register_buf = blkio_register_buf, \
- .bdrv_unregister_buf = blkio_unregister_buf, \
- __VA_ARGS__ \
- }
-
-static BlockDriver bdrv_io_uring = BLKIO_DRIVER(
- DRIVER_IO_URING,
- .bdrv_needs_filename = true,
-);
-
-static BlockDriver bdrv_nvme_io_uring = BLKIO_DRIVER(
- DRIVER_NVME_IO_URING,
-);
-
-static BlockDriver bdrv_virtio_blk_vfio_pci = BLKIO_DRIVER(
- DRIVER_VIRTIO_BLK_VFIO_PCI
-);
+/*
+ * Do not include .format_name and .protocol_name because module_block.py
+ * does not parse macros in the source code.
+ */
+#define BLKIO_DRIVER_COMMON \
+ .instance_size = sizeof(BDRVBlkioState), \
+ .bdrv_file_open = blkio_file_open, \
+ .bdrv_close = blkio_close, \
+ .bdrv_co_getlength = blkio_co_getlength, \
+ .bdrv_co_truncate = blkio_truncate, \
+ .bdrv_co_get_info = blkio_co_get_info, \
+ .bdrv_attach_aio_context = blkio_attach_aio_context, \
+ .bdrv_detach_aio_context = blkio_detach_aio_context, \
+ .bdrv_co_pdiscard = blkio_co_pdiscard, \
+ .bdrv_co_preadv = blkio_co_preadv, \
+ .bdrv_co_pwritev = blkio_co_pwritev, \
+ .bdrv_co_flush_to_disk = blkio_co_flush, \
+ .bdrv_co_pwrite_zeroes = blkio_co_pwrite_zeroes, \
+ .bdrv_co_io_unplug = blkio_co_io_unplug, \
+ .bdrv_refresh_limits = blkio_refresh_limits, \
+ .bdrv_register_buf = blkio_register_buf, \
+ .bdrv_unregister_buf = blkio_unregister_buf,
-static BlockDriver bdrv_virtio_blk_vhost_user = BLKIO_DRIVER(
- DRIVER_VIRTIO_BLK_VHOST_USER
-);
+/*
+ * Use the same .format_name and .protocol_name as the libblkio driver name for
+ * consistency.
+ */
-static BlockDriver bdrv_virtio_blk_vhost_vdpa = BLKIO_DRIVER(
- DRIVER_VIRTIO_BLK_VHOST_VDPA
-);
+static BlockDriver bdrv_io_uring = {
+ .format_name = "io_uring",
+ .protocol_name = "io_uring",
+ .bdrv_needs_filename = true,
+ BLKIO_DRIVER_COMMON
+};
+
+static BlockDriver bdrv_nvme_io_uring = {
+ .format_name = "nvme-io_uring",
+ .protocol_name = "nvme-io_uring",
+ BLKIO_DRIVER_COMMON
+};
+
+static BlockDriver bdrv_virtio_blk_vfio_pci = {
+ .format_name = "virtio-blk-vfio-pci",
+ .protocol_name = "virtio-blk-vfio-pci",
+ BLKIO_DRIVER_COMMON
+};
+
+static BlockDriver bdrv_virtio_blk_vhost_user = {
+ .format_name = "virtio-blk-vhost-user",
+ .protocol_name = "virtio-blk-vhost-user",
+ BLKIO_DRIVER_COMMON
+};
+
+static BlockDriver bdrv_virtio_blk_vhost_vdpa = {
+ .format_name = "virtio-blk-vhost-vdpa",
+ .protocol_name = "virtio-blk-vhost-vdpa",
+ BLKIO_DRIVER_COMMON
+};
static void bdrv_blkio_init(void)
{
--
2.39.3

@ -1,151 +0,0 @@
From 458c33c9f19ed01beeb9b2b494ce6ed10d2ed4ac Mon Sep 17 00:00:00 2001
From: Stefano Garzarella <sgarzare@redhat.com>
Date: Thu, 27 Jul 2023 18:10:17 +0200
Subject: [PATCH 03/14] block/blkio: move blkio_connect() in the drivers
functions
RH-Author: Stefano Garzarella <sgarzare@redhat.com>
RH-MergeRequest: 194: block/blkio: backport latest fixes for virtio-blk-* drivers
RH-Bugzilla: 2225354 2225439
RH-Acked-by: Hanna Czenczek <hreitz@redhat.com>
RH-Acked-by: Alberto Faria <None>
RH-Commit: [3/6] c356108d7dfe1ba2098c094f8d12b6e40853560c (sgarzarella/qemu-kvm-c-9-s)
This is in preparation for the next patch, where for virtio-blk
drivers we need to handle the failure of blkio_connect().
Let's also rename the *_open() functions to *_connect() to make
the code reflect the changes applied.
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20230727161020.84213-2-sgarzare@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 69785d66ae1ec43f77fc65109a21721992bead9f)
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
---
block/blkio.c | 67 ++++++++++++++++++++++++++++++---------------------
1 file changed, 40 insertions(+), 27 deletions(-)
diff --git a/block/blkio.c b/block/blkio.c
index 5a82c6cb1a..85d1eed5fb 100644
--- a/block/blkio.c
+++ b/block/blkio.c
@@ -602,8 +602,8 @@ static void blkio_unregister_buf(BlockDriverState *bs, void *host, size_t size)
}
}
-static int blkio_io_uring_open(BlockDriverState *bs, QDict *options, int flags,
- Error **errp)
+static int blkio_io_uring_connect(BlockDriverState *bs, QDict *options,
+ int flags, Error **errp)
{
const char *filename = qdict_get_str(options, "filename");
BDRVBlkioState *s = bs->opaque;
@@ -626,11 +626,18 @@ static int blkio_io_uring_open(BlockDriverState *bs, QDict *options, int flags,
}
}
+ ret = blkio_connect(s->blkio);
+ if (ret < 0) {
+ error_setg_errno(errp, -ret, "blkio_connect failed: %s",
+ blkio_get_error_msg());
+ return ret;
+ }
+
return 0;
}
-static int blkio_nvme_io_uring(BlockDriverState *bs, QDict *options, int flags,
- Error **errp)
+static int blkio_nvme_io_uring_connect(BlockDriverState *bs, QDict *options,
+ int flags, Error **errp)
{
const char *path = qdict_get_try_str(options, "path");
BDRVBlkioState *s = bs->opaque;
@@ -654,11 +661,18 @@ static int blkio_nvme_io_uring(BlockDriverState *bs, QDict *options, int flags,
return -EINVAL;
}
+ ret = blkio_connect(s->blkio);
+ if (ret < 0) {
+ error_setg_errno(errp, -ret, "blkio_connect failed: %s",
+ blkio_get_error_msg());
+ return ret;
+ }
+
return 0;
}
-static int blkio_virtio_blk_common_open(BlockDriverState *bs,
- QDict *options, int flags, Error **errp)
+static int blkio_virtio_blk_connect(BlockDriverState *bs, QDict *options,
+ int flags, Error **errp)
{
const char *path = qdict_get_try_str(options, "path");
BDRVBlkioState *s = bs->opaque;
@@ -717,6 +731,13 @@ static int blkio_virtio_blk_common_open(BlockDriverState *bs,
}
}
+ ret = blkio_connect(s->blkio);
+ if (ret < 0) {
+ error_setg_errno(errp, -ret, "blkio_connect failed: %s",
+ blkio_get_error_msg());
+ return ret;
+ }
+
qdict_del(options, "path");
return 0;
@@ -736,24 +757,6 @@ static int blkio_file_open(BlockDriverState *bs, QDict *options, int flags,
return ret;
}
- if (strcmp(blkio_driver, "io_uring") == 0) {
- ret = blkio_io_uring_open(bs, options, flags, errp);
- } else if (strcmp(blkio_driver, "nvme-io_uring") == 0) {
- ret = blkio_nvme_io_uring(bs, options, flags, errp);
- } else if (strcmp(blkio_driver, "virtio-blk-vfio-pci") == 0) {
- ret = blkio_virtio_blk_common_open(bs, options, flags, errp);
- } else if (strcmp(blkio_driver, "virtio-blk-vhost-user") == 0) {
- ret = blkio_virtio_blk_common_open(bs, options, flags, errp);
- } else if (strcmp(blkio_driver, "virtio-blk-vhost-vdpa") == 0) {
- ret = blkio_virtio_blk_common_open(bs, options, flags, errp);
- } else {
- g_assert_not_reached();
- }
- if (ret < 0) {
- blkio_destroy(&s->blkio);
- return ret;
- }
-
if (!(flags & BDRV_O_RDWR)) {
ret = blkio_set_bool(s->blkio, "read-only", true);
if (ret < 0) {
@@ -764,10 +767,20 @@ static int blkio_file_open(BlockDriverState *bs, QDict *options, int flags,
}
}
- ret = blkio_connect(s->blkio);
+ if (strcmp(blkio_driver, "io_uring") == 0) {
+ ret = blkio_io_uring_connect(bs, options, flags, errp);
+ } else if (strcmp(blkio_driver, "nvme-io_uring") == 0) {
+ ret = blkio_nvme_io_uring_connect(bs, options, flags, errp);
+ } else if (strcmp(blkio_driver, "virtio-blk-vfio-pci") == 0) {
+ ret = blkio_virtio_blk_connect(bs, options, flags, errp);
+ } else if (strcmp(blkio_driver, "virtio-blk-vhost-user") == 0) {
+ ret = blkio_virtio_blk_connect(bs, options, flags, errp);
+ } else if (strcmp(blkio_driver, "virtio-blk-vhost-vdpa") == 0) {
+ ret = blkio_virtio_blk_connect(bs, options, flags, errp);
+ } else {
+ g_assert_not_reached();
+ }
if (ret < 0) {
- error_setg_errno(errp, -ret, "blkio_connect failed: %s",
- blkio_get_error_msg());
blkio_destroy(&s->blkio);
return ret;
}
--
2.39.3

@ -1,85 +0,0 @@
From ece855a71d9234c58497f37cb5498f507742167d Mon Sep 17 00:00:00 2001
From: Stefano Garzarella <sgarzare@redhat.com>
Date: Thu, 27 Jul 2023 18:10:18 +0200
Subject: [PATCH 04/14] block/blkio: retry blkio_connect() if it fails using
`fd`
RH-Author: Stefano Garzarella <sgarzare@redhat.com>
RH-MergeRequest: 194: block/blkio: backport latest fixes for virtio-blk-* drivers
RH-Bugzilla: 2225354 2225439
RH-Acked-by: Hanna Czenczek <hreitz@redhat.com>
RH-Acked-by: Alberto Faria <None>
RH-Commit: [4/6] 14ebc1f333617ce22c68693dec1c9a186d4f8a08 (sgarzarella/qemu-kvm-c-9-s)
libblkio 1.3.0 added support of "fd" property for virtio-blk-vhost-vdpa
driver. In QEMU, starting from commit cad2ccc395 ("block/blkio: use
qemu_open() to support fd passing for virtio-blk") we are using
`blkio_get_int(..., "fd")` to check if the "fd" property is supported
for all the virtio-blk-* driver.
Unfortunately that property is also available for those driver that do
not support it, such as virtio-blk-vhost-user.
So, `blkio_get_int()` is not enough to check whether the driver supports
the `fd` property or not. This is because the virito-blk common libblkio
driver only checks whether or not `fd` is set during `blkio_connect()`
and fails with -EINVAL for those transports that do not support it
(all except vhost-vdpa for now).
So let's handle the `blkio_connect()` failure, retrying it using `path`
directly.
Fixes: cad2ccc395 ("block/blkio: use qemu_open() to support fd passing for virtio-blk")
Suggested-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20230727161020.84213-3-sgarzare@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 809c319f8a089fbc49223dc29e1cc2b978beeada)
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
---
block/blkio.c | 29 +++++++++++++++++++++++++++++
1 file changed, 29 insertions(+)
diff --git a/block/blkio.c b/block/blkio.c
index 85d1eed5fb..93a8f8fc5c 100644
--- a/block/blkio.c
+++ b/block/blkio.c
@@ -732,6 +732,35 @@ static int blkio_virtio_blk_connect(BlockDriverState *bs, QDict *options,
}
ret = blkio_connect(s->blkio);
+ /*
+ * If the libblkio driver doesn't support the `fd` property, blkio_connect()
+ * will fail with -EINVAL. So let's try calling blkio_connect() again by
+ * directly setting `path`.
+ */
+ if (fd_supported && ret == -EINVAL) {
+ qemu_close(fd);
+
+ /*
+ * We need to clear the `fd` property we set previously by setting
+ * it to -1.
+ */
+ ret = blkio_set_int(s->blkio, "fd", -1);
+ if (ret < 0) {
+ error_setg_errno(errp, -ret, "failed to set fd: %s",
+ blkio_get_error_msg());
+ return ret;
+ }
+
+ ret = blkio_set_str(s->blkio, "path", path);
+ if (ret < 0) {
+ error_setg_errno(errp, -ret, "failed to set path: %s",
+ blkio_get_error_msg());
+ return ret;
+ }
+
+ ret = blkio_connect(s->blkio);
+ }
+
if (ret < 0) {
error_setg_errno(errp, -ret, "blkio_connect failed: %s",
blkio_get_error_msg());
--
2.39.3

@ -1,49 +0,0 @@
From 2f4436e7cc2f63d198229dc8ba32783460c0b185 Mon Sep 17 00:00:00 2001
From: Stefano Garzarella <sgarzare@redhat.com>
Date: Thu, 27 Jul 2023 18:10:20 +0200
Subject: [PATCH 06/14] block/blkio: use blkio_set_int("fd") to check fd
support
RH-Author: Stefano Garzarella <sgarzare@redhat.com>
RH-MergeRequest: 194: block/blkio: backport latest fixes for virtio-blk-* drivers
RH-Bugzilla: 2225354 2225439
RH-Acked-by: Hanna Czenczek <hreitz@redhat.com>
RH-Acked-by: Alberto Faria <None>
RH-Commit: [6/6] d57aafb2c3a8ed13aa3c6dcce5525a9cc8f5aa21 (sgarzarella/qemu-kvm-c-9-s)
Setting the `fd` property fails with virtio-blk-* libblkio drivers
that do not support fd passing since
https://gitlab.com/libblkio/libblkio/-/merge_requests/208.
Getting the `fd` property, on the other hand, always succeeds for
virtio-blk-* libblkio drivers even when they don't support fd passing.
This patch switches to setting the `fd` property because it is a
better mechanism for probing fd passing support than getting the `fd`
property.
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20230727161020.84213-5-sgarzare@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 1c38fe69e2b8a05c1762b122292fa7e3662f06fd)
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
---
block/blkio.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/block/blkio.c b/block/blkio.c
index eef80e9ce5..8defbf744f 100644
--- a/block/blkio.c
+++ b/block/blkio.c
@@ -689,7 +689,7 @@ static int blkio_virtio_blk_connect(BlockDriverState *bs, QDict *options,
return -EINVAL;
}
- if (blkio_get_int(s->blkio, "fd", &fd) == 0) {
+ if (blkio_set_int(s->blkio, "fd", -1) == 0) {
fd_supported = true;
}
--
2.39.3

@ -1,108 +0,0 @@
From fd57241cf0f8c2906fa56118f8da1e65a5b1e4d8 Mon Sep 17 00:00:00 2001
From: Stefano Garzarella <sgarzare@redhat.com>
Date: Tue, 30 May 2023 09:19:40 +0200
Subject: [PATCH 3/5] block/blkio: use qemu_open() to support fd passing for
virtio-blk
RH-Author: Stefano Garzarella <sgarzare@redhat.com>
RH-MergeRequest: 169: block/blkio: support fd passing for virtio-blk-vhost-vdpa driver
RH-Bugzilla: 2180076
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
RH-Commit: [1/2] 9ff1a1510500db101648341207a36318a0c41c5a (sgarzarella/qemu-kvm-c-9-s)
Some virtio-blk drivers (e.g. virtio-blk-vhost-vdpa) supports the fd
passing. Let's expose this to the user, so the management layer
can pass the file descriptor of an already opened path.
If the libblkio virtio-blk driver supports fd passing, let's always
use qemu_open() to open the `path`, so we can handle fd passing
from the management layer through the "/dev/fdset/N" special path.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20230530071941.8954-2-sgarzare@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit cad2ccc395c7113fb30bc9390774b67b34f06c68)
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
---
block/blkio.c | 53 ++++++++++++++++++++++++++++++++++++++++++---------
1 file changed, 44 insertions(+), 9 deletions(-)
diff --git a/block/blkio.c b/block/blkio.c
index 0cdc99a729..6a6f20f923 100644
--- a/block/blkio.c
+++ b/block/blkio.c
@@ -672,25 +672,60 @@ static int blkio_virtio_blk_common_open(BlockDriverState *bs,
{
const char *path = qdict_get_try_str(options, "path");
BDRVBlkioState *s = bs->opaque;
- int ret;
+ bool fd_supported = false;
+ int fd, ret;
if (!path) {
error_setg(errp, "missing 'path' option");
return -EINVAL;
}
- ret = blkio_set_str(s->blkio, "path", path);
- qdict_del(options, "path");
- if (ret < 0) {
- error_setg_errno(errp, -ret, "failed to set path: %s",
- blkio_get_error_msg());
- return ret;
- }
-
if (!(flags & BDRV_O_NOCACHE)) {
error_setg(errp, "cache.direct=off is not supported");
return -EINVAL;
}
+
+ if (blkio_get_int(s->blkio, "fd", &fd) == 0) {
+ fd_supported = true;
+ }
+
+ /*
+ * If the libblkio driver supports fd passing, let's always use qemu_open()
+ * to open the `path`, so we can handle fd passing from the management
+ * layer through the "/dev/fdset/N" special path.
+ */
+ if (fd_supported) {
+ int open_flags;
+
+ if (flags & BDRV_O_RDWR) {
+ open_flags = O_RDWR;
+ } else {
+ open_flags = O_RDONLY;
+ }
+
+ fd = qemu_open(path, open_flags, errp);
+ if (fd < 0) {
+ return -EINVAL;
+ }
+
+ ret = blkio_set_int(s->blkio, "fd", fd);
+ if (ret < 0) {
+ error_setg_errno(errp, -ret, "failed to set fd: %s",
+ blkio_get_error_msg());
+ qemu_close(fd);
+ return ret;
+ }
+ } else {
+ ret = blkio_set_str(s->blkio, "path", path);
+ if (ret < 0) {
+ error_setg_errno(errp, -ret, "failed to set path: %s",
+ blkio_get_error_msg());
+ return ret;
+ }
+ }
+
+ qdict_del(options, "path");
+
return 0;
}
--
2.39.3

@ -1,121 +0,0 @@
From d9190117f3c701380701d6e9b2aa3c2446b9708f Mon Sep 17 00:00:00 2001
From: Stefan Hajnoczi <stefanha@redhat.com>
Date: Mon, 1 May 2023 13:34:43 -0400
Subject: [PATCH 01/21] block: compile out assert_bdrv_graph_readable() by
default
RH-Author: Kevin Wolf <kwolf@redhat.com>
RH-MergeRequest: 166: block/graph-lock: Disable locking for now
RH-Bugzilla: 2186725
RH-Acked-by: Eric Blake <eblake@redhat.com>
RH-Acked-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
RH-Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
RH-Commit: [1/4] d8cb4bb832c85e8216d97e57679a34c7bc6a8f71 (kmwolf/centos-qemu-kvm)
reader_count() is a performance bottleneck because the global
aio_context_list_lock mutex causes thread contention. Put this debugging
assertion behind a new ./configure --enable-debug-graph-lock option and
disable it by default.
The --enable-debug-graph-lock option is also enabled by the more general
--enable-debug option.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230501173443.153062-1-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 58a2e3f5c37be02dac3086b81bdda9414b931edf)
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
block/graph-lock.c | 3 +++
configure | 1 +
meson.build | 2 ++
meson_options.txt | 2 ++
scripts/meson-buildoptions.sh | 4 ++++
5 files changed, 12 insertions(+)
diff --git a/block/graph-lock.c b/block/graph-lock.c
index 454c31e691..259a7a0bde 100644
--- a/block/graph-lock.c
+++ b/block/graph-lock.c
@@ -265,7 +265,10 @@ void bdrv_graph_rdunlock_main_loop(void)
void assert_bdrv_graph_readable(void)
{
+ /* reader_count() is slow due to aio_context_list_lock lock contention */
+#ifdef CONFIG_DEBUG_GRAPH_LOCK
assert(qemu_in_main_thread() || reader_count());
+#endif
}
void assert_bdrv_graph_writable(void)
diff --git a/configure b/configure
index 800b5850f4..a62a3e6be9 100755
--- a/configure
+++ b/configure
@@ -806,6 +806,7 @@ for opt do
--enable-debug)
# Enable debugging options that aren't excessively noisy
debug_tcg="yes"
+ meson_option_parse --enable-debug-graph-lock ""
meson_option_parse --enable-debug-mutex ""
meson_option_add -Doptimization=0
fortify_source="no"
diff --git a/meson.build b/meson.build
index c44d05a13f..d964e741e7 100644
--- a/meson.build
+++ b/meson.build
@@ -1956,6 +1956,7 @@ if get_option('debug_stack_usage') and have_coroutine_pool
have_coroutine_pool = false
endif
config_host_data.set10('CONFIG_COROUTINE_POOL', have_coroutine_pool)
+config_host_data.set('CONFIG_DEBUG_GRAPH_LOCK', get_option('debug_graph_lock'))
config_host_data.set('CONFIG_DEBUG_MUTEX', get_option('debug_mutex'))
config_host_data.set('CONFIG_DEBUG_STACK_USAGE', get_option('debug_stack_usage'))
config_host_data.set('CONFIG_GPROF', get_option('gprof'))
@@ -3833,6 +3834,7 @@ summary_info += {'PIE': get_option('b_pie')}
summary_info += {'static build': config_host.has_key('CONFIG_STATIC')}
summary_info += {'malloc trim support': has_malloc_trim}
summary_info += {'membarrier': have_membarrier}
+summary_info += {'debug graph lock': get_option('debug_graph_lock')}
summary_info += {'debug stack usage': get_option('debug_stack_usage')}
summary_info += {'mutex debugging': get_option('debug_mutex')}
summary_info += {'memory allocator': get_option('malloc')}
diff --git a/meson_options.txt b/meson_options.txt
index fc9447d267..bc857fe68b 100644
--- a/meson_options.txt
+++ b/meson_options.txt
@@ -311,6 +311,8 @@ option('rng_none', type: 'boolean', value: false,
description: 'dummy RNG, avoid using /dev/(u)random and getrandom()')
option('coroutine_pool', type: 'boolean', value: true,
description: 'coroutine freelist (better performance)')
+option('debug_graph_lock', type: 'boolean', value: false,
+ description: 'graph lock debugging support')
option('debug_mutex', type: 'boolean', value: false,
description: 'mutex debugging support')
option('debug_stack_usage', type: 'boolean', value: false,
diff --git a/scripts/meson-buildoptions.sh b/scripts/meson-buildoptions.sh
index 009fab1515..30e1f25259 100644
--- a/scripts/meson-buildoptions.sh
+++ b/scripts/meson-buildoptions.sh
@@ -21,6 +21,8 @@ meson_options_help() {
printf "%s\n" ' QEMU'
printf "%s\n" ' --enable-cfi Control-Flow Integrity (CFI)'
printf "%s\n" ' --enable-cfi-debug Verbose errors in case of CFI violation'
+ printf "%s\n" ' --enable-debug-graph-lock'
+ printf "%s\n" ' graph lock debugging support'
printf "%s\n" ' --enable-debug-mutex mutex debugging support'
printf "%s\n" ' --enable-debug-stack-usage'
printf "%s\n" ' measure coroutine stack usage'
@@ -249,6 +251,8 @@ _meson_option_parse() {
--datadir=*) quote_sh "-Ddatadir=$2" ;;
--enable-dbus-display) printf "%s" -Ddbus_display=enabled ;;
--disable-dbus-display) printf "%s" -Ddbus_display=disabled ;;
+ --enable-debug-graph-lock) printf "%s" -Ddebug_graph_lock=true ;;
+ --disable-debug-graph-lock) printf "%s" -Ddebug_graph_lock=false ;;
--enable-debug-mutex) printf "%s" -Ddebug_mutex=true ;;
--disable-debug-mutex) printf "%s" -Ddebug_mutex=false ;;
--enable-debug-stack-usage) printf "%s" -Ddebug_stack_usage=true ;;
--
2.39.3

@ -0,0 +1,330 @@
From a67edfb4b591acdffc5b4987601a30224376996f Mon Sep 17 00:00:00 2001
From: Stefan Hajnoczi <stefanha@redhat.com>
Date: Mon, 27 May 2024 11:58:50 -0400
Subject: [PATCH 4/5] block/crypto: create ciphers on demand
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Stefan Hajnoczi <stefanha@redhat.com>
RH-MergeRequest: 251: block/crypto: create ciphers on demand
RH-Jira: RHEL-36159
RH-Acked-by: Kevin Wolf <kwolf@redhat.com>
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Commit: [1/2] 22a4c87fef774cad98a6f5a79f27df50a208013d (stefanha/centos-stream-qemu-kvm)
Ciphers are pre-allocated by qcrypto_block_init_cipher() depending on
the given number of threads. The -device
virtio-blk-pci,iothread-vq-mapping= feature allows users to assign
multiple IOThreads to a virtio-blk device, but the association between
the virtio-blk device and the block driver happens after the block
driver is already open.
When the number of threads given to qcrypto_block_init_cipher() is
smaller than the actual number of threads at runtime, the
block->n_free_ciphers > 0 assertion in qcrypto_block_pop_cipher() can
fail.
Get rid of qcrypto_block_init_cipher() n_thread's argument and allocate
ciphers on demand.
Reported-by: Qing Wang <qinwang@redhat.com>
Buglink: https://issues.redhat.com/browse/RHEL-36159
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240527155851.892885-2-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit af206c284e4c1b17cdfb0f17e898b288c0fc1751)
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
crypto/block-luks.c | 3 +-
crypto/block-qcow.c | 2 +-
crypto/block.c | 111 ++++++++++++++++++++++++++------------------
crypto/blockpriv.h | 12 +++--
4 files changed, 78 insertions(+), 50 deletions(-)
diff --git a/crypto/block-luks.c b/crypto/block-luks.c
index 3ee928fb5a..3357852c0a 100644
--- a/crypto/block-luks.c
+++ b/crypto/block-luks.c
@@ -1262,7 +1262,6 @@ qcrypto_block_luks_open(QCryptoBlock *block,
luks->cipher_mode,
masterkey,
luks->header.master_key_len,
- n_threads,
errp) < 0) {
goto fail;
}
@@ -1456,7 +1455,7 @@ qcrypto_block_luks_create(QCryptoBlock *block,
/* Setup the block device payload encryption objects */
if (qcrypto_block_init_cipher(block, luks_opts.cipher_alg,
luks_opts.cipher_mode, masterkey,
- luks->header.master_key_len, 1, errp) < 0) {
+ luks->header.master_key_len, errp) < 0) {
goto error;
}
diff --git a/crypto/block-qcow.c b/crypto/block-qcow.c
index 4d7cf36a8f..02305058e3 100644
--- a/crypto/block-qcow.c
+++ b/crypto/block-qcow.c
@@ -75,7 +75,7 @@ qcrypto_block_qcow_init(QCryptoBlock *block,
ret = qcrypto_block_init_cipher(block, QCRYPTO_CIPHER_ALG_AES_128,
QCRYPTO_CIPHER_MODE_CBC,
keybuf, G_N_ELEMENTS(keybuf),
- n_threads, errp);
+ errp);
if (ret < 0) {
ret = -ENOTSUP;
goto fail;
diff --git a/crypto/block.c b/crypto/block.c
index 506ea1d1a3..ba6d1cebc7 100644
--- a/crypto/block.c
+++ b/crypto/block.c
@@ -20,6 +20,7 @@
#include "qemu/osdep.h"
#include "qapi/error.h"
+#include "qemu/lockable.h"
#include "blockpriv.h"
#include "block-qcow.h"
#include "block-luks.h"
@@ -57,6 +58,8 @@ QCryptoBlock *qcrypto_block_open(QCryptoBlockOpenOptions *options,
{
QCryptoBlock *block = g_new0(QCryptoBlock, 1);
+ qemu_mutex_init(&block->mutex);
+
block->format = options->format;
if (options->format >= G_N_ELEMENTS(qcrypto_block_drivers) ||
@@ -76,8 +79,6 @@ QCryptoBlock *qcrypto_block_open(QCryptoBlockOpenOptions *options,
return NULL;
}
- qemu_mutex_init(&block->mutex);
-
return block;
}
@@ -92,6 +93,8 @@ QCryptoBlock *qcrypto_block_create(QCryptoBlockCreateOptions *options,
{
QCryptoBlock *block = g_new0(QCryptoBlock, 1);
+ qemu_mutex_init(&block->mutex);
+
block->format = options->format;
if (options->format >= G_N_ELEMENTS(qcrypto_block_drivers) ||
@@ -111,8 +114,6 @@ QCryptoBlock *qcrypto_block_create(QCryptoBlockCreateOptions *options,
return NULL;
}
- qemu_mutex_init(&block->mutex);
-
return block;
}
@@ -227,37 +228,42 @@ QCryptoCipher *qcrypto_block_get_cipher(QCryptoBlock *block)
* This function is used only in test with one thread (it's safe to skip
* pop/push interface), so it's enough to assert it here:
*/
- assert(block->n_ciphers <= 1);
- return block->ciphers ? block->ciphers[0] : NULL;
+ assert(block->max_free_ciphers <= 1);
+ return block->free_ciphers ? block->free_ciphers[0] : NULL;
}
-static QCryptoCipher *qcrypto_block_pop_cipher(QCryptoBlock *block)
+static QCryptoCipher *qcrypto_block_pop_cipher(QCryptoBlock *block,
+ Error **errp)
{
- QCryptoCipher *cipher;
-
- qemu_mutex_lock(&block->mutex);
-
- assert(block->n_free_ciphers > 0);
- block->n_free_ciphers--;
- cipher = block->ciphers[block->n_free_ciphers];
-
- qemu_mutex_unlock(&block->mutex);
+ /* Usually there is a free cipher available */
+ WITH_QEMU_LOCK_GUARD(&block->mutex) {
+ if (block->n_free_ciphers > 0) {
+ block->n_free_ciphers--;
+ return block->free_ciphers[block->n_free_ciphers];
+ }
+ }
- return cipher;
+ /* Otherwise allocate a new cipher */
+ return qcrypto_cipher_new(block->alg, block->mode, block->key,
+ block->nkey, errp);
}
static void qcrypto_block_push_cipher(QCryptoBlock *block,
QCryptoCipher *cipher)
{
- qemu_mutex_lock(&block->mutex);
+ QEMU_LOCK_GUARD(&block->mutex);
- assert(block->n_free_ciphers < block->n_ciphers);
- block->ciphers[block->n_free_ciphers] = cipher;
- block->n_free_ciphers++;
+ if (block->n_free_ciphers == block->max_free_ciphers) {
+ block->max_free_ciphers++;
+ block->free_ciphers = g_renew(QCryptoCipher *,
+ block->free_ciphers,
+ block->max_free_ciphers);
+ }
- qemu_mutex_unlock(&block->mutex);
+ block->free_ciphers[block->n_free_ciphers] = cipher;
+ block->n_free_ciphers++;
}
@@ -265,24 +271,31 @@ int qcrypto_block_init_cipher(QCryptoBlock *block,
QCryptoCipherAlgorithm alg,
QCryptoCipherMode mode,
const uint8_t *key, size_t nkey,
- size_t n_threads, Error **errp)
+ Error **errp)
{
- size_t i;
+ QCryptoCipher *cipher;
- assert(!block->ciphers && !block->n_ciphers && !block->n_free_ciphers);
+ assert(!block->free_ciphers && !block->max_free_ciphers &&
+ !block->n_free_ciphers);
- block->ciphers = g_new0(QCryptoCipher *, n_threads);
+ /* Stash away cipher parameters for qcrypto_block_pop_cipher() */
+ block->alg = alg;
+ block->mode = mode;
+ block->key = g_memdup2(key, nkey);
+ block->nkey = nkey;
- for (i = 0; i < n_threads; i++) {
- block->ciphers[i] = qcrypto_cipher_new(alg, mode, key, nkey, errp);
- if (!block->ciphers[i]) {
- qcrypto_block_free_cipher(block);
- return -1;
- }
- block->n_ciphers++;
- block->n_free_ciphers++;
+ /*
+ * Create a new cipher to validate the parameters now. This reduces the
+ * chance of cipher creation failing at I/O time.
+ */
+ cipher = qcrypto_block_pop_cipher(block, errp);
+ if (!cipher) {
+ g_free(block->key);
+ block->key = NULL;
+ return -1;
}
+ qcrypto_block_push_cipher(block, cipher);
return 0;
}
@@ -291,19 +304,23 @@ void qcrypto_block_free_cipher(QCryptoBlock *block)
{
size_t i;
- if (!block->ciphers) {
+ g_free(block->key);
+ block->key = NULL;
+
+ if (!block->free_ciphers) {
return;
}
- assert(block->n_ciphers == block->n_free_ciphers);
+ /* All popped ciphers were eventually pushed back */
+ assert(block->n_free_ciphers == block->max_free_ciphers);
- for (i = 0; i < block->n_ciphers; i++) {
- qcrypto_cipher_free(block->ciphers[i]);
+ for (i = 0; i < block->max_free_ciphers; i++) {
+ qcrypto_cipher_free(block->free_ciphers[i]);
}
- g_free(block->ciphers);
- block->ciphers = NULL;
- block->n_ciphers = block->n_free_ciphers = 0;
+ g_free(block->free_ciphers);
+ block->free_ciphers = NULL;
+ block->max_free_ciphers = block->n_free_ciphers = 0;
}
QCryptoIVGen *qcrypto_block_get_ivgen(QCryptoBlock *block)
@@ -311,7 +328,7 @@ QCryptoIVGen *qcrypto_block_get_ivgen(QCryptoBlock *block)
/* ivgen should be accessed under mutex. However, this function is used only
* in test with one thread, so it's enough to assert it here:
*/
- assert(block->n_ciphers <= 1);
+ assert(block->max_free_ciphers <= 1);
return block->ivgen;
}
@@ -446,7 +463,10 @@ int qcrypto_block_decrypt_helper(QCryptoBlock *block,
Error **errp)
{
int ret;
- QCryptoCipher *cipher = qcrypto_block_pop_cipher(block);
+ QCryptoCipher *cipher = qcrypto_block_pop_cipher(block, errp);
+ if (!cipher) {
+ return -1;
+ }
ret = do_qcrypto_block_cipher_encdec(cipher, block->niv, block->ivgen,
&block->mutex, sectorsize, offset, buf,
@@ -465,7 +485,10 @@ int qcrypto_block_encrypt_helper(QCryptoBlock *block,
Error **errp)
{
int ret;
- QCryptoCipher *cipher = qcrypto_block_pop_cipher(block);
+ QCryptoCipher *cipher = qcrypto_block_pop_cipher(block, errp);
+ if (!cipher) {
+ return -1;
+ }
ret = do_qcrypto_block_cipher_encdec(cipher, block->niv, block->ivgen,
&block->mutex, sectorsize, offset, buf,
diff --git a/crypto/blockpriv.h b/crypto/blockpriv.h
index 836f3b4726..4bf6043d5d 100644
--- a/crypto/blockpriv.h
+++ b/crypto/blockpriv.h
@@ -32,8 +32,14 @@ struct QCryptoBlock {
const QCryptoBlockDriver *driver;
void *opaque;
- QCryptoCipher **ciphers;
- size_t n_ciphers;
+ /* Cipher parameters */
+ QCryptoCipherAlgorithm alg;
+ QCryptoCipherMode mode;
+ uint8_t *key;
+ size_t nkey;
+
+ QCryptoCipher **free_ciphers;
+ size_t max_free_ciphers;
size_t n_free_ciphers;
QCryptoIVGen *ivgen;
QemuMutex mutex;
@@ -130,7 +136,7 @@ int qcrypto_block_init_cipher(QCryptoBlock *block,
QCryptoCipherAlgorithm alg,
QCryptoCipherMode mode,
const uint8_t *key, size_t nkey,
- size_t n_threads, Error **errp);
+ Error **errp);
void qcrypto_block_free_cipher(QCryptoBlock *block);
--
2.39.3

@ -1,55 +0,0 @@
From 961bc392ee60743344236ddd247ab646a0eec914 Mon Sep 17 00:00:00 2001
From: Jon Maloy <jmaloy@redhat.com>
Date: Tue, 9 May 2023 10:29:03 -0400
Subject: [PATCH 07/21] checkpatch: add qemu_bh_new/aio_bh_new checks
RH-Author: Jon Maloy <jmaloy@redhat.com>
RH-MergeRequest: 165: memory: prevent dma-reentracy issues
RH-Jira: RHEL-516
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Commit: [3/13] e0473487f0e3186c42559a5c36a8650f27ab26ae (jmaloy/jmaloy-qemu-kvm-2)
Jira: https://issues.redhat.com/browse/RHEL-516
Upstream: Merged
CVE: CVE-2023-2680
commit ef56ffbdd6b0605dc1e305611287b948c970e236
Author: Alexander Bulekov <alxndr@bu.edu>
Date: Thu Apr 27 17:10:08 2023 -0400
checkpatch: add qemu_bh_new/aio_bh_new checks
Advise authors to use the _guarded versions of the APIs, instead.
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-Id: <20230427211013.2994127-4-alxndr@bu.edu>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
---
scripts/checkpatch.pl | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index d768171dcf..eeaec436eb 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -2865,6 +2865,14 @@ sub process {
if ($line =~ /\bsignal\s*\(/ && !($line =~ /SIG_(?:IGN|DFL)/)) {
ERROR("use sigaction to establish signal handlers; signal is not portable\n" . $herecurr);
}
+# recommend qemu_bh_new_guarded instead of qemu_bh_new
+ if ($realfile =~ /.*\/hw\/.*/ && $line =~ /\bqemu_bh_new\s*\(/) {
+ ERROR("use qemu_bh_new_guarded() instead of qemu_bh_new() to avoid reentrancy problems\n" . $herecurr);
+ }
+# recommend aio_bh_new_guarded instead of aio_bh_new
+ if ($realfile =~ /.*\/hw\/.*/ && $line =~ /\baio_bh_new\s*\(/) {
+ ERROR("use aio_bh_new_guarded() instead of aio_bh_new() to avoid reentrancy problems\n" . $herecurr);
+ }
# check for module_init(), use category-specific init macros explicitly please
if ($line =~ /^module_init\s*\(/) {
ERROR("please use block_init(), type_init() etc. instead of module_init()\n" . $herecurr);
--
2.39.3

@ -0,0 +1,90 @@
From 0f0a3a860a07addea21a0282556a5022b9cb8b2c Mon Sep 17 00:00:00 2001
From: Xiaoyao Li <xiaoyao.li@intel.com>
Date: Thu, 29 Feb 2024 01:00:35 -0500
Subject: [PATCH 011/100] confidential guest support: Add kvm_init() and
kvm_reset() in class
RH-Author: Paolo Bonzini <pbonzini@redhat.com>
RH-MergeRequest: 245: SEV-SNP support
RH-Jira: RHEL-39544
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Bandan Das <bdas@redhat.com>
RH-Acked-by: Vitaly Kuznetsov <vkuznets@redhat.com>
RH-Commit: [11/91] 21d2178178bf181a8e4d0b051f64bd983f0d0cf1 (bonzini/rhel-qemu-kvm)
Different confidential VMs in different architectures all have the same
needs to do their specific initialization (and maybe resetting) stuffs
with KVM. Currently each of them exposes individual *_kvm_init()
functions and let machine code or kvm code to call it.
To facilitate the introduction of confidential guest technology from
different x86 vendors, add two virtual functions, kvm_init() and kvm_reset()
in ConfidentialGuestSupportClass, and expose two helpers functions for
invodking them.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-Id: <20240229060038.606591-1-xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 41a605944e3fecae43ca18ded95ec31f28e0c7fe)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
include/exec/confidential-guest-support.h | 34 ++++++++++++++++++++++-
1 file changed, 33 insertions(+), 1 deletion(-)
diff --git a/include/exec/confidential-guest-support.h b/include/exec/confidential-guest-support.h
index ba2dd4b5df..e5b188cffb 100644
--- a/include/exec/confidential-guest-support.h
+++ b/include/exec/confidential-guest-support.h
@@ -23,7 +23,10 @@
#include "qom/object.h"
#define TYPE_CONFIDENTIAL_GUEST_SUPPORT "confidential-guest-support"
-OBJECT_DECLARE_SIMPLE_TYPE(ConfidentialGuestSupport, CONFIDENTIAL_GUEST_SUPPORT)
+OBJECT_DECLARE_TYPE(ConfidentialGuestSupport,
+ ConfidentialGuestSupportClass,
+ CONFIDENTIAL_GUEST_SUPPORT)
+
struct ConfidentialGuestSupport {
Object parent;
@@ -55,8 +58,37 @@ struct ConfidentialGuestSupport {
typedef struct ConfidentialGuestSupportClass {
ObjectClass parent;
+
+ int (*kvm_init)(ConfidentialGuestSupport *cgs, Error **errp);
+ int (*kvm_reset)(ConfidentialGuestSupport *cgs, Error **errp);
} ConfidentialGuestSupportClass;
+static inline int confidential_guest_kvm_init(ConfidentialGuestSupport *cgs,
+ Error **errp)
+{
+ ConfidentialGuestSupportClass *klass;
+
+ klass = CONFIDENTIAL_GUEST_SUPPORT_GET_CLASS(cgs);
+ if (klass->kvm_init) {
+ return klass->kvm_init(cgs, errp);
+ }
+
+ return 0;
+}
+
+static inline int confidential_guest_kvm_reset(ConfidentialGuestSupport *cgs,
+ Error **errp)
+{
+ ConfidentialGuestSupportClass *klass;
+
+ klass = CONFIDENTIAL_GUEST_SUPPORT_GET_CLASS(cgs);
+ if (klass->kvm_reset) {
+ return klass->kvm_reset(cgs, errp);
+ }
+
+ return 0;
+}
+
#endif /* !CONFIG_USER_ONLY */
#endif /* QEMU_CONFIDENTIAL_GUEST_SUPPORT_H */
--
2.39.3

@ -0,0 +1,228 @@
From 117486e0820f135f191e19f8ebb8838a98b121c6 Mon Sep 17 00:00:00 2001
From: Stefan Hajnoczi <stefanha@redhat.com>
Date: Mon, 27 May 2024 11:58:51 -0400
Subject: [PATCH 5/5] crypto/block: drop qcrypto_block_open() n_threads
argument
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Stefan Hajnoczi <stefanha@redhat.com>
RH-MergeRequest: 251: block/crypto: create ciphers on demand
RH-Jira: RHEL-36159
RH-Acked-by: Kevin Wolf <kwolf@redhat.com>
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Commit: [2/2] 68290935b174b1f2b76aa857a926da9011e54abe (stefanha/centos-stream-qemu-kvm)
The n_threads argument is no longer used since the previous commit.
Remove it.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240527155851.892885-3-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 3ab0f063e58ed9224237d69c4211ca83335164c4)
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
block/crypto.c | 1 -
block/qcow.c | 2 +-
block/qcow2.c | 5 ++---
crypto/block-luks.c | 1 -
crypto/block-qcow.c | 6 ++----
crypto/block.c | 3 +--
crypto/blockpriv.h | 1 -
include/crypto/block.h | 2 --
tests/unit/test-crypto-block.c | 4 ----
9 files changed, 6 insertions(+), 19 deletions(-)
diff --git a/block/crypto.c b/block/crypto.c
index 21eed909c1..4eed3ffa6a 100644
--- a/block/crypto.c
+++ b/block/crypto.c
@@ -363,7 +363,6 @@ static int block_crypto_open_generic(QCryptoBlockFormat format,
block_crypto_read_func,
bs,
cflags,
- 1,
errp);
if (!crypto->block) {
diff --git a/block/qcow.c b/block/qcow.c
index ca8e1d5ec8..c2f89db055 100644
--- a/block/qcow.c
+++ b/block/qcow.c
@@ -211,7 +211,7 @@ static int qcow_open(BlockDriverState *bs, QDict *options, int flags,
cflags |= QCRYPTO_BLOCK_OPEN_NO_IO;
}
s->crypto = qcrypto_block_open(crypto_opts, "encrypt.",
- NULL, NULL, cflags, 1, errp);
+ NULL, NULL, cflags, errp);
if (!s->crypto) {
ret = -EINVAL;
goto fail;
diff --git a/block/qcow2.c b/block/qcow2.c
index 0e8b2f7518..0ebd455dc8 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -321,7 +321,7 @@ qcow2_read_extensions(BlockDriverState *bs, uint64_t start_offset,
}
s->crypto = qcrypto_block_open(s->crypto_opts, "encrypt.",
qcow2_crypto_hdr_read_func,
- bs, cflags, QCOW2_MAX_THREADS, errp);
+ bs, cflags, errp);
if (!s->crypto) {
return -EINVAL;
}
@@ -1707,8 +1707,7 @@ qcow2_do_open(BlockDriverState *bs, QDict *options, int flags,
cflags |= QCRYPTO_BLOCK_OPEN_NO_IO;
}
s->crypto = qcrypto_block_open(s->crypto_opts, "encrypt.",
- NULL, NULL, cflags,
- QCOW2_MAX_THREADS, errp);
+ NULL, NULL, cflags, errp);
if (!s->crypto) {
ret = -EINVAL;
goto fail;
diff --git a/crypto/block-luks.c b/crypto/block-luks.c
index 3357852c0a..5b777c15d3 100644
--- a/crypto/block-luks.c
+++ b/crypto/block-luks.c
@@ -1189,7 +1189,6 @@ qcrypto_block_luks_open(QCryptoBlock *block,
QCryptoBlockReadFunc readfunc,
void *opaque,
unsigned int flags,
- size_t n_threads,
Error **errp)
{
QCryptoBlockLUKS *luks = NULL;
diff --git a/crypto/block-qcow.c b/crypto/block-qcow.c
index 02305058e3..42e9556e42 100644
--- a/crypto/block-qcow.c
+++ b/crypto/block-qcow.c
@@ -44,7 +44,6 @@ qcrypto_block_qcow_has_format(const uint8_t *buf G_GNUC_UNUSED,
static int
qcrypto_block_qcow_init(QCryptoBlock *block,
const char *keysecret,
- size_t n_threads,
Error **errp)
{
char *password;
@@ -100,7 +99,6 @@ qcrypto_block_qcow_open(QCryptoBlock *block,
QCryptoBlockReadFunc readfunc G_GNUC_UNUSED,
void *opaque G_GNUC_UNUSED,
unsigned int flags,
- size_t n_threads,
Error **errp)
{
if (flags & QCRYPTO_BLOCK_OPEN_NO_IO) {
@@ -115,7 +113,7 @@ qcrypto_block_qcow_open(QCryptoBlock *block,
return -1;
}
return qcrypto_block_qcow_init(block, options->u.qcow.key_secret,
- n_threads, errp);
+ errp);
}
}
@@ -135,7 +133,7 @@ qcrypto_block_qcow_create(QCryptoBlock *block,
return -1;
}
/* QCow2 has no special header, since everything is hardwired */
- return qcrypto_block_qcow_init(block, options->u.qcow.key_secret, 1, errp);
+ return qcrypto_block_qcow_init(block, options->u.qcow.key_secret, errp);
}
diff --git a/crypto/block.c b/crypto/block.c
index ba6d1cebc7..3bcc4270c3 100644
--- a/crypto/block.c
+++ b/crypto/block.c
@@ -53,7 +53,6 @@ QCryptoBlock *qcrypto_block_open(QCryptoBlockOpenOptions *options,
QCryptoBlockReadFunc readfunc,
void *opaque,
unsigned int flags,
- size_t n_threads,
Error **errp)
{
QCryptoBlock *block = g_new0(QCryptoBlock, 1);
@@ -73,7 +72,7 @@ QCryptoBlock *qcrypto_block_open(QCryptoBlockOpenOptions *options,
block->driver = qcrypto_block_drivers[options->format];
if (block->driver->open(block, options, optprefix,
- readfunc, opaque, flags, n_threads, errp) < 0)
+ readfunc, opaque, flags, errp) < 0)
{
g_free(block);
return NULL;
diff --git a/crypto/blockpriv.h b/crypto/blockpriv.h
index 4bf6043d5d..b8f77cb5eb 100644
--- a/crypto/blockpriv.h
+++ b/crypto/blockpriv.h
@@ -59,7 +59,6 @@ struct QCryptoBlockDriver {
QCryptoBlockReadFunc readfunc,
void *opaque,
unsigned int flags,
- size_t n_threads,
Error **errp);
int (*create)(QCryptoBlock *block,
diff --git a/include/crypto/block.h b/include/crypto/block.h
index 92e823c9f2..5b5d039800 100644
--- a/include/crypto/block.h
+++ b/include/crypto/block.h
@@ -76,7 +76,6 @@ typedef enum {
* @readfunc: callback for reading data from the volume
* @opaque: data to pass to @readfunc
* @flags: bitmask of QCryptoBlockOpenFlags values
- * @n_threads: allow concurrent I/O from up to @n_threads threads
* @errp: pointer to a NULL-initialized error object
*
* Create a new block encryption object for an existing
@@ -113,7 +112,6 @@ QCryptoBlock *qcrypto_block_open(QCryptoBlockOpenOptions *options,
QCryptoBlockReadFunc readfunc,
void *opaque,
unsigned int flags,
- size_t n_threads,
Error **errp);
typedef enum {
diff --git a/tests/unit/test-crypto-block.c b/tests/unit/test-crypto-block.c
index 6cfc817a92..42cfab6067 100644
--- a/tests/unit/test-crypto-block.c
+++ b/tests/unit/test-crypto-block.c
@@ -303,7 +303,6 @@ static void test_block(gconstpointer opaque)
test_block_read_func,
&header,
0,
- 1,
NULL);
g_assert(blk == NULL);
@@ -312,7 +311,6 @@ static void test_block(gconstpointer opaque)
test_block_read_func,
&header,
QCRYPTO_BLOCK_OPEN_NO_IO,
- 1,
&error_abort);
g_assert(qcrypto_block_get_cipher(blk) == NULL);
@@ -327,7 +325,6 @@ static void test_block(gconstpointer opaque)
test_block_read_func,
&header,
0,
- 1,
&error_abort);
g_assert(blk);
@@ -384,7 +381,6 @@ test_luks_bad_header(gconstpointer data)
test_block_read_func,
&buf,
0,
- 1,
&err);
g_assert(!blk);
g_assert(err);
--
2.39.3

@ -1,153 +0,0 @@
From 516bf44de08a13d97c08e210137078e642ce8e88 Mon Sep 17 00:00:00 2001
From: Kevin Wolf <kwolf@redhat.com>
Date: Wed, 17 May 2023 17:28:32 +0200
Subject: [PATCH 02/21] graph-lock: Disable locking for now
RH-Author: Kevin Wolf <kwolf@redhat.com>
RH-MergeRequest: 166: block/graph-lock: Disable locking for now
RH-Bugzilla: 2186725
RH-Acked-by: Eric Blake <eblake@redhat.com>
RH-Acked-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
RH-Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
RH-Commit: [2/4] 39d42fb527aad0491a018743289de7b762108317 (kmwolf/centos-qemu-kvm)
In QEMU 8.0, we've been seeing deadlocks in bdrv_graph_wrlock(). They
come from callers that hold an AioContext lock, which is not allowed
during polling. In theory, we could temporarily release the lock, but
callers are inconsistent about whether they hold a lock, and if they do,
some are also confused about which one they hold. While all of this is
fixable, it's not trivial, and the best course of action for 8.0.1 is
probably just disabling the graph locking code temporarily.
We don't currently rely on graph locking yet. It is supposed to replace
the AioContext lock eventually to enable multiqueue support, but as long
as we still have the AioContext lock, it is sufficient without the graph
lock. Once the AioContext lock goes away, the deadlock doesn't exist any
more either and this commit can be reverted. (Of course, it can also be
reverted while the AioContext lock still exists if the callers have been
fixed.)
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230517152834.277483-2-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 80fc5d260002432628710f8b0c7cfc7d9b97bb9d)
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
block/graph-lock.c | 24 ++++++++++++++++++++++++
1 file changed, 24 insertions(+)
diff --git a/block/graph-lock.c b/block/graph-lock.c
index 259a7a0bde..2490926c90 100644
--- a/block/graph-lock.c
+++ b/block/graph-lock.c
@@ -30,8 +30,10 @@ BdrvGraphLock graph_lock;
/* Protects the list of aiocontext and orphaned_reader_count */
static QemuMutex aio_context_list_lock;
+#if 0
/* Written and read with atomic operations. */
static int has_writer;
+#endif
/*
* A reader coroutine could move from an AioContext to another.
@@ -88,6 +90,7 @@ void unregister_aiocontext(AioContext *ctx)
g_free(ctx->bdrv_graph);
}
+#if 0
static uint32_t reader_count(void)
{
BdrvGraphRWlock *brdv_graph;
@@ -105,10 +108,17 @@ static uint32_t reader_count(void)
assert((int32_t)rd >= 0);
return rd;
}
+#endif
void bdrv_graph_wrlock(void)
{
GLOBAL_STATE_CODE();
+ /*
+ * TODO Some callers hold an AioContext lock when this is called, which
+ * causes deadlocks. Reenable once the AioContext locking is cleaned up (or
+ * AioContext locks are gone).
+ */
+#if 0
assert(!qatomic_read(&has_writer));
/* Make sure that constantly arriving new I/O doesn't cause starvation */
@@ -139,11 +149,13 @@ void bdrv_graph_wrlock(void)
} while (reader_count() >= 1);
bdrv_drain_all_end();
+#endif
}
void bdrv_graph_wrunlock(void)
{
GLOBAL_STATE_CODE();
+#if 0
QEMU_LOCK_GUARD(&aio_context_list_lock);
assert(qatomic_read(&has_writer));
@@ -155,10 +167,13 @@ void bdrv_graph_wrunlock(void)
/* Wake up all coroutine that are waiting to read the graph */
qemu_co_enter_all(&reader_queue, &aio_context_list_lock);
+#endif
}
void coroutine_fn bdrv_graph_co_rdlock(void)
{
+ /* TODO Reenable when wrlock is reenabled */
+#if 0
BdrvGraphRWlock *bdrv_graph;
bdrv_graph = qemu_get_current_aio_context()->bdrv_graph;
@@ -223,10 +238,12 @@ void coroutine_fn bdrv_graph_co_rdlock(void)
qemu_co_queue_wait(&reader_queue, &aio_context_list_lock);
}
}
+#endif
}
void coroutine_fn bdrv_graph_co_rdunlock(void)
{
+#if 0
BdrvGraphRWlock *bdrv_graph;
bdrv_graph = qemu_get_current_aio_context()->bdrv_graph;
@@ -249,6 +266,7 @@ void coroutine_fn bdrv_graph_co_rdunlock(void)
if (qatomic_read(&has_writer)) {
aio_wait_kick();
}
+#endif
}
void bdrv_graph_rdlock_main_loop(void)
@@ -266,13 +284,19 @@ void bdrv_graph_rdunlock_main_loop(void)
void assert_bdrv_graph_readable(void)
{
/* reader_count() is slow due to aio_context_list_lock lock contention */
+ /* TODO Reenable when wrlock is reenabled */
+#if 0
#ifdef CONFIG_DEBUG_GRAPH_LOCK
assert(qemu_in_main_thread() || reader_count());
#endif
+#endif
}
void assert_bdrv_graph_writable(void)
{
assert(qemu_in_main_thread());
+ /* TODO Reenable when wrlock is reenabled */
+#if 0
assert(qatomic_read(&has_writer));
+#endif
}
--
2.39.3

@ -1,40 +0,0 @@
From b4645e7682aa1bde6f89df0eff2a9de83720eecc Mon Sep 17 00:00:00 2001
From: Ani Sinha <anisinha@redhat.com>
Date: Tue, 2 May 2023 15:51:53 +0530
Subject: [PATCH 3/3] hw/acpi: Mark acpi blobs as resizable on RHEL pc machines
version 7.6 and above
RH-Author: Ani Sinha <None>
RH-MergeRequest: 160: hw/acpi: limit warning on acpi table size to pc machines older than version 2.3
RH-Bugzilla: 1934134
RH-Acked-by: Igor Mammedov <imammedo@redhat.com>
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Acked-by: MST <mst@redhat.com>
RH-Commit: [2/2] 95d443af6e75c569d89d04d028012c3c56c0c3a4 (anisinha/centos-qemu-kvm)
Please look at QEMU upstream commit
1af507756bae7 ("hw/acpi: limit warning on acpi table size to pc machines older than version 2.3")
This patch adapts the above change so that it applies to RHEL pc machines of
version 7.6 and newer. These are the machine types that are currently supported
in RHEL. Q35 machines are not affected.
Signed-off-by: Ani Sinha <anisinha@redhat.com>
---
hw/i386/pc_piix.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 4d5880e249..6c7be628e1 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -914,6 +914,7 @@ static void pc_machine_rhel7_options(MachineClass *m)
m->default_machine_opts = "firmware=bios-256k.bin,hpet=off";
pcmc->default_nic_model = "e1000";
pcmc->pci_root_uid = 0;
+ pcmc->resizable_acpi_blob = true;
m->default_display = "std";
m->no_parallel = 1;
m->numa_mem_supported = true;
--
2.39.1

@ -1,101 +0,0 @@
From 3f70da88788c398877b8ded0b27689530385302b Mon Sep 17 00:00:00 2001
From: Ani Sinha <anisinha@redhat.com>
Date: Wed, 29 Mar 2023 10:27:26 +0530
Subject: [PATCH 2/3] hw/acpi: limit warning on acpi table size to pc machines
older than version 2.3
RH-Author: Ani Sinha <None>
RH-MergeRequest: 160: hw/acpi: limit warning on acpi table size to pc machines older than version 2.3
RH-Bugzilla: 1934134
RH-Acked-by: Igor Mammedov <imammedo@redhat.com>
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Acked-by: MST <mst@redhat.com>
RH-Commit: [1/2] 96c3b6d51e16734eb4e8de52635e0ca036964090 (anisinha/centos-qemu-kvm)
i440fx machine versions 2.3 and newer supports dynamic ram
resizing. See commit a1666142db6233 ("acpi-build: make ROMs RAM blocks resizeable") .
Currently supported all q35 machine types (versions 2.4 and newer) supports
resizable RAM/ROM blocks.Therefore the warning generated when the ACPI table
size exceeds a pre-defined value does not apply to those machine versions.
Add a check limiting the warning message to only those machines that does not
support expandable ram blocks (that is, i440fx machines with version 2.2
and older).
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Message-Id: <20230329045726.14028-1-anisinha@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 1af507756bae775028c27d30e602e2b9c72cd074)
---
hw/i386/acpi-build.c | 6 ++++--
hw/i386/pc.c | 1 +
hw/i386/pc_piix.c | 1 +
include/hw/i386/pc.h | 3 +++
4 files changed, 9 insertions(+), 2 deletions(-)
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index ec857a117e..9bc4d8a981 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -2695,7 +2695,8 @@ void acpi_build(AcpiBuildTables *tables, MachineState *machine)
int legacy_table_size =
ROUND_UP(tables_blob->len - aml_len + legacy_aml_len,
ACPI_BUILD_ALIGN_SIZE);
- if (tables_blob->len > legacy_table_size) {
+ if ((tables_blob->len > legacy_table_size) &&
+ !pcmc->resizable_acpi_blob) {
/* Should happen only with PCI bridges and -M pc-i440fx-2.0. */
warn_report("ACPI table size %u exceeds %d bytes,"
" migration may not work",
@@ -2706,7 +2707,8 @@ void acpi_build(AcpiBuildTables *tables, MachineState *machine)
g_array_set_size(tables_blob, legacy_table_size);
} else {
/* Make sure we have a buffer in case we need to resize the tables. */
- if (tables_blob->len > ACPI_BUILD_TABLE_SIZE / 2) {
+ if ((tables_blob->len > ACPI_BUILD_TABLE_SIZE / 2) &&
+ !pcmc->resizable_acpi_blob) {
/* As of QEMU 2.1, this fires with 160 VCPUs and 255 memory slots. */
warn_report("ACPI table size %u exceeds %d bytes,"
" migration may not work",
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index f216922cee..7db5a2348f 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -2092,6 +2092,7 @@ static void pc_machine_class_init(ObjectClass *oc, void *data)
pcmc->acpi_data_size = 0x20000 + 0x8000;
pcmc->pvh_enabled = true;
pcmc->kvmclock_create_always = true;
+ pcmc->resizable_acpi_blob = true;
assert(!mc->get_hotplug_handler);
mc->async_pf_vmexit_disable = false;
mc->get_hotplug_handler = pc_get_hotplug_handler;
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index fc704d783f..4d5880e249 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -750,6 +750,7 @@ static void pc_i440fx_2_2_machine_options(MachineClass *m)
compat_props_add(m->compat_props, hw_compat_2_2, hw_compat_2_2_len);
compat_props_add(m->compat_props, pc_compat_2_2, pc_compat_2_2_len);
pcmc->rsdp_in_ram = false;
+ pcmc->resizable_acpi_blob = false;
}
DEFINE_I440FX_MACHINE(v2_2, "pc-i440fx-2.2", pc_compat_2_2_fn,
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index d218ad1628..2f514d13d8 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -130,6 +130,9 @@ struct PCMachineClass {
/* create kvmclock device even when KVM PV features are not exposed */
bool kvmclock_create_always;
+
+ /* resizable acpi blob compat */
+ bool resizable_acpi_blob;
};
#define TYPE_PC_MACHINE "generic-pc-machine"
--
2.39.1

@ -1,60 +0,0 @@
From 7b57aec372fc238cbaafe86557f9fb4b560895b1 Mon Sep 17 00:00:00 2001
From: Gavin Shan <gshan@redhat.com>
Date: Tue, 27 Jun 2023 20:20:09 +1000
Subject: [PATCH 2/6] hw/arm: Validate cluster and NUMA node boundary
RH-Author: Gavin Shan <gshan@redhat.com>
RH-MergeRequest: 175: hw/arm: Validate CPU cluster and NUMA node boundary for RHEL machines
RH-Bugzilla: 2171363
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Acked-by: Eric Auger <eric.auger@redhat.com>
RH-Commit: [2/3] fcac7ea85d9f73613989903c642fc1bf6c51946b
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2171363
There are two ARM machines where NUMA is aware: 'virt' and 'sbsa-ref'.
Both of them are required to follow cluster-NUMA-node boundary. To
enable the validation to warn about the irregular configuration where
multiple CPUs in one cluster have been associated with different NUMA
nodes.
Signed-off-by: Gavin Shan <gshan@redhat.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230509002739.18388-3-gshan@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit fecff672351ace5e39adf7dbcf7a8ee748b201cb)
Signed-off-by: Gavin Shan <gshan@redhat.com>
---
hw/arm/sbsa-ref.c | 2 ++
hw/arm/virt.c | 2 ++
2 files changed, 4 insertions(+)
diff --git a/hw/arm/sbsa-ref.c b/hw/arm/sbsa-ref.c
index 0b93558dde..efb380e7c8 100644
--- a/hw/arm/sbsa-ref.c
+++ b/hw/arm/sbsa-ref.c
@@ -864,6 +864,8 @@ static void sbsa_ref_class_init(ObjectClass *oc, void *data)
mc->possible_cpu_arch_ids = sbsa_ref_possible_cpu_arch_ids;
mc->cpu_index_to_instance_props = sbsa_ref_cpu_index_to_props;
mc->get_default_cpu_node_id = sbsa_ref_get_default_cpu_node_id;
+ /* platform instead of architectural choice */
+ mc->cpu_cluster_has_numa_boundary = true;
}
static const TypeInfo sbsa_ref_info = {
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 9be53e9355..df6a0231bc 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -3083,6 +3083,8 @@ static void virt_machine_class_init(ObjectClass *oc, void *data)
mc->smp_props.clusters_supported = true;
mc->auto_enable_numa_with_memhp = true;
mc->auto_enable_numa_with_memdev = true;
+ /* platform instead of architectural choice */
+ mc->cpu_cluster_has_numa_boundary = true;
mc->default_ram_id = "mach-virt.ram";
object_class_property_add(oc, "acpi", "OnOffAuto",
--
2.39.3

@ -1,166 +0,0 @@
From a3412036477e8c91e0b71fcd91de4e24a9904077 Mon Sep 17 00:00:00 2001
From: Peter Maydell <peter.maydell@linaro.org>
Date: Tue, 25 Jul 2023 10:56:51 +0100
Subject: [PATCH 09/14] hw/arm/smmu: Handle big-endian hosts correctly
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Eric Auger <eric.auger@redhat.com>
RH-MergeRequest: 197: virtio-iommu/smmu: backport some late fixes
RH-Bugzilla: 2229133
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Peter Xu <peterx@redhat.com>
RH-Commit: [3/3] df9c8d228b25273e0c4927a10b21e66fb4bef5f0 (eauger1/centos-qemu-kvm)
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2229133
The implementation of the SMMUv3 has multiple places where it reads a
data structure from the guest and directly operates on it without
doing a guest-to-host endianness conversion. Since all SMMU data
structures are little-endian, this means that the SMMU doesn't work
on a big-endian host. In particular, this causes the Avocado test
machine_aarch64_virt.py:Aarch64VirtMachine.test_alpine_virt_tcg_gic_max
to fail on an s390x host.
Add appropriate byte-swapping on reads and writes of guest in-memory
data structures so that the device works correctly on big-endian
hosts.
As part of this we constrain queue_read() to operate only on Cmd
structs and queue_write() on Evt structs, because in practice these
are the only data structures the two functions are used with, and we
need to know what the data structure is to be able to byte-swap its
parts correctly.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20230717132641.764660-1-peter.maydell@linaro.org
Cc: qemu-stable@nongnu.org
(cherry picked from commit c6445544d4cea2628fbad3bad09f3d3a03c749d3)
Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
hw/arm/smmu-common.c | 3 +--
hw/arm/smmuv3.c | 39 +++++++++++++++++++++++++++++++--------
2 files changed, 32 insertions(+), 10 deletions(-)
diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
index e7f1c1f219..daa02ce798 100644
--- a/hw/arm/smmu-common.c
+++ b/hw/arm/smmu-common.c
@@ -192,8 +192,7 @@ static int get_pte(dma_addr_t baseaddr, uint32_t index, uint64_t *pte,
dma_addr_t addr = baseaddr + index * sizeof(*pte);
/* TODO: guarantee 64-bit single-copy atomicity */
- ret = dma_memory_read(&address_space_memory, addr, pte, sizeof(*pte),
- MEMTXATTRS_UNSPECIFIED);
+ ret = ldq_le_dma(&address_space_memory, addr, pte, MEMTXATTRS_UNSPECIFIED);
if (ret != MEMTX_OK) {
info->type = SMMU_PTW_ERR_WALK_EABT;
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 270c80b665..cfb56725a6 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -98,20 +98,34 @@ static void smmuv3_write_gerrorn(SMMUv3State *s, uint32_t new_gerrorn)
trace_smmuv3_write_gerrorn(toggled & pending, s->gerrorn);
}
-static inline MemTxResult queue_read(SMMUQueue *q, void *data)
+static inline MemTxResult queue_read(SMMUQueue *q, Cmd *cmd)
{
dma_addr_t addr = Q_CONS_ENTRY(q);
+ MemTxResult ret;
+ int i;
- return dma_memory_read(&address_space_memory, addr, data, q->entry_size,
- MEMTXATTRS_UNSPECIFIED);
+ ret = dma_memory_read(&address_space_memory, addr, cmd, sizeof(Cmd),
+ MEMTXATTRS_UNSPECIFIED);
+ if (ret != MEMTX_OK) {
+ return ret;
+ }
+ for (i = 0; i < ARRAY_SIZE(cmd->word); i++) {
+ le32_to_cpus(&cmd->word[i]);
+ }
+ return ret;
}
-static MemTxResult queue_write(SMMUQueue *q, void *data)
+static MemTxResult queue_write(SMMUQueue *q, Evt *evt_in)
{
dma_addr_t addr = Q_PROD_ENTRY(q);
MemTxResult ret;
+ Evt evt = *evt_in;
+ int i;
- ret = dma_memory_write(&address_space_memory, addr, data, q->entry_size,
+ for (i = 0; i < ARRAY_SIZE(evt.word); i++) {
+ cpu_to_le32s(&evt.word[i]);
+ }
+ ret = dma_memory_write(&address_space_memory, addr, &evt, sizeof(Evt),
MEMTXATTRS_UNSPECIFIED);
if (ret != MEMTX_OK) {
return ret;
@@ -291,7 +305,7 @@ static void smmuv3_init_regs(SMMUv3State *s)
static int smmu_get_ste(SMMUv3State *s, dma_addr_t addr, STE *buf,
SMMUEventInfo *event)
{
- int ret;
+ int ret, i;
trace_smmuv3_get_ste(addr);
/* TODO: guarantee 64-bit single-copy atomicity */
@@ -304,6 +318,9 @@ static int smmu_get_ste(SMMUv3State *s, dma_addr_t addr, STE *buf,
event->u.f_ste_fetch.addr = addr;
return -EINVAL;
}
+ for (i = 0; i < ARRAY_SIZE(buf->word); i++) {
+ le32_to_cpus(&buf->word[i]);
+ }
return 0;
}
@@ -313,7 +330,7 @@ static int smmu_get_cd(SMMUv3State *s, STE *ste, uint32_t ssid,
CD *buf, SMMUEventInfo *event)
{
dma_addr_t addr = STE_CTXPTR(ste);
- int ret;
+ int ret, i;
trace_smmuv3_get_cd(addr);
/* TODO: guarantee 64-bit single-copy atomicity */
@@ -326,6 +343,9 @@ static int smmu_get_cd(SMMUv3State *s, STE *ste, uint32_t ssid,
event->u.f_ste_fetch.addr = addr;
return -EINVAL;
}
+ for (i = 0; i < ARRAY_SIZE(buf->word); i++) {
+ le32_to_cpus(&buf->word[i]);
+ }
return 0;
}
@@ -407,7 +427,7 @@ static int smmu_find_ste(SMMUv3State *s, uint32_t sid, STE *ste,
return -EINVAL;
}
if (s->features & SMMU_FEATURE_2LVL_STE) {
- int l1_ste_offset, l2_ste_offset, max_l2_ste, span;
+ int l1_ste_offset, l2_ste_offset, max_l2_ste, span, i;
dma_addr_t l1ptr, l2ptr;
STEDesc l1std;
@@ -431,6 +451,9 @@ static int smmu_find_ste(SMMUv3State *s, uint32_t sid, STE *ste,
event->u.f_ste_fetch.addr = l1ptr;
return -EINVAL;
}
+ for (i = 0; i < ARRAY_SIZE(l1std.word); i++) {
+ le32_to_cpus(&l1std.word[i]);
+ }
span = L1STD_SPAN(&l1std);
--
2.39.3

@ -0,0 +1,120 @@
From 41c4083269ec772b406c6c57b496ca2011f928c7 Mon Sep 17 00:00:00 2001
From: Zhenyu Zhang <zhenyzha@redhat.com>
Date: Tue, 9 Jul 2024 23:08:59 -0400
Subject: [PATCH 2/2] hw/arm/virt: Avoid unexpected warning from Linux guest on
host with Fujitsu CPUs
RH-Author: zhenyzha <None>
RH-MergeRequest: 256: hw/arm/virt: Avoid unexpected warning from Linux guest on host with Fujitsu CPUs
RH-Jira: RHEL-39936
RH-Acked-by: Gavin Shan <gshan@redhat.com>
RH-Acked-by: Sebastian Ott <sebott@redhat.com>
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Commit: [1/1] fdf156fd05b219a06e2e2ca409fff0f728c1e2cf (zhenyzha/qemu-kvm)
JIRA: https://issues.redhat.com/browse/RHEL-39936
Multiple warning messages and corresponding backtraces are observed when Linux
guest is booted on the host with Fujitsu CPUs. One of them is shown as below.
[ 0.032443] ------------[ cut here ]------------
[ 0.032446] uart-pl011 9000000.pl011: ARCH_DMA_MINALIGN smaller than
CTR_EL0.CWG (128 < 256)
[ 0.032454] WARNING: CPU: 0 PID: 1 at arch/arm64/mm/dma-mapping.c:54
arch_setup_dma_ops+0xbc/0xcc
[ 0.032470] Modules linked in:
[ 0.032475] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.14.0-452.el9.aarch64
[ 0.032481] Hardware name: linux,dummy-virt (DT)
[ 0.032484] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 0.032490] pc : arch_setup_dma_ops+0xbc/0xcc
[ 0.032496] lr : arch_setup_dma_ops+0xbc/0xcc
[ 0.032501] sp : ffff80008003b860
[ 0.032503] x29: ffff80008003b860 x28: 0000000000000000 x27: ffffaae4b949049c
[ 0.032510] x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
[ 0.032517] x23: 0000000000000100 x22: 0000000000000000 x21: 0000000000000000
[ 0.032523] x20: 0000000100000000 x19: ffff2f06c02ea400 x18: ffffffffffffffff
[ 0.032529] x17: 00000000208a5f76 x16: 000000006589dbcb x15: ffffaae4ba071c89
[ 0.032535] x14: 0000000000000000 x13: ffffaae4ba071c84 x12: 455f525443206e61
[ 0.032541] x11: 68742072656c6c61 x10: 0000000000000029 x9 : ffffaae4b7d21da4
[ 0.032547] x8 : 0000000000000029 x7 : 4c414e494d5f414d x6 : 0000000000000029
[ 0.032553] x5 : 000000000000000f x4 : ffffaae4b9617a00 x3 : 0000000000000001
[ 0.032558] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff2f06c029be40
[ 0.032564] Call trace:
[ 0.032566] arch_setup_dma_ops+0xbc/0xcc
[ 0.032572] of_dma_configure_id+0x138/0x300
[ 0.032591] amba_dma_configure+0x34/0xc0
[ 0.032600] really_probe+0x78/0x3dc
[ 0.032614] __driver_probe_device+0x108/0x160
[ 0.032619] driver_probe_device+0x44/0x114
[ 0.032624] __device_attach_driver+0xb8/0x14c
[ 0.032629] bus_for_each_drv+0x88/0xe4
[ 0.032634] __device_attach+0xb0/0x1e0
[ 0.032638] device_initial_probe+0x18/0x20
[ 0.032643] bus_probe_device+0xa8/0xb0
[ 0.032648] device_add+0x4b4/0x6c0
[ 0.032652] amba_device_try_add.part.0+0x48/0x360
[ 0.032657] amba_device_add+0x104/0x144
[ 0.032662] of_amba_device_create.isra.0+0x100/0x1c4
[ 0.032666] of_platform_bus_create+0x294/0x35c
[ 0.032669] of_platform_populate+0x5c/0x150
[ 0.032672] of_platform_default_populate_init+0xd0/0xec
[ 0.032697] do_one_initcall+0x4c/0x2e0
[ 0.032701] do_initcalls+0x100/0x13c
[ 0.032707] kernel_init_freeable+0x1c8/0x21c
[ 0.032712] kernel_init+0x28/0x140
[ 0.032731] ret_from_fork+0x10/0x20
[ 0.032735] ---[ end trace 0000000000000000 ]---
In Linux, a check is applied to every device which is exposed through
device-tree node. The warning message is raised when the device isn't
DMA coherent and the cache line size is larger than ARCH_DMA_MINALIGN
(128 bytes). The cache line is sorted from CTR_EL0[CWG], which corresponds
to 256 bytes on the guest CPUs. The DMA coherent capability is claimed
through 'dma-coherent' in their device-tree nodes or parent nodes.
This happens even when the device doesn't implement or use DMA at all,
for legacy reasons.
Fix the issue by adding 'dma-coherent' property to the device-tree root
node, meaning all devices are capable of DMA coherent by default.
This both suppresses the spurious kernel warnings and also guards
against possible future QEMU bugs where we add a DMA-capable device
and forget to mark it as dma-coherent.
Signed-off-by: Zhenyu Zhang <zhenyzha@redhat.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Donald Dutile <ddutile@redhat.com
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-id: 20240612020506.307793-1-zhenyzha@redhat.com
[PMM: tweaked commit message]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit dda533087ad5559674ff486e7031c88dc01e0abd)
Signed-off-by: Zhenyu Zhang <zhenyzha@redhat.com>
---
hw/arm/virt.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 3f0496cdb9..6ece67f11d 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -330,6 +330,17 @@ static void create_fdt(VirtMachineState *vms)
qemu_fdt_setprop_cell(fdt, "/", "#size-cells", 0x2);
qemu_fdt_setprop_string(fdt, "/", "model", "linux,dummy-virt");
+ /*
+ * For QEMU, all DMA is coherent. Advertising this in the root node
+ * has two benefits:
+ *
+ * - It avoids potential bugs where we forget to mark a DMA
+ * capable device as being dma-coherent
+ * - It avoids spurious warnings from the Linux kernel about
+ * devices which can't do DMA at all
+ */
+ qemu_fdt_setprop(fdt, "/", "dma-coherent", NULL, 0);
+
/* /chosen must exist for load_dtb to fill in necessary properties later */
qemu_fdt_add_subnode(fdt, "/chosen");
if (vms->dtb_randomness) {
--
2.39.3

@ -0,0 +1,59 @@
From e3360c415f7de923d27c3167260a93cb679afabe Mon Sep 17 00:00:00 2001
From: Eric Auger <eric.auger@redhat.com>
Date: Mon, 6 May 2024 15:09:43 +0200
Subject: [PATCH 1/2] hw/arm/virt: Fix spurious call to arm_virt_compat_set()
RH-Author: Eric Auger <eric.auger@redhat.com>
RH-MergeRequest: 238: hw/arm/virt: Fix spurious call to arm_virt_compat_set()
RH-Jira: RHEL-34945
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Acked-by: Gavin Shan <gshan@redhat.com>
RH-Commit: [1/1] a858a3e1dff12b28e14f7e4bd2b896a9f06eacbb (eauger1/centos-qemu-kvm)
JIRA: https://issues.redhat.com/browse/RHEL-34945
Status: RHEL-only
Downstream, we apply arm_rhel_compat in place of arm_virt_compat.
This is done though arm_rhel_compat_set() transparently called in
DEFINE_RHEL_MACHINE_LATEST(). So there is no need to call
arm_virt_compat_set() in rhel_machine_class_init(). Besides
this triggers a "GLib: g_ptr_array_add: assertion 'rarray' failed"
warning.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
hw/arm/virt.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index f1af9495c6..3f0496cdb9 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -85,6 +85,7 @@
#include "hw/char/pl011.h"
#include "qemu/guest-random.h"
+#if 0 /* Disabled for Red Hat Enterprise Linux */
static GlobalProperty arm_virt_compat[] = {
{ TYPE_VIRTIO_IOMMU_PCI, "aw-bits", "48" },
};
@@ -101,7 +102,6 @@ static void arm_virt_compat_set(MachineClass *mc)
arm_virt_compat_len);
}
-#if 0 /* Disabled for Red Hat Enterprise Linux */
#define DEFINE_VIRT_MACHINE_LATEST(major, minor, latest) \
static void virt_##major##_##minor##_class_init(ObjectClass *oc, \
void *data) \
@@ -3536,7 +3536,6 @@ static void rhel_machine_class_init(ObjectClass *oc, void *data)
{
MachineClass *mc = MACHINE_CLASS(oc);
HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(oc);
- arm_virt_compat_set(mc);
mc->family = "virt-rhel-Z";
mc->init = machvirt_init;
--
2.39.3

@ -1,41 +0,0 @@
From 022529f6d0ee306da857825c72a98bf7ddf5de22 Mon Sep 17 00:00:00 2001
From: Gavin Shan <gshan@redhat.com>
Date: Tue, 27 Jun 2023 20:20:09 +1000
Subject: [PATCH 3/6] hw/arm/virt: Validate cluster and NUMA node boundary for
RHEL machines
RH-Author: Gavin Shan <gshan@redhat.com>
RH-MergeRequest: 175: hw/arm: Validate CPU cluster and NUMA node boundary for RHEL machines
RH-Bugzilla: 2171363
RH-Acked-by: Cornelia Huck <cohuck@redhat.com>
RH-Acked-by: Eric Auger <eric.auger@redhat.com>
RH-Commit: [3/3] a396c499259b566861ca007b01f8539bf6113711
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2171363
Upstream Status: RHEL only
Set mc->cpu_cluster_has_numa_boundary to true so that the boundary of
CPU cluster and NUMA node will be validated for 'virt-rhel*' machines.
A warning message will be printed if the boundary is broken.
Signed-off-by: Gavin Shan <gshan@redhat.com>
---
hw/arm/virt.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index df6a0231bc..faf68488d5 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -3530,6 +3530,8 @@ static void rhel_machine_class_init(ObjectClass *oc, void *data)
mc->smp_props.clusters_supported = true;
mc->auto_enable_numa_with_memhp = true;
mc->auto_enable_numa_with_memdev = true;
+ /* platform instead of architectural choice */
+ mc->cpu_cluster_has_numa_boundary = true;
mc->default_ram_id = "mach-virt.ram";
object_class_property_add(oc, "acpi", "OnOffAuto",
--
2.39.3

@ -0,0 +1,73 @@
From e74980be81d641736ea9d44d0fe9af02af63a220 Mon Sep 17 00:00:00 2001
From: Michael Roth <michael.roth@amd.com>
Date: Thu, 30 May 2024 06:16:40 -0500
Subject: [PATCH 083/100] hw/i386: Add support for loading BIOS using
guest_memfd
RH-Author: Paolo Bonzini <pbonzini@redhat.com>
RH-MergeRequest: 245: SEV-SNP support
RH-Jira: RHEL-39544
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Bandan Das <bdas@redhat.com>
RH-Acked-by: Vitaly Kuznetsov <vkuznets@redhat.com>
RH-Commit: [83/91] 7b77d212ef7d83b66ad9d8348179ee84e64fb911 (bonzini/rhel-qemu-kvm)
When guest_memfd is enabled, the BIOS is generally part of the initial
encrypted guest image and will be accessed as private guest memory. Add
the necessary changes to set up the associated RAM region with a
guest_memfd backend to allow for this.
Current support centers around using -bios to load the BIOS data.
Support for loading the BIOS via pflash requires additional enablement
since those interfaces rely on the use of ROM memory regions which make
use of the KVM_MEM_READONLY memslot flag, which is not supported for
guest_memfd-backed memslots.
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240530111643.1091816-29-pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit fc7a69e177e4ba26d11fcf47b853f85115b35a11)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
hw/i386/x86-common.c | 17 ++++++++++++-----
1 file changed, 12 insertions(+), 5 deletions(-)
diff --git a/hw/i386/x86-common.c b/hw/i386/x86-common.c
index 35fe6eabea..6cbb76c25c 100644
--- a/hw/i386/x86-common.c
+++ b/hw/i386/x86-common.c
@@ -969,8 +969,13 @@ void x86_bios_rom_init(X86MachineState *x86ms, const char *default_firmware,
(bios_size % 65536) != 0) {
goto bios_error;
}
- memory_region_init_ram(&x86ms->bios, NULL, "pc.bios", bios_size,
- &error_fatal);
+ if (machine_require_guest_memfd(MACHINE(x86ms))) {
+ memory_region_init_ram_guest_memfd(&x86ms->bios, NULL, "pc.bios",
+ bios_size, &error_fatal);
+ } else {
+ memory_region_init_ram(&x86ms->bios, NULL, "pc.bios",
+ bios_size, &error_fatal);
+ }
if (sev_enabled()) {
/*
* The concept of a "reset" simply doesn't exist for
@@ -991,9 +996,11 @@ void x86_bios_rom_init(X86MachineState *x86ms, const char *default_firmware,
}
g_free(filename);
- /* map the last 128KB of the BIOS in ISA space */
- x86_isa_bios_init(&x86ms->isa_bios, rom_memory, &x86ms->bios,
- !isapc_ram_fw);
+ if (!machine_require_guest_memfd(MACHINE(x86ms))) {
+ /* map the last 128KB of the BIOS in ISA space */
+ x86_isa_bios_init(&x86ms->isa_bios, rom_memory, &x86ms->bios,
+ !isapc_ram_fw);
+ }
/* map all the bios at the top of memory */
memory_region_add_subregion(rom_memory,
--
2.39.3

@ -0,0 +1,106 @@
From c1e615d6b8f609b72a94ffe6d31a9848a41744ef Mon Sep 17 00:00:00 2001
From: Bernhard Beschow <shentey@gmail.com>
Date: Tue, 30 Apr 2024 17:06:39 +0200
Subject: [PATCH 038/100] hw/i386: Have x86_bios_rom_init() take
X86MachineState rather than MachineState
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Paolo Bonzini <pbonzini@redhat.com>
RH-MergeRequest: 245: SEV-SNP support
RH-Jira: RHEL-39544
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Bandan Das <bdas@redhat.com>
RH-Acked-by: Vitaly Kuznetsov <vkuznets@redhat.com>
RH-Commit: [38/91] 59f388b1dffc5d0aa2f0fff768194d755bc3efbb (bonzini/rhel-qemu-kvm)
The function creates and leaks two MemoryRegion objects regarding the BIOS which
will be moved into X86MachineState in the next steps to avoid the leakage.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240430150643.111976-3-shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit 848351840148f8c3b53ddf6210194506547d3ffd)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
hw/i386/microvm.c | 2 +-
hw/i386/pc_sysfw.c | 4 ++--
hw/i386/x86.c | 4 ++--
include/hw/i386/x86.h | 2 +-
4 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/hw/i386/microvm.c b/hw/i386/microvm.c
index 61a772dfe6..fec63cacfa 100644
--- a/hw/i386/microvm.c
+++ b/hw/i386/microvm.c
@@ -278,7 +278,7 @@ static void microvm_devices_init(MicrovmMachineState *mms)
default_firmware = x86_machine_is_acpi_enabled(x86ms)
? MICROVM_BIOS_FILENAME
: MICROVM_QBOOT_FILENAME;
- x86_bios_rom_init(MACHINE(mms), default_firmware, get_system_memory(), true);
+ x86_bios_rom_init(x86ms, default_firmware, get_system_memory(), true);
}
static void microvm_memory_init(MicrovmMachineState *mms)
diff --git a/hw/i386/pc_sysfw.c b/hw/i386/pc_sysfw.c
index 3efabbbab2..ef7dea9798 100644
--- a/hw/i386/pc_sysfw.c
+++ b/hw/i386/pc_sysfw.c
@@ -206,7 +206,7 @@ void pc_system_firmware_init(PCMachineState *pcms,
BlockBackend *pflash_blk[ARRAY_SIZE(pcms->flash)];
if (!pcmc->pci_enabled) {
- x86_bios_rom_init(MACHINE(pcms), "bios.bin", rom_memory, true);
+ x86_bios_rom_init(X86_MACHINE(pcms), "bios.bin", rom_memory, true);
return;
}
@@ -227,7 +227,7 @@ void pc_system_firmware_init(PCMachineState *pcms,
if (!pflash_blk[0]) {
/* Machine property pflash0 not set, use ROM mode */
- x86_bios_rom_init(MACHINE(pcms), "bios.bin", rom_memory, false);
+ x86_bios_rom_init(X86_MACHINE(pcms), "bios.bin", rom_memory, false);
} else {
if (kvm_enabled() && !kvm_readonly_mem_enabled()) {
/*
diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index 2a4f3ee285..6d3c72f124 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -1128,7 +1128,7 @@ void x86_load_linux(X86MachineState *x86ms,
nb_option_roms++;
}
-void x86_bios_rom_init(MachineState *ms, const char *default_firmware,
+void x86_bios_rom_init(X86MachineState *x86ms, const char *default_firmware,
MemoryRegion *rom_memory, bool isapc_ram_fw)
{
const char *bios_name;
@@ -1138,7 +1138,7 @@ void x86_bios_rom_init(MachineState *ms, const char *default_firmware,
ssize_t ret;
/* BIOS load */
- bios_name = ms->firmware ?: default_firmware;
+ bios_name = MACHINE(x86ms)->firmware ?: default_firmware;
filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, bios_name);
if (filename) {
bios_size = get_image_size(filename);
diff --git a/include/hw/i386/x86.h b/include/hw/i386/x86.h
index 4dc30dcb4d..cb07618d19 100644
--- a/include/hw/i386/x86.h
+++ b/include/hw/i386/x86.h
@@ -116,7 +116,7 @@ void x86_cpu_unplug_request_cb(HotplugHandler *hotplug_dev,
void x86_cpu_unplug_cb(HotplugHandler *hotplug_dev,
DeviceState *dev, Error **errp);
-void x86_bios_rom_init(MachineState *ms, const char *default_firmware,
+void x86_bios_rom_init(X86MachineState *x86ms, const char *default_firmware,
MemoryRegion *rom_memory, bool isapc_ram_fw);
void x86_load_linux(X86MachineState *x86ms,
--
2.39.3

@ -0,0 +1,51 @@
From 7bb1f124413891bc5d2187f12cd19da6e794904b Mon Sep 17 00:00:00 2001
From: Xiaoyao Li <xiaoyao.li@intel.com>
Date: Wed, 3 Apr 2024 10:59:53 -0400
Subject: [PATCH 010/100] hw/i386/acpi: Set PCAT_COMPAT bit only when pic is
not disabled
RH-Author: Paolo Bonzini <pbonzini@redhat.com>
RH-MergeRequest: 245: SEV-SNP support
RH-Jira: RHEL-39544
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Bandan Das <bdas@redhat.com>
RH-Acked-by: Vitaly Kuznetsov <vkuznets@redhat.com>
RH-Commit: [10/91] 62110e4bf52cb3e106c8d2a902bbd31548beba00 (bonzini/rhel-qemu-kvm)
A value 1 of PCAT_COMPAT (bit 0) of MADT.Flags indicates that the system
also has a PC-AT-compatible dual-8259 setup, i.e., the PIC. When PIC
is not enabled (pic=off) for x86 machine, the PCAT_COMPAT bit needs to
be cleared. The PIC probe should then print:
[ 0.155970] Using NULL legacy PIC
However, no such log printed in guest kernel unless PCAT_COMPAT is
cleared.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20240403145953.3082491-1-xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 292dd287e78e0cbafde9d1522c729349d132d844)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
hw/i386/acpi-common.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/hw/i386/acpi-common.c b/hw/i386/acpi-common.c
index 20f19269da..0cc2919bb8 100644
--- a/hw/i386/acpi-common.c
+++ b/hw/i386/acpi-common.c
@@ -107,7 +107,9 @@ void acpi_build_madt(GArray *table_data, BIOSLinker *linker,
acpi_table_begin(&table, table_data);
/* Local APIC Address */
build_append_int_noprefix(table_data, APIC_DEFAULT_ADDRESS, 4);
- build_append_int_noprefix(table_data, 1 /* PCAT_COMPAT */, 4); /* Flags */
+ /* Flags. bit 0: PCAT_COMPAT */
+ build_append_int_noprefix(table_data,
+ x86ms->pic != ON_OFF_AUTO_OFF ? 1 : 0 , 4);
for (i = 0; i < apic_ids->len; i++) {
pc_madt_cpu_entry(i, apic_ids, table_data, false);
--
2.39.3

@ -0,0 +1,164 @@
From fd6de3c5e97bdf13a39342fc71815a20c66867ae Mon Sep 17 00:00:00 2001
From: Bernhard Beschow <shentey@gmail.com>
Date: Wed, 8 May 2024 19:55:07 +0200
Subject: [PATCH 043/100] hw/i386/pc_sysfw: Alias rather than copy isa-bios
region
RH-Author: Paolo Bonzini <pbonzini@redhat.com>
RH-MergeRequest: 245: SEV-SNP support
RH-Jira: RHEL-39544
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Bandan Das <bdas@redhat.com>
RH-Acked-by: Vitaly Kuznetsov <vkuznets@redhat.com>
RH-Commit: [43/91] f64dab2a091838a10a9b94e3d09ea11432b0809f (bonzini/rhel-qemu-kvm)
In the -bios case the "isa-bios" memory region is an alias to the BIOS mapped
to the top of the 4G memory boundary. Do the same in the -pflash case, but only
for new machine versions for migration compatibility. This establishes common
behavior and makes pflash commands work in the "isa-bios" region which some
real-world legacy bioses rely on.
Note that in the sev_enabled() case, the "isa-bios" memory region in the -pflash
case will now also point to encrypted memory, just like it already does in the
-bios case.
When running `info mtree` before and after this commit with
`qemu-system-x86_64 -S -drive \
if=pflash,format=raw,readonly=on,file=/usr/share/qemu/bios-256k.bin` and running
`diff -u before.mtree after.mtree` results in the following changes in the
memory tree:
| --- before.mtree
| +++ after.mtree
| @@ -71,7 +71,7 @@
| 0000000000000000-ffffffffffffffff (prio -1, i/o): pci
| 00000000000a0000-00000000000bffff (prio 1, i/o): vga-lowmem
| 00000000000c0000-00000000000dffff (prio 1, rom): pc.rom
| - 00000000000e0000-00000000000fffff (prio 1, rom): isa-bios
| + 00000000000e0000-00000000000fffff (prio 1, romd): alias isa-bios @system.flash0 0000000000020000-000000000003ffff
| 00000000000a0000-00000000000bffff (prio 1, i/o): alias smram-region @pci 00000000000a0000-00000000000bffff
| 00000000000c0000-00000000000c3fff (prio 1, i/o): alias pam-pci @pci 00000000000c0000-00000000000c3fff
| 00000000000c4000-00000000000c7fff (prio 1, i/o): alias pam-pci @pci 00000000000c4000-00000000000c7fff
| @@ -108,7 +108,7 @@
| 0000000000000000-ffffffffffffffff (prio -1, i/o): pci
| 00000000000a0000-00000000000bffff (prio 1, i/o): vga-lowmem
| 00000000000c0000-00000000000dffff (prio 1, rom): pc.rom
| - 00000000000e0000-00000000000fffff (prio 1, rom): isa-bios
| + 00000000000e0000-00000000000fffff (prio 1, romd): alias isa-bios @system.flash0 0000000000020000-000000000003ffff
| 00000000000a0000-00000000000bffff (prio 1, i/o): alias smram-region @pci 00000000000a0000-00000000000bffff
| 00000000000c0000-00000000000c3fff (prio 1, i/o): alias pam-pci @pci 00000000000c0000-00000000000c3fff
| 00000000000c4000-00000000000c7fff (prio 1, i/o): alias pam-pci @pci 00000000000c4000-00000000000c7fff
| @@ -131,11 +131,14 @@
| memory-region: pc.ram
| 0000000000000000-0000000007ffffff (prio 0, ram): pc.ram
|
| +memory-region: system.flash0
| + 00000000fffc0000-00000000ffffffff (prio 0, romd): system.flash0
| +
| memory-region: pci
| 0000000000000000-ffffffffffffffff (prio -1, i/o): pci
| 00000000000a0000-00000000000bffff (prio 1, i/o): vga-lowmem
| 00000000000c0000-00000000000dffff (prio 1, rom): pc.rom
| - 00000000000e0000-00000000000fffff (prio 1, rom): isa-bios
| + 00000000000e0000-00000000000fffff (prio 1, romd): alias isa-bios @system.flash0 0000000000020000-000000000003ffff
|
| memory-region: smram
| 00000000000a0000-00000000000bffff (prio 0, ram): alias smram-low @pc.ram 00000000000a0000-00000000000bffff
Note that in both cases the "system" memory region contains the entry
00000000fffc0000-00000000ffffffff (prio 0, romd): system.flash0
but the "system.flash0" memory region only appears standalone when "isa-bios" is
an alias.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-ID: <20240508175507.22270-7-shentey@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit a44ea3fa7f2aa1d809fdca1b84a52695b53d8ad0)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
hw/i386/pc.c | 1 +
hw/i386/pc_piix.c | 1 +
hw/i386/pc_q35.c | 1 +
hw/i386/pc_sysfw.c | 8 +++++++-
include/hw/i386/pc.h | 1 +
5 files changed, 11 insertions(+), 1 deletion(-)
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 1a34bc4522..660a59c63b 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1967,6 +1967,7 @@ static void pc_machine_class_init(ObjectClass *oc, void *data)
pcmc->has_reserved_memory = true;
pcmc->enforce_aligned_dimm = true;
pcmc->enforce_amd_1tb_hole = true;
+ pcmc->isa_bios_alias = true;
/* BIOS ACPI tables: 128K. Other BIOS datastructures: less than 4K reported
* to be used at the moment, 32K should be enough for a while. */
pcmc->acpi_data_size = 0x20000 + 0x8000;
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index bef3e8b73e..dbb7f2ed17 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -975,6 +975,7 @@ static void pc_machine_rhel7_options(MachineClass *m)
m->alias = "pc";
m->is_default = 1;
m->smp_props.prefer_sockets = true;
+ pcmc->isa_bios_alias = false;
}
static void pc_init_rhel760(MachineState *machine)
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index dedc86eec9..f9900ad798 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -735,6 +735,7 @@ static void pc_q35_machine_rhel940_options(MachineClass *m)
m->desc = "RHEL-9.4.0 PC (Q35 + ICH9, 2009)";
pcmc->smbios_stream_product = "RHEL";
pcmc->smbios_stream_version = "9.4.0";
+ pcmc->isa_bios_alias = false;
compat_props_add(m->compat_props, pc_rhel_9_5_compat,
pc_rhel_9_5_compat_len);
diff --git a/hw/i386/pc_sysfw.c b/hw/i386/pc_sysfw.c
index 82d37cb376..ac88ad4eb9 100644
--- a/hw/i386/pc_sysfw.c
+++ b/hw/i386/pc_sysfw.c
@@ -135,6 +135,7 @@ static void pc_system_flash_map(PCMachineState *pcms,
MemoryRegion *rom_memory)
{
X86MachineState *x86ms = X86_MACHINE(pcms);
+ PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
hwaddr total_size = 0;
int i;
BlockBackend *blk;
@@ -184,7 +185,12 @@ static void pc_system_flash_map(PCMachineState *pcms,
if (i == 0) {
flash_mem = pflash_cfi01_get_memory(system_flash);
- pc_isa_bios_init(&x86ms->isa_bios, rom_memory, flash_mem);
+ if (pcmc->isa_bios_alias) {
+ x86_isa_bios_init(&x86ms->isa_bios, rom_memory, flash_mem,
+ true);
+ } else {
+ pc_isa_bios_init(&x86ms->isa_bios, rom_memory, flash_mem);
+ }
/* Encrypt the pflash boot ROM */
if (sev_enabled()) {
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 467e7fb52f..3f53ec73ac 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -122,6 +122,7 @@ struct PCMachineClass {
bool enforce_aligned_dimm;
bool broken_reserved_end;
bool enforce_amd_1tb_hole;
+ bool isa_bios_alias;
/* generate legacy CPU hotplug AML */
bool legacy_cpu_hotplug;
--
2.39.3

@ -0,0 +1,53 @@
From 9bf1d368c4b53139db39649833d475e097fc98d1 Mon Sep 17 00:00:00 2001
From: Bernhard Beschow <shentey@gmail.com>
Date: Mon, 22 Apr 2024 22:06:22 +0200
Subject: [PATCH 039/100] hw/i386/pc_sysfw: Remove unused parameter from
pc_isa_bios_init()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Paolo Bonzini <pbonzini@redhat.com>
RH-MergeRequest: 245: SEV-SNP support
RH-Jira: RHEL-39544
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Bandan Das <bdas@redhat.com>
RH-Acked-by: Vitaly Kuznetsov <vkuznets@redhat.com>
RH-Commit: [39/91] c0019dc2706a8e3f40486fd4a4c0dd1fbe23237b (bonzini/rhel-qemu-kvm)
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240422200625.2768-2-shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit f4b63768b91811cdcf1fb7b270587123251dfea5)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
hw/i386/pc_sysfw.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/hw/i386/pc_sysfw.c b/hw/i386/pc_sysfw.c
index ef7dea9798..59c7a81692 100644
--- a/hw/i386/pc_sysfw.c
+++ b/hw/i386/pc_sysfw.c
@@ -41,8 +41,7 @@
#define FLASH_SECTOR_SIZE 4096
static void pc_isa_bios_init(MemoryRegion *rom_memory,
- MemoryRegion *flash_mem,
- int ram_size)
+ MemoryRegion *flash_mem)
{
int isa_bios_size;
MemoryRegion *isa_bios;
@@ -186,7 +185,7 @@ static void pc_system_flash_map(PCMachineState *pcms,
if (i == 0) {
flash_mem = pflash_cfi01_get_memory(system_flash);
- pc_isa_bios_init(rom_memory, flash_mem, size);
+ pc_isa_bios_init(rom_memory, flash_mem);
/* Encrypt the pflash boot ROM */
if (sev_enabled()) {
--
2.39.3

@ -0,0 +1,158 @@
From e6472ff46cbed97c2a238a8ef7d321351931333a Mon Sep 17 00:00:00 2001
From: Brijesh Singh <brijesh.singh@amd.com>
Date: Thu, 30 May 2024 06:16:30 -0500
Subject: [PATCH 070/100] hw/i386/sev: Add function to get SEV metadata from
OVMF header
RH-Author: Paolo Bonzini <pbonzini@redhat.com>
RH-MergeRequest: 245: SEV-SNP support
RH-Jira: RHEL-39544
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Bandan Das <bdas@redhat.com>
RH-Acked-by: Vitaly Kuznetsov <vkuznets@redhat.com>
RH-Commit: [70/91] ba818dade96119c8a51ca1fb222f4f69e2752396 (bonzini/rhel-qemu-kvm)
A recent version of OVMF expanded the reset vector GUID list to add
SEV-specific metadata GUID. The SEV metadata describes the reserved
memory regions such as the secrets and CPUID page used during the SEV-SNP
guest launch.
The pc_system_get_ovmf_sev_metadata_ptr() is used to retieve the SEV
metadata pointer from the OVMF GUID list.
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240530111643.1091816-19-pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit f3c30c575d34122573b7370a7da5ca3a27dde481)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
hw/i386/pc_sysfw.c | 4 ++++
include/hw/i386/pc.h | 26 ++++++++++++++++++++++++++
target/i386/sev-sysemu-stub.c | 4 ++++
target/i386/sev.c | 32 ++++++++++++++++++++++++++++++++
target/i386/sev.h | 2 ++
5 files changed, 68 insertions(+)
diff --git a/hw/i386/pc_sysfw.c b/hw/i386/pc_sysfw.c
index ac88ad4eb9..9b8671c441 100644
--- a/hw/i386/pc_sysfw.c
+++ b/hw/i386/pc_sysfw.c
@@ -260,6 +260,10 @@ void x86_firmware_configure(void *ptr, int size)
pc_system_parse_ovmf_flash(ptr, size);
if (sev_enabled()) {
+
+ /* Copy the SEV metadata table (if it exists) */
+ pc_system_parse_sev_metadata(ptr, size);
+
ret = sev_es_save_reset_vector(ptr, size);
if (ret) {
error_report("failed to locate and/or save reset vector");
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 3f53ec73ac..94b49310f5 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -167,6 +167,32 @@ void pc_acpi_smi_interrupt(void *opaque, int irq, int level);
#define PCI_HOST_ABOVE_4G_MEM_SIZE "above-4g-mem-size"
#define PCI_HOST_PROP_SMM_RANGES "smm-ranges"
+typedef enum {
+ SEV_DESC_TYPE_UNDEF,
+ /* The section contains the region that must be validated by the VMM. */
+ SEV_DESC_TYPE_SNP_SEC_MEM,
+ /* The section contains the SNP secrets page */
+ SEV_DESC_TYPE_SNP_SECRETS,
+ /* The section contains address that can be used as a CPUID page */
+ SEV_DESC_TYPE_CPUID,
+
+} ovmf_sev_metadata_desc_type;
+
+typedef struct __attribute__((__packed__)) OvmfSevMetadataDesc {
+ uint32_t base;
+ uint32_t len;
+ ovmf_sev_metadata_desc_type type;
+} OvmfSevMetadataDesc;
+
+typedef struct __attribute__((__packed__)) OvmfSevMetadata {
+ uint8_t signature[4];
+ uint32_t len;
+ uint32_t version;
+ uint32_t num_desc;
+ OvmfSevMetadataDesc descs[];
+} OvmfSevMetadata;
+
+OvmfSevMetadata *pc_system_get_ovmf_sev_metadata_ptr(void);
void pc_pci_as_mapping_init(MemoryRegion *system_memory,
MemoryRegion *pci_address_space);
diff --git a/target/i386/sev-sysemu-stub.c b/target/i386/sev-sysemu-stub.c
index 96e1c15cc3..fc1c57c411 100644
--- a/target/i386/sev-sysemu-stub.c
+++ b/target/i386/sev-sysemu-stub.c
@@ -67,3 +67,7 @@ void hmp_info_sev(Monitor *mon, const QDict *qdict)
{
monitor_printf(mon, "SEV is not available in this QEMU\n");
}
+
+void pc_system_parse_sev_metadata(uint8_t *flash_ptr, size_t flash_size)
+{
+}
diff --git a/target/i386/sev.c b/target/i386/sev.c
index e84e4395a5..17281bb2c7 100644
--- a/target/i386/sev.c
+++ b/target/i386/sev.c
@@ -597,6 +597,38 @@ SevCapability *qmp_query_sev_capabilities(Error **errp)
return sev_get_capabilities(errp);
}
+static OvmfSevMetadata *ovmf_sev_metadata_table;
+
+#define OVMF_SEV_META_DATA_GUID "dc886566-984a-4798-A75e-5585a7bf67cc"
+typedef struct __attribute__((__packed__)) OvmfSevMetadataOffset {
+ uint32_t offset;
+} OvmfSevMetadataOffset;
+
+OvmfSevMetadata *pc_system_get_ovmf_sev_metadata_ptr(void)
+{
+ return ovmf_sev_metadata_table;
+}
+
+void pc_system_parse_sev_metadata(uint8_t *flash_ptr, size_t flash_size)
+{
+ OvmfSevMetadata *metadata;
+ OvmfSevMetadataOffset *data;
+
+ if (!pc_system_ovmf_table_find(OVMF_SEV_META_DATA_GUID, (uint8_t **)&data,
+ NULL)) {
+ return;
+ }
+
+ metadata = (OvmfSevMetadata *)(flash_ptr + flash_size - data->offset);
+ if (memcmp(metadata->signature, "ASEV", 4) != 0 ||
+ metadata->len < sizeof(OvmfSevMetadata) ||
+ metadata->len > flash_size - data->offset) {
+ return;
+ }
+
+ ovmf_sev_metadata_table = g_memdup2(metadata, metadata->len);
+}
+
static SevAttestationReport *sev_get_attestation_report(const char *mnonce,
Error **errp)
{
diff --git a/target/i386/sev.h b/target/i386/sev.h
index 5dc4767b1e..cc12824dd6 100644
--- a/target/i386/sev.h
+++ b/target/i386/sev.h
@@ -66,4 +66,6 @@ int sev_inject_launch_secret(const char *hdr, const char *secret,
int sev_es_save_reset_vector(void *flash_ptr, uint64_t flash_size);
void sev_es_set_reset_vector(CPUState *cpu);
+void pc_system_parse_sev_metadata(uint8_t *flash_ptr, size_t flash_size);
+
#endif
--
2.39.3

@ -0,0 +1,165 @@
From 226cf6c3d3e2fd1a35422043dbe0b73d1216df83 Mon Sep 17 00:00:00 2001
From: Brijesh Singh <brijesh.singh@amd.com>
Date: Thu, 30 May 2024 06:16:36 -0500
Subject: [PATCH 073/100] hw/i386/sev: Add support to encrypt BIOS when SEV-SNP
is enabled
RH-Author: Paolo Bonzini <pbonzini@redhat.com>
RH-MergeRequest: 245: SEV-SNP support
RH-Jira: RHEL-39544
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Bandan Das <bdas@redhat.com>
RH-Acked-by: Vitaly Kuznetsov <vkuznets@redhat.com>
RH-Commit: [73/91] 844afd322c12c3e8992cf6ec692c94e70747bd0c (bonzini/rhel-qemu-kvm)
As with SEV, an SNP guest requires that the BIOS be part of the initial
encrypted/measured guest payload. Extend sev_encrypt_flash() to handle
the SNP case and plumb through the GPA of the BIOS location since this
is needed for SNP.
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240530111643.1091816-25-pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 77d1abd91e5352ad30ae2f83790f95fa6a3c0b6b)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
hw/i386/pc_sysfw.c | 12 +++++++-----
hw/i386/x86-common.c | 2 +-
include/hw/i386/x86.h | 2 +-
target/i386/sev-sysemu-stub.c | 2 +-
target/i386/sev.c | 5 +++--
target/i386/sev.h | 2 +-
6 files changed, 14 insertions(+), 11 deletions(-)
diff --git a/hw/i386/pc_sysfw.c b/hw/i386/pc_sysfw.c
index 9b8671c441..7cdbafc8d2 100644
--- a/hw/i386/pc_sysfw.c
+++ b/hw/i386/pc_sysfw.c
@@ -148,6 +148,8 @@ static void pc_system_flash_map(PCMachineState *pcms,
assert(PC_MACHINE_GET_CLASS(pcms)->pci_enabled);
for (i = 0; i < ARRAY_SIZE(pcms->flash); i++) {
+ hwaddr gpa;
+
system_flash = pcms->flash[i];
blk = pflash_cfi01_get_blk(system_flash);
if (!blk) {
@@ -177,11 +179,11 @@ static void pc_system_flash_map(PCMachineState *pcms,
}
total_size += size;
+ gpa = 0x100000000ULL - total_size; /* where the flash is mapped */
qdev_prop_set_uint32(DEVICE(system_flash), "num-blocks",
size / FLASH_SECTOR_SIZE);
sysbus_realize_and_unref(SYS_BUS_DEVICE(system_flash), &error_fatal);
- sysbus_mmio_map(SYS_BUS_DEVICE(system_flash), 0,
- 0x100000000ULL - total_size);
+ sysbus_mmio_map(SYS_BUS_DEVICE(system_flash), 0, gpa);
if (i == 0) {
flash_mem = pflash_cfi01_get_memory(system_flash);
@@ -196,7 +198,7 @@ static void pc_system_flash_map(PCMachineState *pcms,
if (sev_enabled()) {
flash_ptr = memory_region_get_ram_ptr(flash_mem);
flash_size = memory_region_size(flash_mem);
- x86_firmware_configure(flash_ptr, flash_size);
+ x86_firmware_configure(gpa, flash_ptr, flash_size);
}
}
}
@@ -249,7 +251,7 @@ void pc_system_firmware_init(PCMachineState *pcms,
pc_system_flash_cleanup_unused(pcms);
}
-void x86_firmware_configure(void *ptr, int size)
+void x86_firmware_configure(hwaddr gpa, void *ptr, int size)
{
int ret;
@@ -270,6 +272,6 @@ void x86_firmware_configure(void *ptr, int size)
exit(1);
}
- sev_encrypt_flash(ptr, size, &error_fatal);
+ sev_encrypt_flash(gpa, ptr, size, &error_fatal);
}
}
diff --git a/hw/i386/x86-common.c b/hw/i386/x86-common.c
index 67b03c913a..35fe6eabea 100644
--- a/hw/i386/x86-common.c
+++ b/hw/i386/x86-common.c
@@ -981,7 +981,7 @@ void x86_bios_rom_init(X86MachineState *x86ms, const char *default_firmware,
*/
void *ptr = memory_region_get_ram_ptr(&x86ms->bios);
load_image_size(filename, ptr, bios_size);
- x86_firmware_configure(ptr, bios_size);
+ x86_firmware_configure(0x100000000ULL - bios_size, ptr, bios_size);
} else {
memory_region_set_readonly(&x86ms->bios, !isapc_ram_fw);
ret = rom_add_file_fixed(bios_name, (uint32_t)(-bios_size), -1);
diff --git a/include/hw/i386/x86.h b/include/hw/i386/x86.h
index b006f16b8d..d43cb3908e 100644
--- a/include/hw/i386/x86.h
+++ b/include/hw/i386/x86.h
@@ -154,6 +154,6 @@ void ioapic_init_gsi(GSIState *gsi_state, Object *parent);
DeviceState *ioapic_init_secondary(GSIState *gsi_state);
/* pc_sysfw.c */
-void x86_firmware_configure(void *ptr, int size);
+void x86_firmware_configure(hwaddr gpa, void *ptr, int size);
#endif
diff --git a/target/i386/sev-sysemu-stub.c b/target/i386/sev-sysemu-stub.c
index fc1c57c411..d5bf886e79 100644
--- a/target/i386/sev-sysemu-stub.c
+++ b/target/i386/sev-sysemu-stub.c
@@ -42,7 +42,7 @@ void qmp_sev_inject_launch_secret(const char *packet_header, const char *secret,
error_setg(errp, "SEV is not available in this QEMU");
}
-int sev_encrypt_flash(uint8_t *ptr, uint64_t len, Error **errp)
+int sev_encrypt_flash(hwaddr gpa, uint8_t *ptr, uint64_t len, Error **errp)
{
g_assert_not_reached();
}
diff --git a/target/i386/sev.c b/target/i386/sev.c
index 06401f0526..7b5c4b4874 100644
--- a/target/i386/sev.c
+++ b/target/i386/sev.c
@@ -1484,7 +1484,7 @@ static int sev_snp_kvm_init(ConfidentialGuestSupport *cgs, Error **errp)
}
int
-sev_encrypt_flash(uint8_t *ptr, uint64_t len, Error **errp)
+sev_encrypt_flash(hwaddr gpa, uint8_t *ptr, uint64_t len, Error **errp)
{
SevCommonState *sev_common = SEV_COMMON(MACHINE(qdev_get_machine())->cgs);
@@ -1841,7 +1841,8 @@ bool sev_add_kernel_loader_hashes(SevKernelLoaderContext *ctx, Error **errp)
/* zero the excess data so the measurement can be reliably calculated */
memset(padded_ht->padding, 0, sizeof(padded_ht->padding));
- if (sev_encrypt_flash((uint8_t *)padded_ht, sizeof(*padded_ht), errp) < 0) {
+ if (sev_encrypt_flash(area->base, (uint8_t *)padded_ht,
+ sizeof(*padded_ht), errp) < 0) {
ret = false;
}
diff --git a/target/i386/sev.h b/target/i386/sev.h
index cc12824dd6..858005a119 100644
--- a/target/i386/sev.h
+++ b/target/i386/sev.h
@@ -59,7 +59,7 @@ uint32_t sev_get_cbit_position(void);
uint32_t sev_get_reduced_phys_bits(void);
bool sev_add_kernel_loader_hashes(SevKernelLoaderContext *ctx, Error **errp);
-int sev_encrypt_flash(uint8_t *ptr, uint64_t len, Error **errp);
+int sev_encrypt_flash(hwaddr gpa, uint8_t *ptr, uint64_t len, Error **errp);
int sev_inject_launch_secret(const char *hdr, const char *secret,
uint64_t gpa, Error **errp);
--
2.39.3

@ -0,0 +1,123 @@
From a20b2e3e52b9589ac1abc8b9b818d526c86368cf Mon Sep 17 00:00:00 2001
From: Michael Roth <michael.roth@amd.com>
Date: Thu, 30 May 2024 06:16:39 -0500
Subject: [PATCH 082/100] hw/i386/sev: Use guest_memfd for legacy ROMs
RH-Author: Paolo Bonzini <pbonzini@redhat.com>
RH-MergeRequest: 245: SEV-SNP support
RH-Jira: RHEL-39544
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Bandan Das <bdas@redhat.com>
RH-Acked-by: Vitaly Kuznetsov <vkuznets@redhat.com>
RH-Commit: [82/91] a591e85e00c353009803b143c80852b8c9b1f15e (bonzini/rhel-qemu-kvm)
Current SNP guest kernels will attempt to access these regions with
with C-bit set, so guest_memfd is needed to handle that. Otherwise,
kvm_convert_memory() will fail when the guest kernel tries to access it
and QEMU attempts to call KVM_SET_MEMORY_ATTRIBUTES to set these ranges
to private.
Whether guests should actually try to access ROM regions in this way (or
need to deal with legacy ROM regions at all), is a separate issue to be
addressed on kernel side, but current SNP guest kernels will exhibit
this behavior and so this handling is needed to allow QEMU to continue
running existing SNP guest kernels.
Signed-off-by: Michael Roth <michael.roth@amd.com>
[pankaj: Added sev_snp_enabled() check]
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240530111643.1091816-28-pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 413a67450750e0459efeffc3db3ba9759c3e381c)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
hw/i386/pc.c | 14 ++++++++++----
hw/i386/pc_sysfw.c | 19 +++++++++++++------
2 files changed, 23 insertions(+), 10 deletions(-)
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 0aca0cc79e..b25d075b59 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -62,6 +62,7 @@
#include "hw/mem/memory-device.h"
#include "e820_memory_layout.h"
#include "trace.h"
+#include "sev.h"
#include CONFIG_DEVICES
#ifdef CONFIG_XEN_EMU
@@ -1173,10 +1174,15 @@ void pc_memory_init(PCMachineState *pcms,
pc_system_firmware_init(pcms, rom_memory);
option_rom_mr = g_malloc(sizeof(*option_rom_mr));
- memory_region_init_ram(option_rom_mr, NULL, "pc.rom", PC_ROM_SIZE,
- &error_fatal);
- if (pcmc->pci_enabled) {
- memory_region_set_readonly(option_rom_mr, true);
+ if (machine_require_guest_memfd(machine)) {
+ memory_region_init_ram_guest_memfd(option_rom_mr, NULL, "pc.rom",
+ PC_ROM_SIZE, &error_fatal);
+ } else {
+ memory_region_init_ram(option_rom_mr, NULL, "pc.rom", PC_ROM_SIZE,
+ &error_fatal);
+ if (pcmc->pci_enabled) {
+ memory_region_set_readonly(option_rom_mr, true);
+ }
}
memory_region_add_subregion_overlap(rom_memory,
PC_ROM_MIN_VGA,
diff --git a/hw/i386/pc_sysfw.c b/hw/i386/pc_sysfw.c
index 7cdbafc8d2..ef80281d28 100644
--- a/hw/i386/pc_sysfw.c
+++ b/hw/i386/pc_sysfw.c
@@ -40,8 +40,8 @@
#define FLASH_SECTOR_SIZE 4096
-static void pc_isa_bios_init(MemoryRegion *isa_bios, MemoryRegion *rom_memory,
- MemoryRegion *flash_mem)
+static void pc_isa_bios_init(PCMachineState *pcms, MemoryRegion *isa_bios,
+ MemoryRegion *rom_memory, MemoryRegion *flash_mem)
{
int isa_bios_size;
uint64_t flash_size;
@@ -51,8 +51,13 @@ static void pc_isa_bios_init(MemoryRegion *isa_bios, MemoryRegion *rom_memory,
/* map the last 128KB of the BIOS in ISA space */
isa_bios_size = MIN(flash_size, 128 * KiB);
- memory_region_init_ram(isa_bios, NULL, "isa-bios", isa_bios_size,
- &error_fatal);
+ if (machine_require_guest_memfd(MACHINE(pcms))) {
+ memory_region_init_ram_guest_memfd(isa_bios, NULL, "isa-bios",
+ isa_bios_size, &error_fatal);
+ } else {
+ memory_region_init_ram(isa_bios, NULL, "isa-bios", isa_bios_size,
+ &error_fatal);
+ }
memory_region_add_subregion_overlap(rom_memory,
0x100000 - isa_bios_size,
isa_bios,
@@ -65,7 +70,9 @@ static void pc_isa_bios_init(MemoryRegion *isa_bios, MemoryRegion *rom_memory,
((uint8_t*)flash_ptr) + (flash_size - isa_bios_size),
isa_bios_size);
- memory_region_set_readonly(isa_bios, true);
+ if (!machine_require_guest_memfd(current_machine)) {
+ memory_region_set_readonly(isa_bios, true);
+ }
}
static PFlashCFI01 *pc_pflash_create(PCMachineState *pcms,
@@ -191,7 +198,7 @@ static void pc_system_flash_map(PCMachineState *pcms,
x86_isa_bios_init(&x86ms->isa_bios, rom_memory, flash_mem,
true);
} else {
- pc_isa_bios_init(&x86ms->isa_bios, rom_memory, flash_mem);
+ pc_isa_bios_init(pcms, &x86ms->isa_bios, rom_memory, flash_mem);
}
/* Encrypt the pflash boot ROM */
--
2.39.3

@ -0,0 +1,58 @@
From 4331180aa09e44550ff8de781c618bae5e99bb70 Mon Sep 17 00:00:00 2001
From: Michael Roth <michael.roth@amd.com>
Date: Tue, 9 Apr 2024 18:07:43 -0500
Subject: [PATCH 025/100] hw/i386/sev: Use legacy SEV VM types for older
machine types
RH-Author: Paolo Bonzini <pbonzini@redhat.com>
RH-MergeRequest: 245: SEV-SNP support
RH-Jira: RHEL-39544
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Bandan Das <bdas@redhat.com>
RH-Acked-by: Vitaly Kuznetsov <vkuznets@redhat.com>
RH-Commit: [25/91] 8c73cd312736ccb0818b4d3216fd13712f21f3c9 (bonzini/rhel-qemu-kvm)
Newer 9.1 machine types will default to using the KVM_SEV_INIT2 API for
creating SEV/SEV-ES going forward. However, this API results in guest
measurement changes which are generally not expected for users of these
older guest types and can cause disruption if they switch to a newer
QEMU/kernel version. Avoid this by continuing to use the older
KVM_SEV_INIT/KVM_SEV_ES_INIT APIs for older machine types.
Signed-off-by: Michael Roth <michael.roth@amd.com>
Message-ID: <20240409230743.962513-4-michael.roth@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit ea7fbd37537b3a598335c21ccb2ea674630fc810)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
hw/i386/pc.c | 1 +
target/i386/sev.c | 1 +
2 files changed, 2 insertions(+)
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index b9fde3cec1..1a34bc4522 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -351,6 +351,7 @@ const size_t pc_rhel_compat_len = G_N_ELEMENTS(pc_rhel_compat);
GlobalProperty pc_rhel_9_5_compat[] = {
/* pc_rhel_9_5_compat from pc_compat_pc_9_0 (backported from 9.1) */
{ TYPE_X86_CPU, "guest-phys-bits", "0" },
+ { "sev-guest", "legacy-vm-type", "true" },
};
const size_t pc_rhel_9_5_compat_len = G_N_ELEMENTS(pc_rhel_9_5_compat);
diff --git a/target/i386/sev.c b/target/i386/sev.c
index f4ee317cb0..d30b68c11e 100644
--- a/target/i386/sev.c
+++ b/target/i386/sev.c
@@ -1417,6 +1417,7 @@ sev_guest_instance_init(Object *obj)
object_property_add_uint32_ptr(obj, "reduced-phys-bits",
&sev->reduced_phys_bits,
OBJ_PROP_FLAG_READWRITE);
+ object_apply_compat_props(obj);
}
/* sev guest info */
--
2.39.3

File diff suppressed because it is too large Load Diff

@ -0,0 +1,133 @@
From ebf08d2a822576acfa60fbd5f552d26de1e4c4be Mon Sep 17 00:00:00 2001
From: Bernhard Beschow <shentey@gmail.com>
Date: Wed, 8 May 2024 19:55:04 +0200
Subject: [PATCH 040/100] hw/i386/x86: Don't leak "isa-bios" memory regions
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Paolo Bonzini <pbonzini@redhat.com>
RH-MergeRequest: 245: SEV-SNP support
RH-Jira: RHEL-39544
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Bandan Das <bdas@redhat.com>
RH-Acked-by: Vitaly Kuznetsov <vkuznets@redhat.com>
RH-Commit: [40/91] bb595357c6cc2d5a80bf3873853c69553c5feee5 (bonzini/rhel-qemu-kvm)
Fix the leaking in x86_bios_rom_init() and pc_isa_bios_init() by adding an
"isa_bios" attribute to X86MachineState.
Suggested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-ID: <20240508175507.22270-4-shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit 32d3ee87a17fc91e981a23dba94855bff89f5920)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
hw/i386/pc_sysfw.c | 7 +++----
hw/i386/x86.c | 9 ++++-----
include/hw/i386/x86.h | 7 +++++++
3 files changed, 14 insertions(+), 9 deletions(-)
diff --git a/hw/i386/pc_sysfw.c b/hw/i386/pc_sysfw.c
index 59c7a81692..82d37cb376 100644
--- a/hw/i386/pc_sysfw.c
+++ b/hw/i386/pc_sysfw.c
@@ -40,11 +40,10 @@
#define FLASH_SECTOR_SIZE 4096
-static void pc_isa_bios_init(MemoryRegion *rom_memory,
+static void pc_isa_bios_init(MemoryRegion *isa_bios, MemoryRegion *rom_memory,
MemoryRegion *flash_mem)
{
int isa_bios_size;
- MemoryRegion *isa_bios;
uint64_t flash_size;
void *flash_ptr, *isa_bios_ptr;
@@ -52,7 +51,6 @@ static void pc_isa_bios_init(MemoryRegion *rom_memory,
/* map the last 128KB of the BIOS in ISA space */
isa_bios_size = MIN(flash_size, 128 * KiB);
- isa_bios = g_malloc(sizeof(*isa_bios));
memory_region_init_ram(isa_bios, NULL, "isa-bios", isa_bios_size,
&error_fatal);
memory_region_add_subregion_overlap(rom_memory,
@@ -136,6 +134,7 @@ void pc_system_flash_cleanup_unused(PCMachineState *pcms)
static void pc_system_flash_map(PCMachineState *pcms,
MemoryRegion *rom_memory)
{
+ X86MachineState *x86ms = X86_MACHINE(pcms);
hwaddr total_size = 0;
int i;
BlockBackend *blk;
@@ -185,7 +184,7 @@ static void pc_system_flash_map(PCMachineState *pcms,
if (i == 0) {
flash_mem = pflash_cfi01_get_memory(system_flash);
- pc_isa_bios_init(rom_memory, flash_mem);
+ pc_isa_bios_init(&x86ms->isa_bios, rom_memory, flash_mem);
/* Encrypt the pflash boot ROM */
if (sev_enabled()) {
diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index 6d3c72f124..457e8a34a5 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -1133,7 +1133,7 @@ void x86_bios_rom_init(X86MachineState *x86ms, const char *default_firmware,
{
const char *bios_name;
char *filename;
- MemoryRegion *bios, *isa_bios;
+ MemoryRegion *bios;
int bios_size, isa_bios_size;
ssize_t ret;
@@ -1173,14 +1173,13 @@ void x86_bios_rom_init(X86MachineState *x86ms, const char *default_firmware,
/* map the last 128KB of the BIOS in ISA space */
isa_bios_size = MIN(bios_size, 128 * KiB);
- isa_bios = g_malloc(sizeof(*isa_bios));
- memory_region_init_alias(isa_bios, NULL, "isa-bios", bios,
+ memory_region_init_alias(&x86ms->isa_bios, NULL, "isa-bios", bios,
bios_size - isa_bios_size, isa_bios_size);
memory_region_add_subregion_overlap(rom_memory,
0x100000 - isa_bios_size,
- isa_bios,
+ &x86ms->isa_bios,
1);
- memory_region_set_readonly(isa_bios, !isapc_ram_fw);
+ memory_region_set_readonly(&x86ms->isa_bios, !isapc_ram_fw);
/* map all the bios at the top of memory */
memory_region_add_subregion(rom_memory,
diff --git a/include/hw/i386/x86.h b/include/hw/i386/x86.h
index cb07618d19..a07de79167 100644
--- a/include/hw/i386/x86.h
+++ b/include/hw/i386/x86.h
@@ -18,6 +18,7 @@
#define HW_I386_X86_H
#include "exec/hwaddr.h"
+#include "exec/memory.h"
#include "hw/boards.h"
#include "hw/intc/ioapic.h"
@@ -52,6 +53,12 @@ struct X86MachineState {
GMappedFile *initrd_mapped_file;
HotplugHandler *acpi_dev;
+ /*
+ * Map the upper 128 KiB of the BIOS just underneath the 1 MiB address
+ * boundary.
+ */
+ MemoryRegion isa_bios;
+
/* RAM information (sizes, addresses, configuration): */
ram_addr_t below_4g_mem_size, above_4g_mem_size;
--
2.39.3

@ -0,0 +1,105 @@
From e1f2265b5f6bf5b63bf3808bb540888f3cf8badb Mon Sep 17 00:00:00 2001
From: Bernhard Beschow <shentey@gmail.com>
Date: Wed, 8 May 2024 19:55:05 +0200
Subject: [PATCH 041/100] hw/i386/x86: Don't leak "pc.bios" memory region
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Paolo Bonzini <pbonzini@redhat.com>
RH-MergeRequest: 245: SEV-SNP support
RH-Jira: RHEL-39544
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Bandan Das <bdas@redhat.com>
RH-Acked-by: Vitaly Kuznetsov <vkuznets@redhat.com>
RH-Commit: [41/91] a9cd61d8d240134c09c46e244efb89217cadf60c (bonzini/rhel-qemu-kvm)
Fix the leaking in x86_bios_rom_init() by adding a "bios" attribute to
X86MachineState. Note that it is only used in the -bios case.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-ID: <20240508175507.22270-5-shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit 865d95321ffc8d9941e33000b10140550f094556)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
hw/i386/x86.c | 13 ++++++-------
include/hw/i386/x86.h | 6 ++++++
2 files changed, 12 insertions(+), 7 deletions(-)
diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index 457e8a34a5..29167de97d 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -1133,7 +1133,6 @@ void x86_bios_rom_init(X86MachineState *x86ms, const char *default_firmware,
{
const char *bios_name;
char *filename;
- MemoryRegion *bios;
int bios_size, isa_bios_size;
ssize_t ret;
@@ -1149,8 +1148,8 @@ void x86_bios_rom_init(X86MachineState *x86ms, const char *default_firmware,
(bios_size % 65536) != 0) {
goto bios_error;
}
- bios = g_malloc(sizeof(*bios));
- memory_region_init_ram(bios, NULL, "pc.bios", bios_size, &error_fatal);
+ memory_region_init_ram(&x86ms->bios, NULL, "pc.bios", bios_size,
+ &error_fatal);
if (sev_enabled()) {
/*
* The concept of a "reset" simply doesn't exist for
@@ -1159,11 +1158,11 @@ void x86_bios_rom_init(X86MachineState *x86ms, const char *default_firmware,
* the firmware as rom to properly re-initialize on reset.
* Just go for a straight file load instead.
*/
- void *ptr = memory_region_get_ram_ptr(bios);
+ void *ptr = memory_region_get_ram_ptr(&x86ms->bios);
load_image_size(filename, ptr, bios_size);
x86_firmware_configure(ptr, bios_size);
} else {
- memory_region_set_readonly(bios, !isapc_ram_fw);
+ memory_region_set_readonly(&x86ms->bios, !isapc_ram_fw);
ret = rom_add_file_fixed(bios_name, (uint32_t)(-bios_size), -1);
if (ret != 0) {
goto bios_error;
@@ -1173,7 +1172,7 @@ void x86_bios_rom_init(X86MachineState *x86ms, const char *default_firmware,
/* map the last 128KB of the BIOS in ISA space */
isa_bios_size = MIN(bios_size, 128 * KiB);
- memory_region_init_alias(&x86ms->isa_bios, NULL, "isa-bios", bios,
+ memory_region_init_alias(&x86ms->isa_bios, NULL, "isa-bios", &x86ms->bios,
bios_size - isa_bios_size, isa_bios_size);
memory_region_add_subregion_overlap(rom_memory,
0x100000 - isa_bios_size,
@@ -1184,7 +1183,7 @@ void x86_bios_rom_init(X86MachineState *x86ms, const char *default_firmware,
/* map all the bios at the top of memory */
memory_region_add_subregion(rom_memory,
(uint32_t)(-bios_size),
- bios);
+ &x86ms->bios);
return;
bios_error:
diff --git a/include/hw/i386/x86.h b/include/hw/i386/x86.h
index a07de79167..55c6809ae0 100644
--- a/include/hw/i386/x86.h
+++ b/include/hw/i386/x86.h
@@ -53,6 +53,12 @@ struct X86MachineState {
GMappedFile *initrd_mapped_file;
HotplugHandler *acpi_dev;
+ /*
+ * Map the whole BIOS just underneath the 4 GiB address boundary. Only used
+ * in the ROM (-bios) case.
+ */
+ MemoryRegion bios;
+
/*
* Map the upper 128 KiB of the BIOS just underneath the 1 MiB address
* boundary.
--
2.39.3

@ -0,0 +1,69 @@
From b9d0c78f04160fbc1eee6cfd94b17f1133a35d83 Mon Sep 17 00:00:00 2001
From: Bernhard Beschow <shentey@gmail.com>
Date: Tue, 30 Apr 2024 17:06:38 +0200
Subject: [PATCH 037/100] hw/i386/x86: Eliminate two if statements in
x86_bios_rom_init()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Paolo Bonzini <pbonzini@redhat.com>
RH-MergeRequest: 245: SEV-SNP support
RH-Jira: RHEL-39544
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Bandan Das <bdas@redhat.com>
RH-Acked-by: Vitaly Kuznetsov <vkuznets@redhat.com>
RH-Commit: [37/91] 1ef6a13214e85f6ef773f5c894c720f20330912b (bonzini/rhel-qemu-kvm)
Given that memory_region_set_readonly() is a no-op when the readonlyness is
already as requested it is possible to simplify the pattern
if (condition) {
foo(true);
}
to
foo(condition);
which is shorter and allows to see the invariant of the code more easily.
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240430150643.111976-2-shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit 014dbdac8798799d081abc9dff3e4876ca54f49e)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
hw/i386/x86.c | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index 3d5b51e92d..2a4f3ee285 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -1163,9 +1163,7 @@ void x86_bios_rom_init(MachineState *ms, const char *default_firmware,
load_image_size(filename, ptr, bios_size);
x86_firmware_configure(ptr, bios_size);
} else {
- if (!isapc_ram_fw) {
- memory_region_set_readonly(bios, true);
- }
+ memory_region_set_readonly(bios, !isapc_ram_fw);
ret = rom_add_file_fixed(bios_name, (uint32_t)(-bios_size), -1);
if (ret != 0) {
goto bios_error;
@@ -1182,9 +1180,7 @@ void x86_bios_rom_init(MachineState *ms, const char *default_firmware,
0x100000 - isa_bios_size,
isa_bios,
1);
- if (!isapc_ram_fw) {
- memory_region_set_readonly(isa_bios, true);
- }
+ memory_region_set_readonly(isa_bios, !isapc_ram_fw);
/* map all the bios at the top of memory */
memory_region_add_subregion(rom_memory,
--
2.39.3

@ -0,0 +1,98 @@
From 1baf67564d4227d6ba98923217a15814c438c32b Mon Sep 17 00:00:00 2001
From: Bernhard Beschow <shentey@gmail.com>
Date: Wed, 8 May 2024 19:55:06 +0200
Subject: [PATCH 042/100] hw/i386/x86: Extract x86_isa_bios_init() from
x86_bios_rom_init()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Paolo Bonzini <pbonzini@redhat.com>
RH-MergeRequest: 245: SEV-SNP support
RH-Jira: RHEL-39544
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Bandan Das <bdas@redhat.com>
RH-Acked-by: Vitaly Kuznetsov <vkuznets@redhat.com>
RH-Commit: [42/91] 1db417a5995480924f7fd0661a306f2d2bfa0a77 (bonzini/rhel-qemu-kvm)
The function is inspired by pc_isa_bios_init() and should eventually replace it.
Using x86_isa_bios_init() rather than pc_isa_bios_init() fixes pflash commands
to work in the isa-bios region.
While at it convert the magic number 0x100000 (== 1MiB) to increase readability.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Message-ID: <20240508175507.22270-6-shentey@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit 5c5ffec12c30d2017cbdee6798f54d8fad3f9656)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
hw/i386/x86.c | 25 ++++++++++++++++---------
include/hw/i386/x86.h | 2 ++
2 files changed, 18 insertions(+), 9 deletions(-)
diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index 29167de97d..c61f4ebfa6 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -1128,12 +1128,25 @@ void x86_load_linux(X86MachineState *x86ms,
nb_option_roms++;
}
+void x86_isa_bios_init(MemoryRegion *isa_bios, MemoryRegion *isa_memory,
+ MemoryRegion *bios, bool read_only)
+{
+ uint64_t bios_size = memory_region_size(bios);
+ uint64_t isa_bios_size = MIN(bios_size, 128 * KiB);
+
+ memory_region_init_alias(isa_bios, NULL, "isa-bios", bios,
+ bios_size - isa_bios_size, isa_bios_size);
+ memory_region_add_subregion_overlap(isa_memory, 1 * MiB - isa_bios_size,
+ isa_bios, 1);
+ memory_region_set_readonly(isa_bios, read_only);
+}
+
void x86_bios_rom_init(X86MachineState *x86ms, const char *default_firmware,
MemoryRegion *rom_memory, bool isapc_ram_fw)
{
const char *bios_name;
char *filename;
- int bios_size, isa_bios_size;
+ int bios_size;
ssize_t ret;
/* BIOS load */
@@ -1171,14 +1184,8 @@ void x86_bios_rom_init(X86MachineState *x86ms, const char *default_firmware,
g_free(filename);
/* map the last 128KB of the BIOS in ISA space */
- isa_bios_size = MIN(bios_size, 128 * KiB);
- memory_region_init_alias(&x86ms->isa_bios, NULL, "isa-bios", &x86ms->bios,
- bios_size - isa_bios_size, isa_bios_size);
- memory_region_add_subregion_overlap(rom_memory,
- 0x100000 - isa_bios_size,
- &x86ms->isa_bios,
- 1);
- memory_region_set_readonly(&x86ms->isa_bios, !isapc_ram_fw);
+ x86_isa_bios_init(&x86ms->isa_bios, rom_memory, &x86ms->bios,
+ !isapc_ram_fw);
/* map all the bios at the top of memory */
memory_region_add_subregion(rom_memory,
diff --git a/include/hw/i386/x86.h b/include/hw/i386/x86.h
index 55c6809ae0..d7b7d3f3ce 100644
--- a/include/hw/i386/x86.h
+++ b/include/hw/i386/x86.h
@@ -129,6 +129,8 @@ void x86_cpu_unplug_request_cb(HotplugHandler *hotplug_dev,
void x86_cpu_unplug_cb(HotplugHandler *hotplug_dev,
DeviceState *dev, Error **errp);
+void x86_isa_bios_init(MemoryRegion *isa_bios, MemoryRegion *isa_memory,
+ MemoryRegion *bios, bool read_only);
void x86_bios_rom_init(X86MachineState *x86ms, const char *default_firmware,
MemoryRegion *rom_memory, bool isapc_ram_fw);
--
2.39.3

@ -1,44 +0,0 @@
From 491cf9e251026d135f315b7fe0d8771841f06e9f Mon Sep 17 00:00:00 2001
From: Leonardo Bras <leobras@redhat.com>
Date: Tue, 25 Jul 2023 15:34:45 -0300
Subject: [PATCH 8/9] hw/pci: Disable PCI_ERR_UNCOR_MASK reg for machine type
<= pc-q35-rhel9.2.0
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Leonardo Brás <leobras@redhat.com>
RH-MergeRequest: 192: hw/pci: Disable PCI_ERR_UNCOR_MASK reg for machine type <= pc-q35-rhel9.2.0
RH-Bugzilla: 2223691
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Acked-by: quintela1 <quintela@redhat.com>
RH-Commit: [1/1] e57816f8ad15a9ce5f342b061c103ae011ec1223 (LeoBras/centos-qemu-kvm)
This is a downstream-only patch to that sets off the property
x-pcie-err-unc-mask for machine types <= pc-q35-rhel9.2.0, allowing
live migrations to RHEL9.2 happen successfully.
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2223691
Fixes: 293a34b4be ("hw/pci: Disable PCI_ERR_UNCOR_MASK register for machine
type < 8.0")
Signed-off-by: Leonardo Bras <leobras@redhat.com>
---
hw/core/machine.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/hw/core/machine.c b/hw/core/machine.c
index 5ea52317b9..6f5117669d 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -62,6 +62,8 @@ GlobalProperty hw_compat_rhel_9_2[] = {
{ "virtio-mem", "x-early-migration", "false" },
/* hw_compat_rhel_9_2 from hw_compat_7_2 */
{ "migration", "x-preempt-pre-7-2", "true" },
+ /* hw_compat_rhel_9_2 from hw_compat_7_2 */
+ { TYPE_PCI_DEVICE, "x-pcie-err-unc-mask", "off" },
};
const size_t hw_compat_rhel_9_2_len = G_N_ELEMENTS(hw_compat_rhel_9_2);
--
2.39.3

@ -1,118 +0,0 @@
From 3ac01bb90da12538898f95b2fb4e7f6bc1557eb3 Mon Sep 17 00:00:00 2001
From: Leonardo Bras <leobras@redhat.com>
Date: Tue, 2 May 2023 21:27:02 -0300
Subject: [PATCH 18/21] hw/pci: Disable PCI_ERR_UNCOR_MASK register for machine
type < 8.0
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Leonardo Brás <leobras@redhat.com>
RH-MergeRequest: 170: hw/pci: Disable PCI_ERR_UNCOR_MASK register for machine type < 8.0
RH-Bugzilla: 2189423
RH-Acked-by: Peter Xu <peterx@redhat.com>
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Commit: [1/1] ad62dd5a8567f386770577513c00a0bf36bd3df1 (LeoBras/centos-qemu-kvm)
Since it's implementation on v8.0.0-rc0, having the PCI_ERR_UNCOR_MASK
set for machine types < 8.0 will cause migration to fail if the target
QEMU version is < 8.0.0 :
qemu-system-x86_64: get_pci_config_device: Bad config data: i=0x10a read: 40 device: 0 cmask: ff wmask: 0 w1cmask:0
qemu-system-x86_64: Failed to load PCIDevice:config
qemu-system-x86_64: Failed to load e1000e:parent_obj
qemu-system-x86_64: error while loading state for instance 0x0 of device '0000:00:02.0/e1000e'
qemu-system-x86_64: load of migration failed: Invalid argument
The above test migrated a 7.2 machine type from QEMU master to QEMU 7.2.0,
with this cmdline:
./qemu-system-x86_64 -M pc-q35-7.2 [-incoming XXX]
In order to fix this, property x-pcie-err-unc-mask was introduced to
control when PCI_ERR_UNCOR_MASK is enabled. This property is enabled by
default, but is disabled if machine type <= 7.2.
Fixes: 010746ae1d ("hw/pci/aer: Implement PCI_ERR_UNCOR_MASK register")
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Leonardo Bras <leobras@redhat.com>
Message-Id: <20230503002701.854329-1-leobras@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1576
Tested-by: Fiona Ebner <f.ebner@proxmox.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 5ed3dabe57dd9f4c007404345e5f5bf0e347317f)
Signed-off-by: Leonardo Bras <leobras@redhat.com>
---
hw/core/machine.c | 1 +
hw/pci/pci.c | 2 ++
hw/pci/pcie_aer.c | 11 +++++++----
include/hw/pci/pci.h | 2 ++
4 files changed, 12 insertions(+), 4 deletions(-)
diff --git a/hw/core/machine.c b/hw/core/machine.c
index 0e0120b7f2..c28702b690 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -43,6 +43,7 @@ GlobalProperty hw_compat_7_2[] = {
{ "e1000e", "migrate-timadj", "off" },
{ "virtio-mem", "x-early-migration", "false" },
{ "migration", "x-preempt-pre-7-2", "true" },
+ { TYPE_PCI_DEVICE, "x-pcie-err-unc-mask", "off" },
};
const size_t hw_compat_7_2_len = G_N_ELEMENTS(hw_compat_7_2);
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index def5000e7b..8ad4349e96 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -79,6 +79,8 @@ static Property pci_props[] = {
DEFINE_PROP_STRING("failover_pair_id", PCIDevice,
failover_pair_id),
DEFINE_PROP_UINT32("acpi-index", PCIDevice, acpi_index, 0),
+ DEFINE_PROP_BIT("x-pcie-err-unc-mask", PCIDevice, cap_present,
+ QEMU_PCIE_ERR_UNC_MASK_BITNR, true),
DEFINE_PROP_END_OF_LIST()
};
diff --git a/hw/pci/pcie_aer.c b/hw/pci/pcie_aer.c
index 103667c368..374d593ead 100644
--- a/hw/pci/pcie_aer.c
+++ b/hw/pci/pcie_aer.c
@@ -112,10 +112,13 @@ int pcie_aer_init(PCIDevice *dev, uint8_t cap_ver, uint16_t offset,
pci_set_long(dev->w1cmask + offset + PCI_ERR_UNCOR_STATUS,
PCI_ERR_UNC_SUPPORTED);
- pci_set_long(dev->config + offset + PCI_ERR_UNCOR_MASK,
- PCI_ERR_UNC_MASK_DEFAULT);
- pci_set_long(dev->wmask + offset + PCI_ERR_UNCOR_MASK,
- PCI_ERR_UNC_SUPPORTED);
+
+ if (dev->cap_present & QEMU_PCIE_ERR_UNC_MASK) {
+ pci_set_long(dev->config + offset + PCI_ERR_UNCOR_MASK,
+ PCI_ERR_UNC_MASK_DEFAULT);
+ pci_set_long(dev->wmask + offset + PCI_ERR_UNCOR_MASK,
+ PCI_ERR_UNC_SUPPORTED);
+ }
pci_set_long(dev->config + offset + PCI_ERR_UNCOR_SEVER,
PCI_ERR_UNC_SEVERITY_DEFAULT);
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index d5a40cd058..6dc6742fc4 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -207,6 +207,8 @@ enum {
QEMU_PCIE_EXTCAP_INIT = (1 << QEMU_PCIE_EXTCAP_INIT_BITNR),
#define QEMU_PCIE_CXL_BITNR 10
QEMU_PCIE_CAP_CXL = (1 << QEMU_PCIE_CXL_BITNR),
+#define QEMU_PCIE_ERR_UNC_MASK_BITNR 11
+ QEMU_PCIE_ERR_UNC_MASK = (1 << QEMU_PCIE_ERR_UNC_MASK_BITNR),
};
typedef struct PCIINTxRoute {
--
2.39.3

@ -1,470 +0,0 @@
From d1b7a9b25c0df9016cd8e93d40837314b1a81d70 Mon Sep 17 00:00:00 2001
From: Jon Maloy <jmaloy@redhat.com>
Date: Tue, 9 May 2023 10:29:03 -0400
Subject: [PATCH 08/21] hw: replace most qemu_bh_new calls with
qemu_bh_new_guarded
RH-Author: Jon Maloy <jmaloy@redhat.com>
RH-MergeRequest: 165: memory: prevent dma-reentracy issues
RH-Jira: RHEL-516
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Commit: [4/13] bcbc67dd0023aee2b3a342665237daa83b183c7b (jmaloy/jmaloy-qemu-kvm-2)
Jira: https://issues.redhat.com/browse/RHEL-516
Upstream: Merged
CVE: CVE-2023-2680
commit f63192b0544af5d3e4d5edfd85ab520fcf671377
Author: Alexander Bulekov <alxndr@bu.edu>
Date: Thu Apr 27 17:10:09 2023 -0400
hw: replace most qemu_bh_new calls with qemu_bh_new_guarded
This protects devices from bh->mmio reentrancy issues.
Thanks: Thomas Huth <thuth@redhat.com> for diagnosing OS X test failure.
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230427211013.2994127-5-alxndr@bu.edu>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
---
hw/9pfs/xen-9p-backend.c | 5 ++++-
hw/block/dataplane/virtio-blk.c | 3 ++-
hw/block/dataplane/xen-block.c | 5 +++--
hw/char/virtio-serial-bus.c | 3 ++-
hw/display/qxl.c | 9 ++++++---
hw/display/virtio-gpu.c | 6 ++++--
hw/ide/ahci.c | 3 ++-
hw/ide/ahci_internal.h | 1 +
hw/ide/core.c | 4 +++-
hw/misc/imx_rngc.c | 6 ++++--
hw/misc/macio/mac_dbdma.c | 2 +-
hw/net/virtio-net.c | 3 ++-
hw/nvme/ctrl.c | 6 ++++--
hw/scsi/mptsas.c | 3 ++-
hw/scsi/scsi-bus.c | 3 ++-
hw/scsi/vmw_pvscsi.c | 3 ++-
hw/usb/dev-uas.c | 3 ++-
hw/usb/hcd-dwc2.c | 3 ++-
hw/usb/hcd-ehci.c | 3 ++-
hw/usb/hcd-uhci.c | 2 +-
hw/usb/host-libusb.c | 6 ++++--
hw/usb/redirect.c | 6 ++++--
hw/usb/xen-usb.c | 3 ++-
hw/virtio/virtio-balloon.c | 5 +++--
hw/virtio/virtio-crypto.c | 3 ++-
25 files changed, 66 insertions(+), 33 deletions(-)
diff --git a/hw/9pfs/xen-9p-backend.c b/hw/9pfs/xen-9p-backend.c
index 74f3a05f88..0e266c552b 100644
--- a/hw/9pfs/xen-9p-backend.c
+++ b/hw/9pfs/xen-9p-backend.c
@@ -61,6 +61,7 @@ typedef struct Xen9pfsDev {
int num_rings;
Xen9pfsRing *rings;
+ MemReentrancyGuard mem_reentrancy_guard;
} Xen9pfsDev;
static void xen_9pfs_disconnect(struct XenLegacyDevice *xendev);
@@ -443,7 +444,9 @@ static int xen_9pfs_connect(struct XenLegacyDevice *xendev)
xen_9pdev->rings[i].ring.out = xen_9pdev->rings[i].data +
XEN_FLEX_RING_SIZE(ring_order);
- xen_9pdev->rings[i].bh = qemu_bh_new(xen_9pfs_bh, &xen_9pdev->rings[i]);
+ xen_9pdev->rings[i].bh = qemu_bh_new_guarded(xen_9pfs_bh,
+ &xen_9pdev->rings[i],
+ &xen_9pdev->mem_reentrancy_guard);
xen_9pdev->rings[i].out_cons = 0;
xen_9pdev->rings[i].out_size = 0;
xen_9pdev->rings[i].inprogress = false;
diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index b28d81737e..a6202997ee 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -127,7 +127,8 @@ bool virtio_blk_data_plane_create(VirtIODevice *vdev, VirtIOBlkConf *conf,
} else {
s->ctx = qemu_get_aio_context();
}
- s->bh = aio_bh_new(s->ctx, notify_guest_bh, s);
+ s->bh = aio_bh_new_guarded(s->ctx, notify_guest_bh, s,
+ &DEVICE(vdev)->mem_reentrancy_guard);
s->batch_notify_vqs = bitmap_new(conf->num_queues);
*dataplane = s;
diff --git a/hw/block/dataplane/xen-block.c b/hw/block/dataplane/xen-block.c
index 734da42ea7..d8bc39d359 100644
--- a/hw/block/dataplane/xen-block.c
+++ b/hw/block/dataplane/xen-block.c
@@ -633,8 +633,9 @@ XenBlockDataPlane *xen_block_dataplane_create(XenDevice *xendev,
} else {
dataplane->ctx = qemu_get_aio_context();
}
- dataplane->bh = aio_bh_new(dataplane->ctx, xen_block_dataplane_bh,
- dataplane);
+ dataplane->bh = aio_bh_new_guarded(dataplane->ctx, xen_block_dataplane_bh,
+ dataplane,
+ &DEVICE(xendev)->mem_reentrancy_guard);
return dataplane;
}
diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
index 7d4601cb5d..dd619f0731 100644
--- a/hw/char/virtio-serial-bus.c
+++ b/hw/char/virtio-serial-bus.c
@@ -985,7 +985,8 @@ static void virtser_port_device_realize(DeviceState *dev, Error **errp)
return;
}
- port->bh = qemu_bh_new(flush_queued_data_bh, port);
+ port->bh = qemu_bh_new_guarded(flush_queued_data_bh, port,
+ &dev->mem_reentrancy_guard);
port->elem = NULL;
}
diff --git a/hw/display/qxl.c b/hw/display/qxl.c
index 80ce1e9a93..f1c0eb7dfc 100644
--- a/hw/display/qxl.c
+++ b/hw/display/qxl.c
@@ -2201,11 +2201,14 @@ static void qxl_realize_common(PCIQXLDevice *qxl, Error **errp)
qemu_add_vm_change_state_handler(qxl_vm_change_state_handler, qxl);
- qxl->update_irq = qemu_bh_new(qxl_update_irq_bh, qxl);
+ qxl->update_irq = qemu_bh_new_guarded(qxl_update_irq_bh, qxl,
+ &DEVICE(qxl)->mem_reentrancy_guard);
qxl_reset_state(qxl);
- qxl->update_area_bh = qemu_bh_new(qxl_render_update_area_bh, qxl);
- qxl->ssd.cursor_bh = qemu_bh_new(qemu_spice_cursor_refresh_bh, &qxl->ssd);
+ qxl->update_area_bh = qemu_bh_new_guarded(qxl_render_update_area_bh, qxl,
+ &DEVICE(qxl)->mem_reentrancy_guard);
+ qxl->ssd.cursor_bh = qemu_bh_new_guarded(qemu_spice_cursor_refresh_bh, &qxl->ssd,
+ &DEVICE(qxl)->mem_reentrancy_guard);
}
static void qxl_realize_primary(PCIDevice *dev, Error **errp)
diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c
index 5e15c79b94..66ac9b6cc5 100644
--- a/hw/display/virtio-gpu.c
+++ b/hw/display/virtio-gpu.c
@@ -1339,8 +1339,10 @@ void virtio_gpu_device_realize(DeviceState *qdev, Error **errp)
g->ctrl_vq = virtio_get_queue(vdev, 0);
g->cursor_vq = virtio_get_queue(vdev, 1);
- g->ctrl_bh = qemu_bh_new(virtio_gpu_ctrl_bh, g);
- g->cursor_bh = qemu_bh_new(virtio_gpu_cursor_bh, g);
+ g->ctrl_bh = qemu_bh_new_guarded(virtio_gpu_ctrl_bh, g,
+ &qdev->mem_reentrancy_guard);
+ g->cursor_bh = qemu_bh_new_guarded(virtio_gpu_cursor_bh, g,
+ &qdev->mem_reentrancy_guard);
QTAILQ_INIT(&g->reslist);
QTAILQ_INIT(&g->cmdq);
QTAILQ_INIT(&g->fenceq);
diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
index 55902e1df7..4e76d6b191 100644
--- a/hw/ide/ahci.c
+++ b/hw/ide/ahci.c
@@ -1509,7 +1509,8 @@ static void ahci_cmd_done(const IDEDMA *dma)
ahci_write_fis_d2h(ad);
if (ad->port_regs.cmd_issue && !ad->check_bh) {
- ad->check_bh = qemu_bh_new(ahci_check_cmd_bh, ad);
+ ad->check_bh = qemu_bh_new_guarded(ahci_check_cmd_bh, ad,
+ &ad->mem_reentrancy_guard);
qemu_bh_schedule(ad->check_bh);
}
}
diff --git a/hw/ide/ahci_internal.h b/hw/ide/ahci_internal.h
index 303fcd7235..2480455372 100644
--- a/hw/ide/ahci_internal.h
+++ b/hw/ide/ahci_internal.h
@@ -321,6 +321,7 @@ struct AHCIDevice {
bool init_d2h_sent;
AHCICmdHdr *cur_cmd;
NCQTransferState ncq_tfs[AHCI_MAX_CMDS];
+ MemReentrancyGuard mem_reentrancy_guard;
};
struct AHCIPCIState {
diff --git a/hw/ide/core.c b/hw/ide/core.c
index 45d14a25e9..de48ff9f86 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -513,6 +513,7 @@ BlockAIOCB *ide_issue_trim(
BlockCompletionFunc *cb, void *cb_opaque, void *opaque)
{
IDEState *s = opaque;
+ IDEDevice *dev = s->unit ? s->bus->slave : s->bus->master;
TrimAIOCB *iocb;
/* Paired with a decrement in ide_trim_bh_cb() */
@@ -520,7 +521,8 @@ BlockAIOCB *ide_issue_trim(
iocb = blk_aio_get(&trim_aiocb_info, s->blk, cb, cb_opaque);
iocb->s = s;
- iocb->bh = qemu_bh_new(ide_trim_bh_cb, iocb);
+ iocb->bh = qemu_bh_new_guarded(ide_trim_bh_cb, iocb,
+ &DEVICE(dev)->mem_reentrancy_guard);
iocb->ret = 0;
iocb->qiov = qiov;
iocb->i = -1;
diff --git a/hw/misc/imx_rngc.c b/hw/misc/imx_rngc.c
index 632c03779c..082c6980ad 100644
--- a/hw/misc/imx_rngc.c
+++ b/hw/misc/imx_rngc.c
@@ -228,8 +228,10 @@ static void imx_rngc_realize(DeviceState *dev, Error **errp)
sysbus_init_mmio(sbd, &s->iomem);
sysbus_init_irq(sbd, &s->irq);
- s->self_test_bh = qemu_bh_new(imx_rngc_self_test, s);
- s->seed_bh = qemu_bh_new(imx_rngc_seed, s);
+ s->self_test_bh = qemu_bh_new_guarded(imx_rngc_self_test, s,
+ &dev->mem_reentrancy_guard);
+ s->seed_bh = qemu_bh_new_guarded(imx_rngc_seed, s,
+ &dev->mem_reentrancy_guard);
}
static void imx_rngc_reset(DeviceState *dev)
diff --git a/hw/misc/macio/mac_dbdma.c b/hw/misc/macio/mac_dbdma.c
index 43bb1f56ba..80a789f32b 100644
--- a/hw/misc/macio/mac_dbdma.c
+++ b/hw/misc/macio/mac_dbdma.c
@@ -914,7 +914,7 @@ static void mac_dbdma_realize(DeviceState *dev, Error **errp)
{
DBDMAState *s = MAC_DBDMA(dev);
- s->bh = qemu_bh_new(DBDMA_run_bh, s);
+ s->bh = qemu_bh_new_guarded(DBDMA_run_bh, s, &dev->mem_reentrancy_guard);
}
static void mac_dbdma_class_init(ObjectClass *oc, void *data)
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 53e1c32643..447f669921 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -2917,7 +2917,8 @@ static void virtio_net_add_queue(VirtIONet *n, int index)
n->vqs[index].tx_vq =
virtio_add_queue(vdev, n->net_conf.tx_queue_size,
virtio_net_handle_tx_bh);
- n->vqs[index].tx_bh = qemu_bh_new(virtio_net_tx_bh, &n->vqs[index]);
+ n->vqs[index].tx_bh = qemu_bh_new_guarded(virtio_net_tx_bh, &n->vqs[index],
+ &DEVICE(vdev)->mem_reentrancy_guard);
}
n->vqs[index].tx_waiting = 0;
diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index ac24eeb5ed..e5a468975e 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -4607,7 +4607,8 @@ static void nvme_init_sq(NvmeSQueue *sq, NvmeCtrl *n, uint64_t dma_addr,
QTAILQ_INSERT_TAIL(&(sq->req_list), &sq->io_req[i], entry);
}
- sq->bh = qemu_bh_new(nvme_process_sq, sq);
+ sq->bh = qemu_bh_new_guarded(nvme_process_sq, sq,
+ &DEVICE(sq->ctrl)->mem_reentrancy_guard);
if (n->dbbuf_enabled) {
sq->db_addr = n->dbbuf_dbs + (sqid << 3);
@@ -5253,7 +5254,8 @@ static void nvme_init_cq(NvmeCQueue *cq, NvmeCtrl *n, uint64_t dma_addr,
}
}
n->cq[cqid] = cq;
- cq->bh = qemu_bh_new(nvme_post_cqes, cq);
+ cq->bh = qemu_bh_new_guarded(nvme_post_cqes, cq,
+ &DEVICE(cq->ctrl)->mem_reentrancy_guard);
}
static uint16_t nvme_create_cq(NvmeCtrl *n, NvmeRequest *req)
diff --git a/hw/scsi/mptsas.c b/hw/scsi/mptsas.c
index c485da792c..3de288b454 100644
--- a/hw/scsi/mptsas.c
+++ b/hw/scsi/mptsas.c
@@ -1322,7 +1322,8 @@ static void mptsas_scsi_realize(PCIDevice *dev, Error **errp)
}
s->max_devices = MPTSAS_NUM_PORTS;
- s->request_bh = qemu_bh_new(mptsas_fetch_requests, s);
+ s->request_bh = qemu_bh_new_guarded(mptsas_fetch_requests, s,
+ &DEVICE(dev)->mem_reentrancy_guard);
scsi_bus_init(&s->bus, sizeof(s->bus), &dev->qdev, &mptsas_scsi_info);
}
diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c
index c97176110c..3c20b47ad0 100644
--- a/hw/scsi/scsi-bus.c
+++ b/hw/scsi/scsi-bus.c
@@ -193,7 +193,8 @@ static void scsi_dma_restart_cb(void *opaque, bool running, RunState state)
AioContext *ctx = blk_get_aio_context(s->conf.blk);
/* The reference is dropped in scsi_dma_restart_bh.*/
object_ref(OBJECT(s));
- s->bh = aio_bh_new(ctx, scsi_dma_restart_bh, s);
+ s->bh = aio_bh_new_guarded(ctx, scsi_dma_restart_bh, s,
+ &DEVICE(s)->mem_reentrancy_guard);
qemu_bh_schedule(s->bh);
}
}
diff --git a/hw/scsi/vmw_pvscsi.c b/hw/scsi/vmw_pvscsi.c
index fa76696855..4de34536e9 100644
--- a/hw/scsi/vmw_pvscsi.c
+++ b/hw/scsi/vmw_pvscsi.c
@@ -1184,7 +1184,8 @@ pvscsi_realizefn(PCIDevice *pci_dev, Error **errp)
pcie_endpoint_cap_init(pci_dev, PVSCSI_EXP_EP_OFFSET);
}
- s->completion_worker = qemu_bh_new(pvscsi_process_completion_queue, s);
+ s->completion_worker = qemu_bh_new_guarded(pvscsi_process_completion_queue, s,
+ &DEVICE(pci_dev)->mem_reentrancy_guard);
scsi_bus_init(&s->bus, sizeof(s->bus), DEVICE(pci_dev), &pvscsi_scsi_info);
/* override default SCSI bus hotplug-handler, with pvscsi's one */
diff --git a/hw/usb/dev-uas.c b/hw/usb/dev-uas.c
index 88f99c05d5..f013ded91e 100644
--- a/hw/usb/dev-uas.c
+++ b/hw/usb/dev-uas.c
@@ -937,7 +937,8 @@ static void usb_uas_realize(USBDevice *dev, Error **errp)
QTAILQ_INIT(&uas->results);
QTAILQ_INIT(&uas->requests);
- uas->status_bh = qemu_bh_new(usb_uas_send_status_bh, uas);
+ uas->status_bh = qemu_bh_new_guarded(usb_uas_send_status_bh, uas,
+ &d->mem_reentrancy_guard);
dev->flags |= (1 << USB_DEV_FLAG_IS_SCSI_STORAGE);
scsi_bus_init(&uas->bus, sizeof(uas->bus), DEVICE(dev), &usb_uas_scsi_info);
diff --git a/hw/usb/hcd-dwc2.c b/hw/usb/hcd-dwc2.c
index 8755e9cbb0..a0c4e782b2 100644
--- a/hw/usb/hcd-dwc2.c
+++ b/hw/usb/hcd-dwc2.c
@@ -1364,7 +1364,8 @@ static void dwc2_realize(DeviceState *dev, Error **errp)
s->fi = USB_FRMINTVL - 1;
s->eof_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, dwc2_frame_boundary, s);
s->frame_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, dwc2_work_timer, s);
- s->async_bh = qemu_bh_new(dwc2_work_bh, s);
+ s->async_bh = qemu_bh_new_guarded(dwc2_work_bh, s,
+ &dev->mem_reentrancy_guard);
sysbus_init_irq(sbd, &s->irq);
}
diff --git a/hw/usb/hcd-ehci.c b/hw/usb/hcd-ehci.c
index d4da8dcb8d..c930c60921 100644
--- a/hw/usb/hcd-ehci.c
+++ b/hw/usb/hcd-ehci.c
@@ -2533,7 +2533,8 @@ void usb_ehci_realize(EHCIState *s, DeviceState *dev, Error **errp)
}
s->frame_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, ehci_work_timer, s);
- s->async_bh = qemu_bh_new(ehci_work_bh, s);
+ s->async_bh = qemu_bh_new_guarded(ehci_work_bh, s,
+ &dev->mem_reentrancy_guard);
s->device = dev;
s->vmstate = qemu_add_vm_change_state_handler(usb_ehci_vm_state_change, s);
diff --git a/hw/usb/hcd-uhci.c b/hw/usb/hcd-uhci.c
index 8ac1175ad2..77baaa7a6b 100644
--- a/hw/usb/hcd-uhci.c
+++ b/hw/usb/hcd-uhci.c
@@ -1190,7 +1190,7 @@ void usb_uhci_common_realize(PCIDevice *dev, Error **errp)
USB_SPEED_MASK_LOW | USB_SPEED_MASK_FULL);
}
}
- s->bh = qemu_bh_new(uhci_bh, s);
+ s->bh = qemu_bh_new_guarded(uhci_bh, s, &DEVICE(dev)->mem_reentrancy_guard);
s->frame_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, uhci_frame_timer, s);
s->num_ports_vmstate = NB_PORTS;
QTAILQ_INIT(&s->queues);
diff --git a/hw/usb/host-libusb.c b/hw/usb/host-libusb.c
index 176868d345..f500db85ab 100644
--- a/hw/usb/host-libusb.c
+++ b/hw/usb/host-libusb.c
@@ -1141,7 +1141,8 @@ static void usb_host_nodev_bh(void *opaque)
static void usb_host_nodev(USBHostDevice *s)
{
if (!s->bh_nodev) {
- s->bh_nodev = qemu_bh_new(usb_host_nodev_bh, s);
+ s->bh_nodev = qemu_bh_new_guarded(usb_host_nodev_bh, s,
+ &DEVICE(s)->mem_reentrancy_guard);
}
qemu_bh_schedule(s->bh_nodev);
}
@@ -1739,7 +1740,8 @@ static int usb_host_post_load(void *opaque, int version_id)
USBHostDevice *dev = opaque;
if (!dev->bh_postld) {
- dev->bh_postld = qemu_bh_new(usb_host_post_load_bh, dev);
+ dev->bh_postld = qemu_bh_new_guarded(usb_host_post_load_bh, dev,
+ &DEVICE(dev)->mem_reentrancy_guard);
}
qemu_bh_schedule(dev->bh_postld);
dev->bh_postld_pending = true;
diff --git a/hw/usb/redirect.c b/hw/usb/redirect.c
index fd7df599bc..39fbaaab16 100644
--- a/hw/usb/redirect.c
+++ b/hw/usb/redirect.c
@@ -1441,8 +1441,10 @@ static void usbredir_realize(USBDevice *udev, Error **errp)
}
}
- dev->chardev_close_bh = qemu_bh_new(usbredir_chardev_close_bh, dev);
- dev->device_reject_bh = qemu_bh_new(usbredir_device_reject_bh, dev);
+ dev->chardev_close_bh = qemu_bh_new_guarded(usbredir_chardev_close_bh, dev,
+ &DEVICE(dev)->mem_reentrancy_guard);
+ dev->device_reject_bh = qemu_bh_new_guarded(usbredir_device_reject_bh, dev,
+ &DEVICE(dev)->mem_reentrancy_guard);
dev->attach_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL, usbredir_do_attach, dev);
packet_id_queue_init(&dev->cancelled, dev, "cancelled");
diff --git a/hw/usb/xen-usb.c b/hw/usb/xen-usb.c
index 66cb3f7c24..38ee660a30 100644
--- a/hw/usb/xen-usb.c
+++ b/hw/usb/xen-usb.c
@@ -1032,7 +1032,8 @@ static void usbback_alloc(struct XenLegacyDevice *xendev)
QTAILQ_INIT(&usbif->req_free_q);
QSIMPLEQ_INIT(&usbif->hotplug_q);
- usbif->bh = qemu_bh_new(usbback_bh, usbif);
+ usbif->bh = qemu_bh_new_guarded(usbback_bh, usbif,
+ &DEVICE(xendev)->mem_reentrancy_guard);
}
static int usbback_free(struct XenLegacyDevice *xendev)
diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 43092aa634..5186e831dd 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -909,8 +909,9 @@ static void virtio_balloon_device_realize(DeviceState *dev, Error **errp)
precopy_add_notifier(&s->free_page_hint_notify);
object_ref(OBJECT(s->iothread));
- s->free_page_bh = aio_bh_new(iothread_get_aio_context(s->iothread),
- virtio_ballloon_get_free_page_hints, s);
+ s->free_page_bh = aio_bh_new_guarded(iothread_get_aio_context(s->iothread),
+ virtio_ballloon_get_free_page_hints, s,
+ &dev->mem_reentrancy_guard);
}
if (virtio_has_feature(s->host_features, VIRTIO_BALLOON_F_REPORTING)) {
diff --git a/hw/virtio/virtio-crypto.c b/hw/virtio/virtio-crypto.c
index 802e1b9659..2fe804510f 100644
--- a/hw/virtio/virtio-crypto.c
+++ b/hw/virtio/virtio-crypto.c
@@ -1074,7 +1074,8 @@ static void virtio_crypto_device_realize(DeviceState *dev, Error **errp)
vcrypto->vqs[i].dataq =
virtio_add_queue(vdev, 1024, virtio_crypto_handle_dataq_bh);
vcrypto->vqs[i].dataq_bh =
- qemu_bh_new(virtio_crypto_dataq_bh, &vcrypto->vqs[i]);
+ qemu_bh_new_guarded(virtio_crypto_dataq_bh, &vcrypto->vqs[i],
+ &dev->mem_reentrancy_guard);
vcrypto->vqs[i].vcrypto = vcrypto;
}
--
2.39.3

@ -1,141 +0,0 @@
From 8075a9e05699ef0c4e078017eefc20db3186328f Mon Sep 17 00:00:00 2001
From: Jon Maloy <jmaloy@redhat.com>
Date: Mon, 29 May 2023 14:21:08 -0400
Subject: [PATCH 17/21] hw/scsi/lsi53c895a: Fix reentrancy issues in the LSI
controller (CVE-2023-0330)
RH-Author: Jon Maloy <jmaloy@redhat.com>
RH-MergeRequest: 165: memory: prevent dma-reentracy issues
RH-Jira: RHEL-516
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Commit: [13/13] 0b6fa742075ef2db3a354ee672dccca3747051cc (jmaloy/jmaloy-qemu-kvm-2)
Jira: https://issues.redhat.com/browse/RHEL-516
Upstream: Merged
CVE: CVE-2023-2680
commit b987718bbb1d0eabf95499b976212dd5f0120d75
Author: Thomas Huth <thuth@redhat.com>
Date: Mon May 22 11:10:11 2023 +0200
hw/scsi/lsi53c895a: Fix reentrancy issues in the LSI controller (CVE-2023-0330)
We cannot use the generic reentrancy guard in the LSI code, so
we have to manually prevent endless reentrancy here. The problematic
lsi_execute_script() function has already a way to detect whether
too many instructions have been executed - we just have to slightly
change the logic here that it also takes into account if the function
has been called too often in a reentrant way.
The code in fuzz-lsi53c895a-test.c has been taken from an earlier
patch by Mauro Matteo Cascella.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1563
Message-Id: <20230522091011.1082574-1-thuth@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
---
hw/scsi/lsi53c895a.c | 23 +++++++++++++++------
tests/qtest/fuzz-lsi53c895a-test.c | 33 ++++++++++++++++++++++++++++++
2 files changed, 50 insertions(+), 6 deletions(-)
diff --git a/hw/scsi/lsi53c895a.c b/hw/scsi/lsi53c895a.c
index 048436352b..f7d45b0b20 100644
--- a/hw/scsi/lsi53c895a.c
+++ b/hw/scsi/lsi53c895a.c
@@ -1134,15 +1134,24 @@ static void lsi_execute_script(LSIState *s)
uint32_t addr, addr_high;
int opcode;
int insn_processed = 0;
+ static int reentrancy_level;
+
+ reentrancy_level++;
s->istat1 |= LSI_ISTAT1_SRUN;
again:
- if (++insn_processed > LSI_MAX_INSN) {
- /* Some windows drivers make the device spin waiting for a memory
- location to change. If we have been executed a lot of code then
- assume this is the case and force an unexpected device disconnect.
- This is apparently sufficient to beat the drivers into submission.
- */
+ /*
+ * Some windows drivers make the device spin waiting for a memory location
+ * to change. If we have executed more than LSI_MAX_INSN instructions then
+ * assume this is the case and force an unexpected device disconnect. This
+ * is apparently sufficient to beat the drivers into submission.
+ *
+ * Another issue (CVE-2023-0330) can occur if the script is programmed to
+ * trigger itself again and again. Avoid this problem by stopping after
+ * being called multiple times in a reentrant way (8 is an arbitrary value
+ * which should be enough for all valid use cases).
+ */
+ if (++insn_processed > LSI_MAX_INSN || reentrancy_level > 8) {
if (!(s->sien0 & LSI_SIST0_UDC)) {
qemu_log_mask(LOG_GUEST_ERROR,
"lsi_scsi: inf. loop with UDC masked");
@@ -1596,6 +1605,8 @@ again:
}
}
trace_lsi_execute_script_stop();
+
+ reentrancy_level--;
}
static uint8_t lsi_reg_readb(LSIState *s, int offset)
diff --git a/tests/qtest/fuzz-lsi53c895a-test.c b/tests/qtest/fuzz-lsi53c895a-test.c
index 2012bd54b7..1b55928b9f 100644
--- a/tests/qtest/fuzz-lsi53c895a-test.c
+++ b/tests/qtest/fuzz-lsi53c895a-test.c
@@ -8,6 +8,36 @@
#include "qemu/osdep.h"
#include "libqtest.h"
+/*
+ * This used to trigger a DMA reentrancy issue
+ * leading to memory corruption bugs like stack
+ * overflow or use-after-free
+ * https://gitlab.com/qemu-project/qemu/-/issues/1563
+ */
+static void test_lsi_dma_reentrancy(void)
+{
+ QTestState *s;
+
+ s = qtest_init("-M q35 -m 512M -nodefaults "
+ "-blockdev driver=null-co,node-name=null0 "
+ "-device lsi53c810 -device scsi-cd,drive=null0");
+
+ qtest_outl(s, 0xcf8, 0x80000804); /* PCI Command Register */
+ qtest_outw(s, 0xcfc, 0x7); /* Enables accesses */
+ qtest_outl(s, 0xcf8, 0x80000814); /* Memory Bar 1 */
+ qtest_outl(s, 0xcfc, 0xff100000); /* Set MMIO Address*/
+ qtest_outl(s, 0xcf8, 0x80000818); /* Memory Bar 2 */
+ qtest_outl(s, 0xcfc, 0xff000000); /* Set RAM Address*/
+ qtest_writel(s, 0xff000000, 0xc0000024);
+ qtest_writel(s, 0xff000114, 0x00000080);
+ qtest_writel(s, 0xff00012c, 0xff000000);
+ qtest_writel(s, 0xff000004, 0xff000114);
+ qtest_writel(s, 0xff000008, 0xff100014);
+ qtest_writel(s, 0xff10002f, 0x000000ff);
+
+ qtest_quit(s);
+}
+
/*
* This used to trigger a UAF in lsi_do_msgout()
* https://gitlab.com/qemu-project/qemu/-/issues/972
@@ -124,5 +154,8 @@ int main(int argc, char **argv)
qtest_add_func("fuzz/lsi53c895a/lsi_do_msgout_cancel_req",
test_lsi_do_msgout_cancel_req);
+ qtest_add_func("fuzz/lsi53c895a/lsi_dma_reentrancy",
+ test_lsi_dma_reentrancy);
+
return g_test_run();
}
--
2.39.3

@ -1,76 +0,0 @@
From fcd6219a95851d17fd8bde69d87e78c6533be990 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= <clg@redhat.com>
Date: Wed, 12 Jul 2023 17:46:57 +0200
Subject: [PATCH 24/37] hw/vfio/pci-quirks: Sanitize capability pointer
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Cédric Le Goater <clg@redhat.com>
RH-MergeRequest: 179: vfio: live migration support
RH-Bugzilla: 2192818
RH-Acked-by: Eric Auger <eric.auger@redhat.com>
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Commit: [22/28] cb080409c1912f4365f8e31cd23c914b48f91575 (clegoate/qemu-kvm-c9s)
Bugzilla: https://bugzilla.redhat.com/2192818
commit 0ddcb39c9357
Author: Alex Williamson <alex.williamson@redhat.com>
Date: Fri Jun 30 16:36:08 2023 -0600
hw/vfio/pci-quirks: Sanitize capability pointer
Coverity reports a tained scalar when traversing the capabilities
chain (CID 1516589). In practice I've never seen a device with a
chain so broken as to cause an issue, but it's also pretty easy to
sanitize.
Fixes: f6b30c1984f7 ("hw/vfio/pci-quirks: Support alternate offset for GPUDirect Cliques")
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
hw/vfio/pci-quirks.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/hw/vfio/pci-quirks.c b/hw/vfio/pci-quirks.c
index 0ed2fcd531..f4ff836805 100644
--- a/hw/vfio/pci-quirks.c
+++ b/hw/vfio/pci-quirks.c
@@ -1530,6 +1530,12 @@ const PropertyInfo qdev_prop_nv_gpudirect_clique = {
.set = set_nv_gpudirect_clique_id,
};
+static bool is_valid_std_cap_offset(uint8_t pos)
+{
+ return (pos >= PCI_STD_HEADER_SIZEOF &&
+ pos <= (PCI_CFG_SPACE_SIZE - PCI_CAP_SIZEOF));
+}
+
static int vfio_add_nv_gpudirect_cap(VFIOPCIDevice *vdev, Error **errp)
{
PCIDevice *pdev = &vdev->pdev;
@@ -1563,7 +1569,7 @@ static int vfio_add_nv_gpudirect_cap(VFIOPCIDevice *vdev, Error **errp)
*/
ret = pread(vdev->vbasedev.fd, &tmp, 1,
vdev->config_offset + PCI_CAPABILITY_LIST);
- if (ret != 1 || !tmp) {
+ if (ret != 1 || !is_valid_std_cap_offset(tmp)) {
error_setg(errp, "NVIDIA GPUDirect Clique ID: error getting cap list");
return -EINVAL;
}
@@ -1575,7 +1581,7 @@ static int vfio_add_nv_gpudirect_cap(VFIOPCIDevice *vdev, Error **errp)
d4_conflict = true;
}
tmp = pdev->config[tmp + PCI_CAP_LIST_NEXT];
- } while (tmp);
+ } while (is_valid_std_cap_offset(tmp));
if (!c8_conflict) {
pos = 0xC8;
--
2.39.3

@ -1,110 +0,0 @@
From dd38230a0a375fb8427fa106ff79562e56c51b6c Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?C=C3=A9dric=20Le=20Goater?= <clg@redhat.com>
Date: Wed, 12 Jul 2023 17:46:57 +0200
Subject: [PATCH 18/37] hw/vfio/pci-quirks: Support alternate offset for
GPUDirect Cliques
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Cédric Le Goater <clg@redhat.com>
RH-MergeRequest: 179: vfio: live migration support
RH-Bugzilla: 2192818
RH-Acked-by: Eric Auger <eric.auger@redhat.com>
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Commit: [16/28] 9befb7c9adaeb58e9d0b49686cf54b751c742832 (clegoate/qemu-kvm-c9s)
Bugzilla: https://bugzilla.redhat.com/2192818
commit f6b30c1984f7
Author: Alex Williamson <alex.williamson@redhat.com>
Date: Thu Jun 8 12:05:07 2023 -0600
hw/vfio/pci-quirks: Support alternate offset for GPUDirect Cliques
NVIDIA Turing and newer GPUs implement the MSI-X capability at the offset
previously reserved for use by hypervisors to implement the GPUDirect
Cliques capability. A revised specification provides an alternate
location. Add a config space walk to the quirk to check for conflicts,
allowing us to fall back to the new location or generate an error at the
quirk setup rather than when the real conflicting capability is added
should there be no available location.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
---
hw/vfio/pci-quirks.c | 41 ++++++++++++++++++++++++++++++++++++++++-
1 file changed, 40 insertions(+), 1 deletion(-)
diff --git a/hw/vfio/pci-quirks.c b/hw/vfio/pci-quirks.c
index f0147a050a..0ed2fcd531 100644
--- a/hw/vfio/pci-quirks.c
+++ b/hw/vfio/pci-quirks.c
@@ -1490,6 +1490,9 @@ void vfio_setup_resetfn_quirk(VFIOPCIDevice *vdev)
* +---------------------------------+---------------------------------+
*
* https://lists.gnu.org/archive/html/qemu-devel/2017-08/pdfUda5iEpgOS.pdf
+ *
+ * Specification for Turning and later GPU architectures:
+ * https://lists.gnu.org/archive/html/qemu-devel/2023-06/pdf142OR4O4c2.pdf
*/
static void get_nv_gpudirect_clique_id(Object *obj, Visitor *v,
const char *name, void *opaque,
@@ -1530,7 +1533,9 @@ const PropertyInfo qdev_prop_nv_gpudirect_clique = {
static int vfio_add_nv_gpudirect_cap(VFIOPCIDevice *vdev, Error **errp)
{
PCIDevice *pdev = &vdev->pdev;
- int ret, pos = 0xC8;
+ int ret, pos;
+ bool c8_conflict = false, d4_conflict = false;
+ uint8_t tmp;
if (vdev->nv_gpudirect_clique == 0xFF) {
return 0;
@@ -1547,6 +1552,40 @@ static int vfio_add_nv_gpudirect_cap(VFIOPCIDevice *vdev, Error **errp)
return -EINVAL;
}
+ /*
+ * Per the updated specification above, it's recommended to use offset
+ * D4h for Turing and later GPU architectures due to a conflict of the
+ * MSI-X capability at C8h. We don't know how to determine the GPU
+ * architecture, instead we walk the capability chain to mark conflicts
+ * and choose one or error based on the result.
+ *
+ * NB. Cap list head in pdev->config is already cleared, read from device.
+ */
+ ret = pread(vdev->vbasedev.fd, &tmp, 1,
+ vdev->config_offset + PCI_CAPABILITY_LIST);
+ if (ret != 1 || !tmp) {
+ error_setg(errp, "NVIDIA GPUDirect Clique ID: error getting cap list");
+ return -EINVAL;
+ }
+
+ do {
+ if (tmp == 0xC8) {
+ c8_conflict = true;
+ } else if (tmp == 0xD4) {
+ d4_conflict = true;
+ }
+ tmp = pdev->config[tmp + PCI_CAP_LIST_NEXT];
+ } while (tmp);
+
+ if (!c8_conflict) {
+ pos = 0xC8;
+ } else if (!d4_conflict) {
+ pos = 0xD4;
+ } else {
+ error_setg(errp, "NVIDIA GPUDirect Clique ID: invalid config space");
+ return -EINVAL;
+ }
+
ret = pci_add_capability(pdev, PCI_CAP_ID_VNDR, pos, 8, errp);
if (ret < 0) {
error_prepend(errp, "Failed to add NVIDIA GPUDirect cap: ");
--
2.39.3

@ -0,0 +1,108 @@
From c554f8768a18ceba173aedbd582c1cae43a41e2c Mon Sep 17 00:00:00 2001
From: Thomas Huth <thuth@redhat.com>
Date: Tue, 18 Jun 2024 14:19:58 +0200
Subject: [PATCH 1/2] hw/virtio: Fix the de-initialization of vhost-user
devices
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Thomas Huth <thuth@redhat.com>
RH-MergeRequest: 255: hw/virtio: Fix the de-initialization of vhost-user devices
RH-Jira: RHEL-40708
RH-Acked-by: Cédric Le Goater <clg@redhat.com>
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Commit: [1/1] c7815a249ec135993f45934cab1c1f2c038b80ea (thuth/qemu-kvm-cs9)
JIRA: https://issues.redhat.com/browse/RHEL-40708
The unrealize functions of the various vhost-user devices are
calling the corresponding vhost_*_set_status() functions with a
status of 0 to shut down the device correctly.
Now these vhost_*_set_status() functions all follow this scheme:
bool should_start = virtio_device_should_start(vdev, status);
if (vhost_dev_is_started(&vvc->vhost_dev) == should_start) {
return;
}
if (should_start) {
/* ... do the initialization stuff ... */
} else {
/* ... do the cleanup stuff ... */
}
The problem here is virtio_device_should_start(vdev, 0) currently
always returns "true" since it internally only looks at vdev->started
instead of looking at the "status" parameter. Thus once the device
got started once, virtio_device_should_start() always returns true
and thus the vhost_*_set_status() functions return early, without
ever doing any clean-up when being called with status == 0. This
causes e.g. problems when trying to hot-plug and hot-unplug a vhost
user devices multiple times since the de-initialization step is
completely skipped during the unplug operation.
This bug has been introduced in commit 9f6bcfd99f ("hw/virtio: move
vm_running check to virtio_device_started") which replaced
should_start = status & VIRTIO_CONFIG_S_DRIVER_OK;
with
should_start = virtio_device_started(vdev, status);
which later got replaced by virtio_device_should_start(). This blocked
the possibility to set should_start to false in case the status flag
VIRTIO_CONFIG_S_DRIVER_OK was not set.
Fix it by adjusting the virtio_device_should_start() function to
only consider the status flag instead of vdev->started. Since this
function is only used in the various vhost_*_set_status() functions
for exactly the same purpose, it should be fine to fix it in this
central place there without any risk to change the behavior of other
code.
Fixes: 9f6bcfd99f ("hw/virtio: move vm_running check to virtio_device_started")
Buglink: https://issues.redhat.com/browse/RHEL-40708
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20240618121958.88673-1-thuth@redhat.com>
Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit d72479b11797c28893e1e3fc565497a9cae5ca16)
Signed-off-by: Thomas Huth <thuth@redhat.com>
---
include/hw/virtio/virtio.h | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index 7d5ffdc145..2eafad17b8 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -470,9 +470,9 @@ static inline bool virtio_device_started(VirtIODevice *vdev, uint8_t status)
* @vdev - the VirtIO device
* @status - the devices status bits
*
- * This is similar to virtio_device_started() but also encapsulates a
- * check on the VM status which would prevent a device starting
- * anyway.
+ * This is similar to virtio_device_started() but ignores vdev->started
+ * and also encapsulates a check on the VM status which would prevent a
+ * device from starting anyway.
*/
static inline bool virtio_device_should_start(VirtIODevice *vdev, uint8_t status)
{
@@ -480,7 +480,7 @@ static inline bool virtio_device_should_start(VirtIODevice *vdev, uint8_t status
return false;
}
- return virtio_device_started(vdev, status);
+ return status & VIRTIO_CONFIG_S_DRIVER_OK;
}
static inline void virtio_set_started(VirtIODevice *vdev, bool started)
--
2.39.3

@ -1,62 +0,0 @@
From 0a731ac1191182546e80af5f39d178a5a2f3688f Mon Sep 17 00:00:00 2001
From: Eric Auger <eric.auger@redhat.com>
Date: Mon, 17 Jul 2023 18:21:26 +0200
Subject: [PATCH 07/14] hw/virtio-iommu: Fix potential OOB access in
virtio_iommu_handle_command()
RH-Author: Eric Auger <eric.auger@redhat.com>
RH-MergeRequest: 197: virtio-iommu/smmu: backport some late fixes
RH-Bugzilla: 2229133
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Peter Xu <peterx@redhat.com>
RH-Commit: [1/3] ecdb1e1aa6b93761dc87ea79bc0a1093ad649a74 (eauger1/centos-qemu-kvm)
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2229133
In the virtio_iommu_handle_command() when a PROBE request is handled,
output_size takes a value greater than the tail size and on a subsequent
iteration we can get a stack out-of-band access. Initialize the
output_size on each iteration.
The issue was found with ASAN. Credits to:
Yiming Tao(Zhejiang University)
Gaoning Pan(Zhejiang University)
Fixes: 1733eebb9e7 ("virtio-iommu: Implement RESV_MEM probe request")
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reported-by: Mauro Matteo Cascella <mcascell@redhat.com>
Cc: qemu-stable@nongnu.org
Message-Id: <20230717162126.11693-1-eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit cf2f89edf36a59183166ae8721a8d7ab5cd286bd)
Signed-off-by: Eric Auger <eric.auger@redhat.com>
---
hw/virtio/virtio-iommu.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c
index 421e2a944f..17ce630200 100644
--- a/hw/virtio/virtio-iommu.c
+++ b/hw/virtio/virtio-iommu.c
@@ -728,13 +728,15 @@ static void virtio_iommu_handle_command(VirtIODevice *vdev, VirtQueue *vq)
VirtIOIOMMU *s = VIRTIO_IOMMU(vdev);
struct virtio_iommu_req_head head;
struct virtio_iommu_req_tail tail = {};
- size_t output_size = sizeof(tail), sz;
VirtQueueElement *elem;
unsigned int iov_cnt;
struct iovec *iov;
void *buf = NULL;
+ size_t sz;
for (;;) {
+ size_t output_size = sizeof(tail);
+
elem = virtqueue_pop(vq, sizeof(VirtQueueElement));
if (!elem) {
return;
--
2.39.3

@ -0,0 +1,68 @@
From f572a40924c7138072e387111d0f092185972477 Mon Sep 17 00:00:00 2001
From: Paolo Bonzini <pbonzini@redhat.com>
Date: Thu, 9 May 2024 19:00:39 +0200
Subject: [PATCH 044/100] i386: correctly select code in hw/i386 that depends
on other components
RH-Author: Paolo Bonzini <pbonzini@redhat.com>
RH-MergeRequest: 245: SEV-SNP support
RH-Jira: RHEL-39544
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Bandan Das <bdas@redhat.com>
RH-Acked-by: Vitaly Kuznetsov <vkuznets@redhat.com>
RH-Commit: [44/91] 1327a5eb2b91edacf56cc4e93255cad456abbbeb (bonzini/rhel-qemu-kvm)
fw_cfg.c and vapic.c are currently included unconditionally but
depend on other components. vapic.c depends on the local APIC,
while fw_cfg.c includes a piece of AML builder code that depends
on CONFIG_ACPI.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240509170044.190795-9-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 7974e51342775c87f6e759a8c525db1045ddfa24)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
hw/i386/fw_cfg.c | 2 ++
hw/i386/meson.build | 2 +-
2 files changed, 3 insertions(+), 1 deletion(-)
diff --git a/hw/i386/fw_cfg.c b/hw/i386/fw_cfg.c
index 283c3f4c16..7f97d40616 100644
--- a/hw/i386/fw_cfg.c
+++ b/hw/i386/fw_cfg.c
@@ -204,6 +204,7 @@ void fw_cfg_build_feature_control(MachineState *ms, FWCfgState *fw_cfg)
fw_cfg_add_file(fw_cfg, "etc/msr_feature_control", val, sizeof(*val));
}
+#ifdef CONFIG_ACPI
void fw_cfg_add_acpi_dsdt(Aml *scope, FWCfgState *fw_cfg)
{
/*
@@ -230,3 +231,4 @@ void fw_cfg_add_acpi_dsdt(Aml *scope, FWCfgState *fw_cfg)
aml_append(dev, aml_name_decl("_CRS", crs));
aml_append(scope, dev);
}
+#endif
diff --git a/hw/i386/meson.build b/hw/i386/meson.build
index d8b70ef3e9..d9da676038 100644
--- a/hw/i386/meson.build
+++ b/hw/i386/meson.build
@@ -1,12 +1,12 @@
i386_ss = ss.source_set()
i386_ss.add(files(
'fw_cfg.c',
- 'vapic.c',
'e820_memory_layout.c',
'multiboot.c',
'x86.c',
))
+i386_ss.add(when: 'CONFIG_APIC', if_true: files('vapic.c'))
i386_ss.add(when: 'CONFIG_X86_IOMMU', if_true: files('x86-iommu.c'),
if_false: files('x86-iommu-stub.c'))
i386_ss.add(when: 'CONFIG_AMD_IOMMU', if_true: files('amd_iommu.c'),
--
2.39.3

@ -0,0 +1,40 @@
From 127f3c60668e1bd08ec00856a317cb841adf0440 Mon Sep 17 00:00:00 2001
From: Michael Roth <michael.roth@amd.com>
Date: Thu, 30 May 2024 06:16:23 -0500
Subject: [PATCH 063/100] i386/cpu: Set SEV-SNP CPUID bit when SNP enabled
RH-Author: Paolo Bonzini <pbonzini@redhat.com>
RH-MergeRequest: 245: SEV-SNP support
RH-Jira: RHEL-39544
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Bandan Das <bdas@redhat.com>
RH-Acked-by: Vitaly Kuznetsov <vkuznets@redhat.com>
RH-Commit: [63/91] 0f834a6897c5cdc0e29a5b1862e621f8ce309657 (bonzini/rhel-qemu-kvm)
SNP guests will rely on this bit to determine certain feature support.
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240530111643.1091816-12-pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 7831221941cccbde922412c1550ed8b4bce7c361)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
target/i386/cpu.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 489c853b42..13737cd703 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6822,6 +6822,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
if (sev_enabled()) {
*eax = 0x2;
*eax |= sev_es_enabled() ? 0x8 : 0;
+ *eax |= sev_snp_enabled() ? 0x10 : 0;
*ebx = sev_get_cbit_position() & 0x3f; /* EBX[5:0] */
*ebx |= (sev_get_reduced_phys_bits() & 0x3f) << 6; /* EBX[11:6] */
}
--
2.39.3

@ -1,52 +0,0 @@
From f9d982fae156aa9db0506e1e098c1e8a7f7eec94 Mon Sep 17 00:00:00 2001
From: Bandan Das <bsd@redhat.com>
Date: Thu, 3 Aug 2023 14:29:15 -0400
Subject: [PATCH 13/14] i386/cpu: Update how the EBX register of CPUID
0x8000001F is set
RH-Author: Bandan Das <None>
RH-MergeRequest: 196: Updates to SEV reduced-phys-bits parameter
RH-Bugzilla: 2214839
RH-Acked-by: Miroslav Rezanina <mrezanin@redhat.com>
RH-Commit: [4/4] efc368b2c844fd4fbc3c755a5e2da288329e7a2c (bdas1/qemu-kvm)
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2214839
commit fb6bbafc0f19385fb257ee073ed13dcaf613f2f8
Author: Tom Lendacky <thomas.lendacky@amd.com>
Date: Fri Sep 30 10:14:30 2022 -0500
i386/cpu: Update how the EBX register of CPUID 0x8000001F is set
Update the setting of CPUID 0x8000001F EBX to clearly document the ranges
associated with fields being set.
Fixes: 6cb8f2a663 ("cpu/i386: populate CPUID 0x8000_001F when SEV is active")
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <5822fd7d02b575121380e1f493a8f6d9eba2b11a.1664550870.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Bandan Das <bsd@redhat.com>
---
target/i386/cpu.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 839706b430..4ac3046313 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -6008,8 +6008,8 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count,
if (sev_enabled()) {
*eax = 0x2;
*eax |= sev_es_enabled() ? 0x8 : 0;
- *ebx = sev_get_cbit_position();
- *ebx |= sev_get_reduced_phys_bits() << 6;
+ *ebx = sev_get_cbit_position() & 0x3f; /* EBX[5:0] */
+ *ebx |= (sev_get_reduced_phys_bits() & 0x3f) << 6; /* EBX[11:6] */
}
break;
default:
--
2.39.3

@ -0,0 +1,145 @@
From 14aa42bbacde75b2ce9a59d1267f73d613026461 Mon Sep 17 00:00:00 2001
From: Michael Roth <michael.roth@amd.com>
Date: Thu, 30 May 2024 06:16:42 -0500
Subject: [PATCH 076/100] i386/kvm: Add KVM_EXIT_HYPERCALL handling for
KVM_HC_MAP_GPA_RANGE
RH-Author: Paolo Bonzini <pbonzini@redhat.com>
RH-MergeRequest: 245: SEV-SNP support
RH-Jira: RHEL-39544
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Bandan Das <bdas@redhat.com>
RH-Acked-by: Vitaly Kuznetsov <vkuznets@redhat.com>
RH-Commit: [76/91] 3e1201c330dc826af1ec4650974d47053270eb16 (bonzini/rhel-qemu-kvm)
KVM_HC_MAP_GPA_RANGE will be used to send requests to userspace for
private/shared memory attribute updates requested by the guest.
Implement handling for that use-case along with some basic
infrastructure for enabling specific hypercall events.
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240530111643.1091816-31-pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 47e76d03b155e43beca550251a6eb7ea926c059f)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
target/i386/kvm/kvm.c | 55 ++++++++++++++++++++++++++++++++++++
target/i386/kvm/kvm_i386.h | 1 +
target/i386/kvm/trace-events | 1 +
3 files changed, 57 insertions(+)
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 75e75d9772..2935e3931a 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -21,6 +21,7 @@
#include <sys/syscall.h>
#include <linux/kvm.h>
+#include <linux/kvm_para.h>
#include "standard-headers/asm-x86/kvm_para.h"
#include "hw/xen/interface/arch-x86/cpuid.h"
@@ -208,6 +209,13 @@ int kvm_get_vm_type(MachineState *ms)
return kvm_type;
}
+bool kvm_enable_hypercall(uint64_t enable_mask)
+{
+ KVMState *s = KVM_STATE(current_accel());
+
+ return !kvm_vm_enable_cap(s, KVM_CAP_EXIT_HYPERCALL, 0, enable_mask);
+}
+
bool kvm_has_smm(void)
{
return kvm_vm_check_extension(kvm_state, KVM_CAP_X86_SMM);
@@ -5325,6 +5333,50 @@ static bool host_supports_vmx(void)
return ecx & CPUID_EXT_VMX;
}
+/*
+ * Currently the handling here only supports use of KVM_HC_MAP_GPA_RANGE
+ * to service guest-initiated memory attribute update requests so that
+ * KVM_SET_MEMORY_ATTRIBUTES can update whether or not a page should be
+ * backed by the private memory pool provided by guest_memfd, and as such
+ * is only applicable to guest_memfd-backed guests (e.g. SNP/TDX).
+ *
+ * Other other use-cases for KVM_HC_MAP_GPA_RANGE, such as for SEV live
+ * migration, are not implemented here currently.
+ *
+ * For the guest_memfd use-case, these exits will generally be synthesized
+ * by KVM based on platform-specific hypercalls, like GHCB requests in the
+ * case of SEV-SNP, and not issued directly within the guest though the
+ * KVM_HC_MAP_GPA_RANGE hypercall. So in this case, KVM_HC_MAP_GPA_RANGE is
+ * not actually advertised to guests via the KVM CPUID feature bit, as
+ * opposed to SEV live migration where it would be. Since it is unlikely the
+ * SEV live migration use-case would be useful for guest-memfd backed guests,
+ * because private/shared page tracking is already provided through other
+ * means, these 2 use-cases should be treated as being mutually-exclusive.
+ */
+static int kvm_handle_hc_map_gpa_range(struct kvm_run *run)
+{
+ uint64_t gpa, size, attributes;
+
+ if (!machine_require_guest_memfd(current_machine))
+ return -EINVAL;
+
+ gpa = run->hypercall.args[0];
+ size = run->hypercall.args[1] * TARGET_PAGE_SIZE;
+ attributes = run->hypercall.args[2];
+
+ trace_kvm_hc_map_gpa_range(gpa, size, attributes, run->hypercall.flags);
+
+ return kvm_convert_memory(gpa, size, attributes & KVM_MAP_GPA_RANGE_ENCRYPTED);
+}
+
+static int kvm_handle_hypercall(struct kvm_run *run)
+{
+ if (run->hypercall.nr == KVM_HC_MAP_GPA_RANGE)
+ return kvm_handle_hc_map_gpa_range(run);
+
+ return -EINVAL;
+}
+
#define VMX_INVALID_GUEST_STATE 0x80000021
int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
@@ -5420,6 +5472,9 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
ret = kvm_xen_handle_exit(cpu, &run->xen);
break;
#endif
+ case KVM_EXIT_HYPERCALL:
+ ret = kvm_handle_hypercall(run);
+ break;
default:
fprintf(stderr, "KVM: unknown exit reason %d\n", run->exit_reason);
ret = -1;
diff --git a/target/i386/kvm/kvm_i386.h b/target/i386/kvm/kvm_i386.h
index 6b44844d95..34fc60774b 100644
--- a/target/i386/kvm/kvm_i386.h
+++ b/target/i386/kvm/kvm_i386.h
@@ -33,6 +33,7 @@
bool kvm_has_smm(void);
bool kvm_enable_x2apic(void);
bool kvm_hv_vpindex_settable(void);
+bool kvm_enable_hypercall(uint64_t enable_mask);
bool kvm_enable_sgx_provisioning(KVMState *s);
bool kvm_hyperv_expand_features(X86CPU *cpu, Error **errp);
diff --git a/target/i386/kvm/trace-events b/target/i386/kvm/trace-events
index b365a8e8e2..74a6234ff7 100644
--- a/target/i386/kvm/trace-events
+++ b/target/i386/kvm/trace-events
@@ -5,6 +5,7 @@ kvm_x86_fixup_msi_error(uint32_t gsi) "VT-d failed to remap interrupt for GSI %"
kvm_x86_add_msi_route(int virq) "Adding route entry for virq %d"
kvm_x86_remove_msi_route(int virq) "Removing route entry for virq %d"
kvm_x86_update_msi_routes(int num) "Updated %d MSI routes"
+kvm_hc_map_gpa_range(uint64_t gpa, uint64_t size, uint64_t attributes, uint64_t flags) "gpa 0x%" PRIx64 " size 0x%" PRIx64 " attributes 0x%" PRIx64 " flags 0x%" PRIx64
# xen-emu.c
kvm_xen_hypercall(int cpu, uint8_t cpl, uint64_t input, uint64_t a0, uint64_t a1, uint64_t a2, uint64_t ret) "xen_hypercall: cpu %d cpl %d input %" PRIu64 " a0 0x%" PRIx64 " a1 0x%" PRIx64 " a2 0x%" PRIx64" ret 0x%" PRIx64
--
2.39.3

@ -0,0 +1,536 @@
From 5ead79f45e8e90b7a04586c89e70cb9d0b66b730 Mon Sep 17 00:00:00 2001
From: Sean Christopherson <sean.j.christopherson@intel.com>
Date: Thu, 29 Feb 2024 01:36:43 -0500
Subject: [PATCH 004/100] i386/kvm: Move architectural CPUID leaf generation to
separate helper
RH-Author: Paolo Bonzini <pbonzini@redhat.com>
RH-MergeRequest: 245: SEV-SNP support
RH-Jira: RHEL-39544
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Bandan Das <bdas@redhat.com>
RH-Acked-by: Vitaly Kuznetsov <vkuznets@redhat.com>
RH-Commit: [4/91] 06ecdbcf05ad3d658273980b114f02477d0b0475 (bonzini/rhel-qemu-kvm)
Move the architectural (for lack of a better term) CPUID leaf generation
to a separate helper so that the generation code can be reused by TDX,
which needs to generate a canonical VM-scoped configuration.
For now this is just a cleanup, so keep the function static.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-ID: <20240229063726.610065-23-xiaoyao.li@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit a5acf4f26c208a05d05ef1bde65553ce2ab5e5d0)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
target/i386/kvm/kvm.c | 417 +++++++++++++++++++++---------------------
1 file changed, 211 insertions(+), 206 deletions(-)
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 739f33db47..5f30b649a0 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -1706,195 +1706,22 @@ static void kvm_init_nested_state(CPUX86State *env)
}
}
-int kvm_arch_init_vcpu(CPUState *cs)
+static uint32_t kvm_x86_build_cpuid(CPUX86State *env,
+ struct kvm_cpuid_entry2 *entries,
+ uint32_t cpuid_i)
{
- struct {
- struct kvm_cpuid2 cpuid;
- struct kvm_cpuid_entry2 entries[KVM_MAX_CPUID_ENTRIES];
- } cpuid_data;
- /*
- * The kernel defines these structs with padding fields so there
- * should be no extra padding in our cpuid_data struct.
- */
- QEMU_BUILD_BUG_ON(sizeof(cpuid_data) !=
- sizeof(struct kvm_cpuid2) +
- sizeof(struct kvm_cpuid_entry2) * KVM_MAX_CPUID_ENTRIES);
-
- X86CPU *cpu = X86_CPU(cs);
- CPUX86State *env = &cpu->env;
- uint32_t limit, i, j, cpuid_i;
+ uint32_t limit, i, j;
uint32_t unused;
struct kvm_cpuid_entry2 *c;
- uint32_t signature[3];
- int kvm_base = KVM_CPUID_SIGNATURE;
- int max_nested_state_len;
- int r;
- Error *local_err = NULL;
-
- memset(&cpuid_data, 0, sizeof(cpuid_data));
-
- cpuid_i = 0;
-
- has_xsave2 = kvm_check_extension(cs->kvm_state, KVM_CAP_XSAVE2);
-
- r = kvm_arch_set_tsc_khz(cs);
- if (r < 0) {
- return r;
- }
-
- /* vcpu's TSC frequency is either specified by user, or following
- * the value used by KVM if the former is not present. In the
- * latter case, we query it from KVM and record in env->tsc_khz,
- * so that vcpu's TSC frequency can be migrated later via this field.
- */
- if (!env->tsc_khz) {
- r = kvm_check_extension(cs->kvm_state, KVM_CAP_GET_TSC_KHZ) ?
- kvm_vcpu_ioctl(cs, KVM_GET_TSC_KHZ) :
- -ENOTSUP;
- if (r > 0) {
- env->tsc_khz = r;
- }
- }
-
- env->apic_bus_freq = KVM_APIC_BUS_FREQUENCY;
-
- /*
- * kvm_hyperv_expand_features() is called here for the second time in case
- * KVM_CAP_SYS_HYPERV_CPUID is not supported. While we can't possibly handle
- * 'query-cpu-model-expansion' in this case as we don't have a KVM vCPU to
- * check which Hyper-V enlightenments are supported and which are not, we
- * can still proceed and check/expand Hyper-V enlightenments here so legacy
- * behavior is preserved.
- */
- if (!kvm_hyperv_expand_features(cpu, &local_err)) {
- error_report_err(local_err);
- return -ENOSYS;
- }
-
- if (hyperv_enabled(cpu)) {
- r = hyperv_init_vcpu(cpu);
- if (r) {
- return r;
- }
-
- cpuid_i = hyperv_fill_cpuids(cs, cpuid_data.entries);
- kvm_base = KVM_CPUID_SIGNATURE_NEXT;
- has_msr_hv_hypercall = true;
- }
-
- if (cs->kvm_state->xen_version) {
-#ifdef CONFIG_XEN_EMU
- struct kvm_cpuid_entry2 *xen_max_leaf;
-
- memcpy(signature, "XenVMMXenVMM", 12);
-
- xen_max_leaf = c = &cpuid_data.entries[cpuid_i++];
- c->function = kvm_base + XEN_CPUID_SIGNATURE;
- c->eax = kvm_base + XEN_CPUID_TIME;
- c->ebx = signature[0];
- c->ecx = signature[1];
- c->edx = signature[2];
-
- c = &cpuid_data.entries[cpuid_i++];
- c->function = kvm_base + XEN_CPUID_VENDOR;
- c->eax = cs->kvm_state->xen_version;
- c->ebx = 0;
- c->ecx = 0;
- c->edx = 0;
-
- c = &cpuid_data.entries[cpuid_i++];
- c->function = kvm_base + XEN_CPUID_HVM_MSR;
- /* Number of hypercall-transfer pages */
- c->eax = 1;
- /* Hypercall MSR base address */
- if (hyperv_enabled(cpu)) {
- c->ebx = XEN_HYPERCALL_MSR_HYPERV;
- kvm_xen_init(cs->kvm_state, c->ebx);
- } else {
- c->ebx = XEN_HYPERCALL_MSR;
- }
- c->ecx = 0;
- c->edx = 0;
-
- c = &cpuid_data.entries[cpuid_i++];
- c->function = kvm_base + XEN_CPUID_TIME;
- c->eax = ((!!tsc_is_stable_and_known(env) << 1) |
- (!!(env->features[FEAT_8000_0001_EDX] & CPUID_EXT2_RDTSCP) << 2));
- /* default=0 (emulate if necessary) */
- c->ebx = 0;
- /* guest tsc frequency */
- c->ecx = env->user_tsc_khz;
- /* guest tsc incarnation (migration count) */
- c->edx = 0;
-
- c = &cpuid_data.entries[cpuid_i++];
- c->function = kvm_base + XEN_CPUID_HVM;
- xen_max_leaf->eax = kvm_base + XEN_CPUID_HVM;
- if (cs->kvm_state->xen_version >= XEN_VERSION(4, 5)) {
- c->function = kvm_base + XEN_CPUID_HVM;
-
- if (cpu->xen_vapic) {
- c->eax |= XEN_HVM_CPUID_APIC_ACCESS_VIRT;
- c->eax |= XEN_HVM_CPUID_X2APIC_VIRT;
- }
-
- c->eax |= XEN_HVM_CPUID_IOMMU_MAPPINGS;
-
- if (cs->kvm_state->xen_version >= XEN_VERSION(4, 6)) {
- c->eax |= XEN_HVM_CPUID_VCPU_ID_PRESENT;
- c->ebx = cs->cpu_index;
- }
-
- if (cs->kvm_state->xen_version >= XEN_VERSION(4, 17)) {
- c->eax |= XEN_HVM_CPUID_UPCALL_VECTOR;
- }
- }
-
- r = kvm_xen_init_vcpu(cs);
- if (r) {
- return r;
- }
-
- kvm_base += 0x100;
-#else /* CONFIG_XEN_EMU */
- /* This should never happen as kvm_arch_init() would have died first. */
- fprintf(stderr, "Cannot enable Xen CPUID without Xen support\n");
- abort();
-#endif
- } else if (cpu->expose_kvm) {
- memcpy(signature, "KVMKVMKVM\0\0\0", 12);
- c = &cpuid_data.entries[cpuid_i++];
- c->function = KVM_CPUID_SIGNATURE | kvm_base;
- c->eax = KVM_CPUID_FEATURES | kvm_base;
- c->ebx = signature[0];
- c->ecx = signature[1];
- c->edx = signature[2];
-
- c = &cpuid_data.entries[cpuid_i++];
- c->function = KVM_CPUID_FEATURES | kvm_base;
- c->eax = env->features[FEAT_KVM];
- c->edx = env->features[FEAT_KVM_HINTS];
- }
cpu_x86_cpuid(env, 0, 0, &limit, &unused, &unused, &unused);
- if (cpu->kvm_pv_enforce_cpuid) {
- r = kvm_vcpu_enable_cap(cs, KVM_CAP_ENFORCE_PV_FEATURE_CPUID, 0, 1);
- if (r < 0) {
- fprintf(stderr,
- "failed to enable KVM_CAP_ENFORCE_PV_FEATURE_CPUID: %s",
- strerror(-r));
- abort();
- }
- }
-
for (i = 0; i <= limit; i++) {
+ j = 0;
if (cpuid_i == KVM_MAX_CPUID_ENTRIES) {
- fprintf(stderr, "unsupported level value: 0x%x\n", limit);
- abort();
+ goto full;
}
- c = &cpuid_data.entries[cpuid_i++];
-
+ c = &entries[cpuid_i++];
switch (i) {
case 2: {
/* Keep reading function 2 till all the input is received */
@@ -1908,11 +1735,9 @@ int kvm_arch_init_vcpu(CPUState *cs)
for (j = 1; j < times; ++j) {
if (cpuid_i == KVM_MAX_CPUID_ENTRIES) {
- fprintf(stderr, "cpuid_data is full, no space for "
- "cpuid(eax:2):eax & 0xf = 0x%x\n", times);
- abort();
+ goto full;
}
- c = &cpuid_data.entries[cpuid_i++];
+ c = &entries[cpuid_i++];
c->function = i;
c->flags = KVM_CPUID_FLAG_STATEFUL_FUNC;
cpu_x86_cpuid(env, i, 0, &c->eax, &c->ebx, &c->ecx, &c->edx);
@@ -1951,11 +1776,9 @@ int kvm_arch_init_vcpu(CPUState *cs)
continue;
}
if (cpuid_i == KVM_MAX_CPUID_ENTRIES) {
- fprintf(stderr, "cpuid_data is full, no space for "
- "cpuid(eax:0x%x,ecx:0x%x)\n", i, j);
- abort();
+ goto full;
}
- c = &cpuid_data.entries[cpuid_i++];
+ c = &entries[cpuid_i++];
}
break;
case 0x12:
@@ -1970,11 +1793,9 @@ int kvm_arch_init_vcpu(CPUState *cs)
}
if (cpuid_i == KVM_MAX_CPUID_ENTRIES) {
- fprintf(stderr, "cpuid_data is full, no space for "
- "cpuid(eax:0x12,ecx:0x%x)\n", j);
- abort();
+ goto full;
}
- c = &cpuid_data.entries[cpuid_i++];
+ c = &entries[cpuid_i++];
}
break;
case 0x7:
@@ -1991,11 +1812,9 @@ int kvm_arch_init_vcpu(CPUState *cs)
for (j = 1; j <= times; ++j) {
if (cpuid_i == KVM_MAX_CPUID_ENTRIES) {
- fprintf(stderr, "cpuid_data is full, no space for "
- "cpuid(eax:0x%x,ecx:0x%x)\n", i, j);
- abort();
+ goto full;
}
- c = &cpuid_data.entries[cpuid_i++];
+ c = &entries[cpuid_i++];
c->function = i;
c->index = j;
c->flags = KVM_CPUID_FLAG_SIGNIFCANT_INDEX;
@@ -2048,11 +1867,11 @@ int kvm_arch_init_vcpu(CPUState *cs)
cpu_x86_cpuid(env, 0x80000000, 0, &limit, &unused, &unused, &unused);
for (i = 0x80000000; i <= limit; i++) {
+ j = 0;
if (cpuid_i == KVM_MAX_CPUID_ENTRIES) {
- fprintf(stderr, "unsupported xlevel value: 0x%x\n", limit);
- abort();
+ goto full;
}
- c = &cpuid_data.entries[cpuid_i++];
+ c = &entries[cpuid_i++];
switch (i) {
case 0x8000001d:
@@ -2067,11 +1886,9 @@ int kvm_arch_init_vcpu(CPUState *cs)
break;
}
if (cpuid_i == KVM_MAX_CPUID_ENTRIES) {
- fprintf(stderr, "cpuid_data is full, no space for "
- "cpuid(eax:0x%x,ecx:0x%x)\n", i, j);
- abort();
+ goto full;
}
- c = &cpuid_data.entries[cpuid_i++];
+ c = &entries[cpuid_i++];
}
break;
default:
@@ -2094,11 +1911,11 @@ int kvm_arch_init_vcpu(CPUState *cs)
cpu_x86_cpuid(env, 0xC0000000, 0, &limit, &unused, &unused, &unused);
for (i = 0xC0000000; i <= limit; i++) {
+ j = 0;
if (cpuid_i == KVM_MAX_CPUID_ENTRIES) {
- fprintf(stderr, "unsupported xlevel2 value: 0x%x\n", limit);
- abort();
+ goto full;
}
- c = &cpuid_data.entries[cpuid_i++];
+ c = &entries[cpuid_i++];
c->function = i;
c->flags = 0;
@@ -2106,6 +1923,194 @@ int kvm_arch_init_vcpu(CPUState *cs)
}
}
+ return cpuid_i;
+
+full:
+ fprintf(stderr, "cpuid_data is full, no space for "
+ "cpuid(eax:0x%x,ecx:0x%x)\n", i, j);
+ abort();
+}
+
+int kvm_arch_init_vcpu(CPUState *cs)
+{
+ struct {
+ struct kvm_cpuid2 cpuid;
+ struct kvm_cpuid_entry2 entries[KVM_MAX_CPUID_ENTRIES];
+ } cpuid_data;
+ /*
+ * The kernel defines these structs with padding fields so there
+ * should be no extra padding in our cpuid_data struct.
+ */
+ QEMU_BUILD_BUG_ON(sizeof(cpuid_data) !=
+ sizeof(struct kvm_cpuid2) +
+ sizeof(struct kvm_cpuid_entry2) * KVM_MAX_CPUID_ENTRIES);
+
+ X86CPU *cpu = X86_CPU(cs);
+ CPUX86State *env = &cpu->env;
+ uint32_t cpuid_i;
+ struct kvm_cpuid_entry2 *c;
+ uint32_t signature[3];
+ int kvm_base = KVM_CPUID_SIGNATURE;
+ int max_nested_state_len;
+ int r;
+ Error *local_err = NULL;
+
+ memset(&cpuid_data, 0, sizeof(cpuid_data));
+
+ cpuid_i = 0;
+
+ has_xsave2 = kvm_check_extension(cs->kvm_state, KVM_CAP_XSAVE2);
+
+ r = kvm_arch_set_tsc_khz(cs);
+ if (r < 0) {
+ return r;
+ }
+
+ /* vcpu's TSC frequency is either specified by user, or following
+ * the value used by KVM if the former is not present. In the
+ * latter case, we query it from KVM and record in env->tsc_khz,
+ * so that vcpu's TSC frequency can be migrated later via this field.
+ */
+ if (!env->tsc_khz) {
+ r = kvm_check_extension(cs->kvm_state, KVM_CAP_GET_TSC_KHZ) ?
+ kvm_vcpu_ioctl(cs, KVM_GET_TSC_KHZ) :
+ -ENOTSUP;
+ if (r > 0) {
+ env->tsc_khz = r;
+ }
+ }
+
+ env->apic_bus_freq = KVM_APIC_BUS_FREQUENCY;
+
+ /*
+ * kvm_hyperv_expand_features() is called here for the second time in case
+ * KVM_CAP_SYS_HYPERV_CPUID is not supported. While we can't possibly handle
+ * 'query-cpu-model-expansion' in this case as we don't have a KVM vCPU to
+ * check which Hyper-V enlightenments are supported and which are not, we
+ * can still proceed and check/expand Hyper-V enlightenments here so legacy
+ * behavior is preserved.
+ */
+ if (!kvm_hyperv_expand_features(cpu, &local_err)) {
+ error_report_err(local_err);
+ return -ENOSYS;
+ }
+
+ if (hyperv_enabled(cpu)) {
+ r = hyperv_init_vcpu(cpu);
+ if (r) {
+ return r;
+ }
+
+ cpuid_i = hyperv_fill_cpuids(cs, cpuid_data.entries);
+ kvm_base = KVM_CPUID_SIGNATURE_NEXT;
+ has_msr_hv_hypercall = true;
+ }
+
+ if (cs->kvm_state->xen_version) {
+#ifdef CONFIG_XEN_EMU
+ struct kvm_cpuid_entry2 *xen_max_leaf;
+
+ memcpy(signature, "XenVMMXenVMM", 12);
+
+ xen_max_leaf = c = &cpuid_data.entries[cpuid_i++];
+ c->function = kvm_base + XEN_CPUID_SIGNATURE;
+ c->eax = kvm_base + XEN_CPUID_TIME;
+ c->ebx = signature[0];
+ c->ecx = signature[1];
+ c->edx = signature[2];
+
+ c = &cpuid_data.entries[cpuid_i++];
+ c->function = kvm_base + XEN_CPUID_VENDOR;
+ c->eax = cs->kvm_state->xen_version;
+ c->ebx = 0;
+ c->ecx = 0;
+ c->edx = 0;
+
+ c = &cpuid_data.entries[cpuid_i++];
+ c->function = kvm_base + XEN_CPUID_HVM_MSR;
+ /* Number of hypercall-transfer pages */
+ c->eax = 1;
+ /* Hypercall MSR base address */
+ if (hyperv_enabled(cpu)) {
+ c->ebx = XEN_HYPERCALL_MSR_HYPERV;
+ kvm_xen_init(cs->kvm_state, c->ebx);
+ } else {
+ c->ebx = XEN_HYPERCALL_MSR;
+ }
+ c->ecx = 0;
+ c->edx = 0;
+
+ c = &cpuid_data.entries[cpuid_i++];
+ c->function = kvm_base + XEN_CPUID_TIME;
+ c->eax = ((!!tsc_is_stable_and_known(env) << 1) |
+ (!!(env->features[FEAT_8000_0001_EDX] & CPUID_EXT2_RDTSCP) << 2));
+ /* default=0 (emulate if necessary) */
+ c->ebx = 0;
+ /* guest tsc frequency */
+ c->ecx = env->user_tsc_khz;
+ /* guest tsc incarnation (migration count) */
+ c->edx = 0;
+
+ c = &cpuid_data.entries[cpuid_i++];
+ c->function = kvm_base + XEN_CPUID_HVM;
+ xen_max_leaf->eax = kvm_base + XEN_CPUID_HVM;
+ if (cs->kvm_state->xen_version >= XEN_VERSION(4, 5)) {
+ c->function = kvm_base + XEN_CPUID_HVM;
+
+ if (cpu->xen_vapic) {
+ c->eax |= XEN_HVM_CPUID_APIC_ACCESS_VIRT;
+ c->eax |= XEN_HVM_CPUID_X2APIC_VIRT;
+ }
+
+ c->eax |= XEN_HVM_CPUID_IOMMU_MAPPINGS;
+
+ if (cs->kvm_state->xen_version >= XEN_VERSION(4, 6)) {
+ c->eax |= XEN_HVM_CPUID_VCPU_ID_PRESENT;
+ c->ebx = cs->cpu_index;
+ }
+
+ if (cs->kvm_state->xen_version >= XEN_VERSION(4, 17)) {
+ c->eax |= XEN_HVM_CPUID_UPCALL_VECTOR;
+ }
+ }
+
+ r = kvm_xen_init_vcpu(cs);
+ if (r) {
+ return r;
+ }
+
+ kvm_base += 0x100;
+#else /* CONFIG_XEN_EMU */
+ /* This should never happen as kvm_arch_init() would have died first. */
+ fprintf(stderr, "Cannot enable Xen CPUID without Xen support\n");
+ abort();
+#endif
+ } else if (cpu->expose_kvm) {
+ memcpy(signature, "KVMKVMKVM\0\0\0", 12);
+ c = &cpuid_data.entries[cpuid_i++];
+ c->function = KVM_CPUID_SIGNATURE | kvm_base;
+ c->eax = KVM_CPUID_FEATURES | kvm_base;
+ c->ebx = signature[0];
+ c->ecx = signature[1];
+ c->edx = signature[2];
+
+ c = &cpuid_data.entries[cpuid_i++];
+ c->function = KVM_CPUID_FEATURES | kvm_base;
+ c->eax = env->features[FEAT_KVM];
+ c->edx = env->features[FEAT_KVM_HINTS];
+ }
+
+ if (cpu->kvm_pv_enforce_cpuid) {
+ r = kvm_vcpu_enable_cap(cs, KVM_CAP_ENFORCE_PV_FEATURE_CPUID, 0, 1);
+ if (r < 0) {
+ fprintf(stderr,
+ "failed to enable KVM_CAP_ENFORCE_PV_FEATURE_CPUID: %s",
+ strerror(-r));
+ abort();
+ }
+ }
+
+ cpuid_i = kvm_x86_build_cpuid(env, cpuid_data.entries, cpuid_i);
cpuid_data.cpuid.nent = cpuid_i;
if (((env->cpuid_version >> 8)&0xF) >= 6
--
2.39.3

@ -0,0 +1,91 @@
From 03e275023b482ac79b4f92ca4ceef6de3caa634f Mon Sep 17 00:00:00 2001
From: Paolo Bonzini <pbonzini@redhat.com>
Date: Thu, 9 May 2024 19:00:40 +0200
Subject: [PATCH 045/100] i386: pc: remove unnecessary MachineClass overrides
RH-Author: Paolo Bonzini <pbonzini@redhat.com>
RH-MergeRequest: 245: SEV-SNP support
RH-Jira: RHEL-39544
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Bandan Das <bdas@redhat.com>
RH-Acked-by: Vitaly Kuznetsov <vkuznets@redhat.com>
RH-Commit: [45/91] c03d5b57014d0d02f6ce0cdfb19a34996d100dea (bonzini/rhel-qemu-kvm)
There is no need to override these fields of MachineClass because they are
already set to the right value in the superclass.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Message-ID: <20240509170044.190795-10-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit b348fdcdac9f9fc70be9ae56c54e41765e9aae24)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
hw/i386/pc.c | 3 ---
hw/i386/x86.c | 6 +++---
include/hw/i386/x86.h | 4 ----
3 files changed, 3 insertions(+), 10 deletions(-)
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 660a59c63b..0aca0cc79e 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1979,9 +1979,6 @@ static void pc_machine_class_init(ObjectClass *oc, void *data)
mc->async_pf_vmexit_disable = false;
mc->get_hotplug_handler = pc_get_hotplug_handler;
mc->hotplug_allowed = pc_hotplug_allowed;
- mc->cpu_index_to_instance_props = x86_cpu_index_to_props;
- mc->get_default_cpu_node_id = x86_get_default_cpu_node_id;
- mc->possible_cpu_arch_ids = x86_possible_cpu_arch_ids;
mc->auto_enable_numa_with_memhp = true;
mc->auto_enable_numa_with_memdev = true;
mc->has_hotpluggable_cpus = true;
diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index c61f4ebfa6..fcef652c1e 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -443,7 +443,7 @@ void x86_cpu_pre_plug(HotplugHandler *hotplug_dev,
numa_cpu_pre_plug(cpu_slot, dev, errp);
}
-CpuInstanceProperties
+static CpuInstanceProperties
x86_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
{
MachineClass *mc = MACHINE_GET_CLASS(ms);
@@ -453,7 +453,7 @@ x86_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
return possible_cpus->cpus[cpu_index].props;
}
-int64_t x86_get_default_cpu_node_id(const MachineState *ms, int idx)
+static int64_t x86_get_default_cpu_node_id(const MachineState *ms, int idx)
{
X86CPUTopoIDs topo_ids;
X86MachineState *x86ms = X86_MACHINE(ms);
@@ -467,7 +467,7 @@ int64_t x86_get_default_cpu_node_id(const MachineState *ms, int idx)
return topo_ids.pkg_id % ms->numa_state->num_nodes;
}
-const CPUArchIdList *x86_possible_cpu_arch_ids(MachineState *ms)
+static const CPUArchIdList *x86_possible_cpu_arch_ids(MachineState *ms)
{
X86MachineState *x86ms = X86_MACHINE(ms);
unsigned int max_cpus = ms->smp.max_cpus;
diff --git a/include/hw/i386/x86.h b/include/hw/i386/x86.h
index d7b7d3f3ce..c2062db13f 100644
--- a/include/hw/i386/x86.h
+++ b/include/hw/i386/x86.h
@@ -114,10 +114,6 @@ uint32_t x86_cpu_apic_id_from_index(X86MachineState *pcms,
void x86_cpu_new(X86MachineState *pcms, int64_t apic_id, Error **errp);
void x86_cpus_init(X86MachineState *pcms, int default_cpu_version);
-CpuInstanceProperties x86_cpu_index_to_props(MachineState *ms,
- unsigned cpu_index);
-int64_t x86_get_default_cpu_node_id(const MachineState *ms, int idx);
-const CPUArchIdList *x86_possible_cpu_arch_ids(MachineState *ms);
CPUArchId *x86_find_cpu_slot(MachineState *ms, uint32_t id, int *idx);
void x86_rtc_set_cpus_count(ISADevice *rtc, uint16_t cpus_count);
void x86_cpu_pre_plug(HotplugHandler *hotplug_dev,
--
2.39.3

@ -0,0 +1,116 @@
From 652793962000d6906e219ceae36348a476b78c28 Mon Sep 17 00:00:00 2001
From: Paolo Bonzini <pbonzini@redhat.com>
Date: Fri, 31 May 2024 12:44:44 +0200
Subject: [PATCH 065/100] i386/sev: Add a class method to determine KVM VM type
for SNP guests
RH-Author: Paolo Bonzini <pbonzini@redhat.com>
RH-MergeRequest: 245: SEV-SNP support
RH-Jira: RHEL-39544
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Bandan Das <bdas@redhat.com>
RH-Acked-by: Vitaly Kuznetsov <vkuznets@redhat.com>
RH-Commit: [65/91] c6cbeac0a6f691138df212b80efaa9b1143fdaa8 (bonzini/rhel-qemu-kvm)
SEV guests can use either KVM_X86_DEFAULT_VM, KVM_X86_SEV_VM,
or KVM_X86_SEV_ES_VM depending on the configuration and what
the host kernel supports. SNP guests on the other hand can only
ever use KVM_X86_SNP_VM, so split determination of VM type out
into a separate class method that can be set accordingly for
sev-guest vs. sev-snp-guest objects and add handling for SNP.
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240530111643.1091816-14-pankaj.gupta@amd.com>
[Remove unnecessary function pointer declaration. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit a808132f6d8e855bd83a400570ec91d2e00bebe3)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
target/i386/kvm/kvm.c | 1 +
target/i386/sev.c | 15 ++++++++++++---
2 files changed, 13 insertions(+), 3 deletions(-)
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 408568d053..75e75d9772 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -166,6 +166,7 @@ static const char *vm_type_name[] = {
[KVM_X86_DEFAULT_VM] = "default",
[KVM_X86_SEV_VM] = "SEV",
[KVM_X86_SEV_ES_VM] = "SEV-ES",
+ [KVM_X86_SNP_VM] = "SEV-SNP",
};
bool kvm_is_vm_type_supported(int type)
diff --git a/target/i386/sev.c b/target/i386/sev.c
index c3daaf1ad5..072cc4f853 100644
--- a/target/i386/sev.c
+++ b/target/i386/sev.c
@@ -885,6 +885,11 @@ out:
return sev_common->kvm_type;
}
+static int sev_snp_kvm_type(X86ConfidentialGuest *cg)
+{
+ return KVM_X86_SNP_VM;
+}
+
static int sev_common_kvm_init(ConfidentialGuestSupport *cgs, Error **errp)
{
char *devname;
@@ -894,6 +899,8 @@ static int sev_common_kvm_init(ConfidentialGuestSupport *cgs, Error **errp)
struct sev_user_data_status status = {};
SevCommonState *sev_common = SEV_COMMON(cgs);
SevCommonStateClass *klass = SEV_COMMON_GET_CLASS(cgs);
+ X86ConfidentialGuestClass *x86_klass =
+ X86_CONFIDENTIAL_GUEST_GET_CLASS(cgs);
sev_common->state = SEV_STATE_UNINIT;
@@ -964,7 +971,7 @@ static int sev_common_kvm_init(ConfidentialGuestSupport *cgs, Error **errp)
}
trace_kvm_sev_init();
- if (sev_kvm_type(X86_CONFIDENTIAL_GUEST(sev_common)) == KVM_X86_DEFAULT_VM) {
+ if (x86_klass->kvm_type(X86_CONFIDENTIAL_GUEST(sev_common)) == KVM_X86_DEFAULT_VM) {
cmd = sev_es_enabled() ? KVM_SEV_ES_INIT : KVM_SEV_INIT;
ret = sev_ioctl(sev_common->sev_fd, cmd, NULL, &fw_error);
@@ -1441,10 +1448,8 @@ static void
sev_common_class_init(ObjectClass *oc, void *data)
{
ConfidentialGuestSupportClass *klass = CONFIDENTIAL_GUEST_SUPPORT_CLASS(oc);
- X86ConfidentialGuestClass *x86_klass = X86_CONFIDENTIAL_GUEST_CLASS(oc);
klass->kvm_init = sev_common_kvm_init;
- x86_klass->kvm_type = sev_kvm_type;
object_class_property_add_str(oc, "sev-device",
sev_common_get_sev_device,
@@ -1529,10 +1534,12 @@ static void
sev_guest_class_init(ObjectClass *oc, void *data)
{
SevCommonStateClass *klass = SEV_COMMON_CLASS(oc);
+ X86ConfidentialGuestClass *x86_klass = X86_CONFIDENTIAL_GUEST_CLASS(oc);
klass->launch_start = sev_launch_start;
klass->launch_finish = sev_launch_finish;
klass->kvm_init = sev_kvm_init;
+ x86_klass->kvm_type = sev_kvm_type;
object_class_property_add_str(oc, "dh-cert-file",
sev_guest_get_dh_cert_file,
@@ -1770,8 +1777,10 @@ static void
sev_snp_guest_class_init(ObjectClass *oc, void *data)
{
SevCommonStateClass *klass = SEV_COMMON_CLASS(oc);
+ X86ConfidentialGuestClass *x86_klass = X86_CONFIDENTIAL_GUEST_CLASS(oc);
klass->kvm_init = sev_snp_kvm_init;
+ x86_klass->kvm_type = sev_snp_kvm_type;
object_class_property_add(oc, "policy", "uint64",
sev_snp_guest_get_policy,
--
2.39.3

@ -0,0 +1,84 @@
From 82a714b79851b5c2d1389d2fa7a01548c486a854 Mon Sep 17 00:00:00 2001
From: Michael Roth <michael.roth@amd.com>
Date: Thu, 30 May 2024 06:16:20 -0500
Subject: [PATCH 060/100] i386/sev: Add a sev_snp_enabled() helper
RH-Author: Paolo Bonzini <pbonzini@redhat.com>
RH-MergeRequest: 245: SEV-SNP support
RH-Jira: RHEL-39544
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Bandan Das <bdas@redhat.com>
RH-Acked-by: Vitaly Kuznetsov <vkuznets@redhat.com>
RH-Commit: [60/91] c35ead095028ccfb1e1be0fe010ca4f7688530a0 (bonzini/rhel-qemu-kvm)
Add a simple helper to check if the current guest type is SNP. Also have
SNP-enabled imply that SEV-ES is enabled as well, and fix up any places
where the sev_es_enabled() check is expecting a pure/non-SNP guest.
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240530111643.1091816-9-pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 99190f805dca9475fe244fbd8041961842657dc2)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
target/i386/sev.c | 13 ++++++++++++-
target/i386/sev.h | 2 ++
2 files changed, 14 insertions(+), 1 deletion(-)
diff --git a/target/i386/sev.c b/target/i386/sev.c
index a81b3228d4..4edfedc139 100644
--- a/target/i386/sev.c
+++ b/target/i386/sev.c
@@ -325,12 +325,21 @@ sev_enabled(void)
return !!object_dynamic_cast(OBJECT(cgs), TYPE_SEV_COMMON);
}
+bool
+sev_snp_enabled(void)
+{
+ ConfidentialGuestSupport *cgs = MACHINE(qdev_get_machine())->cgs;
+
+ return !!object_dynamic_cast(OBJECT(cgs), TYPE_SEV_SNP_GUEST);
+}
+
bool
sev_es_enabled(void)
{
ConfidentialGuestSupport *cgs = MACHINE(qdev_get_machine())->cgs;
- return sev_enabled() && (SEV_GUEST(cgs)->policy & SEV_POLICY_ES);
+ return sev_snp_enabled() ||
+ (sev_enabled() && SEV_GUEST(cgs)->policy & SEV_POLICY_ES);
}
uint32_t
@@ -946,7 +955,9 @@ static int sev_kvm_init(ConfidentialGuestSupport *cgs, Error **errp)
"support", __func__);
goto err;
}
+ }
+ if (sev_es_enabled() && !sev_snp_enabled()) {
if (!(status.flags & SEV_STATUS_FLAGS_CONFIG_ES)) {
error_setg(errp, "%s: guest policy requires SEV-ES, but "
"host SEV-ES support unavailable",
diff --git a/target/i386/sev.h b/target/i386/sev.h
index bedc667eeb..94295ee74f 100644
--- a/target/i386/sev.h
+++ b/target/i386/sev.h
@@ -45,9 +45,11 @@ typedef struct SevKernelLoaderContext {
#ifdef CONFIG_SEV
bool sev_enabled(void);
bool sev_es_enabled(void);
+bool sev_snp_enabled(void);
#else
#define sev_enabled() 0
#define sev_es_enabled() 0
+#define sev_snp_enabled() 0
#endif
uint32_t sev_get_cbit_position(void);
--
2.39.3

@ -0,0 +1,187 @@
From 0e435819540b0d39da2c828aacc0f35ecaadbdf6 Mon Sep 17 00:00:00 2001
From: Brijesh Singh <brijesh.singh@amd.com>
Date: Thu, 30 May 2024 06:16:28 -0500
Subject: [PATCH 068/100] i386/sev: Add handling to encrypt/finalize guest
launch data
RH-Author: Paolo Bonzini <pbonzini@redhat.com>
RH-MergeRequest: 245: SEV-SNP support
RH-Jira: RHEL-39544
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Bandan Das <bdas@redhat.com>
RH-Acked-by: Vitaly Kuznetsov <vkuznets@redhat.com>
RH-Commit: [68/91] fe77931d279aa8df061823da88a320fb5f72ffea (bonzini/rhel-qemu-kvm)
Process any queued up launch data and encrypt/measure it into the SNP
guest instance prior to initial guest launch.
This also updates the KVM_SEV_SNP_LAUNCH_UPDATE call to handle partial
update responses.
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Co-developed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Co-developed-by: Pankaj Gupta <pankaj.gupta@amd.com>
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240530111643.1091816-17-pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 9f3a6999f9730a694d7db448a99f9c9cb6515992)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
target/i386/sev.c | 112 ++++++++++++++++++++++++++++++++++++++-
target/i386/trace-events | 2 +
2 files changed, 113 insertions(+), 1 deletion(-)
diff --git a/target/i386/sev.c b/target/i386/sev.c
index e89b87d2f5..ef2e592ca7 100644
--- a/target/i386/sev.c
+++ b/target/i386/sev.c
@@ -756,6 +756,76 @@ out:
return ret;
}
+static const char *
+snp_page_type_to_str(int type)
+{
+ switch (type) {
+ case KVM_SEV_SNP_PAGE_TYPE_NORMAL: return "Normal";
+ case KVM_SEV_SNP_PAGE_TYPE_ZERO: return "Zero";
+ case KVM_SEV_SNP_PAGE_TYPE_UNMEASURED: return "Unmeasured";
+ case KVM_SEV_SNP_PAGE_TYPE_SECRETS: return "Secrets";
+ case KVM_SEV_SNP_PAGE_TYPE_CPUID: return "Cpuid";
+ default: return "unknown";
+ }
+}
+
+static int
+sev_snp_launch_update(SevSnpGuestState *sev_snp_guest,
+ SevLaunchUpdateData *data)
+{
+ int ret, fw_error;
+ struct kvm_sev_snp_launch_update update = {0};
+
+ if (!data->hva || !data->len) {
+ error_report("SNP_LAUNCH_UPDATE called with invalid address"
+ "/ length: %p / %lx",
+ data->hva, data->len);
+ return 1;
+ }
+
+ update.uaddr = (__u64)(unsigned long)data->hva;
+ update.gfn_start = data->gpa >> TARGET_PAGE_BITS;
+ update.len = data->len;
+ update.type = data->type;
+
+ /*
+ * KVM_SEV_SNP_LAUNCH_UPDATE requires that GPA ranges have the private
+ * memory attribute set in advance.
+ */
+ ret = kvm_set_memory_attributes_private(data->gpa, data->len);
+ if (ret) {
+ error_report("SEV-SNP: failed to configure initial"
+ "private guest memory");
+ goto out;
+ }
+
+ while (update.len || ret == -EAGAIN) {
+ trace_kvm_sev_snp_launch_update(update.uaddr, update.gfn_start <<
+ TARGET_PAGE_BITS, update.len,
+ snp_page_type_to_str(update.type));
+
+ ret = sev_ioctl(SEV_COMMON(sev_snp_guest)->sev_fd,
+ KVM_SEV_SNP_LAUNCH_UPDATE,
+ &update, &fw_error);
+ if (ret && ret != -EAGAIN) {
+ error_report("SNP_LAUNCH_UPDATE ret=%d fw_error=%d '%s'",
+ ret, fw_error, fw_error_to_str(fw_error));
+ break;
+ }
+ }
+
+out:
+ if (!ret && update.gfn_start << TARGET_PAGE_BITS != data->gpa + data->len) {
+ error_report("SEV-SNP: expected update of GPA range %lx-%lx,"
+ "got GPA range %lx-%llx",
+ data->gpa, data->gpa + data->len, data->gpa,
+ update.gfn_start << TARGET_PAGE_BITS);
+ ret = -EIO;
+ }
+
+ return ret;
+}
+
static int
sev_launch_update_data(SevGuestState *sev_guest, uint8_t *addr, uint64_t len)
{
@@ -901,6 +971,46 @@ sev_launch_finish(SevCommonState *sev_common)
migrate_add_blocker(&sev_mig_blocker, &error_fatal);
}
+static void
+sev_snp_launch_finish(SevCommonState *sev_common)
+{
+ int ret, error;
+ Error *local_err = NULL;
+ SevLaunchUpdateData *data;
+ SevSnpGuestState *sev_snp = SEV_SNP_GUEST(sev_common);
+ struct kvm_sev_snp_launch_finish *finish = &sev_snp->kvm_finish_conf;
+
+ QTAILQ_FOREACH(data, &launch_update, next) {
+ ret = sev_snp_launch_update(sev_snp, data);
+ if (ret) {
+ exit(1);
+ }
+ }
+
+ trace_kvm_sev_snp_launch_finish(sev_snp->id_block, sev_snp->id_auth,
+ sev_snp->host_data);
+ ret = sev_ioctl(sev_common->sev_fd, KVM_SEV_SNP_LAUNCH_FINISH,
+ finish, &error);
+ if (ret) {
+ error_report("SNP_LAUNCH_FINISH ret=%d fw_error=%d '%s'",
+ ret, error, fw_error_to_str(error));
+ exit(1);
+ }
+
+ sev_set_guest_state(sev_common, SEV_STATE_RUNNING);
+
+ /* add migration blocker */
+ error_setg(&sev_mig_blocker,
+ "SEV-SNP: Migration is not implemented");
+ ret = migrate_add_blocker(&sev_mig_blocker, &local_err);
+ if (local_err) {
+ error_report_err(local_err);
+ error_free(sev_mig_blocker);
+ exit(1);
+ }
+}
+
+
static void
sev_vm_state_change(void *opaque, bool running, RunState state)
{
@@ -1832,10 +1942,10 @@ sev_snp_guest_class_init(ObjectClass *oc, void *data)
X86ConfidentialGuestClass *x86_klass = X86_CONFIDENTIAL_GUEST_CLASS(oc);
klass->launch_start = sev_snp_launch_start;
+ klass->launch_finish = sev_snp_launch_finish;
klass->kvm_init = sev_snp_kvm_init;
x86_klass->kvm_type = sev_snp_kvm_type;
-
object_class_property_add(oc, "policy", "uint64",
sev_snp_guest_get_policy,
sev_snp_guest_set_policy, NULL, NULL);
diff --git a/target/i386/trace-events b/target/i386/trace-events
index cb26d8a925..06b44ead2e 100644
--- a/target/i386/trace-events
+++ b/target/i386/trace-events
@@ -12,3 +12,5 @@ kvm_sev_launch_finish(void) ""
kvm_sev_launch_secret(uint64_t hpa, uint64_t hva, uint64_t secret, int len) "hpa 0x%" PRIx64 " hva 0x%" PRIx64 " data 0x%" PRIx64 " len %d"
kvm_sev_attestation_report(const char *mnonce, const char *data) "mnonce %s data %s"
kvm_sev_snp_launch_start(uint64_t policy, char *gosvw) "policy 0x%" PRIx64 " gosvw %s"
+kvm_sev_snp_launch_update(uint64_t src, uint64_t gpa, uint64_t len, const char *type) "src 0x%" PRIx64 " gpa 0x%" PRIx64 " len 0x%" PRIx64 " (%s page)"
+kvm_sev_snp_launch_finish(char *id_block, char *id_auth, char *host_data) "id_block %s id_auth %s host_data %s"
--
2.39.3

@ -0,0 +1,127 @@
From 2872c423fa44dcbf50b581a5c3feac064a0473a0 Mon Sep 17 00:00:00 2001
From: Michael Roth <michael.roth@amd.com>
Date: Tue, 9 Apr 2024 18:07:41 -0500
Subject: [PATCH 024/100] i386/sev: Add 'legacy-vm-type' parameter for SEV
guest objects
RH-Author: Paolo Bonzini <pbonzini@redhat.com>
RH-MergeRequest: 245: SEV-SNP support
RH-Jira: RHEL-39544
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Bandan Das <bdas@redhat.com>
RH-Acked-by: Vitaly Kuznetsov <vkuznets@redhat.com>
RH-Commit: [24/91] ce35d1b09fe8aa8772ff149543f7760455c1e6b5 (bonzini/rhel-qemu-kvm)
QEMU will currently automatically make use of the KVM_SEV_INIT2 API for
initializing SEV and SEV-ES guests verses the older
KVM_SEV_INIT/KVM_SEV_ES_INIT interfaces.
However, the older interfaces will silently avoid sync'ing FPU/XSAVE
state to the VMSA prior to encryption, thus relying on behavior and
measurements that assume the related fields to be allow zero.
With KVM_SEV_INIT2, this state is now synced into the VMSA, resulting in
measurements changes and, theoretically, behaviorial changes, though the
latter are unlikely to be seen in practice.
To allow a smooth transition to the newer interface, while still
providing a mechanism to maintain backward compatibility with VMs
created using the older interfaces, provide a new command-line
parameter:
-object sev-guest,legacy-vm-type=true,...
and have it default to false.
Signed-off-by: Michael Roth <michael.roth@amd.com>
Message-ID: <20240409230743.962513-2-michael.roth@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 023267334da375226720e62963df9545aa8fc2fd)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
qapi/qom.json | 11 ++++++++++-
target/i386/sev.c | 18 +++++++++++++++++-
2 files changed, 27 insertions(+), 2 deletions(-)
diff --git a/qapi/qom.json b/qapi/qom.json
index 85e6b4f84a..38dde6d785 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -898,6 +898,14 @@
# designated guest firmware page for measured boot with -kernel
# (default: false) (since 6.2)
#
+# @legacy-vm-type: Use legacy KVM_SEV_INIT KVM interface for creating the VM.
+# The newer KVM_SEV_INIT2 interface syncs additional vCPU
+# state when initializing the VMSA structures, which will
+# result in a different guest measurement. Set this to
+# maintain compatibility with older QEMU or kernel versions
+# that rely on legacy KVM_SEV_INIT behavior.
+# (default: false) (since 9.1)
+#
# Since: 2.12
##
{ 'struct': 'SevGuestProperties',
@@ -908,7 +916,8 @@
'*handle': 'uint32',
'*cbitpos': 'uint32',
'reduced-phys-bits': 'uint32',
- '*kernel-hashes': 'bool' } }
+ '*kernel-hashes': 'bool',
+ '*legacy-vm-type': 'bool' } }
##
# @ThreadContextProperties:
diff --git a/target/i386/sev.c b/target/i386/sev.c
index 9dab4060b8..f4ee317cb0 100644
--- a/target/i386/sev.c
+++ b/target/i386/sev.c
@@ -67,6 +67,7 @@ struct SevGuestState {
uint32_t cbitpos;
uint32_t reduced_phys_bits;
bool kernel_hashes;
+ bool legacy_vm_type;
/* runtime state */
uint32_t handle;
@@ -356,6 +357,16 @@ static void sev_guest_set_kernel_hashes(Object *obj, bool value, Error **errp)
sev->kernel_hashes = value;
}
+static bool sev_guest_get_legacy_vm_type(Object *obj, Error **errp)
+{
+ return SEV_GUEST(obj)->legacy_vm_type;
+}
+
+static void sev_guest_set_legacy_vm_type(Object *obj, bool value, Error **errp)
+{
+ SEV_GUEST(obj)->legacy_vm_type = value;
+}
+
bool
sev_enabled(void)
{
@@ -863,7 +874,7 @@ static int sev_kvm_type(X86ConfidentialGuest *cg)
}
kvm_type = (sev->policy & SEV_POLICY_ES) ? KVM_X86_SEV_ES_VM : KVM_X86_SEV_VM;
- if (kvm_is_vm_type_supported(kvm_type)) {
+ if (kvm_is_vm_type_supported(kvm_type) && !sev->legacy_vm_type) {
sev->kvm_type = kvm_type;
} else {
sev->kvm_type = KVM_X86_DEFAULT_VM;
@@ -1381,6 +1392,11 @@ sev_guest_class_init(ObjectClass *oc, void *data)
sev_guest_set_kernel_hashes);
object_class_property_set_description(oc, "kernel-hashes",
"add kernel hashes to guest firmware for measured Linux boot");
+ object_class_property_add_bool(oc, "legacy-vm-type",
+ sev_guest_get_legacy_vm_type,
+ sev_guest_set_legacy_vm_type);
+ object_class_property_set_description(oc, "legacy-vm-type",
+ "use legacy VM type to maintain measurement compatibility with older QEMU or kernel versions.");
}
static void
--
2.39.3

@ -0,0 +1,203 @@
From a236548a903aa8350fff9601d481b2f529c8d4a7 Mon Sep 17 00:00:00 2001
From: Pankaj Gupta <pankaj.gupta@amd.com>
Date: Thu, 30 May 2024 06:16:21 -0500
Subject: [PATCH 061/100] i386/sev: Add sev_kvm_init() override for SEV class
RH-Author: Paolo Bonzini <pbonzini@redhat.com>
RH-MergeRequest: 245: SEV-SNP support
RH-Jira: RHEL-39544
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Bandan Das <bdas@redhat.com>
RH-Acked-by: Vitaly Kuznetsov <vkuznets@redhat.com>
RH-Commit: [61/91] b24fcbc8712e7394e029312229da023c63803969 (bonzini/rhel-qemu-kvm)
Some aspects of the init routine SEV are specific to SEV and not
applicable for SNP guests, so move the SEV-specific bits into
separate class method and retain only the common functionality.
Co-developed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240530111643.1091816-10-pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 990da8d243a8c59dafcbed78b56a0e4ffb1605d9)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
target/i386/sev.c | 72 +++++++++++++++++++++++++++++++++--------------
1 file changed, 51 insertions(+), 21 deletions(-)
diff --git a/target/i386/sev.c b/target/i386/sev.c
index 4edfedc139..5519de1c6b 100644
--- a/target/i386/sev.c
+++ b/target/i386/sev.c
@@ -73,6 +73,7 @@ struct SevCommonStateClass {
/* public */
int (*launch_start)(SevCommonState *sev_common);
void (*launch_finish)(SevCommonState *sev_common);
+ int (*kvm_init)(ConfidentialGuestSupport *cgs, Error **errp);
};
/**
@@ -882,7 +883,7 @@ out:
return sev_common->kvm_type;
}
-static int sev_kvm_init(ConfidentialGuestSupport *cgs, Error **errp)
+static int sev_common_kvm_init(ConfidentialGuestSupport *cgs, Error **errp)
{
SevCommonState *sev_common = SEV_COMMON(cgs);
char *devname;
@@ -892,12 +893,6 @@ static int sev_kvm_init(ConfidentialGuestSupport *cgs, Error **errp)
struct sev_user_data_status status = {};
SevCommonStateClass *klass = SEV_COMMON_GET_CLASS(cgs);
- ret = ram_block_discard_disable(true);
- if (ret) {
- error_report("%s: cannot disable RAM discard", __func__);
- return -1;
- }
-
sev_common->state = SEV_STATE_UNINIT;
host_cpuid(0x8000001F, 0, NULL, &ebx, NULL, NULL);
@@ -911,7 +906,7 @@ static int sev_kvm_init(ConfidentialGuestSupport *cgs, Error **errp)
if (host_cbitpos != sev_common->cbitpos) {
error_setg(errp, "%s: cbitpos check failed, host '%d' requested '%d'",
__func__, host_cbitpos, sev_common->cbitpos);
- goto err;
+ return -1;
}
/*
@@ -924,7 +919,7 @@ static int sev_kvm_init(ConfidentialGuestSupport *cgs, Error **errp)
error_setg(errp, "%s: reduced_phys_bits check failed,"
" it should be in the range of 1 to 63, requested '%d'",
__func__, sev_common->reduced_phys_bits);
- goto err;
+ return -1;
}
devname = object_property_get_str(OBJECT(sev_common), "sev-device", NULL);
@@ -933,7 +928,7 @@ static int sev_kvm_init(ConfidentialGuestSupport *cgs, Error **errp)
error_setg(errp, "%s: Failed to open %s '%s'", __func__,
devname, strerror(errno));
g_free(devname);
- goto err;
+ return -1;
}
g_free(devname);
@@ -943,7 +938,7 @@ static int sev_kvm_init(ConfidentialGuestSupport *cgs, Error **errp)
error_setg(errp, "%s: failed to get platform status ret=%d "
"fw_error='%d: %s'", __func__, ret, fw_error,
fw_error_to_str(fw_error));
- goto err;
+ return -1;
}
sev_common->build_id = status.build;
sev_common->api_major = status.api_major;
@@ -953,7 +948,7 @@ static int sev_kvm_init(ConfidentialGuestSupport *cgs, Error **errp)
if (!kvm_kernel_irqchip_allowed()) {
error_setg(errp, "%s: SEV-ES guests require in-kernel irqchip"
"support", __func__);
- goto err;
+ return -1;
}
}
@@ -962,7 +957,7 @@ static int sev_kvm_init(ConfidentialGuestSupport *cgs, Error **errp)
error_setg(errp, "%s: guest policy requires SEV-ES, but "
"host SEV-ES support unavailable",
__func__);
- goto err;
+ return -1;
}
}
@@ -980,25 +975,59 @@ static int sev_kvm_init(ConfidentialGuestSupport *cgs, Error **errp)
if (ret) {
error_setg(errp, "%s: failed to initialize ret=%d fw_error=%d '%s'",
__func__, ret, fw_error, fw_error_to_str(fw_error));
- goto err;
+ return -1;
}
ret = klass->launch_start(sev_common);
if (ret) {
error_setg(errp, "%s: failed to create encryption context", __func__);
- goto err;
+ return -1;
+ }
+
+ if (klass->kvm_init && klass->kvm_init(cgs, errp)) {
+ return -1;
}
- ram_block_notifier_add(&sev_ram_notifier);
- qemu_add_machine_init_done_notifier(&sev_machine_done_notify);
qemu_add_vm_change_state_handler(sev_vm_state_change, sev_common);
cgs->ready = true;
return 0;
-err:
- ram_block_discard_disable(false);
- return -1;
+}
+
+static int sev_kvm_init(ConfidentialGuestSupport *cgs, Error **errp)
+{
+ int ret;
+
+ /*
+ * SEV/SEV-ES rely on pinned memory to back guest RAM so discarding
+ * isn't actually possible. With SNP, only guest_memfd pages are used
+ * for private guest memory, so discarding of shared memory is still
+ * possible..
+ */
+ ret = ram_block_discard_disable(true);
+ if (ret) {
+ error_setg(errp, "%s: cannot disable RAM discard", __func__);
+ return -1;
+ }
+
+ /*
+ * SEV uses these notifiers to register/pin pages prior to guest use,
+ * but SNP relies on guest_memfd for private pages, which has its
+ * own internal mechanisms for registering/pinning private memory.
+ */
+ ram_block_notifier_add(&sev_ram_notifier);
+
+ /*
+ * The machine done notify event is used for SEV guests to get the
+ * measurement of the encrypted images. When SEV-SNP is enabled, the
+ * measurement is part of the guest attestation process where it can
+ * be collected without any reliance on the VMM. So skip registering
+ * the notifier for SNP in favor of using guest attestation instead.
+ */
+ qemu_add_machine_init_done_notifier(&sev_machine_done_notify);
+
+ return 0;
}
int
@@ -1397,7 +1426,7 @@ sev_common_class_init(ObjectClass *oc, void *data)
ConfidentialGuestSupportClass *klass = CONFIDENTIAL_GUEST_SUPPORT_CLASS(oc);
X86ConfidentialGuestClass *x86_klass = X86_CONFIDENTIAL_GUEST_CLASS(oc);
- klass->kvm_init = sev_kvm_init;
+ klass->kvm_init = sev_common_kvm_init;
x86_klass->kvm_type = sev_kvm_type;
object_class_property_add_str(oc, "sev-device",
@@ -1486,6 +1515,7 @@ sev_guest_class_init(ObjectClass *oc, void *data)
klass->launch_start = sev_launch_start;
klass->launch_finish = sev_launch_finish;
+ klass->kvm_init = sev_kvm_init;
object_class_property_add_str(oc, "dh-cert-file",
sev_guest_get_dh_cert_file,
--
2.39.3

@ -0,0 +1,94 @@
From 35ceebdeccbf5dceb374c6f89a12e9981def570b Mon Sep 17 00:00:00 2001
From: Pankaj Gupta <pankaj.gupta@amd.com>
Date: Thu, 30 May 2024 06:16:22 -0500
Subject: [PATCH 062/100] i386/sev: Add snp_kvm_init() override for SNP class
RH-Author: Paolo Bonzini <pbonzini@redhat.com>
RH-MergeRequest: 245: SEV-SNP support
RH-Jira: RHEL-39544
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Bandan Das <bdas@redhat.com>
RH-Acked-by: Vitaly Kuznetsov <vkuznets@redhat.com>
RH-Commit: [62/91] 8fa537961c9262b99a4ffb99e1c25f080d76d1de (bonzini/rhel-qemu-kvm)
SNP does not support SMM and requires guest_memfd for
private guest memory, so add SNP specific kvm_init()
functionality in snp_kvm_init() class method.
Signed-off-by: Michael Roth <michael.roth@amd.com>
Co-developed-by: Pankaj Gupta <pankaj.gupta@amd.com>
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240530111643.1091816-11-pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 125b95a6d465a03ff30816eff0b1889aec01f0c3)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
target/i386/sev.c | 24 +++++++++++++++++++++++-
1 file changed, 23 insertions(+), 1 deletion(-)
diff --git a/target/i386/sev.c b/target/i386/sev.c
index 5519de1c6b..6525b3c1a0 100644
--- a/target/i386/sev.c
+++ b/target/i386/sev.c
@@ -885,12 +885,12 @@ out:
static int sev_common_kvm_init(ConfidentialGuestSupport *cgs, Error **errp)
{
- SevCommonState *sev_common = SEV_COMMON(cgs);
char *devname;
int ret, fw_error, cmd;
uint32_t ebx;
uint32_t host_cbitpos;
struct sev_user_data_status status = {};
+ SevCommonState *sev_common = SEV_COMMON(cgs);
SevCommonStateClass *klass = SEV_COMMON_GET_CLASS(cgs);
sev_common->state = SEV_STATE_UNINIT;
@@ -1030,6 +1030,21 @@ static int sev_kvm_init(ConfidentialGuestSupport *cgs, Error **errp)
return 0;
}
+static int sev_snp_kvm_init(ConfidentialGuestSupport *cgs, Error **errp)
+{
+ MachineState *ms = MACHINE(qdev_get_machine());
+ X86MachineState *x86ms = X86_MACHINE(ms);
+
+ if (x86ms->smm == ON_OFF_AUTO_AUTO) {
+ x86ms->smm = ON_OFF_AUTO_OFF;
+ } else if (x86ms->smm == ON_OFF_AUTO_ON) {
+ error_setg(errp, "SEV-SNP does not support SMM.");
+ return -1;
+ }
+
+ return 0;
+}
+
int
sev_encrypt_flash(uint8_t *ptr, uint64_t len, Error **errp)
{
@@ -1752,6 +1767,10 @@ sev_snp_guest_set_host_data(Object *obj, const char *value, Error **errp)
static void
sev_snp_guest_class_init(ObjectClass *oc, void *data)
{
+ SevCommonStateClass *klass = SEV_COMMON_CLASS(oc);
+
+ klass->kvm_init = sev_snp_kvm_init;
+
object_class_property_add(oc, "policy", "uint64",
sev_snp_guest_get_policy,
sev_snp_guest_set_policy, NULL, NULL);
@@ -1778,8 +1797,11 @@ sev_snp_guest_class_init(ObjectClass *oc, void *data)
static void
sev_snp_guest_instance_init(Object *obj)
{
+ ConfidentialGuestSupport *cgs = CONFIDENTIAL_GUEST_SUPPORT(obj);
SevSnpGuestState *sev_snp_guest = SEV_SNP_GUEST(obj);
+ cgs->require_guest_memfd = true;
+
/* default init/start/finish params for kvm */
sev_snp_guest->kvm_start_conf.policy = DEFAULT_SEV_SNP_POLICY;
}
--
2.39.3

@ -0,0 +1,262 @@
From 4013364679757161d6b9754bfc33ae38be0a1b7f Mon Sep 17 00:00:00 2001
From: Michael Roth <michael.roth@amd.com>
Date: Thu, 30 May 2024 06:16:32 -0500
Subject: [PATCH 072/100] i386/sev: Add support for SNP CPUID validation
RH-Author: Paolo Bonzini <pbonzini@redhat.com>
RH-MergeRequest: 245: SEV-SNP support
RH-Jira: RHEL-39544
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Bandan Das <bdas@redhat.com>
RH-Acked-by: Vitaly Kuznetsov <vkuznets@redhat.com>
RH-Commit: [72/91] 080e2942552dc7de8966e69d0d0d3b8951392030 (bonzini/rhel-qemu-kvm)
SEV-SNP firmware allows a special guest page to be populated with a
table of guest CPUID values so that they can be validated through
firmware before being loaded into encrypted guest memory where they can
be used in place of hypervisor-provided values[1].
As part of SEV-SNP guest initialization, use this interface to validate
the CPUID entries reported by KVM_GET_CPUID2 prior to initial guest
start and populate the CPUID page reserved by OVMF with the resulting
encrypted data.
[1] SEV SNP Firmware ABI Specification, Rev. 0.8, 8.13.2.6
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240530111643.1091816-21-pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 70943ad8e4dfbe5f77006b880290219be9d03553)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
target/i386/sev.c | 164 +++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 162 insertions(+), 2 deletions(-)
diff --git a/target/i386/sev.c b/target/i386/sev.c
index c57534fca2..06401f0526 100644
--- a/target/i386/sev.c
+++ b/target/i386/sev.c
@@ -200,6 +200,36 @@ static const char *const sev_fw_errlist[] = {
#define SEV_FW_MAX_ERROR ARRAY_SIZE(sev_fw_errlist)
+/* <linux/kvm.h> doesn't expose this, so re-use the max from kvm.c */
+#define KVM_MAX_CPUID_ENTRIES 100
+
+typedef struct KvmCpuidInfo {
+ struct kvm_cpuid2 cpuid;
+ struct kvm_cpuid_entry2 entries[KVM_MAX_CPUID_ENTRIES];
+} KvmCpuidInfo;
+
+#define SNP_CPUID_FUNCTION_MAXCOUNT 64
+#define SNP_CPUID_FUNCTION_UNKNOWN 0xFFFFFFFF
+
+typedef struct {
+ uint32_t eax_in;
+ uint32_t ecx_in;
+ uint64_t xcr0_in;
+ uint64_t xss_in;
+ uint32_t eax;
+ uint32_t ebx;
+ uint32_t ecx;
+ uint32_t edx;
+ uint64_t reserved;
+} __attribute__((packed)) SnpCpuidFunc;
+
+typedef struct {
+ uint32_t count;
+ uint32_t reserved1;
+ uint64_t reserved2;
+ SnpCpuidFunc entries[SNP_CPUID_FUNCTION_MAXCOUNT];
+} __attribute__((packed)) SnpCpuidInfo;
+
static int
sev_ioctl(int fd, int cmd, void *data, int *error)
{
@@ -788,6 +818,35 @@ out:
return ret;
}
+static void
+sev_snp_cpuid_report_mismatches(SnpCpuidInfo *old,
+ SnpCpuidInfo *new)
+{
+ size_t i;
+
+ if (old->count != new->count) {
+ error_report("SEV-SNP: CPUID validation failed due to count mismatch,"
+ "provided: %d, expected: %d", old->count, new->count);
+ return;
+ }
+
+ for (i = 0; i < old->count; i++) {
+ SnpCpuidFunc *old_func, *new_func;
+
+ old_func = &old->entries[i];
+ new_func = &new->entries[i];
+
+ if (memcmp(old_func, new_func, sizeof(SnpCpuidFunc))) {
+ error_report("SEV-SNP: CPUID validation failed for function 0x%x, index: 0x%x"
+ "provided: eax:0x%08x, ebx: 0x%08x, ecx: 0x%08x, edx: 0x%08x"
+ "expected: eax:0x%08x, ebx: 0x%08x, ecx: 0x%08x, edx: 0x%08x",
+ old_func->eax_in, old_func->ecx_in,
+ old_func->eax, old_func->ebx, old_func->ecx, old_func->edx,
+ new_func->eax, new_func->ebx, new_func->ecx, new_func->edx);
+ }
+ }
+}
+
static const char *
snp_page_type_to_str(int type)
{
@@ -806,6 +865,7 @@ sev_snp_launch_update(SevSnpGuestState *sev_snp_guest,
SevLaunchUpdateData *data)
{
int ret, fw_error;
+ SnpCpuidInfo snp_cpuid_info;
struct kvm_sev_snp_launch_update update = {0};
if (!data->hva || !data->len) {
@@ -815,6 +875,11 @@ sev_snp_launch_update(SevSnpGuestState *sev_snp_guest,
return 1;
}
+ if (data->type == KVM_SEV_SNP_PAGE_TYPE_CPUID) {
+ /* Save a copy for comparison in case the LAUNCH_UPDATE fails */
+ memcpy(&snp_cpuid_info, data->hva, sizeof(snp_cpuid_info));
+ }
+
update.uaddr = (__u64)(unsigned long)data->hva;
update.gfn_start = data->gpa >> TARGET_PAGE_BITS;
update.len = data->len;
@@ -842,6 +907,11 @@ sev_snp_launch_update(SevSnpGuestState *sev_snp_guest,
if (ret && ret != -EAGAIN) {
error_report("SNP_LAUNCH_UPDATE ret=%d fw_error=%d '%s'",
ret, fw_error, fw_error_to_str(fw_error));
+
+ if (data->type == KVM_SEV_SNP_PAGE_TYPE_CPUID) {
+ sev_snp_cpuid_report_mismatches(&snp_cpuid_info, data->hva);
+ error_report("SEV-SNP: failed update CPUID page");
+ }
break;
}
}
@@ -1004,7 +1074,8 @@ sev_launch_finish(SevCommonState *sev_common)
}
static int
-snp_launch_update_data(uint64_t gpa, void *hva, uint32_t len, int type)
+snp_launch_update_data(uint64_t gpa, void *hva,
+ uint32_t len, int type)
{
SevLaunchUpdateData *data;
@@ -1019,6 +1090,90 @@ snp_launch_update_data(uint64_t gpa, void *hva, uint32_t len, int type)
return 0;
}
+static int
+sev_snp_cpuid_info_fill(SnpCpuidInfo *snp_cpuid_info,
+ const KvmCpuidInfo *kvm_cpuid_info)
+{
+ size_t i;
+
+ if (kvm_cpuid_info->cpuid.nent > SNP_CPUID_FUNCTION_MAXCOUNT) {
+ error_report("SEV-SNP: CPUID entry count (%d) exceeds max (%d)",
+ kvm_cpuid_info->cpuid.nent, SNP_CPUID_FUNCTION_MAXCOUNT);
+ return -1;
+ }
+
+ memset(snp_cpuid_info, 0, sizeof(*snp_cpuid_info));
+
+ for (i = 0; i < kvm_cpuid_info->cpuid.nent; i++) {
+ const struct kvm_cpuid_entry2 *kvm_cpuid_entry;
+ SnpCpuidFunc *snp_cpuid_entry;
+
+ kvm_cpuid_entry = &kvm_cpuid_info->entries[i];
+ snp_cpuid_entry = &snp_cpuid_info->entries[i];
+
+ snp_cpuid_entry->eax_in = kvm_cpuid_entry->function;
+ if (kvm_cpuid_entry->flags == KVM_CPUID_FLAG_SIGNIFCANT_INDEX) {
+ snp_cpuid_entry->ecx_in = kvm_cpuid_entry->index;
+ }
+ snp_cpuid_entry->eax = kvm_cpuid_entry->eax;
+ snp_cpuid_entry->ebx = kvm_cpuid_entry->ebx;
+ snp_cpuid_entry->ecx = kvm_cpuid_entry->ecx;
+ snp_cpuid_entry->edx = kvm_cpuid_entry->edx;
+
+ /*
+ * Guest kernels will calculate EBX themselves using the 0xD
+ * subfunctions corresponding to the individual XSAVE areas, so only
+ * encode the base XSAVE size in the initial leaves, corresponding
+ * to the initial XCR0=1 state.
+ */
+ if (snp_cpuid_entry->eax_in == 0xD &&
+ (snp_cpuid_entry->ecx_in == 0x0 || snp_cpuid_entry->ecx_in == 0x1)) {
+ snp_cpuid_entry->ebx = 0x240;
+ snp_cpuid_entry->xcr0_in = 1;
+ snp_cpuid_entry->xss_in = 0;
+ }
+ }
+
+ snp_cpuid_info->count = i;
+
+ return 0;
+}
+
+static int
+snp_launch_update_cpuid(uint32_t cpuid_addr, void *hva, uint32_t cpuid_len)
+{
+ KvmCpuidInfo kvm_cpuid_info = {0};
+ SnpCpuidInfo snp_cpuid_info;
+ CPUState *cs = first_cpu;
+ int ret;
+ uint32_t i = 0;
+
+ assert(sizeof(snp_cpuid_info) <= cpuid_len);
+
+ /* get the cpuid list from KVM */
+ do {
+ kvm_cpuid_info.cpuid.nent = ++i;
+ ret = kvm_vcpu_ioctl(cs, KVM_GET_CPUID2, &kvm_cpuid_info);
+ } while (ret == -E2BIG);
+
+ if (ret) {
+ error_report("SEV-SNP: unable to query CPUID values for CPU: '%s'",
+ strerror(-ret));
+ return 1;
+ }
+
+ ret = sev_snp_cpuid_info_fill(&snp_cpuid_info, &kvm_cpuid_info);
+ if (ret) {
+ error_report("SEV-SNP: failed to generate CPUID table information");
+ return 1;
+ }
+
+ memcpy(hva, &snp_cpuid_info, sizeof(snp_cpuid_info));
+
+ return snp_launch_update_data(cpuid_addr, hva, cpuid_len,
+ KVM_SEV_SNP_PAGE_TYPE_CPUID);
+}
+
static int
snp_metadata_desc_to_page_type(int desc_type)
{
@@ -1053,7 +1208,12 @@ snp_populate_metadata_pages(SevSnpGuestState *sev_snp,
exit(1);
}
- ret = snp_launch_update_data(desc->base, hva, desc->len, type);
+ if (type == KVM_SEV_SNP_PAGE_TYPE_CPUID) {
+ ret = snp_launch_update_cpuid(desc->base, hva, desc->len);
+ } else {
+ ret = snp_launch_update_data(desc->base, hva, desc->len, type);
+ }
+
if (ret) {
error_report("%s: Failed to add metadata page gpa 0x%x+%x type %d",
__func__, desc->base, desc->len, desc->type);
--
2.39.3

@ -0,0 +1,127 @@
From b2cfd4d89026e76ba86ea7adea323f2c3a588790 Mon Sep 17 00:00:00 2001
From: Brijesh Singh <brijesh.singh@amd.com>
Date: Thu, 30 May 2024 06:16:31 -0500
Subject: [PATCH 071/100] i386/sev: Add support for populating OVMF metadata
pages
RH-Author: Paolo Bonzini <pbonzini@redhat.com>
RH-MergeRequest: 245: SEV-SNP support
RH-Jira: RHEL-39544
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Bandan Das <bdas@redhat.com>
RH-Acked-by: Vitaly Kuznetsov <vkuznets@redhat.com>
RH-Commit: [71/91] b563442c0e2f6ea01937425d300b56d9e641fd57 (bonzini/rhel-qemu-kvm)
OVMF reserves various pages so they can be pre-initialized/validated
prior to launching the guest. Add support for populating these pages
with the expected content.
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Co-developed-by: Pankaj Gupta <pankaj.gupta@amd.com>
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240530111643.1091816-20-pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 3d8c2a7f4806ff39423312e503737fd76c34dcae)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
target/i386/sev.c | 74 +++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 74 insertions(+)
diff --git a/target/i386/sev.c b/target/i386/sev.c
index 17281bb2c7..c57534fca2 100644
--- a/target/i386/sev.c
+++ b/target/i386/sev.c
@@ -1003,15 +1003,89 @@ sev_launch_finish(SevCommonState *sev_common)
migrate_add_blocker(&sev_mig_blocker, &error_fatal);
}
+static int
+snp_launch_update_data(uint64_t gpa, void *hva, uint32_t len, int type)
+{
+ SevLaunchUpdateData *data;
+
+ data = g_new0(SevLaunchUpdateData, 1);
+ data->gpa = gpa;
+ data->hva = hva;
+ data->len = len;
+ data->type = type;
+
+ QTAILQ_INSERT_TAIL(&launch_update, data, next);
+
+ return 0;
+}
+
+static int
+snp_metadata_desc_to_page_type(int desc_type)
+{
+ switch (desc_type) {
+ /* Add the umeasured prevalidated pages as a zero page */
+ case SEV_DESC_TYPE_SNP_SEC_MEM: return KVM_SEV_SNP_PAGE_TYPE_ZERO;
+ case SEV_DESC_TYPE_SNP_SECRETS: return KVM_SEV_SNP_PAGE_TYPE_SECRETS;
+ case SEV_DESC_TYPE_CPUID: return KVM_SEV_SNP_PAGE_TYPE_CPUID;
+ default:
+ return KVM_SEV_SNP_PAGE_TYPE_ZERO;
+ }
+}
+
+static void
+snp_populate_metadata_pages(SevSnpGuestState *sev_snp,
+ OvmfSevMetadata *metadata)
+{
+ OvmfSevMetadataDesc *desc;
+ int type, ret, i;
+ void *hva;
+ MemoryRegion *mr = NULL;
+
+ for (i = 0; i < metadata->num_desc; i++) {
+ desc = &metadata->descs[i];
+
+ type = snp_metadata_desc_to_page_type(desc->type);
+
+ hva = gpa2hva(&mr, desc->base, desc->len, NULL);
+ if (!hva) {
+ error_report("%s: Failed to get HVA for GPA 0x%x sz 0x%x",
+ __func__, desc->base, desc->len);
+ exit(1);
+ }
+
+ ret = snp_launch_update_data(desc->base, hva, desc->len, type);
+ if (ret) {
+ error_report("%s: Failed to add metadata page gpa 0x%x+%x type %d",
+ __func__, desc->base, desc->len, desc->type);
+ exit(1);
+ }
+ }
+}
+
static void
sev_snp_launch_finish(SevCommonState *sev_common)
{
int ret, error;
Error *local_err = NULL;
+ OvmfSevMetadata *metadata;
SevLaunchUpdateData *data;
SevSnpGuestState *sev_snp = SEV_SNP_GUEST(sev_common);
struct kvm_sev_snp_launch_finish *finish = &sev_snp->kvm_finish_conf;
+ /*
+ * To boot the SNP guest, the hypervisor is required to populate the CPUID
+ * and Secrets page before finalizing the launch flow. The location of
+ * the secrets and CPUID page is available through the OVMF metadata GUID.
+ */
+ metadata = pc_system_get_ovmf_sev_metadata_ptr();
+ if (metadata == NULL) {
+ error_report("%s: Failed to locate SEV metadata header", __func__);
+ exit(1);
+ }
+
+ /* Populate all the metadata pages */
+ snp_populate_metadata_pages(sev_snp, metadata);
+
QTAILQ_FOREACH(data, &launch_update, next) {
ret = sev_snp_launch_update(sev_snp, data);
if (ret) {
--
2.39.3

@ -0,0 +1,122 @@
From 0f7432f2b968298b64fd243df793b176f67a538f Mon Sep 17 00:00:00 2001
From: Brijesh Singh <brijesh.singh@amd.com>
Date: Thu, 30 May 2024 06:16:27 -0500
Subject: [PATCH 067/100] i386/sev: Add the SNP launch start context
RH-Author: Paolo Bonzini <pbonzini@redhat.com>
RH-MergeRequest: 245: SEV-SNP support
RH-Jira: RHEL-39544
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Bandan Das <bdas@redhat.com>
RH-Acked-by: Vitaly Kuznetsov <vkuznets@redhat.com>
RH-Commit: [67/91] 63759a25a413a7a9a7274fb4c3b8bc2528634855 (bonzini/rhel-qemu-kvm)
The SNP_LAUNCH_START is called first to create a cryptographic launch
context within the firmware.
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Co-developed-by: Pankaj Gupta <pankaj.gupta@amd.com>
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240530111643.1091816-16-pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit d3107f882ec22cfb211eab7efa0c4e95f5ce11bb)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
target/i386/sev.c | 39 +++++++++++++++++++++++++++++++++++++++
target/i386/trace-events | 1 +
2 files changed, 40 insertions(+)
diff --git a/target/i386/sev.c b/target/i386/sev.c
index 43d1c48bd9..e89b87d2f5 100644
--- a/target/i386/sev.c
+++ b/target/i386/sev.c
@@ -39,6 +39,7 @@
#include "confidential-guest.h"
#include "hw/i386/pc.h"
#include "exec/address-spaces.h"
+#include "qemu/queue.h"
OBJECT_DECLARE_TYPE(SevCommonState, SevCommonStateClass, SEV_COMMON)
OBJECT_DECLARE_TYPE(SevGuestState, SevCommonStateClass, SEV_GUEST)
@@ -115,6 +116,16 @@ struct SevSnpGuestState {
#define DEFAULT_SEV_DEVICE "/dev/sev"
#define DEFAULT_SEV_SNP_POLICY 0x30000
+typedef struct SevLaunchUpdateData {
+ QTAILQ_ENTRY(SevLaunchUpdateData) next;
+ hwaddr gpa;
+ void *hva;
+ uint64_t len;
+ int type;
+} SevLaunchUpdateData;
+
+static QTAILQ_HEAD(, SevLaunchUpdateData) launch_update;
+
#define SEV_INFO_BLOCK_GUID "00f771de-1a7e-4fcb-890e-68c77e2fb44e"
typedef struct __attribute__((__packed__)) SevInfoBlock {
/* SEV-ES Reset Vector Address */
@@ -674,6 +685,31 @@ sev_read_file_base64(const char *filename, guchar **data, gsize *len)
return 0;
}
+static int
+sev_snp_launch_start(SevCommonState *sev_common)
+{
+ int fw_error, rc;
+ SevSnpGuestState *sev_snp_guest = SEV_SNP_GUEST(sev_common);
+ struct kvm_sev_snp_launch_start *start = &sev_snp_guest->kvm_start_conf;
+
+ trace_kvm_sev_snp_launch_start(start->policy,
+ sev_snp_guest->guest_visible_workarounds);
+
+ rc = sev_ioctl(sev_common->sev_fd, KVM_SEV_SNP_LAUNCH_START,
+ start, &fw_error);
+ if (rc < 0) {
+ error_report("%s: SNP_LAUNCH_START ret=%d fw_error=%d '%s'",
+ __func__, rc, fw_error, fw_error_to_str(fw_error));
+ return 1;
+ }
+
+ QTAILQ_INIT(&launch_update);
+
+ sev_set_guest_state(sev_common, SEV_STATE_LAUNCH_UPDATE);
+
+ return 0;
+}
+
static int
sev_launch_start(SevCommonState *sev_common)
{
@@ -1003,6 +1039,7 @@ static int sev_common_kvm_init(ConfidentialGuestSupport *cgs, Error **errp)
}
ret = klass->launch_start(sev_common);
+
if (ret) {
error_setg(errp, "%s: failed to create encryption context", __func__);
return -1;
@@ -1794,9 +1831,11 @@ sev_snp_guest_class_init(ObjectClass *oc, void *data)
SevCommonStateClass *klass = SEV_COMMON_CLASS(oc);
X86ConfidentialGuestClass *x86_klass = X86_CONFIDENTIAL_GUEST_CLASS(oc);
+ klass->launch_start = sev_snp_launch_start;
klass->kvm_init = sev_snp_kvm_init;
x86_klass->kvm_type = sev_snp_kvm_type;
+
object_class_property_add(oc, "policy", "uint64",
sev_snp_guest_get_policy,
sev_snp_guest_set_policy, NULL, NULL);
diff --git a/target/i386/trace-events b/target/i386/trace-events
index 2cd8726eeb..cb26d8a925 100644
--- a/target/i386/trace-events
+++ b/target/i386/trace-events
@@ -11,3 +11,4 @@ kvm_sev_launch_measurement(const char *value) "data %s"
kvm_sev_launch_finish(void) ""
kvm_sev_launch_secret(uint64_t hpa, uint64_t hva, uint64_t secret, int len) "hpa 0x%" PRIx64 " hva 0x%" PRIx64 " data 0x%" PRIx64 " len %d"
kvm_sev_attestation_report(const char *mnonce, const char *data) "mnonce %s data %s"
+kvm_sev_snp_launch_start(uint64_t policy, char *gosvw) "policy 0x%" PRIx64 " gosvw %s"
--
2.39.3

@ -0,0 +1,237 @@
From ec786a1ec0a76775e980862d77500f5196a937e3 Mon Sep 17 00:00:00 2001
From: Dov Murik <dovmurik@linux.ibm.com>
Date: Thu, 30 May 2024 06:16:35 -0500
Subject: [PATCH 080/100] i386/sev: Allow measured direct kernel boot on SNP
RH-Author: Paolo Bonzini <pbonzini@redhat.com>
RH-MergeRequest: 245: SEV-SNP support
RH-Jira: RHEL-39544
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Bandan Das <bdas@redhat.com>
RH-Acked-by: Vitaly Kuznetsov <vkuznets@redhat.com>
RH-Commit: [80/91] 11c629862519c1a279566febf5a537c63c5fcf61 (bonzini/rhel-qemu-kvm)
In SNP, the hashes page designated with a specific metadata entry
published in AmdSev OVMF.
Therefore, if the user enabled kernel hashes (for measured direct boot),
QEMU should prepare the content of hashes table, and during the
processing of the metadata entry it copy the content into the designated
page and encrypt it.
Note that in SNP (unlike SEV and SEV-ES) the measurements is done in
whole 4KB pages. Therefore QEMU zeros the whole page that includes the
hashes table, and fills in the kernel hashes area in that page, and then
encrypts the whole page. The rest of the page is reserved for SEV
launch secrets which are not usable anyway on SNP.
If the user disabled kernel hashes, QEMU pre-validates the kernel hashes
page as a zero page.
Signed-off-by: Dov Murik <dovmurik@linux.ibm.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240530111643.1091816-24-pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit c1996992cc882b00139f78067d6a64e2ec9cb0d8)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
include/hw/i386/pc.h | 2 +
target/i386/sev.c | 111 ++++++++++++++++++++++++++++++++-----------
2 files changed, 85 insertions(+), 28 deletions(-)
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 94b49310f5..ee3bfb7be9 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -175,6 +175,8 @@ typedef enum {
SEV_DESC_TYPE_SNP_SECRETS,
/* The section contains address that can be used as a CPUID page */
SEV_DESC_TYPE_CPUID,
+ /* The section contains the region for kernel hashes for measured direct boot */
+ SEV_DESC_TYPE_SNP_KERNEL_HASHES = 0x10,
} ovmf_sev_metadata_desc_type;
diff --git a/target/i386/sev.c b/target/i386/sev.c
index 3fce4c08eb..004c667ac1 100644
--- a/target/i386/sev.c
+++ b/target/i386/sev.c
@@ -115,6 +115,10 @@ struct SevCommonStateClass {
X86ConfidentialGuestClass parent_class;
/* public */
+ bool (*build_kernel_loader_hashes)(SevCommonState *sev_common,
+ SevHashTableDescriptor *area,
+ SevKernelLoaderContext *ctx,
+ Error **errp);
int (*launch_start)(SevCommonState *sev_common);
void (*launch_finish)(SevCommonState *sev_common);
int (*launch_update_data)(SevCommonState *sev_common, hwaddr gpa, uint8_t *ptr, uint64_t len);
@@ -154,6 +158,9 @@ struct SevSnpGuestState {
struct kvm_sev_snp_launch_start kvm_start_conf;
struct kvm_sev_snp_launch_finish kvm_finish_conf;
+
+ uint32_t kernel_hashes_offset;
+ PaddedSevHashTable *kernel_hashes_data;
};
#define DEFAULT_GUEST_POLICY 0x1 /* disable debug */
@@ -1189,6 +1196,23 @@ snp_launch_update_cpuid(uint32_t cpuid_addr, void *hva, uint32_t cpuid_len)
KVM_SEV_SNP_PAGE_TYPE_CPUID);
}
+static int
+snp_launch_update_kernel_hashes(SevSnpGuestState *sev_snp, uint32_t addr,
+ void *hva, uint32_t len)
+{
+ int type = KVM_SEV_SNP_PAGE_TYPE_ZERO;
+ if (sev_snp->parent_obj.kernel_hashes) {
+ assert(sev_snp->kernel_hashes_data);
+ assert((sev_snp->kernel_hashes_offset +
+ sizeof(*sev_snp->kernel_hashes_data)) <= len);
+ memset(hva, 0, len);
+ memcpy(hva + sev_snp->kernel_hashes_offset, sev_snp->kernel_hashes_data,
+ sizeof(*sev_snp->kernel_hashes_data));
+ type = KVM_SEV_SNP_PAGE_TYPE_NORMAL;
+ }
+ return snp_launch_update_data(addr, hva, len, type);
+}
+
static int
snp_metadata_desc_to_page_type(int desc_type)
{
@@ -1225,6 +1249,9 @@ snp_populate_metadata_pages(SevSnpGuestState *sev_snp,
if (type == KVM_SEV_SNP_PAGE_TYPE_CPUID) {
ret = snp_launch_update_cpuid(desc->base, hva, desc->len);
+ } else if (desc->type == SEV_DESC_TYPE_SNP_KERNEL_HASHES) {
+ ret = snp_launch_update_kernel_hashes(sev_snp, desc->base, hva,
+ desc->len);
} else {
ret = snp_launch_update_data(desc->base, hva, desc->len, type);
}
@@ -1823,6 +1850,58 @@ static bool build_kernel_loader_hashes(PaddedSevHashTable *padded_ht,
return true;
}
+static bool sev_snp_build_kernel_loader_hashes(SevCommonState *sev_common,
+ SevHashTableDescriptor *area,
+ SevKernelLoaderContext *ctx,
+ Error **errp)
+{
+ /*
+ * SNP: Populate the hashes table in an area that later in
+ * snp_launch_update_kernel_hashes() will be copied to the guest memory
+ * and encrypted.
+ */
+ SevSnpGuestState *sev_snp_guest = SEV_SNP_GUEST(sev_common);
+ sev_snp_guest->kernel_hashes_offset = area->base & ~TARGET_PAGE_MASK;
+ sev_snp_guest->kernel_hashes_data = g_new0(PaddedSevHashTable, 1);
+ return build_kernel_loader_hashes(sev_snp_guest->kernel_hashes_data, ctx, errp);
+}
+
+static bool sev_build_kernel_loader_hashes(SevCommonState *sev_common,
+ SevHashTableDescriptor *area,
+ SevKernelLoaderContext *ctx,
+ Error **errp)
+{
+ PaddedSevHashTable *padded_ht;
+ hwaddr mapped_len = sizeof(*padded_ht);
+ MemTxAttrs attrs = { 0 };
+ bool ret = true;
+
+ /*
+ * Populate the hashes table in the guest's memory at the OVMF-designated
+ * area for the SEV hashes table
+ */
+ padded_ht = address_space_map(&address_space_memory, area->base,
+ &mapped_len, true, attrs);
+ if (!padded_ht || mapped_len != sizeof(*padded_ht)) {
+ error_setg(errp, "SEV: cannot map hashes table guest memory area");
+ return false;
+ }
+
+ if (build_kernel_loader_hashes(padded_ht, ctx, errp)) {
+ if (sev_encrypt_flash(area->base, (uint8_t *)padded_ht,
+ sizeof(*padded_ht), errp) < 0) {
+ ret = false;
+ }
+ } else {
+ ret = false;
+ }
+
+ address_space_unmap(&address_space_memory, padded_ht,
+ mapped_len, true, mapped_len);
+
+ return ret;
+}
+
/*
* Add the hashes of the linux kernel/initrd/cmdline to an encrypted guest page
* which is included in SEV's initial memory measurement.
@@ -1831,11 +1910,8 @@ bool sev_add_kernel_loader_hashes(SevKernelLoaderContext *ctx, Error **errp)
{
uint8_t *data;
SevHashTableDescriptor *area;
- PaddedSevHashTable *padded_ht;
- hwaddr mapped_len = sizeof(*padded_ht);
- MemTxAttrs attrs = { 0 };
- bool ret = true;
SevCommonState *sev_common = SEV_COMMON(MACHINE(qdev_get_machine())->cgs);
+ SevCommonStateClass *klass = SEV_COMMON_GET_CLASS(sev_common);
/*
* Only add the kernel hashes if the sev-guest configuration explicitly
@@ -1858,30 +1934,7 @@ bool sev_add_kernel_loader_hashes(SevKernelLoaderContext *ctx, Error **errp)
return false;
}
- /*
- * Populate the hashes table in the guest's memory at the OVMF-designated
- * area for the SEV hashes table
- */
- padded_ht = address_space_map(&address_space_memory, area->base,
- &mapped_len, true, attrs);
- if (!padded_ht || mapped_len != sizeof(*padded_ht)) {
- error_setg(errp, "SEV: cannot map hashes table guest memory area");
- return false;
- }
-
- if (build_kernel_loader_hashes(padded_ht, ctx, errp)) {
- if (sev_encrypt_flash(area->base, (uint8_t *)padded_ht,
- sizeof(*padded_ht), errp) < 0) {
- ret = false;
- }
- } else {
- ret = false;
- }
-
- address_space_unmap(&address_space_memory, padded_ht,
- mapped_len, true, mapped_len);
-
- return ret;
+ return klass->build_kernel_loader_hashes(sev_common, area, ctx, errp);
}
static char *
@@ -1998,6 +2051,7 @@ sev_guest_class_init(ObjectClass *oc, void *data)
SevCommonStateClass *klass = SEV_COMMON_CLASS(oc);
X86ConfidentialGuestClass *x86_klass = X86_CONFIDENTIAL_GUEST_CLASS(oc);
+ klass->build_kernel_loader_hashes = sev_build_kernel_loader_hashes;
klass->launch_start = sev_launch_start;
klass->launch_finish = sev_launch_finish;
klass->launch_update_data = sev_launch_update_data;
@@ -2242,6 +2296,7 @@ sev_snp_guest_class_init(ObjectClass *oc, void *data)
SevCommonStateClass *klass = SEV_COMMON_CLASS(oc);
X86ConfidentialGuestClass *x86_klass = X86_CONFIDENTIAL_GUEST_CLASS(oc);
+ klass->build_kernel_loader_hashes = sev_snp_build_kernel_loader_hashes;
klass->launch_start = sev_snp_launch_start;
klass->launch_finish = sev_snp_launch_finish;
klass->launch_update_data = sev_snp_launch_update_data;
--
2.39.3

@ -0,0 +1,268 @@
From ab6197309551bd6ddd9f8239191f68dfac23684b Mon Sep 17 00:00:00 2001
From: Michael Roth <michael.roth@amd.com>
Date: Tue, 9 Jul 2024 23:10:05 -0500
Subject: [PATCH 090/100] i386/sev: Don't allow automatic fallback to legacy
KVM_SEV*_INIT
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
RH-Author: Paolo Bonzini <pbonzini@redhat.com>
RH-MergeRequest: 245: SEV-SNP support
RH-Jira: RHEL-39544
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Bandan Das <bdas@redhat.com>
RH-Acked-by: Vitaly Kuznetsov <vkuznets@redhat.com>
RH-Commit: [90/91] 2b1345faa56f993bb6e13d63e11656c784e20412 (bonzini/rhel-qemu-kvm)
Currently if the 'legacy-vm-type' property of the sev-guest object is
'on', QEMU will attempt to use the newer KVM_SEV_INIT2 kernel
interface in conjunction with the newer KVM_X86_SEV_VM and
KVM_X86_SEV_ES_VM KVM VM types.
This can lead to measurement changes if, for instance, an SEV guest was
created on a host that originally had an older kernel that didn't
support KVM_SEV_INIT2, but is booted on the same host later on after the
host kernel was upgraded.
Instead, if legacy-vm-type is 'off', QEMU should fail if the
KVM_SEV_INIT2 interface is not provided by the current host kernel.
Modify the fallback handling accordingly.
In the future, VMSA features and other flags might be added to QEMU
which will require legacy-vm-type to be 'off' because they will rely
on the newer KVM_SEV_INIT2 interface. It may be difficult to convey to
users what values of legacy-vm-type are compatible with which
features/options, so as part of this rework, switch legacy-vm-type to a
tri-state OnOffAuto option. 'auto' in this case will automatically
switch to using the newer KVM_SEV_INIT2, but only if it is required to
make use of new VMSA features or other options only available via
KVM_SEV_INIT2.
Defining 'auto' in this way would avoid inadvertantly breaking
compatibility with older kernels since it would only be used in cases
where users opt into newer features that are only available via
KVM_SEV_INIT2 and newer kernels, and provide better default behavior
than the legacy-vm-type=off behavior that was previously in place, so
make it the default for 9.1+ machine types.
Cc: Daniel P. Berrangé <berrange@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
cc: kvm@vger.kernel.org
Signed-off-by: Michael Roth <michael.roth@amd.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Link: https://lore.kernel.org/r/20240710041005.83720-1-michael.roth@amd.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 9d38d9dca2a81aaf5752d45d221021ef96d496cd)
RHEL: adjust compatiility setting, applying it to 9.4 machine type
---
hw/i386/pc.c | 2 +-
qapi/qom.json | 18 ++++++----
target/i386/sev.c | 85 +++++++++++++++++++++++++++++++++++++++--------
3 files changed, 83 insertions(+), 22 deletions(-)
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index b25d075b59..e9c5ea5d8f 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -352,7 +352,7 @@ const size_t pc_rhel_compat_len = G_N_ELEMENTS(pc_rhel_compat);
GlobalProperty pc_rhel_9_5_compat[] = {
/* pc_rhel_9_5_compat from pc_compat_pc_9_0 (backported from 9.1) */
{ TYPE_X86_CPU, "guest-phys-bits", "0" },
- { "sev-guest", "legacy-vm-type", "true" },
+ { "sev-guest", "legacy-vm-type", "on" },
};
const size_t pc_rhel_9_5_compat_len = G_N_ELEMENTS(pc_rhel_9_5_compat);
diff --git a/qapi/qom.json b/qapi/qom.json
index 8bd299265e..17bd5a0cf7 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -912,12 +912,16 @@
# @handle: SEV firmware handle (default: 0)
#
# @legacy-vm-type: Use legacy KVM_SEV_INIT KVM interface for creating the VM.
-# The newer KVM_SEV_INIT2 interface syncs additional vCPU
-# state when initializing the VMSA structures, which will
-# result in a different guest measurement. Set this to
-# maintain compatibility with older QEMU or kernel versions
-# that rely on legacy KVM_SEV_INIT behavior.
-# (default: false) (since 9.1)
+# The newer KVM_SEV_INIT2 interface, from Linux >= 6.10, syncs
+# additional vCPU state when initializing the VMSA structures,
+# which will result in a different guest measurement. Set
+# this to 'on' to force compatibility with older QEMU or kernel
+# versions that rely on legacy KVM_SEV_INIT behavior. 'auto'
+# will behave identically to 'on', but will automatically
+# switch to using KVM_SEV_INIT2 if the user specifies any
+# additional options that require it. If set to 'off', QEMU
+# will require KVM_SEV_INIT2 unconditionally.
+# (default: off) (since 9.1)
#
# Since: 2.12
##
@@ -927,7 +931,7 @@
'*session-file': 'str',
'*policy': 'uint32',
'*handle': 'uint32',
- '*legacy-vm-type': 'bool' } }
+ '*legacy-vm-type': 'OnOffAuto' } }
##
# @SevSnpGuestProperties:
diff --git a/target/i386/sev.c b/target/i386/sev.c
index 491fab74fd..b921defb63 100644
--- a/target/i386/sev.c
+++ b/target/i386/sev.c
@@ -144,7 +144,7 @@ struct SevGuestState {
uint32_t policy;
char *dh_cert_file;
char *session_file;
- bool legacy_vm_type;
+ OnOffAuto legacy_vm_type;
};
struct SevSnpGuestState {
@@ -1334,6 +1334,17 @@ sev_vm_state_change(void *opaque, bool running, RunState state)
}
}
+/*
+ * This helper is to examine sev-guest properties and determine if any options
+ * have been set which rely on the newer KVM_SEV_INIT2 interface and associated
+ * KVM VM types.
+ */
+static bool sev_init2_required(SevGuestState *sev_guest)
+{
+ /* Currently no KVM_SEV_INIT2-specific options are exposed via QEMU */
+ return false;
+}
+
static int sev_kvm_type(X86ConfidentialGuest *cg)
{
SevCommonState *sev_common = SEV_COMMON(cg);
@@ -1344,14 +1355,39 @@ static int sev_kvm_type(X86ConfidentialGuest *cg)
goto out;
}
+ /* These are the only cases where legacy VM types can be used. */
+ if (sev_guest->legacy_vm_type == ON_OFF_AUTO_ON ||
+ (sev_guest->legacy_vm_type == ON_OFF_AUTO_AUTO &&
+ !sev_init2_required(sev_guest))) {
+ sev_common->kvm_type = KVM_X86_DEFAULT_VM;
+ goto out;
+ }
+
+ /*
+ * Newer VM types are required, either explicitly via legacy-vm-type=on, or
+ * implicitly via legacy-vm-type=auto along with additional sev-guest
+ * properties that require the newer VM types.
+ */
kvm_type = (sev_guest->policy & SEV_POLICY_ES) ?
KVM_X86_SEV_ES_VM : KVM_X86_SEV_VM;
- if (kvm_is_vm_type_supported(kvm_type) && !sev_guest->legacy_vm_type) {
- sev_common->kvm_type = kvm_type;
- } else {
- sev_common->kvm_type = KVM_X86_DEFAULT_VM;
+ if (!kvm_is_vm_type_supported(kvm_type)) {
+ if (sev_guest->legacy_vm_type == ON_OFF_AUTO_AUTO) {
+ error_report("SEV: host kernel does not support requested %s VM type, which is required "
+ "for the set of options specified. To allow use of the legacy "
+ "KVM_X86_DEFAULT_VM VM type, please disable any options that are not "
+ "compatible with the legacy VM type, or upgrade your kernel.",
+ kvm_type == KVM_X86_SEV_VM ? "KVM_X86_SEV_VM" : "KVM_X86_SEV_ES_VM");
+ } else {
+ error_report("SEV: host kernel does not support requested %s VM type. To allow use of "
+ "the legacy KVM_X86_DEFAULT_VM VM type, the 'legacy-vm-type' argument "
+ "must be set to 'on' or 'auto' for the sev-guest object.",
+ kvm_type == KVM_X86_SEV_VM ? "KVM_X86_SEV_VM" : "KVM_X86_SEV_ES_VM");
+ }
+
+ return -1;
}
+ sev_common->kvm_type = kvm_type;
out:
return sev_common->kvm_type;
}
@@ -1442,14 +1478,24 @@ static int sev_common_kvm_init(ConfidentialGuestSupport *cgs, Error **errp)
}
trace_kvm_sev_init();
- if (x86_klass->kvm_type(X86_CONFIDENTIAL_GUEST(sev_common)) == KVM_X86_DEFAULT_VM) {
+ switch (x86_klass->kvm_type(X86_CONFIDENTIAL_GUEST(sev_common))) {
+ case KVM_X86_DEFAULT_VM:
cmd = sev_es_enabled() ? KVM_SEV_ES_INIT : KVM_SEV_INIT;
ret = sev_ioctl(sev_common->sev_fd, cmd, NULL, &fw_error);
- } else {
+ break;
+ case KVM_X86_SEV_VM:
+ case KVM_X86_SEV_ES_VM:
+ case KVM_X86_SNP_VM: {
struct kvm_sev_init args = { 0 };
ret = sev_ioctl(sev_common->sev_fd, KVM_SEV_INIT2, &args, &fw_error);
+ break;
+ }
+ default:
+ error_setg(errp, "%s: host kernel does not support the requested SEV configuration.",
+ __func__);
+ return -1;
}
if (ret) {
@@ -2037,14 +2083,23 @@ sev_guest_set_session_file(Object *obj, const char *value, Error **errp)
SEV_GUEST(obj)->session_file = g_strdup(value);
}
-static bool sev_guest_get_legacy_vm_type(Object *obj, Error **errp)
+static void sev_guest_get_legacy_vm_type(Object *obj, Visitor *v,
+ const char *name, void *opaque,
+ Error **errp)
{
- return SEV_GUEST(obj)->legacy_vm_type;
+ SevGuestState *sev_guest = SEV_GUEST(obj);
+ OnOffAuto legacy_vm_type = sev_guest->legacy_vm_type;
+
+ visit_type_OnOffAuto(v, name, &legacy_vm_type, errp);
}
-static void sev_guest_set_legacy_vm_type(Object *obj, bool value, Error **errp)
+static void sev_guest_set_legacy_vm_type(Object *obj, Visitor *v,
+ const char *name, void *opaque,
+ Error **errp)
{
- SEV_GUEST(obj)->legacy_vm_type = value;
+ SevGuestState *sev_guest = SEV_GUEST(obj);
+
+ visit_type_OnOffAuto(v, name, &sev_guest->legacy_vm_type, errp);
}
static void
@@ -2070,9 +2125,9 @@ sev_guest_class_init(ObjectClass *oc, void *data)
sev_guest_set_session_file);
object_class_property_set_description(oc, "session-file",
"guest owners session parameters (encoded with base64)");
- object_class_property_add_bool(oc, "legacy-vm-type",
- sev_guest_get_legacy_vm_type,
- sev_guest_set_legacy_vm_type);
+ object_class_property_add(oc, "legacy-vm-type", "OnOffAuto",
+ sev_guest_get_legacy_vm_type,
+ sev_guest_set_legacy_vm_type, NULL, NULL);
object_class_property_set_description(oc, "legacy-vm-type",
"use legacy VM type to maintain measurement compatibility with older QEMU or kernel versions.");
}
@@ -2088,6 +2143,8 @@ sev_guest_instance_init(Object *obj)
object_property_add_uint32_ptr(obj, "policy", &sev_guest->policy,
OBJ_PROP_FLAG_READWRITE);
object_apply_compat_props(obj);
+
+ sev_guest->legacy_vm_type = ON_OFF_AUTO_AUTO;
}
/* guest info specific sev/sev-es */
--
2.39.3

@ -0,0 +1,46 @@
From ebb3c3536366c383fa09b0987a4efb68d018b7b8 Mon Sep 17 00:00:00 2001
From: Michael Roth <michael.roth@amd.com>
Date: Thu, 30 May 2024 06:16:24 -0500
Subject: [PATCH 064/100] i386/sev: Don't return launch measurements for
SEV-SNP guests
RH-Author: Paolo Bonzini <pbonzini@redhat.com>
RH-MergeRequest: 245: SEV-SNP support
RH-Jira: RHEL-39544
RH-Acked-by: Thomas Huth <thuth@redhat.com>
RH-Acked-by: Bandan Das <bdas@redhat.com>
RH-Acked-by: Vitaly Kuznetsov <vkuznets@redhat.com>
RH-Commit: [64/91] 5a29bb2d8b5a07aec6fd271ec37345e665e9cce4 (bonzini/rhel-qemu-kvm)
For SEV-SNP guests, launch measurement is queried from within the guest
during attestation, so don't attempt to return it as part of
query-sev-launch-measure.
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Pankaj Gupta <pankaj.gupta@amd.com>
Message-ID: <20240530111643.1091816-13-pankaj.gupta@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 73ae63b162fc1fed520f53ad200712964d7d0264)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
target/i386/sev.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/target/i386/sev.c b/target/i386/sev.c
index 6525b3c1a0..c3daaf1ad5 100644
--- a/target/i386/sev.c
+++ b/target/i386/sev.c
@@ -795,7 +795,9 @@ sev_launch_get_measure(Notifier *notifier, void *unused)
static char *sev_get_launch_measurement(void)
{
- SevGuestState *sev_guest = SEV_GUEST(MACHINE(qdev_get_machine())->cgs);
+ ConfidentialGuestSupport *cgs = MACHINE(qdev_get_machine())->cgs;
+ SevGuestState *sev_guest =
+ (SevGuestState *)object_dynamic_cast(OBJECT(cgs), TYPE_SEV_GUEST);
if (sev_guest &&
SEV_COMMON(sev_guest)->state >= SEV_STATE_LAUNCH_SECRET) {
--
2.39.3

Some files were not shown because too many files have changed in this diff Show More

Loading…
Cancel
Save