$NetBSD: patch-XSA444,v 1.1 2023/11/15 15:59:36 bouyer Exp $ From: Andrew Cooper Subject: x86/svm: Fix asymmetry with AMD DR MASK context switching The handling of MSR_DR{0..3}_MASK is asymmetric between PV and HVM guests. HVM guests context switch in based on the guest view of DBEXT, whereas PV guest switch in base on the host capability. Both guest types leave the context dirty for the next vCPU. This leads to the following issue: * PV or HVM guest has debugging active (%dr7 + mask) * Switch-out deactivates %dr7 but leaves other state stale in hardware * Another HVM guest with masks unavailable has debugging active * Switch in loads %dr7 but leaves the mask MSRs alone Now, the second guest's vCPU is operating in the context of the prior vCPU's mask MSR, while the environment the vCPU can see says there are no mask MSRs. As a stopgap, adjust the HVM path to switch in the masks based on host capabilities rather than guest visibility (i.e. like the PV path). Adjustment of the intercepts still needs to be dependent on the guest visibility of DBEXT. This is part of XSA-444 / CVE-2023-34327 Fixes: c097f54912d3 ("x86/SVM: support data breakpoint extension registers") Signed-off-by: Andrew Cooper Reviewed-by: Jan Beulich diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c index a019d196e071..ba4069f9100a 100644 --- xen/arch/x86/hvm/svm/svm.c.orig +++ xen/arch/x86/hvm/svm/svm.c @@ -185,6 +185,10 @@ static void svm_save_dr(struct vcpu *v) v->arch.hvm.flag_dr_dirty = 0; vmcb_set_dr_intercepts(vmcb, ~0u); + /* + * The guest can only have changed the mask MSRs if we previous dropped + * intercepts. Re-read them from hardware. + */ if ( v->domain->arch.cpuid->extd.dbext ) { svm_intercept_msr(v, MSR_AMD64_DR0_ADDRESS_MASK, MSR_INTERCEPT_RW); @@ -216,17 +220,25 @@ static void __restore_debug_registers(struct vmcb_struct *vmcb, struct vcpu *v) ASSERT(v == current); - if ( v->domain->arch.cpuid->extd.dbext ) + /* + * Both the PV and HVM paths leave stale DR_MASK values in hardware on + * context-switch-out. If we're activating %dr7 for the guest, we must + * sync the DR_MASKs too, whether or not the guest can see them. + */ + if ( boot_cpu_has(X86_FEATURE_DBEXT) ) { - svm_intercept_msr(v, MSR_AMD64_DR0_ADDRESS_MASK, MSR_INTERCEPT_NONE); - svm_intercept_msr(v, MSR_AMD64_DR1_ADDRESS_MASK, MSR_INTERCEPT_NONE); - svm_intercept_msr(v, MSR_AMD64_DR2_ADDRESS_MASK, MSR_INTERCEPT_NONE); - svm_intercept_msr(v, MSR_AMD64_DR3_ADDRESS_MASK, MSR_INTERCEPT_NONE); - wrmsrl(MSR_AMD64_DR0_ADDRESS_MASK, v->arch.msrs->dr_mask[0]); wrmsrl(MSR_AMD64_DR1_ADDRESS_MASK, v->arch.msrs->dr_mask[1]); wrmsrl(MSR_AMD64_DR2_ADDRESS_MASK, v->arch.msrs->dr_mask[2]); wrmsrl(MSR_AMD64_DR3_ADDRESS_MASK, v->arch.msrs->dr_mask[3]); + + if ( v->domain->arch.cpuid->extd.dbext ) + { + svm_intercept_msr(v, MSR_AMD64_DR0_ADDRESS_MASK, MSR_INTERCEPT_NONE); + svm_intercept_msr(v, MSR_AMD64_DR1_ADDRESS_MASK, MSR_INTERCEPT_NONE); + svm_intercept_msr(v, MSR_AMD64_DR2_ADDRESS_MASK, MSR_INTERCEPT_NONE); + svm_intercept_msr(v, MSR_AMD64_DR3_ADDRESS_MASK, MSR_INTERCEPT_NONE); + } } write_debugreg(0, v->arch.dr[0]); diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c index f7992ff230b5..a142a63dd869 100644 --- xen/arch/x86/traps.c.orig +++ xen/arch/x86/traps.c @@ -2314,6 +2314,11 @@ void activate_debugregs(const struct vcpu *curr) if ( curr->arch.dr7 & DR7_ACTIVE_MASK ) write_debugreg(7, curr->arch.dr7); + /* + * Both the PV and HVM paths leave stale DR_MASK values in hardware on + * context-switch-out. If we're activating %dr7 for the guest, we must + * sync the DR_MASKs too, whether or not the guest can see them. + */ if ( boot_cpu_has(X86_FEATURE_DBEXT) ) { wrmsrl(MSR_AMD64_DR0_ADDRESS_MASK, curr->arch.msrs->dr_mask[0]); From: Andrew Cooper Subject: x86/pv: Correct the auditing of guest breakpoint addresses The use of access_ok() is buggy, because it permits access to the compat translation area. 64bit PV guests don't use the XLAT area, but on AMD hardware, the DBEXT feature allows a breakpoint to match up to a 4G aligned region, allowing the breakpoint to reach outside of the XLAT area. Prior to c/s cda16c1bb223 ("x86: mirror compat argument translation area for 32-bit PV"), the live GDT was within 4G of the XLAT area. All together, this allowed a malicious 64bit PV guest on AMD hardware to place a breakpoint over the live GDT, and trigger a #DB livelock (CVE-2015-8104). Introduce breakpoint_addr_ok() and explain why __addr_ok() happens to be an appropriate check in this case. For Xen 4.14 and later, this is a latent bug because the XLAT area has moved to be on its own with nothing interesting adjacent. For Xen 4.13 and older on AMD hardware, this fixes a PV-trigger-able DoS. This is part of XSA-444 / CVE-2023-34328. Fixes: 65e355490817 ("x86/PV: support data breakpoint extension registers") Signed-off-by: Andrew Cooper Reviewed-by: Roger Pau Monné Reviewed-by: Jan Beulich diff --git a/xen/arch/x86/pv/misc-hypercalls.c b/xen/arch/x86/pv/misc-hypercalls.c index 5dade2472687..681c16108fd1 100644 --- xen/arch/x86/pv/misc-hypercalls.c.orig +++ xen/arch/x86/pv/misc-hypercalls.c @@ -68,7 +68,7 @@ long set_debugreg(struct vcpu *v, unsigned int reg, unsigned long value) switch ( reg ) { case 0 ... 3: - if ( !access_ok(value, sizeof(long)) ) + if ( !breakpoint_addr_ok(value) ) return -EPERM; v->arch.dr[reg] = value; diff --git a/xen/include/asm-x86/debugreg.h b/xen/include/asm-x86/debugreg.h index c57914efc6e8..cc298265244b 100644 --- xen/include/asm-x86/debugreg.h.orig +++ xen/include/asm-x86/debugreg.h @@ -77,6 +77,26 @@ asm volatile ( "mov %%db" #reg ",%0" : "=r" (__val) ); \ __val; \ }) + +/* + * Architecturally, %dr{0..3} can have any arbitrary value. However, Xen + * can't allow the guest to breakpoint the Xen address range, so we limit the + * guest to the lower canonical half, or above the Xen range in the higher + * canonical half. + * + * Breakpoint lengths are specified to mask the low order address bits, + * meaning all breakpoints are naturally aligned. With %dr7, the widest + * breakpoint is 8 bytes. With DBEXT, the widest breakpoint is 4G. Both of + * the Xen boundaries have >4G alignment. + * + * In principle we should account for HYPERVISOR_COMPAT_VIRT_START(d), but + * 64bit Xen has never enforced this for compat guests, and there's no problem + * (to Xen) if the guest breakpoints it's alias of the M2P. Skipping this + * aspect simplifies the logic, and causes us not to reject a migrating guest + * which operated fine on prior versions of Xen. + */ +#define breakpoint_addr_ok(a) __addr_ok(a) + long set_debugreg(struct vcpu *, unsigned int reg, unsigned long value); void activate_debugregs(const struct vcpu *);