Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOUBLE_FAULT exception under win7 x86 with KB4088878 #54

Closed
leeqwind opened this issue May 24, 2018 · 10 comments
Closed

DOUBLE_FAULT exception under win7 x86 with KB4088878 #54

leeqwind opened this issue May 24, 2018 · 10 comments
Assignees

Comments

@leeqwind
Copy link

leeqwind commented May 24, 2018

Description

Under Windows 7 x86 SP1 with KB4088878 (the 2018-03 security update: http://download.windowsupdate.com/c/msdownload/update/software/secu/2018/03/windows6.1-kb4088878-x86_7512ab54d6a6df9d7e3d511d84a387aaeaeef111.msu), everytime loading and running HyperPlatform driver, a DOUBLE_FAULT exception would be raised during handling cr3 access in vmexit.

Expected behavior

Executing normally.

Actual behavior

The DOUBLE_FAULT exception was triggered by instruction mov cr3,ecx in function UtilLoadPdptes.

Steps to reproduce the problem

  1. Install the 2018-03 security update for Windows 7 x86 SP1: http://download.windowsupdate.com/c/msdownload/update/software/secu/2018/03/windows6.1-kb4088878-x86_7512ab54d6a6df9d7e3d511d84a387aaeaeef111.msu

  2. Load and run the latest version of HyperPlatform (without any extra modification) in the Windows 7 environment.

  3. The exception would be triggered immediately.

Specifications

  • Commit: 97d6ccc (the latest commit so far)

  • OS version: Windows 7 SP1 7601 with KB4088878

  • Architecture: x86

  • Hardware: VMware Workstation (any version)

  • Details:

The DOUBLE_FAULT exception was triggered by instruction mov cr3,ecx in function UtilLoadPdptes:

kd> !analyze -v
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

UNEXPECTED_KERNEL_MODE_TRAP (7f)

Arguments:
Arg1: 00000008, EXCEPTION_DOUBLE_FAULT
Arg2: 80b93c00
Arg3: 00000000
Arg4: 00000000

TSS:  00000028 -- (.tss 0x28)
eax=00185000 ebx=7793fce0 ecx=5f2fd4a0 edx=87d5cff0 esi=003993d8 edi=00000000
eip=987bc8ad esp=87d5cdd0 ebp=87d5ce08 iopl=0         nv up di ng nz na po nc
cs=0008  ss=0010  ds=0020  es=0020  fs=0030  gs=0000             efl=00010082

STACK_TEXT:  
87d5ce08 987bdb3f 5f2fd4a0 86fa1000 987bc580 HyperPlatform!UtilLoadPdptes+0x1d [d:\github\hyperplatform\hyperplatform\util.cpp @ 835]
87d5cef4 987bf205 87d5cf40 00000000 003993d8 HyperPlatform!VmmpHandleCrAccess+0x14f [d:\github\hyperplatform\hyperplatform\vmm.cpp @ 966]
87d5cf34 987bd185 87d5cf40 87d5cfd8 00000006 HyperPlatform!VmmpHandleVmExit+0x145 [d:\github\hyperplatform\hyperplatform\vmm.cpp @ 268]
87d5cf64 987ba27e 87d5cfd8 00000000 00000000 HyperPlatform!VmmVmExitHandler+0xa5 [d:\github\hyperplatform\hyperplatform\vmm.cpp @ 205]
87d5cf6c 00000000 00000000 00000000 00000000 HyperPlatform!AsmVmmEntryPoint+0x37 [D:\GitHub\HyperPlatform\HyperPlatform\Arch\x86\x86.asm @ 117]

kd> ub 987bc8ad 
HyperPlatform!UtilLoadPdptes+0x6 [d:\github\hyperplatform\hyperplatform\util.cpp @ 828]:
987bc896 a108207c98      mov     eax,dword ptr [HyperPlatform!__security_cookie (987c2008)]
987bc89b 33c5            xor     eax,ebp
987bc89d 8945fc          mov     dword ptr [ebp-4],eax
987bc8a0 56              push    esi
987bc8a1 0f20d8          mov     eax,cr3
987bc8a4 8945cc          mov     dword ptr [ebp-34h],eax
987bc8a7 8b4d08          mov     ecx,dword ptr [ebp+8]
987bc8aa 0f22d9          mov     cr3,ecx

The current VMExit is handling a cr3 access from system function KiKernelSysretExit:

kd> dc 87d5cf40 
87d5cf40  87d5cfd8 00000006 83f38028 00000000  ........(.......
87d5cf50  00000100 00000000 00000001 00000000  ................
87d5cf60  00000000 0051fdc8 987ba27e 87d5cfd8  ......Q.~.{.....
87d5cf70  00000000 00000000 00000000 00000000  ................
kd> u 83f38028 
nt!KiKernelSysretExit+0x28:
83f38028 0f22d9          mov     cr3,ecx
83f3802b 59              pop     ecx
83f3802c 0fa1            pop     fs
83f3802e fb              sti
83f3802f 0f35            sysexit
83f38031 cc              int     3

Detail files: https://1drv.ms/u/s!ApQpgQkWR0QOi8ZOVXUBX83WEQIE8Q

  • a Debug build version of a compiled SYS file and a PDB file
  • a log file
  • a system crash dump file
@tandasat
Copy link
Owner

tandasat commented Jun 2, 2018

Thank you for detailed report and attaching files. That helps. I was able to repro the same issue and plan to work for a fix this weekend.

@hzqst
Copy link

hzqst commented Jun 11, 2018

//in VmmpHandleCrAccess
if (UtilIsX86Pae()) {
		UtilLoadPdptes(UtilVmRead(VmcsField::kGuestCr3));
}

the guest cr3 is an _KPROCESS::UserDirectoryTableBase
same as issue #52

// Returns a kernel CR3 value of the current process;
/*_Use_decl_annotations_*/ static ULONG_PTR VmmpGetKernelCr3() {
	auto guest_cr3 = UtilVmRead(VmcsField::kGuestCr3);
	// Assume it is an user-mode CR3 when the lowest bit is set. If so, get CR3
	// from _KPROCESS::DirectoryTableBase.
	if (guest_cr3 & 1) {
static const long kDirectoryTableBaseOffsetX64 = 0x28;
    static const long kDirectoryTableBaseOffsetX86 = 0x18;
    auto process = reinterpret_cast<PUCHAR>(PsGetCurrentProcess());
		if (IsX64())
		{
			auto OriginalGsBase = UtilReadMsr(Msr::kIa32GsBase);
			UtilWriteMsr(Msr::kIa32GsBase, UtilVmRead(VmcsField::kGuestGsBase));

			guest_cr3 =
+          *reinterpret_cast<PULONG_PTR>(process + kDirectoryTableBaseOffsetX64);

			UtilWriteMsr(Msr::kIa32GsBase, OriginalGsBase);
		}
		else
		{
			auto OriginalBase = UtilReadMsr(Msr::kIa32FsBase);
			UtilWriteMsr(Msr::kIa32FsBase, UtilVmRead(VmcsField::kGuestFsBase));

			guest_cr3 =
+          *reinterpret_cast<PULONG_PTR>(process + kDirectoryTableBaseOffsetX86);

			UtilWriteMsr(Msr::kIa32FsBase, OriginalBase);
		}
	}
	return guest_cr3;
}

@tandasat
Copy link
Owner

Does this resolve the issue? I was able to repro but could not find a solution last weekend.

If it does, feel free to make a PR or send me a patch so I can give you a credit if you like.

@leeqwind
Copy link
Author

leeqwind commented Jun 12, 2018

I'm sorry but the issue has not been resolved. The DOUBLE_FAULT exception still exists in the VMExit for the specific CR3 access after I attempted to modify the code.

The guest CR3 value (such as 0x5f2f7920) that caused the exception has not set its lowest bit, so that the condition if (guest_cr3 & 1) to judge whether guest_cr3 is an user-mode CR3 in VmmpGetKernelCr3() would not be satisfied.

sp180612_163342

@leeqwind
Copy link
Author

BTW the target to judge should be the source operand of the mov cr3, ecx instruction rather than the value of kGuestCr3 when handling CR3 writting access since the value has not been written into kGuestCr3 yet.

@hzqst
Copy link

hzqst commented Jun 13, 2018

@tandasat sorry I misread the issue. The best way to solve #52 , #46, and this issue is to stop reading/writing guest virtual memory by switching host cr3 in any vmexit handler.

A emulated GuestVA accessing & emulated linear-address translation could do the same without any problem.

I did a small test that replace all CR3Switching job with my ReadWriteGuestMemory code referenced from Intel/haxm, it do work fine for now.

Idtr gdtr;
gdtr.base = (ULONG_PTR)UtilVmRead(VmcsField::kGuestGdtrBase);
gdtr.limit = (USHORT)UtilVmRead(VmcsField::kGuestGdtrLimit);

VmmpWriteGuestVirtual((ULONG_PTR)operation_address,
	sizeof(gdtr), &gdtr, sizeof(gdtr), 0);
SIZE_T VmmpWriteGuestVirtual(ULONG_PTR addr, SIZE_T dst_buflen, const void *src, SIZE_T size, ULONG flag)
{
	// TODO: use guest CPL for access checks
	const char *srcp = (const char *)src;
	SIZE_T offset = 0;
	void *hva, *hva_base;

	while (offset < size) {
		ULONG64 gpa;
		ULONG64 len = size - offset;
		int r = VmmpTranslateVirtualAddress(addr + offset, TF_WRITE, &gpa, &len,
			flag != 2);
		if (r != 0) {
			if (flag != 0)
				return offset;  // Number of bytes successfully written
			if (r & TF_GP2HP) {
				HYPERPLATFORM_LOG_ERROR_SAFE("VmmpWriteGuestVirtual(%llx, %x) failed\n", addr, size);
			}
			HYPERPLATFORM_LOG_ERROR_SAFE("VmmpWriteGuestVirtual(%llx, %x) injecting #PF\n", addr, size);
			VmmpInjectPageFault(r & 0x1f, (ULONG_PTR)addr + offset);
			return false;
		}

		hva_base = VmmpMapGuestPA(gpa, len);

		if (hva_base) {
			hva = (UCHAR *)hva_base + (gpa & 0xfff);
			RtlCopyMemory(hva, (void *)(srcp + offset), min(len, dst_buflen - offset ));
		}
		else {
			HYPERPLATFORM_LOG_ERROR_SAFE("VmmpWriteGuestVirtual Failed to VmmpMapGuestPA\n");
		}

		VmmpUnmapGuestPA(hva_base, len);

		offset += (SIZE_T)len;
	}

	return flag != 0 ? size : true;
}

VmmpTranslateVirtualAddress does the linear address translation by walking PML4/PDPT/PD/PT while MmMapIoSpace does the GPA to HVA mapping.

update1: After a 2 hours test, I got a GP fault inside PatchGuard code, dont know if the VmmpRead/WriteGuestVirtual is the
cause or not.

INITKDBG:00000001401DA96A                 mov     r9d, r10d
INITKDBG:00000001401DA96D                 cmp     r10d, 8
INITKDBG:00000001401DA971                 jb      short loc_1401DA993
INITKDBG:00000001401DA973                 mov     r11, r10
INITKDBG:00000001401DA976                 shr     r11, 3
INITKDBG:00000001401DA97A
INITKDBG:00000001401DA97A loc_1401DA97A:                          ; CODE XREF: sub_1401DA40C+585�j
INITKDBG:00000001401DA97A                 xor     rdx, [r8] //crash at this instruction with GP fault
INITKDBG:00000001401DA97D                 mov     ecx, [rsi+2A0h]
INITKDBG:00000001401DA983                 add     r9d, 0FFFFFFF8h
INITKDBG:00000001401DA987                 rol     rdx, cl
INITKDBG:00000001401DA98A                 add     r8, 8
INITKDBG:00000001401DA98E                 sub     r11, r13
0: kd> r
rax=ffffffffffffffff rbx=fffff880021b1268 rcx=000000000000001e
rdx=ffffffffc0000005 rsi=fffff880021b0ad0 rdi=fffff800029ec500
rip=fffff800028aa4a0 rsp=fffff880021b0a98 rbp=fffff880021b0fd0
 r8=fffffa8017b94409  r9=0000000000000000 r10=0000000000000000
r11=fffff880021b0408 r12=fffff880021b1310 r13=000000000010001f
r14=fffff880021b1130 r15=fffffa8017b92400
0: kd> dq fffffa8017b94409
fffffa80`17b94409  0002a08e`8b103349 c2d348f8`c1834100
fffffa80`17b94419  75c52b49`08c08349 0f411976`cb3b44e7
fffffa80`17b94429  000002a0`8e8b00b6 d348d033`48c5034d
fffffa80`17b94439  45e775ff`c18341c2 3302ebc2`8b481f01
fffffa80`17b94449  c33b481f`e8c148d0 b9491ff2`ba0ff575
fffffa80`17b94459  a3a03f58`91c8b4e8 fb12840f`14563b41
fffffa80`17b94469  1856894d`c28bffff 084e8b49`20468949
fffffa80`17b94479  568b41ff`fffc51e9 000002c8`8d8d4828

I uploaded my ntoskrnl for further analysis.ntoskrnl.zip

update2: It seems the missing PowerCallbackInitialization was to blame for the BSOD since that happened when I wake up the system (I forgot to merge owercallback.cpp into my code), I will try again later.

update3: After adding PowerCallbackInitialization it still went BSOD with #GP fault, but no BSOD when descriptor_table_exiting = false, I will take a look at KVM to see if I have missed something.

08:58:57.350	ERR	#0	    4	    0	System         	VmmpTranslateVirtualAddress: fffff800054c6618 (W,S) mode 3
08:58:57.350	ERR	#0	    4	    0	System         	VmmpWriteGuestVirtual(fffff800054c6618, 0xa) injecting #PF
08:58:57.350	ERR	#0	    4	    0	System         	VmmpTranslateVirtualAddress: fffff800054c6608 (R,S) mode 3
08:58:57.350	ERR	#0	    4	    0	System         	VmmpReadGuestVirtual(fffff800054c6608, a) injecting #PF
08:58:57.350	ERR	#0	    4	    0	System         	VmmpTranslateVirtualAddress: fffff800054c6618 (R,S) mode 3
08:58:57.350	ERR	#0	    4	    0	System         	VmmpReadGuestVirtual(fffff800054c6618, a) injecting #PF
08:58:57.350	ERR	#1	    4	   68	System         	VmmpTranslateVirtualAddress: fffff880049085e0 (W,S) mode 3
08:58:57.350	ERR	#1	    4	   68	System         	VmmpWriteGuestVirtual(fffff880049085e0, 0xa) injecting #PF
08:58:57.350	ERR	#1	    4	   68	System         	VmmpTranslateVirtualAddress: fffff88004908588 (R,S) mode 3
08:58:57.350	ERR	#1	    4	   68	System         	VmmpReadGuestVirtual(fffff88004908588, a) injecting #PF
08:58:57.350	ERR	#1	    4	   68	System         	VmmpTranslateVirtualAddress: fffff880049085e0 (R,S) mode 3
08:58:57.350	ERR	#1	    4	   68	System         	VmmpReadGuestVirtual(fffff880049085e0, a) injecting #PF
09:00:56.877	ERR	#0	    4	    0	System         	VmmpTranslateVirtualAddress: fffff800054c6598 (W,S) mode 3
09:00:56.877	ERR	#0	    4	    0	System         	VmmpWriteGuestVirtual(fffff800054c6598, 0xa) injecting #PF
09:00:56.877	ERR	#0	    4	    0	System         	VmmpTranslateVirtualAddress: fffff800054c6588 (R,S) mode 3
09:00:56.877	ERR	#0	    4	    0	System         	VmmpReadGuestVirtual(fffff800054c6588, a) injecting #PF
09:00:56.877	ERR	#0	    4	    0	System         	VmmpTranslateVirtualAddress: fffff800054c6598 (R,S) mode 3
09:00:56.877	ERR	#0	    4	    0	System         	VmmpReadGuestVirtual(fffff800054c6598, a) injecting #PF
09:00:56.877	ERR	#1	    4	   68	System         	VmmpTranslateVirtualAddress: fffff880049085e0 (W,S) mode 3
09:00:56.877	ERR	#1	    4	   68	System         	VmmpWriteGuestVirtual(fffff880049085e0, 0xa) injecting #PF
09:00:56.877	ERR	#1	    4	   68	System         	VmmpTranslateVirtualAddress: fffff88004908588 (R,S) mode 3
09:00:56.877	ERR	#1	    4	   68	System         	VmmpReadGuestVirtual(fffff88004908588, a) injecting #PF
09:00:56.877	ERR	#1	    4	   68	System         	VmmpTranslateVirtualAddress: fffff880049085e0 (R,S) mode 3
09:00:56.877	ERR	#1	    4	   68	System         	VmmpReadGuestVirtual(fffff880049085e0, a) injecting #PF
09:02:56.389	ERR	#0	    4	    0	System         	VmmpTranslateVirtualAddress: fffff800054c6618 (W,S) mode 3
09:02:56.389	ERR	#0	    4	    0	System         	VmmpWriteGuestVirtual(fffff800054c6618, 0xa) injecting #PF
09:02:56.389	ERR	#0	    4	    0	System         	VmmpTranslateVirtualAddress: fffff800054c6608 (R,S) mode 3
09:02:56.389	ERR	#0	    4	    0	System         	VmmpReadGuestVirtual(fffff800054c6608, a) injecting #PF
09:02:56.389	ERR	#0	    4	    0	System         	VmmpTranslateVirtualAddress: fffff800054c6618 (R,S) mode 3
09:02:56.389	ERR	#0	    4	    0	System         	VmmpReadGuestVirtual(fffff800054c6618, a) injecting #PF
09:02:56.389	ERR	#0	    4	   60	System         	VmmpTranslateVirtualAddress: fffff880047f45e0 (W,S) mode 3
09:02:56.389	ERR	#0	    4	   60	System         	VmmpWriteGuestVirtual(fffff880047f45e0, 0xa) injecting #PF
09:02:56.389	ERR	#0	    4	   60	System         	VmmpTranslateVirtualAddress: fffff880047f4588 (R,S) mode 3
09:02:56.389	ERR	#0	    4	   60	System         	VmmpReadGuestVirtual(fffff880047f4588, a) injecting #PF
09:02:56.389	ERR	#0	    4	   60	System         	VmmpTranslateVirtualAddress: fffff880047f45e0 (R,S) mode 3
09:02:56.389	ERR	#0	    4	   60	System         	VmmpReadGuestVirtual(fffff880047f45e0, a) injecting #PF

@tandasat
Copy link
Owner

@hzqst Thank you for looking into a correct solution and referring to the sample implementation. I agree that avoiding CR3 is the right approach in general.

Let me know if you find anything new. I will continue to look into both quick possible fix specific to this issue and your suggestion too.

@mgreshis
Copy link

mgreshis commented Jul 2, 2018

I am hitting a similar issue, but via kInvalidGuestState (0x21) vmexit reason instead of access to/from CR3. The exit is inside one of the new subroutines introduced by Microsoft related to Spectre/Meltdown fixes and involves a move to CR3:

nt!KxIsrLinkageShadow:
fffff803`3cb6ba40 f644241001                     test    byte ptr [rsp+10h], 1
fffff803`3cb6ba45 7440                           je      nt!KxIsrLinkageShadow+0x47 (fffff803`3cb6ba87)
fffff803`3cb6ba47 56                             push    rsi
fffff803`3cb6ba48 0f01f8                         swapgs  
fffff803`3cb6ba4b 65488b342500700000             mov     rsi, qword ptr gs:[7000h]
fffff803`3cb6ba54 650fba24251870000001           bt      dword ptr gs:[7018h], 1
fffff803`3cb6ba5e 7203                           jb      nt!KxIsrLinkageShadow+0x23 (fffff803`3cb6ba63)
fffff803`3cb6ba60 0f22de                         mov     cr3, rsi
fffff803`3cb6ba63 488d742438 ---guest IP ---->   lea     rsi, [rsp+38h]
fffff803`3cb6ba68 65488b242508700000             mov     rsp, qword ptr gs:[7008h]
fffff803`3cb6ba71 ff76f8                         push    qword ptr [rsi-8]
fffff803`3cb6ba74 ff76f0                         push    qword ptr [rsi-10h]
fffff803`3cb6ba77 ff76e8                         push    qword ptr [rsi-18h]
fffff803`3cb6ba7a ff76e0                         push    qword ptr [rsi-20h]
fffff803`3cb6ba7d ff76d8                         push    qword ptr [rsi-28h]
fffff803`3cb6ba80 ff76d0                         push    qword ptr [rsi-30h]
fffff803`3cb6ba83 488b76c8                       mov     rsi, qword ptr [rsi-38h]
fffff803`3cb6ba87 e94405ecff                     jmp     nt!KxIsrLinkage (fffff803`3ca2bfd0)
fffff803`3cb6ba8c c3                             ret     

Here is the state of VM at time of exit:

image

The platform is Windows 10, x64. Here's a great write up that explains the reason for this CR3 move:
https://malwaretips.com/threads/heres-how-the-new-meltdown-patch-for-windows-is-enforced-for-amd-systems.78728/

@tandasat
Copy link
Owner

tandasat commented Jul 2, 2018

@mgreshis did you try with the latest commit of HyperPlatform? This is likely a separate issue and the similar issue was fixed already. If it still reproduces with the latest commit, please file an another issue.

@mgreshis
Copy link

mgreshis commented Jul 2, 2018

@tandasat apologies, the issue has already been fixed by you. I am on MemoryMon branch and had to backport. Thanks again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants