Skip to content
This repository has been archived by the owner on Jan 28, 2023. It is now read-only.

Windows10 guest support #204

Merged
merged 3 commits into from
Mar 6, 2020
Merged

Windows10 guest support #204

merged 3 commits into from
Mar 6, 2020

Conversation

nevilad
Copy link
Contributor

@nevilad nevilad commented May 25, 2019

Added more supported features to cpuid and PAT MSR support. This enables to install and run a windows10 x86 guest. Windows10 x64 guests are not yet working, since the amount of CR8 reads\writes is very big and the loading process stucks on handling them.
The patch was tested on a windows 7 and windows 10 guests on a windows10x64 host.
There are open questions:

  1. cr_pat is not read\written to qemu. Thus PAT MSR is not saved\restored from snapshots. I don't know how to add this support, since hax does not provide it's version to qemu, so qemu would not be able to take different actions for different hax versions - read cr_pat from the new version and don't read from the old. The same for CR8 register.
  2. in vcpu.c on line 2020 both the local and the vcpu object versions of entry_ctls are modified. This makes it possible that the check at line 2043 will be false and entry controls are not saved to vmcs. Is this OK?
  3. In cpuid filtering there are such checks:
    if (cpu_has_feature(X86_FEATURE_AESNI)) {
    cpuid_1_features_ecx |= FEATURE(AESNI);
    }
    Is this really necessary, given that the function returns the intersection of cpuid_1_features_ecx and the value returned by host cpuid:
    state->_ecx = (cpuid_1_features_ecx & state->_ecx) | FEATURE(HYPERVISOR);
    ?

core/vcpu.c Outdated
unsigned extFamilyID : 8;
unsigned reserved2 : 4;
};
} cpuid_eax, cpuid_eax2;
Copy link
Contributor

@AlexAltea AlexAltea May 26, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable cpuid_eax2 is not used anywhere.
Due to the strict compiler flags in the Darwin builds this triggers an error:
https://travis-ci.com/intel/haxm/jobs/203067260#L592

PS: Congrats on getting Windows 10 running, it's a huge step forward! 👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Yes, we have more strict compiler flags enabled on Mac. Unused variables should be removed.

@raphaelning
Copy link
Contributor

Thank you so much for enabling this big feature and submitting this PR!

I'm leaving Intel, and @coxuintel is the new maintainer of HAXM.

Please give Colin some time to finish this review. As my last contribution to the project, let me give a quick response to some of the open questions:

  1. in vcpu.c on line 2020 both the local and the vcpu object versions of entry_ctls are modified. This makes it possible that the check at line 2043 will be false and entry controls are not saved to vmcs. Is this OK?

Yes, it looks like a bug to me. It doesn't make sense to update vmx(vcpu, entry_ctls) before the check. Could you get rid of those questionable writes to vmx(vcpu, entry_ctls), and verify that ENTRY_CONTROL_LOAD_EFER is still enabled in the VMCS entry controls?

  1. [...] Is this really necessary, given that the function returns the intersection of cpuid_1_features_ecx and the value returned by host cpuid [...]

That's also a good catch. The cpu_has_feature() checks do seem redundant.

@AlexAltea
Copy link
Contributor

I'm leaving Intel, and @coxuintel is the new maintainer of HAXM.

@raphaelning Best of luck in your future projects! It was really great contributing to HAXM under your careful supervision.

@nevilad
Copy link
Contributor Author

nevilad commented May 27, 2019

More questions:
4) There was a comment "pat is disabled!" in cpuid filtering function, but the code for handling it's read\writes was not commented out\removed. PAT was disabled from the first public HAXM version, so it is not possible to view the commit message for the commit that disabled it. Where there any problems with it's support?
5) I can't yet find a solution to the CR8 problem. Intel docs saying TPR shadow is a solution, but it requires virtualized APIC. So far I understand, the current APIC virtualization is done through disabling rwx to it's page in EPT and handling all ept violations as reads\writes to APIC. Why was this nonstandard solution used? Am I right, that usind virtualized APIC allows to improve haxm performance, but this feature is not supported by older CPUs?

@raphaelning good luck! Leaving Intel is not an excuse for not contributing ))

@krytarowski
Copy link
Contributor

If you think of virtualized apic as APICv, then this needs very recent Intel x86. Personally, I don't have any hw that handles it..

The solution of handling CR8 is to pass it to qemu and emulate by qemu.

You can check the reference in NVMM from NetBSD as it does exactly this.

@krytarowski
Copy link
Contributor

@raphaelning good luck!

@nevilad
Copy link
Contributor Author

nevilad commented May 28, 2019

@krytarowski NVMM passes CR8 reads\writes to qemu as MMIO and is able to running windows10x64? I implemented such solution but running windows installation get's stuck using 100% of CPU. When NVMM is able to run windows10x64 using this solution, than there must be another reason of problems in my tests.

@krytarowski
Copy link
Contributor

Win10 is known to work, but I have not tested it myself.

http://m00nbsd.net/4e0798b7f2620c965d0dd9d6a7a2f296.html

@nevilad
Copy link
Contributor Author

nevilad commented May 28, 2019

The link contains a snapshot of windows 10 x86 and a comment about x64 support:
"Windows 64bit requires MTRR and MCE, and both are currently masked in NVMM; support for that hasn't yet been committed.". This means they haven't tested it yet, so they can encounter same problems with performance.

@krytarowski
Copy link
Contributor

I see, I haven't tested win10 myself - I'm just aware that CR8 was implemented. From other users of CR8 that I know there is just Solaris.. but more difficult to verify (proprietary and niche OS).

@nevilad
Copy link
Contributor Author

nevilad commented May 28, 2019

Personally, I don't have any hw that handles it..

Intel docs mention "Virtualize APIC accesses" feature. I can't find cpuids or capability MSRs to check to know if the processor supports it. How did you check your hardware for APICv support?

@krytarowski
Copy link
Contributor

https://www.redhat.com/archives/vfio-users/2016-June/msg00055.html here are ways how to detect it on Linux.

@nevilad
Copy link
Contributor Author

nevilad commented May 31, 2019

  1. Without these saves VM triple faults, so I don't delete them.
  2. Done.

@HaxmCI HaxmCI added CI:Build Pass CI:Build Pass CI:Mac Test Pass CI:Mac Test Pass labels Jun 3, 2019
@coxuintel
Copy link
Contributor

Sorry fot the late reply.

I tried to test on Win10 1903 x64 host, boot a Win10 1903 x86 or a Win7 x86 iso with upstream QEMU v4.0.0. However it triggers the IRQL_NOT_LESS_OR_EQUAL BSOD when haxm driver completing an IRP. Did you ever hit such issue?

It looks like your patch did several things, and the v2 solve the compile error on Darwin.

  1. Add FEATURE(PAT), FEATURE(PDPE1GB), FEATURE(LAHF), FEATURE(LZCNT), FEATURE(PREFETCHW).
  2. Add "IA32_PAT" to VM-entry and VM-exit controls.
  3. Re-write some CPUID feature detect code for 0x1 and 0x80000001,
  4. Some cleanup.
    Can I assume that both 1 and 2 are required to boot a x86 Windows ISO?

Overall your patches looks ok, although I didn't yet boot the iso successfully 😢.
I have several opens to discuss:

  • In addition to a single patch, it's preferred to split it to several patches:
    • Re-write CPUID feature detection code.
    • Add the features.
    • Add FEATURE(PAT) and IA32_PAT handling.
  • For the cleanup code, we may defer it so that we can do it together in future.

core/vcpu.c Outdated
@@ -1189,6 +1188,10 @@ void vcpu_save_host_state(struct vcpu_t *vcpu)
vmwrite(vcpu, HOST_EFER, hstate->_efer);
}

if (vmx(vcpu, exit_ctls) & EXIT_CONTROL_LOAD_PAT) {
vmwrite(vcpu, HOST_PAT, ia32_rdmsr(IA32_CR_PAT));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add ia32_pat to hstate so that more debug friendly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What means more debug friendly?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the context I'd say (@coxuintel correct me if I'm wrong) something analogous to the few IA32_EFER-related lines above, i.e. first storing the IA32_CR_PAT value in hstate:

    hstate->_pat = ia32_rdmsr(IA32_CR_PAT);

and then vmwriting directly from that field:

    if (vmx(vcpu, exit_ctls) & EXIT_CONTROL_LOAD_PAT) {
        vmwrite(vcpu, HOST_PAT, hstate->_pat);
    }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly. Then functions like vcpu_get_host_state() could output more information then.

core/vcpu.c Outdated
@@ -2009,11 +2016,12 @@ static void vmwrite_cr(struct vcpu_t *vcpu)
if ((state->_cr4 & CR4_PAE) && (state->_cr0 & CR0_PG) &&
(state->_cr0 & CR0_PE)) {
entry_ctls |= ENTRY_CONTROL_LOAD_EFER;
vmx(vcpu, entry_ctls) |= ENTRY_CONTROL_LOAD_EFER;
vmx(vcpu, entry_ctls) |= ENTRY_CONTROL_LOAD_EFER; //Without that VM triple faults!
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the check. We can investigate later why this write is necessary.

Copy link
Contributor

@coxuintel coxuintel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some inline comments in addition to in the PR comments.

core/vcpu.c Outdated
} else {
entry_ctls &= ~ENTRY_CONTROL_LOAD_EFER;
vmx(vcpu, entry_ctls) &= ~ENTRY_CONTROL_LOAD_EFER;
vmx(vcpu, entry_ctls) &= ~ENTRY_CONTROL_LOAD_EFER; //Without that VM triple faults!
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

core/vcpu.c Outdated
unsigned reserved2 : 4;
};
} cpuid_eax;
cpuid_eax.raw = state->_eax;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have some alignment issue here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On which platforms? I think it would be better to change unsigned to uint32_t.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, to align with the type of raw could eliminate the reading ambiguity.
And I didn't mean byte alignment here but the format. 😄 There seems an extra space before the piece of code.

core/vcpu.c Outdated
hw_model = (cpuid_eax.extModelID << 4) + cpuid_eax.model;
else
hw_model = cpuid_eax.model;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's much clean now and follow the pseudo code of SDM. 😄

core/vcpu.c Outdated
@@ -78,7 +78,6 @@ static void vcpu_init(struct vcpu_t *vcpu);
static void vcpu_prepare(struct vcpu_t *vcpu);
static void vcpu_init_emulator(struct vcpu_t *vcpu);

static void vmread_cr(struct vcpu_t *vcpu);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this for clean-up? If so, we can do it together in future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, cleanup.

core/vcpu.c Outdated
@@ -3766,7 +3813,7 @@ static int exit_ept_violation(struct vcpu_t *vcpu, struct hax_tunnel *htun)
htun->_exit_reason = vmx(vcpu, exit_reason).basic_reason;

if (qual->ept.gla1 == 0 && qual->ept.gla2 == 1) {
hax_panic_vcpu(vcpu, "Incorrect EPT seting\n");
hax_panic_vcpu(vcpu, "Incorrect EPT setting\n");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do the clean-up work together in future.

@nevilad
Copy link
Contributor Author

nevilad commented Jun 5, 2019

However it triggers the IRQL_NOT_LESS_OR_EQUAL BSOD when haxm driver completing an IRP. Did you ever hit such issue?

BSOD on the host? Never. I used qemu 2.12 and 3.1.

Can I assume that both 1 and 2 are required to boot a x86 Windows ISO?

PAT and LAHF are required, PDPE1GB, LZCNT and PREFETCHW are not, but I included these because they don't need any other support code.

In addition to a single patch, it's preferred to split it to several patches:

OK, but I would wait with the merge till I solve the problem of saving PAT to qemu. Question 1 in my first post.

@coxuintel
Copy link
Contributor

coxuintel commented Jun 6, 2019

However it triggers the IRQL_NOT_LESS_OR_EQUAL BSOD when haxm driver completing an IRP. Did you ever hit such issue?

BSOD on the host? Never. I used qemu 2.12 and 3.1.

I'll investigate this.

Can I assume that both 1 and 2 are required to boot a x86 Windows ISO?

PAT and LAHF are required, PDPE1GB, LZCNT and PREFETCHW are not, but I included these because they don't need any other support code.

Then I would suggest split to 2 patches so that will be helpful to bisect in case some issue that hasn't been found during current test.

  • Re-write CPUID feature detection code, add LAHF, PDPE1GB, LZCNT, PREFETCHW.
  • Add PAT and IA32_PAT VMX handling.

In addition to a single patch, it's preferred to split it to several patches:

OK, but I would wait with the merge till I solve the problem of saving PAT to qemu. Question 1 in my first post.

Oops, missed this one. HAXM has HAX_IOCTL_VERSION and HAX_IOCTL_CAPABILITY which we could leverage to take different actions from QEMU side.

@coxuintel
Copy link
Contributor

However it triggers the IRQL_NOT_LESS_OR_EQUAL BSOD when haxm driver completing an IRP. Did you ever hit such issue?

BSOD on the host? Never. I used qemu 2.12 and 3.1.

The reason seems like IRP sent with some ioctl get corrupted when calling into IoCompleteRequest(). It looks like irrelevant with your implementation, but I hit this every time booting a Windows ISO.
Could you share your complete QEMU boot parameters?

@nevilad
Copy link
Contributor Author

nevilad commented Jun 15, 2019

C:\user\qemux64\qemu-system-i386.exe -L C:\user\qemu\pc-bios\ -name win10x86 -m 2G -uuid 564db62e-e031-b5cf-5f34-a75f8cefa98f -rtc base=localtime -accel hax -hdd C:\VirtualMachines\qemu\win10x86.qcow2 -cdrom C:\user\qemu\Windows10En.iso

@nevilad
Copy link
Contributor Author

nevilad commented Jun 15, 2019

Left questions:

  1. Saving cr_pat/cr8 to/from qemu:
  • set HAX_CUR_VERSION to 5
  • add cr8 to the end of vcpu structure received through ioctl
  • create patch for qemu which increments supported API version, sets/gets cr_pat with other MSRs and add cr8 to it's vcpu structure. When qemu sees haxm of an older version, it will not set cr_pat with other MSRs.
  • when hax works with an older qemu version (no cr8 and cr_pat emulation), it will not save/restore them.
    Thus new functionality will work only, when qemu is aware of them.
  1. There was a comment "pat is disabled!"
    Still no answer.
  2. I can't yet find a solution to the CR8 problem.
    Windows 10 x64 installation enters an eternal cycle. Don't know what it wants, but previous errors where due to missed CPU functionality. Seems that the problem is not in performance due to cr8 load\store vmexits.

@coxuintel
Copy link
Contributor

coxuintel commented Jun 19, 2019

  1. Saving cr_pat/cr8 to/from qemu:
  • set HAX_CUR_VERSION to 5
  • add cr8 to the end of vcpu structure received through ioctl
  • create patch for qemu which increments supported API version, sets/gets cr_pat with other MSRs and add cr8 to it's vcpu structure. When qemu sees haxm of an older version, it will not set cr_pat with other MSRs.
  • when hax works with an older qemu version (no cr8 and cr_pat emulation), it will not save/restore them.
    Thus new functionality will work only, when qemu is aware of them.

Excellent! If you changed some common CPU code in qemu, then there could be more comments from the community. Please check this link for more information:
https://wiki.qemu.org/Documentation/GettingStartedDevelopers

  1. There was a comment "pat is disabled!"
    Still no answer.

I can't find the history why the author left such comment. We could run more tests based on you PAT enhancement, if it runs OK for both upstream QEMU and Android Emulator, it could be safe then.

  1. I can't yet find a solution to the CR8 problem.
    Windows 10 x64 installation enters an eternal cycle. Don't know what it wants, but previous errors where due to missed CPU functionality. Seems that the problem is not in performance due to cr8 load\store vmexits.

If the "nonstandard solution" you mentioned is the case that HAXM handle APIC by disable rwx in page, the reason could be we doesn't support APICv like KVM yet.
What is the specific "CR8 problem" you met in this PR? The performance is too bad, or simply handle CR8 doesn't work at all for 64bit Windows? As @krytarowski mentioned, CR8 is passed to qemu and emulate by qemu. If it's the performance issue, than APICv has to be implemented.
If you disassemble the 64-bit nt kernel, you can see 64-bit Windows IRQL mechanism relies on CR8: (Dump from WinDbg with public symbol)


0: kd> uf nt!KeGetCurrentIrql  
nt!BgpFwGetCurrentIrql:  
fffff802`2b266620 440f20c0        mov     rax,cr8  
fffff802`2b266624 c3              ret

0: kd> uf nt!KzRaiseIrql
nt!KzRaiseIrql:  
fffff802`2b266600 440f20c0        mov     rax,cr8  
fffff802`2b266604 0fb6d1          movzx   edx,cl  
fffff802`2b266607 440f22c2        mov     cr8,rdx  
fffff802`2b26660b 8b15938f5000    mov     edx,dword ptr [nt!KiIrqlFlags (fffff802`2b76f5a4)]  
fffff802`2b266611 85d2            test    edx,edx  
fffff802`2b266613 0f85eb1f1700    jne     nt!KzRaiseIrql+0x172004 (fffff802`2b3d8604) Branch  
fffff802`2b266619 c3              ret  Branch  
fffff802`2b3d8604 f6c201          test    dl,1  
fffff802`2b3d8607 0f840ce0e8ff    je      nt!KzRaiseIrql+0x19 (fffff802`2b266619)  Branch  
fffff802`2b3d860d 80f902          cmp     cl,2  
fffff802`2b3d8610 0f8203e0e8ff    jb      nt!KzRaiseIrql+0x19 (fffff802`2b266619)  Branch  
fffff802`2b3d8616 3c02            cmp     al,2  
fffff802`2b3d8618 0f83fbdfe8ff    jae     nt!KzRaiseIrql+0x19 (fffff802`2b266619)  Branch  
fffff802`2b3d861e 65488b0c2520000000 mov   rcx,qword ptr gs:[20h]  
fffff802`2b3d8627 488b91b8610000  mov     rdx,qword ptr [rcx+61B8h]  
fffff802`2b3d862e f0810a00000100  lock or dword ptr [rdx],10000h  
fffff802`2b3d8635 c3              ret

0: kd> uf nt!KzLowerIrql
nt!KzLowerIrql:
fffff802`2b2d05f0 4053            push    rbx
fffff802`2b2d05f2 4883ec20        sub     rsp,20h
fffff802`2b2d05f6 8b05a8ef4900    mov     eax,dword ptr [nt!KiIrqlFlags (fffff802`2b76f5a4)]
fffff802`2b2d05fc 0fb6d9          movzx   ebx,cl
fffff802`2b2d05ff 85c0            test    eax,eax
fffff802`2b2d0601 0f85f1881000    jne     nt!KzLowerIrql+0x108908 (fffff802`2b3d8ef8)  Branch
fffff802`2b2d0607 0fb6c3          movzx   eax,bl
fffff802`2b2d060a 440f22c0        mov     cr8,rax
fffff802`2b2d060e 4883c420        add     rsp,20h
fffff802`2b2d0612 5b              pop     rbx
fffff802`2b2d0613 c3              ret
fffff802`2b3d8ef8 a801            test    al,1
fffff802`2b3d8efa 0f840777efff    je      nt!KzLowerIrql+0x17 (fffff802`2b2d0607)  Branch
fffff802`2b3d8f00 440f20c0        mov     rax,cr8
fffff802`2b3d8f04 3c02            cmp     al,2
fffff802`2b3d8f06 0f82fb76efff    jb      nt!KzLowerIrql+0x17 (fffff802`2b2d0607)  Branch
fffff802`2b3d8f0c 80fb02          cmp     bl,2
fffff802`2b3d8f0f 0f83f276efff    jae     nt!KzLowerIrql+0x17 (fffff802`2b2d0607)  Branch
fffff802`2b3d8f15 65488b0c2520000000 mov   rcx,qword ptr gs:[20h]
fffff802`2b3d8f1e 488b81b8610000  mov     rax,qword ptr [rcx+61B8h]
fffff802`2b3d8f25 f08120fffffeff  lock and dword ptr [rax],0FFFEFFFFh
fffff802`2b3d8f2c e8eb420d00      call    nt!KiRemoveSystemWorkPriorityKick (fffff802`2b4ad21c)
fffff802`2b3d8f31 90              nop
fffff802`2b3d8f32 e9d076efff      jmp     nt!KzLowerIrql+0x17 (fffff802`2b2d0607)  Branch

Thus any operations on IRQL (which are quite frequent) will cause tons of CR8 access. To handle CR8 correctly, certain IRQs have to be handled as well.

@nevilad
Copy link
Contributor Author

nevilad commented Jun 19, 2019

If you changed some common CPU code in qemu,

No, only i386 hax working code.

We could run more tests based on you PAT enhancement,

OK. I thought, since PAT was conciously disabled, there where issues with some OSes.

What is the specific "CR8 problem" you met in this PR

The problem is that Windows 10 x64 still doesn't work. When I wrote my initial question, I thought the reason was performance - too many vmexits and returns to usermode slowed the execution so much, that it looked like the OS hung. Later investigation showed, that there are many more HPET access vmexits, which return to usermode too, and a cycle in the code, which exit condition I'm still unable to figure out. The last post about CR8 problem was just a status report, that should be read as "Windows 10x64 does not work".

@coxuintel
Copy link
Contributor

If you changed some common CPU code in qemu,

No, only i386 hax working code.

The patch also needs sent to the mailing list, if you wish you enhancement upstreamed. 😄

We could run more tests based on you PAT enhancement,

OK. I thought, since PAT was conciously disabled, there where issues with some OSes.

We will run more tests based on you patch see any unexpected shows up. Currently I can't find any record discussing why PAT disabled in the initial version.

What is the specific "CR8 problem" you met in this PR

The problem is that Windows 10 x64 still doesn't work. When I wrote my initial question, I thought the reason was performance - too many vmexits and returns to usermode slowed the execution so much, that it looked like the OS hung. Later investigation showed, that there are many more HPET access vmexits, which return to usermode too, and a cycle in the code, which exit condition I'm still unable to figure out. The last post about CR8 problem was just a status report, that should be read as "Windows 10x64 does not work".

May be we can take one step at a time and continue the investigation. Enabling 32-bit Win10 is an excellent enhancement. I'm trying to figure out why I always hit an 0xA BSOD although seems like not introduced by your patch, however it blocks me the testing boot a 32-bit Win10. Then our QA colleague could run more tests and merge your PR.

@nevilad
Copy link
Contributor Author

nevilad commented Jun 20, 2019

The version without the patch merged crashes too?

@wcwang
Copy link
Contributor

wcwang commented Feb 27, 2020

I also have questions about some hax supporting code in qemu and want to ask these here.

QEMU develop mailing list is the place that reviewing QEMU HAX code. It is also available to discuss HAX related issues in HAXM GitHub. We did not actively check launchpad regularly.
To submit QEMU patch, follow QEMU code review process here.
For HAX related code review, send patch to wcwang and coxuintel, CC qemu-devel and haxm-team.
For HAX related discussion, send to haxm-team, CC qemu-devel.
For other discussions, follow QEMU community rules.

I've created this bug for qemu - https://bugs.launchpad.net/qemu/+bug/1855617.

Not all registers are saved and restored unless it is necessary, in consideration of both necessity and performance. The officially supported VM is Android Emulator and we have a test suite to cover most of the tests. For upstream QEMU, it works but is not guaranteed to run all OSes without problem.
Back to the mismatched state, have you found which register(s) directly related to the issue you encountered? Will you expect to support all registers to save and restore?

I have added the review comments in the commits inline. You may reference them and adjust some coding styles this time. e.g., space usage, comments style, parentheses alignment, etc. Thanks.

@nevilad
Copy link
Contributor Author

nevilad commented Feb 27, 2020

For HAX related code review, send patch to wcwang and coxuintel, CC qemu-devel and haxm-team.

Maybe submit a patch to qemu MAINTAINERS file and add this info there?

Not all registers are saved and restored unless it is necessary, in consideration of both necessity and performance.

Yes, but immediately before saving register/MSR values to the snapshot, qemu should get actual values from haxm, but does this not.

Back to the mismatched state, have you found which register(s) directly related to the issue you encountered? Will you expect to support all registers to save and restore?

So far I understand, qemu saves CPUArchState to the snapshot and it contains full state (all registers/MSRs).

@wcwang
Copy link
Contributor

wcwang commented Feb 28, 2020

These are values returned by qemu for most processors, so these are tested and proven to work.

Did it have any problem using host returned value actually? If true, is it possible to replace the hard-coded values by the OR-operations of defined macros, rather than the instant numbers?

Version is checked only on cr8 access, add VM_FEATURES_CR8 flag?

Sure, I agree.

Maybe submit a patch to qemu MAINTAINERS file and add this info there?

We have submitted the patch to QEMU MAINTAINERS file.

Yes, but immediately before saving register/MSR values to the snapshot, qemu should get actual values from haxm, but does this not.

To solve the issue in PR #204, you may add those required registers that you find for now. Maybe we will consider to add more specified necessary registers for HAXM according to the feature requirement in future.

@nevilad
Copy link
Contributor Author

nevilad commented Feb 28, 2020

Did it have any problem using host returned value actually? If true, is it possible to replace the hard-coded values by the OR-operations of defined macros, rather than the instant numbers?

There was no problems using host returned values. So far I understand, qemu loads same values to kvm, so it should return these too.
Yes, macros can be done.

@wcwang
Copy link
Contributor

wcwang commented Feb 28, 2020

There was no problems using host returned values. So far I understand, qemu loads same values to kvm, so it should return these too.

Since there is no problem found, we recommended to use host value directly but macros should be okay. Does KVM use instant values, too?

@nevilad
Copy link
Contributor Author

nevilad commented Feb 28, 2020

This is an example from qemu:
struct CPUID2CacheDescriptorInfo cpuid2_cache_descriptors[] = {
[0x06] = { .level = 1, .type = ICACHE, .size = 8 * KiB,
.associativity = 4, .line_size = 32, },
[0x0A] = { .level = 1, .type = DCACHE, .size = 8 * KiB,
.associativity = 2, .line_size = 32, },
[0x1D] = { .level = 2, .type = UNIFIED_CACHE, .size = 128 * KiB,
.associativity = 2, .line_size = 64, },

Copy link
Contributor

@wcwang wcwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for late review. The round only focused on the coding style and some of suggestions will be up to your customs. Thanks for your update.

@@ -43,6 +43,7 @@ int vcpu_set_msr(struct vcpu_t *vcpu, uint64_t index, uint64_t value);
int vcpu_get_msr(struct vcpu_t *vcpu, uint64_t index, uint64_t *value);
int vcpu_put_fpu(struct vcpu_t *vcpu, struct fx_layout *fl);
int vcpu_get_fpu(struct vcpu_t *vcpu, struct fx_layout *fl);
int get_vcpu_state_size( struct vcpu_t *vcpu );
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Rename the function to vcpu_get_state_size()
  • Remove extra spaces around the parentheses
  • Move this line down before vcpu_debug()

@@ -256,6 +257,7 @@ void vcpu_vmwrite_all(struct vcpu_t *vcpu, int force_vtlb_flush);

int vcpu_teardown(struct vcpu_t *vcpu);

int get_vcpu_state_size( struct vcpu_t *vcpu );
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same changes apply here.

core/vcpu.c Outdated
@@ -3964,6 +3964,13 @@ static int _copy_desc(segment_desc_t *old, segment_desc_t *new)
return flags;
}

int get_vcpu_state_size( struct vcpu_t *vcpu )
{
if( vcpu->vm->features & VM_FEATURES_CR8 )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Add a space after if
  • Remove extra spaces around parentheses

core/vcpu.c Outdated
@@ -3999,6 +4006,9 @@ int vcpu_get_regs(struct vcpu_t *vcpu, struct vcpu_state_t *ustate)
_copy_desc(&state->_gdt, &ustate->_gdt);
_copy_desc(&state->_idt, &ustate->_idt);

if( vcpu->vm->features & VM_FEATURES_CR8 )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same changes apply here.

core/vcpu.c Outdated
@@ -4124,6 +4134,9 @@ int vcpu_set_regs(struct vcpu_t *vcpu, struct vcpu_state_t *ustate)
VMWRITE_DESC(vcpu, IDTR, state->_idt);
}

if( vcpu->vm->features & VM_FEATURES_CR8 )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

@@ -398,15 +398,15 @@ NTSTATUS HaxVcpuControl(PDEVICE_OBJECT DeviceObject,
}
case HAX_VCPU_GET_REGS: {
struct vcpu_state_t *vc_state;
if(outBufLength < sizeof(struct vcpu_state_t)) {
infret = get_vcpu_state_size(cvcpu);
if(outBufLength < infret) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a space after if.

@@ -84,6 +84,7 @@ struct hstate {
uint64_t dr7;
// CR0
bool cr0_ts;
uint64_t _pat;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you think pat_msr is better?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, there is already _efer named in the same way in the same struct. There are other PATs - in vcpu it is called cr_pat and in hstate_compare it is called pat_msr. It helps to distinct them by names. If it is necessary to use common names, it should be done as a standalone feature for the entire code.

core/vcpu.c Outdated
@@ -3605,6 +3612,15 @@ static int misc_msr_write(struct vcpu_t *vcpu, uint32_t msr, uint64_t val)
return 1;
}

static inline bool pat_valid(uint64_t val)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you think is_pat_valid() is better for bool function?

core/vcpu.c Outdated Show resolved Hide resolved
core/vcpu.c Outdated
@@ -3763,7 +3779,15 @@ static int handle_msr_write(struct vcpu_t *vcpu, uint32_t msr, uint64_t val,
break;
}
case IA32_CR_PAT: {
//Attempting to write an undefined memory type encoding into the PAT causes a
//general-protection (#GP) exception to be generated.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a space after slashes and pay attention to the line wrap to ensure the long line does not exceed column 80.

// Attempting to write an undefined memory type encoding into the
// PAT causes a general-protection (#GP) exception to be generated.

@wayne-ma
Copy link
Contributor

wayne-ma commented Mar 4, 2020

Is it PR #245 for CPUID feature detection code? But I still found some CPUID feature related code sections in commit 69d5462 and acce9c7. An independent feature based commit would be better.

Yes,this PR. It is about refactoring - changing the code to make it easier to understand but not changing it's functionality. The commits in this PR are adding functionality that is needed for windows 10 support. You mean adding all commits in different PRs?

For the feature capability, could you implement the feature flag in include/hax_interface.h, instead of increasing the interface version. And then submit it as an independent commit.

Why does version increment not work? Making such a feature is another PR in itself, I have to rewrite a big part of hax and qemu patches.

For the QEMU side patch, you have to follow the QEMU upstream community process.

I've created this bug for qemu - https://bugs.launchpad.net/qemu/+bug/1855617. It was cc'ed to Colin and Yu but there is no answer for more than two monts. I'm scare that the patch to qemu will be too not answered for several months.
Colin and Yu are @coxuintel and @raphaelning ? I also have questions about some hax supporting code in qemu and want to ask these here.

Could you provide the testing steps to add SCSI disks to guests?

I've tried it now and there was no problem. Must investigate this. I've adding it with these commands:
-device virtio-scsi-pci,id=scsihw0 -device scsi-hd,bus=scsihw0.0,drive=scsi0,id=scsi0 -drive file=C:\VirtualMachines\qemu\win10x86_inst.qcow2,if=none,format=qcow2,cache=writethrough,id=scsi0

This is an empty disk on which the OS gets installed. I think the problem was when I tested it with a disk, that had already an OS installed.
Hi @nevilad , I tried to add SCSI device option when launching Win10 32 guest, however the guest always rebooted with BSOD. Did you also meet the same issue. So far I could NOT bootup the Win1032 guest after adding SCSI. Could you help to check if I use the right options? Or share me your cmd and steps? Thanks.
scsi-reboot

Signed-off-by: Alexey Romko <nevilad@yahoo.com>
Signed-off-by: Alexey Romko <nevilad@yahoo.com>
Signed-off-by: Alexey Romko <nevilad@yahoo.com>
@nevilad
Copy link
Contributor Author

nevilad commented Mar 4, 2020

Hi @wayne-ma, my results are:

  • running Win10 installation without scsi works.
  • running installed version without scsi works.
  • running this installation with scsi ends with inaccessible boot device BSOD.
  • running Win10 installation with scsi works.

Looks like it is not possible to change drive type after installation.

@wayne-ma
Copy link
Contributor

wayne-ma commented Mar 6, 2020

Hi @wayne-ma, my results are:

  • running Win10 installation without scsi works.
  • running installed version without scsi works.
  • running this installation with scsi ends with inaccessible boot device BSOD.
  • running Win10 installation with scsi works.

Looks like it is not possible to change drive type after installation.

Hi @nevilad ,yes, I almost got the same result. Regarding the 3rd * scenario, I noticed that I could boot up the OS after removing this option '-device scsi-hd,bus=scsihw0.0,drive=scsi0,id=scsi0', I am not sure if it is a must for scsi device. Anyway, paste my cmd for your reference,
$qemu-system-x86_64.exe -L C:\Binaries\qemu_pr204 -name win10x86 -cpu Penryn -machine pc-i440fx-3.0 -m 4G -vga std -rtc base=localtime -accel hax -device virtio-scsi-pci,id=scsihw0 -drive file=win1032scsi.qcow2 ’

@wcwang
Copy link
Contributor

wcwang commented Mar 6, 2020

@nevilad, thanks for your update. Could you submit the QEMU patch to the community as well and CC the maintainers as previous list so that we can receive it? Thanks.

@wcwang wcwang merged commit 7f3aaab into intel:master Mar 6, 2020
@nevilad
Copy link
Contributor Author

nevilad commented Mar 6, 2020

I've sent the qemu patch 29.02 to qemu-devel@nongnu.org and CC wenchao.wang@intel.com, colin.xu@intel.com. Did you not receive it?

@nevilad
Copy link
Contributor Author

nevilad commented Mar 6, 2020

This issue can be closed - #59

@nevilad nevilad deleted the win10_support branch March 6, 2020 09:45
@nevilad
Copy link
Contributor Author

nevilad commented Mar 6, 2020

Created an issue for SCSI discussion #270.

@wcwang
Copy link
Contributor

wcwang commented Mar 6, 2020

I've sent the qemu patch 29.02 to qemu-devel@nongnu.org and CC wenchao.wang@intel.com, colin.xu@intel.com. Did you not receive it?

Sure, I received the QEMU patch and will give you feedback soon. Thanks for your notification.

@wcwang
Copy link
Contributor

wcwang commented Mar 10, 2020

@nevilad, after we tested this patch further, we found that the Windows guest could not be launched again. When (a == 4) and (c == 0) in handle_cpuid_virtual(), the log is as below,

haxm_warning: CPUID 4.0, eax 0x1c004121, ebx 0x1c0003f, ecx 0x3f, edx 0x0	
haxm_warning: CPUID 4:0, eax 0x1c004121, ebx 0x1c0003f, ecx 0x3f, edx 0x0 AFTER	

After applying a part of your previous patch as below, the Windows guest can be booted up successfully.

        case 3: {
            state->_eax = state->_ebx = state->_ecx = state->_edx = 0;
            return;
        }
        case 4: {
            if (0 == c) {
                //1 0 01 00001 data cache, level 1, fully associative cache
                state->_eax = 0x121;
                // state->_ebx = 0x1C0003F;
                // state->_ecx = 0x3F;
                // state->_edx = 0x1;
            }
            return;
        }
haxm_warning: CPUID 4.0, eax 0x1c004121, ebx 0x1c0003f, ecx 0x3f, edx 0x0
haxm_warning: CPUID 4:0, eax 0x121, ebx 0x1c0003f, ecx 0x3f, edx 0x0 AFTER

So using the host value directly will not work, at least EAX value needs to change. Could you submit another patch to resolve this issue? You may decide whether to fully inherit the values from QEMU for other cases, but using macros for assignments. Sorry for not finding this issue before merging your patch. Thanks.

@nevilad
Copy link
Contributor Author

nevilad commented Mar 10, 2020

Created #272, this should help when passing host values to the guest. Your cache output says your host has 6 cores and the guest only one.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
CI:Build Pass CI:Build Pass CI:Mac Test Pass CI:Mac Test Pass
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants