-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Apple Virtual machine keeps setting itself as read-only #4840
Comments
Do you cleanly shut down the VM every time? |
I always use the "shutdown now" command in the terminal |
What kind of commands do you do that cause it? |
It happens after I finish the setup process (using iso to install and reboot) |
Is there anything interesting/weird in the |
My VM seems to not boot I will tell you the results after I create one again (I will most likely get the same results since this is my 7th VM at this point) |
I had / have the same problem. The kernel log says that the kernel realizes that the checksum of an inode of /dev/vda2 (=ext4 system partition) is incorrect and thus remounts the file system as read-only. Shortly after that the kernel oopses (debian with kernel 5.10.0.158.):
An other time the kernel oopsed this:
I then installed Fedora 37 with kernel 6.0.7 which also got me an read-only file-system and this log:
I then rebooted the host system and installed Debian. I was also able to upgrade to kernel com.apple.Virtualization.VirtualMachine_2023-01-13-113856_MacBook-Pro-von-Tim.log |
Any progress on this? |
It dont think this got anything to do with UTM. I see the same problems with VirtIO FS corruption in everything using the Apple Virtualization Framework (eg Docker). I opened a ticket in the feedback assistant but Apple asks for a reliable way to reproduce the issue, which I did not find a way yet. I dont know how many people are using UTM with the Apple Virtualization Framework enabled but I know that a ton of people are using Docker on macOS, which sporadically crashes due to the same reason for me. As no one else seems to have the same problem there, my current guess is that it's either a faulty macOS installation or a faulty hardware. I got no third-party kernel extensions loaded and programs running in EL0 should not be able to interfere with the hypervisor. Did you try reinstalling macOS? |
I have the same issue with Fedora 38 arm guest, it works for some time then fs becomes read-only and I have to restart it, will try to get more logs later One type of message I see is like this
|
Here is one more case when FS becomes read-only:
And I can't even shutdown properly at this point:
|
Same issue here with a Debian Testing aarch64 guest on Apple HV. Consistently triggers multiple times a day during regular use. Sometimes I end up with a corrupted disk and need to fsck from initramfs shell before being able to continue booting where it will then resolve various filesystem inconsistencies. |
@athre0z I got my Fedora fs corrupted to a point it couldn't boot graphical interface and install packages, I was able to recover my data through terminal and shared directory so be careful |
@pisker is probably right, I just saw the same btrfs error in a lima vm using vz on Ventura 13.4 |
FWIW I'm under the impression that it got a lot better with 13.4: I was easily running into ~4 crashes on any given workday previously whereas now it's more like one crash every two days. That being said, with this kind of spurious bugs it's also perfectly possible that it's just chance or a result of slightly altered workload. Personally I don't really care about FS corruption: everything of value is on a share anyway and if my VM dies I can spin up a fresh one in 30 minutes. I still prefer the crashy Apple HV with the lightning fast virtfs share over the qemu HV with the horrible 9p network share. |
I also encountered this using Apple Virtualization framework. I'm using the VM as a homelab server (basically container runner) with bridged interface. I chose Apple Virtualization framework over qemu because it has better vNIC performance on my 10G nic. I'm running Rocky Linux 9 and the default format is XFS. When this bug happens, the console will have a log says The frequency of this bug is low for me, perhaps 1-2 times a month, but it is still annoying to manually reboot the vm once I found my containers are down. So my workaround is to make a simple rust program that check if the root is available each 10s, if not then force reboot the machine and here's the code: use std::fs;
use std::io::Result;
use std::io::Write;
use std::thread;
use std::time::Duration;
/// Reference: https://www.kernel.org/doc/html/latest/admin-guide/sysrq.html
pub fn force_reboot() -> Result<()> {
let mut file = fs::File::create("/proc/sysrq-trigger")?;
file.write_all(b"b")?;
Ok(())
}
fn main() {
loop {
match fs::read_dir("/") {
Ok(_) => {}
Err(_) => {
force_reboot();
}
};
thread::sleep(Duration::from_secs(10));
}
} Hope this could help someone who also runs a server and have same problem. |
Similar here, This time with a fedora beta 39 guest. Using applie virtualization I get these memory corruption failures (and then other i/o failure). Fedora (albeit 38) is running fine under QEMU with UTM. I'd also previously hit an exception constantly with a RHEL 9.x gues - again, only with apple virtualization |
I think by now there might be enough reports to assume that there's really a bug somewhere, and that all kinds of Distros are affected (NixOS here ;); so adding yet another one seems to have diminishing returns in terms of information gained? From the reports here it seems the oldest kernel explicitly mentioned was 5.10.0.158, I am myself encountered it multiple times on 6.1.*, 6.1.54 atm (using both XFS and ext4) |
I can report that on a OSX 13.6, M2 Mini Pro 16G, the filesystem error occurs frequently. |
My experience has been the same until upgrading the Linux kernel after reading this: https://www.techradar.com/news/linux-kernel-62-is-here-and-it-now-has-mainline-support-for-apple-m1-chips. On M1, Sonoma, using Ubuntu 23.10 (Mantic), Linux kernel 6.5, it is reasonably stable. Unfortunately once there is one disk corruption it is difficult to know whether subsequent issues are totally fresh or a consequence of the first one. I regularly boot into an attached iso and run fsck while the main VM is unmounted. Journalctl reports NULL pointer and EXT4 issues from time to time but it is usable. You can get latest Ubuntu from https://cdimage.ubuntu.com |
Is it me or deactivating ballooning solves the problem? |
In my experience this issue is mostly related to “something” that leads to a kernel oops and filesystem error. ext4 is also affected. I have tried “every” virtualization platform that relies on AVF and the bug is there, always. Docker seems quite stable because of the Linux kernel it uses (a 5.* version, not 6.*). It seems to me that they’ve tried and tested kernels until they’ve found the stable one. In the end, it looks like it’s a combination of AVF + Linux kernel version, so the solution may be on Apple’s on Linux’s side… or both! |
Just in case you haven't come across it, AsahiLinux are working on the linux kernel specifically for Apple silicon purposes. The specific features they are working on are listed here. Their work finds its way into the kernel. They also provide a downloadable dual-boot solution. |
Great find @gnattu! I confirm it works for me and I see about 5-10% performance penalty. I've found something else that fixes this as well. I was playing with Apple's GUI Linux VM demo. It's using virtio. I've noticed you can control caching mode of the attached disk images. VZDiskImageStorageDeviceAttachment(
url: URL(fileURLWithPath: path),
readOnly: false,
cachingMode: .uncached,
synchronizationMode: .full) Caching mode:
The difference between these options comes down to whether the host macOS is using its memory to cache access to the disk image or not. Synchronization mode – I've tried all 3 values with caching set to NVMe isn't affected by caching mode – it works well with all 3 caching options. I'll report all this to Apple through the feedback assistant. Maybe one day somebody will take a look at it a fix it. I've attached my report in case anybody would like to do the same to put a bit of pressure. |
@wdormann Just to make sure, did you perform stress test on 6.7 kernel also? I had dev fedora running for more than 2 weeks and haven't seen disk corruption in that one. If i use stable one, it happens almost immediately to me. |
Agreed, for some reason this issue manifested for me only recently and I don’t recall updating anything. I’ve tried running NixOS with linuxPackages_6_7 (from nixos-unstable branch at 5a09cb4b393d58f9ed0d9ca1555016a8543c2ac8) and it does not fix the issue. Running with #5919 merged, I haven’t experienced any corruption yet, although the performance impact is significant. Before switching to NVMe, btrfs scrub speed was around ~1.6 GiB/s (compared to 2–3 GiB/s with the same raw image under QEMU). Now it’s around 0.5 GiB/s at best. |
We have a new situation. Someone is encountering outright NVMe I/O errors on Debian with macOS 14.2 running the 6.1 kernel: [link to the comment]. I couldn't reproduce this in my current environment(Fedora with 6.5 kernel on macOS 14.2) using the stress-ng method. If anyone has experienced this since macOS 14.2, please share your environment details. |
I'm using UTM 4.4.4 patched with #5919 on Sonoma 14.2.1; VM is Fedora 38 with a 6.6.7 kernel, using an NVMe disk . It worked good so far, but today I got an ext4 corruption:
Now the system doesn't boot and wants me to force an fsck. The corruption apparently manifested after a resume (of the host). |
We are using a build based on gnattus patch for 2 weeks now. But we experienced exactly the same as alanfranz: |
Is the cachingMode setting available via the UTM UI if I want to try this out? |
No it is not. But it is chosen automatically, if switching of NVMe |
I've got a response from Apple about this problem through the Feedback Assistant. They claim this problem should have been fixed in macOS 14.4. It isn't mentioned in the Release Notes. I'm currently not using UTM, so it would be great if somebody else could verify if Virtio in uncached mode doesn't produce disk corruption anymore. |
Can confirm it still happens in my case. I can't even install Kali Linux. It crashes while writing data to disk. Tested with macOS 14.4, UTM 4.5, Kali 2024.01 on a MacBook Pro M1 Pro. |
So is there any progress on this issue? I'm running into this as well where I'm actually unable to compete a Debian install because it'll freeze before it's finished due to this read-only disk issue. UTM 4.4.5 on an M2 Macbook Pro, macOS 14.4 |
The only known solution at this point is to build UTM yourself from the PR #5919 |
I just built UTM from that PR and it actually does not fix the issue for me. I am still unable to complete a Debian install as it randomly freezes from time to time. I have verified that it is using the NVMe interface. |
How about Virtio (NVMe disabled)? It should work because |
FWIW - I am using tart in the meantime. Gist here. |
Aha, that does indeed work! Custom build form #5919 with NVMe disabled makes the machine live hapily! Thanks for the tip. Now I can go get a refund on Parallels ;) |
UTM Version 4.1.2 (Beta)
Ubuntu Version: 23.4 Lunar Lobster
Apple Virtualisation With Rosetta 2 enabled
none of the disks are set as "read only" inside UTM
it sometimes works for seconds sometimes minutes but it always happens
errors usually has something like "error: read-only file system" but it varies from command to command
The text was updated successfully, but these errors were encountered: