Leapdragon 2016 - Aron Hsiao Was Here

resize2fsck and what not to do with your volumes  §

So I have a 2TB RAID with 1TB stored on it. I wanted to clear some space for a Windows-accessible NTFS backup partition, so I ran an fsck on the device and then decided I’d use resize2fs, which I’ve never used before, to resize the filesystem downward. I expected it to run for a few hours.

Instead, 12 hours later it was still cranking with intense disk activity. 24 Hours later, the disk activity had slowed to a smattering of flashes, but the process was pegging the CPU to full load. Sometime between 24 and 36 hours later, X hung rock solid, locking me out of my desktop (OOM? Who knows?) so I was forced to log in via SSH from my iPhone to monitor progress, and iostat and top together showed basically an overloaded CPU and very periodic (once every ten or fifteen minutes) bursts of 10k or 20k of reading or writing.

I sent SIGSTOP to the Xorg process to hopefully head off any crashing it was about to do (which would have taken my non-nohup’ed resize2fs process with it, and trashed the filesystem completely along with all of my data). I used vbetool over SSH to turn my LCD backlight off and finally give my display a break from showing a hung desktop (power management fled when X hung). Then there was nothing to do but wait until I just couldn’t wait any more for the resize2fs process to complete.

40 hours later when I woke up this morning, it was still in the same state, moving 5-10kb every 10-15 minutes and pegging at least one core all of the time. I didn’t know if it was spinning uselessly or what. The iostat statistics showed about two-thirds of a terabyte read from and about half a terabyte written to the device. I decided I’d wait until tonight at the latest (approaching 60 hours being ridiculous territory for a filesystem resize) before giving up.

About 10 minutes ago, I logged in again via SSH on my iPhone to find that the resize2fs process was finally gone and CPU load was nil. Crazy with an alternating mixture of relief and dread, I rebooted, immediately logged into my desktop, started a terminal, and tried to run e2fsck on /dev/sdc1. I got:

e2fsck 1.41.9 (22-Aug-2009)

/dev/sdc1 is mounted.

WARNING!!! Running e2fsck on a mounted filesystem may cause

SEVERE filesystem damage.

Do you really want to continue (y/n)? no

The damned thing had been automounted by the Fedora desktop and was apparently fine. Sure enough, all data appears to be there. I unmounted it an am running a forced e2fsck to make sure that all is well.

THE POINT(S)?

– A terabyte is still a hell of a lot of data, even at today’s speeds

– Fedora 12 and its X environment are far from stable, especially under load

– resize2fs is a resource hog

– Manpages should give use characteristics and tool behavior, not just instructions

– 48 hours is longer than I ever want to wait for a risky, critical process again

– Don’t be cheap and try to squeeze in more; just buy more drives