2GB memory limit running fsck on a +6TB device

Discussion:

s***@usansolo.net

2008-06-09 17:33:48 UTC

Dear Srs,

That's the scenario: +6TB device on a 3ware 9550SX RAID controller, running
Debian Etch 32bits, with 2.6.25.4 kernel, and defaults e2fsprogs version,
"1.39+1.40-WIP-2006.11.14+dfsg-2etch1".

Running "tune2fs" returns that filesystem is in EXT3_ERROR_FS state, "clean
with errors":

# tune2fs -l /dev/sda4
tune2fs 1.40.10 (21-May-2008)
Filesystem volume name: <none>
Last mounted on: <not available>
Filesystem UUID: 7701b70e-f776-417b-bf31-3693dba56f86
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: has_journal dir_index filetype needs_recovery
sparse_super large_file
Default mount options: (none)
Filesystem state: clean with errors
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 792576000
Block count: 1585146848

It's a backup storage server, with more than 113 million files, this's the
output of "df -i":

# df -i /backup/
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/sda4 792576000 113385959 679190041 15% /backup

Running fsck.ext3 or fsck.ext2 I get:

# fsck.ext3 /dev/sda4
e2fsck 1.40.10 (21-May-2008)
Adding dirhash hint to filesystem.

/dev/sda4 contains a file system with errors, check forced.
Pass 1: Checking inodes, blocks, and sizes
Error allocating directory block array: Memory allocation failed
e2fsck: aborted

With some straces:

================================================================================
gettimeofday({1213032482, 940738}, NULL) = 0
getrusage(RUSAGE_SELF, {ru_utime={0, 0}, ru_stime={0, 16001}, ...}) = 0
write(1, "Pass 1: Checking ", 17Pass 1: Checking ) = 17
write(1, "inode", 5inode) = 5
write(1, "s, ", 3s, ) = 3
write(1, "block", 5block) = 5
write(1, "s, and sizes\n", 13s, and sizes
) = 13
mmap2(NULL, 99074048, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
0) = 0x404fa000
mmap2(NULL, 99074048, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
0) = 0x46376000
mmap2(NULL, 99074048, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
0) = 0x4c1f2000
mmap2(NULL, 198148096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
0) = 0x5206e000
mmap2(NULL, 99074048, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
0) = 0x5dd66000
mmap2(NULL, 748892160, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
0) = 0x63be2000
mmap2(NULL, 1866240000, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
-1, 0) = -1 ENOMEM (Cannot allocate memory)
brk(0x77488000) = 0x80ab000
mmap2(NULL, 1866375168, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
-1, 0) = -1 ENOMEM (Cannot allocate memory)
mmap2(NULL, 2097152, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE,
-1, 0) = 0x90615000
munmap(0x90615000, 962560) = 0
munmap(0x90800000, 86016) = 0
mprotect(0x90700000, 135168, PROT_READ|PROT_WRITE) = 0
mmap2(NULL, 1866240000, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
-1, 0) = -1 ENOMEM (Cannot allocate memory)
================================================================================

Appears that fsck is trying to use more than 2GB memory to store inode
table relationship. System has 4GB of physical RAM and 4GB of swap, is
there anyway to limit the memory used by fsck or any solution to check this
filesystem? Running fsck with a 64bit LiveCD will solve the problem?

I also tried with last e2fsprogs stable release 1.40.10, getting the same
error :-/

Regards,

--
Santi Saez

Theodore Tso

2008-06-09 21:33:20 UTC