You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
120 lines
4.4 KiB
120 lines
4.4 KiB
1 year ago
|
From b827e2318ea2bb3eabca13a965c2535a1d7289e5 Mon Sep 17 00:00:00 2001
|
||
|
From: Long Li <leo.lilong@huawei.com>
|
||
|
Date: Fri, 18 Nov 2022 12:23:57 +0100
|
||
|
Subject: [PATCH] xfs: fix sb write verify for lazysbcount
|
||
|
|
||
|
Source kernel commit: 7cecd500d90164419add650e26cc1de03a7a66cb
|
||
|
|
||
|
When lazysbcount is enabled, fsstress and loop mount/unmount test report
|
||
|
the following problems:
|
||
|
|
||
|
XFS (loop0): SB summary counter sanity check failed
|
||
|
XFS (loop0): Metadata corruption detected at xfs_sb_write_verify+0x13b/0x460,
|
||
|
xfs_sb block 0x0
|
||
|
XFS (loop0): Unmount and run xfs_repair
|
||
|
XFS (loop0): First 128 bytes of corrupted metadata buffer:
|
||
|
00000000: 58 46 53 42 00 00 10 00 00 00 00 00 00 28 00 00 XFSB.........(..
|
||
|
00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
|
||
|
00000020: 69 fb 7c cd 5f dc 44 af 85 74 e0 cc d4 e3 34 5a i.|._.D..t....4Z
|
||
|
00000030: 00 00 00 00 00 20 00 06 00 00 00 00 00 00 00 80 ..... ..........
|
||
|
00000040: 00 00 00 00 00 00 00 81 00 00 00 00 00 00 00 82 ................
|
||
|
00000050: 00 00 00 01 00 0a 00 00 00 00 00 04 00 00 00 00 ................
|
||
|
00000060: 00 00 0a 00 b4 b5 02 00 02 00 00 08 00 00 00 00 ................
|
||
|
00000070: 00 00 00 00 00 00 00 00 0c 09 09 03 14 00 00 19 ................
|
||
|
XFS (loop0): Corruption of in-memory data (0x8) detected at _xfs_buf_ioapply
|
||
|
+0xe1e/0x10e0 (fs/xfs/xfs_buf.c:1580). Shutting down filesystem.
|
||
|
XFS (loop0): Please unmount the filesystem and rectify the problem(s)
|
||
|
XFS (loop0): log mount/recovery failed: error -117
|
||
|
XFS (loop0): log mount failed
|
||
|
|
||
|
This corruption will shutdown the file system and the file system will
|
||
|
no longer be mountable. The following script can reproduce the problem,
|
||
|
but it may take a long time.
|
||
|
|
||
|
#!/bin/bash
|
||
|
|
||
|
device=/dev/sda
|
||
|
testdir=/mnt/test
|
||
|
round=0
|
||
|
|
||
|
function fail()
|
||
|
{
|
||
|
echo "$*"
|
||
|
exit 1
|
||
|
}
|
||
|
|
||
|
mkdir -p $testdir
|
||
|
while [ $round -lt 10000 ]
|
||
|
do
|
||
|
echo "******* round $round ********"
|
||
|
mkfs.xfs -f $device
|
||
|
mount $device $testdir || fail "mount failed!"
|
||
|
fsstress -d $testdir -l 0 -n 10000 -p 4 >/dev/null &
|
||
|
sleep 4
|
||
|
killall -w fsstress
|
||
|
umount $testdir
|
||
|
xfs_repair -e $device > /dev/null
|
||
|
if [ $? -eq 2 ];then
|
||
|
echo "ERR CODE 2: Dirty log exception during repair."
|
||
|
exit 1
|
||
|
fi
|
||
|
round=$(($round+1))
|
||
|
done
|
||
|
|
||
|
With lazysbcount is enabled, There is no additional lock protection for
|
||
|
reading m_ifree and m_icount in xfs_log_sb(), if other cpu modifies the
|
||
|
m_ifree, this will make the m_ifree greater than m_icount. For example,
|
||
|
consider the following sequence and ifreedelta is postive:
|
||
|
|
||
|
CPU0 CPU1
|
||
|
xfs_log_sb xfs_trans_unreserve_and_mod_sb
|
||
|
---------- ------------------------------
|
||
|
percpu_counter_sum(&mp->m_icount)
|
||
|
percpu_counter_add_batch(&mp->m_icount,
|
||
|
idelta, XFS_ICOUNT_BATCH)
|
||
|
percpu_counter_add(&mp->m_ifree, ifreedelta);
|
||
|
percpu_counter_sum(&mp->m_ifree)
|
||
|
|
||
|
After this, incorrect inode count (sb_ifree > sb_icount) will be writen to
|
||
|
the log. In the subsequent writing of sb, incorrect inode count (sb_ifree >
|
||
|
sb_icount) will fail to pass the boundary check in xfs_validate_sb_write()
|
||
|
that cause the file system shutdown.
|
||
|
|
||
|
When lazysbcount is enabled, we don't need to guarantee that Lazy sb
|
||
|
counters are completely correct, but we do need to guarantee that sb_ifree
|
||
|
<= sb_icount. On the other hand, the constraint that m_ifree <= m_icount
|
||
|
must be satisfied any time that there /cannot/ be other threads allocating
|
||
|
or freeing inode chunks. If the constraint is violated under these
|
||
|
circumstances, sb_i{count,free} (the ondisk superblock inode counters)
|
||
|
maybe incorrect and need to be marked sick at unmount, the count will
|
||
|
be rebuilt on the next mount.
|
||
|
|
||
|
Fixes: 8756a5af1819 ("libxfs: add more bounds checking to sb sanity checks")
|
||
|
Signed-off-by: Long Li <leo.lilong@huawei.com>
|
||
|
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
|
||
|
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
|
||
|
Signed-off-by: Carlos Maiolino <cem@kernel.org>
|
||
|
Signed-off-by: Pavel Reichl <preichl@redhat.com>
|
||
|
---
|
||
|
libxfs/xfs_sb.c | 4 +++-
|
||
|
1 file changed, 3 insertions(+), 1 deletion(-)
|
||
|
|
||
|
diff --git a/libxfs/xfs_sb.c b/libxfs/xfs_sb.c
|
||
|
index fc33dc4a..d05f0e6e 100644
|
||
|
--- a/libxfs/xfs_sb.c
|
||
|
+++ b/libxfs/xfs_sb.c
|
||
|
@@ -970,7 +970,9 @@ xfs_log_sb(
|
||
|
*/
|
||
|
if (xfs_has_lazysbcount(mp)) {
|
||
|
mp->m_sb.sb_icount = percpu_counter_sum(&mp->m_icount);
|
||
|
- mp->m_sb.sb_ifree = percpu_counter_sum(&mp->m_ifree);
|
||
|
+ mp->m_sb.sb_ifree = min_t(uint64_t,
|
||
|
+ percpu_counter_sum(&mp->m_ifree),
|
||
|
+ mp->m_sb.sb_icount);
|
||
|
mp->m_sb.sb_fdblocks = percpu_counter_sum(&mp->m_fdblocks);
|
||
|
}
|
||
|
|
||
|
--
|
||
|
2.40.0
|
||
|
|