注册

Linux报 “resource deadlock avoided” 异常的原因以及解决办法

  1. 原因

Resource deadlock是操作系统在调度多个进程时,当其中两个或多个进程占有相互依赖的资源时,由于资源竞争导致进程陷入死锁状态。Linux kernel会实现资源分配器resource allocator,调度资源给进程,避免进程间出现竞争或死锁状态。当资源调度器检测到死锁时,会发出"resource deadlock avoided"的报错信息。

报错信息会列出死锁进程和死锁状态资源的相关信息,如下的示例可以看到,死锁发生在两个进程间的semaphore wait()调用中,竞争的资源是主程序里的semaphore:

[ 1325.502637] INFO: task test:3453 blocked for more than 120 seconds.
[ 1325.502688] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1325.502759] test           D ffffffff8180c540     0  3453      1 0x00000080
[ 1325.502820]  ffff88081926b0a8 0000000000000082 0000000000017840 ffffffff81b4d4c0
[ 1325.502890]  ffff88081926b1d8 ffff88087fd9c600 ffff88081926b010 ffffffff815c3400
[  1325.502946]  ffff88081926b020 ffff88087fd9c680 ffff88081926b020 ffffffff8180c540
[  1325.503009] Call Trace:
[  1325.503042]  [] ? schedule_timeout+0x210/0x210
[  1325.503117]  [] io_schedule+0x5a/0xc0
[  1325.503183]  [] wait_on_page_bit+0xe8/0x100
[  1325.503250]  [] ? wake_bit_function+0x40/0x40
[  1325.503320]  [] wait_on_page_writeback_range+0xf4/0x170
[  1325.503387]  [] ? wake_up_atomic_t+0x30/0x30
[  1325.503454]  [] ? remove_inode_hash+0x2f/0x60
[  1325.503522]  [] filemap_write_and_wait_range+0x3a/0x50
[  1325.503591]  [] generic_file_buffered_write+0xfa/0x240
[  1325.503661]  [] __generic_file_aio_write+0x364/0x490
[  1325.503723]  [] ? do_futex+0x1fb/0x4f0
[  1325.503791]  [] ? __wake_up+0x4f/0x70
[  1325.503853]  [] generic_file_aio_write+0x62/0xc0
[  1325.503917]  [] do_sync_write+0xc0/0x110
[  1325.503982]  [] ? __set_task_blocked+0x3e/0x80
[  1325.504047]  [] vfs_write+0xbd/0x1a0
[  1325.504109]  [] SyS_write+0x7c/0xf0
[  1325.504171]  [] ? audit_syscall_entry+0x2c5/0x300
[  1325.504236]  [] entry_SYSCALL_64_fastpath+0x16/0x75
[  1325.504307] INFO: task crasher:3584 blocked for more than 120 seconds.
[  1325.504360] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  1325.504431] crasher        D ffff8807f73995c0     0  3584      1 0x00000080
[  1325.504489]  ffff88080d0ab6d8 0000000000000046 0000000000017840 ffffffff81b4d4c0
[  1325.504555]  ffff88080d0ab308 ffff88077620ce00 ffff88080d0ab540 ffffffff813526b7
[  1325.504618]  ffff88080d0ab540 ffff88077620ce80 ffff88080d0ab540 ffffffff8180c540
[  1325.504690] Call Trace:
[  1325.504721]  [] ? schedule_timeout+0x210/0x210
[  1325.504807]  [] io_schedule+0x5a/0xc0
[  1325.504877]  [] wait_on_page_bit+0xe8/0x100
[  1325.504961]  [] ? wake_bit_function+0x40/0x40
[  1325.505033]  [] wait_on_page_writeback_range+0xf4/0x170
[  1325.505113]  [] ? wake_up_atomic_t+0x30/0x30
[  1325.505185]  [] ? remove_inode_hash+0x2f/0x60
[  1325.505258]  [] __filemap_fdatawait_range+0x70/0xb0
[  1325.505329]  [] filemap_fdatawait+0x2a/0x30
[  1325.505396]  [] __sync_filesystem+0x7e/0x90
[  1325.505466]  [] sync_filesystem+0x2e/0x40
[  1325.505535]  [] fsync_bdev+0x3d/0x60
[  1325.505600]  [] do_fsync+0x38/0x60
[  1325.505662]  [] SyS_fsync+0xd/0x10
[  1325.505725]  [] entry_SYSCALL_64_fastpath+0x16/0x75

[   23.321994] BUG: sleeping function called from invalid context at /build/linux-hvetwq/linux-4.6.0/fs/ext4/inode.c:1279
[   23.322019] in_atomic(): 1, irqs_disabled(): 1, pid: 313, name: ps
[   23.322046] INFO: lockdep is turned off.
[   23.322068] DEBUG_LOCKS_WARN_ON(lockdep_static_key_disabled(&EXT4_I(inode)->i_mmap_sem.dep_map))
[   23.322093] CPU: 1 PID: 313 Comm: ps Tainted: G        W       4.6.0-1-amd64 #1 Debian 4.6.4-1
[   23.322117] Hardware name: Dell Inc. OptiPlex 9020/0PC5F7, BIOS A24 07/13/2018
[   23.322152]  0000000000000000 0000000084fe243c ffff88011e415c00 ffffffff818165d2
[   23.322177]  ffff88011e415c00 ffff88060fff7140 ffffed00001d17ec 0000000000000001
[   23.322201]  0000000000000000 ffff88011e415da8 ffff88011e415d08 ffffffff812a247a
[   23.322225] Call Trace:
[   23.322282]  [] dump_stack+0x5c/0x78
[   23.322316]  [] ___might_sleep+0xfa/0x130
[   23.322345]  [] __mutex_lock_slowpath+0x190/0x2b0
[   23.322371]  [] ? __mutex_lock_slowpath+0x848/0x2b0
[   23.322398]  [] ? __mutex_lock_slowpath+0x848/0x2b0
[   23.322426]  [] ? prepare_to_wait_event+0x108/0x170
[   23.322453]  [] __mutex_lock_slowpath+0x27a/0x2b0
[   23.322480]  [] mutex_lock+0x24/0x40
[   23.322506]  [] ext4_file_mmap+0x16/0x30
[   23.322532]  [] do_mmap_pgoff+0x29e/0x320
[   23.322557]  [] SyS_mmap_pgoff+0xfa/0x190
[   23.322584]  [] entry_SYSCALL_64_fastpath+0x12/0x75
  1. 解决办法

在解决Resource deadlock问题时,可以考虑以下几个方面:

2.1 检查死锁原因

首先需要明确死锁原因,根据报错信息可以定位死锁发生的进程和资源。如果死锁原因不明确,可以使用strace、gdb等工具进行跟踪定位。

2.2 检查共享资源的使用

当多个进程共同竞争一个资源时,容易发生死锁。因此,在代码中需要检查并减少全局变量、共享内存、文件系统等资源的使用。

2.3 加锁顺序

加锁顺序可以影响资源竞争的发生。如果两个进程按照不同的顺序获取锁,容易引起死锁。因此,在多个进程之间协作时,需要约定加锁的顺序。

2.4 减少锁的粒度

在多个进程间协作时,需要使用锁来协调资源的访问。但是加锁会导致竞争和死锁的发生。可以通过减少锁的粒度、使用更细粒度的锁来减轻竞争和死锁的问题。

  1. 示例

**问题:

一个游戏服务器的后端在处理请求时,使用了共享数据类型map,服务器的每次请求都会变更map中的数据,但是从机器日志中看到了"resource deadlock avoided"的信息。**

解决步骤:

  1. 检查死锁进程:

从日志中发现死锁进程的PID为1000。

[2021-05-23 13:23:35] INFO: task Server_1 (pid=1000) blocked for more than 120 seconds.  (schedule_timeout+0x2a6/0x450)
  1. 确认死锁原因:

从日志中同时可以发现,死锁进程的行为涉及到map类型数据结构。

  1. 检查map的代码是否有锁覆盖:

从日志中发现,map的操作涉及到IO,因此产生了锁。这个map可能与IO共享同一把锁,应该考虑减少锁的粒度。

  1. 更换map的实现:

可以尝试对原代码进行更改,采用另一种map实现方式,或者更换数据类型。这样可以尝试减少锁的竞争,降低多个进程之间的锁覆盖的情况。