- 原因
Resource deadlock是操作系统在调度多个进程时,当其中两个或多个进程占有相互依赖的资源时,由于资源竞争导致进程陷入死锁状态。Linux kernel会实现资源分配器resource allocator,调度资源给进程,避免进程间出现竞争或死锁状态。当资源调度器检测到死锁时,会发出"resource deadlock avoided"的报错信息。
报错信息会列出死锁进程和死锁状态资源的相关信息,如下的示例可以看到,死锁发生在两个进程间的semaphore wait()调用中,竞争的资源是主程序里的semaphore:
[ 1325.502637] INFO: task test:3453 blocked for more than 120 seconds.
[ 1325.502688] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1325.502759] test D ffffffff8180c540 0 3453 1 0x00000080
[ 1325.502820] ffff88081926b0a8 0000000000000082 0000000000017840 ffffffff81b4d4c0
[ 1325.502890] ffff88081926b1d8 ffff88087fd9c600 ffff88081926b010 ffffffff815c3400
[ 1325.502946] ffff88081926b020 ffff88087fd9c680 ffff88081926b020 ffffffff8180c540
[ 1325.503009] Call Trace:
[ 1325.503042] [] ? schedule_timeout+0x210/0x210
[ 1325.503117] [] io_schedule+0x5a/0xc0
[ 1325.503183] [] wait_on_page_bit+0xe8/0x100
[ 1325.503250] [] ? wake_bit_function+0x40/0x40
[ 1325.503320] [] wait_on_page_writeback_range+0xf4/0x170
[ 1325.503387] [] ? wake_up_atomic_t+0x30/0x30
[ 1325.503454] [] ? remove_inode_hash+0x2f/0x60
[ 1325.503522] [] filemap_write_and_wait_range+0x3a/0x50
[ 1325.503591] [] generic_file_buffered_write+0xfa/0x240
[ 1325.503661] [] __generic_file_aio_write+0x364/0x490
[ 1325.503723] [] ? do_futex+0x1fb/0x4f0
[ 1325.503791] [] ? __wake_up+0x4f/0x70
[ 1325.503853] [] generic_file_aio_write+0x62/0xc0
[ 1325.503917] [] do_sync_write+0xc0/0x110
[ 1325.503982] [] ? __set_task_blocked+0x3e/0x80
[ 1325.504047] [] vfs_write+0xbd/0x1a0
[ 1325.504109] [] SyS_write+0x7c/0xf0
[ 1325.504171] [] ? audit_syscall_entry+0x2c5/0x300
[ 1325.504236] [] entry_SYSCALL_64_fastpath+0x16/0x75
[ 1325.504307] INFO: task crasher:3584 blocked for more than 120 seconds.
[ 1325.504360] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1325.504431] crasher D ffff8807f73995c0 0 3584 1 0x00000080
[ 1325.504489] ffff88080d0ab6d8 0000000000000046 0000000000017840 ffffffff81b4d4c0
[ 1325.504555] ffff88080d0ab308 ffff88077620ce00 ffff88080d0ab540 ffffffff813526b7
[ 1325.504618] ffff88080d0ab540 ffff88077620ce80 ffff88080d0ab540 ffffffff8180c540
[ 1325.504690] Call Trace:
[ 1325.504721] [] ? schedule_timeout+0x210/0x210
[ 1325.504807] [] io_schedule+0x5a/0xc0
[ 1325.504877] [] wait_on_page_bit+0xe8/0x100
[ 1325.504961] [] ? wake_bit_function+0x40/0x40
[ 1325.505033] [] wait_on_page_writeback_range+0xf4/0x170
[ 1325.505113] [] ? wake_up_atomic_t+0x30/0x30
[ 1325.505185] [] ? remove_inode_hash+0x2f/0x60
[ 1325.505258] [] __filemap_fdatawait_range+0x70/0xb0
[ 1325.505329] [] filemap_fdatawait+0x2a/0x30
[ 1325.505396] [] __sync_filesystem+0x7e/0x90
[ 1325.505466] [] sync_filesystem+0x2e/0x40
[ 1325.505535] [] fsync_bdev+0x3d/0x60
[ 1325.505600] [] do_fsync+0x38/0x60
[ 1325.505662] [] SyS_fsync+0xd/0x10
[ 1325.505725] [] entry_SYSCALL_64_fastpath+0x16/0x75
[ 23.321994] BUG: sleeping function called from invalid context at /build/linux-hvetwq/linux-4.6.0/fs/ext4/inode.c:1279
[ 23.322019] in_atomic(): 1, irqs_disabled(): 1, pid: 313, name: ps
[ 23.322046] INFO: lockdep is turned off.
[ 23.322068] DEBUG_LOCKS_WARN_ON(lockdep_static_key_disabled(&EXT4_I(inode)->i_mmap_sem.dep_map))
[ 23.322093] CPU: 1 PID: 313 Comm: ps Tainted: G W 4.6.0-1-amd64 #1 Debian 4.6.4-1
[ 23.322117] Hardware name: Dell Inc. OptiPlex 9020/0PC5F7, BIOS A24 07/13/2018
[ 23.322152] 0000000000000000 0000000084fe243c ffff88011e415c00 ffffffff818165d2
[ 23.322177] ffff88011e415c00 ffff88060fff7140 ffffed00001d17ec 0000000000000001
[ 23.322201] 0000000000000000 ffff88011e415da8 ffff88011e415d08 ffffffff812a247a
[ 23.322225] Call Trace:
[ 23.322282] [] dump_stack+0x5c/0x78
[ 23.322316] [] ___might_sleep+0xfa/0x130
[ 23.322345] [] __mutex_lock_slowpath+0x190/0x2b0
[ 23.322371] [] ? __mutex_lock_slowpath+0x848/0x2b0
[ 23.322398] [] ? __mutex_lock_slowpath+0x848/0x2b0
[ 23.322426] [] ? prepare_to_wait_event+0x108/0x170
[ 23.322453] [] __mutex_lock_slowpath+0x27a/0x2b0
[ 23.322480] [] mutex_lock+0x24/0x40
[ 23.322506] [] ext4_file_mmap+0x16/0x30
[ 23.322532] [] do_mmap_pgoff+0x29e/0x320
[ 23.322557] [] SyS_mmap_pgoff+0xfa/0x190
[ 23.322584] [] entry_SYSCALL_64_fastpath+0x12/0x75
- 解决办法
在解决Resource deadlock问题时,可以考虑以下几个方面:
2.1 检查死锁原因
首先需要明确死锁原因,根据报错信息可以定位死锁发生的进程和资源。如果死锁原因不明确,可以使用strace、gdb等工具进行跟踪定位。
2.2 检查共享资源的使用
当多个进程共同竞争一个资源时,容易发生死锁。因此,在代码中需要检查并减少全局变量、共享内存、文件系统等资源的使用。
2.3 加锁顺序
加锁顺序可以影响资源竞争的发生。如果两个进程按照不同的顺序获取锁,容易引起死锁。因此,在多个进程之间协作时,需要约定加锁的顺序。
2.4 减少锁的粒度
在多个进程间协作时,需要使用锁来协调资源的访问。但是加锁会导致竞争和死锁的发生。可以通过减少锁的粒度、使用更细粒度的锁来减轻竞争和死锁的问题。
- 示例
**问题:
一个游戏服务器的后端在处理请求时,使用了共享数据类型map
解决步骤:
- 检查死锁进程:
从日志中发现死锁进程的PID为1000。
[2021-05-23 13:23:35] INFO: task Server_1 (pid=1000) blocked for more than 120 seconds. (schedule_timeout+0x2a6/0x450)
- 确认死锁原因:
从日志中同时可以发现,死锁进程的行为涉及到map类型数据结构。
- 检查map的代码是否有锁覆盖:
从日志中发现,map的操作涉及到IO,因此产生了锁。这个map可能与IO共享同一把锁,应该考虑减少锁的粒度。
- 更换map的实现:
可以尝试对原代码进行更改,采用另一种map实现方式,或者更换数据类型。这样可以尝试减少锁的竞争,降低多个进程之间的锁覆盖的情况。