<2024年4月>
31123456
78910111213
14151617181920
21222324252627
2829301234
567891011

文章分类

导航

订阅

Ubuntu内核调试要点(下)——常见问题

  前两讲我们分别介绍了安装调试符号和源文件,以及设置目标机和开始调试会话,这一讲我们介绍一下大家可能遇到的两个常见问题:架构不匹配和内核地址错位。

   架构不匹配

   如果目标机是32位的,那么一般没有问题,如果是64位的,那么主机端的GDB可能误以为对方是32位的,表现出的症状就是找不到函数的符号(因为错把64位的地址当作32位,只取一半),如果观察寄存器,那么也都不对,比如:

(gdb) info registers     
    eax            0x1b 27
    ecx            0x0 0
    edx            0x67 103
    ebx            0x0 0
    esp            0x6 0x6 <irq_stack_union+6>
    ebp            0x0 0x0 <irq_stack_union>
    esi            0x0 0
    edi            0x0 0
    eip            0x246 0x246 <irq_stack_union+582>
    eflags         0x0 [ ]
    cs             0x3f40dc80 1061215360
    ss             0xffff96dd -26915
    ds             0x1e5be38 31833656
    es             0xffffaf96 -20586
    fs             0x1e5be38 31833656
    gs             0xffffaf96 -20586

  仔细观察上面的寄存器内容,对于熟悉x86架构的读者,很容易看出很多异常,比如段寄存器都是16位,数值应该很小,可上面显示的却很大,再比如程序指针寄存器(eip)的值也太小。这都是因为GDB把64位的寄存器上下文强行按32位来对待了。

  有人说,GDB真的这么傻么?真的如此。为了了解GDB里的原委,老雷还特意启动第二个GDB,以GDB调试GDB,截图如下:


  上图左侧是做内核调试的GDB,右侧便是调试左侧GDB的GDB。 

   当我们执行target remote命令时,GDB就会初始化一个gdbarch实例,而此时还未与目标机建立连接,不知道对方是32位还是64位,所以此时GDB是始终初始化一个32为的gdbarch,过程如下:    

#0  i386_gdbarch_init (info=..., arches=0x1240c00) at /build/gdb-9un5Xp/gdb-7.11.1/gdb/i386-tdep.c:8255
#1  0x00000000005e218f in gdbarch_find_by_info (info=...) at /build/gdb-9un5Xp/gdb-7.11.1/gdb/gdbarch.c:5100
#2  0x00000000005e2b71 in gdbarch_update_p (info=...) at /build/gdb-9un5Xp/gdb-7.11.1/gdb/arch-utils.c:522
#3  0x00000000006bb2c3 in target_clear_description () at /build/gdb-9un5Xp/gdb-7.11.1/gdb/target-descriptions.c:400
#4  0x0000000000600637 in target_pre_inferior (from_tty=from_tty@entry=1) at /build/gdb-9un5Xp/gdb-7.11.1/gdb/target.c:2161
#5  0x0000000000601aad in target_preopen (from_tty=from_tty@entry=1) at /build/gdb-9un5Xp/gdb-7.11.1/gdb/target.c:2220
#6  0x00000000004c6b73 in remote_open_1 (name=0x1082e7e "/dev/ttyS0", from_tty=1, target=0xc44040 <remote_ops>, extended_p=0)
    at /build/gdb-9un5Xp/gdb-7.11.1/gdb/remote.c:4886
#7  0x00000000005f4538 in open_target (args=0x1082e7e "/dev/ttyS0", from_tty=1, command=<optimized out>)
    at /build/gdb-9un5Xp/gdb-7.11.1/gdb/target.c:356
#8  0x000000000069dbc6 in execute_command (p=<optimized out>, p@entry=0x1082e70 "", from_tty=1) at /build/gdb-9un5Xp/gdb-7.11.1/gdb/top.c:475
#9  0x00000000005d490c in command_handler (command=0x1082e70 "") at /build/gdb-9un5Xp/gdb-7.11.1/gdb/event-top.c:491
#10 0x00000000005d4fef in command_line_handler (rl=<optimized out>) at /build/gdb-9un5Xp/gdb-7.11.1/gdb/event-top.c:690
#11 0x00007f9a096586f5 in rl_callback_read_char () from /lib/x86_64-linux-gnu/libreadline.so.6
#12 0x00000000005d4969 in rl_callback_read_char_wrapper (client_data=<optimized out>) at /build/gdb-9un5Xp/gdb-7.11.1/gdb/event-top.c:171
#13 0x00000000005d49b3 in stdin_event_handler (error=<optimized out>, client_data=0x0) at /build/gdb-9un5Xp/gdb-7.11.1/gdb/event-top.c:430
#14 0x00000000005d3795 in gdb_wait_for_event (block=block@entry=1) at /build/gdb-9un5Xp/gdb-7.11.1/gdb/event-loop.c:834
#15 0x00000000005d3939 in gdb_do_one_event () at /build/gdb-9un5Xp/gdb-7.11.1/gdb/event-loop.c:323
#16 0x00000000005d3a7e in start_event_loop () at /build/gdb-9un5Xp/gdb-7.11.1/gdb/event-loop.c:347
#17 0x00000000005cd443 in captured_command_loop (data=data@entry=0x0) at /build/gdb-9un5Xp/gdb-7.11.1/gdb/main.c:318
#18 0x00000000005ca25d in catch_errors (func=func@entry=0x5cd430 <captured_command_loop>, func_args=func_args@entry=0x0, 
    errstring=errstring@entry=0x7abfeb "", mask=mask@entry=RETURN_MASK_ALL) at /build/gdb-9un5Xp/gdb-7.11.1/gdb/exceptions.c:240
#19 0x00000000005ce036 in captured_main (data=data@entry=0x7ffec93a15d0) at /build/gdb-9un5Xp/gdb-7.11.1/gdb/main.c:1157
#20 0x00000000005ca25d in catch_errors (func=func@entry=0x5cd990 <captured_main>, func_args=func_args@entry=0x7ffec93a15d0, 
    errstring=errstring@entry=0x7abfeb "", mask=mask@entry=RETURN_MASK_ALL) at /build/gdb-9un5Xp/gdb-7.11.1/gdb/exceptions.c:240
#21 0x00000000005ce90b in gdb_main (args=args@entry=0x7ffec93a15d0) at /build/gdb-9un5Xp/gdb-7.11.1/gdb/main.c:1165
#22 0x000000000045ecd5 in main (argc=<optimized out>, argv=<optimized out>) at /build/gdb-9un5Xp/gdb-7.11.1/gdb/gdb.c:32

  GDB虽然代码量不小,但是设计逻辑还是很清楚的,一个current_inferior_全局变量用来记录当前的被调试对象。在GDB中,调试对象被统称为inferior,是下属晚辈的意思,老雷将其翻译为“下程”(下属程序之意)。

  在current_inferior_结构体中,有一个gdbarch成员,就是用来记录当前调试目标的架构的(32/64位之类),即:

(gdb) p *current_inferior_
$20 = {next = 0x0, num = 1, pid = 42000, fake_pid_p = 1,
  highest_thread_num = 384, control = {stop_soon = STOP_QUIETLY_REMOTE},
  removable = 0, aspace = 0x22d3fc0, pspace = 0x11992b0, args = 0x0,
  argc = 0, argv = 0x0, terminal = 0x0, environment = 0x11c9780,
  attach_flag = 0, vfork_parent = 0x0, vfork_child = 0x0,
  pending_detach = 0, waiting_for_vfork_done = 0, detaching = 0,
  continuations = 0x0, needs_setup = 0, priv = 0x0, has_exit_code = 0,
  exit_code = 0, symfile_flags = 0, tdesc_info = 0x11f2e30,
  gdbarch = 0x1242b80, registry_data = {data = 0x123d160, num_data = 7}}

  观察gdbarch,可以看到它内部的函数指向的都是32位版本(以i386开头)。

  可以用watch命令设置硬件断点监视gdbarch字段的变化,但是当GDB与目标建立连接是,这个断点并没有命中。至少在老雷调试的GDB中,它不会自动检测目标的架构并作切换。

  怎么办呢?答案是需要执行如下命令手工切换:

(gdb) set architecture i386:x86-64
The target architecture is assumed to be i386:x86-64

  这样切换后,再观察gdbarch,就是64位版本的了。

(gdb) p *gdbarch
$42 = {initialized_p = 1, obstack = 0x12e40b0, bfd_arch_info = 0x952dc0 <bfd_x86_64_arch>, byte_order = BFD_ENDIAN_LITTLE, 
  byte_order_for_code = BFD_ENDIAN_LITTLE, osabi = GDB_OSABI_LINUX, target_desc = 0x0, tdep = 0x12e3f10, dump_tdep = 0x0, nr_data = 26, 
  data = 0x12f9990, bits_big_endian = 0, short_bit = 16, int_bit = 32, long_bit = 64, long_long_bit = 64, long_long_align_bit = 32, half_bit = 16, 
  half_format = 0xc37970 <floatformats_ieee_half>, float_bit = 32, float_format = 0xc37960 <floatformats_ieee_single>, double_bit = 64, 
  double_format = 0xc37950 <floatformats_ieee_double>, long_double_bit = 128, long_double_format = 0xc37930 <floatformats_i387_ext>, ptr_bit = 64, 
  addr_bit = 64, dwarf2_addr_size = 8, char_signed = 1, read_pc = 0x0, write_pc = 0x46bf90 <amd64_linux_write_pc>, 
  virtual_frame_pointer = 0x5e29f0 <legacy_virtual_frame_pointer>, pseudo_register_read = 0x0, 
  pseudo_register_read_value = 0x45fc30 <amd64_pseudo_register_read_value>, pseudo_register_write = 0x45fb20 <amd64_pseudo_register_write>, 
  num_regs = 152, num_pseudo_regs = 52, ax_pseudo_register_collect = 0x0, ax_pseudo_register_push_stack = 0x0, sp_regnum = 7, pc_regnum = 16, 
  ps_regnum = 17, fp0_regnum = 24, stab_reg_to_regnum = 0x45f9c0 <amd64_dwarf_reg_to_regnum>, ecoff_reg_to_regnum = 0x5e2990 <no_op_reg_to_regnum>, 
  sdb_reg_to_regnum = 0x47ef30 <i386_dbx_reg_to_regnum>, dwarf2_reg_to_regnum = 0x45f9c0 <amd64_dwarf_reg_to_regnum>, 
  register_name = 0x6bb050 <tdesc_register_name>, register_type = 0x6bc1a0 <tdesc_register_type>, dummy_id = 0x45f950 <amd64_dummy_id>, 
  deprecated_fp_regnum = -1, push_dummy_call = 0x467330 <amd64_push_dummy_call>, call_dummy_location = 1, 
  push_dummy_code = 0x477b80 <i386_push_dummy_code>, print_registers_info = 0x5aef10 <default_print_registers_info>, 
  print_float_info = 0x488170 <i387_print_float_info>, print_vector_info = 0x0, register_sim_regno = 0x5e2890 <legacy_register_sim_regno>, 
  cannot_fetch_register = 0x5e29e0 <cannot_register_not>, cannot_store_register = 0x5e29e0 <cannot_register_not>, 
  get_longjmp_target = 0x45f400 <amd64_get_longjmp_target>, believe_pcc_promotion = 0, convert_register_p = 0x488c50 <i387_convert_register_p>, 
  register_to_value = 0x488c80 <i387_register_to_value>, value_to_register = 0x488dc0 <i387_value_to_register>, 
  value_from_register = 0x561a10 <default_value_from_register>, pointer_to_address = 0x561790 <unsigned_pointer_to_address>, 
  address_to_pointer = 0x5617f0 <unsigned_address_to_pointer>, integer_to_address = 0x0, return_value = 0x465e20 <amd64_return_value>, 
  return_in_first_hidden_param_p = 0x5e3880 <default_return_in_first_hidden_param_p>, skip_prologue = 0x4674a0 <amd64_skip_prologue>, 
  skip_main_prologue = 0x0, skip_entrypoint = 0x0, inner_than = 0x5e2950 <core_addr_lessthan>, 
  breakpoint_from_pc = 0x477b10 <i386_breakpoint_from_pc>, remote_breakpoint_from_pc = 0x5e3850 <default_remote_breakpoint_from_pc>, 
  adjust_breakpoint_address = 0x0, memory_insert_breakpoint = 0x5f1ba0 <default_memory_insert_breakpoint>, 
  memory_remove_breakpoint = 0x5f1c70 <default_memory_remove_breakpoint>, decr_pc_after_break = 1, deprecated_function_start_offset = 0, 
  remote_register_number = 0x6bb020 <tdesc_remote_register_number>, fetch_tls_load_module_address = 0x493be0 <svr4_fetch_objfile_link_map>, 
  frame_args_skip = 8, unwind_pc = 0x47a250 <i386_unwind_pc>, unwind_sp = 0x0, frame_num_args = 0x0, frame_align = 0x45efd0 <amd64_frame_align>, 
  stabs_argument_has_addr = 0x5e2ac0 <default_stabs_argument_has_addr>, frame_red_zone_size = 128, 
  convert_from_func_ptr_addr = 0x5e2980 <convert_from_func_ptr_addr_identity>, addr_bits_remove = 0x5e2970 <core_addr_identity>, 
  software_single_step = 0x0, single_step_through_delay = 0x0, print_insn = 0x47e170 <i386_print_insn>, 
  skip_trampoline_code = 0x613c30 <find_solib_trampoline_target>, skip_solib_resolver = 0x490380 <glibc_skip_solib_resolver>, 
  in_solib_return_trampoline = 0x5e2930 <generic_in_solib_return_trampoline>, stack_frame_destroyed_p = 0x5e2940 <generic_stack_frame_destroyed_p>, 
  elf_make_msymbol_special = 0x0, coff_make_msymbol_special = 0x5e29a0 <default_coff_make_msymbol_special>, 
  make_symbol_special = 0x5e29b0 <default_make_symbol_special>, adjust_dwarf2_addr = 0x5e29c0 <default_adjust_dwarf2_addr>, 
  adjust_dwarf2_line = 0x5e29d0 <default_adjust_dwarf2_line>, cannot_step_breakpoint = 0, have_nonsteppable_watchpoint = 0, 
  address_class_type_flags = 0x0, address_class_type_flags_to_name = 0x0, address_class_name_to_type_flags = 0x0, 
  register_reggroup_p = 0x474950 <amd64_linux_register_reggroup_p>, fetch_pointer_argument = 0x4796f0 <i386_fetch_pointer_argument>, 
  iterate_over_regset_sections = 0x46b610 <amd64_linux_iterate_over_regset_sections>, make_corefile_notes = 0x496c60 <linux_make_corefile_notes>, 
  elfcore_write_linux_prpsinfo = 0x0, find_memory_regions = 0x4976a0 <linux_find_memory_regions>, core_xfer_shared_libraries = 0x0, 
  core_xfer_shared_libraries_aix = 0x0, core_pid_to_str = 0x495e30 <linux_core_pid_to_str>, core_thread_name = 0x0, gcore_bfd_target = 0x0, 
  vtable_function_descriptors = 0, vbit_in_delta = 0, skip_permanent_breakpoint = 0x5e38c0 <default_skip_permanent_breakpoint>, max_insn_length = 16, 
  displaced_step_copy_insn = 0x467770 <amd64_displaced_step_copy_insn>, 
  displaced_step_hw_singlestep = 0x5e2810 <default_displaced_step_hw_singlestep>, displaced_step_fixup = 0x467b10 <amd64_displaced_step_fixup>, 
  displaced_step_free_closure = 0x5e2800 <simple_displaced_step_free_closure>, displaced_step_location = 0x4981d0 <linux_displaced_step_location>, 
  relocate_instruction = 0x45f0f0 <amd64_relocate_instruction>, overlay_update = 0x0, 
  core_read_description = 0x4748a0 <amd64_linux_core_read_description>, static_transform_name = 0x0, sofun_address_maybe_missing = 0, 
  process_record = 0x480860 <i386_process_record>, process_record_signal = 0x474800 <amd64_linux_record_signal>, 
  gdb_signal_from_target = 0x494280 <linux_gdb_signal_from_target>, gdb_signal_to_target = 0x4944c0 <linux_gdb_signal_to_target>, 
  get_siginfo_type = 0x48b2d0 <x86_linux_get_siginfo_type>, record_special_symbol = 0x0, 
  get_syscall_number = 0x46bf10 <amd64_linux_get_syscall_number>, xml_syscall_file = 0x79fce7 "syscalls/amd64-linux.xml", syscalls_info = 0x0, 
---Type <return> to continue, or q <return> to quit---
  stap_integer_prefixes = 0x79f3b0 <stap_integer_prefixes>, stap_integer_suffixes = 0x0, stap_register_prefixes = 0x79f3a0 <stap_register_prefixes>, 
  stap_register_suffixes = 0x0, stap_register_indirection_prefixes = 0x79f390 <stap_register_indirection_prefixes>, 
  stap_register_indirection_suffixes = 0x79f380 <stap_register_indirection_suffixes>, stap_gdb_register_prefix = 0x0, stap_gdb_register_suffix = 0x0, 
  stap_is_single_operand = 0x478270 <i386_stap_is_single_operand>, stap_parse_special_token = 0x478b80 <i386_stap_parse_special_token>, 
  dtrace_parse_probe_argument = 0x474990 <amd64_dtrace_parse_probe_argument>, dtrace_probe_is_enabled = 0x46c8e0 <amd64_dtrace_probe_is_enabled>, 
  dtrace_enable_probe = 0x46c8c0 <amd64_dtrace_enable_probe>, dtrace_disable_probe = 0x46c8a0 <amd64_dtrace_disable_probe>, has_global_solist = 0, 
  has_global_breakpoints = 0, has_shared_address_space = 0x4981c0 <linux_has_shared_address_space>, 
  fast_tracepoint_valid_at = 0x4795d0 <i386_fast_tracepoint_valid_at>, auto_charset = 0x566ae0 <default_auto_charset>, 
  auto_wide_charset = 0x566af0 <default_auto_wide_charset>, solib_symbols_extension = 0x0, has_dos_based_file_system = 0, 
  gen_return_address = 0x45f090 <amd64_gen_return_address>, info_proc = 0x494f70 <linux_info_proc>, core_info_proc = 0x4976f0 <linux_core_info_proc>, 
  iterate_over_objfiles_in_search_order = 0x60fa60 <default_iterate_over_objfiles_in_search_order>, ravenscar_ops = 0x0, 
  insn_is_call = 0x45f080 <amd64_insn_is_call>, insn_is_ret = 0x45f070 <amd64_insn_is_ret>, insn_is_jump = 0x45f060 <amd64_insn_is_jump>, 
  auxv_parse = 0x0, vsyscall_range = 0x494990 <linux_vsyscall_range>, infcall_mmap = 0x494800 <linux_infcall_mmap>, 
  infcall_munmap = 0x494710 <linux_infcall_munmap>, gcc_target_options = 0x5e3970 <default_gcc_target_options>, 
  gnu_triplet_regexp = 0x477ba0 <i386_gnu_triplet_regexp>, addressable_memory_unit_size = 0x5e39d0 <default_addressable_memory_unit_size>}

  接下来,再观察寄存器,就对了,比如:

(gdb) i r
rax            0x2f    47
rbx            0xffffffff81efadc0    -2114998848
rcx            0xffffffff81e5f568    -2115635864
rdx            0x0    0
rsi            0x246    582
rdi            0x246    582
rbp            0xffffc90000643da8    0xffffc90000643da8
rsp            0xffffc90000643da8    0xffffc90000643da8
r8             0xb57e8    743400
r9             0x207    519
r10            0xd2a81d91    3534232977
r11            0xffffffff822487ed    -2111535123
r12            0xffff880111a2be40    -131936804487616
r13            0x0    0
r14            0x55a9bbc50cf0    94187488087280
r15            0x55a9bbc503c0    94187488084928
rip            0xffffffff81141344    0xffffffff81141344 <kgdb_breakpoint+20>
eflags         0x202    [ IF ]
cs             0x10    16
ss             0x0    0
ds             0x0    0
es             0x0    0
fs             0x0    0
gs             0xb    11

 

 内核地址错位

 第二个常见的问题是,GDB里找不到函数的名字,长话短说是因为在GDB看来,内核的编译地址和运行时的地址是一样的,所以就直接用符号文件中查找到的地址来访问目标机器。但其实,今天的较高版本内核(3.14+)可能启用了KASLR(内核空间地址随机化),把内核也做了重定位。

  不妨做个小实验来理解KASLR,先观察/proc/kallsyms中的vfs_read函数的地址(这是实际运行时的地址),再观察/boot目录下编译时的地址,会发现二者明显不同。

gedu@gedu-VirtualBox:~$ sudo cat /proc/kallsyms | grep " vfs_read"
ffffffff86c31960 T vfs_readf
ffffffff86c32ce0 T vfs_read
ffffffff86c33210 T vfs_readv
gedu@gedu-VirtualBox:~$ sudo cat /boot/System.map-4.8.0-36-generic | grep " vfs_read"
ffffffff81231960 T vfs_readf
ffffffff81232ce0 T vfs_read
ffffffff81233210 T vfs_readv

KASLR是一项安全措施,与黑客捉迷藏。但这样躲躲闪闪,让GDB也蒙了。

  那么如何让告诉GDB内核搬家了呢?今天用的方法大多都是禁止KASLR,也就是在内核的命令行中加入nokaslr,然后重启问题就消除了。 

  读到这里,大家是不觉得陷阱很多啊。是的,新问题,新挑战总是有的。GNU领袖RMS在GDB教程的封面如此写道:“Don't worry if it doesn't work right. If everything did, you'd be out of job.”翻译一下:“别担心它工作的不好。如果一切都工作的好,那么你就没工作了。” 其实,我并不很赞同这句话,因为不够积极。不过RMS如此说,也只是开个玩笑而已,看看他的相册,可以看到他走到哪里就在哪里掏出本子改BUG啊。 :-) 


posted on 2018年3月10日 14:21 由 admin

Powered by Community Server Powered by CnForums.Net