- int printf(const char * format, ...);
- int global_init_var = 84;
- int global_uninit_var;
- void func1(int i)
- {
- printf("%d\n", i);
- }
- int main(void)
- {
- static int static_var=85;
- static int static_var2;
- int a = 1;
- int b;
- func1(static_var+static_var2+a+b);
- return a;
- }
通过一个图来了解程序与目标文件的对应图
- 首先,编译而不链接它
- [lizhuohua@lizhuohua-phy Program]$ gcc -c SimpleSection.c
- [lizhuohua@lizhuohua-phy Program]$ ls | grep SimpleSection.
- SimpleSection.c
- SimpleSection.o
- [lizhuohua@lizhuohua-phy Program]$ file SimpleSection.o
- SimpleSection.o: ELF 64-bit LSB relocatable, x86-64, version 1 (SYSV), not stripped
- relocatable: 说明此文件是可以被链接为可执行文件或者共享目标文件。
- [lizhuohua@lizhuohua-phy Program]$ gcc SimpleSection.c -o SimpleSection
- [lizhuohua@lizhuohua-phy Program]$ file SimpleSection
- SimpleSection: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.32, not stripped
- executable: 表示可以直接执行的文件。
- 另外还有共享目标文件,例如linux中以.so为扩展名的文件。如:
- [lizhuohua@lizhuohua-phy Program]$ file /lib/ld-2.14.90.so
- /lib/ld-2.14.90.so: ELF 32-bit LSB shared object, Intel 80386, version 1 (SYSV), dynamically linked, not stripped
- [lizhuohua@lizhuohua-phy Program]$ objdump -h SimpleSection.o
- SimpleSection.o: file format elf64-x86-64
- Sections:
- Idx Name Size VMA LMA File off Algn
- 0 .text 00000050 0000000000000000 0000000000000000 00000040 2**2
- CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
- 1 .data 00000008 0000000000000000 0000000000000000 00000090 2**2
- CONTENTS, ALLOC, LOAD, DATA
- 2 .bss 00000004 0000000000000000 0000000000000000 00000098 2**2
- ALLOC
- 3 .rodata 00000004 0000000000000000 0000000000000000 00000098 2**0
- CONTENTS, ALLOC, LOAD, READONLY, DATA
- 4 .comment 0000002d 0000000000000000 0000000000000000 0000009c 2**0
- CONTENTS, READONLY
- 5 .note.GNU-stack 00000000 0000000000000000 0000000000000000 000000c9 2**0
- CONTENTS, READONLY
- 6 .eh_frame 00000058 0000000000000000 0000000000000000 000000d0 2**3
- CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
此时,可以画出目前所知的文件结构图:
-s表示将所有段的内容以16进制打印, -d表示把所有包含指令的段反汇编
- [lizhuohua@lizhuohua-phy Program]$ objdump -s -d SimpleSection.o
- SimpleSection.o: file format elf64-x86-64
- Contents of section .text:
- 0000 554889e5 4883ec10 897dfc8b 45fc89c6 UH..H....}..E...
- 0010 bf000000 00b80000 0000e800 000000c9 ................
- 0020 c3554889 e54883ec 10c745fc 01000000 .UH..H....E.....
- 0030 8b150000 00008b05 00000000 01d00345 ...............E
- 0040 fc0345f8 89c7e800 0000008b 45fcc9c3 ..E.........E...
- Contents of section .data:
- 0000 54000000 55000000 T...U...
- Contents of section .rodata:
- 0000 25640a00 %d..
- Contents of section .comment:
- 0000 00474343 3a202847 4e552920 342e362e .GCC: (GNU) 4.6.
- 0010 33203230 31323033 30362028 52656420 3 20120306 (Red
- 0020 48617420 342e362e 332d3229 00 Hat 4.6.3-2).
- Contents of section .eh_frame:
- 0000 14000000 00000000 017a5200 01781001 .........zR..x..
- 0010 1b0c0708 90010000 1c000000 1c000000 ................
- 0020 00000000 21000000 00410e10 8602430d ....!....A....C.
- 0030 065c0c07 08000000 1c000000 3c000000 .\..........<...
- 0040 00000000 2f000000 00410e10 8602430d ..../....A....C.
- 0050 066a0c07 08000000 .j......
- Disassembly of section .text:
- 0000000000000000 <func1>:
- 0: 55 push %rbp //对应与Contents of section .text中第一个字节55
- 1: 48 89 e5 mov %rsp,%rbp
- 4: 48 83 ec 10 sub $0x10,%rsp
- 8: 89 7d fc mov %edi,-0x4(%rbp)
- b: 8b 45 fc mov -0x4(%rbp),%eax
- e: 89 c6 mov %eax,%esi
- 10: bf 00 00 00 00 mov $0x0,%edi
- 15: b8 00 00 00 00 mov $0x0,%eax
- 1a: e8 00 00 00 00 callq 1f <func1+0x1f>
- 1f: c9 leaveq
- 20: c3 retq
- 0000000000000021 <main>:
- 21: 55 push %rbp
- 22: 48 89 e5 mov %rsp,%rbp
- 25: 48 83 ec 10 sub $0x10,%rsp
- 29: c7 45 fc 01 00 00 00 movl $0x1,-0x4(%rbp)
- 30: 8b 15 00 00 00 00 mov 0x0(%rip),%edx # 36 <main+0x15>
- 36: 8b 05 00 00 00 00 mov 0x0(%rip),%eax # 3c <main+0x1b>
- 3c: 01 d0 add %edx,%eax
- 3e: 03 45 fc add -0x4(%rbp),%eax
- 41: 03 45 f8 add -0x8(%rbp),%eax
- 44: 89 c7 mov %eax,%edi
- 46: e8 00 00 00 00 callq 4b <main+0x2a>
- 4b: 8b 45 fc mov -0x4(%rbp),%eax
- 4e: c9 leaveq
- 4f: c3 retq //对应与Contents of section .text中最后一个字节c3
再看看.data段:
.data段包含了初始化的全局和局部静态变量,即int global_init_var = 84; 和static int static_var=85;而从.data段的基本信息可以看到
1 .data 00000008 0000000000000000 0000000000000000 00000090 2**2
CONTENTS, ALLOC, LOAD, DATA
长度是8个字节,正好等于两个int的长度。
.rodata段包含了只读数据,在代码中,printf("%d\n", i); 的 “%d\n” 被放在了这个段里。
.bss存放的是未初始化的全局和局部静态变量。但是从.bss的基本信息
2 .bss 00000004 0000000000000000 0000000000000000 00000098 2**2
ALLOC
可以看到,size只有4,这和代码中有两个未初始化的全局和局部静态变量相矛盾。实际上,全局未初始化的变量global_uninit_var 没有被放在这里,通过后面的符号表可以看出来。
分析ELF文件结构
- [lizhuohua@lizhuohua-phy Program]$ readelf -h SimpleSection.o
ELF Header:
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Class: ELF64
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: REL (Relocatable file) //可重定位文件,如果是可执行文件,这里是EXE,如果是共享目标文件,这里是DYC
Machine: Advanced Micro Devices X86-64
Version: 0x1
Entry point address: 0x0 //入口地址,操作系统加载完程序后,从这个地址开始执行程序,对于可重定向文件来说,一般没有入口地址,为0
Start of program headers: 0 (bytes into file)
Start of section headers: 400 (bytes into file) //section表的开始偏移在文件中是第400字节。
Flags: 0x0
Size of this header: 64 (bytes) //ELF header本身的大小是64
Size of program headers: 0 (bytes)
Number of program headers: 0
Size of section headers: 64 (bytes) //这个值可以从/usr/include/elf.h中的Elf64_Shdr结构可以计算(sizeof(Elf64_Shdr))出来,由于本机是64位操作系统,所以是64位的数据结构。
Number of section headers: 13 //共有13个section
Section header string table index: 10 //section列表中有一个是字符
- [lizhuohua@lizhuohua-phy Program]$ readelf -S SimpleSection.o
- There are 13 section headers, starting at offset 0x190:
- Section Headers:
- [Nr] Name Type Address Offset Size EntSize Flags Link Info Align
- [ 0] NULL 0000000000000000 00000000 0000000000000000 0000000000000000 0 0 0
- [ 1] .text PROGBITS 0000000000000000 00000040 0000000000000050 0000000000000000 AX 0 0 4
- [ 2] .rela.text RELA 0000000000000000 000006b8 0000000000000078 0000000000000018 11 1 8
- [ 3] .data PROGBITS 0000000000000000 00000090 0000000000000008 0000000000000000 WA 0 0 4
- [ 4] .bss NOBITS 0000000000000000 00000098 0000000000000004 0000000000000000 WA 0 0 4
- [ 5] .rodata PROGBITS 0000000000000000 00000098 0000000000000004 0000000000000000 A 0 0 1
- [ 6] .comment PROGBITS 0000000000000000 0000009c 000000000000002d 0000000000000001 MS 0 0 1
- [ 7] .note.GNU-stack PROGBITS 0000000000000000 000000c9 0000000000000000 0000000000000000 0 0 1
- [ 8] .eh_frame PROGBITS 0000000000000000 000000d0 0000000000000058 0000000000000000 A 0 0 8
- [ 9] .rela.eh_frame RELA 0000000000000000 00000730 0000000000000030 0000000000000018 11 8 8
- [10] .shstrtab STRTAB 0000000000000000 00000128 0000000000000061 0000000000000000 0 0 1
- [11] .symtab SYMTAB 0000000000000000 000004d0 0000000000000180 0000000000000018 12 11 8
- [12] .strtab STRTAB 0000000000000000 00000650 0000000000000066 0000000000000000 0 0 1
- Key to Flags:
- W (write), A (alloc), X (execute), M (merge), S (strings), l (large)
- I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
- O (extra OS processing required) o (OS specific), p (processor specific)
这时候可以画出一个完整的文件结构图
可以看到,总共有13个段表(从0开始)。
Type:
PROGBITS: 表示程序段,代码段,数据段
RELA: 表示重定向表。比如.rela.text,该段使用的可重定向符号表在段表11(Link为11),即.symtab段表中。而该重定向表所作用的段是1号表(Info为),即.text段表。
SYMTAB:该段的内容为符号表
STRTAB:该段的内容为字符串表
NOBITS:表示该段无内容
Flag:
W (write):表示该段在进程空间中可写
A (alloc):表示该段在进程空间中需要分配空间。
X (execute):表示该段可被执行,比如代码段
- 00000650 00 53 69 6d 70 6c 65 53 65 63 74 69 6f 6e 2e 63 |.SimpleSection.c|
- 00000660 00 73 74 61 74 69 63 5f 76 61 72 2e 31 35 39 36 |.static_var.1596|
- 00000670 00 73 74 61 74 69 63 5f 76 61 72 32 2e 31 35 39 |.static_var2.159|
- 00000680 37 00 67 6c 6f 62 61 6c 5f 69 6e 69 74 5f 76 61 |7.global_init_va|
- 00000690 72 00 67 6c 6f 62 61 6c 5f 75 6e 69 6e 69 74 5f |r.global_uninit_|
- 000006a0 76 61 72 00 66 75 6e 63 31 00 70 72 69 6e 74 66 |var.func1.printf|
- 000006b0 00 6d 61 69 6e 00 00 00 11 00 00 00 00 00 00 00 |.main...........|
- 000006c0 0a 00 00 00 05 00 00 00 00 00 00 00 00 00 00 00 |................|
- [lizhuohua@lizhuohua-phy Program]$ readelf -s SimpleSection.o
- Symbol table '.symtab' contains 16 entries:
- Num: Value Size Type Bind Vis Ndx Name
- 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
- 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS SimpleSection.c
- 2: 0000000000000000 0 SECTION LOCAL DEFAULT 1
- 3: 0000000000000000 0 SECTION LOCAL DEFAULT 3
- 4: 0000000000000000 0 SECTION LOCAL DEFAULT 4
- 5: 0000000000000000 0 SECTION LOCAL DEFAULT 5
- 6: 0000000000000004 4 OBJECT LOCAL DEFAULT 3 static_var.1596
- 7: 0000000000000000 4 OBJECT LOCAL DEFAULT 4 static_var2.1597
- 8: 0000000000000000 0 SECTION LOCAL DEFAULT 7
- 9: 0000000000000000 0 SECTION LOCAL DEFAULT 8
- 10: 0000000000000000 0 SECTION LOCAL DEFAULT 6
- 11: 0000000000000000 4 OBJECT GLOBAL DEFAULT 3 global_init_var
- 12: 0000000000000004 4 OBJECT GLOBAL DEFAULT COM global_uninit_var
- 13: 0000000000000000 33 FUNC GLOBAL DEFAULT 1 func1
- 14: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND printf
- 15: 0000000000000021 47 FUNC GLOBAL DEFAULT 1 main
Bind:
LOCAL: 局部符号,对于外部文件都不可见
GLOBAL:全局符号,外部可见
Type:
FILE: 该符号表示文件名
SECTION: 该符号表示一个段,必须与LOCAL结合
FUNC: 该符号表示函数或其他可执行代码
OBJECT:该符号是个数据对象,比如变量,数组等。
NOTYPE: 未知符号
Ndx:
一般来说表示该符号所在段在段表中的下标。比如global_init_var在3号段表,即.data段表
COM:表示该符号是一个"COMMON"类型的符号,一般来说,未初始化的全局变量即为这个。如global_uninit_var
UND:未定义符号,如printf。它在我们的代码里被调用,但是没有被定义。