恢复丢失的LVM 设备

1288阅读 0评论2012-07-14 ulovko
分类:

前段时间一个朋友慌张的向我求助,他在重装系统的时候,初始化了存储上的两个LVM设备,导致数据丢失。
“案发现场”:
两台服务器做HA,通过光纤连在一个共享存储上,因为其中一台HA节点出现了问题,需要重装系统,在重装系统的过程中,由于没有拔光纤线,在初始化磁盘的时候,误将存储上的两个分区初始化了,但是系统还是正常的装完了,当时他并没有发现什么问题,因为另一台服务器还在正常运行,被初始化的这两个LVM设备在另一台服务器上还能正常访问,但是两个多小时后,服务器崩溃了,等服务器再起来时,这才发现两个LVM设备已经没了,数据也丢了。
问题分析:
初始化LVM时,只是清除了磁盘头部的lvm header信息,磁盘上的数据没有被清除,所以只需要根据备份的逻辑卷的Metadata信息把lvm header重新写到被初始化的磁盘头部就行,好在当时的服务器系统都有做备份,可以找到lvm Metadata信息。
 
当创建vg的时候,系统默认会自动备份逻辑卷Metadata信息到/etc/lvm/backup,当前逻辑卷metadata信息放在/etc/lvm/archive下面,保险起见,最好定期备份/etc/lvm这个目录下的文件到别的地方。
 
 
下面用虚拟机模拟LVM恢复
创建逻辑卷:

点击(此处)折叠或打开

  1. [root@oracle ~]# pvcreate /dev/sdb1
  2. Physical volume "/dev/sdb1" successfully created
  3. [root@oracle ~]# vgcreate lanv /dev/sdb1
  4. Volume group "lanv" successfully created
  5. [root@oracle ~]# lvcreate -n lgl -L 500M lanv
  6. Logical volume "lgl" created
  7. [root@oracle lgl]# pvs
  8. PV VG Fmt Attr PSize PFree
  9. /dev/sdb1 lanv lvm2 a- 1016.00M 516.00M
  10. [root@oracle lgl]# lvs
  11. LV VG Attr LSize Origin Snap% Move Log Copy% Convert
  12. lgl lanv -wi-ao 500.00M
  13. [root@oracle lgl]# vgs
  14. VG #PV #LV #SN Attr VSize VFree
  15. lanv 1 1 0 wz--n- 1016.00M 516.00M
  16. [root@oracle ~]# df -h /lanv/lgl
  17. Filesystem Size Used Avail Use% Mounted on
  18. /dev/mapper/lanv-lgl 485M 11M 449M 3% /lanv/lgl
  19. [root@oracle ~]# cat /etc/fstab
  20. /dev/lanv/lgl /lanv/lgl ext3 defaults 0 0
  21. [root@oracle lgl]# pwd
  22. /lanv/lgl
  23. [root@oracle lgl]# echo "Hello World" >lgl.txt
  24. [root@oracle lgl]# cat lgl.txt
  25. Hello World
删除lvm信息:

点击(此处)折叠或打开

  1. [root@oracle lgl]# fdisk /dev/sdb
  2. Command (m for help): p
  3. Disk /dev/sdb: 1073 MB, 1073741824 bytes
  4. 255 heads, 63 sectors/track, 130 cylinders
  5. Units = cylinders of 16065 * 512 = 8225280 bytes
  6. Device Boot Start End Blocks Id System
  7. /dev/sdb1 1 130 1044193+ 83 Linux
  8. Command (m for help): d
  9. Selected partition 1
  10. Command (m for help): p
  11. Disk /dev/sdb: 1073 MB, 1073741824 bytes
  12. 255 heads, 63 sectors/track, 130 cylinders
  13. Units = cylinders of 16065 * 512 = 8225280 bytes
  14. Device Boot Start End Blocks Id System
  15. Command (m for help): w
  16. The partition table has been altered!
  17. Calling ioctl() to re-read partition table.
  18. WARNING: Re-reading the partition table failed with error 16: Device or resource busy.
  19. The kernel still uses the old table.
  20. The new table will be used at the next reboot.
  21. Syncing disks.

这个时候lvm设备仍然能访问,能正常读写,过一段时间后会报错,系统崩溃。

重启系统报错:




点击(此处)折叠或打开

  1. [root@oracle backup]# fdisk /dev/sdb
  2. Command (m for help): p
  3. Disk /dev/sdb: 1073 MB, 1073741824 bytes
  4. 255 heads, 63 sectors/track, 130 cylinders
  5. Units = cylinders of 16065 * 512 = 8225280 bytes
  6. Device Boot Start End Blocks Id System
  7. Command (m for help): n
  8. Command action
  9. e extended
  10. p primary partition (1-4)
  11. p
  12. Partition number (1-4): 1
  13. First cylinder (1-130, default 1):
  14. Using default value 1
  15. Last cylinder or +size or +sizeM or +sizeK (1-130, default 130):
  16. Using default value 130
  17. Command (m for help): w
  18. The partition table has been altered!
  19. Calling ioctl() to re-read partition table.
  20. Syncing disks.
  21. [root@oracle backup]# fdisk -l
  22. Disk /dev/sda: 32.2 GB, 32212254720 bytes
  23. 255 heads, 63 sectors/track, 3916 cylinders
  24. Units = cylinders of 16065 * 512 = 8225280 bytes
  25. Device Boot Start End Blocks Id System
  26. /dev/sda1 * 1 3655 29358756 83 Linux
  27. /dev/sda2 3656 3916 2096482+ 82 Linux swap / Solaris
  28. Disk /dev/sdb: 1073 MB, 1073741824 bytes
  29. 255 heads, 63 sectors/track, 130 cylinders
  30. Units = cylinders of 16065 * 512 = 8225280 bytes
  31. Device Boot Start End Blocks Id System
  32. /dev/sdb1 1 130 1044193+ 83 Linux

现在可以用vgcfgrestore 命令根据备份的Metadata信息恢复lvm

先查看下备份的Metadata信息


 

点击(此处)折叠或打开

  1. [root@oracle backup]# more lanv
  2. # Generated by LVM2 version 2.02.46-RHEL5 (2009-06-18): Sun Jul 8 19:44:21 2012
  3. contents = "Text Format Volume Group"
  4. version = 1
  5. description = "Created *after* executing 'vgs'"
  6. creation_host = "oracle" # Linux oracle 2.6.18-164.el5 #1 SMP Tue Aug 18 15:51:48 EDT 2009 x86_64
  7. creation_time = 1341747861 # Sun Jul 8 19:44:21 2012
  8. lanv {
  9. id = "hlt8bK-xsoA-Z3GJ-iosl-7cwk-e5b1-TUx3OB"
  10. seqno = 4
  11. status = ["RESIZEABLE", "READ", "WRITE"]
  12. flags = []
  13. extent_size = 8192 # 4 Megabytes
  14. max_lv = 0
  15. max_pv = 0
  16. physical_volumes {
  17. pv0 {
  18. id = "qm6Oo3-92NX-E0ca-e0GV-2eA6-q2Jm-0CzGre"
  19. device = "/dev/sdb" # Hint only
  20. status = ["ALLOCATABLE"]
  21. flags = []
  22. dev_size = 2088387 # 1019.72 Megabytes
  23. pe_start = 384
  24. pe_count = 254 # 1016 Megabytes
  25. }
  26. }
  27. logical_volumes {
  28. lgl {
  29. id = "ifkQpI-bf7f-6SvS-8V9J-HVmO-wAZj-0jWY4B"
  30. status = ["READ", "WRITE", "VISIBLE"]
  31. flags = []
  32. segment_count = 1
  33. segment1 {
  34. start_extent = 0
  35. extent_count = 125 # 500 Megabytes
  36. type = "striped"
  37. stripe_count = 1 # linear
  38. stripes = [
  39. "pv0", 0
  40. ]
  41. }
  42. }
  43. }
  44. }
vgcfgrestore 命令默认从/etc/lvm/backup查找Metadata信息来恢复lvm

点击(此处)折叠或打开

  1. [root@oracle ~]# vgcfgrestore lanv
  2. Couldn't find device with uuid 'qm6Oo3-92NX-E0ca-e0GV-2eA6-q2Jm-0CzGre'.
  3. Cannot restore Volume Group lanv with 1 PVs marked as missing.
  4. Restore failed.
根据错误提示,重新创建pv

点击(此处)折叠或打开

  1. [root@oracle backup]# pvcreate --uuid qm6Oo3-92NX-E0ca-e0GV-2eA6-q2Jm-0CzGre /dev/sdb1
  2. Can't initialize physical volume "/dev/sdb1" of volume group "lanv" without -ff
  3. [root@oracle backup]# pvcreate --uuid qm6Oo3-92NX-E0ca-e0GV-2eA6-q2Jm-0CzGre /dev/sdb1 -ff
  4. Really INITIALIZE physical volume "/dev/sdb1" of volume group "lanv" [y/n]? y
  5. WARNING: Forcing physical volume creation on /dev/sdb1 of volume group "lanv"
  6. Physical volume "/dev/sdb1" successfully created

点击(此处)折叠或打开

  1. [root@oracle backup]# pvs
  2. PV VG Fmt Attr PSize PFree
  3. /dev/sdb1 lvm2 -- 1019.72M 1019.72M
现在再用vgcfgrestore 命令恢复

点击(此处)折叠或打开

  1. [root@oracle backup]# vgcfgrestore lanv
  2. Restored volume group lanv
  3. [root@oracle backup]# vgs
  4. VG #PV #LV #SN Attr VSize VFree
  5. lanv 1 1 0 wz--n- 1016.00M 516.00M
  6. [root@oracle backup]# lvs
  7. LV VG Attr LSize Origin Snap% Move Log Copy% Convert
  8. lgl lanv -wi--- 500.00M
  9. [root@oracle backup]# vgchange -ay lanv
  10. 1 logical volume(s) in volume group "lanv" now active
查看恢复结果

点击(此处)折叠或打开

  1. [root@oracle backup]# lvs
  2. LV VG Attr LSize Origin Snap% Move Log Copy% Convert
  3. lgl lanv -wi-a- 500.00M
  4. [root@oracle backup]# mount /dev/lanv/lgl /lanv/lgl/
  5. [root@oracle backup]# df -h /lanv/lgl/
  6. Filesystem Size Used Avail Use% Mounted on
  7. /dev/mapper/lanv-lgl 485M 11M 449M 3% /lanv/lgl
  8. [root@oracle backup]# cd /lanv/lgl/
  9. [root@oracle lgl]# ls
  10. lgl.txt
  11. [root@oracle lgl]# cat lgl.txt
  12. Hello World
到此lvm已经恢复,里面的数据完好无损

恢复步奏其实不难,就两条命令,但是在生产环境应该尽量避免发生这种情况,比如在装系统时,应该先把外部存储设备都拔了,免得不小心删除了数据。





 
上一篇:linux配置网络rescue mode
下一篇:工作效率與洗澡的順序:為什麼要先洗身體→洗頭→洗臉!