On a very bad day your VMFS volume can become corrupt. In most cases this is due to the partition table that contains errors, but the data is still on the disk but not accessible. I had this issue for a large hospital where all VMFS volumes where corrupt. In this article I explain you how I recovered the VMFS partition tables.
You can list all your volumes with the command in which way you also retrieve the devnames:
[root@myesxserver vmhba2]# esxcfg-vmhbadevs
vmhba0:0:0 /dev/cciss/c0d0
vmhba1:0:1 /dev/sda
vmhba1:0:2 /dev/sdb
vmhba1:4:2 /dev/sdc
If you now perform a list fdisk of the desired volume you get:
[root@myesxserver vmhba2]# fdsik -lu /dev/sdb
Disk /dev/sdb: 400.0 GB 401234249287 bytes
255 heads, 63 sectors/track, 39162 cylinders, total 629145600 sectors
Units = sectors of 1*512=512 bytes
Disk /dev/sdb doesn’t contain a valid partition table
This output means that your partition table is corrupt or that your partition table isn’t present.
The normal output should be
[root@myesxserver vmhba2]# fdsik -lu /dev/sdb
Disk /dev/sdb: 400.0 GB 401234249287 bytes
255 heads, 63 sectors/track, 39162 cylinders, total 629145600 sectors
Units = sectors of 1*512=512 bytes
Device boot Start End Blocks Id System
/dev/sdb1 128 629145591 fb Unknown
When you perform an hexdump of this device you can check if the device was a VMFS volume or not:
[root@myesxserver vmhba2]# hexdump -C /dev/sdb | more
To recover this partition table of /dev/sdb perform the following steps:
Open fdisk on the volume
[root@myesxserver vmhba2]# fdsik -u /dev/sdb
Create a new partition
Command (m for help): n
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4):1
First cylinder(1-39162,default 1): Take default
Using default value 1
Last cylinder or +size or +sizeM or +sizeK (1-39162, default 39162): Take default
Using default value 39162
Change partition system id to fb for VMFS partition type
Command(m for help):t
Select partition 1
Hex code (type L to list codes):fb
Changed system type of partition 1 to fb (Unknown)
Move beginning of partition to cylinder 128 which is default beginning for VMFS volumes
Command (m for help): x
Expert command (m for help): b
Partition number (1-4): 1
New beginning of data (63-629137529, default 63):128
Write this table to disk and exit
Expert command(m for help):w
The partition table has been altered
You can’t know the ending sector of this partition from beforehand. To know the correct ending sector you need perform the following command:
[root@myesxserver vmhba2]# vmkfstools -V
This is an undocumented command of VMware. When you used this command you will get the ending block of the previous partition also check the /var/log/vmkernel log. After the performed command will this log will warn you that the new partition isn’t the same size as the previous partition and the log will mention the previous actual size and stored blocks of the previous partition. The ending sector that you require for your partition table is
actual blocks – stored blocks – ending block= – calculated last cylinder. This is a negative number. This number without the negative sign is your last cylinder.
Delete the partition you just created:
[root@myesxserver vmhba2]# fdsik -u /dev/sdb
Command (m for help): d
Command (m for help):w
Create a new partition
[root@myesxserver vmhba2]# fdsik -u /dev/sdb
Command (m for help): n
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4):1
First cylinder(1-39162,default 1):Take default
Using default value 1
Last cylinder or +size or +sizeM or +sizeK (1-39162, default 39162): calculated last cylinder
Using calculated last cylinder
Change partition system id to fb for VMFS partition type
Command(m for help):t
Select partition 1
Hex code (type L to list codes):fb
Changed system type of partition 1 to fb (Unknown)
Move beginning of partition to cylinder 128 which is default beginning for VMFS volumes
Command (m for help): x
Expert command (m for help): b
Partition number (1-4): 1
New beginning of data (63-629137529, default 63):128
Write this table to disk and exit
Expert command(m for help):w
The partition table has been altered
Perform a vmkfstools -V again and check the vmkernel log if the log is still complaining about ending sector sizes that don’t match. Normally you shouldn’t get this error now.
Go to the VI client choose:
Configuration->Storage and hit refresh
The lost volume should reappear and the data should be accessable
Congratulations you just recreated the partition table again manually and prevented a long restore operation.