GRUB GPT HOWTO

This way you can boot a disk using Grub installed on a GPT partition table.

This guide assumes you want a boot filesystem and an LVM physical volume on your GPT partitioned disk.

You may need to recompile Linux. Select the “EFI partition system” in the Filesystems area. You may like to select RAID+LVM modules to use LVM also. You probably don’t need EFI BIOS support unless your machine has this, and therefore need to get a traditional os loader to function…

Normally, Grub does not understand GPT partition tables and needs to be tricked into starting from one. You need to create a very small partition at the start of your disk to hold the grub stage2, (or stage1.5 if you would like to start stage2 from /boot)

Parted

First thing. Create your GPT partition table on your device. I suggest allocating the smallest size possible that parted lets you get away with for the first partition, that is bigger than the stage2 image. The second partition will be your /boot and holds linux and its ramfs images. I suggest around a gigabyte for this. The rest of your hard disk is allocated to the third and final partition, which is your LVM volume.

mklabel gpt
mkpart non-fs 0 2
mkpart ext3 2 130
mkpart lvm 130 30401
GNU Parted 1.8.9
Using /dev/hda
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) unit cyl
(parted) p
Model: SAMSUNG SP2514N (ide)
Disk /dev/hda: 30401cyl
Sector size (logical/physical): 512B/512B
BIOS cylinder,head,sector geometry: 30401,255,63.  Each cylinder is 8225kB.
Partition Table: gpt

Number  Start   End       Size      File system  Name    Flags
 1      0cyl    2cyl      1cyl                   non-fs
 2      2cyl    130cyl    128cyl    ext3         ext3
 3      130cyl  30401cyl  30271cyl               lvm

The filesystem types were non-fs, ext3, and lvm, respectively.

Gptsync

Now, as grub does not understand this, you need a fake MBR. Enter gptsync. You can get this program by installing refit and running it on the device you just set up, after exiting grub.

gptsync /dev/hda

gptsync sets up the MBR to point to the fake partitions, hovever the partition ID's will need correcting with fdisk next. Notice that the partition numbers are increased by one compared to GPT, and there is an extra partition as the first one.

Fixup with Fdisk

fdisk /dev/hda
/dev/hda1               1           1          16+  ee  EFI GPT
/dev/hda2   *           1           3       16048+  da  Non-FS data
/dev/hda3               3         131     1028160   83  Linux
/dev/hda4             131       30402   243154342   8e  Linux LVM

Set the first MBR partition type ee, as it is your GPT partition table.

The second partition in MBR is your GPT table's first partition. We use this to store the Grub stageloader, so set the type to da meaning this is not a filesystem.

The third partition is your GPT second, being the /boot partition and given the 83 type to say so.

The fourth is the LVM to-be, in GPT it is the 3rd, and has the type 8e

To linux, the GPT table takes precedence over MBR (but check you have only 3 partition devicenodes, not 4, to be sure), thus /dev/hda1 maps to the stageloader area (rather than the GPT partition table)

If you did forget to compile in GPT support into Linux you can probably get away with this if your LVM partition is not bigger than the maximum that the MBR partition table can support. Just access the grub partition through hda2 instead of hda1 until you get round to re-compiling Linux.

Filesystems

At this time you will want to format the second GPT partition with ext3 and install the /boot files in there

Also, format the LVM volume and add a volume group, say system and add logical volumes as desired, say root and swap.

GRUB 1

Now it is time to install grub

34 is the offset in sectors from the start of the device to the first GPT partition. Each sector is customarily 512 bytes long, so you can find the start of the first GPT partition at offset 0x4400 in hexedit.

The Grub stage1 in the first sector of your hard disk is to load the stage2 from this partition.

#!/bin/bash
# erase partition
dd if=/dev/zero of=/dev/hda1

# length in sectors of stage2
FILE=/boot/grub/stage2
S=$((  ( $(stat -c %s ${FILE}) + 511 ) / 512 ))

# put loader in partition
cat "${FILE}" > /dev/hda1

# install grub
grub --no-floppy --batch << EOF
root (hd1,2)
install (hd1,2)/grub/stage1 (hd1) (hd1)34+${S} (hd1,2)/grub/menu.lst
EOF

Or if you prefer to use a stage 1.5, makes starting up slower but grub can then be upgraded by replacing the stage2 file on /boot.

#!/bin/bash
# erase partition
dd if=/dev/zero of=/dev/hda1

# length in sectors of chosen stage1_5
FILE=/boot/grub/e2fs_stage1_5
S=$((  ( $(stat -c %s ${FILE}) + 511 ) / 512 ))

# put loader in partition
cat "${FILE}" > /dev/hda1

# install grub
grub --no-floppy --batch << EOF
root (hd1,2)
install (hd1,2)/grub/stage1 (hd1) (hd1)34+${S} (hd1,2)/grub/stage2 (hd1,2)/grub/menu.lst
EOF

GRUB 2

You may also use the non-fs partition to improve resilience when starting to use GRUB 2 in place of GRUB 1. GRUB 2 can load itself from the non-fs partition and therefore avoid using blocklists.

There is some info on marking a partition for the installation of GRUB.

In this situation you may not need to use gptsync any more as GRUB 2 understands the GPT tables. boot.img replaces the GRUB1 stage1 and goes in the MBR area. core.img replaces stage2 and will be copied into the non-fs partition.

It is also possible to pre-load grub2 with lvm support, then no /boot volume is needed and we can use a partition for core.img with lvm and ext2 support embedded, and a lvm physical volume containing at least the root filesystem where grub will locate the remaining modules not embedded, config files and background images, we can confirm this by editing grub-install to echo grub_mkimage to see that lvm will be embedded.

parted /dev/hda
set 1 bios_grub on
quit

grub-install /dev/hda

Draft of Soft RAID mirror with GPT and LVM

Here we have the current setup of this system, a RAID mirror with the superblock at the end. We may use Intel matrixRAID where the baseboard uses that, and mdadm 1.0 superblock otherwise.

Find out how big a mirror Matrix RAID gives by configuring one to try out. It is good to do iteration of the commands to become very familiar with RAID setup and teardown before relying on the OS, in case of errors in this or other guides, issues with the utilties or infamiliarity leading to data loss. We had to create a container and then allocate block devices inside that, it allows the user to have a mirror for a filesystem and stripes for a swap area if that is wanted. I prefer to have a single mirror for now.

mdadm -v -v --create -l container -e imsm --raid-devices=2 imsm0 /dev/sda /dev/sdb

Subsequently mdadm --assemble --scan can be used to set up, see the container with mdadm --detail /dev/md/imsm0

Now allocate a block volume for mirror in the RAID set, we will use all the space:

mdadm --create -l mirror stat --raid-devices=2 /dev/md/imsm0

It is also possible to create a single drive array on some matrixraid systems.

mdadm -v --create -f -l container -e imsm --raid-devices=1 nn /dev/sda
mdadm -v --create -f -l stripe n --raid-devices=1 /dev/md127
mdadm -v --stop /dev/md126
mdadm -v --kill-subarray=0 /dev/md127
mdadm -v --stop /dev/md127
mdadm -v --zero-superblock /dev/sda

The block devices can be accessed again with mdadm --incremental /dev/md/imsm0 and deleted with a command such as mdadm --kill-subarray=0 /dev/md/imsm0

We made a mirror pair at mdadm --detail /dev/md/stat, linux now checks they are the same, the first sector of the starts at sector 0 so we can put mbr or gpt on it and start from it.

we can stop it (unmount) without deleting it with mdadm --stop /dev/md/stat

At this point destroyed the array and recreate for real with 1.0 superblock, the -z option allows to reduce the space allocated to the mirror to match that we would get from IMSM, allowing easy migration of drives to a Matrix RAID baseboard later.

mdadm -v --create -l mirror -e 1.0 --assume-clean --raid-devices=2  -z $((0x3A381400000 / 1024)) stat /dev/disk/by-id/ata-WDC_WD40EZRX-00SPEB0_WD-WCC4E0227*

Partitioned the resulting mirror like so; the LVM runs from sector 2048 to the end of the RAID volume.

Model: Linux Software RAID Array (md)
Disk /dev/md127: 7814029312s
Sector size (logical/physical): 512B/4096B
Partition Table: gpt

Number  Start  End          Size         File system  Name                 Flags
 1      34s    2047s        2014s                     BIOS boot partition  bios_grub
 2      2048s  7814029278s  7814027231s               Linux LVM            lvm

If instead one has an EFI capable system the table might look like this, we do away with a bios_grub partition as the EFI can hold a full core.img

EFI is typically mounted as /boot/efi and contains files like /boot/efi/shell.efi and /boot/efi/efi/grub/grub.efi

Model: Linux Software RAID Array (md)
Disk /dev/md126: 1953517568s
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags: 

Number  Start     End          Size         File system  Name        Flags
 1      2048s     2097151s     2095104s     fat32        EFI System  boot, esp
 2      2097152s  1953517534s  1951420383s               Linux LVM   lvm
blockdev --rereadpt /dev/md127
pvcreate /dev/md127p2

At this point we can setup the minimal OS on the LVM, example 20GB root and swap 2GB logical volume, no dedicated /boot filesystem is used, it is kept on the root filesystem.

Also recommended here is to edit /etc/lvm/lvm.conf to exclude the raw drives from scanning for volumes: filter = [ "r|/dev/sda|", "r|/dev/sdb|" ] then check with vgscan -vvv 2>&1 | grep regex and commit to the initramfs with update-initrd -u. We can do a thorough check by unpack the initrd in /tmp to inspect it: < /boot/initrd.img-`uname -r` gunzip | cpio -i

Some tricks used to persuade GRUB2 to install here, it may refuse to install on the mirror block device, if so we can still install it on one of the hard disks directly and copy sectors affected over the other using a utility such as dd The sectors that comprise the mirror should be identical across the 2 drives.

grub-install --debug --modules=$'raid lvm ext2' /dev/disk/by-id/ata-WDC_WD40EZRX-00SPEB0_WD-WCC4E0227*
grub-install --debug --modules=$'mdraid09 mdraid1x raid lvm ext2' /dev/sd[bc]

Trial startup

Grub2 may show rescue prompt at startup, which we will see if we rename the logical volume containing root. There is neither help or tab completion so recovery commands have to be entered in full, at least until we load the normal module.

grub rescue>ls⏎
grub rescue>set⏎

Grub2 lists detected devices, hopefully we see the root lvm amongst them. If not, lvm support was not installed into grub and we need te reinstall it.

grub rescue>set root=(vg-root)
grub rescue>set prefix=(vg-root)/boot/grub
grub rescue>insmod normal⏎
grub rescue>normal⏎

Once normal is loaded we can start a kernel manually.

grub>linux path to kernel root=/dev/mapper/root device additional options
grub>initrd path to initrd
grub>boot

When the system starts, then we can correct the grub config to load normally

Advantages

With the 1.0 mirrored GPT if either drive fails we can swap the remaining drive to the primary channel easily to recover, and are less dependent on a non-redundant startup drive. Then when we get a replacement for the failed unit, we could instruct mdadm to re-introduce it to the array.

Other OS intergration

Use bcdedit to call out GRUB from NTloader