Boot issues can be among the most challenging problems to troubleshoot in Linux systems. When your AlmaLinux system fails to boot properly, having the knowledge and tools to diagnose and fix the problem is essential. This comprehensive guide covers common boot issues, their causes, and detailed recovery methods to get your system running again.
Understanding the Boot Process
Before diving into troubleshooting, it’s crucial to understand the AlmaLinux boot process:
Boot Sequence Overview
- BIOS/UEFI: Hardware initialization and boot device selection
- GRUB2: Bootloader loads kernel and initramfs
- Kernel: Hardware detection and driver initialization
- initramfs: Initial RAM filesystem prepares root filesystem
- systemd: System and service manager starts services
- Target Units: Multi-user or graphical target reached
Key Boot Components
# View boot messages
dmesg | less
journalctl -b
# Check boot time
systemd-analyze
systemd-analyze blame
systemd-analyze critical-chain
# View kernel command line
cat /proc/cmdline
# Check current runlevel/target
systemctl get-default
systemctl list-units --type=target
Common Boot Issues
1. System Hangs During Boot
Symptoms: Boot process stops at a specific point
Common Causes:
- Failing hardware
- Corrupted system files
- Problematic services
- Driver issues
Diagnosis:
# Boot with minimal parameters
# Add to kernel command line:
systemd.unit=multi-user.target
# or
systemd.unit=rescue.target
# Enable verbose boot
# Remove 'quiet' and 'rhgb' from kernel parameters
2. Black Screen After GRUB
Symptoms: Screen goes black after selecting kernel
Solutions:
# Try different video modes
# Add to kernel command line:
nomodeset
# or
video=SVIDEO-1:d
3. Boot Loop
Symptoms: System continuously reboots
Common Causes:
- Kernel panic
- Critical service failure
- Hardware issues
4. “No Bootable Device” Error
Symptoms: BIOS/UEFI cannot find boot device
Solutions:
- Check boot order in BIOS/UEFI
- Verify disk connections
- Repair boot sector
GRUB Bootloader Problems
GRUB Rescue Mode
When GRUB cannot find its configuration:
# In GRUB rescue mode
grub rescue> ls
grub rescue> ls (hd0,msdos1)/
grub rescue> set root=(hd0,msdos1)
grub rescue> set prefix=(hd0,msdos1)/boot/grub2
grub rescue> insmod normal
grub rescue> normal
Reinstalling GRUB
For BIOS Systems:
# Boot from live media and chroot
mount /dev/sda1 /mnt
mount --bind /dev /mnt/dev
mount --bind /proc /mnt/proc
mount --bind /sys /mnt/sys
chroot /mnt
# Reinstall GRUB
grub2-install /dev/sda
grub2-mkconfig -o /boot/grub2/grub.cfg
# Exit chroot
exit
umount -R /mnt
reboot
For UEFI Systems:
# Boot from live media
mount /dev/sda2 /mnt # Root partition
mount /dev/sda1 /mnt/boot/efi # EFI partition
mount --bind /dev /mnt/dev
mount --bind /proc /mnt/proc
mount --bind /sys /mnt/sys
chroot /mnt
# Reinstall GRUB for UEFI
dnf reinstall grub2-efi grub2-efi-modules shim
grub2-mkconfig -o /boot/efi/EFI/almalinux/grub.cfg
# Exit and reboot
exit
umount -R /mnt
reboot
Repairing GRUB Configuration
# Regenerate GRUB configuration
grub2-mkconfig -o /boot/grub2/grub.cfg
# For UEFI systems
grub2-mkconfig -o /boot/efi/EFI/almalinux/grub.cfg
# Set default kernel
grub2-set-default 0
# Verify configuration
grub2-editenv list
GRUB Password Recovery
If GRUB is password-protected and you’ve forgotten the password:
# Boot from live media and mount system
# Remove password from /etc/grub.d/01_users
mount /dev/sda1 /mnt
chroot /mnt
vi /etc/grub.d/01_users # Remove password entries
grub2-mkconfig -o /boot/grub2/grub.cfg
Kernel-Related Issues
Kernel Panic
Symptoms: System crashes with kernel panic message
Common Causes:
- Hardware failure
- Corrupted kernel
- Driver issues
- Out of memory
Recovery Steps:
- Boot with Previous Kernel:
# At GRUB menu, select older kernel
# If successful, remove problematic kernel
rpm -qa | grep kernel
dnf remove kernel-[version]
- Boot with Minimal Parameters:
# Add to kernel command line:
init=/bin/bash
# or
single
# or
emergency
- Disable Problematic Drivers:
# Add to kernel command line:
modprobe.blacklist=driver_name
Missing or Corrupted Kernel
# Boot from live media and chroot
mount /dev/sda1 /mnt
mount --bind /dev /mnt/dev
mount --bind /proc /mnt/proc
mount --bind /sys /mnt/sys
chroot /mnt
# Reinstall kernel
dnf reinstall kernel kernel-core kernel-modules
# Regenerate initramfs
dracut --force
# Update GRUB
grub2-mkconfig -o /boot/grub2/grub.cfg
initramfs Issues
# Regenerate initramfs for current kernel
dracut --force
# Regenerate for specific kernel
dracut --force /boot/initramfs-$(uname -r).img $(uname -r)
# Regenerate for all kernels
dracut --regenerate-all --force
# Add drivers to initramfs
dracut --add-drivers "driver1 driver2" --force
Filesystem and Mount Issues
Root Filesystem Errors
When root filesystem has errors:
# Boot to emergency mode
# Add to kernel command line:
systemd.unit=emergency.target
# Or at emergency prompt:
mount -o remount,rw /
fsck -f /dev/sda1
Fixing Filesystem Errors
# For ext4 filesystems
e2fsck -f /dev/sda1
# Force check
e2fsck -f -y /dev/sda1
# For XFS filesystems
xfs_repair /dev/sda1
# Force repair (use carefully)
xfs_repair -L /dev/sda1
/etc/fstab Issues
Common fstab problems and solutions:
# Boot with init=/bin/bash
# Remount root as read-write
mount -o remount,rw /
# Edit fstab
vi /etc/fstab
# Common fixes:
# - Comment out problematic entries
# - Fix UUID references
# - Correct mount options
# Verify UUIDs
blkid
# Test fstab without rebooting
mount -a
LVM Recovery
# Activate LVM volumes
vgscan
vgchange -ay
# Check LVM status
pvs
vgs
lvs
# Repair LVM metadata
vgcfgrestore vg_name
# Force activation
vgchange -ay --activationmode partial
systemd Boot Problems
Debugging systemd Boot
# Enable debug shell
systemctl enable debug-shell.service
# Boot with systemd debug
# Add to kernel command line:
systemd.log_level=debug
systemd.log_target=console
# List failed services
systemctl --failed
# Analyze boot
systemd-analyze
systemd-analyze blame
systemd-analyze critical-chain
Fixing Failed Services
# Identify failed services
systemctl list-units --failed
# Check service status
systemctl status service-name
# View service logs
journalctl -u service-name
# Disable problematic service
systemctl disable service-name
# Mask service to prevent starting
systemctl mask service-name
Emergency Shell Access
# Add to kernel command line for emergency shell:
systemd.unit=emergency.target
# For rescue mode:
systemd.unit=rescue.target
# For specific runlevel:
systemd.unit=multi-user.target
Emergency and Rescue Modes
Accessing Emergency Mode
# Method 1: At boot
# Press 'e' at GRUB menu
# Add to kernel line:
systemd.unit=emergency.target
# Method 2: From running system
systemctl emergency
# Method 3: Using SysRq
# Enable SysRq
echo 1 > /proc/sys/kernel/sysrq
# Press Alt+SysRq+E
Rescue Mode Operations
# In rescue mode
# Mount filesystems
mount -a
# Start networking
systemctl start network
# Check system logs
journalctl -xb
# Fix issues and reboot
systemctl reboot
Single User Mode
# Traditional single user mode
# Add to kernel command line:
single
# or
s
# or
1
# Tasks in single user mode:
# - Reset root password
# - Fix configuration files
# - Repair filesystems
# - Remove problematic software
Recovery Using Live Media
Creating Recovery Media
# Download AlmaLinux ISO
wget https://repo.almalinux.org/almalinux/9/isos/x86_64/AlmaLinux-9-latest-x86_64-dvd.iso
# Create bootable USB
dd if=AlmaLinux-9-latest-x86_64-dvd.iso of=/dev/sdX bs=4M status=progress
Chroot Recovery Process
# Boot from live media
# Identify system partitions
lsblk
fdisk -l
# Mount system
mount /dev/sda2 /mnt # Root partition
mount /dev/sda1 /mnt/boot # Boot partition
# Mount system directories
mount --bind /dev /mnt/dev
mount --bind /proc /mnt/proc
mount --bind /sys /mnt/sys
# Chroot into system
chroot /mnt
# Perform repairs
# - Fix packages: dnf reinstall [package]
# - Fix configurations
# - Rebuild initramfs: dracut --force
# - Update GRUB: grub2-mkconfig -o /boot/grub2/grub.cfg
# Exit and unmount
exit
umount -R /mnt
reboot
Network Recovery
# In live environment
# Start networking
nmcli device wifi list
nmcli device wifi connect SSID password PASSWORD
# Or for wired
dhclient eth0
# Install packages in chroot
chroot /mnt
dnf install [needed-packages]
Advanced Recovery Techniques
Password Recovery
Root Password Reset:
# Method 1: init=/bin/bash
# At GRUB, add to kernel line:
init=/bin/bash
# After boot:
mount -o remount,rw /
passwd root
touch /.autorelabel # For SELinux
exec /sbin/init
# Method 2: rd.break
# At GRUB, add to kernel line:
rd.break
# After boot:
mount -o remount,rw /sysroot
chroot /sysroot
passwd root
touch /.autorelabel
exit
exit
SELinux Issues
# Boot with SELinux disabled
# Add to kernel command line:
selinux=0
# Or set to permissive:
enforcing=0
# After boot, fix contexts
restorecon -Rv /
fixfiles relabel
# Or trigger full relabel
touch /.autorelabel
reboot
Hardware-Related Boot Issues
# Disable ACPI
acpi=off
# Disable APIC
noapic
# Disable specific hardware detection
nousb
nomodeset
# Memory test
memtest86+
# Limit memory usage
mem=4G
RAID Recovery
# Assemble RAID arrays
mdadm --assemble --scan
# Check RAID status
cat /proc/mdstat
# Force assembly
mdadm --assemble --force /dev/md0 /dev/sda1 /dev/sdb1
# Rebuild RAID
mdadm --manage /dev/md0 --add /dev/sdc1
Preventive Measures
Regular Maintenance
# Keep system updated
dnf update -y
# Clean old kernels
dnf autoremove
# Check filesystem health
# Create monthly cron job
cat > /etc/cron.monthly/check-filesystems << 'EOF'
#!/bin/bash
for fs in $(findmnt -t ext4,xfs -n -o TARGET); do
echo "Checking $fs"
if [[ $(findmnt -n -o FSTYPE "$fs") == "xfs" ]]; then
xfs_repair -n "$fs"
else
e2fsck -n "$fs"
fi
done
EOF
chmod +x /etc/cron.monthly/check-filesystems
Backup Critical Files
# Backup boot configuration
mkdir -p /root/boot-backup
cp -r /boot/grub2 /root/boot-backup/
cp -r /boot/efi /root/boot-backup/ 2>/dev/null
cp /etc/fstab /root/boot-backup/
cp /etc/default/grub /root/boot-backup/
# Create rescue information
cat > /root/boot-backup/system-info.txt << EOF
Date: $(date)
Kernel: $(uname -r)
Root Device: $(findmnt -n -o SOURCE /)
Boot Device: $(findmnt -n -o SOURCE /boot)
GRUB Version: $(grub2-install --version)
Disk Layout:
$(lsblk)
Partition Table:
$(fdisk -l)
LVM Information:
$(pvs && echo && vgs && echo && lvs)
EOF
Monitoring Boot Health
# Monitor boot performance
systemd-analyze > /var/log/boot-performance-$(date +%Y%m%d).log
# Check for boot errors
journalctl -b -p err > /var/log/boot-errors-$(date +%Y%m%d).log
# Create boot monitoring script
cat > /usr/local/bin/boot-monitor.sh << 'EOF'
#!/bin/bash
LOG_DIR="/var/log/boot-monitor"
mkdir -p "$LOG_DIR"
# Check last boot time
BOOT_TIME=$(systemd-analyze | grep "Startup finished" | awk '{print $4}')
echo "$(date): Boot time was $BOOT_TIME" >> "$LOG_DIR/boot-times.log"
# Check for failed services
FAILED=$(systemctl --failed --no-legend | wc -l)
if [ $FAILED -gt 0 ]; then
echo "$(date): $FAILED failed services detected" >> "$LOG_DIR/failures.log"
systemctl --failed >> "$LOG_DIR/failures.log"
fi
EOF
chmod +x /usr/local/bin/boot-monitor.sh
# Add to cron
echo "@reboot /usr/local/bin/boot-monitor.sh" | crontab -
Boot Troubleshooting Tools
Essential Tools
# Install recovery tools
dnf install -y \
system-config-kickstart \
systemd-container \
dracut-tools \
grub2-tools \
mdadm \
lvm2 \
xfsprogs \
e2fsprogs \
boot-repair-disk
# Boot analysis tools
dnf install -y \
systemd-bootchart \
plymouth-utils
Creating Custom Recovery ISO
# Install required packages
dnf install -y lorax
# Create custom recovery ISO
cat > /tmp/recovery.ks << 'EOF'
# Custom recovery kickstart
text
cdrom
lang en_US.UTF-8
keyboard us
rootpw --plaintext recovery
firewall --disabled
selinux --disabled
timezone UTC
%packages
@core
kernel
grub2
dracut
dracut-tools
systemd
bash
vim
tar
gzip
bzip2
xz
e2fsprogs
xfsprogs
lvm2
mdadm
gdisk
parted
openssh-server
NetworkManager
wget
curl
%end
%post
# Configure recovery environment
echo "Welcome to AlmaLinux Recovery Environment" > /etc/motd
%end
EOF
# Build ISO
livecd-creator --config=/tmp/recovery.ks --label="AlmaLinux-Recovery"
Emergency Boot USB
# Create emergency USB with tools
# Format USB
mkfs.ext4 /dev/sdX1
# Mount and create structure
mount /dev/sdX1 /mnt
mkdir -p /mnt/{boot,tools,scripts,docs}
# Copy kernel and initramfs
cp /boot/vmlinuz-$(uname -r) /mnt/boot/
cp /boot/initramfs-$(uname -r).img /mnt/boot/
# Install GRUB
grub2-install --boot-directory=/mnt/boot /dev/sdX
# Create recovery scripts
cat > /mnt/scripts/fix-grub.sh << 'EOF'
#!/bin/bash
echo "GRUB Recovery Script"
echo "===================="
# Script content here
EOF
# Copy documentation
cp /root/boot-backup/* /mnt/docs/
Creating Recovery Plans
Documentation Template
cat > /root/recovery-plan.md << 'EOF'
# System Recovery Plan
## System Information
- Hostname: $(hostname)
- OS Version: $(cat /etc/almalinux-release)
- Kernel: $(uname -r)
- Architecture: $(uname -m)
## Boot Configuration
- Boot Mode: [BIOS/UEFI]
- Boot Device: $(findmnt -n -o SOURCE /boot)
- Root Device: $(findmnt -n -o SOURCE /)
## Critical Services
1. sshd - Remote access
2. NetworkManager - Network connectivity
3. [Add your critical services]
## Recovery Procedures
### 1. Cannot Boot - GRUB Issues
1. Boot from AlmaLinux installation media
2. Select "Troubleshooting" > "Rescue a AlmaLinux system"
3. Run: chroot /mnt/sysimage
4. Reinstall GRUB: grub2-install /dev/sda
5. Regenerate config: grub2-mkconfig -o /boot/grub2/grub.cfg
### 2. Kernel Panic
1. At GRUB menu, select previous kernel
2. If successful, remove bad kernel:
dnf remove kernel-[version]
### 3. Filesystem Errors
1. Boot to emergency mode
2. Run fsck on affected partition:
fsck -f /dev/sdXY
### 4. Forgotten Root Password
1. At GRUB, press 'e'
2. Add 'rd.break' to kernel line
3. Press Ctrl+X to boot
4. mount -o remount,rw /sysroot
5. chroot /sysroot
6. passwd root
7. touch /.autorelabel
8. exit && reboot
## Important Files Locations
- GRUB Config: /boot/grub2/grub.cfg
- Kernel Parameters: /etc/default/grub
- Boot Logs: /var/log/boot.log
- System Logs: /var/log/messages
## Emergency Contacts
- System Administrator: [Contact Info]
- Vendor Support: [Support Info]
EOF
Automated Recovery Script
#!/bin/bash
# /root/auto-recovery.sh
# Automated recovery attempt script
LOG="/var/log/auto-recovery.log"
echo "=== Auto Recovery Started at $(date) ===" >> $LOG
# Function to check and fix common issues
fix_filesystem() {
echo "Checking filesystems..." >> $LOG
for fs in $(findmnt -t ext4,xfs -n -o SOURCE); do
if ! findmnt -n -o TARGET "$fs" >/dev/null 2>&1; then
echo "Checking $fs" >> $LOG
fsck -y "$fs" >> $LOG 2>&1
fi
done
}
fix_selinux() {
echo "Checking SELinux contexts..." >> $LOG
if [ -f /.autorelabel ]; then
echo "SELinux relabel already scheduled" >> $LOG
else
restorecon -Rv / >> $LOG 2>&1
fi
}
fix_services() {
echo "Checking critical services..." >> $LOG
for service in sshd NetworkManager; do
if ! systemctl is-active $service >/dev/null 2>&1; then
echo "Starting $service" >> $LOG
systemctl start $service >> $LOG 2>&1
fi
done
}
rebuild_initramfs() {
echo "Rebuilding initramfs..." >> $LOG
dracut --force >> $LOG 2>&1
}
# Main recovery
fix_filesystem
fix_selinux
fix_services
echo "=== Auto Recovery Completed at $(date) ===" >> $LOG
Conclusion
Boot issues in AlmaLinux can range from simple configuration problems to complex hardware failures. This guide has covered:
- Understanding the boot process and common failure points
- Diagnosing boot problems using various tools and techniques
- Recovering from GRUB, kernel, filesystem, and systemd issues
- Using emergency modes and live media for recovery
- Implementing preventive measures and monitoring
- Creating comprehensive recovery plans
Key takeaways:
- Always maintain recent backups of critical boot files
- Keep a bootable recovery medium available
- Document your system configuration and recovery procedures
- Regular maintenance prevents many boot issues
- Test recovery procedures before you need them
Remember that successful recovery often depends on proper preparation. Regular system maintenance, monitoring, and having a well-documented recovery plan will help minimize downtime when boot issues occur.