🤖 AlmaLinux Automation with Ansible Complete Guide
Ready to transform from manual server management to automation mastery? 🚀 This comprehensive guide will turn you into an Ansible wizard, covering everything from basic automation concepts to advanced enterprise orchestration that manages thousands of servers with a single command!
Ansible automation isn’t just about saving time – it’s about creating consistent, reliable, and scalable infrastructure that eliminates human error while enabling rapid deployment and configuration management. Let’s build an automation empire that works while you sleep! 💤⚡
🤔 Why is Ansible Automation Important?
Imagine managing 100 servers manually – updating each one, configuring software, and ensuring consistency. That’s weeks of work! 😵 Here’s why Ansible automation is revolutionary:
- ⚡ Lightning-Fast Deployment: Configure hundreds of servers in minutes instead of weeks
- 🎯 Perfect Consistency: Every server configured exactly the same way, every time
- 🛡️ Error Elimination: No more typos, missed steps, or configuration drift
- 📈 Massive Scalability: Scale from 1 to 10,000 servers with the same effort
- 🔄 Repeatable Processes: Document and version your infrastructure as code
- 💰 Cost Reduction: Reduce manual labor and increase operational efficiency
- 🔐 Enhanced Security: Automated security updates and compliance checking
- 📊 Better Compliance: Ensure all systems meet regulatory requirements automatically
🎯 What You Need
Before we dive into automation mastery, let’s make sure you have everything ready:
✅ AlmaLinux control machine (your automation command center!) ✅ Target servers (AlmaLinux machines to automate - can be VMs for testing) ✅ SSH access to all target machines (Ansible uses SSH for communication) ✅ Python 3 installed on all machines (Ansible’s backbone) ✅ Sudo privileges on target machines (for system-level automation) ✅ Network connectivity between control and target machines ✅ Text editor (nano, vim, or VS Code for writing playbooks) ✅ Curiosity and patience (automation is powerful but requires practice!)
📝 Step 1: Ansible Installation and Setup
Let’s install Ansible and create the perfect automation environment! Think of this as setting up your mission control center. 🎛️
# Update system packages
sudo dnf update -y
# Ensures you have the latest packages and security updates
# Install EPEL repository for additional packages
sudo dnf install -y epel-release
# EPEL provides extra packages not in the standard repositories
# Install Ansible and dependencies
sudo dnf install -y ansible python3-pip git vim
# ansible: the automation engine
# python3-pip: for additional Python modules
# git: for version control of your automation code
# vim: for editing configuration files
# Verify Ansible installation
ansible --version
# Should show Ansible version and configuration details
# Install additional useful modules
pip3 install --user requests netaddr
# requests: for HTTP API interactions
# netaddr: for advanced network operations
Set up SSH key authentication for passwordless access:
# Generate SSH key pair (if you don't have one)
ssh-keygen -t rsa -b 4096 -C "ansible-automation"
# Creates a secure RSA key pair for authentication
# Copy public key to target servers
ssh-copy-id user@target-server-1
ssh-copy-id user@target-server-2
ssh-copy-id user@target-server-3
# Replace with your actual server IPs or hostnames
# Test passwordless connection
ssh user@target-server-1 "hostname"
# Should connect without prompting for password
Create Ansible directory structure:
# Create Ansible project directory
mkdir -p ~/ansible-automation
cd ~/ansible-automation
# Create standard Ansible directory structure
mkdir -p {inventories,playbooks,roles,group_vars,host_vars,files,templates}
# Create the main inventory file
cat << EOF > inventories/production.ini
[web_servers]
web-01 ansible_host=192.168.1.101 ansible_user=admin
web-02 ansible_host=192.168.1.102 ansible_user=admin
web-03 ansible_host=192.168.1.103 ansible_user=admin
[database_servers]
db-01 ansible_host=192.168.1.201 ansible_user=admin
db-02 ansible_host=192.168.1.202 ansible_user=admin
[load_balancers]
lb-01 ansible_host=192.168.1.301 ansible_user=admin
[production:children]
web_servers
database_servers
load_balancers
[production:vars]
ansible_python_interpreter=/usr/bin/python3
ansible_become=yes
ansible_become_method=sudo
EOF
# Replace IP addresses with your actual server IPs
Create Ansible configuration file:
# Create ansible.cfg for project-specific settings
cat << EOF > ansible.cfg
[defaults]
inventory = inventories/production.ini
remote_user = admin
host_key_checking = False
retry_files_enabled = False
gathering = smart
fact_caching = memory
stdout_callback = yaml
stderr_callback = yaml
[privilege_escalation]
become = True
become_method = sudo
become_user = root
become_ask_pass = False
[ssh_connection]
ssh_args = -o ControlMaster=auto -o ControlPersist=60s -o UserKnownHostsFile=/dev/null
pipelining = True
control_path = ~/.ansible/cp/%%h-%%p-%%r
EOF
echo "✅ Ansible configuration created!"
Create initial setup script:
# Create Ansible setup verification script
cat << 'EOF' > setup-verification.sh
#!/bin/bash
echo "🤖 ANSIBLE SETUP VERIFICATION"
echo "============================="
echo "📦 Ansible Version:"
ansible --version | head -1
echo ""
echo "🔗 Testing connectivity to all hosts:"
ansible all -m ping
echo ""
echo "📊 Gathering facts from all hosts:"
ansible all -m setup -a "filter=ansible_distribution*"
echo ""
echo "💾 Checking disk space on all hosts:"
ansible all -m shell -a "df -h | head -5"
echo ""
echo "🔍 Checking Python version on all hosts:"
ansible all -m shell -a "python3 --version"
echo ""
echo "✅ Setup verification complete!"
EOF
chmod +x setup-verification.sh
# Run the setup verification
./setup-verification.sh
# This tests connectivity and basic functionality
🔧 Step 2: Creating Your First Playbooks
Time to write your automation magic! 📜 Playbooks are like recipes that tell Ansible exactly what to do.
Create a basic system update playbook:
# Create system update playbook
cat << 'EOF' > playbooks/system-update.yml
---
- name: System Update and Maintenance
hosts: all
become: yes
gather_facts: yes
vars:
packages_to_install:
- vim
- htop
- curl
- wget
- git
- tree
- nc
tasks:
- name: Update all packages
dnf:
name: "*"
state: latest
register: update_result
- name: Display update results
debug:
msg: "{{ update_result.results | length }} packages were updated"
- name: Install essential packages
dnf:
name: "{{ packages_to_install }}"
state: present
- name: Clean package cache
dnf:
autoremove: yes
- name: Check if reboot is required
stat:
path: /var/run/reboot-required
register: reboot_required
- name: Notify if reboot is needed
debug:
msg: "Reboot required on {{ inventory_hostname }}"
when: reboot_required.stat.exists
- name: Create maintenance log
lineinfile:
path: /var/log/ansible-maintenance.log
line: "{{ ansible_date_time.iso8601 }} - System updated by Ansible"
create: yes
mode: '0644'
- name: Set timezone
timezone:
name: America/New_York
notify: restart chronyd
handlers:
- name: restart chronyd
systemd:
name: chronyd
state: restarted
enabled: yes
EOF
Create a web server setup playbook:
# Create web server deployment playbook
cat << 'EOF' > playbooks/web-server-setup.yml
---
- name: Web Server Configuration
hosts: web_servers
become: yes
gather_facts: yes
vars:
nginx_port: 80
website_name: "AlmaLinux Automation Demo"
document_root: /var/www/html
tasks:
- name: Install Nginx web server
dnf:
name: nginx
state: present
- name: Create custom index page
template:
src: ../templates/index.html.j2
dest: "{{ document_root }}/index.html"
mode: '0644'
notify: reload nginx
- name: Create Nginx virtual host configuration
template:
src: ../templates/nginx-vhost.conf.j2
dest: /etc/nginx/conf.d/default.conf
backup: yes
notify: reload nginx
- name: Start and enable Nginx
systemd:
name: nginx
state: started
enabled: yes
- name: Configure firewall for HTTP
firewalld:
service: http
permanent: yes
state: enabled
immediate: yes
- name: Create web server status check script
copy:
content: |
#!/bin/bash
curl -s http://localhost/server-status || echo "Server status unavailable"
dest: /usr/local/bin/check-web-status.sh
mode: '0755'
- name: Verify web server is responding
uri:
url: "http://{{ ansible_default_ipv4.address }}"
method: GET
status_code: 200
register: web_check
- name: Display web server status
debug:
msg: "Web server on {{ inventory_hostname }} is responding correctly"
when: web_check.status == 200
handlers:
- name: reload nginx
systemd:
name: nginx
state: reloaded
EOF
Create templates for the web server:
# Create templates directory and files
mkdir -p templates
# Create HTML template
cat << 'EOF' > templates/index.html.j2
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>{{ website_name }}</title>
<style>
body {
font-family: Arial, sans-serif;
max-width: 800px;
margin: 50px auto;
text-align: center;
background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
color: white;
padding: 20px;
}
.container {
background: rgba(255,255,255,0.1);
padding: 40px;
border-radius: 10px;
backdrop-filter: blur(10px);
}
.success { color: #28a745; }
.info { margin: 20px 0; }
</style>
</head>
<body>
<div class="container">
<h1>🚀 {{ website_name }}</h1>
<p class="success">✅ Automated deployment successful!</p>
<div class="info">
<h3>Server Information:</h3>
<p><strong>Hostname:</strong> {{ ansible_hostname }}</p>
<p><strong>IP Address:</strong> {{ ansible_default_ipv4.address }}</p>
<p><strong>Operating System:</strong> {{ ansible_distribution }} {{ ansible_distribution_version }}</p>
<p><strong>Deployed on:</strong> {{ ansible_date_time.iso8601 }}</p>
<p><strong>Ansible Controller:</strong> {{ ansible_controller_host | default('Unknown') }}</p>
</div>
<p>🤖 This server was configured automatically using Ansible!</p>
</div>
</body>
</html>
EOF
# Create Nginx configuration template
cat << 'EOF' > templates/nginx-vhost.conf.j2
server {
listen {{ nginx_port }};
server_name {{ ansible_default_ipv4.address }} {{ ansible_hostname }};
root {{ document_root }};
index index.html index.htm;
# Security headers
add_header X-Frame-Options DENY;
add_header X-Content-Type-Options nosniff;
add_header X-XSS-Protection "1; mode=block";
# Logging
access_log /var/log/nginx/{{ ansible_hostname }}_access.log;
error_log /var/log/nginx/{{ ansible_hostname }}_error.log;
location / {
try_files $uri $uri/ =404;
}
# Server status endpoint
location /server-status {
stub_status on;
access_log off;
allow 127.0.0.1;
allow {{ ansible_default_ipv4.address }};
deny all;
}
# Security - deny access to hidden files
location ~ /\. {
deny all;
access_log off;
log_not_found off;
}
}
EOF
Create playbook execution script:
# Create playbook execution helper
cat << 'EOF' > run-playbooks.sh
#!/bin/bash
run_playbook() {
local playbook="$1"
local extra_vars="$2"
echo "🤖 Running playbook: $playbook"
echo "======================================="
if [ ! -f "playbooks/$playbook" ]; then
echo "❌ Playbook not found: playbooks/$playbook"
return 1
fi
if [ -n "$extra_vars" ]; then
ansible-playbook playbooks/$playbook --extra-vars "$extra_vars"
else
ansible-playbook playbooks/$playbook
fi
if [ $? -eq 0 ]; then
echo "✅ Playbook completed successfully"
else
echo "❌ Playbook failed"
return 1
fi
}
case "$1" in
update)
run_playbook "system-update.yml"
;;
web)
run_playbook "web-server-setup.yml"
;;
all)
echo "🚀 Running all playbooks..."
run_playbook "system-update.yml" && run_playbook "web-server-setup.yml"
;;
*)
echo "Usage: $0 {update|web|all}"
echo " update - Run system update playbook"
echo " web - Run web server setup playbook"
echo " all - Run all playbooks in sequence"
;;
esac
EOF
chmod +x run-playbooks.sh
# Test your first playbook
./run-playbooks.sh update
# This will update all systems in your inventory
🌟 Step 3: Advanced Ansible Roles
Let’s create reusable automation components! 🧩 Roles are like building blocks that make your automation modular and scalable.
Create a database role:
# Create role structure for MySQL database
ansible-galaxy init roles/mysql-database
# Create the main tasks file
cat << 'EOF' > roles/mysql-database/tasks/main.yml
---
- name: Install MySQL server and client
dnf:
name:
- mysql-server
- mysql
- python3-PyMySQL
state: present
- name: Start and enable MySQL service
systemd:
name: mysqld
state: started
enabled: yes
- name: Get MySQL root password
shell: "grep 'temporary password' /var/log/mysqld.log | awk '{print $NF}' | tail -1"
register: mysql_temp_password
failed_when: false
changed_when: false
- name: Set MySQL root password
mysql_user:
name: root
password: "{{ mysql_root_password }}"
login_user: root
login_password: "{{ mysql_temp_password.stdout if mysql_temp_password.stdout else mysql_root_password }}"
host_all: yes
when: mysql_root_password is defined
- name: Create MySQL configuration file
template:
src: my.cnf.j2
dest: /etc/my.cnf
backup: yes
notify: restart mysql
- name: Create application database
mysql_db:
name: "{{ mysql_database_name }}"
state: present
login_user: root
login_password: "{{ mysql_root_password }}"
when: mysql_database_name is defined
- name: Create application user
mysql_user:
name: "{{ mysql_app_user }}"
password: "{{ mysql_app_password }}"
priv: "{{ mysql_database_name }}.*:ALL"
state: present
login_user: root
login_password: "{{ mysql_root_password }}"
when: mysql_app_user is defined and mysql_app_password is defined
- name: Configure firewall for MySQL
firewalld:
service: mysql
permanent: yes
state: enabled
immediate: yes
when: mysql_allow_remote | default(false)
- name: Run MySQL secure installation equivalent
mysql_query:
login_user: root
login_password: "{{ mysql_root_password }}"
query:
- "DELETE FROM mysql.user WHERE User=''"
- "DELETE FROM mysql.user WHERE User='root' AND Host NOT IN ('localhost', '127.0.0.1', '::1')"
- "DROP DATABASE IF EXISTS test"
- "DELETE FROM mysql.db WHERE Db='test' OR Db='test\\_%'"
- "FLUSH PRIVILEGES"
when: mysql_secure_installation | default(true)
EOF
# Create MySQL configuration template
cat << 'EOF' > roles/mysql-database/templates/my.cnf.j2
[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
# Basic settings
bind-address = {{ mysql_bind_address | default('127.0.0.1') }}
port = {{ mysql_port | default('3306') }}
# Performance settings
innodb_buffer_pool_size = {{ mysql_innodb_buffer_pool_size | default('256M') }}
max_connections = {{ mysql_max_connections | default('100') }}
query_cache_size = {{ mysql_query_cache_size | default('32M') }}
# Logging
slow_query_log = 1
slow_query_log_file = /var/log/mysqld-slow.log
long_query_time = {{ mysql_slow_query_time | default('2') }}
# Security
validate_password.policy = {{ mysql_password_policy | default('MEDIUM') }}
[mysql]
default-character-set = utf8mb4
[client]
default-character-set = utf8mb4
EOF
# Create role variables
cat << 'EOF' > roles/mysql-database/defaults/main.yml
---
mysql_root_password: "SecureRootPass123!"
mysql_bind_address: "127.0.0.1"
mysql_port: 3306
mysql_max_connections: 200
mysql_innodb_buffer_pool_size: "512M"
mysql_query_cache_size: "64M"
mysql_slow_query_time: 2
mysql_password_policy: "MEDIUM"
mysql_secure_installation: true
mysql_allow_remote: false
# Application database settings
mysql_database_name: "appdb"
mysql_app_user: "appuser"
mysql_app_password: "AppUserPass456!"
EOF
# Create handlers
cat << 'EOF' > roles/mysql-database/handlers/main.yml
---
- name: restart mysql
systemd:
name: mysqld
state: restarted
EOF
Create a monitoring role:
# Create monitoring role
ansible-galaxy init roles/system-monitoring
# Create monitoring tasks
cat << 'EOF' > roles/system-monitoring/tasks/main.yml
---
- name: Install monitoring packages
dnf:
name:
- htop
- iotop
- nethogs
- sysstat
- ncdu
state: present
- name: Create monitoring scripts directory
file:
path: /usr/local/bin/monitoring
state: directory
mode: '0755'
- name: Install system monitoring script
template:
src: system-monitor.sh.j2
dest: /usr/local/bin/monitoring/system-monitor.sh
mode: '0755'
- name: Install log monitoring script
template:
src: log-monitor.sh.j2
dest: /usr/local/bin/monitoring/log-monitor.sh
mode: '0755'
- name: Create monitoring configuration
template:
src: monitoring.conf.j2
dest: /etc/monitoring.conf
mode: '0644'
- name: Set up log rotation for monitoring logs
template:
src: monitoring-logrotate.j2
dest: /etc/logrotate.d/monitoring
mode: '0644'
- name: Create monitoring cron jobs
cron:
name: "{{ item.name }}"
minute: "{{ item.minute }}"
hour: "{{ item.hour }}"
job: "{{ item.job }}"
loop:
- name: "System health check"
minute: "*/15"
hour: "*"
job: "/usr/local/bin/monitoring/system-monitor.sh >> /var/log/system-health.log 2>&1"
- name: "Log analysis"
minute: "0"
hour: "1"
job: "/usr/local/bin/monitoring/log-monitor.sh >> /var/log/log-analysis.log 2>&1"
- name: Ensure monitoring logs exist
file:
path: "{{ item }}"
state: touch
mode: '0644'
loop:
- /var/log/system-health.log
- /var/log/log-analysis.log
EOF
# Create monitoring script templates
cat << 'EOF' > roles/system-monitoring/templates/system-monitor.sh.j2
#!/bin/bash
# System monitoring script generated by Ansible
TIMESTAMP=$(date '+%Y-%m-%d %H:%M:%S')
HOSTNAME=$(hostname)
echo "[$TIMESTAMP] $HOSTNAME System Health Check"
echo "============================================="
# CPU Usage
CPU_USAGE=$(top -bn1 | grep "Cpu(s)" | awk '{print $2}' | cut -d'%' -f1)
echo "CPU Usage: ${CPU_USAGE}%"
# Memory Usage
MEM_USAGE=$(free | grep Mem | awk '{printf("%.1f", $3/$2 * 100.0)}')
echo "Memory Usage: ${MEM_USAGE}%"
# Disk Usage
DISK_USAGE=$(df / | awk 'NR==2{print $5}' | cut -d'%' -f1)
echo "Disk Usage: ${DISK_USAGE}%"
# Load Average
LOAD_AVG=$(uptime | awk -F'load average:' '{print $2}')
echo "Load Average:$LOAD_AVG"
# Check critical thresholds
if [ "${CPU_USAGE%.*}" -gt {{ monitoring_cpu_threshold | default('80') }} ]; then
echo "WARNING: High CPU usage detected!"
fi
if [ "${MEM_USAGE%.*}" -gt {{ monitoring_memory_threshold | default('85') }} ]; then
echo "WARNING: High memory usage detected!"
fi
if [ "$DISK_USAGE" -gt {{ monitoring_disk_threshold | default('90') }} ]; then
echo "WARNING: High disk usage detected!"
fi
echo "Health check completed."
echo ""
EOF
cat << 'EOF' > roles/system-monitoring/templates/log-monitor.sh.j2
#!/bin/bash
# Log monitoring script generated by Ansible
TIMESTAMP=$(date '+%Y-%m-%d %H:%M:%S')
echo "[$TIMESTAMP] Log Analysis Report"
echo "================================"
# Check for errors in system logs
ERROR_COUNT=$(journalctl --since "24 hours ago" --priority=err | wc -l)
echo "System errors (last 24h): $ERROR_COUNT"
# Check authentication failures
AUTH_FAILURES=$(journalctl --since "24 hours ago" | grep -i "authentication failure" | wc -l)
echo "Authentication failures: $AUTH_FAILURES"
# Check disk space warnings
DISK_WARNINGS=$(journalctl --since "24 hours ago" | grep -i "no space left\|disk.*full" | wc -l)
echo "Disk space warnings: $DISK_WARNINGS"
# Service status checks
FAILED_SERVICES=$(systemctl --failed --no-legend | wc -l)
echo "Failed services: $FAILED_SERVICES"
if [ "$FAILED_SERVICES" -gt 0 ]; then
echo "Failed services list:"
systemctl --failed --no-legend
fi
echo "Log analysis completed."
echo ""
EOF
# Create role defaults
cat << 'EOF' > roles/system-monitoring/defaults/main.yml
---
monitoring_cpu_threshold: 80
monitoring_memory_threshold: 85
monitoring_disk_threshold: 90
monitoring_log_retention_days: 30
EOF
Create a comprehensive site playbook that uses roles:
# Create site-wide deployment playbook
cat << 'EOF' > playbooks/site-deployment.yml
---
- name: Complete Infrastructure Deployment
hosts: all
become: yes
gather_facts: yes
vars:
deployment_timestamp: "{{ ansible_date_time.iso8601 }}"
pre_tasks:
- name: Display deployment information
debug:
msg: |
🚀 Starting deployment on {{ inventory_hostname }}
📅 Timestamp: {{ deployment_timestamp }}
🖥️ OS: {{ ansible_distribution }} {{ ansible_distribution_version }}
💾 Memory: {{ ansible_memtotal_mb }}MB
🔗 IP: {{ ansible_default_ipv4.address }}
roles:
- role: system-monitoring
tags: monitoring
- name: Database Server Configuration
hosts: database_servers
become: yes
gather_facts: yes
roles:
- role: mysql-database
vars:
mysql_database_name: "production_db"
mysql_app_user: "webapp"
mysql_app_password: "WebApp789!"
mysql_allow_remote: true
mysql_bind_address: "0.0.0.0"
tags: database
post_tasks:
- name: Verify database is running
systemd:
name: mysqld
register: mysql_status
- name: Display database status
debug:
msg: "MySQL service is {{ mysql_status.status.ActiveState }}"
- name: Web Server Configuration
hosts: web_servers
become: yes
gather_facts: yes
tasks:
- name: Include web server setup
include: web-server-setup.yml
post_tasks:
- name: Create deployment report
template:
src: ../templates/deployment-report.html.j2
dest: /var/www/html/deployment-report.html
mode: '0644'
- name: Load Balancer Configuration
hosts: load_balancers
become: yes
gather_facts: yes
tasks:
- name: Install HAProxy
dnf:
name: haproxy
state: present
- name: Configure HAProxy
template:
src: ../templates/haproxy.cfg.j2
dest: /etc/haproxy/haproxy.cfg
backup: yes
notify: restart haproxy
- name: Start and enable HAProxy
systemd:
name: haproxy
state: started
enabled: yes
handlers:
- name: restart haproxy
systemd:
name: haproxy
state: restarted
post_tasks:
- name: Verify load balancer is responding
uri:
url: "http://{{ ansible_default_ipv4.address }}:80"
method: GET
status_code: 200
register: lb_check
failed_when: false
- name: Display load balancer status
debug:
msg: "Load balancer status: {{ 'OK' if lb_check.status == 200 else 'Failed' }}"
EOF
Create additional templates:
# Create deployment report template
cat << 'EOF' > templates/deployment-report.html.j2
<!DOCTYPE html>
<html>
<head>
<title>Deployment Report - {{ ansible_hostname }}</title>
<style>
body { font-family: Arial; margin: 40px; }
.header { background: #f8f9fa; padding: 20px; border-radius: 8px; }
.section { margin: 20px 0; padding: 15px; border-left: 4px solid #007bff; }
.success { color: #28a745; }
.info { color: #17a2b8; }
table { width: 100%; border-collapse: collapse; margin: 10px 0; }
th, td { padding: 8px; text-align: left; border-bottom: 1px solid #ddd; }
th { background: #f8f9fa; }
</style>
</head>
<body>
<div class="header">
<h1>🚀 Deployment Report</h1>
<p class="success">✅ Automated deployment completed successfully</p>
<p><strong>Server:</strong> {{ ansible_hostname }} ({{ ansible_default_ipv4.address }})</p>
<p><strong>Deployed:</strong> {{ ansible_date_time.iso8601 }}</p>
</div>
<div class="section">
<h2>📊 System Information</h2>
<table>
<tr><th>Property</th><th>Value</th></tr>
<tr><td>Operating System</td><td>{{ ansible_distribution }} {{ ansible_distribution_version }}</td></tr>
<tr><td>Kernel Version</td><td>{{ ansible_kernel }}</td></tr>
<tr><td>Architecture</td><td>{{ ansible_architecture }}</td></tr>
<tr><td>CPU Cores</td><td>{{ ansible_processor_vcpus }}</td></tr>
<tr><td>Total Memory</td><td>{{ ansible_memtotal_mb }}MB</td></tr>
<tr><td>Python Version</td><td>{{ ansible_python_version }}</td></tr>
</table>
</div>
<div class="section">
<h2>🌐 Network Configuration</h2>
<table>
<tr><th>Interface</th><th>IP Address</th><th>MAC Address</th></tr>
{% for interface in ansible_interfaces %}
{% if ansible_facts[interface]['ipv4'] is defined %}
<tr>
<td>{{ interface }}</td>
<td>{{ ansible_facts[interface]['ipv4']['address'] }}</td>
<td>{{ ansible_facts[interface]['macaddress'] | default('N/A') }}</td>
</tr>
{% endif %}
{% endfor %}
</table>
</div>
<div class="section">
<h2>💾 Storage Information</h2>
<table>
<tr><th>Mount Point</th><th>Size</th><th>Used</th><th>Available</th><th>Filesystem</th></tr>
{% for mount in ansible_mounts %}
<tr>
<td>{{ mount.mount }}</td>
<td>{{ (mount.size_total / 1024**3) | round(1) }}GB</td>
<td>{{ (mount.size_used / 1024**3) | round(1) }}GB</td>
<td>{{ (mount.size_available / 1024**3) | round(1) }}GB</td>
<td>{{ mount.fstype }}</td>
</tr>
{% endfor %}
</table>
</div>
<p class="info">🤖 This report was generated automatically by Ansible</p>
</body>
</html>
EOF
# Create HAProxy configuration template
cat << 'EOF' > templates/haproxy.cfg.j2
global
log stdout local0
chroot /var/lib/haproxy
stats socket /run/haproxy/admin.sock mode 660 level admin
stats timeout 30s
user haproxy
group haproxy
daemon
defaults
mode http
log global
option httplog
option dontlognull
option http-server-close
option forwardfor except 127.0.0.0/8
option redispatch
retries 3
timeout http-request 10s
timeout queue 1m
timeout connect 10s
timeout client 1m
timeout server 1m
timeout http-keep-alive 10s
timeout check 10s
maxconn 3000
frontend web_frontend
bind *:80
default_backend web_servers
backend web_servers
balance roundrobin
option httpchk GET /
{% for host in groups['web_servers'] %}
server {{ host }} {{ hostvars[host]['ansible_default_ipv4']['address'] }}:80 check
{% endfor %}
listen stats
bind *:8404
stats enable
stats uri /stats
stats refresh 30s
stats admin if LOCALHOST
EOF
✅ Step 4: Advanced Automation Strategies
Let’s implement enterprise-grade automation patterns! 🏢 We’ll cover error handling, testing, and complex workflows.
Create error handling and testing playbook:
# Create advanced automation playbook with error handling
cat << 'EOF' > playbooks/advanced-automation.yml
---
- name: Advanced Automation with Error Handling
hosts: all
become: yes
gather_facts: yes
serial: "25%" # Process 25% of hosts at a time
max_fail_percentage: 10 # Allow up to 10% failures
vars:
deployment_id: "{{ ansible_date_time.epoch }}"
rollback_required: false
pre_tasks:
- name: Create deployment checkpoint
file:
path: "/tmp/ansible-checkpoint-{{ deployment_id }}"
state: touch
mode: '0644'
- name: Validate prerequisites
block:
- name: Check minimum disk space
assert:
that:
- ansible_mounts | selectattr('mount', 'equalto', '/') | map(attribute='size_available') | first > 1073741824
fail_msg: "Insufficient disk space. Need at least 1GB free."
- name: Check minimum memory
assert:
that:
- ansible_memtotal_mb > 1024
fail_msg: "Insufficient memory. Need at least 1GB RAM."
- name: Verify network connectivity
uri:
url: "http://{{ ansible_default_gateway_ipv4 }}"
method: GET
timeout: 10
failed_when: false
register: network_check
rescue:
- name: Log prerequisite failure
lineinfile:
path: "/var/log/ansible-deployment.log"
line: "{{ ansible_date_time.iso8601 }} - FAILED: Prerequisites not met on {{ inventory_hostname }}"
create: yes
- name: Fail gracefully
fail:
msg: "Prerequisites validation failed. Check logs for details."
tasks:
- name: Execute deployment with error handling
block:
- name: Update system packages safely
dnf:
name: "*"
state: latest
update_cache: yes
register: update_result
retries: 3
delay: 10
- name: Install monitoring agent
dnf:
name:
- htop
- iotop
- tcpdump
state: present
register: install_result
- name: Configure security settings
lineinfile:
path: /etc/ssh/sshd_config
regexp: "{{ item.regexp }}"
line: "{{ item.line }}"
backup: yes
loop:
- { regexp: '^#?PermitRootLogin', line: 'PermitRootLogin no' }
- { regexp: '^#?PasswordAuthentication', line: 'PasswordAuthentication no' }
- { regexp: '^#?MaxAuthTries', line: 'MaxAuthTries 3' }
notify: restart sshd
- name: Verify service status
systemd:
name: sshd
register: sshd_status
failed_when: sshd_status.status.ActiveState != "active"
rescue:
- name: Log deployment failure
lineinfile:
path: "/var/log/ansible-deployment.log"
line: "{{ ansible_date_time.iso8601 }} - FAILED: Deployment failed on {{ inventory_hostname }}"
create: yes
- name: Set rollback flag
set_fact:
rollback_required: true
- name: Attempt rollback
block:
- name: Restore SSH configuration
copy:
src: "/etc/ssh/sshd_config.{{ ansible_date_time.date }}"
dest: /etc/ssh/sshd_config
remote_src: yes
when: sshd_config_backup is defined
- name: Restart SSH service
systemd:
name: sshd
state: restarted
rescue:
- name: Emergency rollback failed
fail:
msg: "Emergency rollback failed. Manual intervention required."
always:
- name: Log deployment completion
lineinfile:
path: "/var/log/ansible-deployment.log"
line: "{{ ansible_date_time.iso8601 }} - {{ 'COMPLETED' if not rollback_required else 'ROLLED_BACK' }}: Deployment on {{ inventory_hostname }}"
create: yes
- name: Send notification (placeholder)
debug:
msg: "Deployment {{ 'completed successfully' if not rollback_required else 'required rollback' }} on {{ inventory_hostname }}"
handlers:
- name: restart sshd
systemd:
name: sshd
state: restarted
post_tasks:
- name: Generate deployment summary
template:
src: ../templates/deployment-summary.j2
dest: "/tmp/deployment-summary-{{ deployment_id }}.txt"
mode: '0644'
- name: Cleanup temporary files
file:
path: "/tmp/ansible-checkpoint-{{ deployment_id }}"
state: absent
EOF
Create CI/CD integration playbook:
# Create CI/CD automation playbook
cat << 'EOF' > playbooks/cicd-deployment.yml
---
- name: CI/CD Application Deployment
hosts: web_servers
become: yes
gather_facts: yes
serial: 1 # Deploy one server at a time (blue-green deployment)
vars:
app_name: "demo-application"
app_version: "{{ app_version | default('latest') }}"
deployment_strategy: "{{ deployment_strategy | default('rolling') }}"
health_check_retries: 5
health_check_delay: 10
pre_tasks:
- name: Validate deployment parameters
assert:
that:
- app_name is defined
- app_version is defined
fail_msg: "Missing required deployment parameters"
- name: Create application directories
file:
path: "{{ item }}"
state: directory
mode: '0755'
loop:
- "/opt/{{ app_name }}"
- "/opt/{{ app_name }}/releases"
- "/opt/{{ app_name }}/shared"
- "/var/log/{{ app_name }}"
tasks:
- name: Blue-Green Deployment Strategy
block:
- name: Stop application service
systemd:
name: "{{ app_name }}"
state: stopped
failed_when: false
- name: Create release directory
file:
path: "/opt/{{ app_name }}/releases/{{ app_version }}"
state: directory
mode: '0755'
- name: Deploy application files (simulate)
copy:
content: |
#!/bin/bash
# {{ app_name }} v{{ app_version }}
# Generated: {{ ansible_date_time.iso8601 }}
echo "Application {{ app_name }} v{{ app_version }} is running on {{ inventory_hostname }}"
while true; do
echo "$(date): Service running..."
sleep 30
done
dest: "/opt/{{ app_name }}/releases/{{ app_version }}/{{ app_name }}.sh"
mode: '0755'
- name: Create application configuration
template:
src: ../templates/app-config.json.j2
dest: "/opt/{{ app_name }}/releases/{{ app_version }}/config.json"
mode: '0644'
- name: Create symbolic link to current release
file:
src: "/opt/{{ app_name }}/releases/{{ app_version }}"
dest: "/opt/{{ app_name }}/current"
state: link
force: yes
- name: Create systemd service file
template:
src: ../templates/app-service.j2
dest: "/etc/systemd/system/{{ app_name }}.service"
mode: '0644'
notify: reload systemd
- name: Start application service
systemd:
name: "{{ app_name }}"
state: started
enabled: yes
daemon_reload: yes
- name: Perform health checks
uri:
url: "http://{{ ansible_default_ipv4.address }}:8080/health"
method: GET
status_code: 200
register: health_check
retries: "{{ health_check_retries }}"
delay: "{{ health_check_delay }}"
failed_when: false
- name: Validate deployment success
assert:
that:
- health_check.status == 200
fail_msg: "Health check failed after deployment"
rescue:
- name: Rollback on failure
block:
- name: Find previous release
find:
paths: "/opt/{{ app_name }}/releases"
file_type: directory
register: releases
- name: Rollback to previous version
file:
src: "{{ releases.files | sort(attribute='mtime') | reverse | list | second | path if releases.files | length > 1 else releases.files[0].path }}"
dest: "/opt/{{ app_name }}/current"
state: link
force: yes
when: releases.files | length > 0
- name: Restart service with previous version
systemd:
name: "{{ app_name }}"
state: restarted
always:
- name: Log rollback action
lineinfile:
path: "/var/log/{{ app_name }}/deployment.log"
line: "{{ ansible_date_time.iso8601 }} - ROLLBACK: Failed deployment of {{ app_version }}, rolled back to previous version"
create: yes
- name: Fail deployment
fail:
msg: "Deployment failed and rollback completed"
post_tasks:
- name: Cleanup old releases (keep last 3)
shell: |
cd /opt/{{ app_name }}/releases
ls -t | tail -n +4 | xargs rm -rf
args:
executable: /bin/bash
- name: Log successful deployment
lineinfile:
path: "/var/log/{{ app_name }}/deployment.log"
line: "{{ ansible_date_time.iso8601 }} - SUCCESS: Deployed {{ app_name }} v{{ app_version }} on {{ inventory_hostname }}"
create: yes
handlers:
- name: reload systemd
systemd:
daemon_reload: yes
EOF
Create comprehensive automation management script:
# Create comprehensive automation management script
cat << 'EOF' > automation-manager.sh
#!/bin/bash
ANSIBLE_DIR="$HOME/ansible-automation"
INVENTORY="inventories/production.ini"
VAULT_FILE="group_vars/all/vault.yml"
cd "$ANSIBLE_DIR" || exit 1
deploy_application() {
local environment="$1"
local version="$2"
echo "🚀 DEPLOYING APPLICATION"
echo "========================"
echo "Environment: $environment"
echo "Version: $version"
echo ""
# Validate inputs
if [ -z "$environment" ] || [ -z "$version" ]; then
echo "❌ Usage: deploy <environment> <version>"
return 1
fi
# Check if environment exists
if ! ansible-inventory --list | grep -q "$environment"; then
echo "❌ Environment '$environment' not found in inventory"
return 1
fi
# Run pre-deployment checks
echo "🔍 Running pre-deployment checks..."
ansible $environment -m ping || {
echo "❌ Connectivity check failed"
return 1
}
# Deploy application
ansible-playbook playbooks/cicd-deployment.yml \
--limit $environment \
--extra-vars "app_version=$version" \
--ask-become-pass
if [ $? -eq 0 ]; then
echo "✅ Deployment completed successfully"
else
echo "❌ Deployment failed"
return 1
fi
}
rollback_application() {
local environment="$1"
echo "🔄 ROLLING BACK APPLICATION"
echo "==========================="
echo "Environment: $environment"
echo ""
# Trigger rollback playbook
ansible-playbook playbooks/rollback.yml \
--limit $environment \
--ask-become-pass
}
infrastructure_health_check() {
echo "🏥 INFRASTRUCTURE HEALTH CHECK"
echo "=============================="
echo ""
# System status
echo "📊 System Status:"
ansible all -m shell -a "uptime" | grep -E "(=>|UNREACHABLE)"
echo ""
echo "💾 Disk Usage:"
ansible all -m shell -a "df -h / | tail -1" | grep -A1 "=>"
echo ""
echo "🔥 Service Status:"
ansible web_servers -m systemd -a "name=nginx" | grep -E "(=|CHANGED|FAILED)"
ansible database_servers -m systemd -a "name=mysqld" | grep -E "(=|CHANGED|FAILED)"
echo ""
echo "🌐 Network Connectivity:"
ansible all -m ping | grep -E "(SUCCESS|FAILED|UNREACHABLE)"
}
run_maintenance() {
local maintenance_type="$1"
echo "🔧 RUNNING MAINTENANCE"
echo "======================"
echo "Type: $maintenance_type"
echo ""
case "$maintenance_type" in
update)
ansible-playbook playbooks/system-update.yml --ask-become-pass
;;
security)
ansible-playbook playbooks/security-hardening.yml --ask-become-pass
;;
backup)
ansible-playbook playbooks/backup.yml --ask-become-pass
;;
*)
echo "❌ Unknown maintenance type: $maintenance_type"
echo "Available types: update, security, backup"
return 1
;;
esac
}
generate_reports() {
echo "📊 GENERATING INFRASTRUCTURE REPORTS"
echo "===================================="
echo ""
# Create reports directory
mkdir -p reports/$(date +%Y%m%d)
# Generate inventory report
echo "📋 Generating inventory report..."
ansible-inventory --list > reports/$(date +%Y%m%d)/inventory.json
# Generate system facts
echo "🖥️ Collecting system facts..."
ansible all -m setup --tree reports/$(date +%Y%m%d)/facts/
# Generate performance report
echo "📈 Generating performance report..."
ansible all -m shell -a "top -bn1 | head -20" > reports/$(date +%Y%m%d)/performance.txt
# Generate security report
echo "🔒 Generating security report..."
ansible all -m shell -a "last | head -10" > reports/$(date +%Y%m%d)/security.txt
echo "✅ Reports generated in reports/$(date +%Y%m%d)/"
}
interactive_mode() {
echo "🤖 ANSIBLE AUTOMATION MANAGER"
echo "=============================="
echo ""
while true; do
echo "Available options:"
echo "1. Deploy application"
echo "2. Health check"
echo "3. Run maintenance"
echo "4. Generate reports"
echo "5. Rollback application"
echo "6. Exit"
echo ""
read -p "Select option (1-6): " choice
case $choice in
1)
read -p "Environment: " env
read -p "Version: " ver
deploy_application "$env" "$ver"
;;
2)
infrastructure_health_check
;;
3)
read -p "Maintenance type (update/security/backup): " type
run_maintenance "$type"
;;
4)
generate_reports
;;
5)
read -p "Environment: " env
rollback_application "$env"
;;
6)
echo "👋 Goodbye!"
exit 0
;;
*)
echo "❌ Invalid option. Please select 1-6."
;;
esac
echo ""
read -p "Press Enter to continue..."
echo ""
done
}
case "$1" in
deploy)
deploy_application "$2" "$3"
;;
health)
infrastructure_health_check
;;
maintenance)
run_maintenance "$2"
;;
reports)
generate_reports
;;
rollback)
rollback_application "$2"
;;
interactive)
interactive_mode
;;
*)
echo "Usage: $0 {deploy|health|maintenance|reports|rollback|interactive}"
echo ""
echo "Commands:"
echo " deploy <env> <version> - Deploy application"
echo " health - Check infrastructure health"
echo " maintenance <type> - Run maintenance tasks"
echo " reports - Generate infrastructure reports"
echo " rollback <env> - Rollback application"
echo " interactive - Launch interactive mode"
;;
esac
EOF
chmod +x automation-manager.sh
Create additional configuration templates:
# Create app configuration template
cat << 'EOF' > templates/app-config.json.j2
{
"application": {
"name": "{{ app_name }}",
"version": "{{ app_version }}",
"environment": "{{ ansible_environment | default('production') }}",
"host": "{{ ansible_hostname }}",
"ip": "{{ ansible_default_ipv4.address }}"
},
"database": {
"host": "{{ groups['database_servers'][0] if groups['database_servers'] is defined else 'localhost' }}",
"port": 3306,
"name": "{{ mysql_database_name | default('appdb') }}"
},
"monitoring": {
"enabled": true,
"endpoint": "/health",
"port": 8080
},
"deployment": {
"timestamp": "{{ ansible_date_time.iso8601 }}",
"deployed_by": "ansible"
}
}
EOF
# Create systemd service template
cat << 'EOF' > templates/app-service.j2
[Unit]
Description={{ app_name }} Application Service
After=network.target
Wants=network.target
[Service]
Type=simple
User=nobody
Group=nobody
WorkingDirectory=/opt/{{ app_name }}/current
ExecStart=/opt/{{ app_name }}/current/{{ app_name }}.sh
Restart=always
RestartSec=10
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier={{ app_name }}
[Install]
WantedBy=multi-user.target
EOF
🎮 Quick Examples
Let’s see your Ansible automation in action with real-world scenarios! 🎯
Example 1: Zero-Downtime Deployment
# Run zero-downtime deployment
./automation-manager.sh deploy web_servers v2.1.0
# Monitor deployment progress
watch -n 2 "ansible web_servers -m shell -a 'systemctl status demo-application'"
Example 2: Infrastructure Scaling
# Create auto-scaling playbook
cat << 'EOF' > playbooks/auto-scale.yml
---
- name: Auto-Scale Infrastructure
hosts: localhost
gather_facts: no
vars:
min_servers: 2
max_servers: 10
cpu_threshold: 80
tasks:
- name: Check current server load
uri:
url: "http://{{ item }}:8080/metrics"
method: GET
register: server_metrics
loop: "{{ groups['web_servers'] }}"
failed_when: false
- name: Calculate average CPU usage
set_fact:
avg_cpu: "{{ server_metrics.results | map(attribute='json') | map(attribute='cpu_usage') | list | average }}"
- name: Scale up if high load
debug:
msg: "Scaling up - CPU usage: {{ avg_cpu }}%"
when: avg_cpu | int > cpu_threshold
- name: Scale down if low load
debug:
msg: "Scaling down - CPU usage: {{ avg_cpu }}%"
when: avg_cpu | int < 30 and groups['web_servers'] | length > min_servers
EOF
Example 3: Disaster Recovery
# Create disaster recovery test
./automation-manager.sh interactive
# Select option 3 (maintenance) and then 'backup'
# This will backup all critical data and configurations
🚨 Fix Common Problems
Don’t worry when automation issues arise – here are solutions to common Ansible problems! 🛠️
Problem 1: SSH Connection Failures
Symptoms: “UNREACHABLE” errors, SSH timeouts
# Test SSH connectivity manually
ssh -v user@target-server
# Common fixes:
# 1. Verify SSH key authentication
ssh-copy-id user@target-server
# 2. Check SSH configuration
ansible target-server -m ping -vvv
# 3. Update inventory with correct SSH settings
ansible_ssh_common_args='-o StrictHostKeyChecking=no'
Problem 2: Permission Denied Errors
Symptoms: “FAILED! => sudo: a password is required” errors
# Use become password
ansible-playbook playbook.yml --ask-become-pass
# Or configure passwordless sudo
echo "username ALL=(ALL) NOPASSWD:ALL" | sudo tee /etc/sudoers.d/ansible
# Test sudo access
ansible all -m shell -a "sudo whoami"
Problem 3: Playbook Syntax Errors
Symptoms: YAML parsing errors, indentation issues
# Check playbook syntax
ansible-playbook playbook.yml --syntax-check
# Validate YAML syntax
python3 -c "import yaml; yaml.safe_load(open('playbook.yml'))"
# Use ansible-lint for best practices
pip3 install ansible-lint
ansible-lint playbook.yml
Problem 4: Module Not Found Errors
Symptoms: “MODULE FAILURE” or module import errors
# Check Python path on target hosts
ansible all -m shell -a "which python3"
# Install required Python modules
ansible all -m shell -a "pip3 install requests"
# Update inventory with correct Python interpreter
ansible_python_interpreter=/usr/bin/python3
📋 Simple Commands Summary
Here’s your Ansible automation quick reference guide! 📚
Task | Command | Purpose |
---|---|---|
Test Connectivity | ansible all -m ping | Test connection to all hosts |
Run Playbook | ansible-playbook playbook.yml | Execute automation playbook |
Check Syntax | ansible-playbook playbook.yml --syntax-check | Validate playbook syntax |
Dry Run | ansible-playbook playbook.yml --check | Test without making changes |
Run Command | ansible all -m shell -a "command" | Execute command on all hosts |
Gather Facts | ansible all -m setup | Collect system information |
List Inventory | ansible-inventory --list | Show inventory configuration |
Run Role | ansible-playbook site.yml --tags "role_name" | Execute specific role |
Limit Hosts | ansible-playbook playbook.yml --limit web_servers | Run on specific group |
Vault Operations | ansible-vault encrypt/decrypt file.yml | Manage encrypted variables |
Custom Scripts | ./automation-manager.sh health | Run custom automation scripts |
Interactive Mode | ./automation-manager.sh interactive | Launch automation manager |
💡 Tips for Success
Follow these expert strategies to master Ansible automation! 🌟
🎯 Automation Best Practices
- Start small and iterate – Begin with simple tasks and gradually add complexity
- Use version control – Keep all your automation code in Git repositories
- Test thoroughly – Always use
--check
mode before running playbooks in production - Document everything – Clear variable names and comments make maintenance easier
🔧 Performance Optimization
- Use parallel execution – Ansible runs tasks in parallel by default
- Implement fact caching – Cache gathered facts to speed up subsequent runs
- Minimize SSH connections – Use connection persistence and pipelining
- Group related tasks – Use blocks to organize logically related tasks
🛡️ Security and Compliance
- Use Ansible Vault – Encrypt sensitive variables and passwords
- Implement proper RBAC – Use different service accounts for different environments
- Audit automation runs – Log all playbook executions and results
- Regular security updates – Automate security patching across all systems
🚀 Advanced Techniques
- Dynamic inventory – Use scripts to automatically discover infrastructure
- Custom modules – Write your own modules for specific automation needs
- Error handling – Implement comprehensive error handling and rollback procedures
- CI/CD integration – Integrate Ansible with your deployment pipelines
🏆 What You Learned
Congratulations! You’ve mastered Ansible automation on AlmaLinux! 🎉 Here’s your incredible achievement:
✅ Built a complete automation infrastructure with control machines and managed nodes ✅ Created comprehensive playbooks for system management and application deployment ✅ Developed reusable roles for database, monitoring, and web server automation ✅ Implemented advanced error handling with rollback and recovery procedures ✅ Set up CI/CD automation with blue-green deployments and health checks ✅ Created management scripts for enterprise-scale automation operations ✅ Mastered inventory management with dynamic and static configurations ✅ Implemented security best practices with encrypted variables and proper access control ✅ Built monitoring and reporting systems for automation oversight ✅ Developed troubleshooting skills for common automation challenges
🎯 Why This Matters
Ansible automation expertise is absolutely essential in today’s IT landscape! 💎
Every organization needs reliable, scalable automation to manage their infrastructure efficiently. From startups managing a few servers to enterprises with thousands of systems, Ansible automation enables consistent, error-free operations at any scale.
These skills open doors to high-paying DevOps Engineer, Cloud Architect, and Site Reliability Engineer positions. Companies desperately need automation experts who can design, implement, and maintain infrastructure-as-code solutions that reduce costs and improve reliability.
Remember, you haven’t just learned a tool – you’ve mastered the art of infrastructure automation. Your ability to codify and version infrastructure management puts you at the forefront of modern IT operations.
Keep automating, keep optimizing, and keep pushing the boundaries of what’s possible with infrastructure-as-code! Your expertise will power the next generation of scalable, reliable systems! 🤖⚡🙌