📘 Network Automation: Netmiko and NAPALM

🎯 Introduction

Welcome to the exciting world of network automation! 🎉 In this guide, we’ll explore how Netmiko and NAPALM can revolutionize the way you manage network devices using Python.

You’ll discover how these powerful libraries can transform tedious manual network configurations into automated, reliable, and scalable solutions. Whether you’re managing routers 🔌, switches 🖥️, or firewalls 🛡️, understanding network automation is essential for modern network engineering.

By the end of this tutorial, you’ll feel confident automating network tasks with Python! Let’s dive in! 🏊‍♂️

📚 Understanding Network Automation

🤔 What are Netmiko and NAPALM?

Network automation is like having a smart assistant for your network devices 🤖. Think of it as teaching your computer to speak the language of routers and switches!

Netmiko is like a universal translator 🌍 that helps Python communicate with network devices through SSH. It’s built on top of Paramiko and simplifies connecting to various vendor devices.

NAPALM (Network Automation and Programmability Abstraction Layer with Multivendor support) is like a Swiss Army knife 🔧 for network automation. It provides a unified API to interact with different network device operating systems.

In Python terms, these libraries help you:

✨ Connect to network devices programmatically
🚀 Execute commands and retrieve outputs
🛡️ Configure devices consistently across vendors
📊 Extract operational data in structured formats

💡 Why Use Network Automation?

Here’s why network engineers love automation:

Consistency 🔒: Apply configurations uniformly across devices
Speed ⚡: Configure hundreds of devices in minutes
Accuracy 🎯: Eliminate human errors from manual typing
Documentation 📖: Code serves as living documentation
Scalability 📈: Manage growing networks efficiently

Real-world example: Imagine updating VLAN configurations across 100 switches 🏢. With automation, you can complete this task in minutes instead of hours!

🔧 Basic Syntax and Usage

📝 Netmiko Basics

Let’s start with connecting to a device using Netmiko:

from netmiko import ConnectHandler

# 🔌 Device connection details
cisco_device = {
    'device_type': 'cisco_ios',  # 🏷️ Specify device type
    'host': '192.168.1.1',        # 🏠 Device IP address
    'username': 'admin',          # 👤 Login username
    'password': 'secure_pass',    # 🔐 Login password
    'secret': 'enable_pass'       # 🔑 Enable password
}

# 🚀 Connect to the device
connection = ConnectHandler(**cisco_device)
connection.enable()  # 📈 Enter enable mode

# 💡 Send a command
output = connection.send_command('show version')
print(f"🖥️ Device info:\n{output}")

# 🔒 Always close the connection!
connection.disconnect()

💡 Explanation: Notice how we specify the device type! Netmiko supports many vendors including Cisco, Juniper, Arista, and more.

🎯 NAPALM Basics

Here’s how to use NAPALM for vendor-agnostic operations:

from napalm import get_network_driver

# 🎨 Get the appropriate driver
driver = get_network_driver('ios')  # 🏷️ Cisco IOS driver

# 🔧 Create device object
device = driver(
    hostname='192.168.1.1',
    username='admin',
    password='secure_pass',
    optional_args={'secret': 'enable_pass'}
)

# 🌟 Open connection
device.open()

# 📊 Get device facts
facts = device.get_facts()
print(f"🏢 Device Model: {facts['model']}")
print(f"🔢 Serial Number: {facts['serial_number']}")
print(f"💾 OS Version: {facts['os_version']}")

# 🎯 Get interfaces
interfaces = device.get_interfaces()
for intf, details in interfaces.items():
    print(f"🔌 {intf}: {'UP' if details['is_up'] else 'DOWN'} 🟢")

# 🔒 Close connection
device.close()

💡 Practical Examples

🏢 Example 1: Bulk Configuration Updater

Let’s build a tool to update configurations across multiple devices:

from netmiko import ConnectHandler
from concurrent.futures import ThreadPoolExecutor
import logging

# 📝 Setup logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

class NetworkConfigurator:
    def __init__(self, devices):
        self.devices = devices  # 📋 List of device dictionaries
        
    def configure_device(self, device):
        """🔧 Configure a single device"""
        try:
            # 🚀 Connect to device
            logger.info(f"🔌 Connecting to {device['host']}")
            connection = ConnectHandler(**device)
            connection.enable()
            
            # 📋 Configuration commands
            config_commands = [
                'ntp server 10.1.1.1',           # ⏰ NTP server
                'logging host 10.1.1.100',       # 📊 Syslog server
                'banner motd # Authorized Use Only! 🔒 #',  # 🚪 Login banner
                'service timestamps debug datetime msec',     # ⏱️ Timestamps
            ]
            
            # 💫 Send configuration
            output = connection.send_config_set(config_commands)
            logger.info(f"✅ Configured {device['host']} successfully!")
            
            # 💾 Save configuration
            connection.save_config()
            logger.info(f"💾 Configuration saved on {device['host']}")
            
            connection.disconnect()
            return {'device': device['host'], 'status': 'success', 'output': output}
            
        except Exception as e:
            logger.error(f"❌ Failed to configure {device['host']}: {str(e)}")
            return {'device': device['host'], 'status': 'failed', 'error': str(e)}
    
    def configure_all(self, max_threads=5):
        """🚀 Configure all devices in parallel"""
        results = []
        
        # 🏃‍♂️ Use ThreadPoolExecutor for parallel execution
        with ThreadPoolExecutor(max_workers=max_threads) as executor:
            futures = [executor.submit(self.configure_device, device) 
                      for device in self.devices]
            
            # 📊 Collect results
            for future in futures:
                results.append(future.result())
        
        # 📈 Summary
        success_count = sum(1 for r in results if r['status'] == 'success')
        print(f"\n🎉 Configuration Summary:")
        print(f"✅ Successful: {success_count}/{len(self.devices)}")
        print(f"❌ Failed: {len(self.devices) - success_count}/{len(self.devices)}")
        
        return results

# 🎮 Let's use it!
devices = [
    {
        'device_type': 'cisco_ios',
        'host': '192.168.1.1',
        'username': 'admin',
        'password': 'pass123',
        'secret': 'enable123'
    },
    {
        'device_type': 'cisco_ios',
        'host': '192.168.1.2',
        'username': 'admin',
        'password': 'pass123',
        'secret': 'enable123'
    }
]

configurator = NetworkConfigurator(devices)
results = configurator.configure_all()

🎯 Try it yourself: Add error handling for specific configuration failures and implement rollback functionality!

🎮 Example 2: Network Health Monitor

Let’s create a comprehensive network health monitoring system:

from napalm import get_network_driver
import json
from datetime import datetime
import pandas as pd

class NetworkHealthMonitor:
    def __init__(self):
        self.health_data = []  # 📊 Store health metrics
        
    def check_device_health(self, device_info):
        """🏥 Check health of a single device"""
        driver = get_network_driver(device_info['driver'])
        device = driver(**device_info['connection_params'])
        
        try:
            device.open()
            health_report = {
                'timestamp': datetime.now().isoformat(),  # ⏰ Current time
                'hostname': device_info['connection_params']['hostname'],
                'checks': {}
            }
            
            # 🔍 Check 1: Device facts
            facts = device.get_facts()
            health_report['device_info'] = {
                'model': facts['model'],
                'uptime': facts['uptime'],
                'vendor': facts['vendor']
            }
            
            # 💾 Check 2: Memory usage
            environment = device.get_environment()
            if 'memory' in environment:
                memory = environment['memory']
                used_percent = (memory['used_ram'] / memory['available_ram']) * 100
                health_report['checks']['memory'] = {
                    'status': '✅' if used_percent < 80 else '⚠️',
                    'used_percent': round(used_percent, 2),
                    'message': f"Memory usage: {used_percent:.1f}%"
                }
            
            # 🌡️ Check 3: CPU temperature
            if 'cpu' in environment:
                cpu_temps = [cpu['temperature'] for cpu in environment['cpu'].values()]
                avg_temp = sum(cpu_temps) / len(cpu_temps)
                health_report['checks']['temperature'] = {
                    'status': '✅' if avg_temp < 70 else '🔥',
                    'avg_celsius': round(avg_temp, 1),
                    'message': f"CPU temp: {avg_temp:.1f}°C"
                }
            
            # 🔌 Check 4: Interface errors
            interfaces = device.get_interfaces_counters()
            error_interfaces = []
            for intf, counters in interfaces.items():
                if counters['rx_errors'] > 100 or counters['tx_errors'] > 100:
                    error_interfaces.append(intf)
            
            health_report['checks']['interfaces'] = {
                'status': '✅' if not error_interfaces else '⚠️',
                'error_count': len(error_interfaces),
                'message': f"Interfaces with errors: {len(error_interfaces)}"
            }
            
            # 🎯 Overall health score
            health_score = self._calculate_health_score(health_report['checks'])
            health_report['health_score'] = health_score
            health_report['health_emoji'] = self._get_health_emoji(health_score)
            
            device.close()
            return health_report
            
        except Exception as e:
            logger.error(f"❌ Health check failed: {str(e)}")
            return {
                'hostname': device_info['connection_params']['hostname'],
                'status': 'failed',
                'error': str(e)
            }
    
    def _calculate_health_score(self, checks):
        """📊 Calculate overall health score (0-100)"""
        scores = {
            '✅': 100,
            '⚠️': 70,
            '❌': 30,
            '🔥': 50
        }
        
        if not checks:
            return 0
            
        total_score = sum(scores.get(check['status'], 0) for check in checks.values())
        return total_score // len(checks)
    
    def _get_health_emoji(self, score):
        """🎨 Get emoji based on health score"""
        if score >= 90:
            return "💚"  # Excellent
        elif score >= 70:
            return "💛"  # Good
        elif score >= 50:
            return "🧡"  # Warning
        else:
            return "❤️"  # Critical
    
    def generate_report(self, devices):
        """📈 Generate health report for all devices"""
        print("🏥 Network Health Check Report")
        print("=" * 50)
        
        for device in devices:
            report = self.check_device_health(device)
            if 'error' not in report:
                print(f"\n🏢 Device: {report['hostname']}")
                print(f"   Health: {report['health_emoji']} {report['health_score']}%")
                for check_name, check_data in report['checks'].items():
                    print(f"   {check_data['status']} {check_name}: {check_data['message']}")
            else:
                print(f"\n❌ Device: {report['hostname']} - Check Failed!")

# 🎮 Usage example
devices = [
    {
        'driver': 'ios',
        'connection_params': {
            'hostname': '192.168.1.1',
            'username': 'admin',
            'password': 'pass123',
            'optional_args': {'secret': 'enable123'}
        }
    }
]

monitor = NetworkHealthMonitor()
monitor.generate_report(devices)

🚀 Advanced Concepts

🧙‍♂️ Configuration Templating with Jinja2

When you’re ready to level up, combine network automation with templating:

from jinja2 import Template
from netmiko import ConnectHandler

# 🎨 Create a configuration template
vlan_template = Template("""
{% for vlan in vlans %}
vlan {{ vlan.id }}
 name {{ vlan.name }}
 {% if vlan.description %}
 description {{ vlan.description }} 🏷️
 {% endif %}
!
interface vlan {{ vlan.id }}
 description {{ vlan.name }} SVI 🌐
 ip address {{ vlan.ip }} {{ vlan.mask }}
 no shutdown
!
{% endfor %}
""")

# 📊 VLAN data
vlan_data = {
    'vlans': [
        {'id': 10, 'name': 'SALES', 'description': 'Sales Department', 
         'ip': '10.1.10.1', 'mask': '255.255.255.0'},
        {'id': 20, 'name': 'IT', 'description': 'IT Department',
         'ip': '10.1.20.1', 'mask': '255.255.255.0'},
        {'id': 30, 'name': 'GUEST', 'description': 'Guest Network',
         'ip': '10.1.30.1', 'mask': '255.255.255.0'}
    ]
}

# 🔧 Generate configuration
config = vlan_template.render(vlan_data)
print("📋 Generated Configuration:")
print(config)

# 🚀 Apply to device
def apply_templated_config(device_params, config_text):
    connection = ConnectHandler(**device_params)
    connection.enable()
    
    # 💫 Send configuration
    output = connection.send_config_set(config_text.split('\n'))
    connection.save_config()
    connection.disconnect()
    
    return output

🏗️ Event-Driven Automation

For the brave automators, implement reactive network automation:

import asyncio
from napalm import get_network_driver
import time

class NetworkEventHandler:
    def __init__(self):
        self.thresholds = {
            'cpu_usage': 80,      # 🔥 CPU threshold
            'memory_usage': 85,   # 💾 Memory threshold
            'interface_errors': 100  # ⚠️ Error threshold
        }
        
    async def monitor_device(self, device_info):
        """🔍 Continuously monitor device and trigger actions"""
        driver = get_network_driver(device_info['driver'])
        
        while True:
            try:
                device = driver(**device_info['connection_params'])
                device.open()
                
                # 📊 Get environment data
                env = device.get_environment()
                
                # 🔥 Check CPU
                if 'cpu' in env:
                    cpu_usage = list(env['cpu'].values())[0]['%usage']
                    if cpu_usage > self.thresholds['cpu_usage']:
                        await self.handle_high_cpu(device_info, cpu_usage)
                
                # 💾 Check Memory
                if 'memory' in env:
                    memory = env['memory']
                    memory_usage = (memory['used_ram'] / memory['available_ram']) * 100
                    if memory_usage > self.thresholds['memory_usage']:
                        await self.handle_high_memory(device_info, memory_usage)
                
                device.close()
                
            except Exception as e:
                print(f"❌ Monitoring error: {str(e)}")
            
            # ⏱️ Wait before next check
            await asyncio.sleep(60)  # Check every minute
    
    async def handle_high_cpu(self, device_info, cpu_usage):
        """🔥 Handle high CPU usage event"""
        print(f"🚨 HIGH CPU ALERT on {device_info['connection_params']['hostname']}!")
        print(f"   CPU Usage: {cpu_usage}%")
        print(f"   🔧 Triggering automated response...")
        
        # Implement automated response (e.g., clear ARP cache, restart process)
        # This is where you'd add your remediation logic
        
    async def handle_high_memory(self, device_info, memory_usage):
        """💾 Handle high memory usage event"""
        print(f"🚨 HIGH MEMORY ALERT on {device_info['connection_params']['hostname']}!")
        print(f"   Memory Usage: {memory_usage:.1f}%")
        print(f"   🔧 Triggering automated response...")

⚠️ Common Pitfalls and Solutions

😱 Pitfall 1: Not Handling Connection Timeouts

# ❌ Wrong way - no timeout handling
connection = ConnectHandler(**device_params)
output = connection.send_command('show running-config')  # 💥 Might hang forever!

# ✅ Correct way - set timeouts
connection = ConnectHandler(
    **device_params,
    timeout=30,           # 🕐 Connection timeout
    session_timeout=60    # ⏱️ Session timeout
)

# Also use command-specific timeouts
output = connection.send_command(
    'show running-config',
    read_timeout=120  # 📊 Long commands need more time!
)

🤯 Pitfall 2: Not Saving Configurations

# ❌ Dangerous - changes lost on reboot!
connection.send_config_set(['interface gi0/1', 'description Important Link'])
connection.disconnect()  # 💥 Config not saved!

# ✅ Safe - always save your work!
connection.send_config_set(['interface gi0/1', 'description Important Link'])
connection.save_config()  # 💾 Save to startup-config
print("✅ Configuration saved successfully!")
connection.disconnect()

🔒 Pitfall 3: Hardcoding Credentials

# ❌ Security nightmare - never do this!
device = {
    'username': 'admin',
    'password': 'MyPassword123!'  # 🚨 Exposed credential!
}

# ✅ Secure way - use environment variables or vault
import os
from getpass import getpass

device = {
    'username': os.environ.get('NETWORK_USER'),
    'password': os.environ.get('NETWORK_PASS') or getpass('Password: ')
}

# 🔐 Even better - use a secrets management system!

🛠️ Best Practices

🎯 Use Context Managers: Always ensure connections are closed
📝 Log Everything: Track all changes and operations
🛡️ Implement Rollback: Have a way to undo changes
🔄 Test in Lab First: Never test automation in production
✨ Use Version Control: Track your automation scripts
🚀 Parallelize Carefully: Don’t overwhelm devices with connections
📊 Monitor Impact: Track CPU/memory during automation

🧪 Hands-On Exercise

🎯 Challenge: Build a Network Compliance Checker

Create a tool that checks network devices for compliance:

📋 Requirements:

✅ Check for required NTP servers
🔒 Verify SSH is enabled and Telnet is disabled
📊 Ensure logging is configured
🛡️ Check for banner messages
📈 Generate compliance report

🚀 Bonus Points:

Add automatic remediation for non-compliant items
Create a web dashboard for compliance status
Implement configuration backup before changes

💡 Solution

🔍 Click to see solution

from netmiko import ConnectHandler
import re
from datetime import datetime
import json

class ComplianceChecker:
    def __init__(self):
        self.compliance_rules = {
            'ntp_servers': ['10.1.1.1', '10.1.1.2'],  # 🕐 Required NTP
            'required_banner': 'Authorized',           # 🚪 Banner keyword
            'syslog_server': '10.1.1.100',            # 📊 Logging host
            'ssh_enabled': True,                       # 🔐 SSH required
            'telnet_disabled': True                    # 🚫 No telnet
        }
        
    def check_device_compliance(self, device_params):
        """🔍 Check device against compliance rules"""
        results = {
            'device': device_params['host'],
            'timestamp': datetime.now().isoformat(),
            'compliant': True,
            'checks': {}
        }
        
        try:
            connection = ConnectHandler(**device_params)
            connection.enable()
            
            # 🕐 Check NTP servers
            ntp_output = connection.send_command('show run | include ntp server')
            configured_ntp = re.findall(r'ntp server (\S+)', ntp_output)
            ntp_compliant = all(ntp in configured_ntp for ntp in self.compliance_rules['ntp_servers'])
            
            results['checks']['ntp'] = {
                'compliant': ntp_compliant,
                'status': '✅' if ntp_compliant else '❌',
                'message': f"NTP servers: {', '.join(configured_ntp)}"
            }
            
            # 🚪 Check banner
            banner_output = connection.send_command('show run | include banner')
            banner_compliant = self.compliance_rules['required_banner'] in banner_output
            
            results['checks']['banner'] = {
                'compliant': banner_compliant,
                'status': '✅' if banner_compliant else '❌',
                'message': 'Login banner configured' if banner_compliant else 'Banner missing!'
            }
            
            # 📊 Check syslog
            syslog_output = connection.send_command('show run | include logging host')
            syslog_compliant = self.compliance_rules['syslog_server'] in syslog_output
            
            results['checks']['syslog'] = {
                'compliant': syslog_compliant,
                'status': '✅' if syslog_compliant else '❌',
                'message': f"Syslog server: {'configured' if syslog_compliant else 'not configured'}"
            }
            
            # 🔐 Check SSH/Telnet
            ssh_output = connection.send_command('show ip ssh')
            ssh_enabled = 'SSH Enabled' in ssh_output
            
            vty_output = connection.send_command('show run | section line vty')
            telnet_disabled = 'transport input ssh' in vty_output or 'transport input none' in vty_output
            
            results['checks']['remote_access'] = {
                'compliant': ssh_enabled and telnet_disabled,
                'status': '✅' if (ssh_enabled and telnet_disabled) else '❌',
                'message': f"SSH: {'enabled' if ssh_enabled else 'disabled'}, Telnet: {'disabled' if telnet_disabled else 'enabled'}"
            }
            
            # 🎯 Overall compliance
            results['compliant'] = all(check['compliant'] for check in results['checks'].values())
            results['compliance_score'] = sum(1 for check in results['checks'].values() if check['compliant']) / len(results['checks']) * 100
            
            # 🔧 Auto-remediation if requested
            if not results['compliant']:
                results['remediation_available'] = True
                results['remediation_commands'] = self._generate_remediation(results['checks'])
            
            connection.disconnect()
            
        except Exception as e:
            results['error'] = str(e)
            results['compliant'] = False
            
        return results
    
    def _generate_remediation(self, checks):
        """🔧 Generate commands to fix non-compliant items"""
        commands = []
        
        if not checks.get('ntp', {}).get('compliant'):
            for ntp in self.compliance_rules['ntp_servers']:
                commands.append(f'ntp server {ntp}')
        
        if not checks.get('banner', {}).get('compliant'):
            commands.append(f'banner motd # {self.compliance_rules["required_banner"]} Use Only! #')
        
        if not checks.get('syslog', {}).get('compliant'):
            commands.append(f'logging host {self.compliance_rules["syslog_server"]}')
        
        if not checks.get('remote_access', {}).get('compliant'):
            commands.extend([
                'ip ssh version 2',
                'line vty 0 15',
                'transport input ssh'
            ])
        
        return commands
    
    def generate_report(self, results):
        """📊 Generate compliance report"""
        print("\n📋 NETWORK COMPLIANCE REPORT")
        print("=" * 60)
        print(f"Generated: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
        print(f"Devices Checked: {len(results)}")
        
        compliant_count = sum(1 for r in results if r['compliant'])
        print(f"\n📊 Summary:")
        print(f"   ✅ Compliant: {compliant_count}/{len(results)}")
        print(f"   ❌ Non-Compliant: {len(results) - compliant_count}/{len(results)}")
        
        print("\n🔍 Detailed Results:")
        for result in results:
            print(f"\n🏢 Device: {result['device']}")
            if 'error' in result:
                print(f"   ❌ Error: {result['error']}")
            else:
                score_emoji = '💚' if result['compliance_score'] == 100 else '💛' if result['compliance_score'] >= 75 else '❤️'
                print(f"   Score: {score_emoji} {result['compliance_score']:.0f}%")
                for check_name, check_data in result['checks'].items():
                    print(f"   {check_data['status']} {check_name}: {check_data['message']}")

# 🎮 Test it out!
devices = [
    {
        'device_type': 'cisco_ios',
        'host': '192.168.1.1',
        'username': 'admin',
        'password': 'pass123',
        'secret': 'enable123'
    }
]

checker = ComplianceChecker()
results = [checker.check_device_compliance(device) for device in devices]
checker.generate_report(results)

🎓 Key Takeaways

You’ve learned so much! Here’s what you can now do:

✅ Connect to network devices programmatically with Netmiko 💪
✅ Use NAPALM for vendor-agnostic operations 🛡️
✅ Automate configurations across multiple devices 🎯
✅ Build monitoring tools for network health 🐛
✅ Implement compliance checking and remediation 🚀

Remember: Network automation is about making your life easier while improving reliability! Start small, test thoroughly, and gradually expand your automation toolkit. 🤝

🤝 Next Steps

Congratulations! 🎉 You’ve mastered network automation basics!

Here’s what to do next:

💻 Practice with the exercises above on lab devices
🏗️ Build an automation project for your network
📚 Explore advanced topics like NETCONF/RESTCONF
🌟 Share your automation scripts with the community!

Remember: Every network automation expert started with their first script. Keep automating, keep learning, and most importantly, have fun transforming your network operations! 🚀

Happy automating! 🎉🚀✨

Prerequisites

What you'll learn