NA004 - Autocon3 hallway track
Posted on June 22, 2025 • 7 min read • 1,289 wordsThe Autocon3 3 part hallway track

Network Automagic EP004 - Autocon3 hallway track
🎙️ LIVE FROM AUTOCON3! 🎙️
Get ready for an exclusive three-part journey through the buzzling hallways of Autocon3! We’re bringing you the raw, unfiltered conversations that happen between sessions - the kind of deep dives that only happen when network automation experts get together.
🚀 PART 1: Catch up with Damien Garros from Ops mill for an impromptu hallway chat that dives deep into the evolution of network data management and automation strategy.
⚡ PART 2: Join the heated Terraform discussion with Eduardo Pozo & Christian Drefke as they share battle-tested insights from the trenches of enterprise network automation.
🔥 PART 3: Wrap up with John Howard exploring the cutting-edge world of network telemetry and observability - what’s hot in 2025!
This is conference content at its finest - authentic, technical, and packed with real-world wisdom you won’t find anywhere else.
Episode Guests:
- Damien Garros - Ops mill - Hallway chat LinkedIn
- Eduardo Pozo & Christian Drefke - Terraform discussion Eduardo LinkedIn Christian LinkedIn
- John Howard - Network telemetry and observability in 2025 LinkedIn
Listen to the show on YouTube:
Listen to the show anywhere:
- YouTube: @networkautomagic
- Spotify: Network AutoMagic
- Apple Podcasts: Network AutoMagic
- RSS Feed: Anchor.fm
Show notes resources:
What we cover:
Key Topics Discussed:
Damien Garros
Network Data Management & Schema Evolution
- Infrastructure data management platforms vs “network social truth”
- The evolution and controversy around “source of truth” terminology
- YANG schema development and its 25-year history
- Phil Schaeffer’s contributions to NETCONF and YANG standards
- Open models vs closed vendor models debate
Schema Design Philosophy
- XML Schema (XSD) influence on YANG development
- Starting with vendor-agnostic, topology-level models
- Transformation layers for device-specific implementations
- The pitfalls of trying to build comprehensive schemas from the start
Automation Strategy & Implementation
- Starting small and scaling automation projects incrementally
- Workflow analysis and time mapping for teams
- Identifying quick wins vs comprehensive overhauls
- The importance of understanding current team processes before automating
Building Trust in Automation Tools
- Why automation tools often go unused by operations teams
- The “black box” problem in network automation
- Making automation predictable and transparent
- Providing visibility into automation processes and stages
Metrics & Validation
- Establishing baseline metrics before implementing automation
- Demonstrating value through before/after comparisons
- The challenge of selling automation benefits to organizations
- Management buy-in and adoption strategies
Practical Considerations
- Balancing feature completeness with usability
- Service-centric vs device-centric modeling approaches
- The reality of budget constraints and gradual implementation
- Learning from automation failures and building better tools
Eduardo & Christian
Topics Discussed: Network Automation & Terraform Expert Panel
Participants
- steinzi (Host)
- Eduardo Pozo (Terraform Expert, Healthcare Environment)
- Christian Drefke (Network Automation Expert, Enterprise Integrator)
1. Terraform Provider Evolution & Challenges
Initial Provider Pain Points
- 3+ years ago: Providers were “half-baked” with ~50% of resources non-functional
- Many resources could create but not modify or delete configurations
- Frequent crashes and apparent lack of vendor testing
- Vendors prioritized development based on client demand and revenue
Solution Strategy
- Direct vendor engagement and collaboration
- Opening issues and contributing fixes
- Community-driven development approach
- Beta testing partnerships with vendors
Current State Improvements
- Palo Alto: New provider version with significant improvements
- Cisco Catalyst Center: Updated provider with better functionality
- Overall ecosystem maturity has dramatically improved
2. State File Management & Best Practices
Key Challenges Discussed
- Understanding state file mechanics
- Difference between
refresh
,apply -refresh
, and regular refresh operations - State file security and validation
- Managing distributed state across multiple environments
Recommended Approaches
- Split State Files: Essential for scaling (300-400 pipeline runs/day mentioned)
- Single Repository Strategy: Avoid repository sprawl per state file
- Tools: Terra* (Terramate) for scaling without vendor lock-in
- Dev Containers: Consistent development environments across teams
3. Data Validation & Security
Multi-Layer Validation Strategy
Eduardo’s Approach:
- Syntax Validation: Data correctness checks
- Semantic Validation: Business logic (e.g., BGP neighbor dependencies)
- Critical Resource Protection: Preventing deletion of critical VLANs
- Custom Python Validation: Instead of Terraform Sentinel
Christian’s Approach:
- Strict Input Validation: Fail-fast methodology
- External Data Files: Never embed data directly in HCL
- Secured Data Repository: Restricted access with Git monitoring
- GitLab Security Controls: Repository access management
Fail-Fast Philosophy
“If we let the wrong data come into Terraform, maybe Terraform will go through 1000 objects before seeing that the data is wrong… you already wasted like ten minutes of your time” - Eduardo
4. Source of Truth Evolution
Current Challenges with NetBox
- Limited modeling capabilities for specific network values
- Need for distributed source of truth solutions
Emerging Solutions
Eduardo’s Direction:
- Moving to notebook-based distributed source of truth
- Custom plugins and models
- Northbound/Southbound API integration
- Custom work runners for pipeline triggers
Christian’s Approach:
- Evaluating Infrahub solution
- Workflow-driven data feeding
- Integration beyond networking (Kubernetes, OpenShift, VMware)
- Reliable data sourcing with audit history
5. Development Best Practices
Module Design Philosophy
- Start Small: Begin with simple modules (e.g., single VLAN creation)
- Building Block Approach: Lego-style incremental development
- Always Use Modules: Even for non-reusable code (organization benefits)
- Versioning Strategy: Module registry with proper version control
Development Environment
- Dev Containers: Consistent environments across team members
- Same Tool Versions: Terraform, Python packages, plugins
- Easy Handoffs: Seamless collaboration and vacation coverage
6. Emergency Change Management
Brownfield Challenges
- Manual emergency changes in production
- Maintaining state file accuracy
- Audit trail requirements in regulated environments
Proposed Solutions
- Audit Log Integration: Pulling manual changes from system logs
- Workflow Automation: Converting manual changes to approved workflows
- TACACS/RADIUS Integration: Triggering Terraform refreshes on manual logins
- Slack Bot Integration: Real-time change notifications and approvals
7. AI/ML Integration Considerations
Current Usage
- GitHub Copilot: Security scanning in pull requests
- Basic AI Tools: Limited use due to data sensitivity
Future Potential
- Regulatory Environments: Government approval required for healthcare/sensitive data
- Troubleshooting Applications: AI-assisted problem diagnosis
- Security Analysis: Automated compliance checking
- Configuration Review: AI as “second pair of eyes”
MCP (Model Context Protocol) Considerations
- Limited input file integration
- Chatbot-driven state file edits
- Zero-conflict automated changes
- Current limitations in sensitive environments
8. Vendor-Specific Insights
Cisco Ecosystem
Catalyst Center Provider:
- Significant improvements over past 2 years
- Better API foundation compared to Meraki
- Community collaboration success story
ACI Provider:
- Consistently reliable from day one
- Minimal issues with resource functionality
- Active migration to new Terraform SDK
Provider Quality Assessment
- Need for star rating system for providers
- Evaluation criteria: API foundation, issue resolution, community support
- Importance of underlying technology (REST API vs gRPC)
9. Team Structure & Skills
The “Unicorn” Debate
Controversial Opinion: The DevOps unicorn (network engineer + developer) isn’t scalable for sophisticated projects.
Recommended Team Structure
- Senior Network Engineers: Product knowledge and design expertise
- Dedicated Developers: Software engineering principles and SOLID architecture
- Collaborative Approach: Bridge-building between departments
- Specialized Skills: High-level expertise in respective domains
Reality Check
“A network engineer can always begin to code, but I feel like we are never going to be as good as a software developer that studied for that” - Eduardo
10. Scaling Considerations
Production Metrics
- Pipeline Frequency: 300-400 runs per day
- Multi-Environment: One repository per location/hospital
- Closed Environment: Local provider/module caching
- Container Strategy: Version-controlled runner environments
Architecture Decisions
- Tool Selection: Terraform vs Pulumi considerations
- State Management: Local vs remote state strategies
- CI/CD Integration: GitLab Actions and pipeline orchestration
- Security: NDA-compliant implementations in healthcare
Key Takeaways
- Provider Maturity: Significant improvements in the last 2-3 years through vendor collaboration
- Start Small: Incremental module development approach
- Team Composition: Balance of network expertise and software development skills
- Validation is Critical: Multi-layer validation prevents costly pipeline failures
- State Management: Proper splitting and security essential for scale
- Community Engagement: Direct vendor collaboration accelerates provider improvement
- Future-Ready: AI integration coming but with regulatory considerations