Course Outline
Introduction to AIOps
Origins and evolution of AIOps
The importance of AIOps in modern IT
AIOps vs. IT Operations Analytics – key differences
Core technologies and concepts
AIOps system lifecycle
Related practices and methodologies
AIOps in the Organizational Context
Key drivers and influencing factors
Integration with DevOps
The role of AIOps in Site Reliability Engineering (SRE)
AIOps and IT security concerns
Data, telemetry, and system complexity
A new paradigm for understanding system health
Core Technologies – Data
What is Big Data?
The 5 Vs of Big Data
Characteristics of Big Data in AIOps
Data sources and types in AIOps environments
Data diversity and processing challenges
Core Technologies – Machine Learning (ML)
AI, ML, and their role in AIOps
Supervised vs. unsupervised learning in AIOps
Machine learning vs. traditional analytics
ML models and their application in AIOps
The future of AI in IT operations
Comparing ML with data analytics approaches
AIOps and Operational Metrics
Key operational metrics for IT environments
Important indicators across various systems
SLA, SLO, and KPI – definitions and usage
Incident-related metrics: detection and classification
Time-based metrics: MTTD, MTBF, MTTA, MTTR
Managing service level agreements
Use Cases and Organizational Mindset Shift
From reactive to proactive operations
Characteristics of a reactive IT operations model
Moving from deterministic to probabilistic approaches
Real-world use cases of AIOps
Organizational change driven by AIOps
Understanding the past, predicting the future
Measuring the Impact of AIOps
Key AIOps metrics for IT operations
Synergy between AIOps, DevOps, and SRE
Improving AI accuracy through AIOps
Enhancing system observability
Tracking AIOps impact on operations
Connecting AIOps metrics with DORA indicators
Implementing AIOps in the Organization
Avoiding common pitfalls
Ethics and machine learning in AIOps
Implementation paths and strategies
Data quality and process alignment
Organizational culture and supporting practices
Data regulations and compliance
Handling ML model errors
Privacy and user data protection
Requirements
Basic understanding of IT terminology and experience working with information technologies.
Testimonials (1)
There were many practical exercises supervised and assisted by the trainer