Well-Architected Framework: Operational excellence pillar Stay organized with collections Save and categorize content based on your preferences.
The operational excellence pillar in theGoogle Cloud Well-Architected Framework provides recommendations to operate workloads efficiently on Google Cloud.Operational excellence in the cloud involves designing, implementing, andmanaging cloud solutions that provide value, performance, security, andreliability. The recommendations in this pillar help you to continuously improveand adapt workloads to meet the dynamic and ever-evolving needs in the cloud.
The operational excellence pillar is relevant to the following audiences:
- Managers and leaders: A framework to establish and maintainoperational excellence in the cloud and to ensure that cloud investmentsdeliver value and support business objectives.
- Cloud operations teams: Guidance to manage incidents and problems,plan capacity, optimize performance, and manage change.
- Site reliability engineers (SREs): Best practices that help you toachieve high levels of service reliability, including monitoring, incidentresponse, and automation.
- Cloud architects and engineers: Operational requirements and bestpractices for the design and implementation phases, to help ensure thatsolutions are designed for operational efficiency and scalability.
- DevOps teams: Guidance about automation, CI/CD pipelines, and changemanagement, to help enable faster and more reliable software delivery.
To achieve operational excellence, you should embrace automation,orchestration, and data-driven insights. Automation helps to eliminate toil. Italso streamlines and builds guardrails around repetitive tasks. Orchestrationhelps to coordinate complex processes. Data-driven insights enableevidence-based decision-making. By using these practices, you can optimize cloudoperations, reduce costs, improve service availability, and enhance security.
Operational excellence in the cloud goes beyond technical proficiency in cloudoperations. It includes a cultural shift that encourages continuous learning andexperimentation. Teams must be empowered to innovate, iterate, and adopt agrowth mindset. A culture of operational excellence fosters a collaborativeenvironment where individuals are encouraged to share ideas, challengeassumptions, and drive improvement.
For operational excellence principles and recommendations that are specific to AI and ML workloads, seeAI and ML perspective: Operational excellencein the Well-Architected Framework.
Core principles
The recommendations in the operational excellence pillar of the Well-Architected Frameworkare mapped to the following core principles:
- Ensure operational readiness and performance using CloudOps:Ensure that cloud solutions meet operational and performance requirementsby defining service level objectives (SLOs) and by performing comprehensivemonitoring, performance testing, and capacity planning.
- Manage incidents and problems:Minimize the impact of cloud incidents and prevent recurrence throughcomprehensive observability, clear incident response procedures, thoroughretrospectives, and preventive measures.
- Manage and optimize cloud resources:Optimize and manage cloud resources through strategies like right-sizing,autoscaling, and by using effective cost monitoring tools.
- Automate and manage change:Automate processes, streamline change management, and alleviate the burdenof manual labor.
- Continuously improve and innovate:Focus on ongoing enhancements and the introduction of new solutions to staycompetitive.
Contributors
Authors:
- Ryan Cox | Principal Architect
- Hadrian Knotz | Enterprise Architect
Other contributors:
- Daniel Lees | Cloud Security Architect
- Filipe Gracio, PhD | Customer Engineer, AI/ML Specialist
- Gary Harmson | Principal Architect
- Jose Andrade | Customer Engineer, SRE Specialist
- Kumar Dhanagopal | Cross-Product Solution Developer
- Nicolas Pintaux | Customer Engineer, Application Modernization Specialist
- Radhika Kanakam | Program Lead, Google Cloud Well-Architected Framework
- Samantha He | Technical Writer
- Zach Seils | Networking Specialist
- Wade Holmes | Global Solutions Director
Except as otherwise noted, the content of this page is licensed under theCreative Commons Attribution 4.0 License, and code samples are licensed under theApache 2.0 License. For details, see theGoogle Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.
Last updated 2024-10-31 UTC.