
DataOps is an Agile approach to designing, implementing and maintaining a distributed data architecture that will support a wide range of open source tools and frameworks in production. The goal of DataOps is to create business value from big data.
Inspired by theDevOps movement, the DataOps strategy strives to speed the production of applications running on big data processing frameworks. DataOps also seeks to liberate silos across IT operations,data management and software development teams, encouraging line-of-business stakeholders to work withdata engineers,data scientists and analysts. The goal is to ensure the organization's data can be used in the most flexible, effective manner possible to achieve positive and reliable business outcomes.
Since it incorporates so many elements from thedata lifecycle, DataOps spans a number of information technology disciplines, including data development,data transformation, data extraction,data quality,data governance, data access control, data center capacity planning and system operations. DataOps teams are often managed by an organization's chief data scientist or chief analytics officer and supported by data engineers, data analysts, data stewards and others with responsibilities for data.
As with DevOps, there are no DataOps-specific software tools -- only frameworks and related tool sets that support a DataOps approach to collaboration and increased agility. These tools includeETL/ELT tools,data curation and cataloging tools, log analyzers and systems monitors. Software that supports microservices architectures, as well as open source software that lets applications blend structured and unstructured data, are also associated with the DataOps movement. This software can include MapReduce, HDFS, Kafka, Hive and Spark.
Thegoal of DataOps is to combine DevOps and Agile methodologies to manage data in alignment with business goals. If the goal is to raise the lead conversion rate, for example, DataOps would position data to make recommendations for marketing products better, thus converting more leads. Agile processes are used for data governance and analytics development while DevOps processes are used to optimize code, product builds and delivery.
Building new code is only one part of DataOps. Streamlining and improving thedata warehouse are equally as important. Similar to the process oflean manufacturing, DataOps uses statistical process control (SPC) to monitor and verify the data analyticspipeline consistently. SPC ensures statistics remain within feasible ranges, advances data processing efficiency and raises data quality. If an anomaly or error occurs, SPC helps alert data analysts immediately for a response.
The volume of data is estimated to continue to grow exponentially, making implementation of a DataOps strategy critical. The first step to DataOps involves cleaningraw data and developing an infrastructure that makes it readily available for use, typically in a self-service model. Once data is made accessible, software, platforms and tools should be developed or deployed that orchestrate data and integrate with current systems. These components will then continuously process new data, monitor performance and produce real-time insights.
A few best practices associated with implementing a DataOps strategy include the following:
Transitioning to a DataOps strategy can bring an organization the following benefits:
A DataOps framework needs to harmonize and improve on several key elements and practices.
Cross-functional communication. DataOps starts with the same core paradigm for Agile development practices that support improved collaboration across business, development, quality assurance and operations teams and extends this collaboration to data engineers, data scientists and business analysts.
Agile mindset. It's essential to find ways to break various data processes into small chunks that can be adapted incrementally -- analogous to continuous development and continuousintegration pipelines.
Integrated data pipeline. Enterprises need to automate thecommon handoffs between data processes such as ingestion, ETL/ELT, data quality, metadata management, storage, data preparation, feature engineering and deployment.
Data-driven culture. Enterprises should adopt a long-term and ongoing program for cultivating data literacy across the organization and guiding data users who are finding new ways to incorporate data into different analytics tools.
Continuous feedback. Various teams also need to develop a process to aggregate insights for transforming and vetting data tohelp data engineering teams prioritize infrastructure improvements.
DataOps tools address many capabilities required to ingest, transform, clean, orchestrate and load data. In some cases, these tools emerged to complement the vendor's other tools. In other cases, vendors focus on specific DataOps workflows. Popular DataOpstools include the following:
Several trends are driving the future of DataOps, including integration, augmentation and observability.
Increased integration with other data disciplines.DataOps will increasingly need tointeroperate and support related data management practices. Gartner has identified MLOps,ModelOps and PlatformOps as complementary approaches to manage specific ways of using data. MLOps is geared to machine learning development and versioning, and ModelOps focuses on model engineering, training, experimentation and monitoring. Gartner characterizes PlatformOps as a comprehensive AI orchestration platform that includes DataOps, MLOps, ModelOps and DevOps.
Augmented DataOps. AI is beginning to help manage and orchestrate the data infrastructure itself. Data catalogs are evolving into augmented data catalogs and analytics into augmented analytics infused with AI. Similar techniques will be gradually applied toall other aspects of the DataOps pipeline.
Data observability. The DevOps community for years has widely used application performance management tools empowered by observability infrastructure to help pinpoint and prioritize issues with applications. Vendors like Acceldata, Monte Carlo, Precisely, Soda and Unravel are developing comparabledata observability tools focused on the data infrastructure itself. DataOps tools will increasingly consume data observability feeds to help optimize DataOps pipelines through development, integration, partnerships and acquisitions.
Synthetic data and simulation forecasting help executives overcome data constraints, test scenarios and strengthen strategic ...
A data prep agent and caching capabilities aimed at helping users control spending help the vendor stand out from its peers as it...
Numerous tools are available for data science applications. Read about 18, including their features, capabilities and uses, to ...
Compare Datadog vs. New Relic capabilities including alerts, log management, incident management and more. Learn which tool is ...
Many organizations struggle to manage their vast collection of AWS accounts, but Control Tower can help. The service automates ...
There are several important variables within the Amazon EKS pricing model. Dig into the numbers to ensure you deploy the service ...
Line-of-business Box users can now tag contracts, reports and other commonly used docs with plain-language instructions, which an...
AI technology continues to shape the content management market. It underpins top trends in 2026, including generative AI, agentic...
When evaluating content collaboration platforms, business leaders have several options and must choose carefully to find one that...
Oracle has made it easier for customers to choose and launch third-party software onto its cloud. Now, the question is whether ...
Part two of a two-part article: Willis uses PeopleSoft 9.1 to bring back the personal feel to automated insurance selection for ...
Part one of a two-part article: Willis uses PeopleSoft 9.1 to create real-time automated insurance selection for voluntary ...
New tools to speed up agentic AI development, open SAP platforms and provide access to data products were also touted as helping ...
New AI-driven applications for supply chain, procurement and CX also shared the spotlight as SAP strives to portray its broad ...
In this Q&A, Michael Lemashov and Denis Malov of JDC Group discuss the strategies for SAP customers to achieve a clean core and ...
