Visual analytics is a multidisciplinary science and technology field that emerged frominformation visualization andscientific visualization. It focuses on how analyticalreasoning can be facilitated by interactivevisual interfaces.[1]
Visual analytics is "the science of analytical reasoning facilitated by interactive visual interfaces."[2] It can address problems whose size, complexity, and need for closely coupled human and machine analysis may make them otherwise intractable.[3] Visual analytics advances scientific and technological development across multiple domains, including analytical reasoning, human–computer interaction, data transformations, visual representation for computation and analysis, analytic reporting, and the transition of new technologies into practice.[4] As a research agenda, visual analytics brings together several scientific and technical communities from computer science, information visualization, cognitive and perceptual sciences, interactive design, graphic design, and social sciences.
Visual analytics integrates new computational and theory-based tools with innovative interactive techniques andvisual representations to enable human-information discourse. The design of the tools and techniques is based oncognitive,design, andperceptual principles. This science of analytical reasoning provides the reasoning framework upon which one can build both strategic and tactical visual analytics technologies for threat analysis, prevention, and response. Analytical reasoning is central to the analyst's task of applying human judgments to reach conclusions from a combination of evidence and assumptions.[2]
Visual analytics has some overlapping goals and techniques withinformation visualization andscientific visualization. There is currently no clear consensus on the boundaries between these fields, but broadly speaking the three areas can be distinguished as follows:
Visual analytics seeks to marry techniques from information visualization with techniques from computational transformation and analysis of data. Information visualization forms part of the direct interface between user and machine, amplifying human cognitive capabilities in six basic ways:[2][5]
These capabilities of information visualization, combined with computational data analysis, can be applied to analytic reasoning to support the sense-making process.
As an interdisciplinary approach, visual analytics has its roots in information visualization, cognitive sciences, and computer science. The term and scope of the field was defined in the early 2000s through researchers such asJim Thomas,Kristin A. Cook,John Stasko,Pak Chung Wong,Daniel A. Keim andDavid S. Ebert. As a reaction to theSeptember 11, 2001 attacks theUnited States Department of Homeland Security was established in late 2002, combining dozens of previously separated government agencies. Building upon earlier work on visual data mining byDaniel A. Keim starting in the late 1990s, this simultaneously lead to the development of a research agenda for visual analytics.[6][7] As part of these efforts theNational Visualization and Analytics Center (NVAC) atPacific Northwest National Laboratory was established in 2004, whose charter was to develop system to mitigate information overload after theSeptember 11, 2001 attacks in the intelligence community. Their research work determined core challenges, posed open research questions, and positioned visual analytics as a new research domain, in particular through the 2005 research agendaIlluminating the Path.[2]In 2006, the IEEE VIS community led byPak Chung Wong andDaniel A. Keim launched the annualIEEE Conference on Visual Analytics Science and Technology (VAST), providing a dedicated venue for research into visual analytics, which in 2020 merged to form theIEEE Visualization conference. In 2008, scope and challenges of visual analytics were conceptually defined byDaniel A. Keim andJim Thomas in their influential book about visual data mining.[8] The domain was further refined as part of the European CommissionsFP7 VisMaster program in the late 2000s.[9]
Visual analytics is a multidisciplinary field that includes the following focus areas:[2]
Analytical reasoning techniques are the method by which users obtain deep insights that directly support situation assessment, planning, and decision making. Visual analytics must facilitate high-quality human judgment with a limited investment of the analysts’ time. Visual analytics tools must enable diverse analytical tasks such as:[2]
These tasks will be conducted through a combination of individual and collaborative analysis, often under extreme time pressure. Visual analytics must enable hypothesis-based and scenario-based analytical techniques, providing support for the analyst to reason based on the available evidence.[2]
Data representations are structured forms suitable for computer-based transformations. These structures must exist in the original data or be derivable from the data themselves. They must retain the information and knowledge content and the related context within the original data to the greatest degree possible. The structures of underlying data representations are generally neither accessible nor intuitive to the user of the visual analytics tool. They are frequently more complex in nature than the original data and are not necessarily smaller in size than the original data. The structures of the data representations may contain hundreds or thousands of dimensions and be unintelligible to a person, but they must be transformable into lower-dimensional representations for visualization and analysis.[2]
Theories of visualization include:[3]
Visual representations translate data into a visible form that highlights important features, including commonalities and anomalies. These visual representations make it easy for users to perceive salient aspects of their data quickly. Augmenting the cognitive reasoning process with perceptual reasoning through visual representations permits the analytical reasoning process to become faster and more focused.[2]
The input for the data sets used in the visual analytics process areheterogeneous data sources (i.e., the internet, newspapers, books, scientific experiments,expert systems). From these rich sources, the data setsS = S1, ..., Sm are chosen, whereas eachSi , i ∈ (1, ..., m) consists ofattributes Ai1, ..., Aik. The goal or output of the process is insightI. Insight is either directly obtained from the set of created visualizationsV or through confirmation ofhypothesesH as the results of automated analysis methods. This formalization of the visual analytics process is illustrated in the following figure. Arrows represent the transitions from one set to another one.
More formally the visual analytics process is atransformationF: S → I, whereasF is a concatenation of functionsf ∈ {DW, VX, HY, UZ} defined as follows:
DW describes the basic datapre-processing functionality withDW : S → S and W ∈ {T, C, SL, I} including data transformation functionsDT, data cleaning functionsDC, data selection functionsDSL and data integration functionsDI that are needed to make analysis functions applicable to the data set.
VW, W ∈ {S, H} symbolizes the visualization functions, which are either functions visualizing dataVS : S → V or functions visualizing hypothesesVH : H → V.
HY, Y ∈ {S, V} represents the hypotheses generation process. We distinguish between functions that generate hypotheses from dataHS : S → H and functions that generate hypotheses from visualizationsHV : V → H.
Moreover, user interactionsUZ, Z ∈ {V, H, CV, CH} are an integral part of the visual analytics process. User interactions can either effect only visualizationsUV : V → V (i.e., selecting or zooming), or can effect only hypothesesUH : H → H by generating a new hypotheses from given ones. Furthermore, insight can be concluded from visualizationsUCV : V → I or from hypothesesUCH : H → I.
The typical data pre-processing applying data cleaning, data integration and data transformation functions is defined asDP = DT(DI(DC(S1, ..., Sn))). After the pre-processing step either automated analysis methodsHS = {fs1, ..., fsq} (i.e., statistics, data mining, etc.) or visualization methodsVS : S → V, VS = {fv1, ..., fvs} are applied to the data, in order to reveal patterns as shown in the figure above.[11]
In general the following paradigm is used to process the data:
Analyse First – Show the Important – Zoom, Filter and Analyse Further – Details on Demand[8]