Intheoretical computer science, thesubgraph isomorphism problem is a computational task in which twographs and are given as input, and one must determine whether contains asubgraph that isisomorphic to.Subgraph isomorphism is a generalization of both themaximum clique problem and the problem of testing whether a graph contains aHamiltonian cycle, and is thereforeNP-complete.[1] However certain other cases of subgraph isomorphism may be solved in polynomial time.[2]
Sometimes the namesubgraph matching is also used for the same problem. This name puts emphasis on finding such a subgraph as opposed to the bare decision problem.
To prove subgraph isomorphism is NP-complete, it must be formulated as adecision problem. The input to the decision problem is a pair of graphs andH. The answer to the problem is positive ifH is isomorphic to a subgraph ofG, and negative otherwise.
Formal question:
Let, be graphs. Is there a subgraph such that? I. e., does there exist abijection such that?
The proof of subgraph isomorphism being NP-complete is simple and based on reduction of theclique problem, an NP-complete decision problem in which the input is a single graphG and a numberk, and the question is whetherG contains acomplete subgraph withk vertices. To translate this to a subgraph isomorphism problem, simply letH be the complete graphKk; then the answer to the subgraph isomorphism problem forG andH is equal to the answer to the clique problem forG andk. Since the clique problem is NP-complete, thispolynomial-time many-one reduction shows that subgraph isomorphism is also NP-complete.[3]
An alternative reduction from theHamiltonian cycle problem translates a graphG which is to be tested for Hamiltonicity into the pair of graphsG andH, whereH is a cycle having the same number of vertices asG. Because the Hamiltonian cycle problem is NP-complete even forplanar graphs, this shows that subgraph isomorphism remains NP-complete even in the planar case.[4]
Subgraph isomorphism is a generalization of thegraph isomorphism problem, which asks whetherG is isomorphic toH: the answer to the graph isomorphism problem is true if and only ifG andH both have the same numbers of vertices and edges and the subgraph isomorphism problem forG andH is true. However the complexity-theoretic status of graph isomorphism remains an open question.
In the context of theAanderaa–Karp–Rosenberg conjecture on thequery complexity of monotone graph properties,Gröger (1992) showed that any subgraph isomorphism problem has query complexity Ω(n3/2); that is, solving the subgraph isomorphism requires an algorithm to check the presence or absence in the input of Ω(n3/2) different edges in the graph.[5]
Ullmann (1976) describes a recursivebacktracking procedure for solving the subgraph isomorphism problem. Although its running time is, in general, exponential, it takes polynomial time for any fixed choice ofH (with a polynomial that depends on the choice ofH). WhenG is aplanar graph (or more generally a graph ofbounded expansion) andH is fixed, the running time of subgraph isomorphism can be reduced tolinear time.[2]
Ullmann (2010) is a substantial update to the 1976 subgraph isomorphism algorithm paper.
Cordella (2004) proposed in 2004 another algorithm based on Ullmann's, VF2, which improves the refinement process using different heuristics and uses significantly less memory.
Bonnici & Giugno (2013) proposed a better algorithm, which improves the initial order of the vertices using some heuristics.
The current state of the art solver for moderately-sized, hard instances is the Glasgow Subgraph Solver (McCreesh, Prosser & Trimble (2020)).[6] This solver adopts aconstraint programming approach, using bit-parallel data structures and specialized propagation algorithms for performance. It supports most common variations of the problem and is capable of counting or enumerating solutions as well as deciding whether one exists.
For large graphs, state-of-the art algorithms include CFL-Match and Turboiso, and extensions thereupon such as DAF byHan et al. (2019).
As subgraph isomorphism has been applied in the area ofcheminformatics to find similarities between chemical compounds from theirstructural formula; often in this area the termsubstructure search is used.[7] A query structure is often defined graphically using astructure editor program;SMILES based database systems typically define queries usingSMARTS, aSMILES extension.
The closely related problem of counting the number of isomorphic copies of a graphH in a larger graphG has been applied to pattern discovery in databases,[8] thebioinformatics of protein-protein interaction networks,[9] and inexponential random graph methods for mathematically modelingsocial networks.[10]
The problem is also of interest inartificial intelligence, where it is considered part of an array ofpattern matching in graphs problems; an extension of subgraph isomorphism known asgraph mining is also of interest in that area.[11]
^The originalCook (1971) paper that proves theCook–Levin theorem already showed subgraph isomorphism to be NP-complete, using a reduction from3-SAT involving cliques.
^de la Higuera, Colin; Janodet, Jean-Christophe; Samuel, Émilie; Damiand, Guillaume; Solnon, Christine (2013),"Polynomial algorithms for open plane graph and subgraph isomorphisms"(PDF),Theoretical Computer Science,498:76–99,doi:10.1016/j.tcs.2013.05.026,MR3083515,It is known since the mid-70's that the isomorphism problem is solvable in polynomial time for plane graphs. However, it has also been noted that the subisomorphism problem is still N P-complete, in particular because the Hamiltonian cycle problem is NP-complete for planar graphs.
Ohlrich, Miles; Ebeling, Carl; Ginting, Eka; Sather, Lisa (1993), "SubGemini: identifying subcircuits using a fast subgraph isomorphism algorithm",Proceedings of the 30th international Design Automation Conference, pp. 31–37,doi:10.1145/157485.164556,ISBN978-0-89791-577-9,S2CID5889119.
Jamil, Hasan (2011), "Computing Subgraph Isomorphic Queries using Structural Unification and Minimum Graph Structures",26th ACM Symposium on Applied Computing, pp. 1058–1063.
Carletti, V.; Foggia, P.; Saggese, A.; Vento, M. (2018), "Challenging the time complexity of exact subgraph isomorphism for huge and dense graphs with VF3",IEEE Transactions on Pattern Analysis and Machine Intelligence,40 (4):804–818,doi:10.1109/TPAMI.2017.2696940,PMID28436848,S2CID3709576
McCreesh, Ciaran; Prosser, Patrick; Trimble, James (2020), "The Glasgow Subgraph Solver: Using Constraint Programming to Tackle Hard Subgraph Isomorphism Problem Variants",Graph Transformation - 13th International Conference, ICGT 2020, Held as Part of STAF 2020, Bergen, Norway, June 25-26, 2020, Proceedings, Lecture Notes in Computer Science, vol. 12150, Springer, pp. 316–324,doi:10.1007/978-3-030-51372-6_19,ISBN978-3-030-51371-9,PMC7314700