Movatterモバイル変換

Color-coding

From Wikipedia, the free encyclopedia

This article is about a technique in the design of graph algorithms. For the use of color to display information, seecolor code. For other uses, seeColor code (disambiguation).

Incomputer science andgraph theory, the termcolor-coding refers to analgorithmic technique which is useful in the discovery ofnetwork motifs. For example, it can be used to detect asimple path of lengthk in a givengraph. The traditional color-coding algorithm isprobabilistic, but it can bederandomized without much overhead in the running time.

Color-coding also applies to the detection ofcycles of a given length, and more generally it applies to thesubgraph isomorphism problem (anNP-complete problem), where it yieldspolynomial time algorithms when the subgraph pattern that it is trying to detect has boundedtreewidth.

The color-coding method was proposed and analyzed in 1994 byNoga Alon,Raphael Yuster, andUri Zwick.^[1]^[2]

Results

[edit]

The following results can be obtained through the method of color-coding:

For every fixed constantk, if a graphG = (V,E) contains a simple cycle of sizek, then such a cycle can be found in:
- $O(|V|^{\omega })$ expected time, or
- $O(|V|^{\omega }\log |V|)$ worst-case time, whereω is the exponent ofmatrix multiplication.^[3]
For every fixed constantk, and every graphG = (V,E) that is in any nontrivialminor-closed graph family (e.g., aplanar graph), ifG contains a simple cycle of sizek, then such cycle can be found in:
- O(V) expected time, or
- O(V logV) worst-case time.
If a graphG = (V,E) contains a subgraph isomorphic to a boundedtreewidth graph which hasO(logV) vertices, then such a subgraph can be found inpolynomial time.

The method

[edit]

To solve the problem of finding a subgraph $H=(V_{H},E_{H})$ in a given graphG = (V,E), whereH can be a path, a cycle, or any boundedtreewidth graph where $|V_{H}|=O(\log |V|)$ , the method of color-coding begins by randomly coloring each vertex ofG with $k=|V_{H}|$ colors, and then tries to find a colorful copy ofH in coloredG. Here, a graph is colorful if every vertex in it is colored with a distinct color. This method works by repeating (1) random coloring a graph and (2) finding colorful copy of the target subgraph, and eventually the target subgraph can be found if the process is repeated a sufficient number of times.

Suppose a copy ofH inG becomes colorful with some non-zero probabilityp. It immediately follows that if the random coloring is repeated⁠1/p⁠ times, then this copy is expected to become colorful once. Note that thoughp is small, it is shown that if $|V_{H}|=O(\log |V|)$ ,p is only polynomially small. Suppose again there exists an algorithm such that, given a graphG and a coloring which maps each vertex ofG to one of thek colors, it finds a copy of colorfulH, if one exists, within some runtimeO(r). Then the expected time to find a copy ofH inG, if one exists, is $O({\tfrac {r}{p}})$ .

Sometimes it is also desirable to use a more restricted version of colorfulness. For example, in the context of finding cycles inplanar graphs, it is possible to develop an algorithm that finds well-colored cycles. Here, a cycle is well-colored if its vertices are colored by consecutive colors.

Example

[edit]

An example would be finding a simple cycle of lengthk in graphG = (V,E).

By applying random coloring method, each simple cycle has a probability of $k!/k^{k}>e^{-k}$ to become colorful, since there are $k^{k}$ ways of coloring thek vertices on the cycle, among which there are $k! {\displaystyle k!}$ colorful occurrences. Then an algorithm (described next) can be used to find colorful cycles in the randomly colored graphG in time $O(V^{\omega })$ , where $\omega$ is the matrix multiplication constant. Therefore, it takes $e^{k}\cdot O(V^{\omega })$ overall time to find a simple cycle of lengthk inG.

The colorful cycle-finding algorithm works by first finding all pairs of vertices inV that are connected by a simple path of lengthk − 1, and then checking whether the two vertices in each pair are connected. Given a coloring functionc :V → {1, ...,k} to color graphG, enumerate all partitions of the color set{1, ...,k} into two subsetsC₁,C₂ of size $k/2$ each. Note thatV can be divided intoV₁ andV₂ accordingly, and letG₁ andG₂ denote the subgraphs induced byV₁ andV₂ respectively. Then, recursively find colorful paths of length $k/2-1$ in each ofG₁ andG₂. Suppose the boolean matrixA₁ andA₂ represent the connectivity of each pair of vertices inG₁ andG₂ by a colorful path, respectively, and letB be the matrix describing the adjacency relations between vertices ofV₁ and those ofV₂, the boolean product $A_{1}BA_{2}$ gives all pairs of vertices inV that are connected by a colorful path of lengthk − 1. Thus, the recursive relation of matrix multiplications is $t(k)\leq 2^{k}\cdot t(k/2)$ , which yields a runtime of $2^{O(k)}\cdot V^{\omega }$ . Although this algorithm finds only the end points of the colorful path, another algorithm by Alon and Naor^[4] that finds colorful paths themselves can be incorporated into it.

Derandomization

[edit]

Thederandomization of color-coding involves enumerating possible colorings of a graphG, such that the randomness of coloringG is no longer required. For the target subgraphH inG to be discoverable, the enumeration has to include at least one instance where theH is colorful. To achieve this, enumerating ak-perfect familyF of hash functions from{1, ..., |V|} to{1, ...,k} is sufficient. By definition,F isk-perfect if for every subsetS of{1, ..., |V|} where $|S|=k$ , there exists a hash functionh inF such thath :S → {1, ...,k} isperfect. In other words, there must exist a hash function inF that colors any givenk vertices withk distinct colors.

There are several approaches to construct such ak-perfect hash family:

The best explicit construction is byMoni Naor,Leonard J. Schulman, andAravind Srinivasan,^[5] where a family of size $e^{k}k^{O(\log k)}\log |V|$ can be obtained. This construction does not require the target subgraph to exist in the original subgraph finding problem.
Another explicit construction byJeanette P. Schmidt and Alan Siegel^[6] yields a family of size $2^{O(k)}\log ^{2}|V|$ .
Another construction that appears in the original paper ofNoga Alon et al.^[2] can be obtained by first building ak-perfect family that maps{1, ..., |V|} to{1, ...,k²}, followed by building anotherk-perfect family that maps{1, ...,k²} to{1, ...,k}. In the first step, it is possible to construct such a family with2nlogk random bits that are almost2logk-wise independent,^[7]^[8] and the sample space needed for generating those random bits can be as small as $k^{O(1)}\log |V|$ . In the second step, it has been shown by Jeanette P. Schmidt and Alan Siegel^[6] that the size of suchk-perfect family can be $2^{O(k)}$ . Consequently, by composing thek-perfect families from both steps, ak-perfect family of size $2^{O(k)}\log |V|$ that maps from{1, ..., |V|} to{1, ...,k} can be obtained.

In the case of derandomizing well-coloring, where each vertex on the subgraph is colored consecutively, ak-perfect family of hash functions from{1, ..., |V|} to{1, ...,k!} is needed. A sufficientk-perfect family which maps from{1, ..., |V|} to{1, ...,k^k} can be constructed in a way similar to the approach 3 above (the first step). In particular, it is done by usingnklogk random bits that are almostklogk independent, and the size of the resultingk-perfect family will be $k^{O(k)}\log |V|$ .

The derandomization of color-coding method can be easily parallelized, yielding efficientNC algorithms.

Applications

[edit]

Recently, color-coding has attracted much attention in the field of bioinformatics. One example is the detection ofsignaling pathways inprotein-protein interaction (PPI) networks. Another example is to discover and to count the number ofmotifs in PPI networks. Studying bothsignaling pathways andmotifs allows a deeper understanding of the similarities and differences of many biological functions, processes, and structures among organisms.

Due to the huge amount of gene data that can be collected, searching for pathways or motifs can be highly time consuming. However, by exploiting the color-coding method, the motifs or signaling pathways with $k=O(\log n)$ vertices in a networkG withn vertices can be found very efficiently in polynomial time. Thus, this enables us to explore more complex or larger structures in PPI networks.

References

[edit]

^Alon, N., Yuster, R., and Zwick, U. 1994. Color-coding: a new method for finding simple paths, cycles and other small subgraphs within large graphs. In Proceedings of the Twenty-Sixth Annual ACM Symposium on theory of Computing (Montreal, Quebec, Canada, May 23–25, 1994). STOC '94. ACM, New York, NY, 326–335. DOI=http://doi.acm.org/10.1145/195058.195179
^^a ^bAlon, N., Yuster, R., and Zwick, U. 1995. Color-coding. J. ACM 42, 4 (Jul. 1995), 844–856. DOI=http://doi.acm.org/10.1145/210332.210337
^Coppersmith–Winograd Algorithm
^Alon, N. and Naor, M. 1994 Derandomization, Witnesses for Boolean Matrix Multiplication and Construction of Perfect Hash Functions. Technical Report. UMI Order Number: CS94-11., Weizmann Science Press of Israel.
^Naor, M., Schulman, L. J., and Srinivasan, A. 1995. Splitters and near-optimal derandomization. In Proceedings of the 36th Annual Symposium on Foundations of Computer Science (October 23–25, 1995). FOCS. IEEE Computer Society, Washington, DC, 182.
^^a ^bSchmidt, J. P.; Siegel, A. (1990). "The spatial complexity of oblivious k-probe Hash functions".SIAM J. Comput.19 (5):775–786.doi:10.1137/0219054.
^Naor, J. and Naor, M. 1990. Small-bias probability spaces: efficient constructions and applications. In Proceedings of the Twenty-Second Annual ACM Symposium on theory of Computing (Baltimore, Maryland, United States, May 13–17, 1990). H. Ortiz, Ed. STOC '90. ACM, New York, NY, 213-223. DOI=http://doi.acm.org/10.1145/100216.100244
^Alon, N., Goldreich, O., Hastad, J., and Peralta, R. 1990. Simple construction of almost k-wise independent random variables. In Proceedings of the 31st Annual Symposium on Foundations of Computer Science (October 22–24, 1990). SFCS. IEEE Computer Society, Washington, DC, 544-553 vol.2.doi:10.1109/FSCS.1990.89575

Retrieved from "https://en.wikipedia.org/w/index.php?title=Color-coding&oldid=1258011184"

Category:

Graph algorithms

[8]ページ先頭

Movatterモバイル変換

Results

The method

Example

Derandomization

Applications

Further reading

References