Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

Set cover problem

From Wikipedia, the free encyclopedia
Classical problem in combinatorics
Example of an instance of set cover problem.
icon
This articleneeds additional citations forverification. Please helpimprove this article byadding citations to reliable sources. Unsourced material may be challenged and removed.
Find sources: "Set cover problem" – news ·newspapers ·books ·scholar ·JSTOR
(September 2025) (Learn how and when to remove this message)

Theset cover problem is a classical question incombinatorics,computer science,operations research, andcomplexity theory.

Given aset of elements{1, 2, …,n}(henceforth referred to as theuniverse, specifying all possible elements under consideration) and a collection, referred to asS, of a givenm subsets whoseunion equals the universe, the set cover problem is to identify a smallest sub-collection ofS whose union equals the universe.

For example, consider the universe,U = {1, 2, 3, 4, 5} and the collection of setsS = { {1, 2, 3}, {2, 4}, {3, 4}, {4, 5} }. In this example,m is equal to 4, as there are four subsets that comprise this collection. The union ofS is equal toU. However, we can cover all elements with only two sets:{ {1, 2, 3}, {4, 5} }‍, see picture, but not with only one set. Therefore, the solution to the set cover problem for thisU andS has size 2.

More formally, given a universeU{\displaystyle {\mathcal {U}}} and a familyS{\displaystyle {\mathcal {S}}} of subsets ofU{\displaystyle {\mathcal {U}}}, aset cover is a subfamilyCS{\displaystyle {\mathcal {C}}\subseteq {\mathcal {S}}} of sets whose union isU{\displaystyle {\mathcal {U}}}.

The decision version of set covering isNP-complete. It is one ofKarp's 21 NP-complete problems shown to beNP-complete in 1972. The optimization/search version of set cover isNP-hard.[1] It is a problem "whose study has led to the development of fundamental techniques for the entire field" ofapproximation algorithms.[2]

Variants

[edit]

In theweighted set cover problem, each set is assigned a positive weight (representing its cost), and the goal is to find a set cover with a smallest weight. The usual (unweighted) set cover corresponds to all sets having a weight of 1.

In thefractional set cover problem, it is allowed to select fractions of sets, rather than entire sets. A fractional set cover is an assignment of a fraction (a number in [0,1]) to each set inS{\displaystyle {\mathcal {S}}}, such that for each elementx in the universe, the sum of fractions of sets that containx is at least 1. The goal is to find a fractional set cover in which the sum of fractions is as small as possible. Note that a (usual) set cover is equivalent to a fractional set cover in which all fractions are either 0 or 1; therefore, the size of the smallest fractional cover is at most the size of the smallest cover, but may be smaller. For example, consider the universeU = {1, 2, 3} and the collection of setsS = { {1, 2}, {2, 3}, {3, 1} }. The smallest set cover has a size of 2, e.g.{ {1, 2}, {2, 3} }. But there is a fractional set cover of size 1.5, in which a 0.5 fraction of each set is taken.

Covering/packing-problem pairs
Covering problemsPacking problems
Minimum set coverMaximum set packing
Minimum edge coverMaximum matching
Minimum vertex coverMaximum independent set
Bin coveringBin packing
Polygon coveringRectangle packing

Linear program formulation

[edit]

Theset cover problem can be formulated as the followinginteger linear program (ILP).[3]

minimizesSxs{\displaystyle \sum _{s\in {\mathcal {S}}}x_{s}}(minimize the number of sets)
subject tos:esxs1{\displaystyle \sum _{s\colon e\in s}x_{s}\geqslant 1}for alleU{\displaystyle e\in {\mathcal {U}}}(cover every element of the universe)
xs{0,1}{\displaystyle x_{s}\in \{0,1\}}for allsS{\displaystyle s\in {\mathcal {S}}}.(every set is either in the set cover or not)

For a more compact representation of the covering constraint, one can define anincidence matrixA{\displaystyle A}, where each row corresponds to an element and each column corresponds to a set, andAe,s=1{\displaystyle A_{e,s}=1} if element e is in set s, andAe,s=0{\displaystyle A_{e,s}=0} otherwise. Then, the covering constraint can be written asAx1{\displaystyle Ax\geqslant 1}.

Weighted set cover is described by a program identical to the one given above, except that the objective function to minimize issSwsxs{\displaystyle \sum _{s\in {\mathcal {S}}}w_{s}x_{s}}, wherews{\displaystyle w_{s}} is the weight of setsS{\displaystyle s\in {\mathcal {S}}}.

Fractional set cover is described by a program identical to the one given above, except thatxs{\displaystyle x_{s}} can be non-integer, so the last constraint is replaced by0xs1{\displaystyle 0\leq x_{s}\leq 1}.

This linear program belongs to the more general class of LPs forcovering problems, as all the coefficients in the objective function and both sides of the constraints are non-negative. Theintegrality gap of the ILP is at mostlogn{\displaystyle \scriptstyle \log n} (wheren{\displaystyle \scriptstyle n} is the size of the universe). It has been shown that itsrelaxation indeed gives a factor-logn{\displaystyle \scriptstyle \log n}approximation algorithm for the minimum set cover problem.[4] Seerandomized rounding#setcover for a detailed explanation.

Hitting set formulation

[edit]

The set cover problem is equivalent to the hitting set problem. A subsetH{\displaystyle H} ofU{\displaystyle U} is called ahitting set whenHSj{\displaystyle H\cap S_{j}\neq \emptyset } for all1jm{\displaystyle 1\leq j\leq m} (i.e.,H{\displaystyle H} intersects or “hits” all subsets inS{\displaystyle S}). Thehitting set problem is to find a minimum hitting setH{\displaystyle H} for a givenU{\displaystyle U} andS{\displaystyle S}.

To show that the problems are equivalent, for a universeU{\displaystyle U} of sizen{\displaystyle n} and collection of setsS{\displaystyle S} of sizem{\displaystyle m}, constructU={1,2,,m}{\displaystyle U'=\{1,2,\ldots ,m\}} andSi={jiSj}{\displaystyle S'_{i}=\{j\mid i\in S_{j}\}}. Then a set coverC{\displaystyle C} ofS{\displaystyle S} is equivalent to a hitting setH{\displaystyle H'} ofU{\displaystyle U'} whereSjCjH{\displaystyle S_{j}\in C\iff j\in H'}, and vice versa.

This equivalence can also be visualized by representing the problem as abipartite graph ofn+m{\displaystyle n+m} vertices, withn{\displaystyle n} vertices on the left representing elements ofU{\displaystyle U}, andm{\displaystyle m} vertices on the right representing elements ofS{\displaystyle S}, and edges representing set membership (i.e., there is an edge between thei{\displaystyle i}-th vertex on the left and thej{\displaystyle j}-th vertex of the rightiff.iSj{\displaystyle i\in S_{j}}). Then a set cover is a subsetC{\displaystyle C} of right vertices such that each left vertex is adjacent to at least one member ofC{\displaystyle C}, while a hitting set is a subsetH{\displaystyle H} of left vertices such that each right vertex is adjacent to at least one member ofH{\displaystyle H}. These definitions are exactly the same except thatleft andright are swapped. But there is nothing special about the sides in the bipartite graph; we could have put the elements ofU{\displaystyle U} on the right side, and the elements ofS{\displaystyle S} on the left side, creating a graph that is a mirror image of the one described above. This shows that set covers in the original graph are equivalent to hitting sets in the mirrored graph, and vice versa.

In the field ofcomputational geometry, a hitting set for a collection of geometrical objects is also called astabbing set orpiercing set.[5]

Greedy algorithm

[edit]

There is agreedy algorithm for polynomial time approximation of set covering that chooses sets according to one rule: at each stage, choose the set that contains the largest number of uncovered elements. This method can be implemented in time linear in the sum of sizes of the input sets, using abucket queue to prioritize the sets.[6] It achieves an approximation ratio ofH(s){\displaystyle H(s)}, wheres{\displaystyle s} is the size of the set to be covered.[7] In other words, it finds a covering that may beH(n){\displaystyle H(n)} times as large as the minimum one, whereH(n){\displaystyle H(n)} is then{\displaystyle n}-thharmonic number:H(n)=k=1n1klnn+1{\displaystyle H(n)=\sum _{k=1}^{n}{\frac {1}{k}}\leq \ln {n}+1}

This greedy algorithm actually achieves an approximation ratio ofH(s){\displaystyle H(s^{\prime })} wheres{\displaystyle s^{\prime }} is the maximum cardinality set ofS{\displaystyle S}. Forδ{\displaystyle \delta -}dense instances, however, there exists aclnm{\displaystyle c\ln {m}}-approximation algorithm for everyc>0{\displaystyle c>0}.[8]

Tight example for thegreedy algorithm with k=3

There is a standard example on which the greedy algorithm achieves an approximation ratio oflog2(n)/2{\displaystyle \log _{2}(n)/2}.The universe consists ofn=2(k+1)2{\displaystyle n=2^{(k+1)}-2} elements. The set system consists ofk{\displaystyle k} pairwise disjoint setsS1,,Sk{\displaystyle S_{1},\ldots ,S_{k}} with sizes2,4,8,,2k{\displaystyle 2,4,8,\ldots ,2^{k}} respectively, as well as two additional disjoint setsT0,T1{\displaystyle T_{0},T_{1}},each of which contains half of the elements from eachSi{\displaystyle S_{i}}. On this input, the greedy algorithm takes the setsSk,,S1{\displaystyle S_{k},\ldots ,S_{1}}, in that order, while the optimal solution consists only ofT0{\displaystyle T_{0}} andT1{\displaystyle T_{1}}.An example of such an input fork=3{\displaystyle k=3} is pictured on the right.

Inapproximability results show that the greedy algorithm is essentially the best-possible polynomial time approximation algorithm for set cover up to lower order terms(seeInapproximability results below), under plausible complexity assumptions. A tighter analysis for the greedy algorithm shows that the approximation ratio is exactlylnnlnlnn+Θ(1){\displaystyle \ln {n}-\ln {\ln {n}}+\Theta (1)}.[9]

Low-frequency systems

[edit]

If each element occurs in at mostf sets, then a solution can be found in polynomial time that approximates the optimum to within a factor off usingLP relaxation.

If the constraintxS{0,1}{\displaystyle x_{S}\in \{0,1\}} is replaced byxS0{\displaystyle x_{S}\geq 0} for allS inS{\displaystyle {\mathcal {S}}} in the integer linear program shownabove, then it becomes a (non-integer) linear programL. The algorithm can be described as follows:

  1. Find an optimal solutionO for the programL using some polynomial-time method of solving linear programs.
  2. Pick all setsS for which the corresponding variablexS has value at least 1/f in the solutionO.[10]

Inapproximability results

[edit]

Whenn{\displaystyle n} refers to the size of the universe,Lund & Yannakakis (1994) showed that set covering cannot be approximated in polynomial time to within a factor of12log2n0.72lnn{\displaystyle {\tfrac {1}{2}}\log _{2}{n}\approx 0.72\ln {n}}, unlessNP hasquasi-polynomial time algorithms.Feige (1998) improved this lower bound to(1o(1))lnn{\displaystyle {\bigl (}1-o(1){\bigr )}\cdot \ln {n}} under the same assumptions, which essentially matches the approximation ratio achieved by the greedy algorithm.Raz & Safra (1997) established a lower boundofclnn{\displaystyle c\cdot \ln {n}}, wherec{\displaystyle c} is a certain constant, under the weaker assumption thatP{\displaystyle \not =}NP.A similar result with a higher value ofc{\displaystyle c} was recently proved byAlon, Moshkovitz & Safra (2006).Dinur & Steurer (2013) showed optimal inapproximability by proving that it cannot be approximated to(1o(1))lnn{\displaystyle {\bigl (}1-o(1){\bigr )}\cdot \ln {n}} unlessP={\displaystyle =}NP.

In low-frequency systems,Dinur et al. (2003) proved it is NP-hard to approximate set cover to better thanf1ϵ{\displaystyle f-1-\epsilon }.If theUnique games conjecture is true, this can be improved tofϵ{\displaystyle f-\epsilon } as proven byKhot & Regev (2008).

Trevisan (2001) proves that set cover instances with sets of size at mostΔ{\displaystyle \Delta } cannot be approximated to a factor better thanlnΔO(lnlnΔ){\displaystyle \ln \Delta -O(\ln \ln \Delta )} unlessP={\displaystyle =}NP, thus making the approximation oflnΔ+1{\displaystyle \ln \Delta +1} of the greedy algorithm essentially tight in this case.

Weighted set cover

[edit]
[icon]
This sectionneeds expansion. You can help byadding to it.(November 2017)

Relaxing the integer linear program for weighted set cover statedabove, one may userandomized rounding to get anO(logn){\displaystyle O(\log n)}-factor approximation. Non weighted set cover can be adapted to the weighted case.[11]

Fractional set cover

[edit]
[icon]
This sectionneeds expansion. You can help byadding to it.(September 2023)

Related problems

[edit]
  • Hitting set is an equivalent reformulation of Set Cover.
  • Vertex cover is a special case of Hitting Set.
  • Edge cover is a special case of Set Cover.
  • Geometric set cover is a special case of Set Cover when the universe is a set of points inRd{\displaystyle \mathbb {R} ^{d}} and the sets are induced by the intersection of the universe and geometric shapes (e.g., disks, rectangles).
  • Set packing
  • Maximum coverage problem is to choose at most k sets to cover as many elements as possible.
  • Dominating set is the problem of selecting a set of vertices (the dominating set) in a graph such that all other vertices are adjacent to at least one vertex in the dominating set. The Dominating set problem was shown to be NP complete through a reduction from Set cover.
  • Exact cover problem is to choose a set cover with no element included in more than one covering set.
  • Red-blue set cover.[12]
  • Set-cover abduction.
  • Monotone dualization is a computational problem equivalent to either listing all minimal hitting sets or listing all minimal set covers of a given set family.[13]

Notes

[edit]
  1. ^Korte & Vygen 2012, p. 414.
  2. ^Vazirani (2001, p. 15)
  3. ^Vazirani (2001, p. 108)
  4. ^Vazirani (2001, pp. 110–112)
  5. ^Nielsen, Frank (2000-09-06)."Fast stabbing of boxes in high dimensions"(PDF).Theoretical Computer Science.246 (1):53–72.doi:10.1016/S0304-3975(98)00336-3.ISSN 0304-3975.
  6. ^Cormen, Thomas H.;Leiserson, Charles E.;Rivest, Ronald L.;Stein, Clifford (2009) [1990], "Exercise 35.3-3",Introduction to Algorithms (3rd ed.), MIT Press and McGraw-Hill, p. 1122,ISBN 0-262-03384-4
  7. ^Chvatal, V. (August 1979), "A Greedy Heuristic for the Set-Covering Problem",Mathematics of Operations Research,4 (3):233–235,doi:10.1287/moor.4.3.233,JSTOR 3689577
  8. ^Karpinski & Zelikovsky 1998
  9. ^Slavík PetrA tight analysis of the greedy algorithm for set cover. STOC'96, Pages 435-441,doi:10.1145/237814.237991
  10. ^Vazirani (2001, pp. 118–119)
  11. ^Vazirani (2001, Chapter 14)
  12. ^Information., Sandia National Laboratories. United States. Department of Energy. United States. Department of Energy. Office of Scientific and Technical (1999).On the Red-Blue Set Cover Problem. United States. Dept. of Energy.OCLC 68396743.
  13. ^Gainer-Dewar, Andrew; Vera-Licona, Paola (2017), "The minimal hitting set generation problem: algorithms and computation",SIAM Journal on Discrete Mathematics,31 (1):63–100,arXiv:1601.02939,doi:10.1137/15M1055024,MR 3590650,S2CID 9240467

References

[edit]

External links

[edit]
Wikimedia Commons has media related toSet cover problem.
Retrieved from "https://en.wikipedia.org/w/index.php?title=Set_cover_problem&oldid=1313299814"
Categories:
Hidden categories:

[8]ページ先頭

©2009-2025 Movatter.jp