Monday, January 21, 2008

LP integrality gaps for Min-Set-Cover

As promised, herein I will discuss "LP integrality gap instances" for Set-Cover. Recall that these are instances for which Lp() is small but Opt() is large.

Let's start with one that looks quite similar to the LP+Randomized-Rounding algorithmic gap we saw in class:

Ground elements: all nonzero elements of {0,1}q -- thought of as 2q, the q-dimensional vector space over the field of size 2. There are n=2q-1 ground elements.

Sets: For each α2q we have a set Sα={e2q\{0}:αe=1}. Here denotes the usual dot product in 2q: α1e1+...+αqeq (mod 2). There are M=2q sets. (The set S0 is kind of pathetic, being the empty set, but never mind.)

Analysis: Each ground element is contained in exactly half of the sets. (Standard undergrad probability exercise.) It follows that in an LP solution, if we take each set fractionally to the extent 2/M, then each element is fractionally covered to the extent 1.

Hence Lp()M(2/M)=2. (In fact, Lp is less than this...)

On the other hand, we claim Opt()q. Suppose by way of contradiction that there are some q-1 sets Sα1,...,Sαq-1 covering all ground elements. So

Sα1...Sαq-1=2q\{0}

i=1qSαi¯=

i=1q{e2q:αie=0}={0}.

But each set in the intersection above is a hyperplane in 2q. This gives the contradiction, because in dimension q, you can't have q-1 hyperplanes intersecting on just the point 0.

Conclusion: This instance is a 2 vs. log2 (n+1) integrality gap for Set-Cover. This yields a ratio of Ω(lnn), but not the full (1-ε)lnn we've come to expect.

--------------------

Comparing with the way things have gone before, I bet you think I'm going to now tell you a straightforward way to extend this to get the full (1-ε)lnn gap, perhaps by looking at an instance with ground elements kn.

Well, I'm here to tell you that I don't know how to do this. Quite likely it's possible, but I couldn't see how. I'd be very very interested if one of you could give such an example.

--------------------

So is there another kind of integrality gap instance? Or perhaps Lp and Opt can never be off by more than a (log2n)/2 factor...

Since Feige proved that the (1-\eps)lnn decision problem is hard, we know that the former must be the case. (Otherwise, the efficient algorithm of computing and outputting Lp would contradict his theorem.) Here is one way to show it:


Fix an abstract set of ground elements, Ω, of cardinality n. Introduce sets S1,...,SM by picking them randomly (and indendently). For each Si, we choose it by including each element of Ω with probability 1/k. We will take M=(Ck2)logn for a large C to be named later.

Analysis: Each element is expected to be contained in about a 1/k fraction of the sets. If we take C large enough, then a Chernoff bound + union bound will imply that with high probability, every element is contained in at least a (1/k)(1-ε) fraction of sets. (Here we can make ε as small as we like by making C large.) Thus we can get a valid fractional solution to the LP by taking each set to the extent (k/M)/(1-ε); this solution has value k/(1-ε). Summarizing: with high probability, Lpk/(1-ε).

We now would like to show that with high probability, Optt:=(1-ε)klnn. This will give us an Opt/Lp integrality gap of (1-ε)2lnn, which is good enough.

To see this, we fix any set of t out of M set indices and show that the probability these sets end up being a valid cover is very small. If the probability is significantly smaller even than 1/Mt then we can union bound over all possible size-t covers and be done.

Now each element e has a 1-1/k chance of being uncovered by a given set, and thus our fixed t sets have a (1-1/k)t chance of leaving e uncovered. With our choice of t, this quantity is essentially exp(-(1-ε)lnn)=1/n1-ε (slight cheat here as the inequality goes the wrong way, but one can fix this easily). The events that the t sets cover each element are actually independent across elements. So the probability that all elements are covered is at most (1-1/n1-ε)nexp(-nε). This is indeed much less than 1/Mt, since this latter quantity only is of the order 1/(logn)O(logn) (assuming k and ε are "constants"), much smaller than exp(-nε).

2 comments:

Anonymous said...

Does the integrality gap depend on the nature of LP formulation. For example if we have some alternative LP formulation for set cover problem, do we end up getting different integrality gap.

Ryan O'Donnell said...

Yes, it does depend on the LP formulation.

Note, though, that since we know NP-hardness of deciding Set-Cover to within a factor of (1-eps) ln n, there cannot be any poly-size LP formulation for Set-Cover with integrality gap less than (1-eps) ln n. (Unless P = NP!)

[Actually, we only know this hardness subject to NP not in DTIME(n^{O(log log n)}), so "unless NP in DTIME(n^{O(log log n)}}".]