15-854B: Advanced Approximation Algorithms

Another clarification on HW6

2008-04-30T20:43:00.002-04:00

For 2a, it should say "expressed as a sum of products of up to k of f's Fourier coefficients".

Thanks to Jonah for pointing this out.

Clarification on HW6

2008-04-29T12:30:00.007-04:00

Ok, yeah, part (b) was super-vague, sorry about that. Here is a more precise question, I think.

Consider the matrix $M_2(y)$; if this is psd, then we know, e.g., that $y_{abcd}$ can be written as the dot product of vectors $z_{ab}$ and $z_{cd}$, and also as the dot products of vectors $z_{bc}$ and $z_{ad}$ etc. Similarly $y_{ab}$ can be written as $z_{a}$ dot $z_b$, and also as $z_{ab}$ dot $z_\emptyset$. Hence we now have these vectors $z$'s, each subscripted by at most two vertices, with some constraints relating them.

The goal of the problem is to show that the SDP $Q_1(K_{VC})$ is at least as strong as a natural vertex cover SDP, which assigns a vector $v_i$ to each node $i$ to minimize $\sum_i v_i \cdot v_i$ subject to

$(v_\emptyset - v_a)(v_\emptyset - v_b) = 0$ for all edges $ab$.
$v_\emptyset \cdot v_\emptyset = 1$ (say).

Can you show that the SDP you've created implies these constraints?

Does your SDP also imply the triangle inequalities:

$||v_a - v_b||^2 + ||v_b - v_c||^2 \geq ||v_a - v_c||^2$?

If things are still vague (or if you see some mistakes above), please email/post yr questions.

Course evaluation time...

2008-04-28T09:33:00.003-04:00

Hi all: when you get a chance, could you go to https://my.cmu.edu/site/main/page.academics and give yr feedback on the course. thanks!

Integrality Gaps for Sparsest Cut

2008-04-27T22:38:00.006-04:00

A quick comment about the sparsest cut integrality gaps we saw last time, and the connection to the dual. (BTW, the notes for Lec 27 are up, thanks Varun!)

In Lecture 27, the arguments for the gaps went thus: at least $k$ of the demands need to travel through $p$ edges in the network, hence sending $\tau$ fraction of each unit demand uses $\tau \cdot k \cdot p$ volume. And since we dealt with unit capacities, there was only $m$ volume in the network, hence the maximum concurrent flow is at most $m/(pk)$. For the 5-node example, $p=2$, $k=4$ and $m=6$; for the expander, $k = \Omega(n^2)$, $p = \Omega(\log n)$, and $m = O(n)$.

But suppose we are more sophisticated:

Suppose we assign lengths $d_e$ to each edge $e$, so that the shortest path according to these edge lengths from $i$ to $j$ is now $d_{ij}$. Think of each of the edges $(i,j)$ as pipes with capacity $C_{ij}$ and length $d_{ij}$.
Now if the pair $ij$ sends $\tau D_{ij}$ flow, each unit of flow must travel $d_{ij}$ distance. Hence the total volume used by this pair is $\tau D_{ij} \cdot d_{ij}$. And the total volume used is $\sum_{ij} \tau \cdot D_{ij} \cdot d_{ij}$.
But the total volume in the network is now $\sum_{(i,j) \in E} C_{ij} d_{ij}$.

Hence, the maximum concurrent flow $\tau^*$ is at most $\sum_{(i,j) \in E} C_{ij} d_{ij}$ divided by $\sum_{ij} \tau \cdot D_{ij} \cdot d_{ij}$. Moreover, this is true for any setting of the edge lengths $d_e$. Hence, $\tau$ is at most the minimum of $(C.d)/(D.d)$ over all metrics $d$, or in other words, $\tau \leq \lambda^*$!

What we have just proved is weak LP-duality for the case of sparsest cut. The fact that $\tau^*$ equals $\lambda^*$ is given by strong duality.

Hardness of Independent Set

2008-04-21T22:29:00.003-04:00

I wanted to say a few more things about the hardness of Max-Independent-Set (equivalently, Min-Vertex-Cover). Recall that in Lecture 25, we sketched the proof of factor $1/n^{1-\epsilon}$ NP-hardness for this problem; specifically, Hastad's result shows that distinguishing $1/n^{\epsilon}$ vs. $1/n^{1-\epsilon}$ is NP-hard for all $\epsilon > 0$.

This is a great hardness factor, but (as they used to say) "the gap is in the wrong place" -- or at least, it's not in the strongest possible place. It is of great interest to try to obtain a hardness result of the form $\Omega(1)$ vs. $o(1)$. This will give a worse "factor", but the result is essentially stronger. Note that we can't hope for $\Omega(1)$ vs. $1/n^{1-\epsilon}$ NP-hardness; Boppana and Halldorsson showed, roughly, that $1/k$ vs. $1/n^{(k-2)/(k-1)}$ is in P (using Ramsey theory) and this was improved by Alon and Kahale to, roughly, $1/k$ vs. $1/n^{(k-2)/(k+1)}$, using the Karger-Motwani-Sudan SDP coloring algorithm.

One reason hardness results like $\Omega(1)$ vs. "small" are better is that they better "factor" hardness for Min-Vertex-Cover. Recall that, trivially, a $c$ vs. $s$ hardness result for Min-Independent-Set is a $1-c$ vs. $1 - s$ hardness result for Max-Vertex-Cover, and thus a $(1-s)/(1-c)$-factor hardness for Max-Vertex-Cover. For example, Hastad's hardness gives only $1+o(1)$ factor hardness for Vertex-Cover, but a result like $1/3$ vs. $o(1)$ hardness for Independent-Set would give factor $3/2 - o(1)$ hardness for Vertex-Cover.

For a long time, the best NP-hardness result like this we knew for Independent-Set came simply from taking Hastad's $1 - \epsilon$ vs. $1/2 + \epsilon$ hardness result for Max-E3Lin and applying the FGLSS reduction to it. You should (easily) check for yourself that this gives $1/4 - \epsilon$ vs. $1/8 + \epsilon$ NP-hardness for Independent-Set. Hence it gives $3/4 + \epsilon$ vs. $7/8 - \epsilon$ NP-hardness for Vertex-Cover; i.e., factor $7/6 + \epsilon$.

In 2002, in the same conference as Khot's Unique Games Conjecture paper, Dinur and Safra published a stronger NP-hardness result for Independent-Set. This was a pretty amazing paper, because the outline of the proof was:

a) Prove a super-weak variant of the Unique Games Conjecture;
b) Do a very difficult Dictator Test based on the combinatorics of $2$-intersecting families.

Part (a), as mentioned, was done contemporaneously with the UGC paper, while part (b) was done before the work on Ek-Independent-Set described in our Lectures 8 and 9.

I like to remember their result as $1/3$ vs. $1/9$ NP-hardness for Independent-Set (omitting the $\epsilon$'s), and hence factor $(8/9)/(2/3) = 4/3$ hardness for Vertex-Cover. In fact, they pushed it a little further. Specifically, for $p = (3 - \sqrt{5})/2$, they showed $p$ vs. $4p^3 - 3p^4$ NP-hardness for Independent-Set. This is, like, .382 vs. .159. This yields factor 1.36 or so NP-hardness for Vertex-Cover.

That stands as the best NP-hardness result known. Relatively soon after the UGC paper, Khot and Regev showed showed that assuming UGC, $1/2 - \epsilon$ vs. $\epsilon$ hardness holds for Min-Independent-Set, and thus factor $2-\epsilon$ hardness for Vertex-Cover. Indeed, they showed UGC-hardness of $1/k - \epsilon$ vs. $\epsilon$ for Min-Ek-Independent-Set.

End of Lecture 25

2008-04-21T17:39:00.002-04:00

Here is the end of the proof I didn't get a chance to finish from Lecture 25.

Homework 5 graded

2008-04-21T16:39:00.002-04:00

Homework 5 is graded and will be handed back tomorrow. The average was 79%.

A few comments:

1. Most people got this right (one way or another).

2. For part b), the two equally simplest counterexamples are {$x_1$, $x_1$, $1$}, or {$x_1$, $x_1$, $x_1$}. Aaron, Kanat, and Mike all gave the same, far-from-simplest counterexample. What a coincidence...

3. Many people failed to prove a characterization (if and only if statement) in c), and just gave an implication. Many people also had complicated solutions for the support-3 problem. Ali and Amitabh gave roughly the same solution, which is probably the slickest. Suppose f = a $\chi_S$ + $b \chi_T$ + $c \chi_U$, where the coefficients $a, b, c$ are nonzero and $S$, $T$, $U$ are distinct. By Parseval,

$a^2 + b^2 + c^2 = 1$ (*).

Substituting in $(1, ..., 1)$, we get $a + b + c \in {-1,1}$. Squaring this and subtracting (*) yields

$ab + bc + ac = 0$ (**).

Since $S$, $T$, $U$ are distinct, there must be an index $i$ which is in exactly 1 or 2 of them. Substituting in the string which is 1 everywhere except that it's $-1$ on the $i$th coordinate, we conclude $\sigma_1 a + \sigma_2 b + \sigma_3 c \in {-1,1}$, where exactly 1 or 2 of the signs $\sigma_i$ are $-1$. Squaring this and subtracting (*) yields

$\tau_1 ab + \tau_2 bc + \tau_3 ac = 0$ (***),

where exactly two of the signs $\tau_i$ are $-1$. Now add (**) and (***) and conclude that one of $ab$, $bc$, or $ac$ must equal 0, a contradiction.

4. Most people got (a) and (b). For (c), several people estimated the degree-1 Fourier coefficients of Majority and concluded that it's, say, $(1/n, 1/2)$-quasirandom. Only Eric truly ventured to high degree, pointing out that $Inf_i(Maj)$ can be seen to be $\Theta(1/\sqrt{n})$, and therefore by 5a, Maj is $(\Theta(1/\sqrt{n}),0)$-quasirandom. In fact, one can show that the largest Fourier coefficients for Maj are the ones at level $1$, and hence Maj is $(1/n, 0)$-quasirandom.

5. Most people more or less got this.

6. Most people mostly got this; the key in b) is picking the right diameter.

7. Most people got this too, although many failed to do both parts of f).

Homework 6 out

2008-04-15T15:14:00.000-04:00

Homework 6 is out: it's due May 1st.

Lecture 24: UGC-Hardness of Max-Cut

2008-04-14T11:36:00.001-04:00

The completion of the proof of the reduction from Unique Label-Cover to Max-Cut can be found here.

Homework 4

2008-04-03T15:05:00.002-04:00

A high-scoring one: the average was 88. Notes and comments soon.

Hwk 5 clarifications

2008-03-31T17:11:00.002-04:00

HW5 has been extended to Thursday Apr 3rd. Here are some typos to watch out for ---

Q6, the demands are all integers.
Q7e, it should be $\epsilon/\log n$, and not $\epsilon \log n$.

Tomorrow's class

2008-03-31T16:34:00.002-04:00

Tomorrow we're going to dive more deeply into hardness of approximation results for CSPs. Please take a look again at the ones we've already proved: for Max-Coverage (Lecture 3) and for Min-Ek-Hypergraph-Independent-Set (Lectures 8 and 9). (

Lecture notes 16 and 17

2008-03-29T22:34:00.000-04:00

are posted. Sorry for the delay.

Prep for lecture tomorrow: recap Lecture 18

2008-03-24T09:28:00.005-04:00

For tomorrow's lecture (#19), please recall the definition of metrics embedding into trees (problem #6 from HW4). Also, have a look at the correctness proof for low-diameter decompositions posted on Thursday, and try to prove the following extension of that result:

Extension: For a vertex $v$ and radius $r$, let $B(v,r) = \{ w | d_{vw} \leq r\}$ be the "ball" of all vertices within distance $r$ to $v$. Extend the proof from Lecture 16 to show that the probability of an edge $(u,v)$ being cut is at most

$O(1) \times (d_{uv} / r) \times \log (|B(u,2r)| / |B(u,r/4)|)$.

(Hint: The previous proof sums $1/j$ for all $j$ from $1$ to $n$ to get $H_n$. Can you show it suffices to sum over a smaller range? You may also want to consider cases depending whether or not $d_{uv}$ is at least $r/4$.)

Homework 5

2008-03-21T01:08:00.001-04:00

Sorry for the delay, HW5 is up on the web page.

Lecture 18: Last bits of the proof

2008-03-20T15:03:00.005-04:00

Recall the low-diameter decomposition procedure: we picked a random radius $X$ and a random permutation $\pi$. We then constructed the random partition by considering all the vertices one by one, in the order given by $\pi$: when considering $w$, we assigned all the yet-unassigned vertices $v$ with $d(w,v) \leq X$ to $w$'s partition.

Just for the analysis: suppose we renumber the vertices $w_1, w_2, \dots, w_n$ in order of their distance from the edge $e = (u,v)$.

The crucial definitions (that avoid pain later) are the following:

At some time instant in this procedure, one (or both) of $u$ or $v$ gets assigned to some $w_i$. Say that $w_i$ settles the edge $(u,v)$.
At the moment the edge is settled, if only one endpoint of this edge gets assigned, then we say that $w_i$ cuts the edge $(u,v)$.

Note that each edge is settled at exactly one time instant in the procedure, and it may or may not be cut at that point in time. Of course, once the edge is settled (with or without being cut), it is never cut in the future.

Consider $w_j$ and let its distance $d(w_j,u) = a_j$ and $d(w_j, v) = b_j$. Assume $a_j < b_j$, the other case is identical. If $w_j$ cuts $(u,v)$ when the random values are $X$ and $\pi$, the following two properties must hold:

The random variable $X$ must lie in the interval $[a_j, b_j]$ (else either none or both of $(u,v)$ would get marked).
The node $w_j$ must come before $w_1, ..., w_{j-1}$ in the permutation $\pi$.

Suppose not, and one of them came before $w_j$ in the permutation. Since all these vertices are closer to the edge than $w_j$ is, then for the current value of $X$, they would have settle the edge (either capture one or both of the endpoints) at some previous time point, and hence $w_j$ would not settle---and hence not cut---the edge $(u,v)$.

Now the rest of the argument is as in class: Pr[ edge $e$ is cut ] = $\sum_j$ Pr[ $w_j$ cuts the edge $e$]. Moreover, Pr[$w_j$ cuts $e$] $\leq$ Pr[$X \in [a_j, b_j]$ and $w_j$ comes before $w_1, ... w_{j-1}$ in the permutation $\pi$] $\leq$ $(d_{uv}/(r/2)) \times (1/j)$. And summing this over all vertices gives us $2 (d_{uv}/r) H_n$.

Homework 4, problem 2b comment

2008-03-16T23:28:00.002-04:00

Just show you can get a gap at least this bad. (In other words, although you will need to construct a good SDP solution for $C_5$, you need not prove it is optimal. Unless you want to. Say, by using duality.)

Homework 3 graded

2008-03-16T23:20:00.003-04:00

I'll return them on Tuesday. The average was 79%. Some comments:

1a) Several people erroneously claimed a log*|D| approximation here. (In fact, no factor approximation is possible.) The reason the AKC algorithm does not work is as follows: Assume we have correctly guessed R = Opt. At first, we indeed know that there k vertices which R-cover all of D. So the first Set-Cover approximation returns some set of vertices A R-covering D. But now, we do not know that there are k vertices which R-cover A. So we are stuck.

2. This was surprisingly not done so well. Most people seemed to get the key point that ($c_0$, $s_0$) hardness implies hardness along the two line segments (in your diagram) joining this point to (1,1) and to (0,0). But people seemed to have a very hard time deducing the easy region (due to the Vertex-Cover algorithm), which is the triangle with vertices (1,1), (1, 0), and (1-1/k, 0).

3. Most everybody got this one, except for failing to point out in part b the simple but key fact that the reduction preserves Opt = 1.

4. As mentioned earlier, this question was kind of screwed up. In honor of that, everyone received 3 bonus points. Mainly, we just wanted you to revisit the Max-Coverage proof, pay attention to the parameters, and see how the parameters from #3 fit in.

5. Almost no one tried this, although it is not too hard.

6. Most people who tried this more or less got it, although some failed to argue why shifting preserves the weight of the set. For a solution, see the original paper.

Mathematical odds and sods for SDP

2008-03-16T16:48:00.008-04:00

Hello all, hope you had a good spring break. Scribe notes for the last two lectures are up. Sorry for the delay; my fault, not the scribes'.

I thought I would record here a few small mathematical facts that came up in our lectures on SDP algorithms for Max-Cut and Coloring.

The GW constant: Recall the GW constant,

$\alpha := min_{-1 \leq \rho \leq 1} [\arccos(\rho)/\pi]/[1/2 - \rho/2]$.

Using calculus and some trigonometric substitutions, you can determine that $\alpha = 2/(\pi sin \theta)$, where $\theta$ is the positive solution of $\theta = \tan(\theta/2)$. As far as I know, it is not known if $\alpha$ is irrational, although presumably it is transcendental.

Gaussians: Remember that a real random variable X has the "standard Gaussian (normal) distribution" if its pdf is $\phi(x) := (1/\sqrt{2\pi}) \exp(-x^2/2)$. The annoying thing about this pdf is that it does not have a closed-form integrals although several closely related expressions do. For example, $x\phi(x)$ has $-\phi(x)$ as an antiderivative. It's easy to check that E[$X^k$] = 0 whenever $k$ is an odd positive integer, and with enough integration-by-parts effort you can check that E[$X^k]= (k-1)(k-3) \cdot \cdot \cdot 3 \cdot 1$ whenever $k$ is an even positive integer. In particular, X has mean 0 and variance 1.

In class we were particularly interested in the "tail", $\mathcal{N}(t) := $ Pr[X $\geq t$] $= \int_{t}^\infty \phi(x) dx$, where $t$ is positive. Here is how to accurately estimate this quantity:

For an upper bound, just insert a factor of $x/t$ into the integrand, which is always at least $1$ on the range. Then you can immediately integrate and get the upper bound

$\mathcal{N}(t) \leq (1/t) \phi(t)$.

For a lower bound, you need a cleverer trick. Insert a factor of $1 - 3/x^4$ into the integral (!), which is always less than $1$. You can check that the resulting quantity, $(1-3/x^4)\phi(x)$ has an explicit antiderivative: $(1/x^3 - 1/x)\phi(x)$. Thus you can conclude

$\mathcal{N}(t) \geq (1/t - 1/t^3) \phi(t)$.

(I think this trick is due to Feller.) Thus we have excellent bounds for $\mathcal{N}(t)$ for large $t$: in particular,

$\mathcal{N}(t) \sim (1/t) \phi(t)$

as mentioned in class.

How to draw a Gaussian random variable: In practice, just use "randn" in Matlab :) In theory, suppose you have access to an oracle which gives you a uniform real from the interval [0,1]. Then the easiest thing is to draw two random numbers S and $\theta$ from [0,1] and then set g = $\sqrt{\ln(1/s)} \cos(2\pi \theta)$, h = $\sqrt{\ln(1/s)} \sin(2\pi \theta)$. Then g and h are two independent Gaussians! (This is the "Box-Muller Transformation".) It is a nice exercise to see why this is true.

Of course, in practice in theory (if you will), the usual model of randomness is to assume you have random bits. Also, of course, you can't compute ln, cos, sin, $\pi$, etc. to arbitrary precision. So in reality, you need to make some approximations.

Just as with the issue of solving SDPs exactly, these approximations introduce additive errors of $\epsilon$, with running time overhead of polylog($1/\epsilon$). And just as with the SDP-solving issue, we will try to never speak of such annoying details again :)

Compendium of approximability

2008-03-16T16:47:00.001-04:00

[somehow I deleted the original of this posting...]

This site, by Pierluigi Crescenzi and Viggo Kann describes the best known approximation guarantees and inapproximability results for a very large number of problems.

It's quite out of date, but still a useful resource.

k-Set-Cover

2008-03-04T18:24:00.003-05:00

Jeremiah asked me today about the status of the problem "k-Set-Cover" for small k. This is the Min-Set-Cover problem where all the sets have cardinality at most k.

I wasn't exactly sure on the status, actually. On the homework we showed that the greedy algorithm gives an $H_k$ ($k$th harmonic number) factor approximation. I guessed that this was known to be more or less sharp.

Turns out that is true to some extent. With some augmentations to Feige's result, Trevisan showed that the problem is hard to approximate within a factor of $\ln k - \Omega(\ln \ln k)$.

While this looks pretty sharp, it's actually useless for the relatively interesting case of small constant $k$. For $k = 2$ the problem is AKA "Edge Cover", and it's solvable in P.

For $k \geq 3$ the problem is known to be NP-hard, and furthermore known to be "APX-hard", meaning that it is hard to approximate within a factor of $1 + \epsilon$ for some universal positive $\eps > 0$. (Result due to Papadimitriou and Yannakakis?) Improving this to an explicit decent value of $\epsilon$, depending on $k$, would be a nice problem.

On the positive side, the case of most interest to Jeremiah was $k = 3$. Here the best result is due to Duh and Fürer: a factor 4/3 algorithm. Actually, they get a factor $H_k - 1/2$ algorithm for any $k$.

There have been subsequent marginal improvements. A result slightly better than 19/12 is known for $k = 4$, and the world record is $H_k - .59$ for all $k \geq 6$; that appears in this recent paper of Athanassopoulos, Caragiannis, and Kaklamanis.

Homework 2: Notes and Comments

2008-02-29T11:30:00.003-05:00

Here are solution sketches and comments on the rest of problems in HW2.

For the naive algorithm on unweighted graphs, several answers said: "if an edge is uncovered, the algorithm will pick two vertices when the optimal algorithm picks one vertex". While this is true (and is indeed the basis for the proof), you need to show that we are not over-counting. One way is to clearly state a lower bound on OPT (the size of any maximal matching), and state why the algorithm is at most twice the lower bound.

For the randomized VC algorithm, see the solutions to problem 2b from this previous semester's course.

For the primal-dual algorithm, we write down the dual which maximizes $\sum_e y_e$ such that for each vertex $\sum_{e: v \in e} y_e \leq c_v$, and the $y_e$ values are non-negative. We start with the zero primal and dual solutions, and raise all the $y_e$'s uniformly until some constraint becomes tight (i.e., for some vertex $v$, $\sum_{e: v \in e} y_e = c_v$. Pick the vertex in the primal solution, and "freeze" all the variables involved in this constraint. Now continue raising all the unfrozen edge variables, picking vertices and freezing edges when other contraints become right etc, until all the edge variables become tight: we now have have a feasible primal vertex cover $C$. (why?) Now the cost of this solution is $\sum_{v \in C} c_v = \sum_{v \in C} \sum_{e: v \in e} y_e$ $\leq \sum_{e} \sum_{v \in C \cap e} y_e \leq 2 \sum_e y_e$, which is the dual value. (The last inequality is because each edge contains at most 2 vertices.) Hence the vertex cover costs twice the feasible dual solution.

For the greedy algorithm, if you have a star with center cost $1+\epsilon$, and $n$ leaves with costs $1, 1/2, 1/3, \ldots, 1/n$, then you will pick all the leaves instead of the center, and hence only get an $H_n$ approx. An $O(\log m) = O(\log n)$ upper bound follows from the standard set cover analysis.
One point often missed: this problem defined $C^*$ and $F^*$ to be the facility and connection costs for the optimal integer solution, and not for the optimal LP solution.

Also, if you claim that after filtering you can use the standard Set Cover randomized rounding to obtain a solution, you need to show how the LP solution for facility location can be mapped to a LP solution for Set Cover.
No surprises here.
Build a bipartite graph on clauses and variables, and use Hall's theorem to infer the existence of a perfect matching.

Homework 2 graded

2008-02-28T11:46:00.001-05:00

Sorry for the delay. We'll hand it back today. Overall average was 84%.

Solutions to Homework 2, #6

2008-02-28T10:43:00.004-05:00

7a. Take any $s$ sets in $\mathcal{F}_{s,t,\ell}$. By definition, each misses at most $\ell$ elements from $[t + s\ell]$. Thus their intersection misses at most $s \ell$ elements from $[t + s\ell]$ and thus has size at least $t$.

7b. Extend the definition of $S_{ij}$ to sets in the natural way. It's obvious that if $A$ is in $\mathcal{F}_{s,t,\ell}$ then $S_{ij}(A)$ is too. Hence shifting has no effect on $\mathcal{F}_{s,t,\ell}$.

7c. This one definitely takes much more explanation, but I don't think it's actually tricky -- you just have to follow your nose all the way through. However, you do have to avoid the following two extremely common mistakes:

"$S_{ij}\mathcal{F}$ is given by taking $S_{ij}(A)$ for all $A \in \mathcal{F}$." -- Not true.

"For any $A' \in S_{ij}\mathcal{F}$, there is an $A \in \mathcal{F}$ such that $S_{ij}(A) = A'$." -- False false false! Consider $i = 1$, $j = 2$, $n = 2$, $\mathcal{F} = \{ \{1\}, \{2\} \}$. Then $S_{12} \mathcal{F} = \mathcal{F}$, and yet for $\{2\} \in S_{ij} \mathcal{F}$, there is no set $A \in \mathcal{F}$ such that $S_{12}(A) = \{2\}$!

Here is my two-part proof:

Part 1.
Suppose by way of contradiction that there are $s$ sets $B_1, ..., B_s$ in $S_{ij} \mathcal{F}$ whose intersection is of size less than $t$. Let $A_1, ..., A_s \in \mathcal{F}$ be the sets they "came from". So for each $k$, either $B_k = S_{ij}(A_k)$ or $B_k = A_k$, the latter occurring when $S_{ij}(A_k)$ is also in $\mathcal{F}$. Clearly, the $A_k$'s are all be distinct, and so the intersection of the $A_k$'s has size at least $t$.

How could the $A_k$'s have intersection at least $t$ and yet the $B_k$'s have intersection less than $t$? Certainly nothing changes on coordinates other than $i$ or $j$. It's also clear $S_{ij}$ can't cause coordinate $i$ to leave the intersection. This means that it must be the case that coordinate $j$ left the intersection. In particular, $j$ is in the intersection of the $A_k$'s and not in the intersection of the $B_k$'s.

Further, since we know that $j$ is in the intersection of the $A_k$'s, it follows that $i$ is not in the intersection of the $A_k$'s. Otherwise, we have both $i$ and $j$ in each $A_k$, which surely means that $i$ and $j$ are both in each $B_k$. But we've already established that $j$ is not in the intersection of the $B_k$'s.

Finally, since $i$ is not in the intersection of the $A_k$'s, and since shifting caused the intersection size to go down, $i$ had better not be in the intersection of the $B_k$'s.

To summarize, $j$ but not $i$ is in the intersection of the $A_k$'s, and neither $i$ nor $j$ is in the intersection of the $B_k$'s. Further, it must be the case that the $A_k$'s have intersection size exactly $t$.

Part 2.
Focus on the coordinates $i$ and $j$ in the $A_k$'s now. For each $A_k$ with a $0$ in the $i$ coordinate and a $1$ in the $j$ coordinate (and there is at least one such $A_k$), how come all those $1$'s didn't shift over into the $i$th coordinate, causing all the $B_k$'s to have $1$'s in their $i$th coordinate?

It must be because there is at least one $A_k$ -- say $A_1$ WOLOG -- such that $S_{ij}(A_1)$ was already in $\mathcal{F}$. So let's look at the intersection of the following $s$ distinct sets in $\mathcal{F}$: $S_{ij}(A_1), A_2, ..., A_s$. These $s$ sets must also have intersection size at least $t$. Now the intersection of these is very similar to the intersection of $A_1, ..., A_s$; however, $S_{ij}(A_j)$ does not have a $0$ in the $j$th coordinate. This means $j$ is not in the intersection of $S_{ij}(A_1), A_2, ..., A_s$.

But this is a risky situation, because we concluded in Part 1 that $A_1, ..., A_s$ had intersection size only $t$, and this included the fact that $j$ was in the intersection. The only way $S_{ij}(A_1), A_2, ..., A_s$ could still have intersection size $t$ is if miraculously, putting in $S_{ij}(A_1)$ causes $i$ to be in the intersection, where it wasn't before.

But the only way for this to happen is if $A_1$ was the only set among $A_1, ..., A_s$ with the pattern $0, 1$ in coordinates $i, j$. In this case, every other $A_k$ has the pattern $1, 1$, which certainly implies $B_k = A_k$. Further, we have $B_1 = A_1$ by assumption (since $S_{ij}(A_1)$ was already in $\mathcal{F}$). So in fact all $B_k$'s equal their $A_k$'s. This is a contradiction because in Part 1 we concluded that $j$ was in the intersection of the $A_k$'s but not in the intersection of the $B_k$'s.

Lecture 13: The Hardness Endgame

2008-02-26T17:17:00.007-05:00

Here's the final steps to the hardness of Priority Steiner tree. The details are in the paper On the Approximability of Network Design Problems, J. Chuzhoy, A. Gupta, J. Naor and A. Sinha, SODA 2006.

What we saw Today in Class. Recall that we assumed that for some $h$, we had set cover instances with universe of size $u$, $s$ sets, each set of size $z$, where

the Yes instances had a solution using at most $X$ sets, but
the No instances had the property that even picking $h\cdot X$ sets would leave a $2^{-c \cdot h}$-fraction of the universe uncovered. (Let us call this the $h$-factor property.)

The Yes instances mapped to instances of PST where the solution cost was at most $X$.

In the No instances, we proved a claim that if the total cost of edges bought in levels $1, 2, \ldots, j-1$ was at most $hX/2$, then the cost incurred at level $j$ itself would be at least $(u/4z)\cdot 2^{-ch}$. In particular, if $k$ were (at least) $(4zhX/u)\cdot 2^{ch}$, this claim implies that the total cost in the No instance is at least $hX/2$.

And hence we have a gap of $h/2$ between the Yes and the No instances.

Parameters (The Questions). So how do we set the parameters so that

the set cover instance satisfies the $h$-factor property.
The number of levels $k$ is at least $(4zhX/u) \cdot 2^{ch}$.
the size of the construction $N \approx (2z)^k \cdot s$ is reasonable.

Our Friend the Set Cover Instance. The construction of the set cover instance is the usual thing as discussed in Lecture 11 (or in Homework 3): note that we can start from a SAT instance of size $n$, and get a label cover instance with $n'$ nodes in each partition, each node participating in $d'$ constraints, label set $L'$, key set $K'$, where $n' = n^{- \log \eta}$, and $d', |L'|, |K'|$ are all $poly(1/\eta)$.

In turn, this gives us a set cover instance with universe size $u = n'd'2^{|K'|}$, number of sets $s = n'|L'|$, size of each set being $z = d' 2^{|K'| - 1}$, the optimal solution size is $X = 2n'$, and it satisfies the $h$-factor property above if $1/h^2 > O(\eta)$.

Parameters (The Answers). To answer the questions above:

the set cover instance satisfies the $h$-factor property as long as $h^2 < O(1/\eta)$. This is satisfied if we set $\eta \approx h \approx \Theta(\log \log n)$.
Note that the parameters derived above satisfy $zX/u = 4$, and hence $k = 16h 2^{ch} \approx \Theta(\log n)$ is sufficient.
the size of the construction is $N \approx s (2z)^k = n^{- \log \eta} \cdot poly(1/\eta)^{(2^h)} \approx n^{\log \log \log n}$.

Finally, the gap between the Yes and No instances is $h = \Omega(\log \log n) = \log \log N$ as well. This completes the proof.

Questions and comments about today's lecture go here, as usual.