15-854B: Advanced Approximation Algorithms: April 2008

Wednesday, April 30, 2008

Another clarification on HW6

For 2a, it should say "expressed as a sum of products of up to k of f's Fourier coefficients".

Thanks to Jonah for pointing this out.

Tuesday, April 29, 2008

Ok, yeah, part (b) was super-vague, sorry about that. Here is a more precise question, I think.

Consider the matrix $M_2(y)$; if this is psd, then we know, e.g., that $y_{abcd}$ can be written as the dot product of vectors $z_{ab}$ and $z_{cd}$, and also as the dot products of vectors $z_{bc}$ and $z_{ad}$ etc. Similarly $y_{ab}$ can be written as $z_{a}$ dot $z_b$, and also as $z_{ab}$ dot $z_\emptyset$. Hence we now have these vectors $z$'s, each subscripted by at most two vertices, with some constraints relating them.

The goal of the problem is to show that the SDP $Q_1(K_{VC})$ is at least as strong as a natural vertex cover SDP, which assigns a vector $v_i$ to each node $i$ to minimize $\sum_i v_i \cdot v_i$ subject to

$(v_\emptyset - v_a)(v_\emptyset - v_b) = 0$ for all edges $ab$.
$v_\emptyset \cdot v_\emptyset = 1$ (say).

Can you show that the SDP you've created implies these constraints?

Does your SDP also imply the triangle inequalities:

$||v_a - v_b||^2 + ||v_b - v_c||^2 \geq ||v_a - v_c||^2$?

If things are still vague (or if you see some mistakes above), please email/post yr questions.

Monday, April 28, 2008

Course evaluation time...

Hi all: when you get a chance, could you go to https://my.cmu.edu/site/main/page.academics and give yr feedback on the course. thanks!

Sunday, April 27, 2008

Integrality Gaps for Sparsest Cut

A quick comment about the sparsest cut integrality gaps we saw last time, and the connection to the dual. (BTW, the notes for Lec 27 are up, thanks Varun!)

In Lecture 27, the arguments for the gaps went thus: at least $k$ of the demands need to travel through $p$ edges in the network, hence sending $\tau$ fraction of each unit demand uses $\tau \cdot k \cdot p$ volume. And since we dealt with unit capacities, there was only $m$ volume in the network, hence the maximum concurrent flow is at most $m/(pk)$. For the 5-node example, $p=2$, $k=4$ and $m=6$; for the expander, $k = \Omega(n^2)$, $p = \Omega(\log n)$, and $m = O(n)$.

But suppose we are more sophisticated:

Suppose we assign lengths $d_e$ to each edge $e$, so that the shortest path according to these edge lengths from $i$ to $j$ is now $d_{ij}$. Think of each of the edges $(i,j)$ as pipes with capacity $C_{ij}$ and length $d_{ij}$.
Now if the pair $ij$ sends $\tau D_{ij}$ flow, each unit of flow must travel $d_{ij}$ distance. Hence the total volume used by this pair is $\tau D_{ij} \cdot d_{ij}$. And the total volume used is $\sum_{ij} \tau \cdot D_{ij} \cdot d_{ij}$.
But the total volume in the network is now $\sum_{(i,j) \in E} C_{ij} d_{ij}$.

Hence, the maximum concurrent flow $\tau^*$ is at most $\sum_{(i,j) \in E} C_{ij} d_{ij}$ divided by $\sum_{ij} \tau \cdot D_{ij} \cdot d_{ij}$. Moreover, this is true for any setting of the edge lengths $d_e$. Hence, $\tau$ is at most the minimum of $(C.d)/(D.d)$ over all metrics $d$, or in other words, $\tau \leq \lambda^*$!

What we have just proved is weak LP-duality for the case of sparsest cut. The fact that $\tau^*$ equals $\lambda^*$ is given by strong duality.

Monday, April 21, 2008

Hardness of Independent Set

I wanted to say a few more things about the hardness of Max-Independent-Set (equivalently, Min-Vertex-Cover). Recall that in Lecture 25, we sketched the proof of factor $1/n^{1-\epsilon}$ NP-hardness for this problem; specifically, Hastad's result shows that distinguishing $1/n^{\epsilon}$ vs. $1/n^{1-\epsilon}$ is NP-hard for all $\epsilon > 0$.

This is a great hardness factor, but (as they used to say) "the gap is in the wrong place" -- or at least, it's not in the strongest possible place. It is of great interest to try to obtain a hardness result of the form $\Omega(1)$ vs. $o(1)$. This will give a worse "factor", but the result is essentially stronger. Note that we can't hope for $\Omega(1)$ vs. $1/n^{1-\epsilon}$ NP-hardness; Boppana and Halldorsson showed, roughly, that $1/k$ vs. $1/n^{(k-2)/(k-1)}$ is in P (using Ramsey theory) and this was improved by Alon and Kahale to, roughly, $1/k$ vs. $1/n^{(k-2)/(k+1)}$, using the Karger-Motwani-Sudan SDP coloring algorithm.

One reason hardness results like $\Omega(1)$ vs. "small" are better is that they better "factor" hardness for Min-Vertex-Cover. Recall that, trivially, a $c$ vs. $s$ hardness result for Min-Independent-Set is a $1-c$ vs. $1 - s$ hardness result for Max-Vertex-Cover, and thus a $(1-s)/(1-c)$-factor hardness for Max-Vertex-Cover. For example, Hastad's hardness gives only $1+o(1)$ factor hardness for Vertex-Cover, but a result like $1/3$ vs. $o(1)$ hardness for Independent-Set would give factor $3/2 - o(1)$ hardness for Vertex-Cover.

For a long time, the best NP-hardness result like this we knew for Independent-Set came simply from taking Hastad's $1 - \epsilon$ vs. $1/2 + \epsilon$ hardness result for Max-E3Lin and applying the FGLSS reduction to it. You should (easily) check for yourself that this gives $1/4 - \epsilon$ vs. $1/8 + \epsilon$ NP-hardness for Independent-Set. Hence it gives $3/4 + \epsilon$ vs. $7/8 - \epsilon$ NP-hardness for Vertex-Cover; i.e., factor $7/6 + \epsilon$.

In 2002, in the same conference as Khot's Unique Games Conjecture paper, Dinur and Safra published a stronger NP-hardness result for Independent-Set. This was a pretty amazing paper, because the outline of the proof was:

a) Prove a super-weak variant of the Unique Games Conjecture;
b) Do a very difficult Dictator Test based on the combinatorics of $2$-intersecting families.

Part (a), as mentioned, was done contemporaneously with the UGC paper, while part (b) was done before the work on Ek-Independent-Set described in our Lectures 8 and 9.

I like to remember their result as $1/3$ vs. $1/9$ NP-hardness for Independent-Set (omitting the $\epsilon$'s), and hence factor $(8/9)/(2/3) = 4/3$ hardness for Vertex-Cover. In fact, they pushed it a little further. Specifically, for $p = (3 - \sqrt{5})/2$, they showed $p$ vs. $4p^3 - 3p^4$ NP-hardness for Independent-Set. This is, like, .382 vs. .159. This yields factor 1.36 or so NP-hardness for Vertex-Cover.

That stands as the best NP-hardness result known. Relatively soon after the UGC paper, Khot and Regev showed showed that assuming UGC, $1/2 - \epsilon$ vs. $\epsilon$ hardness holds for Min-Independent-Set, and thus factor $2-\epsilon$ hardness for Vertex-Cover. Indeed, they showed UGC-hardness of $1/k - \epsilon$ vs. $\epsilon$ for Min-Ek-Independent-Set.

End of Lecture 25

Here is the end of the proof I didn't get a chance to finish from Lecture 25.

Homework 5 graded

Homework 5 is graded and will be handed back tomorrow. The average was 79%.

A few comments:

1. Most people got this right (one way or another).

2. For part b), the two equally simplest counterexamples are {$x_1$, $x_1$, $1$}, or {$x_1$, $x_1$, $x_1$}. Aaron, Kanat, and Mike all gave the same, far-from-simplest counterexample. What a coincidence...

3. Many people failed to prove a characterization (if and only if statement) in c), and just gave an implication. Many people also had complicated solutions for the support-3 problem. Ali and Amitabh gave roughly the same solution, which is probably the slickest. Suppose f = a $\chi_S$ + $b \chi_T$ + $c \chi_U$, where the coefficients $a, b, c$ are nonzero and $S$, $T$, $U$ are distinct. By Parseval,

$a^2 + b^2 + c^2 = 1$ (*).

Substituting in $(1, ..., 1)$, we get $a + b + c \in {-1,1}$. Squaring this and subtracting (*) yields

$ab + bc + ac = 0$ (**).

Since $S$, $T$, $U$ are distinct, there must be an index $i$ which is in exactly 1 or 2 of them. Substituting in the string which is 1 everywhere except that it's $-1$ on the $i$th coordinate, we conclude $\sigma_1 a + \sigma_2 b + \sigma_3 c \in {-1,1}$, where exactly 1 or 2 of the signs $\sigma_i$ are $-1$. Squaring this and subtracting (*) yields

$\tau_1 ab + \tau_2 bc + \tau_3 ac = 0$ (***),

where exactly two of the signs $\tau_i$ are $-1$. Now add (**) and (***) and conclude that one of $ab$, $bc$, or $ac$ must equal 0, a contradiction.

4. Most people got (a) and (b). For (c), several people estimated the degree-1 Fourier coefficients of Majority and concluded that it's, say, $(1/n, 1/2)$-quasirandom. Only Eric truly ventured to high degree, pointing out that $Inf_i(Maj)$ can be seen to be $\Theta(1/\sqrt{n})$, and therefore by 5a, Maj is $(\Theta(1/\sqrt{n}),0)$-quasirandom. In fact, one can show that the largest Fourier coefficients for Maj are the ones at level $1$, and hence Maj is $(1/n, 0)$-quasirandom.

5. Most people more or less got this.

6. Most people mostly got this; the key in b) is picking the right diameter.

7. Most people got this too, although many failed to do both parts of f).

Tuesday, April 15, 2008

Homework 6 out

Homework 6 is out: it's due May 1st.

Monday, April 14, 2008

Lecture 24: UGC-Hardness of Max-Cut

The completion of the proof of the reduction from Unique Label-Cover to Max-Cut can be found here.

Thursday, April 3, 2008

Homework 4

A high-scoring one: the average was 88. Notes and comments soon.

15-854B: Advanced Approximation Algorithms