## Tuesday, February 19, 2008

### Integrality Gap for Group Steiner

The integrality gap of $\Omega(\log^2 k)$ was proved in Integrality Ratio for Group Steiner Trees and Directed Steiner Trees, E. Halperin, G. Kortsarz, R. Krauthgamer, A. Srinivasan, and N. Wang. SIAM Journal on Computing, Vol. 36, 1494-1511, 2007. Here's the idea.
1. Create a balanced $\delta$-ary tree with height $H+1$ levels, where the edge costs decrease exponentially: edges at level $h$ cost $1/2^h$. (Hence edges incident to the root cost $1/2$, and edges incident to the leaves cost $1/2^{H}$.)

2. The arity of the tree is $\delta = c \log n$ for some constant $c$. So the height is $H = \log n/\log \delta$. The number of leaves is $n$, the total number of nodes is $n(1+o(1))$. The number of groups/colors/elements is $k = 2^{2H}$.

3. Do the following independently for each of the $k$ elements $x$. The element $x$ starts off at the root. For each of the $\delta$ child edges, it flips a fair coin independently: if the coin comes up heads, a copy of $x$ slides down that edge, where it begins such a random process anew. Note that the expected number of copies of $x$ at level $i$ is $(\delta/2)^i$. The group for $x$ is the set of all the leaves that contain a copy of $x$.
Small fractional solution. It is not that difficult to show that this instance has a fractional solution of value $O(H)$ with high probability (where the probability is over the random choice of groups above):
• Consider some node at level $i$ in this tree. Conditioned on this node getting a copy of $x$, the expected number of leaves containing $x$ below it is $\mu_h = (\delta/2)^{H-h}$. With high probability, the actual number lies in $(\mu_h/3, 3\mu_h)$. (Use Chernoff bounds!)

• For an edge $e$ going from level $i-1$ to level $i$, set $x_e = 9/\mu_e$ --- using the above facts, this is a feasible solution (whp) with cost $9H$.
No small integer solution. Next, we show that "with high probability, all integer solutions have cost $\Omega(H^2 \log k) = \Omega(H \log^2 k)$." This proves the $\Omega(\log^2 k)$ integrality gap for the problem.
• Since each leaf-edge costs $1/2^H$, no tree of cost $C$ can use more than $C2^H$ of these. There are $\delta^H$ leaf-edges, so the number of possible trees of cost $C$ is $[(\delta^H) choose (C2^H)] \leq \exp(CH2^H log \delta)$. Call this value $M(C)$.

• Suppose $Fail(C)$ is the probability that for a fixed tree of cost $C$, it misses the group for $x$ created above. By independence of groups, the probability that this fixed tree hits all the groups is at most $(1 - Fail(C))^k \leq \exp(-k \times Fail(C)) = Hitall(C)$.

• Hence, taking a trivial union bound over all trees, the chance that there exists some tree that hits all the groups is at most $M(C) \times Hitall(C)$.

• If this product is $o(1)$ for some $C$, then with probability $1 - o(1)$ no tree of cost $C$ is feasible. We would like to show this holds true for some constant $\epsilon > 0$ and $C = \epsilon H^2 \log k$.
Now for the most technical part.

One needs to prove that for some fixed tree of cost at most $\epsilon H^2 \log k$, and some $x$, none of the leafs in this tree are hit by the random process above. The proof strategy seems natural (in hindsight), but also somewhat delicate: details in Section 2.3.1 of the paper above. Question: can one get a slightly worse bound but without the case analysis at the end? That may be useful to know...

This analysis of the integrality gap (and an extension of it) is actually used for the proof of soundness in the hardness reduction we saw today! I will write/say some more about this later.