Monday, January 14, 2008

Lecture 1: Definitions; greedy algorithm for Set-Cover & Max-Coverage

Any questions or comments about Lecture 1 can go here.


I also wanted to include just a little bit of my own opinion on why studying approximation algorithms is worthwhile.


The "usual" motivation given is the following: In practice we are faced with NP-hard problems, and we nevertheless want to do something besides just give up. This is definitely true. The motivation suggests that running an approximation algorithm is a good idea -- they have worst-case performance guarantees! This is partly true, but it's not the whole story.

As an example, at one point in late 2001, the best known approximation algorithm for "Uncapacitated Metric Facility Location" (sounds complicated, but it's actually a very natural problem studied in industry; we'll discuss it later in this class) was a 1.73-factor approximation [Guha-Charikar]. It used an unholy combination of linear programming, primal-dual methods, and greedy methods. It's doubtful anyone ever ran it. At the same time, there was a relatively simple, quadratic time greedy algorithm achieving a 1.86-factor approximation [Mahdian-Markakis-Saberi-V.Vazirani]. One would be hard-pressed to say that the 1.73-factor algorithm was a better heuristic in practice for the problem. (The current best known algorithm is 1.52-factor [Mahdian-Ye-Zhang] and is combinatorial.)

On the other hand, take Set Cover. Even though we know that the Greedy algorithm only achieves somewhat bad performance -- a ln n factor -- we would and do run it anyway.

So breakthroughs in the analysis of a problem's approximability don't necessarily help you out at all in practice.

(I should emphasize, though, that sometimes they do: for example, the Goemans-Williamson .878-factor Max-Cut algorithm has had a huge impact on practice; not because it is a .878-factor approximation algorithm but because it gave powerful evidence in favor of an algorithmic technique (semidefinite programming) which is today a key component in practical Max-Cut algorithms.)



Regardless of the "heuristics for practice" motivation, there are additional reasons to study approximability:

1. It helps you understand "hard instances"; for example, what properties of Set-Cover instances make them hard? What properties make them easy? Studying approximation algorithms for the problem usually reveals this and helps you design algorithm for special cases. See also Luca Trevisan's opinion on the whys of approximation algorithms.

2. It tells you something about P vs. NP. (This is my person reason for studying approximability.) Take Max-Cut for example. We know that .878-approximating it is in P. We also know [HÃ¥stad-Trevisan-Sorkin-Sudan-Williamson] that .942-approximating it is NP-hard. What about the algorithmic problem of .9-approximating Max-Cut: is it in P or is it NP-hard? No one knows. In fact, experts have contradictory guesses. And this is for Max-Cut, which I'd argue is the simplest possible NP optimization problem. How can such a simple problem evade being classified as being in P or NP-hard? I find this to be an even more galling situation than the unclear status of Factoring (which most people at least guess is not in P) and Graph-Isomorphism (which most people at least guess is in P).

No comments: