sparsest-cut/main.tex

\documentclass[11pt]{article}
\usepackage{chao}
% \usepackage{natbib}


\title{Sparsest Cut}
\author{}
\date{}

\DeclareMathOperator*{\opt}{OPT}
\DeclareMathOperator*{\len}{len}
\newcommand{\scut}{\textsc{Sparsest Cut}}
\newcommand{\uscut}{\textsc{Uniform Sparsest Cut}}
\newcommand{\nonuscut}{\textsc{Non-Uniform Sparsest Cut}}
\newcommand{\expansion}{\textsc{Expansion}}

\begin{document}
\maketitle

\section{Introduction}

\scut{} is a fundamental problem in graph algorithms with connections to various cut related problems.

\begin{problem}[\nonuscut]
The input is a graph $G=(V,E)$ with edge capacities $c:E\to \R_+$ and a set of vertex pairs $\{s_1,t_1\},\dots,\{s_k,t_k\}$ along with demand values $D_1,\dots,D_k\in \R_+$. The goal is to find a cut $\delta(S)$ of $G$ such that $\frac{c(\delta(S))}{\sum_{i:|S\cap \set{s_i,t_i}|=1}D_i}$ is minimized.
\end{problem}

In other words, \nonuscut{} finds the cut that minimizes its capacity divided by the sum of demands of the vertex pairs it separates. There are two important varients of \nonuscut{}. Note that we always consider unordered pair $\{s_i,t_i\}$, i.e., we do not distinguish $\{s_i,t_i\}$ and $\{t_i,s_i\}$.

\uscut{} is the uniform version of \nonuscut{}. The demand is 1 for every possible vertex pair $\{s_i,t_i\}$. In this case, we can remove from the input the pairs and demands. The goal becomes to minimize $\frac{c(\delta(S))}{|S||V\setminus S|}$.

\expansion{} further simplifies the objective of \uscut{} to $\min_{|S|\leq n/2}\frac{c(\delta(S))}{|S|}$.

% \subsection{importance and connections}

These problems are interesting since they are related to central concepts in graph theory and help to design algorithms for hard problems on graph. One connections is expander graphs. The importance of expander graphs is thoroughly surveyed in \cite{hoory_expander_2006}. The optimum of \expansion{} is also known as Cheeger constant or conductance of a graph. \uscut{} provides a 2-approximation of \expansion{}, which is especially important in the context of expander graphs as it is a way to measure the edge expansion of a graph. \nonuscut{} is related to other cut problems such as Multicut and Balanced Separator.
From a more mathematical perspective, the techniques developed for approximating \scut{} are deeply related to metric embedding, which is another fundamental problem in geometry.
Besides theoretical interests, \scut{} is useful in practical scenarios such as in image segmentation and in some machine leaning algorithms.

\subsection{related works}
\nonuscut{} is APX-hard \cite{juliaJACMapxhard} and, assuming the Unique Game Conjecture, has no polynomial time constant factor aproximation algorithm\cite{chawla_hardness_2005}. \uscut{} admits no PTAS \cite{uniformhardnessFocs07}, assuming that NP-complete problems cannot be solved in randomized subexponential time. The currently best approximation algorithm for \uscut{} has ratio $O(\sqrt{\log n})$ and running time $\tilde{O}(n^2)$ \cite{arora_osqrtlogn_2010}. %
% Prior to this currently optimal result, there is a long line of research optimizing both the approximation ratio and the complexity, see \cite{arora_expander_2004,leighton_multicommodity_1999}.
For \nonuscut{} the best approximation is $O(\sqrt{\log n}\log\log n)$ \cite{arora_euclidean_2005,arora_frechet_2007}.
There are also works concerning approximating \scut{} on special graph classes such as planar graphs \cite{lee_genus_2010}, graphs with low treewidth \cite{chlamtac_approximating_2010,gupta2013sparsestcutboundedtreewidth, Chalermsook_2024}.

The seminal work of \cite{leighton_multicommodity_1988,leighton_multicommodity_1999} starts this line of research. They studied multicommodity flow problem and proved a $O(\log n)$ flow-cut gap for \uscut{}.
% (the tight $\Theta(\log n)$ gap for \nonuscut{} was proven by \cite{garg_approximate_1996}).
They developed a $O(\log n)$ approximation algorithm for \uscut{}. The technique is called region growing. They also discovered a lowerbound of $\Omega(\log n)$.
Note that the flow-cut gap describes the ratio of the max concurrent flow to the min sparsity of a cut. \cite{garg_approximate_1996} studied the flow-cut gap for min multicut and max multicommodity flow, which is also $\Theta(\log n)$. The result of Garg, Vazirani and Yannakakis \cite{garg_approximate_1996} provides an $O(\log n)$ approximation algorithm for Multicut, which implies a $O(\log^2 n)$ approximation for \nonuscut{}.
Although \cite{leighton_multicommodity_1999} showed an $\Omega(\log n)$ lowerbound for flow-cut gap, better approximation for \scut{} is still possible through other methods.

For \nonuscut{} the $O(\log^2 n)$ approximation is further improved by \cite{Linial_London_Rabinovich_1995} and \cite{lognGapAumann98}. \cite{lognGapAumann98} applied metric embedding to \nonuscut{} and obtained a $O(\log n)$ flow-cut gap as well as a $O(\log n)$ approximation algorithm for \nonuscut{}. The connections between metric embedding and \nonuscut{} is influential. \nonuscut{} can be formulated as an integer program. \cite{lognGapAumann98}, \cite{aumann_rabani_1995} and \cite{Linial_London_Rabinovich_1995} considered the metric relaxation of the IP. They observed that \nonuscut{} is polynomial time solvable for trees and more generally for all $\ell_1$ metrics. The $O(\log n)$ gap follows from the $O(\log n)$ distortion in the metric embedding theorem.

\cite{arora_expander_2004} and \cite{arora_osqrtlogn_2010} further improved the approximation ratio for \uscut{} to $O(\sqrt{\log n})$ via semidefinite relaxation. This is currently the best approximation ratio for \uscut{} on general undirected graphs.
For \nonuscut{}, the approximation is improved to $O(\sqrt{\log n} \log \log n)$ \cite{arora_euclidean_2005,arora_frechet_2007}. Later \cite{guruswami_approximating_2013} gives a $\frac{1+\delta}{\e}$ approximation in time $2^{r/(\delta \e)}\poly(n)$ provided that $\lambda_r\geq \opt / (1-\delta)$.

There is also plenty of research on \scut{} in some graph classes, for example \cite{bonsma_complexity_2012}. One of the most popular class is graphs with constant treewidth. \cite{Chalermsook_2024} gave a $O(k^2)$ approximation algorithm with complexity $2^{O(k)}\poly(n)$. \cite{Cohen-Addad_Mömke_Verdugo_2024} obtained
a 2-approximation algorithm for sparsest cut in treewidth $k$ graph with running time $2^{2^{O(k)}}\poly(n)$.

\scut{} is easy on trees and the flow-cut gap is 1 for trees. One explaination\footnote{\url{https://courses.grainger.illinois.edu/cs598csc/fa2024/Notes/lec-sparsest-cut.pdf}} is that shortest path distance in trees is an $\ell_1$ metric. There are works concerning planar graphs and more generally graphs with constant genus.
\cite{leighton_multicommodity_1999} provided a $\Omega(\log n)$ lowerbound for flow-cut gap for \scut{}. However, it is conjectured that the gap is $O(1)$, while currently the best upperbound is still $O(\sqrt{\log n})$ \cite{rao_small_1999}.
For graphs with constant genus, \cite{lee_genus_2010} gives a $O(\sqrt{\log g})$ approximation for \scut{}, where $g$ is the genus of the input graph.  For flow-cut gap in planar graphs the techniques are mainly related to metric embedding theory\footnote{\url{https://home.ttic.edu/~harry/teaching/teaching.html}}.


\section{Approximations}
Techniques for approximating \scut{}.
\subsection{LP $\Theta(\log n)$ - \nonuscut{}}

\begin{minipage}{0.47\linewidth}
\begin{equation}\label{IP}
\begin{aligned}
\min&   &   \frac{\sum_e c_e x_e}{\sum_{i} D_i y_i}&    &   &\\
s.t.&   &   \sum_{e\in p} x_e&\geq y_i                  &   &\forall p\in \mathcal{P}_{s_i,t_i}, \forall i\\
    &   &   x_e,y_i&\in \{0,1\}
\end{aligned}
\end{equation}
\end{minipage}
\begin{minipage}{0.47\linewidth}
\begin{equation}\label{LP}
\begin{aligned}
\min&   &   \sum_e c_e x_e&     &   &\\
s.t.&   &   \sum_i D_iy_i&=1    &   &\\
    &   &   \sum_{e\in p} x_e&\geq y_i                  &   &\forall p\in \mathcal{P}_{s_i,t_i}, \forall i\\
    &   &   x_e,y_i&>0
\end{aligned}
\end{equation}
\end{minipage}
\bigskip

\begin{minipage}{0.47\linewidth}
\begin{equation}\label{dual}
\begin{aligned}
\max&   &   \lambda&    &   &\\
s.t.&   &   \sum_{p\in\mathcal{P}_{s_i,t_i}} y_p&\geq \lambda D_i   &   &\forall i\\
    &   &   \sum_i \sum_{p\in \mathcal{P}_{s_i,t_i}, p\ni e}y_p&\geq c_e    & &\forall e\\
    &   &   y_p&\geq 0
\end{aligned}
\end{equation}
\end{minipage}
\begin{minipage}{0.47\linewidth}
\begin{equation}\label{metric}
\begin{aligned}
\min&   &   \sum_{uv\in E} c_{uv}d(u,v)&     &   &\\
s.t.&   &   \sum_i D_i d(s_i,t_i)&=1    &   &\\
    &   &   \text{$d$ is a metric on $V$}
\end{aligned}
\end{equation}
\end{minipage}

\newcommand{\lp}{\texttt{LP\ref{LP}}}
\newcommand{\ip}{\texttt{IP\ref{IP}}}
\newcommand{\dual}{\texttt{LP\ref{dual}}}
\newcommand{\metric}{\texttt{LP\ref{metric}}}
\begin{enumerate}
\item \ip{} $\geq$ \lp{}. Given any feasible solution to \ip{}, we can scale all $x_e$ and $y_i$ simultaneously with factor $1/\sum_i D_i y_i$. The scaled solution is feasible for \lp{} and gets the same objective value.
\item \lp{} $=$ \dual{}. by duality.
\item \metric{} $=$ \lp{}. It is easy to see \metric{} $\geq$ \lp{} since any feasible metric to \metric{} induces a feasible solution to \lp{}. In fact, the optimal solution to \lp{} also induces a feasible metric. Consider a solution $x_e,y_i$ to \lp{}. Let $d_x$ be the shortest path metric on $V$ using edge length $x_e$. It suffices to show that $y_i=d_x(s_i,t_i)$. This can be seen from a reformulation of \lp{}. The constraint $\sum_i D_i y_i=1$ can be removed and the objective becomes $\sum_e c_e x_e / \sum_i D_i y_i$. This reformulation does not change the optimal solution. Now suppose in the optimal solution to \lp{} there is some $y_i$ which is strictly smaller than $d_x(s_i,t_i)$. Then the denominator $\sum_i D_i y_i$ in the objective of our reformulation can be larger, contradicting to the optimality of solution $x_e,y_i$.

\end{enumerate}
\begin{theorem}[Japanese Theorem]
$D$ is a demand matrix. $D$ is routable in $G$ iff $\forall l:E\to \R_+$, $\sum_e c_e l(e)\geq \sum_{uv} D(u,v) d_l(u,v)$, where $d_l(s,t)$ is the short path distance induced by $l(e)$.
\end{theorem}
Note that $D$ is routable iff the optimum of the LPs is at least 1. Then the theorem follows directly from \metric{}.

\paragraph{$\Theta(\log n)$ flow-cut gap}
The flow-cut gap is defined as $\opt(\ip{})/\opt(\lp{})$ and the $\Theta(\log n)$ bound is proven in \cite{leighton_multicommodity_1999}.

Suppose that $G$ satisfies the cut condition, that is, $c(\delta(S))$ is at least the demand separated by $\delta(S)$ for all $S\subset V$. This implies $\opt(\ip{})\geq 1$ and in this case the largest integrality gap is $1/\opt(\lp{})$.
For 1 and 2-commodity flow problem the gap is 1 \cite{Ford_Fulkerson_1956,Hu_1963}.
However, for $k\geq 3$ the gap becomes larger\footnote{\url{https://en.wikipedia.org/wiki/Approximate_max-flow_min-cut_theorem}}.
It is mentioned in \cite{leighton_multicommodity_1999} that \cite{schrijver_homotopic_1990} proved if the demand graph does not contain either  three disjoint edges or a triangle and a disjoint edge, then the gap is 1.

For the $\Omega(\log n)$ lowerbound consider an \uscut{} instance on some 3-regular graph $G$ with unit capacity. In \cite{leighton_multicommodity_1999} they further required that for any $S\subset V$ and small constant $c$, $|\delta(S)|\geq c \min(|S|,|\bar S|)$. Then the value of the sparsest cut is at least $\frac{c}{n-1}$. Observe that for any fixed vertex $v$, there are at most $n/2$ vertices within distance $\log n-3$ of $v$. Thus at least half of the $\binom{n}{2}$ demand pairs are connected with shortest path of length at least $\log n-2$. To sustain a flow $f$ we need at least $\frac{1}{2}\binom{n}{2}(\log n -2)f\leq 3n/2$. Any feasible flow satisfies $f\leq \frac{3n}{\binom{n}{2}(\log n -2)}$ and the gap is therefore $\Omega(\log n)$.

For the upperbound it suffices to show there exists a cut of ratio $O(f\log n)$.
\cite{leighton_multicommodity_1999} gave an algorithmic proof based on \metric{}. This can also be proven using metric embedding results.
We can solve \metric{} in polynomial time and get a metric on $V$. Then there is an embedding of $V$ into $\R^d$ with $\ell_1$ metric such that the distortion is $O(\log n)$.
Since $\ell_1$ metric is in the cut cone, our metric on $\R^d$ is a conic combination of cut metrics, which implies\footnote{This requires some work. See \url{https://courses.grainger.illinois.edu/cs598csc/fa2024/Notes/lec-sparsest-cut.pdf}} that there is a cut in the conic combination with value at most $O(\log n)\opt(\metric{})$.
To find such a cut it suffices to compute a conic combination of cut metrics which is exactly our $\ell_1$ metric in $\R^d$. One way to do this is test $(n-1)d$ cuts by observing the followings,
\begin{enumerate}
\item Every coordinate of $\R^d$ corresponds to a line metric;
\item $\ell_1$ metric in $\R^d$ is the sum of those line metrics;
\item Every line metric on $n$ points can be represented as some conic combination of $n-1$ cut metrics.
\end{enumerate}

The gap can be improved to $\log k$ through a stronger metric embedding theorem ($k$ is the number of demand pairs).

\begin{remark}
I believe the later method is more general and works for \nonuscut{}, while the former method is limited to \uscut{}. However, the proof in \cite{leighton_multicommodity_1999} may have connections with the proof of Bourgain's thm? Why does their method fail to work on \nonuscut{}?
\end{remark}

\subsection{SDP $O(\sqrt{\log n})$ - \uscut{}}
This $O(\sqrt{\log n})$ approximation via SDP is developed in \cite{arora_expander_2004}. This is also described in \cite[section 15.4]{Williamson_Shmoys_2011}.

\begin{equation*}
\begin{aligned}
\min&   &   \frac{\sum_{ij\in E}c_{ij}(x_i-x_j)^2}{\sum_{ij\in V\times V}(x_i-x_j)^2}&  &   &\\
s.t.&   &   (x_i-x_j)^2 + (x_j-x_k)^2&\geq (x_i-x_k)^2  &   &\forall i,j,k\in V\\
    &   &                   x_i&\in \{+1,-1\}   &   &\forall i \in V
\end{aligned}
\end{equation*}

This SDP models \uscut{} since every assignment of $x$ corresponds to a cut and the objective is the sparsity of the cut (up to a constant factor, but we don't care since we cannot achieve a constant factor approximation anyway). Now we consider a relaxation which is similar to \lp{}.

\begin{equation*}
\begin{aligned}
\min&   &   \sum_{ij\in E}c_{ij}\|v_i-v_j\|^2&  &   &\\
s.t.&   &                   \sum_{ij\in V\times V}\|v_i-v_j\|^2&=1  &   &\\
    &   &   \|v_i-v_j\|^2 + \|v_j-v_k\|^2&\geq \|v_i-v_k\|^2  &   &\forall i,j,k\in V\\
    &   &                   v_i&\in \R^n   &   &\forall i \in V
\end{aligned}
\end{equation*}

To get a $O(\sqrt{\log n})$ (randomized) approximation algorithm we need to first solve the SDP and then round the solution to get a cut $\delta(S)$ with $c(\delta(S))=|S| \opt(SDP) O(n\sqrt{\log n})$. If we can find two sets $S,T\subset V$ both of size $\Omega(n)$ that are well-separated, in the sense that for any $s\in S$ and $t\in T$, $\|v_s-v_t\|^2=\Omega(1/\sqrt{\log n})$, then we have

\[
\frac{c(\delta(S))}{|S||V-S|}
\leq n|S| \frac{\sum_{ij\in E} c_{ij}\|v_i-v_j\|^2}{\sum_{i\in S,j\in T} \|v_i-v_j\|^2}
\leq |S| \frac{\sum_{ij\in E} c_{ij}\|v_i-v_j\|^2}{n} O(\sqrt{\log n})
\leq O(\sqrt{\log n}) \opt(SDP).
\]

This is the framework of the proof in \cite{arora_expander_2004}.

\bibliographystyle{alpha}
\bibliography{ref}
\end{document}