Graph sparsification

graphs
ML
Removing edges while maintaining connectivity
Author

Akshay Balsubramani

Modified

September 12, 2022

Introduction: thinning out edges in a graph

Approximating a given graph by a graph with fewer edges is called sparsification. It turns out that every weighted undirected graph can be efficiently sparsified into a graph that is “close” to it on all scales, fine and coarse. This is done by a remarkable algorithm, whose principles are so simple and elegant that I couldn’t resist trying it myself on some gene network data.

Since I found a scarcity of well-documented modular Python implementations of this algorithm1, I implemented a version in this notebook that directly uses the ideas of the seminal work that introduced the algorithm (Spielman and Srivastava 2011). The version we’ll discuss here scales easily on a laptop CPU to graphs with tens of thousands of vertices, and tens of millions of edges.

Let’s get started.

What do we get from sparsification?

A sparsification method operates on an undirected weighted graph \(G\). It creates a sparsifier \(H\) – a graph with the same vertices as \(G\) but many fewer edges – which is similar to \(G\) in terms of connectivity. There are some clear benefits to this: - Dense graphs are produced by many routines; kernel and similarity matrices are dense by nature, and sparsification enables us to store and manipulate them efficiently. - The sparse graphs produced by the method of (Spielman and Srivastava 2011) that we implement can be produced at any desired level of sparsity, which translates proportionally into gains in storage and calculation efficiency. - As an intermediate computation, we implement an efficient and principled way to locally characterize the connectivity of the graph.

Sparsification preserves cuts

Remarkably, the sparsification method preserves graph cuts – that is, any large enough subset of vertices touches approximately the same weight of edges in \(H\) as in \(G\). The method guarantees this for any subset of vertices whatsoever. This is a powerful property – it ensures the topology of the graph is preserved after the sparsification, at coarse and fine scales.

Sparsifying a graph by resampling edges

The core algorithm here is easy to describe. Given \(G\), it produces a sparsified graph with num_edges edges constructed using a simple procedure:

  1. Calculate weights over edges of \(G\).
  2. Sample num_edges edges with replacement from the distribution induced by the weights.

We can write this as a function, with the weight calculation abstracted out. The weight calculation requires a precomputed quantity for each edge, called the effective resistance, which we will discuss next. For now, it is included in the API of our function.

CODE
import numpy as np, pandas as pd, time, scipy, scipy.linalg
import scipy.sparse as sp
import matplotlib.pyplot as plt
import anndata, scanpy as sc
%matplotlib inline

print('Packages imported.')
Packages imported.
CODE
def sample_sparsifier(adj_mat, num_edges, R, ultrasparsifier=False):
    """
    Parameters
    ----------
    adj_mat : sparse matrix
        The adjacency matrix of the graph.
    num_edges : int
        The number of samples for the sparsifier.
        
    Returns:
    ------
    H : sparse matrix
        The adjacency matrix of the sparsified graph with at most num_edges edges.
    """
    rows, cols, R = compute_exact_resistances(adj_mat)
    weights = np.array(adj_mat[rows, cols].tolist())   # Effective resistances (approximate) for each edge.
    probs = compute_sampling_probs(adj_mat, R, ultrasparsifier=ultrasparsifier)
    sampled = np.random.choice(len(probs), num_edges, p=probs, replace=True)
    H = sp.lil_matrix(adj_mat.shape)
    for idx in sampled:
        H[rows[idx], cols[idx]] += weights[idx] / num_edges / probs[idx]
    return H.tocsr()

Edge sampling weights using effective resistances

How do we compute edge weights to guide this sparsification-by-sampling procedure? We want to put higher weight on edges that are bottlenecks to the flow of random walks on the graph, as such edges are more essential to preserving the overall topology of the graph.

One intuitive way of measuring this property between any pair of vertices \(a\) and \(b\) is the commute time \(C_{ab}\) - the expected number of steps that a random walker would take to diffuse from \(a\) to \(b\) and back, which is higher around bottlenecks. It turns out that if the commute times are used as edge sampling weights, the resulting sparsifier preserves the graph topology at fine and coarse scales. This is what we’ll implement.

Effective resistance

We’ll find it more convenient to compute a quantity called the effective resistance \(R_{ab}\) across each edge \((a, b)\) (Klein and Randić 1993). This is equivalent to computing the commute time \(C_{ab}\), as \(R_{ab}\) is proportional to \(C_{ab}\) in any given graph (Chandra et al. 1996).

\(R_{ab}\) can be understood in other equivalent ways:

  • \(R_{ab}\) is proportional to the probability the edge appears in a random spanning tree of the graph (for unweighted graphs). Bottleneck edges that are essential to the topology of the graph will be in the most spanning trees, and will have highest weight.
  • \(R_{ab}\) is the effective electrical resistance between two nodes in an electrical network laid out according to the graph, with a resistor placed on each edge according to the weights. It is a vast generalization to arbitrary networks of the well-known formulas for series and parallel resistances.
  • As such, the effective resistance is a natural graph-based measure of distance that accounts for the multiplicity of paths between any two nodes.
CODE
def compute_sampling_probs(adj_mat, R, ultrasparsifier=False):
    """
    Parameters
    ----------
    adj_mat : sparse matrix
        The adjacency matrix of the graph.
    R : sparse matrix
        Matrix of effective resistances
    ultrasparsifier : bool, optional
        Change probabilities so that no edge gets scaled up much. 
    
    Only guarantees spectral closeness up to 2 * epsilon, but with the benefit of lowering variance.
    """
    rows, cols = adj_mat.nonzero()
    weights = np.array(adj_mat[rows, cols].tolist())
    probs = weights * R
    probs /= np.sum(probs)
    if ultrasparsifier:
        degrees = adj_mat.sum(1)[np.array((rows, cols))].squeeze().T
        mins = 1 / np.min(degrees, axis=1) / adj_mat.shape[0]
        probs += mins
        probs /= 2
    # probs = probs.reshape((probs.shape[0], 1))
    probs /= np.sum(probs)
    return np.ravel(probs)

Effective resistances and the graph Laplacian

For any vertex \(v\), let \(e_{v} \in \mathbb{R}^{n}\) be the vector with a 1 in position \(v\) and zero elsewhere: the \(v^{th}\) vertex basis vector. For any edge \(e = (u, v)\), let \(\chi_{e} := e_{u} - e_{v}\): the \(e^{th}\) edge basis vector. Note that for distinct edges \(e \neq f\), \(\chi_{e}^{\top} \chi_{f} = 0\).

The graph Laplacian \(L\) is a matrix \(L = D - W\), where \(D = diag(\textbf{1}^{\top} W)\). It can be written in terms of the edges: \(L = \sum_{e \in E} w_{e} \chi_{e} \chi_{e}^\top\).

It turns out that using the electrical circuit theory interpretation of effective resistance (Kirchhoff’s current law and Ohm’s law), the effective resistance can be written in terms of the pseudoinverse \(L^{+}\) of the Laplacian: for any pair of vertices \(e = (u, v)\),

\[ R_{u v} = \chi_{e}^{\top} L^{+} \chi_{e} \]

CODE
def compute_exact_resistances(adj_mat):
    # Compute effective resistance between all pairs of vertices in the graph.
    if scipy.sparse.issparse(adj_mat):
        L = sp.csgraph.laplacian(adj_mat).tocsr()
        L_inv = np.linalg.pinv(L.todense())
    else:
        L = sp.csgraph.laplacian(adj_mat)
        L_inv = np.linalg.pinv(L)
    rows, cols = adj_mat.nonzero()
    R = []
    for i, j in zip(rows, cols):
        R.append(L_inv[i, i] + L_inv[j, j] - L_inv[j, i] - L_inv[i, j])
    return (rows, cols, R)

The sparsifier preserves graph cuts

Putting this all together, we get a graph sparsifier.

This sparsifier is special – it preserves all graph cuts.

More formally, for any set of vertices \(T \subseteq V\), let \(p_G (T)\) and \(p_H (T)\) be the weight of edges incident to \(T\) in graph \(G\) and its sparsifier \(H\). Then for any \(\epsilon \in (0,1)\), \[ (1 - \epsilon) p_G (T) \leq p_H (T) \leq (1 + \epsilon) p_G (T) \] if we draw \(H\) to have \(O \left(\frac{n \log n }{ \epsilon^2 } \right)\) edges.

\(\textbf{Details}\)

The Laplacian \(L\) completely characterizes the connectivity of the graph \(G\), because the weight of edges incident to any subset of vertices \(T \subseteq V\) – which we call \(p_{G} (T)\) – can be written in terms of \(L\). If \(C(T)\) is the cut-set of \(T\), i.e. the set of edges incident to a vertex in \(T\), then since \(L = \sum_{e \in E} w_{e} \chi_{e} \chi_{e}^\top\), we can write: \[p_{G} (T) = \sum_{e \in E : E \in C(T)} w_{e} = \sum_{e \in E} w_{e} (\textbf{1}_{T}^{\top} \chi_{e})^2 = \textbf{1}_{T}^{\top} L \textbf{1}_{T}\]

As a linear operator, the Laplacian therefore characterizes all graph cuts.

The sparsifier we have constructed very strictly approximates the original graph - their Laplacian matrices are spectrally close, i.e. close as linear operators. In detail, {% cite spielman2011graph %} proves that the sparsifier’s Laplacian \(L_{H}\) satisfies \((1 - \epsilon) x^{\top} L_{G} x \leq x^{\top} L_{H} x \leq (1 + \epsilon) x^{\top} L_{G} x\) for all \(x\). Substituting \(x = \textbf{1}_{T}\), we get \(p_{G} (T) = x^{\top} L_{G} x\) and \(p_{H} (T) = x^{\top} L_{H} x\), giving the result.

Experiments highlight scalability issues

Current implementations do compute \(R_{uv}\) using the above method. Unfortunately, it is extremely slow to compute the pseudoinverse \(L^{+}\) exactly, as it requires solving a linear system of equations in \(n\) variables. We can see this by trying out the function on a range of networks against the graph processing package pygsp, which computes it exactly and is a state-of-the-art package for such graph algorithms.

This package deserves special mention for including sparsification in a broader context of algorithms that I’ll explore in future posts. But its actual implementation is not efficient enough to scale as ours does.

Loading the data

I’ll load a network of human genes to demonstrate this method. The network, from this recent paper, is constructed so that nearby genes have similar functions.

CODE
amat = np.load('GLS_p.npy')
gene_names = pd.read_csv('genes.txt', header=None)
coess_adj = -np.log(amat)
coess_adj[np.isinf(coess_adj)] = 0

adj_mat = coess_adj
/var/folders/km/m9qr6zbn3wbb7spz005wwf140000gn/T/ipykernel_27218/338721239.py:4: RuntimeWarning: divide by zero encountered in log
  coess_adj = -np.log(amat)

Trying it out

Running the pygsp version of graph sparsification, we run into some scalability issues in directly using \(R_{uv}\). In this case, it takes ~30 seconds on my laptop at time of writing, to process a 2000-node subgraph of the initial adjacency matrix. and ~6 hours to process the entire matrix. The runtime doesn’t scale well at all as the number of nodes increases.

CODE
from pygsp.reduction import graph_sparsify

from pygsp import reduction, graphs
from matplotlib import pyplot as plt

# G = graphs.Sensor(200)#, k=20, distributed=True, seed=1)
times_list = []
kn_list = [100, 160, 260, 470, 670, 1070, 1600, 2500, 4000]

for kn in kn_list:
    G = graphs.Graph(adj_mat[:kn, :kn])
    # # Getting a graph from adj_mat
    # print('{} nodes, {} edges'.format(G.N, G.Ne))
    # G.A
    itime = time.time()
    Gs = reduction.graph_sparsify(G, epsilon=0.99)
    elapsed = time.time() - itime
    #print(kn, elapsed)
    times_list.append(elapsed)

plt.plot(kn_list, times_list, )
plt.xlabel('number of nodes')
plt.ylabel('Time (s)')
plt.xscale('log')
plt.yscale('log')
plt.show()

We can check that our implementation performs similarly with similar runtime.

CODE
# def visualize_graph(A, plot_labels=True):
#     rows, cols = A.nonzero()
#     edges = list(zip(rows.tolist(), cols.tolist()))
#     gr = nx.Graph()
#     gr.add_edges_from(edges)
#     graph_pos=nx.random_layout(gr)
#     nx.draw_networkx_nodes(gr, graph_pos, node_size=2)
#     nx.draw_networkx_edges(gr, graph_pos, edge_size=1)
#     weights = A[rows, cols].tolist()[0]
#     if plot_labels: nx.draw_networkx_edge_labels(gr, graph_pos, edge_labels=dict(zip(edges, weights)))
#     plt.show()

# # Make plots for the paper
# plt.rc('text', usetex=True)
# plt.rc('font', family='serif', size=18)

# f = plt.figure()
# plt.hist(scores_rnd, bins=100)
# plt.xlabel(r'Deviation $\epsilon^\prime$')
# plt.ylabel(r'Number of samples')
# plt.title('')
# f.savefig("../paper/random.pdf", bbox_inches='tight', format='pdf')

# fig, axes = plt.subplots(1, 2)
# G.set_coordinates()
# _ = G.plot(ax=axes[0])#, title='original')
# Gs.coords = G.coords
# _ = Gs.plot(ax=axes[1])#, title='sparsified')

Conclusions

This leads to an inescapable conclusion:

Exact computation is a non-starter for large graphs.

Can we get away with approximately computing \(L^{+}\)? There are a couple of major hurdles to doing this efficiently: - The computation is particularly plagued by instability problems that might occur over particular edges, and it’s difficult to guarantee that all the edges would be “uniformly” well approximated. - \(L\) is often sparse (e.g. for kNN graphs), but \(L^{+}\) is dense; how can we do this in less than \(O( n^2 )\) space?

The best approximation, in a sense, is an elegant algorithm by (Spielman and Srivastava 2011) that approximates resistances over all cuts of the graph. This beautiful method warrants a bit more detail.

CODE
num_edges = 20000
kn = 2000
am = adj_mat[:kn, :kn]
itime = time.time()
rows, cols, R = compute_exact_resistances(am)
print(time.time() - itime)
#compute_sampling_probs(adj_mat, R, ultrasparsifier=False)

sparsified_graph = sample_sparsifier(am, num_edges, R, ultrasparsifier=True)
print(time.time() - itime)
6.543221950531006
14.336249828338623

Approximating resistances over edges with random projections

So let’s tackle the problem of approximately computing resistances over edges. To understand and implement the algorithm for doing this, we’ll need to rewrite the Laplacian as \(L = B^\top S^2 B\), where \[ B_{e, v} = \begin{cases} 1,\;\;\;\text{ if e has destination vertex $v$} \\ -1,\;\;\;\text{ if e has source vertex $v$} \\ 0,\;\;\;\text{ if e does not hit vertex $v$} \end{cases} \qquad , \qquad S_{e_1, e_2} = \begin{cases} \sqrt{w_{e_1}},\;\;\text{ if } e_1 = e_2 \\ 0,\;\;\text{ otherwise} \end{cases} \] (When the graph is undirected, we orient it arbitrarily to apply the “destination” and “source” designations of each edge.)

A little linear algebra (Spielman and Srivastava 2011) shows that the resistance \(R_{u v}\) between any two vertices \(u\) and \(v\) is the distance between the \(u^{th}\) and \(v^{th}\) columns of the \(|E| \times n\) matrix \[ M := S B L^{+} \]

Scaling up the implementation with random projections

Here we run into an implementation hurdle:

Computing and storing \(M\) explicitly is not scalable, as each data point is represented by \(|E|\) coordinates, a number far more than the number of data \(n\).

The core of our implementation, following (Spielman and Srivastava 2011), is in overcoming this hurdle. We efficiently approximate the matrix \(M\) by representing its \(n\) columns in a reduced \(\log (n)\)-dimensional space, such that their pairwise distances are preserved. This is done using a random projection.

A random projection is a linear operator \(Q \in \mathbb{R}^{k \times |E|}\) that reduces the \(|E|\)-dimensional data to a smaller number of dimensions \(k\). The reduced data can be stacked together as the columns of a \(k \times n\) matrix \(Z\): \[ Z = Q M = Q S B L^{+} \] Resistances can be calculated from \(Z\) for any edge by reading off the relevant columns; the resistance is the squared Euclidean distance \(\vert \vert \cdot \vert \vert_2^2\) between these columns.

CODE
def compute_effective_resistances(Z, edges):
    """
    Parameters:
    ----------
    Z : (k x n) matrix from which to efficiently compute approximate effective resistances.
    edges : A tuple of two lists: the row and column indices of the respective edges.
    
    Returns:
    --------
    R : Effective resistances for each edge.
    """
    rows, cols = edges
    R = []
    for i, j in zip(rows, cols):
        R.append(np.linalg.norm(Z[:, i] - Z[:, j]) ** 2)
    return np.array(R)

For this use, we want a random projection matrix \(Q\) that preserves pairwise distances between \(n\) points. This happens when the entries are i.i.d. random variables (Arriaga and Vempala 2006), and we will explore two such special cases:

How many random projection dimensions \(k\) to construct, as the number of data \(n\) goes up? Incredibly, (Spielman and Srivastava 2011) shows that \(k \sim \log(n)\) dimensions suffice. This relates to a major benefit of using random projections: they provide a seamless way to trade off between computation and accuracy (i.e., choosing \(k\)), and require very little computation to get a good result2.

Computing \(QSB\) efficiently

This is all very well, except that computing the matrices \(Q, S, B\) explicitly involves at the very least looping over \(|E|\) edges, which again does not scale. However, these matrices only figure into the calculation through the \(k \times n\) matrix \(P := Q S B\), which we can calculate by alternative means.

In essence, we calculate \(P\) from \(A\) by computing each of the \(k\) coordinates in turn for a node. We do this by summing the map’s contributions over edges around the node, weighted according to their orientation and according to \(\sqrt{\cdot}\) of the weight of the edges. This can be parallelized over all nodes at once.

In more detail, let’s implement these steps.

A) Preprocess the adjacency matrix

  • Take the elementwise \(\sqrt{\cdot}\) of \(A\).
  • Orient the graph \(A\) by blocking out the lower triangle of the matrix with zeroes.
  • Arbitrarily assign some of the remaining edges to be negative.
  • Finally, anti-symmetrize the matrix again to re-fill the lower triangle, forming a new graph \(A_2\).

It’s easy to add a little code here at the appropriate commented-out place to arbitarily change the orientation of some edges by making them negative; It’s a good sanity check to see that this does not affect things.

CODE
def preprocess_adjmat(adj_mat):
    if not scipy.sparse.issparse(adj_mat):
        adj2 = np.sqrt(adj_mat)
        adj2 = np.triu(adj2)
    else:
        adj_mat.data = np.sqrt(adj_mat.data)
        adj2 = scipy.sparse.triu(adj_mat)

    # Arbitrarily assign some edges to be negative. 
    pass

    # Anti-symmetrize the matrix again.
    adj2 = adj2 - adj2.T
    return adj2

B) Compute the \(k \times n \times n\) random map \(T\), which should be zero outside edges

We’ll focus on cases where \(Q\) is composed of i.i.d. entries. In these cases, \(T\) is accordingly i.i.d. distributed on entries corresponding to edges, and zero elsewhere.

CODE
import sklearn.random_projection

def draw_random_featmap(
    adj_mat, 
    num_components = 40, 
    projection_mode='sparse', 
    verbose=False
):
    if not scipy.sparse.issparse(adj_mat):
        num_edges = np.prod(adj_mat.shape)
    else:
        num_edges = len(adj_mat.data)
    dummy_data = scipy.sparse.csr_matrix((adj_mat.shape[0], num_edges))
    
    if (num_components * num_edges) > 1e6:
        projection_mode = 'sparse'   # Otherwise memory can't handle it
    
    itime = time.time()
    if projection_mode == 'sparse':
        form = sklearn.random_projection.SparseRandomProjection(n_components=num_components, random_state=42)
    else:
        form = sklearn.random_projection.GaussianRandomProjection(n_components=num_components, random_state=42)
    X_new = form.fit(dummy_data)
    if verbose:
        print(time.time() - itime)

    embmat = form.components_
    list_tensor = []
    if scipy.sparse.issparse(adj_mat):
        nnzrows, nnzcols = adj_mat.nonzero()
    for component_index in range(num_components):
        newdata = np.ravel(embmat[component_index, :].toarray())
        if scipy.sparse.issparse(adj_mat):
            nnzrows, nnzcols = adj_mat.nonzero()
            random_map = scipy.sparse.csr_matrix((newdata, (nnzrows, nnzcols)), shape=adj_mat.shape)
        else:
            random_map = np.reshape(newdata, adj_mat.shape)
        list_tensor.append(random_map)
    return list_tensor

C) Compute each coordinate in the random map

We do this in a vectorized way for all nodes in the graph for each coordinate \(i = 1, \dots, k\). We elementwise multiply the corresponding feature map (slice of \(T\)) by \(A_2\), and then find the row sums of that matrix to get a column vector \(v_i \in \mathbb{R}^{n}\). This gives the \(k \times n\) matrix \(P := Q S B\).

CODE
#collapse-show
def compute_reduced_vertices(list_tensor_random, adj_mat):
    adj2 = preprocess_adjmat(adj_mat)
    list_apply_tensor = [x.multiply(adj2).sum(axis=1) for x in list_tensor_random]
    return list_apply_tensor

D) Putting it together to compute \(QSB\)

We do this in a vectorized way for all nodes in the graph for each coordinate \(i = 1, \dots, k\). We elementwise multiply the corresponding feature map (slice of \(T\)) by \(A_2\), and then find the row sums of that matrix to get a column vector \(v_i \in \mathbb{R}^{n}\). This gives the \(k \times n\) matrix \(P := Q S B\).

CODE
import pandas as pd
from io import StringIO
fname = '/Users/akshay/Downloads/2022_09_12_phage_neutralisers_vh_vl (2).xls'
#pd.read_excel(fname, engine='xlrd', header=None)

table_lines = []
table_lines.append("3247_31_B01 Kappa   1   123537  6558    197470  353 166 165 655 171 -   YES -   -   63088.5 2128.5  YES 113.2   31.39   34.20602218 132.86  12.62   16.68393523 IGHV3-23    3106_06_A03 DYIMH   13  AIHGRGGTTDYADSVKG   10  TVTGNNYYYGLDV   11  DYIMH-AIHGRGGTTDYADSVKG-TVTGNNYYYGLDV   13  KYAMS   AISGNGNSAYNADSVKG   RASQSISSYLN AASSLQS QQSSKTPLT   356 QVQLVESGGGSAQPGGSLRLSCAASGFTFSKYAMSWVRQAPGKGLEWLSAISGNGNSAYNADSVKGRFTISRDNSKNTLYLQMNNLRADDTALYYCASTVTGNNYYYGLDVWGQGTTVTVSS  AIQMTQSPSSLSASVGDRVTITCRASQSISSYLNWYQQKPGKAPKLLIYAASSLQSGVPSRFSGSGSGTDFTLTISSLQPEDFATYYCQQSSKTPLTFGGGTRLEIK 1   1   1   1   1   1   QVQLVESGGGSAQPGGSLRLSCAASGFTFSDYIMHWVRQAPGKGLEWLSAIHGRGGTTDYADSVKGRFTISRDNSKNTLYLQMNNLRADDTALYYCASTVTGNNYYYGLDVWGQGTTVTVSS  AIQMTQSPSSLSASVGDRVTITCRASQSISSYLNWYQQKPGKAPKLLIYAASSLQSGVPSRFSGSGSGTDFTLTISSLQPEDFATYYCQQSSKTPLTFGGGTRLEIK")
table_lines.append("3248_32_B04 Kappa   1   51003   622 180287  96  132 198 519 131 -   YES -   -   5473.5  708 YES 165.13  4.78    3.570751274 175.61  4.93    4.93095529  IGHV3-23    3106_06_A03 DYQMH   20  HIGGRGGRTRYADSVKG   216 TVTGNNYYYGLDV   11  DYQMH-HIGGRGGRTRYADSVKG-TVTGNNYYYGLDV   42  KYAMS   AISGNGNSAYNADSVKG   RASQSISSYLN AASSLQS QQSSKTPLT   356 QVQLVESGGGSAQPGGSLRLSCAASGFTFSKYAMSWVRQAPGKGLEWLSAISGNGNSAYNADSVKGRFTISRDNSKNTLYLQMNNLRADDTALYYCASTVTGNNYYYGLDVWGQGTTVTVSS  AIQMTQSPSSLSASVGDRVTITCRASQSISSYLNWYQQKPGKAPKLLIYAASSLQSGVPSRFSGSGSGTDFTLTISSLQPEDFATYYCQQSSKTPLTFGGGTRLEIK 1   1   1   1   1   1   QVQLVESGGGSAQPGGSLRLSCAASGFTFSDYQMHWVRQAPGKGLEWLSHIGGRGGRTRYADSVKGRFTISRDNSKNTLYLQMNNLRADDTALYYCASTVTGNNYYYGLDVWGQGTTVTVSS  AIQMTQSPSSLSASVGDRVTITCRASQSISSYLNWYQQKPGKAPKLLIYAASSLQSGVPSRFSGSGSGTDFTLTISSLQPEDFATYYCQQSSKTPLTFGGGTRLEIK")
table_lines.append("3248_32_D11 Kappa   1   211193  770 234856  92  7086    178 985 70  -   YES -   -   22461   364 YES 170.21  32.49   23.54628106 206.77  36.38   30.9035631  IGHV3-23    3106_06_A03 DYIMH   13  GIHGRGGVTNYADSVKG   132 TVTGNNYYYGLDV   11  DYIMH-GIHGRGGVTNYADSVKG-TVTGNNYYYGLDV   17  KYAMS   AISGNGNSAYNADSVKG   RASQSISSYLN AASSLQS QQSSKTPLT   356 QVQLVESGGGSAQPGGSLRLSCAASGFTFSKYAMSWVRQAPGKGLEWLSAISGNGNSAYNADSVKGRFTISRDNSKNTLYLQMNNLRADDTALYYCASTVTGNNYYYGLDVWGQGTTVTVSS  AIQMTQSPSSLSASVGDRVTITCRASQSISSYLNWYQQKPGKAPKLLIYAASSLQSGVPSRFSGSGSGTDFTLTISSLQPEDFATYYCQQSSKTPLTFGGGTRLEIK 1   1   1   1   1   1   QVQLVESGGGSAQPGGSLRLSCAASGFTFSDYIMHWVRQAPGKGLEWLSGIHGRGGVTNYADSVKGRFTISRDNSKNTLYLQMNNLRADDTALYYCASTVTGNNYYYGLDVWGQGTTVTVSS  AIQMTQSPSSLSASVGDRVTITCRASQSISSYLNWYQQKPGKAPKLLIYAASSLQSGVPSRFSGSGSGTDFTLTISSLQPEDFATYYCQQSSKTPLTFGGGTRLEIK")
table_lines.append("3248_32_E03 Kappa   1   38068   735 214674  121 148 339 2140    189 -   YES -   -   20570   678 YES 464.56  50.31   13.3588843  525.9   36.42   12.16382488 IGHV3-23    3106_06_A03 DYPMQ   18  RIDGRGGRTRYADSVKG   431 TVTGNNYYYGLDV   11  DYPMQ-RIDGRGGRTRYADSVKG-TVTGNNYYYGLDV   30  KYAMS   AISGNGNSAYNADSVKG   RASQSISSYLN AASSLQS QQSSKTPLT   356 QVQLVESGGGSAQPGGSLRLSCAASGFTFSKYAMSWVRQAPGKGLEWLSAISGNGNSAYNADSVKGRFTISRDNSKNTLYLQMNNLRADDTALYYCASTVTGNNYYYGLDVWGQGTTVTVSS  AIQMTQSPSSLSASVGDRVTITCRASQSISSYLNWYQQKPGKAPKLLIYAASSLQSGVPSRFSGSGSGTDFTLTISSLQPEDFATYYCQQSSKTPLTFGGGTRLEIK 1   1   1   1   1   1   QVQLVESGGGSAQPGGSLRLSCAASGFTFSDYPMQWVRQAPGKGLEWLSRIDGRGGRTRYADSVKGRFTISRDNSKNTLYLQMNNLRADDTALYYCASTVTGNNYYYGLDVWGQGTTVTVSS  AIQMTQSPSSLSASVGDRVTITCRASQSISSYLNWYQQKPGKAPKLLIYAASSLQSGVPSRFSGSGSGTDFTLTISSLQPEDFATYYCQQSSKTPLTFGGGTRLEIK")
table_lines.append("3248_32_G04 Kappa   1   206305  2164    210206  181 40584   256 1000    113 -   YES YES -   24390.5 336 YES 262.1   55.34   26.04532722 305.25  79.02   45.46893071 IGHV3-23    3106_06_A03 DYIMH   13  AIHGRGGVTNYADSVKG   13  TVTGNNYYYGLDV   11  DYIMH-AIHGRGGVTNYADSVKG-TVTGNNYYYGLDV   16  KYAMS   AISGNGNSAYNADSVKG   RASQSISSYLN AASSLQS QQSSKTPLT   356 QVQLVESGGGSAQPGGSLRLSCAASGFTFSKYAMSWVRQAPGKGLEWLSAISGNGNSAYNADSVKGRFTISRDNSKNTLYLQMNNLRADDTALYYCASTVTGNNYYYGLDVWGQGTTVTVSS  AIQMTQSPSSLSASVGDRVTITCRASQSISSYLNWYQQKPGKAPKLLIYAASSLQSGVPSRFSGSGSGTDFTLTISSLQPEDFATYYCQQSSKTPLTFGGGTRLEIK 1   1   1   1   1   1   QVQLVESGGGSAQPGGSLRLSCAASGFTFSDYIMHWVRQAPGKGLEWLSAIHGRGGVTNYADSVKGRFTISRDNSKNTLYLQMNNLRADDTALYYCASTVTGNNYYYGLDVWGQGTTVTVSS  AIQMTQSPSSLSASVGDRVTITCRASQSISSYLNWYQQKPGKAPKLLIYAASSLQSGVPSRFSGSGSGTDFTLTISSLQPEDFATYYCQQSSKTPLTFGGGTRLEIK")
table_lines.append("3248_38_F06 Kappa   2   142200  160 145419  62  84  71  213 65  -   YES -   -   82941   126 YES 177.55  69.31   48.15405964 177.53  22.23   21.99384186 IGHV3-23    3106_06_A03 DYIMY   14  AIHGRGGVTNYADSVKG   13  TVTGNNYYYGLDV   11  DYIMY-AIHGRGGVTNYADSVKG-TVTGNNYYYGLDV   22  KYAMS   AISGNGNSAYNADSVKG   RASQSISSYLN AASSLQS QQSSKTPLT   356 QVQLVESGGGSAQPGGSLRLSCAASGFTFSKYAMSWVRQAPGKGLEWLSAISGNGNSAYNADSVKGRFTISRDNSKNTLYLQMNNLRADDTALYYCASTVTGNNYYYGLDVWGQGTTVTVSS  AIQMTQSPSSLSASVGDRVTITCRASQSISSYLNWYQQKPGKAPKLLIYAASSLQSGVPSRFSGSGSGTDFTLTISSLQPEDFATYYCQQSSKTPLTFGGGTRLEIK 1   1   1   1   1   1   QVQLVESGGGSAQPGGSLRLSCAASGFTFSDYIMYWVRQAPGKGLEWLSAIHGRGGVTNYADSVKGRFTISRDNSKNTLYLQMNNLRADDTALYYCASTVTGNNYYYGLDVWGQGTTVTVSS  AIQMTQSPSSLSASVGDRVTITCRASQSISSYLNWYQQKPGKAPKLLIYAASSLQSGVPSRFSGSGSGTDFTLTISSLQPEDFATYYCQQSSKTPLTFGGGTRLEIK")
table_lines.append("3248_39_B08 Kappa   2   158250  100 148834  243 11588   56  408 134 -   YES -   -   59522.5 283 YES 313.75  53.65   21.09325854 301.74  29.84   17.36998011 IGHV3-23    3106_06_A03 IYKMH   68  TIRGRGGPTQYADSVKG   625 TVTGNNYYYGLDV   11  IYKMH-TIRGRGGPTQYADSVKG-TVTGNNYYYGLDV   150 KYAMS   AISGNGNSAYNADSVKG   RASQSISSYLN AASSLQS QQSSKTPLT   356 QVQLVESGGGSAQPGGSLRLSCAASGFTFSKYAMSWVRQAPGKGLEWLSAISGNGNSAYNADSVKGRFTISRDNSKNTLYLQMNNLRADDTALYYCASTVTGNNYYYGLDVWGQGTTVTVSS  AIQMTQSPSSLSASVGDRVTITCRASQSISSYLNWYQQKPGKAPKLLIYAASSLQSGVPSRFSGSGSGTDFTLTISSLQPEDFATYYCQQSSKTPLTFGGGTRLEIK 1   1   1   1   1   1   QVQLVESGGGSAQPGGSLRLSCAASGFTFSIYKMHWVRQAPGKGLEWLSTIRGRGGPTQYADSVKGRFTISRDNSKNTLYLQMNNLRADDTALYYCASTVTGNNYYYGLDVWGQGTTVTVSS  AIQMTQSPSSLSASVGDRVTITCRASQSISSYLNWYQQKPGKAPKLLIYAASSLQSGVPSRFSGSGSGTDFTLTISSLQPEDFATYYCQQSSKTPLTFGGGTRLEIK")
table_lines.append("3248_39_C10 Kappa   2   140683  114 175048  257 9496    24  542 182 -   YES -   -   61572.5 639.5   YES 287.24  60.57   26.01179602 271.78  8.95    5.784141775 IGHV3-23    3106_06_A03 KYHMQ   79  GIKGRGGITRYADSVKG   146 TVTGNNYYYGLDV   11  KYHMQ-GIKGRGGITRYADSVKG-TVTGNNYYYGLDV   222 KYAMS   AISGNGNSAYNADSVKG   RASQSISSYLN AASSLQS QQSSKTPLT   356 QVQLVESGGGSAQPGGSLRLSCAASGFTFSKYAMSWVRQAPGKGLEWLSAISGNGNSAYNADSVKGRFTISRDNSKNTLYLQMNNLRADDTALYYCASTVTGNNYYYGLDVWGQGTTVTVSS  AIQMTQSPSSLSASVGDRVTITCRASQSISSYLNWYQQKPGKAPKLLIYAASSLQSGVPSRFSGSGSGTDFTLTISSLQPEDFATYYCQQSSKTPLTFGGGTRLEIK 1   1   1   1   1   1   QVQLVESGGGSAQPGGSLRLSCAASGFTFSKYHMQWVRQAPGKGLEWLSGIKGRGGITRYADSVKGRFTISRDNSKNTLYLQMNNLRADDTALYYCASTVTGNNYYYGLDVWGQGTTVTVSS  AIQMTQSPSSLSASVGDRVTITCRASQSISSYLNWYQQKPGKAPKLLIYAASSLQSGVPSRFSGSGSGTDFTLTISSLQPEDFATYYCQQSSKTPLTFGGGTRLEIK")
table_lines.append("3248_40_F03 Kappa   2   141303  525 147159  190 425 492 450 254 -   YES -   -   22726   270 YES 249.97  48.45   23.90911909 245.82  26.45   18.89913165 IGHV3-23    3106_06_A03 DYIMH   13  GIHGRGGVTQYADSVKG   133 TVTGNNYYYGLDV   11  DYIMH-GIHGRGGVTQYADSVKG-TVTGNNYYYGLDV   18  KYAMS   AISGNGNSAYNADSVKG   RASQSISSYLN AASSLQS QQSSKTPLT   356 QVQLVESGGGSAQPGGSLRLSCAASGFTFSKYAMSWVRQAPGKGLEWLSAISGNGNSAYNADSVKGRFTISRDNSKNTLYLQMNNLRADDTALYYCASTVTGNNYYYGLDVWGQGTTVTVSS  AIQMTQSPSSLSASVGDRVTITCRASQSISSYLNWYQQKPGKAPKLLIYAASSLQSGVPSRFSGSGSGTDFTLTISSLQPEDFATYYCQQSSKTPLTFGGGTRLEIK 1   1   1   1   1   1   QVQLVESGGGSAQPGGSLRLSCAASGFTFSDYIMHWVRQAPGKGLEWLSGIHGRGGVTQYADSVKGRFTISRDNSKNTLYLQMNNLRADDTALYYCASTVTGNNYYYGLDVWGQGTTVTVSS  AIQMTQSPSSLSASVGDRVTITCRASQSISSYLNWYQQKPGKAPKLLIYAASSLQSGVPSRFSGSGSGTDFTLTISSLQPEDFATYYCQQSSKTPLTFGGGTRLEIK")
table_lines.append("3248_40_G12 Kappa   2   54559   370 144875  231 221 1237    2333    181 -   YES -   -   15939.5 658.5   YES 229.79  32.81   17.61297787 219.91  1.25    0.998385863 IGHV3-23    3106_06_A03 DYQMH   20  HIGGRGGRTKYADSVKG   214 TVTGNNYYYGLDV   11  DYQMH-HIGGRGGRTKYADSVKG-TVTGNNYYYGLDV   39  KYAMS   AISGNGNSAYNADSVKG   RASQSISSYLN AASSLQS QQSSKTPLT   356 QVQLVESGGGSAQPGGSLRLSCAASGFTFSKYAMSWVRQAPGKGLEWLSAISGNGNSAYNADSVKGRFTISRDNSKNTLYLQMNNLRADDTALYYCASTVTGNNYYYGLDVWGQGTTVTVSS  AIQMTQSPSSLSASVGDRVTITCRASQSISSYLNWYQQKPGKAPKLLIYAASSLQSGVPSRFSGSGSGTDFTLTISSLQPEDFATYYCQQSSKTPLTFGGGTRLEIK 1   1   1   1   1   1   QVQLVESGGGSAQPGGSLRLSCAASGFTFSDYQMHWVRQAPGKGLEWLSHIGGRGGRTKYADSVKGRFTISRDNSKNTLYLQMNNLRADDTALYYCASTVTGNNYYYGLDVWGQGTTVTVSS  AIQMTQSPSSLSASVGDRVTITCRASQSISSYLNWYQQKPGKAPKLLIYAASSLQSGVPSRFSGSGSGTDFTLTISSLQPEDFATYYCQQSSKTPLTFGGGTRLEIK")
table_lines.append("3248_40_H01 Kappa   2   118514  1115    148648  2131    980 2421    1074    208 -   YES -   -   166637  213 YES 207.96  82.07   48.68131587 214.1   20.59   16.89168864 IGHV3-23    3106_06_A03 DYIMH   13  AIHGRGGVTDYADSVKG   11  TVTGNNYYYGLDV   11  DYIMH-AIHGRGGVTDYADSVKG-TVTGNNYYYGLDV   14  KYAMS   AISGNGNSAYNADSVKG   RASQSISSYLN AASSLQS QQSSKTPLT   356 QVQLVESGGGSAQPGGSLRLSCAASGFTFSKYAMSWVRQAPGKGLEWLSAISGNGNSAYNADSVKGRFTISRDNSKNTLYLQMNNLRADDTALYYCASTVTGNNYYYGLDVWGQGTTVTVSS  AIQMTQSPSSLSASVGDRVTITCRASQSISSYLNWYQQKPGKAPKLLIYAASSLQSGVPSRFSGSGSGTDFTLTISSLQPEDFATYYCQQSSKTPLTFGGGTRLEIK 1   1   1   1   1   1   QVQLVESGGGSAQPGGSLRLSCAASGFTFSDYIMHWVRQAPGKGLEWLSAIHGRGGVTDYADSVKGRFTISRDNSKNTLYLQMNNLRADDTALYYCASTVTGNNYYYGLDVWGQGTTVTVSS  AIQMTQSPSSLSASVGDRVTITCRASQSISSYLNWYQQKPGKAPKLLIYAASSLQSGVPSRFSGSGSGTDFTLTISSLQPEDFATYYCQQSSKTPLTFGGGTRLEIK")
table_lines.append("3248_41_C03 Kappa   2   31466   147 91173   148 188 479 1330    193 -   YES -   -   10116.5 940 YES NA  NA  NA  NA  NA  NA  IGHV3-23    3106_06_A03 KYAMS   71  GIKGRGGHTKYADSVKG   144 TVTGNNYYYGLDV   11  KYAMS-GIKGRGGHTKYADSVKG-TVTGNNYYYGLDV   174 KYAMS   AISGNGNSAYNADSVKG   RASQSISSYLN AASSLQS QQSSKTPLT   356 QVQLVESGGGSAQPGGSLRLSCAASGFTFSKYAMSWVRQAPGKGLEWLSAISGNGNSAYNADSVKGRFTISRDNSKNTLYLQMNNLRADDTALYYCASTVTGNNYYYGLDVWGQGTTVTVSS  AIQMTQSPSSLSASVGDRVTITCRASQSISSYLNWYQQKPGKAPKLLIYAASSLQSGVPSRFSGSGSGTDFTLTISSLQPEDFATYYCQQSSKTPLTFGGGTRLEIK 1   1   1   1   1   1   QVQLVESGGGSAQPGGSLRLSCAASGFTFSKYAMSWVRQAPGKGLEWLSGIKGRGGHTKYADSVKGRFTISRDNSKNTLYLQMNNLRADDTALYYCASTVTGNNYYYGLDVWGQGTTVTVSS  AIQMTQSPSSLSASVGDRVTITCRASQSISSYLNWYQQKPGKAPKLLIYAASSLQSGVPSRFSGSGSGTDFTLTISSLQPEDFATYYCQQSSKTPLTFGGGTRLEIK")
table_lines.append("3248_42_A09 Kappa   2   100354  507 191202  454 530 636 1064    296 -   YES -   -   54109   272.5   YES 211.25  38.36   22.399564   194.97  32.03   28.85509678 IGHV3-23    3106_06_A03 DYVMH   24  GIHGRGGYTKYADSVKG   134 TVTGNNYYYGLDV   11  DYVMH-GIHGRGGYTKYADSVKG-TVTGNNYYYGLDV   64  KYAMS   AISGNGNSAYNADSVKG   RASQSISSYLN AASSLQS QQSSKTPLT   356 QVQLVESGGGSAQPGGSLRLSCAASGFTFSKYAMSWVRQAPGKGLEWLSAISGNGNSAYNADSVKGRFTISRDNSKNTLYLQMNNLRADDTALYYCASTVTGNNYYYGLDVWGQGTTVTVSS  AIQMTQSPSSLSASVGDRVTITCRASQSISSYLNWYQQKPGKAPKLLIYAASSLQSGVPSRFSGSGSGTDFTLTISSLQPEDFATYYCQQSSKTPLTFGGGTRLEIK 1   1   1   1   1   1   QVQLVESGGGSAQPGGSLRLSCAASGFTFSDYVMHWVRQAPGKGLEWLSGIHGRGGYTKYADSVKGRFTISRDNSKNTLYLQMNNLRADDTALYYCASTVTGNNYYYGLDVWGQGTTVTVSS  AIQMTQSPSSLSASVGDRVTITCRASQSISSYLNWYQQKPGKAPKLLIYAASSLQSGVPSRFSGSGSGTDFTLTISSLQPEDFATYYCQQSSKTPLTFGGGTRLEIK")
table_lines.append("3248_42_F01 Kappa   2   31681   534 163866  550 1233    434 1157    147 -   YES -   -   33575   638.5   YES 289.39  37.85   16.13392553 293.74  42.93   25.67031431 IGHV3-23    3106_06_A03 DYPMQ   18  DIRGRGGRTRYADSVKG   69  TVTGNNYYYGLDV   11  DYPMQ-DIRGRGGRTRYADSVKG-TVTGNNYYYGLDV   26  KYAMS   AISGNGNSAYNADSVKG   RASQSISSYLN AASSLQS QQSSKTPLT   356 QVQLVESGGGSAQPGGSLRLSCAASGFTFSKYAMSWVRQAPGKGLEWLSAISGNGNSAYNADSVKGRFTISRDNSKNTLYLQMNNLRADDTALYYCASTVTGNNYYYGLDVWGQGTTVTVSS  AIQMTQSPSSLSASVGDRVTITCRASQSISSYLNWYQQKPGKAPKLLIYAASSLQSGVPSRFSGSGSGTDFTLTISSLQPEDFATYYCQQSSKTPLTFGGGTRLEIK 1   1   1   1   1   1   QVQLVESGGGSAQPGGSLRLSCAASGFTFSDYPMQWVRQAPGKGLEWLSDIRGRGGRTRYADSVKGRFTISRDNSKNTLYLQMNNLRADDTALYYCASTVTGNNYYYGLDVWGQGTTVTVSS  AIQMTQSPSSLSASVGDRVTITCRASQSISSYLNWYQQKPGKAPKLLIYAASSLQSGVPSRFSGSGSGTDFTLTISSLQPEDFATYYCQQSSKTPLTFGGGTRLEIK")
table_lines.append("3258_34_H11 Lambda  1   213672  2890    187268  339 190669  971 1178    572 -   YES YES -   13985   2108    YES 420.05  28.01   8.225641998 466.5   32.51   12.24048736 IGHV3-23    3107_13_H11 QYGMQ   147 YIDGHGGHTYYADSVKG   677 SWNKVRGGDV  9   QYGMQ-YIDGHGGHTYYADSVKG-SWNKVRGGDV  567 SYAMS   AISGSGGSTYYADSVKG   SGSSSNIGSNYVY   RNNQRPS AAWDDSLSGRV 136 EVQLVESGGGFVQPGESVRLSCEDSGFTFSSYAMSWVRQAPGKGLEWVSAISGSGGSTYYADSVKGRFTISRDNSKNTLYLQMNSLRAEDTAVYYCSTSWNKVRGGDVWGQGTTVTVSS QSVLTQPPSASGTPGQRVTISCSGSSSNIGSNYVYWYQQLPGTAPKLLIYRNNQRPSGVPDRFSGSKSGTSASLAISGLRSEDEADYYCAAWDDSLSGRVFGGGTKLTVL  1   1   1   1   1   1   EVQLVESGGGFVQPGESVRLSCEDSGFTFSQYGMQWVRQAPGKGLEWVSYIDGHGGHTYYADSVKGRFTISRDNSKNTLYLQMNSLRAEDTAVYYCSTSWNKVRGGDVWGQGTTVTVSS QSVLTQPPSASGTPGQRVTISCSGSSSNIGSNYVYWYQQLPGTAPKLLIYRNNQRPSGVPDRFSGSKSGTSASLAISGLRSEDEADYYCAAWDDSLSGRVFGGGTKLTVL")
table_lines.append("3258_43_A11 Lambda  2   201715  4862    214990  613 169588  663 1413    585 -   YES YES -   16425   380.5   YES 273.98  21.54   9.698052297 289.64  35.14   21.30966423 IGHV3-23    3107_13_H11 AYGMQ   1   YIDGHGGHTYYADSVKG   677 SWNKVRGGDV  9   AYGMQ-YIDGHGGHTYYADSVKG-SWNKVRGGDV  1   SYAMS   AISGSGGSTYYADSVKG   SGSSSNIGSNYVY   RNNQRPS AAWDDSLSGRV 136 EVQLVESGGGFVQPGESVRLSCEDSGFTFSSYAMSWVRQAPGKGLEWVSAISGSGGSTYYADSVKGRFTISRDNSKNTLYLQMNSLRAEDTAVYYCSTSWNKVRGGDVWGQGTTVTVSS QSVLTQPPSASGTPGQRVTISCSGSSSNIGSNYVYWYQQLPGTAPKLLIYRNNQRPSGVPDRFSGSKSGTSASLAISGLRSEDEADYYCAAWDDSLSGRVFGGGTKLTVL  1   1   1   1   1   1   EVQLVESGGGFVQPGESVRLSCEDSGFTFSAYGMQWVRQAPGKGLEWVSYIDGHGGHTYYADSVKGRFTISRDNSKNTLYLQMNSLRAEDTAVYYCSTSWNKVRGGDVWGQGTTVTVSS QSVLTQPPSASGTPGQRVTISCSGSSSNIGSNYVYWYQQLPGTAPKLLIYRNNQRPSGVPDRFSGSKSGTSASLAISGLRSEDEADYYCAAWDDSLSGRVFGGGTKLTVL")

newdf = pd.read_csv(StringIO('\n'.join(table_lines)), sep='\t', header=None)
newdf.iloc[:, [0, -2, -1]].to_csv('seqs.tsv', sep='\t', index=None)

Matrix inversion as gradient descent

There are huge time and space savings by using \(Z\) instead of \(R\), because this choice represents each of the \(n\) points by \(k \sim \log (n)\) coordinates instead of \(|E|\) coordinates. However, the time to naively compute \(Z\) is dominated by the \(O (n^{3})\) cost of computing \(L^{+}\), so we need to compute \(Z\) without explicitly computing \(L^{+}\).

We do this by writing \[ L Z^{\top} \approx B^{\top} S Q^{\top} \] and finding $Z = {Z} L Z^{} - B^{} S Q^{} {2}^{2} $ as the least-squares linear solution to this system.

So we can efficiently find \(Z\) by gradient descent on the objective function \[ f(Z) := \vert\vert L Z^{\top} - B^{\top} S Q^{\top} \vert\vert_{2}^{2} = \vert\vert Z L^{\top} - Q S B \vert\vert_{2}^{2} = \vert\vert Z L - Q S B \vert\vert_{2}^{2} \]

CODE
def calculate_reduced_resistances(
    adj_mat, 
    epsilon=1e-1, eta=1e-3, max_iters=1000, convergence_after = 10,
    tolerance=1e-2, log_every=10, compute_exact_loss=False
):
    """ 
    Computes the Z matrix using gradient descent.
    
    Parameters:
    -----------
    adj_mat : sparse matrix
        The adjacency matrix of the graph.
    epsilon : float, optional
        Tolerance for deviations w.r.t. spectral norm of the sparsifier. Smaller epsilon lead to a higher
        dimensionality of the Z matrix.
    eta : float, optional
        Step size for the gradient descent.
    max_iters : int, optional
        Maximum number of iterations.
    convergence_after : int, optional
        If the loss did not decrease significantly for this amount of iterations, the gradient descent will abort.
    tolerance : float, optional
        The minimum amount of energy decrease that is expected for iterations. If for a certain number of iterations
        no overall energy decrease is detected, the gradient descent will abort.
    log_every : int, optional
        Log the loss after each log_every iterations.
    compute_exact_loss : bool, optional
        Only for debugging. If set it computes the actual pseudo inverse without down-projection and checks if
        the pairwise distances in Z's columns are the same with respect to the forbenius norm.
    
    Returns:
    --------
    Z : ndarray, shape [k, N]
        Matrix from which to efficiently compute approximate resistances.
    """
    # Compute L, S, B
    L = sp.csr_matrix(sp.csgraph.laplacian(adj_mat))
    rows, cols = adj_mat.nonzero()
    weights = np.sqrt(np.array(adj_mat[rows, cols].tolist()))
    S = sp.diags(weights, [0])
    # Construct signed edge incidence matrix
    num_vertices = adj_mat.shape[0]
    num_edges = S.shape[0]
    assert(num_edges == len(rows) and num_edges == len(cols))
    B = sp.coo_matrix((
        ([1] * num_edges) + ([-1] * num_edges),
        (list(range(num_edges)) * 2, list(rows) + list(cols))
    ), shape=[num_edges, num_vertices])

    k = int(np.ceil(np.log(B.shape[1] / epsilon**2)))
    Y_red = compute_reduced_vertices()
    
    # Use gradient descent to solve for Z
    Z = np.random.randn(k, L.shape[1])
    best_loss = np.inf
    best_iter = np.inf
    for it in range(max_iters):
        eta_t = eta  # Constant learning rate
        eta_t = eta * (1.0/np.sqrt(it + 10))
        residual = Y_red - sp.csr_matrix.dot(Z, L)
        loss = np.linalg.norm(residual)
        if it % log_every == 0: 
            print(f'Loss before iteration {it}: {loss}')
            if compute_exact_loss:
                pairwise_dist = Z.T.dot(Z)
                exact_loss = np.linalg.norm(pairwise_dist - pairwise_dist_gnd)
                print(f'Loss w.r.t. exact pairwise distances {exact_loss}')
        if loss + tolerance < best_loss:
            best_loss = loss
            best_iter = it
        elif it > best_iter + convergence_after:
            # No improvement for 10 iterations
            print(f'Convergence after {it - 1} iterations.')
            break
        Z += eta_t * L.dot(residual.T).T
    return Z
    

Putting it all together to sparsify graphs

All this results in a streamlined procedure for sparsifying a graph to have any desired number of edges num_edges. To summarize, given a graph \(G\), we:

  1. Calculate reduced-dimension representation \(Z\) of the graph \(G\).
  2. Compute effective resistances over edges in \(G\).
  3. Run weighted edge bootstrap sampling to generate sparsified graph \(H\), with relative edge weights proportional to resistances.
CODE
def spectral_sparsify(
    adj_mat, q=None, epsilon=1e-1, eta=1e-3, max_iters=1000, convergence_after=100,
    tolerance=1e-2, log_every=10, prevent_vertex_blow_up=False
):
    """ Computes a spectral sparsifier of the graph given by an adjacency matrix.
    
    Parameters:
    ----------
    adj_mat : sparse matrix, shape [N, N]
        The adjacency matrix of the graph.
    q : int or None
        The number of samples for the sparsifier. If None q will be set to N * log(N) / (epsilon * 2)
    epsilon : float, optional
        The desired spectral similarity of the sparsifier. Defaults to 0.1.
    eta : float, optional
        Step size for the gradient descent when computing resistances. Defaults to 1e-3.
    max_iters : int, optional
        Maximum number of iterations when computing resistances. Defaults to 1000.
    convergence_after : int, optional
        If the loss did not decrease significantly for this amount of iterations, 
        the gradient descent will abort. Defaults to 100.
    tolerance : float, optional
        The minimum amount of energy decrease that is expected for iterations. 
        If for a certain number of iterations no overall energy decrease 
        is detected, the gradient descent will abort.
    prevent_vertex_blow_up : bool, optional
        If the probabilities will be tweaked in order to ensure that the vertices are not blown up too much. 
        Note that this will only guarantee a spectral closeness up to a factor of 2 * epsilon.
    log_every : int, optional
        How often to log the current loss (once every `log_every` iterations).
    
    Returns:
    --------
    H : sp.csr_matrix, shape [N, N]
        Sparsified graph with at most q edges.
    """
    if q is None:
        q = int(np.ceil(adj_mat.shape[0] * np.log(adj_mat.shape[0]) / (epsilon ** 2)))   # or 6n*ln(n) , by https://www.cs.ubc.ca/~nickhar/Cargese3.pdf
    edges = adj_mat.nonzero()
    Z = calculate_reduced_resistances(adj_mat, epsilon=epsilon, max_iters=max_iters, convergence_after=convergence_after, eta=eta, tolerance=tolerance, compute_exact_loss=False)
    R = compute_effective_resistances(Z, edges)
    return sample_sparsifier(adj_mat, q, R, edges, prevent_vertex_blow_up=prevent_vertex_blow_up)

Summary

  • The weighted edge bootstrap algorithm of (Spielman and Srivastava 2011) guarantees a sparsifier with high probability for any undirected weighted graph.
  • An efficient implementation, described in (Spielman and Srivastava 2011), uses random projections to construct a subspace of dimension \(\log n\) in which the pairwise distances between the \(n\) vertices are approximately preserved.
  • Though simple in concept, this sparsification method is an extremely useful and powerful primitive in designing efficient graph algorithms.

References

Arriaga, Rosa I, and Santosh Vempala. 2006. “An Algorithmic Theory of Learning: Robust Concepts and Random Projection.” Machine Learning 63 (2): 161–82.
Chandra, Ashok K, Prabhakar Raghavan, Walter L Ruzzo, Roman Smolensky, and Prasoon Tiwari. 1996. “The Electrical Resistance of a Graph Captures Its Commute and Cover Times.” Computational Complexity 6 (4): 312–40.
Dasgupta, Sanjoy, and Anupam Gupta. 2003. “An Elementary Proof of a Theorem of Johnson and Lindenstrauss.” Random Structures & Algorithms 22 (1): 60–65.
Klein, Douglas J, and Milan Randić. 1993. “Resistance Distance.” Journal of Mathematical Chemistry 12 (1): 81–95.
Li, Ping, Trevor J Hastie, and Kenneth W Church. 2006. “Very Sparse Random Projections.” In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 287–96.
Spielman, Daniel A, and Nikhil Srivastava. 2011. “Graph Sparsification by Effective Resistances.” SIAM Journal on Computing 40 (6): 1913–26.

Footnotes

  1. The pygsp package deserves special mention for including sparsification in a broader context of algorithms that I will explore in future posts. But its actual implementation is not efficient enough to scale as ours does.↩︎

  2. By Johnson-Lindenstrauss theory and related statistics, there is no such method that can use fewer dimensions to achieve a given reconstruction error guarantee.↩︎