a simple stable inner circle algorithm

Welcome to the Second Coming of Twinkdestroyer Analytics. Today, we are interested in finding out a stable inner circle of some node in a large social network. This is similar, but not quite, like the influence maximization problem that is commonly solved with diffusion models.

So why not use the other nice stuff? this is a simple problem that takes you literally a few matrix multiplications. We are not interested in a probabilistic approach, as we are just trying to find a stable non-growing network from known edges.

Let $G = (V,E)$ be a directed graphs with loops but no repeated edges. Let us consider a non-empty seed set $D_0 \subset V$ that we want to expand into a stable, non-growing community $D_T \subset V$. Let a directed edge denote some notion of affinity or admission. This could be X has donated to Y, X has visited Y, X called Y, X follows Y etc. Let us define a threshold criteria for a set $S$ to be $\tau(S)=\max{1,\ \lfloor p,|S|\rfloor}$ where $p \in [0,1] $ denotes the percentage of seed set members that has to have edges point out to candidate node in order for the candidate node to be admitted to the seed set. This threshold changes dynamically at each epoch. In practice, the threshold percentage should be 1 billion times less handwavey than what I outlined here, there is a lot of bells and whistles you can add.

Some further notation:

  • The child nodes of a node $u$ is $\Gamma^{+}(u);=;{,v\in V:\ (u,v)\in E,}.$
  • The child nodes of any set $S$ is $\Gamma^{+}(S);=; \bigcup_{u \in S} \Gamma^{+} (u)$
  • In degree from set $S$ to node $v$ is $d^{\text{in}}_S (v) = |{,u\in S:\ (u,v)\in E} |$
  • Recall theshold criteria is $\tau(S)=\max{1,\ \lfloor p,|S|\rfloor}$ where $p \in [0,1] $ is a percentage

A simple algorithm for doing this is the following:

  1. At each iteration $t$, compute candidate set by finding all the child hodes of the seed set $D_t$ at iteration $t$

    $C_t = \Gamma^+ (D_t) \backslash D_t$

  2. Admit new nodes to the seed set if the number of parent nodes in $D_i$ exceeds the threshold criteria $A_t = {v \in C_t : d^{\text{in}}_{D_t} (v)\ \geq \ \tau(D_t)}$

  3. Update the seed network $D_{t+1} = D_t \cup A_t$

  4. Stopping condition: stop at first $T$ where $D_{T+1} = D_T$, the seed network stops expanding. Output stable community $D_T$.

If we know the entire graph, we can compute this using adjacency matrices:

import numpy as np
from typing import Iterable, Optional

def find_stable_inner_circle(adj_matrix, seed_indices: Iterable[int], threshold_pct: float = 0.9, max_iters: Optional[int] = 100) -> np.ndarray:
    A = adj_matrix
    n = A.shape[0]
    seeds = np.asarray(list(seed_indices), dtype=int)
    mask = np.zeros(n, dtype=bool)
    mask[seeds] = True
    # Iterate until convergence
    for iteration in range(max_iters):
        # Count incoming edges from current set
        incoming = A[mask].sum(axis=0)
        if hasattr(incoming, 'A1'):  # Handle sparse matrices
            incoming = incoming.A1
        else:
            incoming = np.asarray(incoming).ravel()
        # Compute threshold: max of 0 or threshold_pct of current set size
        threshold = max(1, int(np.floor(threshold_pct * mask.sum())))
        # Expand set with nodes meeting threshold
        new_mask = mask | (incoming >= threshold)
        # Check convergence
        if np.array_equal(new_mask, mask):
            break
        mask = new_mask
    #return set without convergence after max iter
    return np.flatnonzero(mask)

#usage for A = np.array() 
inner_circle = find_stable_inner_circle(A, seed_indices=[0], threshold_pct=0.9)
#scipy sparse
from scipy.sparse import csr_matrix
A_sparse = csr_matrix(A)
inner_circle = find_stable_inner_circle(A_sparse, seed_indices=[0])

If you found this useful, please cite my twitter or cite this write-up as:

a simple stable inner circle algorithm. jean.land. https://www.jean.land/[objects](/objects.html)/social-networks.html