Incremental Community Detection on Streaming Graphs with Bounded Memory and Distributed Merge Guarantees
Abstract
Streaming interaction data induces graphs whose vertex and edge sets evolve continuously, often at rates that make batch community detection impractical. In such settings, the detection objective is typically not only to track latent group structure, but also to do so under strict limits on memory, latency, and communication, while tolerating out-of-order updates and distributed ingestion. This paper studies incremental community detection on streaming graphs under bounded memory, emphasizing mergeable state representations that admit deterministic, distributed reconciliation with formal guarantees. The core premise is that community detection in a stream should be treated as an online optimization and inference problem whose sufficient statistics are carefully chosen to support approximate yet stable updates without retaining full adjacency. We formulate a model that unifies modularity-like partition quality, stochastic blockmodel-inspired likelihood, and embedding-based separation, and we derive incremental update rules that operate per edge in small amortized time. To respect memory budgets, we use bounded retention windows and sketch-based estimators for degrees, cuts, and inter-community affinities, with explicit approximation error propagation. For distributed ingestion, we define merge operators over compact community summaries and prove associativity and commutativity properties needed for eventual consistency, along with bounds on quality loss relative to a centralized baseline. We also address performance engineering in storage internals and distributed execution, and we outline evaluation protocols designed for reproducibility under streaming nondeterminism.
Downloads
Published
Issue
Section
License
Copyright (c) 2020 authors

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.