Yun Mao, Hani Jamjoom, Shu Tao and Jonathan Smith
ACM Internet Measurement Conference (IMC)
San Diego, CA, October 2007
Abstract. Health monitoring, automated failure localization and
diagnosis have all become critical to service
providers of large distribution networks (e.g.,
digital cable and fiber-to-thehome), due to the
increases in scale and complexity of their offered
services. Existing automated failure diagnosis
solutions typically assume complete knowledge of
network topology, which in practice is rarely
available. The solution presented in this
paper—Network Management and Diagnosis
(NetworkMD)—is an automated failure diagnosis system
that can infer failure groups based on historical
failure data, and optionally geographical
information. The inferred failure groups mirror
missing topologies, and can be used to localize
failures, diagnose root causes of problems, and
detect misconfiguration in known
topologies. NetworkMD uses an unsupervised learning
algorithm based on non-negative matrix factorization
(NMF) to infer failure groups. Using cable network
as the primary example, we demonstrate the
effectiveness of NetworkMD in both simulated settings
and real environment using data collected from a
commercial network serving hundreds of thousands of
customers via thousands of intermediate network
devices
Bibtex.
@inproceedings{jamjoom-IMC-07,
author = {Yun and Mao and Hani and Jamjoom and Shu and Tao and Jonathan and Smith},
title = {{NetworkMD: Topology Inference and Failure Diagnosis in the Last Mile}},
booktitle = {ACM Internet Measurement Conference (IMC)},
address = {San Diego, CA},
month = {October},
year = {2007}
}