Part of Advances in Neural Information Processing Systems 26 (NIPS 2013)
Karthika Mohan, Judea Pearl, Jin Tian
We address the problem of deciding whether there exists a consistent estimator of a given relation Q, when data are missing not at random. We employ a formal representation called `Missingness Graphs' to explicitly portray the causal mechanisms responsible for missingness and to encode dependencies between these mechanisms and the variables being measured. Using this representation, we define the notion of \textit{recoverability} which ensures that, for a given missingness-graph $G$ and a given query $Q$ an algorithm exists such that in the limit of large samples, it produces an estimate of $Q$ \textit{as if} no data were missing. We further present conditions that the graph should satisfy in order for recoverability to hold and devise algorithms to detect the presence of these conditions.