We develop upper and lower bounds for the probability of Boolean functions by treating multiple occurrences of variables as independent and assigning them new individual probabilities. We call this approach dissociation and give an exact characterization of optimal oblivious bounds, i.e. when the new probabilities are chosen independent of the probabilities of all other variables. Our motivation comes from the weighted model counting problem (or, equivalently, the problem of computing the probability of a Boolean function), which is #Phard in general. By performing several dissociations, one can transform a Boolean formula whose probability is difficult to compute, into one whose probability is easy to compute, and which is guaranteed to provide an upper or lower bound on the probability of the original formula by choosing appropriate probabilities for the dissociated variables. Our new bounds shed light on the connection between previous relaxationbased and modelbased approximations and unify them as concrete choices in a larger design space.
We also show how our theory allows a standard relational database management system (DBMS) to both upper and lower bound hard probabilistic queries in guaranteed polynomial time. We call this approach query dissociation and change from the commonlyused possibleworld semantics to a new semantics that we call propagation. Conceptually, a dissociated query is obtained from an unsafe query by adding extraneous variables to some atoms until the query becomes safe. We show that the scores for chain queries before and after dissociation correspond to two wellknown scoring functions on graphs, namely network reliability (which is #Phard), and propagation (which is related to PageRank and in PTIME). We then define a unique propagation score for an answer to a selfjoinfree conjunctive query and prove that it is always an upper bound to the score given by the possible world semantics. We further show that the propagation score can always be evaluated with an instanceindependent query plan, that the propagation score is identical to the possible world semantics for all safe queries, and that the concept of propagation is a natural generalization of plans for safe queries to plans for unsafe queries.
Publications

Dissociation and propagation for approximate lifted inference with standard relational database management systems
Wolfgang Gatterbauer, Dan Suciu
VLDBJ (Special Issue of VLDB Journal on VLDB 2015). pp. 126, 2016.
new [paper (Springer)]
[slides from Uncertainty in Computation workshop Oct 2016 (2MB)]
Extended 33 page online version with all proofs: [paper (arXiv:1310.6257)], (Version June 2016)
Extends and complements PVLDB 8(5):629640, 2015
This paper proposes an approach to uncertain query evaluation by which every query is evaluated entirely in the database engine by evaluating a fixed number of query plans, each providing an upper bound on the true probability, then taking their minimum. We provide an algorithm that takes into account important schema information to enumerate only the minimal necessary plans among all possible plans. Importantly, this algorithm is a strict generalization of all known PTIME selfjoinfree conjunctive queries: A query is in PTIME if and only if our algorithm returns one single plan. Furthermore, our approach is a generalization of a family of efficient ranking functions from graphs to hypergraphs. We also note that the techniques developed in this paper apply immediately to lifted inference from statistical relational models since lifted inference corresponds to PTIME plans in probabilistic databases.

Approximate lifted inference with probabilistic databases
Wolfgang Gatterbauer, Dan Suciu
PVLDB 8(5):629640, 2015.
[paper], [paper (arXiv:1412.1069)], [bib]
Invited to the Special Issue of VLDB Journal on VLDB 2015.
This paper proposes a new approach for approximate evaluation of #Phard queries with probabilistic databases. In our approach, every query is evaluated entirely in the database engine by evaluating a fixed number of query plans, each providing an upper bound on the true probability, then taking their minimum. We provide an algorithm that takes into account important schema information to enumerate only the minimal necessary plans among all possible plans. Importantly, this algorithm is a strict generalization of all known results of PTIME selfjoinfree conjunctive queries: A query is safe if and only if our algorithm returns one single plan. We also apply three relational query optimization techniques to evaluate all minimal safe plans very fast. We give a detailed experimental evaluation of our approach and, in the process, provide a new way of thinking about the value of probabilistic methods over nonprobabilistic methods for ranking query answers.

Oblivious bounds on the probability of Boolean functions
Wolfgang Gatterbauer, Dan Suciu
ACM TODS, vol. 39, no. 1, pp. 191208, 2014.
[paper], [paper (preprint)], [paper (arXiv:1409.6052)], [SQL code], [bib]
Prior superseded working paper "Optimal Upper and Lower Bounds for Boolean Expressions by Dissociation": arXiv:1105.2813
This paper develops upper and lower bounds for the probability of Boolean functions by treating multiple occurrences of variables as independent and assigning them new individual probabilities. We call this approach dissociation and give an exact characterization of optimal oblivious bounds, i.e. when the new probabilities are chosen independent of the probabilities of all other variables. Our motivation comes from the weighted model counting problem (or, equivalently, the problem of computing the probability of a Boolean function), which is #Phard in general. By performing several dissociations, one can transform a Boolean formula whose probability is difficult to compute, into one whose probability is easy to compute, and which is guaranteed to provide an upper or lower bound on the probability of the original formula by choosing appropriate probabilities for the dissociated variables. Our new bounds shed light on the connection between previous relaxationbased and modelbased approximations and unify them as concrete choices in a larger design space. We also show how our theory allows a standard relational database management system (DBMS) to both upper and lower bound hard probabilistic queries in guaranteed polynomial time.
People & Affiliations
(Assistant Professor @ CMU)  
(Professor @ UW) 
Funding
This project has been generously supported by the NSF via grants IIS0915054 (the BeliefDB project) and IIS1553547. Any opinions, findings, and conclusions or recommendations expressed in this project are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.