Re: Learning BNs with hidden variables

Jim Myers (myers29@erols.com)
Wed, 7 Apr 1999 12:54:25 -0400

Frank

I'm just completing my dissertation. My research topic is learning Bayesian
networks from incomplete data. I defined incomplete data as missing data
and hidden variables. I used stochastic search methods (MCMC and EA) to
search over structures and incomplete data (a form of multiple imputation).
I have a paper that was accepted by GECCO99 and a paper submitted to UAI99.
Neither paper, at the moment, addresses the hidden variable problem. My
dissertation, however, does address this problem and I get some good
results. Unlike Binder et
al.(learning CPTs from a given structure), my work has concentrated on
inducing network structure with hidden variables.

The reason I used stochastic search is to avoid getting "stuck" at the
nearest local optimum. Also, multiple imputation has the advantage over
techniques such as EM in that it explicitly deals with the uncertainty in
the missing data. The EM and other similar approaches find point estimates
of the uncertain parameters. The results were pretty good, but there's
still a good bit of work to do as this is a very difficult problem.

I'd send you my dissertation, but it's not quite ready (needs some editing
and formatting). In addition it's too large for electronic dissemination.
My public defense is next week. If you'd like, I can send copies of the
papers with further explanation of my approach for dealing with hidden
variables. The papers deal with missing data only. Papers describing the
results with hidden variables are forthcoming.

>We are also interested in experiences with implementations of
>the EM algorithm for learning Bayesian networks and how these or
>similar problems have been solved in that context.

See
Friedman, "The Bayesian Structural EM Algorithm", UAI 99

Friedman, "Learning Bayesian Networks in the Presence of missing values and
hidden variables", ML97

Melia and Jordan, "Estimating Dependency Structure as a Hidden Variable",
NIPS10 1998

Geiger, Heckerman, and Meek, "Asymptotic Model Selection for Directed Graphs
with Hidden Variables", UAI96

Lauritzen, "The EM algorithm for Graphical Association Models with Missing
Data" Computational Statistics and Data Analysis, V19, pp191-201, 1996

Best Regards
Jim Myers