What the network says is, "given the data, AND assuming the independencies
represented in the naive Bayes model are correct, I'm 95% accurate".
Probably, the assumptions inherent in your naive model are violated to some
extent. For example, suppose someone's job type is "Manager" and there are
two questions "Salary" and "Tendency to delegate tasks". Your naive model
says that within the class of managers, there is no relationship between
these two variables, whereas I believe that high-paid managers are even
more eager to delegate.
So, to answer your second question, my guess would be to perform a
factor analysis to group questions into categories, create nodes for
those categories, and make questions children of their category. That way,
covariances among findings are represented by the mutual dependence of
findings via their parent; such a covariance would not be blocked by
observation of the job type.
regards,
Hans.
- --------------------------------------------------
Hans van Leijen
NOTION / Universiteit Nyenrode
The Netherlands Business School
Straatweg 25
3621 BG Breukelen
The Netherlands
Phone: + 31 346 291313
Fax: + 31 346 291250
E-mail: h.vleijen@nyenrode.nl
URL: http://www.nyenrode.nl/int/research_faculty/cscm/notion/notion.html
This archive was generated by hypermail 2b29 : Thu Feb 08 2001 - 12:09:49 PST