Hi Samuel,
I started to look at the issues around feature selection. Most of the
current methods used are fairly non-stochastic. The statistical methods are
robust to independent features (i.e. features that tell you nothing about
the target class/value), and so not many statisticians have not considered
reducing the number of features.
In the more traditional machine learning arena this problem has been looked
at by a number of authors. Principle Component Analysis is commonly used -
search on http://www.researchindex.org/. Any new branch of research claims
to work well with highly dependent variables, and uses clustering. I
presented a seminar on this branch of work, which you can find at:
http://www.cs.auckland.ac.nz/~pat/760_2001/seminars/nicks760.html
I should point out that the mathematical justification for these methods is
immature - some short comings are highlighted on the website.
;)
Regards,
Nick Hynes.
This archive was generated by hypermail 2b29 : Thu May 31 2001 - 10:28:51 PDT