Xiaoli Z. Fern, Ph.D
Associate Professor
To contact me:
Office: Kelly 3073
Phone: (541)737-2557
e-mail: xfern AT eecs.oregonstate.edu
Quick links: Teaching, Research Projects, Students, CV, Publication
list, The
bioacoustic project , Career Grant
Education:
Ph.D, Computer Engineering, ECE, Purdue University, Indiana, USA.
2005
M.S. Institute of Image Processing and Pattern Recognition,
Shanghai Jiao Tong Univ., Shanghai, China 2000
B.S. Automation, Shanghai Jiao Tong University, Shanghai, China
2000
Short Biography:
Dr. Xiaoli Fern is an
assistant professor at the School of Electrical Engineering and
Computer Science, Oregon State University, Corvallis, OR, since
2005. She received her Ph.D. degree in Computer Engineering
from Purdue University, West Lafayette, IN, in 2005 and her M.S.
degree from Shanghai Jiao Tong university (SJTU), Shanghai China
in 2000. Her general research interest is in the area of machine
learning and data mining. She received an NSF Career Award in
2011. She co-organized the first International Workshop on
Discovering, Utilizing and Summarizing Multiple Clustering
(MultiClust KDD 2010), and served as the publicity chair for
International Conference on Machine Learning in 2007. Dr. Xiaoli
Fern is currently an editorial board member of the Machine
Learning Journal and serves regularly on the program committee for
a number of top tier international conferences on machine learning
and data mining such as ICML, ECML, AAAI, KDD, ICDM, SIAM
SDM.
Research:
My primary research interest is in the areas of machine learning and
data mining. My research is largely driven by practical applications
and the challenges they present to traditional machine learning and
data mining techniques. Below is a sample of my current and past
research projects and some selected publications related to these
projects. See my
publication list for a more complete list of publications.
Explorative
data clustering involves grouping objects into clusters
such that similar objects are grouped together. My research
attemps to advance the field of unsupervised clutsering in a
number of directions.
First, motivated by the fact that objects in a data set maybe
similar to each other in multiple different ways, and different
clustering structures may exist in the same data. I am interested
in exploratively examining data in different ways to produce
different clusterings. Such clusterings can be sometimes combined
to provide a more reliable view of the structure of the data via
cluster ensemble
methods, or other times examined individually as they may
provide different insights (
non-redundant clustering).
Clustering with partial
supervision and active learning. Due to the
inherent embiguous nature of the clustering task, in practice it
is very useful to consider some user-provided side information
such that a learning algorithm can seek to find a underlying
clustering struture to that is most consistent with the side
information. For example such information can be expressed in
terms of pairwise constraints requiring some objects to be placed
together or apart. I am interested in developing clustering
techniques that can take into consideration richer forms of user
constraints (e.g., comparative constraints involving multiple
objects), and active learning frameworks that effectively acquire
various form of user inputs without imposing heavy burden on the
users. This work is supported by my NSF career grant. For
more details, please see the project page
here.
Selected publications:
Wei Zhang, Akshat Surve, Xiaoli Z.
Fern and Thomas Ditteriech, Learning Non-redundant Codebooks for
Classifying Complex Objects, In
Proceedings of International Conference on Machine
Learning, ICML 2009. PDF
Javad Azimi and Xiaoli Fern, Adaptive Cluster Ensemble
Selection, In
Proceedings of
International Joint Conference on Artificial Intellegence, IJCAI
2009. PDF
Xiaoli Z. Fern and Wei Lin, Cluster Ensemble Selection,
Journal of Statistical Analysis and
Data Mining,
Special
Issue on Best of SDM08, Volume 1, Issue 3
, Pages128 - 141, 2008
Preprint
Xiaoli Z. Fern and Wei Lin, Cluster Ensemble Selection, In
Proceedings of 2008 SIAM
International Conference on Data Mining (SDM08).
pdf
Ying Cui, Xiaoli Z. Fern and
Jennifer Dy, Non-redundant multi-view clustering via
orthogonalization, in
Proceedings
of 7th IEEE International Conference on Data Mining (ICDM07)
pdf.
Xiaoli Z. Fern and
Carla
E.
Brodley, "Cluster ensembles for high dimensional data
clustering: An empirical study", Techenical report
CS06-30-02.
Xiaoli Z. Fern and Carla E. Brodley, "Solving cluster
ensemble problems by bipartite graph partitioning", in Proceedings
of 21th International Conference on Machine learning (ICML2004),
PDF file,
Matlab
implementation of the algorithm ( Note: this code is
provided on "as is" basis for research use only. )
Xiaoli Z. Fern and Carla E. Brodley, "Random Projection for
High Dimensional Data Clustering: A Cluster Ensemble
Approach", in Proceedings of 20th International Conference
on Machine learning (ICML2003),
PDF file
- The
Bird-Bioacoustic Project. In this project, we
record and analyze bird songs to predict bird species and to
understand the behavior of birds through their songs. This
is a joint project with Dr. Raviv Raich (ECE, OSU) and Dr.
Matthew Betts (Forrestry, OSU). See project webpage for more
exciting details and publications related to this topic.
- Forrest Briggs, Xiaoli Fern and Raviv Raich,
Context-Aware MIML Instance Annotation, To appear in
Proceedings of IEEE International Conference on Data
Mining (ICDM 2013)
- Forrest Briggs, Xiaoli Fern and Raviv Raich, Rank Loss
Support Instance Machines for MIML Instance Annotation, In
Proceedings of ACM International Conference on Knowledge
Discovery and Data Mining (KDD 2012)
- Forrest Briggs, Balaji Lakshminarayanan, Lawrence Neal,
Xiaoli Z. Fern, Raviv Raich, Sarah Frey, Adam Hadley, and
Mattew G. Betts, Acoustic classification of multiple
simultaneous bird species: A multi-instance multi-label
approach, To appear in Journal of Acoustics Society of
America, 2012
- Lawrence Neal, Forrest Briggs, Raviv Raich and Xiaoli
Fern, Time-frequency segmentation of bird song in noisy
acoustic environments, to appear in Proceedings of the
36th International Conference on Acoustics, Speech and
Signal Processing (ICASSP) 2011
- Forrest Briggs, Raviv Raich, and Xiaoli Z. Fern, Audio
Classification of Bird Species: a Statistical Manifold
Approach, to appear in Proceedings
of International Conference on Data Mining, ICDM
2009, pdf
- B. Lakshminarayanan, R. Raich, and X. Fern, A
syllable-level probabilistic framework for bird species
identification, in
Proc. IEEE International Conference on Machine Learning
and Applications, 2009.
- Experimental design and
Bayesian Optimization . In this project, we
develop machine learning techniques to help scientists and
engineers to desgin better microbial fuels by allowing them
more efficiently experiment with different nano-structures.
This is an experimental design problem --- how to choose
appropriate experiments to perform under a budget
constraint.
- Ali Jalali, Javad Azimi, and Xiaoli Fern, A Lipschitz
Exploration-Exploitation Scheme for Bayesian Optimization,
In Proceedings of ECML/PKDD 2013
- Javad Azimi, Ali Jalali, and Xiaoli Fern, Hybrid Batch
Bayesian Optimization, In Proceedings of International
Conference on Machine Learning (ICML 2012)
- Javad Azimi, Alan Fern and Xiaoli Fern, Batch Active
Learning via Coordinated Matching, In Proceedings of
International Conference on Machine Learning (ICML 2012)
- Javad Azimi, Alan Fern and Xiaoli Fern, Budgeted
Optimization with Concurrent Stochastic-Duration
Experiments, In NIPS 2011 (spotlight)
- Javad Azimi, Alan Fern and Xiaoli Fern, Batch Bayesian
Optimization via Simulation Matching Advances in Neural
Information Processing Systems (NIPS-2010)PDF
- Javad Azimi, Xiaoli Fern, Alan Fern, Elizabeth Burrows,
Frank Chaplen, Yanzhen Fan, Hong Liu, Jun Jiao and Rebecca
Schaller. Myopic Policies for Budgeted Optimization with
Constrained Experiments In Proceedings of AAAI Conference on Artificial
Intelligence (AAAI-2010) PDF.
- Mining human strategies
(behavioral patterns) from human computer interaction data
. In this research, we apply data mining to HCI log
data with the goal of finding interesting behavioral
patterns that shed some lights on the strategies users
employ while using software for problem solving. We
collaborate with the HCI researchers at EUSES
consortium on this project.
- Xiaoli Fern, Chaitanya Komireddy, Valentina Grigoreanu,
and Margaret Burnett, "Mining Problem-Solving Strategies
from HCI Data", In ACM
Transaction on CHI (TOCHI) 2010.
- Neeraja Subrahmaniyan, Laura Beckwith, Valentina
Grigoreanu, Margaret Burnett, Susan Wiedenbeck,Vaishnavi
Narayanan, Karin Bucht, Russell Drummond, and Xiaoli Fern,
“Testing vs. Code Inspection vs. ... What Else? Male and
Female End Users’ Debugging Strategies”, ACM Conference on
Human-Computer Interaction, April 2008 (CHI2008)
- Xiaoli Fern, Chaitanya Komireddy, Margaret Burnett,
"Mining Interpretable Human Strategies: A Case Study", to
appear in Proceedings of 7th IEEE International Conference
on Data Mining (ICDM07) pdf. A
longer tech report version can be found here pdf.
- Grigoreanu, V., Beckwith, L., Fern, X., Yang, S.,
Komireddy, C., Narayanan, V., Cook, C., Burnett, M.,
Gender Differences in End-User Debugging, Revisited: What
the Miners Found. In Proceedings of IEEE Symposium on
Visual Languages and Human-Centric Computing Languages and
Environments, Brighton, England, September 2006. pdf
- Correlation analysis on
Earth science data. In this project, the goal is to
find out how one environmental factor impact another, for
example Sea Surface Temperature (SST) versus Sea Surface
Pressure (SSP), and vegetation index versus precipitation.
We developed a method that builds mixture of linear
canonical correlation (CCA) models to explain correlation
patterns between different domains.
- Xiaoli Z. Fern, Carla E. Brodley and Mark A. Friedl,
"Correlation clustering for learning mixture of
canonical correlation models ", SIAM International
Conference on Data Mining (SDM2005). PDF
file; Matlab code for
generating synthetic data with a prespecified correlation
structure.