Information & Data Management and Analytics (IDEA) Laboratory

The myriad amounts of digital data, i.e., Big Data, can bring about exciting advancements in various areas of science and technology. We set forth the foundations of and build systems for easy, effective, and efficient data management and analytics. Our research lies primarily in the areas of databases and data management.
  • Email: termehca [at]
  • Address: 3053 Kelley Engineering Center, Corvallis, OR 97330-5501
Cape Perpetua

Current projects

  • CHARM: Autonmous Communication of Humans and Information Sources

    One can gain invaluable insights by integrating and analyzing available data sources, such as traditional data systems, sensors, and social media. Data sources must also interact with each other to provide the information needed for many important queries and analyses. Unfortunately, different data sources express information in different forms. Humans also have their own ways of expressing their information and needs. Hence, humans and data sources cannot communicate effectively, which keeps many valuable insights out of our reach. CHARM aims at designing algorithms and systems that enable information sources and humans to automatically develop an effective mutual understanding and common language through interaction. Check out the project webpage for publication and more information.
  • READY: Representation Independent Data Analytics

    The output of data analytics algorithms highly depend on the structure and representation of their input data. To use current database analytics algorithms, users have to find the desired representation for these algorithms and transform (wrangle) their data to these representations. These tasks are hard and time-consuming and major obstacles for unlocking the value of data. READY aims at developing algorithms that return the desired results no matter how their input data is represented. Check out the project webpage for more information.

Recent News

  • Congratulations to Ben and Vahid for their paper, "The Data Interaction Game" , being selected as one of the best papers of SIGMOD 2018!
  • Jose demonstrates CastorX, a system that efficiently learns over multiple heterogeneous databases using novel sampling techniques, at VLDB 2018. He presents its fundamental ideas at SIGMOD DEEM 2018.
  • Ben presents his work on helping humans and large-scale data sources to progressively and automatically develop a mutual language for effective communication via reinforcement learning at SIGMOD 2018.
  • Jose demonstrates AutoMode, a system that automatically sets the language bias for learning systems over relational data at ICDE 2018.
  • People usually believe that to get effective results for vague queries, e.g., ambiguous keyword queries, data systems have to spend a lot of time and explore many potential answers in the data. We present a lightening talk on how to query large databases both effectively and efficiently using caching techniques at ICDE 2018.
  • We present an analysis of learning strategies in game-theoretic data interaction at the SIGKDD IDEA workshop.
  • We present our work on managing and managing evolving and heterogeneous relational databases at DBPL 2017.
  • Ben presents an overview on our work of modeling human users and data sources as rational agents who want to establish a common language, their learning mechanisms, and interesting equilibria that appear in their interactions in HILDA 2017.
  • Jose will present his paper Schema Independent Relational Learning in SIGMOD 2017. His paper measures the robustness of learning algorithms to data representation and proposes a representationally robust, accurate, and efficient learning over relational data. Here is the one-slide teaser.
  • Jose presents his paper on automatically setting the language bias of learning systems over relational data, the so-called "black magic" of relational learning, in SIGMOD-DEEM 2017
  • Yodsawalai will present our paper Cost-effective Concept Annotation Using Taxonomies, which addresses the tradeoff between the usability and overhead of organizing data sets in WebDB 2017.

Selected awards

  • SIGMOD best papers selection, 2018.
  • Distinguished PC member of SIGMOD 2017.
  • Best student paper award, ICDE 2011.
  • Yahoo! Key Scientific Challenges Award, 2011.
  • ICDE best papers selection, 2011.

Template by