Information & Data Management and Analytics (IDEA) Laboratory

The myriad amounts of digital data, i.e., Big Data, can bring about exciting advancements in various areas of science and technology. These potentials will be realized only if people can explore and analyze databases easily, effectively, and efficiently. We set forth the foundations of and build systems for easy, effective, and efficient data management and analytics. Our research lies primarily in the areas of databases and data management.
  • Email: termehca [at]
  • Address: 3053 Kelley Engineering Center, Corvallis, OR 97330-5501
Cape Perpetua

Current projects

  • RIDE: Representation Independent Data Exploration

    The output of database exploration and analytics algorithms highly depend on the structure and representation of their input data. To use current database analytics algorithms, users have to find the desired representation for these algorithms and transform (wrangle) their data to these representations. These tasks are hard and time-consuming and major obstacles for unlocking the value of data. RIDE aims at developing algorithms that return the desired results no matter how their input data is represented. more information
  • CHAIN: Strategic Communication of Humans And Information Systems

    Because humans and information systems express information in different forms, they do not usually communicate effectively: users cannot precisely express their information needs and database systems do not understand users' intents. CHAIN aims at designing interaction strategies and interfaces that enable users and database systems to establish an effective mutual understanding and common language fast. It leverages concepts from game theory to analyze the interaction between users and database systems and find its eventual stable states. more information


  • Our paper: Schema independent Relational Learning will appear in SIGMOD 2017.
  • We will give an overview of our work on representation independent relational learning in ILP 2016.
  • Yodsawalai will present her work on representation independent similarity and proximity search at CIKM 2016.
  • We give an invited talk on representation independent graph analytics on the Eighth Linked Data Benchmark Council (LDBC), Technical User Community Meeting at the Oracle Conference Center in the Redwood Shores.
  • We have a manuscript on game theoretic and language game modeling of database querying and interaction.
  • We will demo Castor , our schema independent and scalable relational learning system, at VLDB'16 .
  • Jose will present our schema independent learning system in LearningSys Workshop at NIPS 2015 and Alberto Mendelzon International Workshop on Foundations of Data Management 2016 .
  • Our paper: A Signaling Game Approach to Database Querying and Interaction will appear at the SIGIR International Conference on the Theory of Information Retrieval , 2015. It considers querying as a collaboration between two potentially rational agents: the user and the database system, to establish a mutual language for representing information and intents. We formalize this collaboration as a signaling game, where each mutual language is an equilibrium for the game.
  • We have two new manuscripts on representation independent analytics project:
    • Our first manuscript shows that relational learning algorithms tend to vary quite substantially over the choice of the database schema, both in terms of learning accuracy and efficiency, which complicates their off-the-shelf application. Hence, it proposes a schema independent, efficient, and effective learning algorithm to solve this problem.
    • Our second manuscript sets forth a novel framework to explore the representation independence of similarly and proximity search in graph data. It further proposes novel effective similarity and proximity search algorithms that are robust over widely popular representational changes.
  • We will demonstrate Universal DB, our representation independent graph analytic system, in VLDB 2015.
  • We have a new manuscript out on automatically finding the necessary amount of specificity in database design.
  • Our work on Automatic Data Organization will appear in the June issue of ACM Transactions on Database Systems (TODS), 2015.
  • Check out our vision paper on representation independent data analytics. Because the results of current data mining and machine learning algorithms depend how their input databases are represented, developers have to spend great deal of time and resources to transform the databases to their desired representations for these algorithms. The paper argues for representation independent methods for data analytics.
  • Thanks to NSF for supporting our research through award "III-Generalizable Similarity and Proximity Metrics For Data Exploration". The grant will fund our work on representation independent graph analytics.

Selected awards

  • Best Student Paper Award, ICDE, 2011.
  • Yahoo! Key Scientific Challenges Award, 2011.
  • ICDE Best Papers Selection, 2011.

Template by