CS 589 Fall 2011
Empirical Lab Studies (Quantitative) of Programming

(To understand humans' use of languages, environments, and practices of software development)

Instructor: Dr. Burnett
Office: KEC 3051
E-Mail: burnett@cs.orst.edu

Dr. Burnett's Office Hours are listed on my home page

Course Description

The course is about empirical methods in understanding humans' use of languages, environments, etc., in software development. There are many possibilities about empirical methods, and we can't cover them all. Thus, this course will focus on a method that is very useful for certain kinds of questions about the HCI (human aspects) of software development, but is not well understood in computer science: how to scientifically conduct and analyze (statistically oriented) laboratory studies with human participants.

This course will cover how you go about designing, preparing for, running, analyzing, and writing-for-publication quantitative (statistical) lab experiments of programming situations involving human participants. This is an end-to-end coverage of the entire process, and will put you in a position to conduct lab studies of your own with human participants.

Note that this is not a statistics course, although we will cover a couple of basic stats. If what you really want is a statistics-for-experiments course, I recommend the Stats 511 and/or Stats 515 courses, which are excellent, and are specifically targeted to non-stats grad students.

Course objectives

You can think of this as a "research methods" course, focusing on the research method of doing this type of empirical work. The goals are that by the end of this course, you will be able to:

Choose when a (statistically oriented) lab study is the right choice of empirical work.
Design, conduct, and gather data in such lab studies...
... according to accepted ethical principles of dealing with human subjects.
Analyze data in lab studies using quantitative (statistical) methods.
Report quantitative lab study empirical work in research publications.

How to sign up

This course is listed as CS 589 (Special Topics in Programming Languages). Register for 4 hours of CS 589. It will count toward the "Languages/Systems" category of requirements.

Prerequisites

Contrary to what the catalog or scheduling system might say, the only prerequisite for this course is grad standing in Computer Science.

How the course will be conducted, method of instruction

You'll actually do a lab study of some programming language, tool, or practice, with a team member. You and your teammate will choose the language/tool/practice you want to study:

Any idea within the scope of this course will be fine; it does not have to relate to your research or thesis activity. However, it is allowed to relate to your research if you want it to (read on).
Optionally, this can be a study you need for your research, with the IRB approval already in place very early in the course (so it is probably something your advisor has been planning to do for awhile).
Optionally, what you do in this class could turn out to be a pilot in preparation for a "real" study related to your thesis and/or research.

I'm anticipating little or no programming in this course. There will be some lectures by me; about half of the classes will be like this. The other half of the class will be more studio/discussion style, with teams discussing and critiquing each others' work, and based on team presentations in which the class jointly provides feedback on some aspect of a team's case study. In short, it will be highly interactive.

Textbooks

Required, ISBN/SKU 9781848000438, Shull et al., Guide to Advanced Empirical Software Engineering, Springer, 2008.

We will also have selected readings from other sources, but you don't have to buy those.

Components of your grade

There will be one midterm, one final exam, assignments (most of which will contribute toward your team project), and a final presentation of your study. In addition, every student will need to participate as human subjects in 3 (tentative number) of the studies your classmates are conducting.

Note: This class does not have a TA, and so not all assignments can be graded formally. You will not know in advance which assignments will be graded formally and which will not.

I have high expectations, and expect performance worthy of graduate students in computer science. Thus, in this class, "A" does not mean "adequate" or "nothing wrong" -- it means "excellent". For an A, you should expect to dig deep and get the most you can out of the class.

Weights:

Midterm = 20%
Final exam = 25%
Interim assignments and presentations = 30%
Final project = 25%

Tentative schedule

Subject to change! Please check back every week for updates.

Week 1 (Sept 27-...): Introduction to empirical studies.
- Types of empirical studies, when to use what.
- Ethical issues when working with human subjects.
- ==Read chapter 11: sections 1-5 and section 8.
- ==Skim chapter 9.
- ==HW#1: Take the "CITI" IRB ethics of human subjects tutorial. Hardcopy certificate due in class Thurs 9/29.
  - If you have previously obtained this certificate, turning in a copy of your old one is fine -- you do not have to take the tutorial again.
  - It is wise to keep a pdf of this certificate for reuse later. The IRB sometimes loses their copy of these, so saving a copy is wise.
- ==HW#2 (see below for due date): Start thinking about who your team member should be and what you would like to do for your study project. Here is a resource page.

Weeks 2-3 (Oct 4-...): Designing statistical studies.
- Win-win research questions.
- What types of data you might collect to answer the research questions.
- The concept of statistical significance, statistical dangers (noise and where it can come from, type I errors, type II errors).
- The humans: population sources (real people, mechanical turks, ...), between-subject vs. within-subject, how many groups, random assignment, counterbalancing, Latin Squares.
- Designing the tutorial.
- Designing the task, materials, time limits.
- ==Read Mini-Crowdsourcing End-User Assessment of Intelligent Assistants: A Cost-Benefit Study, (sample experiment paper).
- ==Read about Mechanical Turk: Amazon Mechanical Turk. Requester Best Practices Guide and Crowdsourcing user studies with Mechanical Turk.
- ==HW#2: Choose your platform, initial research questions, teammate. Here is a resource page. Turn in electronically by sending email to Dr. Burnett (my lastname at eecs.oregonstate.edu) by Tues., Oct. 4, 4:00 pm
- ==HW#3a: Design your tutorial. Turn in electronically by Tues., Oct. 11, 4:00 pm.
- ==Presentation: On Tues., Oct. 11, one team will present their tutorial, and the rest of us will critique it. That team is: Tyler/Jennifer
- ==HW #3b: Critique the team's tutorial you have been given. (The presenting team is exempt from this assignment.) Turn in electronically by Thurs., Oct. 13, 4:00 pm.

Week 4 (Oct 18-...): The data collection devices
- Surveys/questionnaires and log data: how to do them, problems to watch out for.
- ==Read chapter 3 (more readings may be added too).
- ==HW #4: Design your experiment plan. Turn in electronically by sending email to Dr. Burnett (my lastname at eecs.oregonstate.edu) by Thurs., Oct. 20, 4:00 pm, structured as described on the projects page (you can add roles and other things that don't quite fit into their structure at the end). Your homework should include the following information:
  - All human subject decisions: How many subjects, Mechanical Turks or "live", within/between, how you will recruit, whether you'll use Latin square, randomization plan, etc. (This is for your real experiment -- the CS589 version will have only 4-5 subjects, but we'll find ways to pretend like you got more.)
  - Experiment roles: who on your team will fulfill what roles.
  - Procedures, task, and related materials: what are your subjects going to do and in what order. If there are materials they'll use during the task (eg, source code, prototype, handouts), include pictures of them in the packet you turn in.
  - The following team will present their project's response to the above 3 bullets on Tues. Oct. 18: Qingqing/Yonglei.
  - Data: What data you'll collect: questionnaires, log data files, any other data you intend to collect. For questionnaires, include the questionnaire itself. For log data files, include the list of data items you'll have in the log file.
  - Include the research questions (you turned these in before, but you can update them now if you want to) -- and with each, point to the data item that will enable you to answer that question.
  - The following team will present their project's plan for the above 2 bullets on Tues. Oct. 18: Karl, Balaji, Saikat.

Week 5 (Oct 25-...): Formative user studies and pilots (studies to perform before the "real" study, aka "studying for the test"): Design and analysis.
- No class (Dr. Burnett out of town).
- ==Read the Ko et al. paper sent via email to the class list (except Sections IV and IX, which are optional.)
- ==Midterm exam on Thursday Oct. 27. (Covers lecture/readings through Week #5's guest lecture.)
- ==HW #5: Run 1 to 3 sandbox pilots. Due Tues., Nov. 1. Turn in the following electronically by 4:00 pm. (Any of ppt, doc, pdf, or txt will be fine):
  - Any problems found (with fixes) with your tutorial(s).
  - Any problems found (with fixes) with the design/use of your prototype for this experiment.
  - Any problems found (with fixes) with your questionnaire(s).
  - Any problems found (with fixes) regarding answering your RQs with the data actually collected.

Week 6 (Nov 1-...): Theory
- Empirical studies and theory
- The following team will present their HW #5 (results of their sandboxes) on Tues. Nov. 1: Wojtek/Mohammad/Faezeh.
- ==Read chapter 12.
- ==Project: Sometime between Nov. 1 and Nov. 11, actually conduct your study, using the rest of the class as your participants. (If you need the EUSES lab for this, coordinate with Dr. Burnett.)

Week 7 (Nov 8-...): Statistically analyzing the data
- No class on Tues., Nov. 8. (Dr. Burnett out of town.)
- ==Project: When you've finalized your design, you should update your write-up of your experiment's design, procedures, etc. (See my Papers page for examples. Good choices for statistical examples include 2010/#4, 2008/#1, 2008/#6. Section 5 of 2010/#3 provides a different kind statistical example, but had to be very short due to space.)
- ==HW #6: Be a participant ("subject") in two of the empirical studies your classmates are running:
  - Participants in Karl/Saijat/Balaji's CoScripter study will be: Faezeh, Qingqing, Tyler, Wojtek, Mohammad
  - Participants in Qingqing/Yonglei's Eclipse study will be: Faezeh, Jennifer, Tyler, Saikat, Mohammad
  - Participants in Tyler/Jennifer's testing tool study will be: Qingqing, Wojtek, Balaji, Karl, Yonglei
  - Participants in Wojtek/Mohammad/Faezeh's Tasktracer study will be: Jennifer, Balaji, Saikat, Karl, Yonglei
- ==Read chapter 6.
- ==Project: No later than Nov. 12, you should start statistically analyzing your data.
- Here is a directory of R-related examples.

Week 8 (Nov 15-...): Statistically analyzing (cont.)
- Thur., Nov. 17 is "lab day". Bring your statistical issues in some kind of sharable form, and we'll brainstorm about them together.

Week 9 (Nov 22-...): Validity (Thanksgiving holiday Nov. 25-26):
- ==Read chapter 11's section 7, revisit chap 6's section 3.2, handout from Wohlin et al.

Week 10 (Nov 29-...): Project presentations.
- Tuesday presentations by: everyone. Staying late, ie till about 6:30. (pizza provided).
- No class on Thursday (Dr. Burnett out of town).
- ==Turn in Final project (email) by midnight Sunday, Dec. 4. What to turn in:
  - Your powerpoint (can be updated).
  - Your written report. You should combine your write-up of your design and research questions that you did earlier (updated if needed) with your new "Results section" that describes the statistical results in the usual way (as per acceptable standards). It should look like the Results section of an empirical research paper.

Finals week: Final exam is on Tues., Dec. 6, from 9:30 am to 11:20 am. (Note that this date is fixed, so make your vacation plans accordingly.)

Things I may work in if time permits:

Replication (Chapter 14)

Handy link to internal page.

Margaret M. Burnett
Date of last update: Nov. 30, 2011

CS 589 Fall 2011 Empirical Lab Studies (Quantitative) of Programming