CS 475 / 575 -- (Desktop and Mobile) Parallel Programming

Spring Quarter 2012

http://cs.oregonstate.edu/~mjb/cs575


IM Lectures Projects Handouts Grades VHR


This page was last updated: August 6, 2012


Announcements:


What We Will Be Doing This Quarter

The goals of this course are to leave you "career-ready" (i.e., both work-ready and research-ready) for tasks that require desktop parallelism, both on a CPU and on a GPU.

CS 475/575 topics include:



Prerequisites

This course will be a very intense experience in C/C++ programming. As such, you should come in already proficient in, at least, C. This will not be a good time to learn C from scratch. Familiarity with data structures would also be helpful.

Some knowledge about computer architecture (e.g., cores, cache) would be a plus.

Learning Outcomes

On completion of the course, students will have demonstrated the ability to:

  1. Use physics and Moore's Law to explain the clockspeed limitations of computing
  2. Use Amdahl's Law to explain the limitations of parallel computing
  3. Demonstrate "parallel thinking" in program design
  4. Explain the difference between ILP, TLP, DLP, and SIMD
  5. Demonstrate the ability to program parallel algorithms in TLP, DLP, and SIMD.
  6. Characterize what types of problems are best able to be parallelized
  7. Characterize different parallel programming patterns and what types of problems they best address
  8. Chacacterize how cache issues affect parallel performance
  9. Demonstrate the proper use of synchronization to avoid race conditions, deadlock, and livelock
  10. Characterize the benefits of using a CPU versus using a GPU for parallel programming

    In addition, those taking this course as CS 575 will also have deminstrated the ability to:

  11. Perform and benchmark reduction in OpenCL.



Professor

The class is being taught by Professor Mike Bailey.

If you need it, you will have access to the graphics systems in OSU's Computer Graphics Education Lab (CGEL) in Batcheller Hall 244 for OpenCL GPU access.
Office: Kelley 2117
Phone: 541-737-2542
E-mail: mjb@cs.oregonstate.edu
Web site: http://cs.oregonstate.edu/~mjb

Office Hours:

Sundays 7:00-8:00 (PM) Instant Messaging
Mondays 2:00 - 4:00 Kelley 2117
Tuesdays 2:30 - 4:30 Kelley 2117
Fridays 10:30 - 12:30 Kelley 2117
    or, anytime my office door is open
    or, by appointment -- send email

The Virtual Hand Raise (VHR)

I recognize that it sometimes takes a certain amount of courage to ask a question in class. But, the worst thing of all is to not ask! So, this class also offers a feature called the Virtual Hand Raise. Click here to get into it. It will allow you to send me a question or comment, completely anonymously. I will answer questions submitted this way by email to the class or in class.



Textbook

You are expected to have the following book handy:

Peter Pacheco, An Introduction to Parallel Programming, Morgan-Kaufmann, 2011.

This is available from the OSU bookstore. There will be assigned readings from it. You don't need your own copy, but at least have one you can share. Other course material will consist of web pages, handouts, and notes taken in class.



Other Good References



Lecture Schedule

To see an Academic Year calendar, click here.

Class lecture time is: Monday, Wednesday, and Friday, 1:00 - 1:50. Unless otherwise given, all classes will be in Owen 101.

  Date Reading Topics
1 April 2   Introduction. Syllabus. What this course is ... and isn't.
Timing. Graphing.
The two things we care about Parallel Processing for. Examples.
2 April 4 Chapter 2.1: 15-18, 29-32 Von Neumann architecture.
Clock speed.
Moore's Law. What holds, what doesn't.
Multicore. Hyperthreading.
A special kind of parallelism: Single Instruction Multiple Data (SIMD). Intel SSE instructions: what they are, how to use them. Types of problems that work this way.
3 April 6   SSE: Multiplying two arrays into a third array. Project #2.
Fourier analysis. Autocorrelation.
SSE: Multiplying two arrays and summing them. Project #3.
4 April 9 Chapter 2.2: 25-29 More introduction to parallel programming.
ILP, TLP, DLP.
Flynn's taxonomy.
5 April 11 Chapter 2.3: 32-34, 53 Definitions used in the discussion of parallel programming.
Threads. Thread safety.
Functional decomposiion. Data decomposiion.
6 April 13 Chapter 5.1, 5.2, 5.3: 209-221 OpenMP: fork-join model, pragmas, what it does for you, what it doesn't do for you,
7 April 16 Chapter 2.4.3: 49 ; Chapter 5.5: 224-232 SIGHPC: ACM's Special Interest Group on High Performance Computing (a special presentation by Prof. Cherri Pancake)
OpenMP: parallelizing for-loops
OpenMP: variable sharing, dynamic vs. static thread assignment.
Chunksize.
Summing. Not doing anything special vs. critical vs. atomic vs. reduction.
Trapezoid integration.
Synchronization. Race conditions. Deadlock, livelock.
Mutexes.
Barriers.
8 April 18 Chapter 2.6: 58-65 Project #4.
OpenMP: sections, tasks.
Project #5.
Timing. Speedup. Amdahl's Law. Parallel efficiency. Parallel scalability.
9 April 20 Chapter 2.2, 2.3: 19-25, 43-45 ; Chapter 5.9: 251-256 Caches. Architecture. Hits. Misses.
10 April 23 Chapter 5.4-8: 221-251 Caches. False sharing.
Project #6.
11 April 25 Chapter 2.7: 65-70 Designing parallel programs.
12 April 27   Designing parallel programs.
13 April 30 Chapter 2.2, 2.3: 19-25, 43-45 ; Chapter 5.9: 251-256 Test #1 review
14 May 2   Pthreads. Library calls.
15 May 4   Test #1
16 May 7   Go over the test answers.
Pthreads.
17 May 9   Pthreads. Mutexes, barriers, conditional variables.
18 May 11   GPU 101.
Architecture.
What they are good at. What they are not good at. Why?
19 May 14 Chapter 4.6: 168-171, 176-190 More GPU 101.
20 May 16   OpenCL: What is it? Diagram. Mapping onto GPU architecture.
OpenCL library. Querying configurations.
21 May 18 ---- Engineering Expo -- No Class Today
22 May 21   More OpenCL.
Project 8.
23 May 23   Outline of the Project 8 C++ OpenCL program.
OpenCL Reduction.
24 May 25   OpenCl Events
OpenCL / OpenGL Interoperability
25 May 28 ---- Memorial Day -- OSU Holiday -- No Class Today
26 May 30   More OpenCL / OpenGL Interoperability
Project #9.
27 June 1   Guest Speaker: Patrick Neill, NVIDIA: "GPU Architectures"
28 June 4   Dumping OpenCL Assembly Language.
The Message Passing Interface (MPI)
29 June 6   Guest Speaker: Michael Wrinn, Intel: "Parallel Design Patterns"
30 June 8   "Count the Parallelisms"
Class Evaluations.
More Information.
Test #2 review.
* June 13   Test #2 Wednesday, June 13 9:30 - 11:00 AM.



Projects

Project # Points Title Due Date
1 20 Register your Grade-Posting Alias April 4
2 60 SSE: Array multiplication. April 11
3 100 SSE: Autocorrelation April 18
4 100 OpenMP: Numeric integration April 27
5 100 OpenMP: N-body problem May 4
6 70 False sharing. May 11
7 100 Pthreads Functional Decomposition May 21
8 100 OpenCL Array Multiplication May 30
8B 60 CS 575: OpenCL Reduction May 30
9 120 OpenCL/OpenGL Particle System June 11

Project Notes


Project Turn-In Procedures


Bonus Days and Late Assignments

Projects are due at 23:59:59 on the listed due date, with the following exception:

Each of you has been granted five Bonus Days, which are no-questions-asked one-day extensions which may be applied to any project, subject to the following rules:

  1. No more than 3 Bonus Days may be applied to any one project
  2. Bonus Days cannot be applied to tests
  3. Bonus Days cannot be applied such that they extend a project due date past the start of Test #2.

Click here to get a copy of the Bonus Day Submission Form. Fill this out and turn it in the next class period after turning in your project.

After the due date, after you have exhausted all your eligible Bonus Days, projects can still be turned in for 50% credit, as long as you turn them in within two weeks of the original due date. You still need to turn in your paper bundle. (That's how I will know to go look for your project in the turn-in area.)



Grading

Grades will be posted through this web page. To protect your privacy, they will be posted by your alias that you give me in Project #1.

Click here to see the current grade posting.

CS 575 will be graded on a fill-the-bucket basis. There will be 10 projects and two tests. In addition, the CS 575 people have an extra project. You get to keep all the points you earn.

Your final grade will be based on your overall class point total. Based on an available point total of 1030, grade cutoffs will be no higher than:

Points Grade
1000
970 B+
940
910 C+
880
850 D+
830



Downloadable Files

  1. simd.h
  2. simd.cpp



Handouts

Don't print these until you are told to do so in the Announcements section.
Sometimes I will put notes out here that are not quite complete, just to show you where we are headed.

  1. Moore's Law: 1pp, 2pp, 6pp
  2. More Background Information 1pp, 2pp, 6pp
  3. Parallel programming using OpenMP 1pp, 2pp, 6pp
  4. Trapezoid Intergration with OpenMP 1pp, 2pp, 6pp
  5. Speed-up 1pp, 2pp, 6pp
  6. Bubble Sort case Study 1pp, 2pp, 6pp
  7. Caching Issues in Multicore Performance 1pp, 2pp, 6pp
  8. Parallel Program Design Patterns and Strategies 1pp, 2pp, 6pp
  9. pthreads 1pp, 2pp, 6pp
  10. GPU 101 1pp, 2pp, 6pp
  11. OpenCL 1pp, 2pp, 6pp
  12. OpenCL Files first.cpp, first.cl,
  13. OpenCL Reduction 1pp, 2pp, 6pp
  14. OpenCL Events 1pp, 2pp, 6pp
  15. OpenCL / OpenGL Vertex Buffer Interoperability 1pp, 2pp, 6pp
  16. OpenCL / OpenGL Texture Interoperability 1pp, 2pp, 6pp
  17. Dumping OpenCL Assembly language 1pp, 2pp, 6pp
  18. The Message Passing Interface 1pp, 2pp, 6pp
  19. August 6, 2012 New with OpenGL 4.3 -- Compute Shaders! 1pp, 2pp, 6pp
  20. Finding More Information PDF,



Class Rules



Students With Disabilities

Accommodations are collaborative efforts between students, faculty and Disability Access Services (DAS). Students with accommodations approved through DAS are responsible for contacting the faculty member in charge of the course prior to or during the first week of the term to discuss accommodations. Students who believe they are eligible for accommodations but who have not yet obtained approval through DAS should contact DAS immediately at 737-4098.



Other Useful Online Parallel Programming Information