ECE679 Special Topic on Computer Engineering

Welcome to Computer Memory Systems course homepage! This page provides access to class info outside of the classroom.

This page is currently under construction.....


Course Description

An introduction to the current semiconductor memory technologies and modern computer memory systems design issues. Topics include memory technologies and trend; control, and access modes; memory hierarchies, and allocation algorithms; characteristics of programs; and system organization.

Course Information

  • Suggested Reference books:
    1. Bettty Prince, Semiconductor Memories - A Handbook of Design, Manufacture, and Application
    2. S. Przybylski, Cache and Memory Hierarchy Design: A Performance-Directed Approach, Morgan Kaufmann Publishers.
  • Pre-req: course 472/572 or equiv. with consent of instructor.

    Course Work

    Course work will consist of reading a selection of research papers, participating in class discussion, and doing a project. Both an oral presentation and written report of the project are to be done. All groups will make a presentation to the class. The first 7 weeks of the course will focus on the textbook and reading the papers. The final 3 weeks will be used for working on the projects and presentation. During the first half of the 7 weeks I will lecture on different fundamental topics. After that you are required to read 2-3 papers before each lecture. You will need to prepare a summary of each paper, and discussing the papers during class time. To summarize a paper, you are to identify the following:

    The summaries are to be handed in at the beginning of each lecture electronically (ASCII). For the discussion, each person will be expected to lead the discussion for some subset of the papers and everyone is expected to participate.

    The course project consists of writing a survey of research papers on a chosen topic and/or conducting a simulation study that is related to your chosen topic. The survey topic is to be chosen from the topics given in the reading list of the papers that we will be discussing in class. The simulation study may be one of the future studies proposed when discussing the papers. This project is to be done in groups of 2. While we are going through the papers, you should be choosing your partner as well as a topic and thinking of a simulation study to conduct. The papers in the reading list are a starting point and you are expected to add at least 2 more papers to your survey.

    How to give a presentation?

  • Oral Presentation Adivce from Mark Hill

    Lecture Notes

    1. Introduction (last update 4/1/97)
    2. Memory Hierarchy (last update 4/7/97)
    3. More on Cache (last update 4/7/97)
    4. More on Cache (last update 4/8/97)
    5. Memory Technology (last update 4/1/97)

    Tentative Reading List

    Memory Technologies - Introduction

    1. Shih-Lien Lu. Memory Devices (text in PS) (Figures in PS) Section 48, The Electronics Handbook, Edited by Jerry Whitaker, CRC Press. 1996
    2. Betty Prince. "Semiconductor Memoryies - A Handbook of Design, Manufacture, and Application". Wiley 1996
    3. Lanny Lewyn. Physical Limits of VLSI dDRAM's. IEEE Journal of Solid-State Circuits, Vol., SC-20, No. 1, Feb. 1985.

    Program Locality and Memory Systems Hierarchy

    1. Doug Burger, Jim Goodman and G. Sohi. Memory Systems in The Handbook of Computer Science and Engineering, CRC Press, 1997. Also to appear in The Handbook of Electrical Engineering, CRC Press, 1997.
    2. P. J. Denning. Working Sets Past and Present. IEEE Trans. on Software Engineering, Vol. SE-6, No. 1, Jan. 1980 pp.64-84.
    3. B. Ramakrishna Rau. Program Behavior and the Performance of Interleaved Memories. IEEE Trans. on Computers, Vol. C-28, No. 3, Mar. 1979 pp.191-199.
    4. K. R. Kaplan and R. O. Winder. Cache-Based Computer Systems. Computer, March, 1973 pp.30-36.
    5. A. J. Smith. Cache Memories. Computing Surveys, Vol. 14, No. 3, pp. 473-530, Sept. 1982.
    6. Doug Burger, Jim Goodman and Alain Kägi Memory Bandwidth Limitations of Future Microprocessors," 23rd International Symposium on Computer Architecture (ISCA), May, 1996.

    Multi-Level Caching

    1. D. Nagel, R. Uhlig, T. Mudge, S. Sechrest. Optimal Allocation of On-chip Memory for Multiple-API Operating Systems. Proceedings of the International Symposium on Computer Architecture, pages 358-369, April 1994.
    2. M. Farrens, G. Tyson, A. Pleszkun. A Study of Single-Chip Processor/Cache Organizations for Large Numbers of Transistors. Proceedings of the International Symposium on Computer Architecture, pages 338-347, April 1994.
    3. B.L. Jacob, P.M. Chen, S.R. Silverman, T.N. Mudge. An Analytical Model for Designing Memory Hierarchies. IEEE Transactions on Computers, Vol 45, No 10, pages 1180-1194, October 1996.

    Instruction Fetching

    1. R. Yung. Design Decisions Influencing the UltraSPARC's Instruction Fetch Architecture. Proceedings of the International Symposium on Microarchitecture, pages 178-190, December 1996.
    2. E. Rotenberg, S. Bennett, J.E. Smith. Trace Cache: A Low Latency Approach to High Bandwidth Instruction Fetching. Proceedings of the International Symposium on Microarchitecture, pages 24-34, December 1996. (Technical Report version)
    3. D. Lee, J.-L. Baer, B. Calder, D. Grunwald. Instruction Cache Fetch Policies for Speculative Execution. Proceedings of the International Symposium on Computer Architecture, pages 357-367, June 1995.
    4. J. Pierce, T. Mudge. Wrong-Path Instruction Prefetching. Proceedings of the International Symposium on Microarchitecture, pages 165-175, December 1996.

    Data Fetching

    1. N.P. Jouppi. Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-Associative Cache and Prefetch Buffers (DEC WRL Technical Report version). Proceedings of the International Symposium on Computer Architecture, pages 364-373, May 1990.
    2. K.I. Farkas and N.P. Jouppi. Complexity/Performance Tradeoffs with Non-Blocking Loads (DEC WRL Technical Report version). Proceedings of the International Symposium on Computer Architecture, pages 211-222, April 1994.
    3. T.-F. Chen, J.-L. Baer. Effective Hardware Based Data Prefetching for High-Performance Processors. IEEE Transactions on Computers, Vol 44, No 5, pages 609-623, May 95.
    4. K.I. Farkas, N.P. Jouppi, P. Chow How Useful are non-blocking Loads, Stream Buffers and Speculative Execution in Multiple Issue Processors? (DEC WRL Technical Report version). Proceedings of the International Symposium on High-Performance Computer Architecture, pages 78-89, January 1995.
    5. G. Tyson, Matthew Farrens, John Matthews, and Andrew Pleszkun. A Modified Approach to Data Cache Management" Proceeding of the 28th Annual Symposium on Microarchitecure, Nov 28 - Dec 1, 1995.

    Tools and Modeling

    1. Jeffrey D. Gee, Mark D. Hill, Dionisios N. Pnevmatikatos, Alan Jay Smith. Cache Performance of the SPEC92 Benchmark Suite, IEEE Micro, August 1993.
    2. R. E. Kessler, Mark D. Hill, David A. Wood. A Comparison of Trace-Sampling Techniques for Multi-Megabyte Caches, IEEE Transactions on Computers, June 1994.
    3. Alvin R. Lebeck and David A. Wood. Cache Profiling and the SPEC Benchmarks: A Case Study, IEEE COMPUTER, pages 15-26, October 1994
    4. Alvin R. Lebeck and David A. Wood. Active Memory: A New Abstraction For Memory System Simulation ACM SIGMETRICS May 1995. (a more extended version) To appear ACM Transactions on Modeling and Computer Simulation
    5. Vijay S. Pai, Parthasarathy Ranganathan, and Sarita V. Adve, RSIM: An Execution-Driven Simulator for ILP-Based Shared-Memory Multiprocessors and Uniprocessors, Proceedings of the 3rd Workshop on Computer Architecture Education (held in conjunction with the 3rd International Symposium on High Performance Computer Architecture), February 1997.

    Other Unsorted Papers - May be Useful for Class Project

    1. S. J. E. Wilton and N Jouppi. An Enhanced Access and Cycle Time Model for On-Chip Caches, DEC WRL Research Reports #WRL-TR-93.5 July, 1994
    2. Teresa L. Johnson and Wen-mei W. Hwu. Run-time Adaptive Cache Hierarchy Management via Reference Analysis Proceedings of the 24th International Symposium on Computer Architecture, June 2-4, 1997
    3. Cheng-Hsueh A. Hsieh, Marie T. Conte, Teresa L. Johnson, John C. Gyllenhaal and Wen-mei W. Hwu. A Study of the Cache and Branch Performance Issues with Running Java on Current Hardware Platforms Proceedings of COMPCON , 1997
    4. Teresa L. Johnson, and Wen-mei W. Hwu. Run-time Cache Hierarchy Management via Reference Analysis IMPACT Technical Report, IMPACT-96-01, University of Illinois, Urbana, IL 1996
    5. Yoji Yamada, Teresa L. Johnson, Grant Haab, John C. Gyllenhaal, and Wen-mei W. Hwu
      Reducing Cache Misses in Numerical Applications Using Data Relocation and Prefetching Technical Report CRHC-95-04, Center for Reliable and High-Performance Computing, University of Illinois, Urbana, IL 1995
    6. D.C. Burger, S. Kaxiras, and J.R. Goodman. "DataScalar Architectures," 24th International Symposium on Computer Architecture (ISCA), June, 1997.
    7. Patterson, D. A Case for Intelligent RAM: IRAM To Appear, IEEE Micro, April 1997
    8. Perissakis et. al. "Scaling Processors to 1 Billion Transistors and Beyond: IRAM" (Draft), Submitted to IEEE Computer Special Issue: Future Microprocessors - How to use a Billion Transistors, September 1997.
    9. Richard Fromm et. al. "The Energy Efficiency of IRAM Architectures," To appear in ISCA '97: The 24th Annual International Symposium on Computer Architecture, Denver, CO, 2-4 June 1997.
    10. Patterson et. al. "A Case for Intelligent DRAM: IRAM,"
    11. Patterson et. al. (pdf) "Intelligent RAM (IRAM): Chips that remember and compute," 1997 IEEE International Solid-State Circuits Conference, San Francisco, CA, 6-8 February 1997.

    Simulaiton tools and Traces

  • Tycho and Dinero Tools. DineroIII can be found under /nfs/bedrock/u2/dlx/dinero.

    There should be some traces also available on OSU site. I am also in the process of getting the Monster traces set (2GB) from Intel. Besides waiting for it you may want to look into:

  • BYU Trace archieve
  • NMSU Trace Database

    Other Links to Real Memory Systems

  • Pentium Pro L2 Info
  • HP's PA-RISC L1 Cache
    Created: 6 Jan 1997
    Last Updated: 8 April 1997 by Shih-Lien Lu (
    Return to Shih-Lien Lu's Home page