Welcome
The STAR Lab focuses on the research, development and educational endeavors in the broad area of computing systems and applications. We perform cutting-edge research on a variety of technologies to improve the performance, energy-efficiency, reliability and security of computing systems across a growing landscape, from embedded and mobile devices to supercomputers and data-centers. Some recent focuses include machine learning accelerators, post-Moore's era architecture, extreme-scale computing, applications of AI/ML in architecture designs, and Internet of Things (IoT). We are also interested in exploring novel applications in machine learning and natural language processing (e.g., large language models) that are enabled by efficient computing systems. Below are a few on-going and past projects.
Machine Learning for Natural Language Processing
- Large language model for simultaneous translation [Simul-LLM]
- Optimizations for large language models [e.g., Extreme model compression]
- Improvement for simultaneous speech translation [ICML'23, ACL Findings'23]
- Linearized transformer models for autoregressive NLP tasks [ECML'23]
Machine Learning for Computer Architecture and System
- Intelligent dynamic resource allocation for edge network servers [FGCS'23 (IF 7.5), Neurocomputing'23 (IF 6.0)]
- Improve data center peak power shaving with deep reinforcement learning [ICAI'21]
- Deep reinforcement learning framework for architectural exploration [HPCA'20 (Best Paper Nomination)]
- Survey of machine learning applied to computer architecture [arXiv 1909.12373]
- Utilize machine learning to characterize data communication patterns [ICCD'19]
- Improve memory controller placement in GPUs with deep learning [CAL'19]
- We have founded and been organizing the Annual International Workshop on AI-assisted Design for Architecture (AIDArc)
Machine Learning Accelerators
- Survey on sparsity exploration in transformer-based accelerators [Electronics'23]
- Polymorphic accelerators for deep neural networks [TC'21]
- Explore cross-layer data reuse in deep neural network accelerators [HPCA'19]
- Tolerating soft errors in deep learning accelerators [NAS'18 (Best Paper Nomination)]
- Flexible on-chip memory architecture for DCNN accelerators [AIM'17]
- Ultralow power accelerator for intelligent data analytics in wearable IoT devices to detect heart and brain diseases based on biosignals [ISCA'17]
GPU Architectures and Extreme-scale computing
- Silicon-interposer based chiplet GPU systems [HPCA'20]
- Remove on-chip network bottleneck in general-purpose GPUs [IPDPS'20]
- High-performance and energy-efficient on-chip networks [HPCA'18]
- Efficient utilization of GPU cache resources and memory bandwidth [ICS'19, ICCD'18]
Harnessing Dark Silicon for Post-Moore Era Computing
- Performance-aware network-on-chip (NoC) power reduction for the dark silicon era [ISLPED'19, HPCA'15, HPCA'14]
- Reducing NoC static power with core-state-awareness [ISLPED'14]
- Effective power-gating of on-chip routers [MICRO'12]
Cache and Memory Bandwidth Partitioning
- Partitioning last-level cache with high associativity [MICRO'14]
- Analytical performance modeling for partitioning memory bandwidth [IPDPS'13]
Deadlock-free Interconnection Networks
- Resource-efficient deadlock avoidance in wormhole-switched networks [HPCA'13]
- Bubble-based deadlock-free schemes [ICS'13, JPDC'12, IPDPS'11]
Application-aware Optimizations for Many-core Processors
- Temperature-aware Application Mapping [DATE'15]
- Application mapping for express channel-based chip multiprocessors [DATE'14]
- Balancing on-chip latency for multiple applications [IPDPS'14]
- Region-aware interference reduction [IPDPS'13]
Transactional Memory & Parallel Programming
- Mitigate mismatch between coherence protocol and conflict detection [IPDPS'14, SC'13]
- Reduce energy and contention in transactional memory [HPCA'13]