Zhongliang Chen Ph.D. Candidate, Dept. of ECE, Northeastern University

I am a PhD candidate at Department of Electrical and Computer Engineering, Northeastern University. My advisor is Dr. David Kaeli. I am a member of Northeastern University Computer Architecture Research Group (NUCAR). I received my master's degree in Computer Science from State Key Laboratory of Computer Architecture (CARCH), Institute of Computing Technology, Chinese Academy of Sciences (ICT, CAS) in 2010, and bachelor's degree in Information Engineering from School of Information and Communication Engineering, Beijing University of Posts and Telecommunications (BUPT) in 2007.


My research interests include parallel computing with Graphics Processing Units (GPU) and computer architecture. Currently, I am working on identifying and analyzing compiler- and architecture-level scalar opportunities in GPGPU applications. These opportunities are later utilized on the novel scalar-vector GPU architecture to improve performance and power efficiency. Also, I have abundant experience in GPU porting, profiling and optimizations.

During my master's career, I have done intensive research on microprocessor reliability at Reliable Design Research Group.


We are designed and implementing a full-system CPU-GPU simulator that enables efficient hardware-software design exploration. The simulator contains an ARM CPU, an AMD GPU, and a bus model in SystemC. We are also developing a GPU driver and an OpenCL library for the hardware platform.
We are modeling NVIDIA Fermi and Kepler microarchitecture on Multi2Sim simulation framework, and implementing a native ISA-level (or SASS-level) simulator.
We are porting a 3D finite difference time domain (FDTD) application in Fortran to a GPU cluster with OpenCL and MPI programming models.
We accelerated a growing neural gas network algorithm using CUDA.
We developed a forward ray tracer for 3D simulation and inversion for whole-body imaging. In order to offer potential for real-time inversion, we used NVIDIA OptiX ray tracing engine and CUDA programming model.


I worked as a GPU Architecture Intern in Advanced Processor Lab at Samsung Research America from May to August 2014. I was working on modeling and implementation of a complete tessellation pipeline, including hull shader, tessellator, and domain shader.


I worked as a Performance Compiler Engineer Intern at AMD Shader Compiler Group from July to December 2011. My work was primarily focused on compile-time scalar opportunity analysis in GPGPU applications and performance evaluation of scalar coprocessors in AMD Southern Islands GPUs.


  1. Yu Hu, Zhongliang Chen, and Xiaowei Li, "OWARE: operand width aware redundant execution for whole-processor error detection," Intelligent Automation and Soft Computing, vol. 17, no. 6, pp. 771-780, 2011.
  1. Yash Ukidave, Fanny Nina Paravecino, Leiming Yu, Charu Kalra, Amir Momeni, Zhongliang Chen, Nick Materise, Brett Daley, Perhaad Mistry, and David Kaeli, "NUPAR: a benchmark suite for modern GPU architectures," ICPE, 2015.
  2. Kathryn Williams, Luis Tirado, Zhongliang Chen, Borja Gonzalez-Valdes, Jose Martinez-Lorenzo, and Carey Rappaport, "Ray tracing simulation tool for portal-based millimeter-wave security systems using the NVIDIA OptiX ray tracing engine," USNC-URSI Radio Science Meeting, 2014.
  3. Rafael Ubal, Dana Schaa, Perhaad Mistry, Xiang Gong, Yash Ukidave, Zhongliang Chen, Gunar Schirner, and David R. Kaeli, "Exploring the heterogeneous design space for both performance and reliability," DAC, 2014.
  4. Ayse Yilmazer, Zhongliang Chen, and David Kaeli, "Scalar waving: improving the efficiency of SIMD execution on GPUs," IPDPS, 2014.
  5. Zhongliang Chen, David Kaeli, and Norman Rubin, "Characterizing Scalar Opportunities in GPGPU Applications," ISPASS, 2013.
  6. Kathryn Williams, Zhongliang Chen, Luis Tirado, Borja Gonzalez-Valdes, Jose Martinez-Lorenzo, and Carey Rappaport, "Ray tracing for 3D simulation and inversion for whole-body imaging," APSURSI, 2012.
  7. Zhongliang Chen and David Kaeli, "Delivering 100x speedup for three-dimensional finite difference time domain (FDTD) on GPU," Workshop on Advances in GPU Computing, 2011.
  8. Yu Hu, Zhongliang Chen, and Xiaowei Li, "Using data-level parallelism to accelerate instruction-Level temporal redundancy," the 4th Conference on Dependable Computing (CDC), 2010.
  9. Li Zhao, Zhongliang Chen, Yu Hu, and Xiaowei Li, "Software-hardware co-simulation based evaluation platform for reliable design of microprocessors (in Chinese)," China Test Conference (CTC), 2010.
  10. Zhongliang Chen, Yu Hu, and Xiaowei Li, "Overview of software-based fault tolerance," China Fault Tolerance Conference (CFTC), 2009.
  11. Zhongliang Chen and Yubin Huang, "The design of low-cost audio signal infrared transceiver (in Chinese)," the 3rd Annual Conference of School of Information Engineering, Beijing University of Posts and Telecommunications, 2007.
  1. Zhongliang Chen, "Research on operand-width aware fault tolerance for microprocessors (in Chinese)," Master's Thesis, Institute of Computing Technology, Chinese Academy of Sciences, 2010.
  2. Zhongliang Chen, "Research on reconfigurable boundary scan technique (in Chinese)," Bachelor's Thesis, Beijing University of Posts and Telecommunications, 2007.


In Spring 2011, I assisted Professor Kaeli to organize a GPU seminar for undergraduate students at Northeastern University.