Publications

Academic Publications

  • Dario Amodei, Rishita Anubhai, Eric Battenberg, Carl Case, Jared Casper, Bryan Catanzaro, Jingdong Chen, Mike Chrzanowski, Adam Coates, Greg Diamos, Erich Elsen, Jesse Engel, Linxi Fan, Christopher Fougner, Awni Y. Hannun, Billy Jun, Tony Han, Patrick LeGresley, Xiangang Li, Libby Lin, Sharan Narang, Andrew Y. Ng, Sherjil Ozair, Ryan Prenger, Sheng Qian, Jonathan Raiman, Sanjeev Satheesh, David Seetapun, Shubho Sengupta, Chong Wang, Yi Wang, Zhiqian Wang, Bo Xiao, Yan Xie, Dani Yogatama, Jun Zhan, Zhenyao Zhu, “Deep Speech 2: End-to-end Speech Recognition in English and Mandarin”, International Conference on Machine Learning 2016, June 2016. pdf.
  • Greg Diamos, Shubho Sengupta, Bryan Catanzaro, Mike Chrzanowski, Adam Coates, Erich Elsen, Jesse Engel, Awni Y. Hannun, Sanjeev Satheesh. “Persistent RNNs: Stashing Recurrent Weights On-Chip”, International Conference on Machine Learning 2016, June 2016. pdf.
  • Song Han, Jeff Pool, Sharan Narang, Huizi Mao, Shijian Tang, Erich Elsen, Bryan Catanzaro, John Tran, William J. Dally, “DSD: Regularizing Deep Neural Networks with Dense-Sparse-Dense Training Flow”. arXiv. 2016.
  • Dario Amodei, Rishita Anubhai, Eric Battenberg, Carl Case, Jared Casper, Bryan Catanzaro, Jingdong Chen, Mike Chrzanowski, Adam Coates, Greg Diamos, Erich Elsen, Jesse Engel, Linxi Fan, Christopher Fougner, Tony Han, Awni Hannun, Billy Jun, Patrick LeGresley, Libby Lin, Sharan Narang, Andrew Ng, Sherjil Ozair, Ryan Prenger, Jonathan Raiman, Sanjeev Satheesh, David Seetapun, Shubho Sengupta, Yi Wang, Zhiqian Wang, Chong Wang, Bo Xiao, Dani Yogatama, Jun Zhan, Zhenyao Zhu, “Deep Speech 2: End-to-end Speech Recognition in English and Mandarin”. arXiv. 2015.
  • Awni Hannun, Carl Case, Jared Casper, Bryan Catanzaro, Greg Diamos, Erich Elsen, Ryan Prenger, Sanjeev Satheesh, Shubho Sengupta, Adam Coates, Andrew Ng, “Deep Speech: Scaling up end-to-end speech recognition”. arXiv. 2014.
  • Sharan Chetlur, Cliff Woolley, Philippe Vandermersch, Jonathan Cohen, John Tran, Bryan Catanzaro, Evan Shelhamer, “cuDNN: Efficient Primitives for Deep Learning”. arXiv. 2014.
  • Saurav Muralidharan, Manu Shantharam, Mary Hall, Michael Garland, Bryan Catanzaro, “Nitro: A Framework for Adaptive Code Variant Tuning”, International Parallel and Distributed Systems Symposium. pdf. 2014.
  • Bryan Catanzaro, Alexander Keller, Michael Garland, “A Decomposition for In-place Matrix Transposition”. Principles and Practices of Parallel Programming 2014, pages 193-206, Orlando, Florida. pdf
  • Adam Coates, Brody Huval, Tao Wang, David Wu, Andrew Ng, Bryan Catanzaro, “Deep learning with COTS HPC systems”. International Conference on Machine Learning 2013, pages 1337-1345, Atlanta, Georgia. pdf
  • Michael Anderson, Bryan Catanzaro, Jike Chong, Ekaterina Gonina, Kurt Keutzer, Chao-Yue Lai, Mark Murphy, David Sheffield, Bor-Yiing Su, Narayanan Sundaram, “Considerations When Evaluating Microprocessor Platforms”. USENIX Workshop on Hot Topics in Parallelism, May 2011. pdf
  • Bryan Catanzaro, Michael Garland, Kurt Keutzer, “Copperhead: Compiling an Embedded Data Parallel Language”. Principles and Practices of Parallel Programming (PPoPP) 2011, pages 47-56. pdf
  • Bryan Catanzaro, Armando Fox, Kurt Keutzer, David Patterson, Bor-Yiing Su, Marc Snir, Kunle Olukotun, Pat Hanrahan, Hassan Chafi, “Ubiquitous Parallel Computing from Berkeley, Illinois and Stanford”. IEEE Micro, Volume 30, Number 2, pages 41-55, March 2010. pdf
  • Bryan Catanzaro, Bor-Yiing Su, Narayanan Sundaram, Yunsup Lee, Mark Murphy, Kurt Keutzer, “Efficient, High-Quality Image Contour Detection”. International Conference on Computer Vision, pages 2381-2388, Kyoto, Japan, 2009. pdf
  • Andreas Klöckner, Nicolas Pinto, Yunsup Lee, Bryan Catanzaro, Paul Ivanov, Ahmed Fasih. “PyCUDA: GPU Run-Time Code Generation for High-Performance Computing.” Computing Research Repository, 2009. pdf
  • Bryan Catanzaro, Shoaib Kamil, Yunsup Lee, Krste Asanovic, James Demmel, Kurt Keutzer, John Shalf, Kathy Yelick, Armando Fox. “SEJITS: Getting Productivity and Performance with Selective Embedded JIT Specialization”. Programming Models for Emerging Architectures, Raleigh, NC, 2009. pdf
  • Bryan Catanzaro, Narayanan Sundaram and Kurt Keutzer, “Fast Support Vector Machine Training and Classification on Graphics Processors”. International Conference on Machine Learning 2008, pages 104-111, Helsinki, Finland. pdf
  • Bryan Catanzaro, Kurt Keutzer and Bor-Yiing Su, “Parallelizing CAD: A Timely Research Agenda for EDA”, In proceedings of Design Automation Conference 2008, pages 12-17, ACM. pdf
  • Bryan Catanzaro, Narayanan Sundaram and Kurt Keutzer, “A Map Reduce Framework for Programming Graphics Processors”, Workshop on Software Tools for Multi-Core Systems 2008. pdf
  • Jike Chong, Nadathur Satish, Bryan Catanzaro, Kaushik Ravindran, Kurt Keutzer, “Efficient Parallelization of H.264 Decoding with Macro Block Level Scheduling”, IEEE International Conference on Multimedia & Expo 2007. pdf
  • Krste Asanovic, Rastislav Bodik, Bryan Catanzaro, Joseph Gebis, Parry Husbands, Kurt Keutzer, Dave Patterson, William Plishker, John Shalf, Sam Williams and Kathy Yelick, “The Landscape of Parallel Computing Research: A View from Berkeley”, Technical Report UCB/EECS-2006-183, EECS Department, University of California, Berkeley, December 18, 2006. pdf
  • Bryan Catanzaro and Brent Nelson, “Higher Radix Floating-Point Representations for FPGA-Based Arithmetic”, IEEE Symposium on Field-Programmable Custom Computing Machines, pages 161-170, 2005. pdf

Theses

  • Bryan Catanzaro, “Compilation Techniques for Embedded Data Parallel Languages”. PhD thesis, University of California, Berkeley, 2011. pdf
  • Bryan Catanzaro, “Higher Radix Floating-Point Representations for FPGA-Based Arithmetic”. MS thesis, Brigham Young University, 2005. pdf

Book Chapters

  • Andreas Klöckner, Nicolas Pinto, Yunsup Lee, Bryan Catanzaro, Paul Ivanov, Ahmed Fasih. “GPU Scripting and Code Generation with PyCUDA,” In GPU Computing Gems, Volume 2. Morgan Kaufmann, 2011.

Magazine Articles

  • Bryan Catanzaro and Kurt Keutzer, “Parallel computing with patterns and frameworks”, XRDS: Crossroads, the ACM Magazine for Students. Volume 17, Issue 1, Pages 22-27, Fall 2010. pdf