A 28nm 12.1TOPS/W Dual-Mode CNN Processor Using Effective-Weight-Based Convolution and Error-Compensation-Based Prediction
H. Mo1 , W. Zhu1 , W. Hu1 , G. Wang1 , Q. Li2 , A. Li1 , S. Yin1 , S. Wei1 , L. Liu1 1 Institute of Microelectronics of Tsinghua University, Beijing, China; 2 Intel, Beijing, China
mm-Wave Transceivers for Communication and Radar
A 1V W-Band Bidirectional Transceiver Front-End with <1dB T/R Switch Loss, <1°/dB Phase/Gain Resolution and 12.3% TX PAE at 15.1dBm Output Power in 65nm CMOS Technology
W. Zhu, J. Wang, R. Wang, Y. Wang, Institute of Microelectronics of Tsinghua University, Beijing, China
A 2.75-to-75.9TOPS/W Computing-in-Memory NN Processor Supporting Set-Associate Block-Wise Zero Skipping and Ping-Pong CIM with Simultaneous Computation and Weight Updating
J. Yue1,2, X. Feng1 , Y. He1 , Y. Huang1 , Y. Wang2 , Z. Yuan1 , M. Zhan1 , J. Liu1 , J-W. Su3 , Y-L. Chung3 , P-C. Wu3 , L-Y. Hung3 , M-F. Chang3 , N. Sun1 , X. Li1 , H. Yang1 , Y. Liu1 1 Tsinghua University, Beijing, China 2 Pi2star Technology, Beijing, China 3 National Tsing Hua University, Hsinchu, Taiwan
A 5.99-to-691.1TOPS/W Tensor-Train In-Memory-Computing Processor Using Bit-Level-Sparsity- Based Optimization and Variable-Precision Quantization
R. Guo1 , Z. Yue1 , X. Si2 , T. Hu1 , H. Li1 , L. Tang1 , Y. Wang1 , L. Liu1 , M-F. Chang3 , Q. Li2 , S. Wei1 , S. Yin1 1 Tsinghua University, Beijing, China 2 University of Electronic Science and Technology of China, Chengdu, China 3 National Tsing Hua University, Hsinchu, Taiwan
A 60GHz 186.5dBc/Hz FoM Quad-Core Fundamental VCO Using Circular Triple-Coupled Transformer with No Mode Ambiguity in 65nm CMOS
H. Jia, W. Deng, P. Guan, Z. Wang, B. Chi Tsinghua University, Beijing, China
A 250kHz-BW 93dB-SNDR 4th-Order Noise-Shaping SAR Using Capacitor Stacking and Dynamic Buffering
J. Liu1 , D. Li2 , Y. Zhong1 , X. Tang3 , N. Sun1,3 1 Tsinghua University, Beijing, China 2 Xidian University, Xi’an, China 3 University of Texas, Austin, TX
ML Processors From Cloud to Edge
魏少军、刘雷波教授等发表题为 “A 28nm 12.1TOPS/W Dual-Mode CNN Processor Using Effective-Weight-Based Convolution and Error-Compensation-Based Prediction ”的论文,发布了一款量化卷积神经网络(CNN)加速芯片——QNAP。通过挖掘量化后CNN模型权值大量冗余的特征,研究团队提出一种能够显著减少冗余权值造成冗余乘操作的优化方法,降低了硬件功耗;同时,还提出了一种减少ReLU激活函数造成冗余乘加操作的预测方法,显著提升了CNN硬件的运行性能;此外,针对广泛使用的残差结构,团队提出一种专用的流水结构,减少了残差结构中大量的片外访存操作。基于TSCM 28 nm工艺,QNAP芯片仅消耗1.9 mm2的面积就实现了高达12.1 TOPS/W的能效,显著优于已有结果。
QNAP芯片照片及其硬件指标
mm-Wave Transceivers for Communication and Radar
高性能双向W-band相控阵收发机前端芯片显微照片
Compute-in-Memory Processors for Deep Neural Networks
存内计算神经网络处理器芯片及硬件指标
TT@CIM芯片及硬件指标
High-Performance VCOs
王志华、池保勇教授等发表题为”A 60GHz 186.5dBc/Hz FoM Quad-Core Fundamental VCO Using Circular Triple-Coupled Transformer with No Mode Ambiguity in 65nm CMOS“的论文。针对当前基频振荡器的相位噪声性能受限于硅基工艺晶体管的有限增益和片上电感的插入损耗,难以满足5G毫米波通信中高阶数字调制需求的挑战,研究团队深入分析了高频片上电感的品质因子降低机理,针对性地提出环状的电感结构,消除了小尺寸电感内径负耦合,大大提高了电感的品质因子。同时,采用三线圈变压器将4个振荡器核耦合在一起,使相位噪声得到额外6dB的降低。团队发布的振荡器采用65nm的CMOS工艺设计和制造,振荡在60GHz,在1MHz频偏处的相位噪声为-104.7dBc/Hz,是目前文献中工作在类似频段的相位噪声性能最好的CMOS基频振荡器,由于其设计简洁、面积小、性能优异,该振荡器结构有望在5G毫米波通信中得到广泛应用。
振荡器的芯片照片及与世界先进水平的性能对比
Discrete-Time ADCs
孙楠教授等发表题为”A 250kHz-BW 93dB-SNDR 4th-Order Noise-Shaping SAR Using Capacitor Stacking and Dynamic Buffering “的论文。研究团队提出了一种全新的离散时间积分器技术,通过电容叠加和动态缓冲实现积分,有效地避免了无源积分导致的信号衰减问题,并且不需要运算放大器等电路。与现有的几种积分器相比,该技术具有PVT鲁棒性高、信号损耗小、电路实现简单、高阶扩展性好等优点。基于该积分器实现的一款四阶噪声整形SAR ADC芯片,在250kHz带宽内达到了93dB的SNDR,功耗为340uW,能效品质因数为182dB。该芯片是目前首款实现90dB以上精度和100kHz以上带宽的噪声整形SAR ADC芯片。
ADC芯片架构图(上)、照片(左下)及测试频谱(右下)
长按下方二维码,关注官方微信。
未来芯片高精尖中心
微信号:THU-ICFC