avatar

Shang Yang

Ph.D. Student
MIT EECS
Cambridge, MA
shangy [at] mit [dot] edu


Shang Yang

I am a second-year Ph.D. student at HAN LAB of MIT EECS, advised by Prof. Song Han. Before that, I received my Bachelor degree with highest honor from the Department of Electronic Engineering, Tsinghua University, China, where I was fortunate to be advised by Prof. Yu Wang.

My long-term goal is to build efficient machine learning systems for applications at different scales, especially the Large Language Models (LLMs). Recently, I am activately working on the efficient inference systems for LLMs/VLMs.

News

Selected Publications

  1. arXiv
    Yujun Lin*, Haotian Tang*, Shang Yang*, Zhekai Zhang, Guangxuan Xiao, Chuang Gan, Song Han.
    arXiv, 2024.

  2. MLSys
    Ji Lin*, Jiaming Tang*, Haotian Tang†, Shang Yang†, Wei-Ming Chen, Wei-Chen Wang, Guangxuan Xiao, Xingyu Dang, Chuang Gan, Song Han.
    The Seventh Annual Conference on Machine Learning and Systems (MLSys), 2024.

  3. MICRO
    Haotian Tang*, Shang Yang*, Zhijian Liu, Ke Hong, Zhongming Yu, Xiuyu Li, Guohao Dai, Yu Wang, Song Han.
    56th IEEE/ACM International Symposium on Microarchitecture (MICRO), 2023.

  4. DAC
    Guohao Dai, Guyue Huang, Shang Yang, Zhongming Yu, Hengrui Zhang, Yufei Ding, Yuan Xie, Huazhong Yang, Yu Wang.
    59th Design Automation Conference (DAC), 2022.

Blogs

  1. Running large language models (LLMs) on the edge is of great importance. In this blog, we introduce TinyChat, an efficient and lightweight system for LLM deployment on the edge. It runs Meta's latest LLaMA-2 model at 30 tokens / second on NVIDIA Jetson Orin and can easily support different models and hardware.

  2. Explore the latest advancement in TinyChat and AWQ - the integration of Visual Language Models (VLM) on the edge! The exciting advancements in VLM allows LLMs to comprehend visual inputs, enabling seamless image understanding tasks like caption generation, question answering, and more. With the latest release, TinyChat now supports leading VLMs such as VILA, which can be easily quantized with AWQ, empowering users with seamless experience for image understanding tasks.


© Copyright 2024 Shang Yang. Powered by Jekyll and Minimal Light theme.