Feng (Jeff) Liang, 梁丰

I am a PhD student at UT Austin, fortunately working with Prof. Diana Marculescu. I obtained my master and bachelor degree from Tsinghua University and Huazhong University of Science and Technology, respectively.

My current research interests lie in vision-language models and generative AI with a special interest in efficiency. If you find any research interests that we might share, feel free to drop me an email. I am always open to potential collaborations.

Email  /  CV  /  Google Scholar  /  Linkedin  /  Zhihu  /  Twitter

profile photo
News
  • May 2024: Checkout our StreamV2V with code&demo!
  • May 2024: Honored to have been chosen as 2024 MLCommons ML and Systems Rising Stars!
  • February 2024: Two papers (FlowVid and Fairy) get accepted to CVPR 2024!
  • January 2024: After being rejected four times, Supervised MAE (SupMAE) finally gets accepted in AAAI Edge Intelligence Workshop (EIW) 2024 with Best Poster Award!
  • December 2023: Checkout our video-to-video synthesis work FlowVid and instruction-based Fairy!
  • March 2023: I will intern at Meta Gen AI this summer, fortunate to work with Dr. Bichen Wu, again!
  • February 2023: One paper gets accepted to CVPR 2023!
  • November 2022: Checkout our Open-vocabulary Segmentation (OVSeg) with codes and demo!
  • August 2022: Checkout our Supervised MAE (SupMAE) with codes&models!
  • June 2022: Three papers get accepted to ICML workshops 2022!
  • April 2022: One paper gets accepted to IJCAI 2022 as long oral!
  • March 2022: One paper gets accepted to CVPRW ECV 2022!
  • February 2022: I will intern at Meta Reality Labs this summer, fortunate to work with Dr. Bichen Wu!
  • January 2022: One paper gets accepted to ICLR 2022!
  • October 2021: Checkout our Data efficient CLIP (DeCLIP) with codes&models!
  • July 2021: One paper gets accepted to ICCV 2021!
  • April 2021: I am granted UT Austin Engineering Fellowship!
Publications
streamv2v Looking Backward: Streaming Video-to-Video Translation with Feature Banks
Feng Liang, Akio Kodaira, Chenfeng Xu, Masayoshi Tomizuka, Kurt Keutzer, Diana Marculescu
Manuscript
project page, arxiv, code, Huggingface demo,

We present StreamV2V to support real-time video-to-video translation for streaming input.

ovseg FlowVid: Taming Imperfect Optical Flows for Consistent Video-to-Video Synthesis
Feng Liang, Bichen Wu, Jialiang Wang, Licheng Yu, Kunpeng Li, Yinan Zhao, Ishan Misra, Jia-Bin Huang, Peizhao Zhang, Peter Vajda, Diana Marculescu
CVPR, 2024, Highlight
project page, arxiv, 5min video,

We leverage the temporal optical flow clue within video to enhance the temporal consistency for text guided video-to-video synthesis.

ovseg Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP
Feng Liang, Bichen Wu, Xiaoliang Dai, Kunpeng Li, Yinan Zhao, Hang Zhang, Peizhao Zhang, Peter Vajda, Diana Marculescu
CVPR, 2023
project page, arxiv, code, Huggingface demo, 7min video, 1hour talk (chinese),

For the first time, we show open-vocabulary generalist models match the performance of supervised specialist models without dataset-specific adaptations.

supmae SupMAE: Supervised Masked Autoencoders Are Efficient Vision Learners
Feng Liang, Yangguang Li, Diana Marculescu
AAAI EIW, 2024, Best Poster Award
arxiv, code, award

SupMAE extends MAE to a fully-supervised setting by adding a supervised classification branch, thereby enabling MAE to effectively learn global features from golden labels.

declip Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm
Yangguang Li*, Feng Liang*, Lichen Zhao*, Yufeng Cui, Wanli Ouyang Jing Shao, Fengwei Yu, Junjie Yan
ICLR, 2022
arxiv, bibtex, code, video presentation

We propose Data efficient CLIP (DeCLIP), a method to efficiently train CLIP via utilizing the widespread supervision among the image-text data.

ant ANT: Adapt Network Across Time for Efficient Video Processing
Feng Liang, Ting-Wu Chin, Yang Zhou, Diana Marculescu
CVPRW ECV, 2022
arxiv, bibtex,

we propose the ANT framework to harness these redundancies for reducing the computational cost of video processing. The proposed ANT adapts a purpose-fit network by inspecting the semantic differences between frames.

repre RePre: Improving Self-Supervised Vision Transformer with Reconstructive Pre-training
Luya Wang, Feng Liang, Yangguang Li, Honggang Zhang, Wanli Ouyang, Jing Shao
IJCAI, 2022, Long oral
arxiv, bibtex,

We propose RePre to extends contrastive frameworks by adding a branch for reconstructing raw image pixels in parallel with the existing contrastive objective.

crnas Computation Reallocation for Object Detection
Feng Liang, Chen Lin, Ronghao Guo, Ming Sun, Wei Wu, Junjie Yan, Wanli Ouyang
ICLR, 2020
arXiv, bibtex

We present CRNAS that can learn computation reallocation strategies across different feature resolution and spatial position diectly on the target detection dataset.

oqa Once Quantization-Aware Training: High Performance Extremely Low-bit Architecture Search
Mingzhu Shen, Feng Liang, Ruihao Gong, Yuhang Li, Chuming Li, Chen Lin, Fengwei Yu, Junjie Yan, Wanli Ouyang
ICCV, 2021
arxiv, bibtex, code

We present Once Quantization-Aware Training (OQAT), a novel framework that searches for quantized efficient models and deploys their quantized weights at the same time without additional post-process.

fqn Inception Convolution with Efficient Dilation Search
Jie Liu, Chuming Li, Feng Liang, Chen Lin, Junjie Yan, Wanli Ouyang, Dong Xu
CVPR, 2021, Oral
arxiv, bibtex, code

We proposed a new mutant of dilated convolution, namely inception (dilated) convolution where the convolutions have independent dilation among different axes, channels and layers.

fqn Fully Quantized Network for Object Detection
Rundong Li, Yan Wang, Feng Liang, Hongwei Qin, Junjie Yan, Rui Fan
CVPR, 2019
CVF, bibtex, code

We apply our techniques to produce fully quantized 4-bit detectors based on RetinaNet and Faster RCNN, and show that these achieve state-of-the-art performance for quantized detectors.

nasgem NASGEM: Neural Architecture Search via Graph Embedding Method
Hsin-Pai Cheng, Tunhou Zhang, Yixing Zhang, Shiyu Li, Feng Liang, Feng Yan, Meng Li, Vikas Chandra, Hai Li, Yiran Chen
AAAI, 2021
arxiv, bibtex,

We propose NASGEM which stands for Neural Architecture Search via Graph Embedding Method. NASGEM is driven by a novel graph embedding method equipped with similarity measures to capture the graph topology information.

scalenas ScaleNAS: One-Shot Learning of Scale-Aware Representations for Visual Recognition
Hsin-Pai Cheng*, Feng Liang*, Meng Li, Bowen Cheng, Feng Yan, Hai Li, Vikas Chandra, Yiran Chen
AutoML-Conf, 2022
arxiv, bibtex

We present ScaleNAS, a one-shot learning method for exploring scale-aware representations. Scale-NAS solves multiple tasks at a time by searching multi-scalefeature aggregation.

Selected Honors
  • MLCommons ML and Systems Rising Stars by MLCommons 2024.
  • Qualcomm Innovation Fellowship Finalist by Qualcomm 2024.
  • UT Austin Engineering Fellowship by UT Austin, 2021 & 2023.
  • Excellent Student Leader by Tsinghua University, 2018.
  • National Scholarship by Ministry of Education of China, 2014 & 2015.
Service

  • Reviewer of Journals: TPAMI, IJCV, TNNLS
  • Reviewer of Conferences: CVPR 2023/2024, ICCV 2023, NeurIPS 2023, ICLR 2024, ECCV 2024

  • Thanks to Jon Barron