Deng Cai (蔡登)

I work as a research scientist on the frontier of large language models (LLMs) and artificial general intelligence (AGI).

We are hiring!

Email: thisisjcykcd AT gmail.com

Past

I am a senior researcher at Tencent AI Lab. My current research focuses on large language models [an incomplete summary of my personal views in mid 2023]:

compute/data-efficient pretraining
generalist/specialist alignment
fast and quality decoding
multimodality/retrieval augmentation

I recieved my PhD from The Chinese University of Hong Kong, where I was advised by Prof. Wai Lam. Before that, I was a MS student at Shanghai Jiao Tong University supervised by Prof. Hai Zhao. In the past, I also worked with Meta AI, Amazon AWS AI, Microsoft Research Redmond, and Alibaba DAMO Academy.

I have a broad interest in natural language processing and machine learning. My work has spanned from fundamental language analysis (e.g., semantic parsing) to real-world NLP applications (e.g., chatbots & translation). From a systematical view, my research is driven by the ultimate goal of building more interpretable and extensible AI systems. To achieve that, my research has revolved around symbolic semantics and reasoning (ACL20, AAAI20, EMNLP21), and explicit and external memory (NAACL19, EMNLP20, ACL21, ICLR23, ICLR24)

Activities

Tutorials

Oct. 2022, Tutorial Speaker, CCL2022

Jul. 2022, Co-organize a tutorial on Retrieval-Augmented Text Generation at SIGIR 2022

Jul. 2022, Co-organize a tutorial on Retrieval-Augmented Text Generation at IJCAI 2022

More

May. 2024, CCF TechFrontier

Sep. 2023, MLNLP Outstanding Speaker

Welcome submissions to the 1st Workshop on Taming Large Language Models @ SIGDIAL 2023 & INLG 2023

Nov. 2022, Invited Talk at Tsinghua University (hosted by Prof. Minlie Huang)

Nov. 2022, Invited Talk at Technical University of Darmstadt (hosted by Prof. Iryna Gurevych)

Sep. 2022, Guest Speaker, NLPCC2022 Student Workshop

Sep. 2022, Invited Talk at Central South University

Sep. 2022, Invited Talk at Tsinghua University (hosted by Prof. Bowen Zhou)

Jul. 2022, Invited Talk at MLNLP Webinar

Jul. 2022, Invited Talk at Peking University (hosted by Prof. Yuexian Zou)

Jun. 2022, Passed my PhD thesis defense

Mar. 2022, Invited Talk at NLG Student Webinar, Chinese Information Processing Society of China

Mar. 2022, Invited Talk at Bytedance

Feb. 2022, Invited Talk at The Chinese University of Hong Kong (hosted by Prof. Helen Meng)

Jan. 2022, Invited Talk at Xiamen University (hosted by Prof. Jinsong Su)

Dec. 2021, Invited Talk at Hunan University

Oct. 2021, Invited Talk at Amazon AWS AI

Sep. 2021, Invited Talk at Institute of Computing Technology, Chinese Academy of Sciences

Jul. 2021, Invited Talk at Tencent Research

Papers (Google Scholar Profile)

(*: equal contribution, ☨: correspondence)

Selected Preprints

On the Transformations across Reward Model, Parameter Update, and In-Context Prompt [arxiv]
Deng Cai, Huayang Li, Tingchen Fu, Siheng Li, Weiwen Xu, Shuaiyi Li, Bowen Cao, Zhisong Zhang, Xinting Huang, Leyang Cui, Yan Wang, Lemao Liu, Taro Watanabe, Shuming Shi
arXiv, 2024.
A Survey on the Honesty of Large Language Models [arxiv]
Siheng Li, Cheng Yang, Taiqiang Wu, Chufan Shi, Yuji Zhang, Xinyu Zhu, Zesen Cheng, Deng Cai, Mo Yu, Lemao Liu, Jie Zhou, Yujiu Yang, Ngai Wong, Xixin Wu, Wai Lam
arXiv, 2024.
Inferflow: an Efficient and Highly Configurable Inference Engine for Large Language Models [arxiv] [code]
Shuming Shi, Enbo Zhao, Deng Cai, Leyang Cui, Xinting Huang, Huayang Li
arXiv, 2023.
Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models [arxiv]
Yue Zhang, Yafu Li, Leyang Cui, Deng Cai, Lemao Liu, Tingchen Fu, Xinting Huang, Enbo Zhao, Yu Zhang, Yulong Chen, Longyue Wang, Anh Tuan Luu, Wei Bi, Freda Shi, Shuming Shi
arXiv, 2023.
A Survey on Retrieval-Augmented Text Generation [arxiv]
Huayang Li^*, Yixuan Su^*, Deng Cai^*, Yan Wang^*, Lemao Liu^*
arXiv, 2022.
Narrative Incoherence Detection [arxiv]
Deng Cai, Yizhe Zhang, Yichen Huang, Wai Lam, Bill Dolan
arXiv, 2021.
Chinese Word Segmentation: Another Decade Review (2007-2017) [arxiv]
Zhao Hai, Cai Deng, Huang Changning, Kit Chunyu
ArXiv, 2017.

Selected Publications

On the Worst Prompt Performance of Large Language Models [arxiv]
Bowen Cao, Deng Cai^☨, Zhisong Zhang, Yuexian Zou, Wai Lam
NeurIPS 2024
StrategyLLM: Large Language Models as Strategy Generators, Executors, Optimizers, and Evaluators for Problem Solving [arxiv]
Chang Gao, Haiyun Jiang, Deng Cai, Shuming Shi, Wai Lam
NeurIPS 2024
Unchosen Experts Can Contribute Too: Unleashing MoE Models’ Power by Self-Contrast [arxiv]
Chufan Shi, Cheng Yang, Xinyu Zhu, Jiahao Wang, Taiqiang Wu, Siheng Li, Deng Cai, Yujiu Yang, Yu Meng
NeurIPS 2024
GLBench: A Comprehensive Benchmark for Graph with Large Language Models
Yuhan Li, Peisong Wang, Xiao Zhu, Aochuan Chen, Haiyun Jiang, Deng Cai, Victor Wai Kin Chan, Jia Li
NeurIPS 2024 (Datasets and Benchmarks)
A Thorough Examination of Decoding Methods in the Era of LLMs [arxiv]
Chufan Shi^*, Haoran Yang^*, Deng Cai^☨, Zhisong Zhang, Yifan Wang, Yujiu Yang, Wai Lam
EMNLP 2024
Consecutive Batch Model Editing with HooK Layers
Shuaiyi Li, Yang Deng, Deng Cai, Hongyuan Lu, Liang Chen, Wai Lam
EMNLP 2024
Cross-lingual Contextualized Phrase Retrieval
Huayang Li, Deng Cai^☨, Zhi Qu, Qu Cui, Hidetaka Kamigaito, Lemao Liu, Taro Watanabe
EMNLP 2024 (Findings)
Not All Preference Pairs Are Created Equal: A Recipe for Annotation-Efficient Iterative Preference Learning
Sen Yang, Leyang Cui, Deng Cai, Xinting Huang, Shuming Shi, Wai Lam
EMNLP 2024 (Findings)
With Greater Text Comes Greater Necessarily: Inference-Time Training Helps Long Text Generation
Yan Wang^*, DM^*, Deng Cai
COLM 2024
GPT4Video: A Unified Multimodal Large Language Model for lnstruction-Followed Understanding and Safety-Aware Generation [arxiv]
Zhanyu Wang, Longyue Wang, Zhen Zhao, Minghao Wu, Chenyang Lyu, Huayang Li, Deng Cai, Luping Zhou, Shuming Shi, Zhaopeng Tu
ACM MM 2024
Best Paper Nomination (26/4340)
Reasons to Reject? Aligning Language Models with Judgments [arxiv]
Weiwen Xu, Deng Cai^☨, Zhisong Zhang, Wai Lam, Shuming Shi
ACL 2024 (Findings)
TextBind: Multi-turn Interleaved Multimodal Instruction-following in the Wild [blog] [demo] [arxiv] [code]
Huayang Li^*, Siheng Li^*, Deng Cai^*,☨, Longyue Wang, Lemao Liu, Taro Watanabe, Yujiu Yang, Shuming Shi
ACL 2024 (Findings)
Disperse-Then-Merge: Pushing the Limits of Instruction Tuning via Alignment Tax Reduction [arxiv]
Tingchen Fu, Deng Cai^☨, Lemao Liu, Shuming Shi, Rui Yan
ACL 2024 (Findings)
WatME: Towards Lossless Watermarking Through Lexical Redundancy [arxiv]
Liang Chen, Yatao Bian, Yang Deng, Deng Cai, Shuaiyi Li, Peilin Zhao, Kam-fai Wong
ACL 2024
A Frustratingly Simple Decoding Method for Neural Text Generation [arxiv]
Haoran Yang, Deng Cai^☨, Huayang Li, Wei Bi, Wai Lam, Shuming Shi
COLING 2024
Retrieval is Accurate Generation [paper]
Bowen Cao, Deng Cai^☨, Leyang Cui, Xuxin Cheng, Wei Bi, Yuexian Zou, Shuming Shi
ICLR 2024
The Reasonableness Behind Unreasonable Translation Capability of Large Language Model [paper]
Tingchen Fu, Lemao Liu, Deng Cai, Guoping Huang, Shuming Shi, Rui Yan
ICLR 2024
Knowledge Fusion of Large Language Models [paper]
Fanqi Wan, Xinting Huang, Deng Cai, Xiaojun Quan, Wei Bi, Shuming Shi
ICLR 2024
Specialist or Generalist? Instruction Tuning for Specific NLP Tasks [paper]
Chufan Shi, Yixuan Su, Cheng Yang, Yujiu Yang, Deng Cai^☨
EMNLP 2023
Large Language Models Meet Harry Potter: A Bilingual Dataset for Aligning Dialogue Agents with Characters [arxiv] [paper]
Nuo Chen, Yan Wang, Haiyun Jiang, Deng Cai, Yuhan Li, Ziyang Chen, Longyue Wang, Jia Li
EMNLP 2023 (Findings)
Repetition In Repetition Out: Towards Understanding Neural Text Degeneration from the Data Perspective [arxiv] [paper] [code]
Huayang Li, Tian Lan, Zihao Fu, Deng Cai, Lemao Liu, Nigel Collier, Taro Watanabe, Yixuan Su
NeurIPS 2023
PandaGPT: One Model To Instruction-Follow Them All [blog] [demo] [arxiv] [code]
Yixuan Su^*, Tian Lan^*, Huayang Li^*, Jialu Xu, Yan Wang, Deng Cai^*,☨
TLLM Workshop 2023
Effidit: An Assistant for Improving Writing Efficiency [paper] [demo]
Tencent AI Lab
ACL 2023 (Demo)
Copy is All You Need [paper]
Tian Lan^*, Deng Cai^*,☨, Yan Wang, Heyan Huang, Xian-Ling Mao
ICLR 2023
Retrofitting Multilingual Sentence Embeddings with Abstract Meaning Representation [arxiv] [paper] [code]
Deng Cai, Xin Li, Jackie Chun-Sing Ho, Lidong Bing, Wai Lam
EMNLP 2022
Linearizing Transformer with Key-Value Memory [arxiv] [paper] [code]
Yizhe Zhang^*, Deng Cai^*
EMNLP 2022
N-gram Is Back: Residual Learning of Neural Text Generation with n-gram Language Model [arxiv] [paper]
Huayang Li, Deng Cai, Jin Xu, Taro Watanabe
EMNLP 2022 (Findings)
Measuring and Reducing Model Update Regression in Structured Prediction for NLP [arxiv] [blog] [paper]
Deng Cai, Elman Mansimov, Yi-An Lai, Yixuan Su, Lei Shu, Yi Zhang
NeurIPS 2022
Learning to Break the Loop: Analyzing and Mitigating Repetitions for Neural Text Generation [arxiv]
Jin Xu, Xiaojiang Liu, Jianhao Yan, Deng Cai, Huayang Li, Jian Li
NeurIPS 2022
Recent Advances in Retrieval-Augmented Text Generation
Deng Cai, Yan Wang, Lemao Liu, Shuming Shi
SIGIR 2022 (Tutorial)
Recent Advances in Retrieval-Augmented Text Generation
Deng Cai, Yan Wang, Lemao Liu, Shuming Shi
IJCAI 2022 (Tutorial)
Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System [arxiv] [paper] [code]
Yixuan Su, Lei Shu, Elman Mansimov, Arshit Gupta, Deng Cai, Yi-An Lai, Yi Zhang
ACL 2022
Multilingual AMR Parsing with Noisy Knowledge Distillation [arxiv] [paper] [code]
Deng Cai, Xin Li, Jackie Chun-Sing Ho, Lidong Bing, Wai Lam
EMNLP 2021 (Findings)
Exploiting Reasoning Chains for Multi-hop Science Question Answering [arxiv] [paper] [code]
Weiwen Xu, Yang Deng, Huihui Zhang, Deng Cai, Wai Lam
EMNLP 2021 (Findings)
Neural Machine Translation with Monolingual Translation Memory [arxiv] [paper] [code] [slides]
Deng Cai, Yan Wang, Huayang Li, Wai Lam, Lemao Liu
ACL 2021
Outstanding Paper Award (6/3350)
Dialogue Response Selection with Hierarchical Curriculum Learning [arxiv] [paper] [code]
Yixuan Su^*, Deng Cai^*, Qingyu Zhou, Zibo Lin, Simon Baker, Yunbo Cao, Shuming Shi, Nigel Collier, Yan Wang
ACL 2021
Dynamic Semantic Graph Construction and Reasoning for Explainable Multi-hop Question Answering [arxiv] [paper] [code]
Weiwen Xu, Huihui Zhang, Deng Cai, Wai Lam
ACL 2021 (Findings)
Assessing Dialogue Systems with Distribution Distances [arxiv] [paper] [code]
Jiannan Xiang^*, Yahui Liu^*, Deng Cai, Huayang Li, Defu Lian, Lemao Liu
ACL 2021 (Findings)
Non-Autoregressive Text Generation with Pre-trained Language Models [arxiv] [paper]
Yixuan Su, Deng Cai, Yan Wang, David Vandyke, Simon Baker, Piji Li, Nigel Collier
EACL 2021
The World is Not Binary: Learning to Rank with Grayscale Data for Dialogue Response Selection [arxiv] [paper]
Zibo Lin^*, Deng Cai^*, Yan Wang, Xiaojiang Liu, Hai-Tao Zheng, Shuming Shi
EMNLP 2020
Describe What to Change: A Text-guided Unsupervised Image-to-Image Translation Approach [arxiv]
Yahui Liu, Marco De Nadai, Deng Cai, Huayang Li, Xavier Alameda-Pineda, Nicu Sebe, Bruno Lepri
ACM MM 2020
AMR Parsing via Graph-Sequence Iterative Inference [arxiv] [paper] [code] [slides]
Deng Cai, Wai Lam
ACL 2020
Graph Transformer for Graph-to-Sequence Learning [arxiv] [paper] [code] [slides]
Deng Cai, Wai Lam
AAAI 2020
Core Semantic First: A Top-down Approach for AMR Parsing [arxiv] [paper] [code] [slides]
Deng Cai, Wai Lam
EMNLP 2019
Retrieval-guided Dialogue Response Generation via a Matching-to-Generation Framework [paper] [code]
Deng Cai, Yan Wang, Wei Bi, Zhaopeng Tu, Xiaojiang Liu, Shuming Shi
EMNLP 2019
Charge-Based Prison Term Prediction with Deep Gating Network [arxiv]
Huajie Chen^*, Deng Cai^*, Wei Dai, Zehui Dai, Yadong Ding
EMNLP 2019
Skeleton-to-Response: Dialogue Generation Guided by Retrieval Memory [arxiv] [paper] [code] [slides]
Deng Cai, Yan Wang, Wei Bi, Zhaopeng Tu, Xiaojiang Liu, Wai Lam, Shuming Shi
NAACL 2019
Unsupervised Learning helps Supervised Neural Word Segmentation [preprint]
Xiaobin Wang, Deng Cai, Guangwei Xu, Hai Zhao, Linlin Li, Luo Si
AAAI 2019
Translating a Math Word Problem to a Expression Tree [paper]
Lei Wang, Yan Wang, Deng Cai, Dongxiang Zhang, Xiaojiang Liu
EMNLP 2018
Fast and Accurate Neural Word Segmentation for Chinese [paper] [code]
Deng Cai, Hai Zhao, Zhisong Zhang, Yuan Xin, Yongjian Wu, Feiyue Huang
ACL 2017
Neural Word Segmentation Learning for Chinese [paper] [code] [slides]
Deng Cai, Hai Zhao
ACL 2016
Exploring Dense Retrieval for Dialogue Response Selection
Tian Lan, Deng Cai^☨, Yan Wang, Yixuan Su, Heyan Huang, Xian-Ling Mao
ACM Transactions on Information Systems, 2023.
PROTOTYPE-TO-STYLE: Dialogue Generation With Style-Aware Editing on Retrieval Memory
Yixuan Su, Yan Wang, Deng Cai, Simon Baker, Anna Korhonen, Nigel Collier
IEEE Transactions on Audio, Speech and Language Processing, 2021.
Neural Machine Translation with Noisy Lexical Constraints
Huayang Li, Guoping Huang, Deng Cai, Lemao Liu
IEEE Transactions on Audio, Speech and Language Processing, 2020.
A Hybrid Model for Chinese Spelling Check [paper]
Deng Cai^*, Hai Zhao^*, Yang Xin, Yuzhu Wang, Zhongye Jia
ACM Transactions on Asian and Low-Resource Language Information Process, 2017.

Open Source Projects (Github Profile)

BERT
A simple yet complete implementation of the popular BERT model, tested on large-scale distributed environment (more than 80GPUs). Its variant is adpoted by Tencent.
Dynet
I used to contribute actively to Dynet. A list of my contributions can be found here.
Biaffine Dependency Parser
A Dynet implementation of the popular biaffine dependency parser, achieving state-of-the-art performance. Much simpler than the original tensorflow code. I played around many domain adaptation methods on this model.
Implicit Discourse Relation Classification
Pair-aware sentence modeling for Implicit Discourse Relation Classification. A summary of this work: [paper]

Selected Awards and Honors

ACM MM-2024 Best Paper Nomination
Young Elite Scientist Sponsorship Program, China Association for Science and Technology, 2023 （中国科协青年人才托举工程）
EACL-2023 Outstanding Reviewer
ACL-2021 Outstanding Paper
EMNLP-2020 Outstanding Reviewer
AAAI-2020 Scholarship Award
National Scholarship for Graduate Student (top 2% students), Ministry of Education of P.R.China, 2016
Excellent Undergraduate Thesis Award, XMU, 2015
Dean's List, 2014, School of Information Science and Engineering, XMU, 2014
Gold Medal, The 5-th Fujian Provincial University Programming Contest, 2014
Bronze Medal, ACM-ICPC Asia Regional Programming Contest, 2013

Education

Aug. 2018 - Jul. 2022
PhD, Dept. of Systems Engineering and Engineering Management, The Chinese University of Hong Kong
Sep. 2015 - Mar. 2018
MS, Dept. of Computer Science, Shanghai Jiao Tong University
Sep. 2011 - Jun. 2015
BE, Dept. of Computer Science, Xiamen University

Research Experience

summer 2021, Applied scientist intern with Elman Mansimov and Yi Zhang
Amazon AWS AI, Seattle (remote)
spring 2021, Research intern with Yizhe Zhang, Michel Galley, and Bill Dolan
Microsoft Research, Redmond (remote)
2018 - 2020, Student Researcher with NLP Center led by Shuming Shi
Tencent AI Lab, Shenzhen
Jan. 2018 - Mar. 2018, Research intern with Xiaobin Wang, Guangwei Xu, and Linlin Li
Alibaba DAMO Academy, Hangzhou
Jul. 2017 - Dec. 2017, Visiting scholar, Advisor: Prof. Yue Zhang
Singapore University of Technology and Design, Singapore

Professional Service

Area Chair

COLING(2022), ACL(2024), EMNLP(2024), NAACL(2025), ACL Rolling Review(2024-), NeurIPS(2025)

Program Committee Member/Reviewer

ACL Rolling Review(2021-), ACL(2017-2023), EMNLP(2019-2023), NAACL(2021)
NeurIPS(2022-2024), ICML(2022-2024), ICLR(2023-2024), AAAI(2020-2022), SIGKDD(2022-2023), WSDM(2022-2023)
COLING(2016, 2018, 2020, 2022), AACL(2020, 2022-2023), EACL(2017, 2021, 2023), LREC(2018, 2020, 2022), INLG(2019-2021), etc

Journal Reviewer

Computational Linguistics
IEEE Transactions on Pattern Analysis and Machine Intelligence
ACM Transactions on Information Systems
IEEE Transactions on Audio, Speech and Language Processing
Neurocomputing
Pattern Analysis and Applications

Miscellaneous

What?! I remember clearly, last time I played the game, I was an international master.

I cannot swim well because my density is too large.