Deng Cai (蔡登)
I work as a research scientist on the frontier of large language models (LLMs) and artificial general intelligence (AGI).
interested in working with me? drop me an email.
Email: thisisjcykcd AT gmail.com
Past
I am a senior researcher at Tencent AI Lab. My current research focuses on large language models [an incomplete summary of my personal views in mid 2023]:- compute/data-efficient pretraining
- generalist/specialist alignment
- fast and quality decoding
- multimodality/retrieval augmentation
I have a broad interest in natural language processing and machine learning. My work has spanned from fundamental language analysis (e.g., semantic parsing) to real-world NLP applications (e.g., chatbots & translation). From a systematical view, my research is driven by the ultimate goal of building more interpretable and extensible AI systems. To achieve that, my research has revolved around symbolic semantics and reasoning (ACL20, AAAI20, EMNLP21), and explicit and external memory (NAACL19, EMNLP20, ACL21, ICLR23, ICLR24)
Activities
- Oct. 2022, Tutorial Speaker, CCL2022
- Jul. 2022, Co-organize a tutorial on Retrieval-Augmented Text Generation at SIGIR 2022
- Jul. 2022, Co-organize a tutorial on Retrieval-Augmented Text Generation at IJCAI 2022
- May. 2024, CCF TechFrontier
- Sep. 2023, MLNLP Outstanding Speaker
- Welcome submissions to the 1st Workshop on Taming Large Language Models @ SIGDIAL 2023 & INLG 2023
- Nov. 2022, Invited Talk at Tsinghua University (hosted by Prof. Minlie Huang)
- Nov. 2022, Invited Talk at Technical University of Darmstadt (hosted by Prof. Iryna Gurevych)
- Sep. 2022, Guest Speaker, NLPCC2022 Student Workshop
- Sep. 2022, Invited Talk at Central South University
- Sep. 2022, Invited Talk at Tsinghua University (hosted by Prof. Bowen Zhou)
- Jul. 2022, Invited Talk at MLNLP Webinar
- Jul. 2022, Invited Talk at Peking University (hosted by Prof. Yuexian Zou)
- Jun. 2022, Passed my PhD thesis defense
- Mar. 2022, Invited Talk at NLG Student Webinar, Chinese Information Processing Society of China
- Mar. 2022, Invited Talk at Bytedance
- Feb. 2022, Invited Talk at The Chinese University of Hong Kong (hosted by Prof. Helen Meng)
- Jan. 2022, Invited Talk at Xiamen University (hosted by Prof. Jinsong Su)
- Dec. 2021, Invited Talk at Hunan University
- Oct. 2021, Invited Talk at Amazon AWS AI
- Sep. 2021, Invited Talk at Institute of Computing Technology, Chinese Academy of Sciences
- Jul. 2021, Invited Talk at Tencent Research
Tutorials
More
Papers (Google Scholar Profile)
(*: equal contribution, ☨: correspondence)
Selected Preprints
- On the Transformations across Reward Model, Parameter Update, and In-Context Prompt [arxiv]
Deng Cai, Huayang Li, Tingchen Fu, Siheng Li, Weiwen Xu, Shuaiyi Li, Bowen Cao, Zhisong Zhang, Xinting Huang, Leyang Cui, Yan Wang, Lemao Liu, Taro Watanabe, Shuming Shi
arXiv, 2024. - Inferflow: an Efficient and Highly Configurable Inference Engine for Large Language Models [arxiv] [code]
Shuming Shi, Enbo Zhao, Deng Cai, Leyang Cui, Xinting Huang, Huayang Li
arXiv, 2023. - Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models [arxiv]
Yue Zhang, Yafu Li, Leyang Cui, Deng Cai, Lemao Liu, Tingchen Fu, Xinting Huang, Enbo Zhao, Yu Zhang, Yulong Chen, Longyue Wang, Anh Tuan Luu, Wei Bi, Freda Shi, Shuming Shi
arXiv, 2023.
- A Survey on Retrieval-Augmented Text Generation [arxiv]
Huayang Li*, Yixuan Su*, Deng Cai*, Yan Wang*, Lemao Liu*
arXiv, 2022.
- Narrative Incoherence Detection [arxiv]
Deng Cai, Yizhe Zhang, Yichen Huang, Wai Lam, Bill Dolan
arXiv, 2021.
- Chinese Word Segmentation: Another Decade Review (2007-2017)
[arxiv]
Zhao Hai, Cai Deng, Huang Changning, Kit Chunyu
ArXiv, 2017.
Selected Publications
- On the Worst Prompt Performance of Large Language Models [arxiv]
Bowen Cao, Deng Cai☨, Zhisong Zhang, Yuexian Zou, Wai Lam
NeurIPS 2024 - StrategyLLM: Large Language Models as Strategy Generators, Executors, Optimizers, and Evaluators for Problem Solving [arxiv]
Chang Gao, Haiyun Jiang, Deng Cai, Shuming Shi, Wai Lam
NeurIPS 2024 - Unchosen Experts Can Contribute Too: Unleashing MoE Models’ Power by Self-Contrast [arxiv]
Chufan Shi, Cheng Yang, Xinyu Zhu, Jiahao Wang, Taiqiang Wu, Siheng Li, Deng Cai, Yujiu Yang, Yu Meng
NeurIPS 2024 - GLBench: A Comprehensive Benchmark for Graph with Large Language Models
Yuhan Li, Peisong Wang, Xiao Zhu, Aochuan Chen, Haiyun Jiang, Deng Cai, Victor Wai Kin Chan, Jia Li
NeurIPS 2024 (Datasets and Benchmarks) - A Thorough Examination of Decoding Methods in the Era of LLMs [arxiv]
Chufan Shi*, Haoran Yang*, Deng Cai☨, Zhisong Zhang, Yifan Wang, Yujiu Yang, Wai Lam
EMNLP 2024 - Consecutive Batch Model Editing with HooK Layers
Shuaiyi Li, Yang Deng, Deng Cai, Hongyuan Lu, Liang Chen, Wai Lam
EMNLP 2024 - Cross-lingual Contextualized Phrase Retrieval
Huayang Li, Deng Cai☨, Zhi Qu, Qu Cui, Hidetaka Kamigaito, Lemao Liu, Taro Watanabe
EMNLP 2024 (Findings) - Not All Preference Pairs Are Created Equal: A Recipe for Annotation-Efficient Iterative Preference Learning
Sen Yang, Leyang Cui, Deng Cai, Xinting Huang, Shuming Shi, Wai Lam
EMNLP 2024 (Findings) - With Greater Text Comes Greater Necessarily: Inference-Time Training Helps Long Text Generation
Yan Wang*, DM*, Deng Cai
COLM 2024 - Reasons to Reject? Aligning Language Models with Judgments [arxiv]
Weiwen Xu, Deng Cai☨, Zhisong Zhang, Wai Lam, Shuming Shi
ACL 2024 (Findings) - TextBind: Multi-turn Interleaved Multimodal Instruction-following in the Wild [blog] [demo] [arxiv] [code]
Huayang Li*, Siheng Li*, Deng Cai*,☨, Longyue Wang, Lemao Liu, Taro Watanabe, Yujiu Yang, Shuming Shi
ACL 2024 (Findings) - Disperse-Then-Merge: Pushing the Limits of Instruction Tuning via Alignment Tax Reduction [arxiv]
Tingchen Fu, Deng Cai☨, Lemao Liu, Shuming Shi, Rui Yan
ACL 2024 (Findings) - WatME: Towards Lossless Watermarking Through Lexical Redundancy [arxiv]
Liang Chen, Yatao Bian, Yang Deng, Deng Cai, Shuaiyi Li, Peilin Zhao, Kam-fai Wong
ACL 2024 - A Frustratingly Simple Decoding Method for Neural Text Generation [arxiv]
Haoran Yang, Deng Cai☨, Huayang Li, Wei Bi, Wai Lam, Shuming Shi
COLING 2024
- Retrieval is Accurate Generation [paper]
Bowen Cao, Deng Cai☨, Leyang Cui, Xuxin Cheng, Wei Bi, Yuexian Zou, Shuming Shi
ICLR 2024
- The Reasonableness Behind Unreasonable Translation Capability of Large Language Model [paper]
Tingchen Fu, Lemao Liu, Deng Cai, Guoping Huang, Shuming Shi, Rui Yan
ICLR 2024
- Knowledge Fusion of Large Language Models [paper]
Fanqi Wan, Xinting Huang, Deng Cai, Xiaojun Quan, Wei Bi, Shuming Shi
ICLR 2024
- Specialist or Generalist? Instruction Tuning for Specific NLP Tasks [paper]
Chufan Shi, Yixuan Su, Cheng Yang, Yujiu Yang, Deng Cai☨
EMNLP 2023
- Large Language Models Meet Harry Potter: A Bilingual Dataset for Aligning Dialogue Agents with Characters [arxiv] [paper]
Nuo Chen, Yan Wang, Haiyun Jiang, Deng Cai, Yuhan Li, Ziyang Chen, Longyue Wang, Jia Li
EMNLP 2023 (Findings)
- Repetition In Repetition Out: Towards Understanding Neural Text Degeneration from the Data Perspective [arxiv] [paper] [code]
Huayang Li, Tian Lan, Zihao Fu, Deng Cai, Lemao Liu, Nigel Collier, Taro Watanabe, Yixuan Su
NeurIPS 2023
- PandaGPT: One Model To Instruction-Follow Them All [blog] [demo] [arxiv] [code]
Yixuan Su*, Tian Lan*, Huayang Li*, Jialu Xu, Yan Wang, Deng Cai*,☨
TLLM Workshop 2023 - Effidit: An Assistant for Improving Writing Efficiency [paper] [demo]
Tencent AI Lab
ACL 2023 (Demo)
- Copy is All You Need [paper]
Tian Lan*, Deng Cai*,☨, Yan Wang, Heyan Huang, Xian-Ling Mao
ICLR 2023
- Retrofitting Multilingual Sentence Embeddings with Abstract Meaning Representation [arxiv] [paper] [code]
Deng Cai, Xin Li, Jackie Chun-Sing Ho, Lidong Bing, Wai Lam
EMNLP 2022
- Linearizing Transformer with Key-Value Memory [arxiv] [paper] [code]
Yizhe Zhang*, Deng Cai*
EMNLP 2022
- N-gram Is Back: Residual Learning of Neural Text Generation with n-gram Language Model [arxiv] [paper]
Huayang Li, Deng Cai, Jin Xu, Taro Watanabe
EMNLP 2022 (Findings)
- Measuring and Reducing Model Update Regression in Structured Prediction for NLP [arxiv] [blog] [paper]
Deng Cai, Elman Mansimov, Yi-An Lai, Yixuan Su, Lei Shu, Yi Zhang
NeurIPS 2022
- Learning to Break the Loop: Analyzing and Mitigating Repetitions for Neural Text Generation [arxiv]
Jin Xu, Xiaojiang Liu, Jianhao Yan, Deng Cai, Huayang Li, Jian Li
NeurIPS 2022
- Recent Advances in Retrieval-Augmented Text Generation
Deng Cai, Yan Wang, Lemao Liu, Shuming Shi
SIGIR 2022 (Tutorial)
- Recent Advances in Retrieval-Augmented Text Generation
Deng Cai, Yan Wang, Lemao Liu, Shuming Shi
IJCAI 2022 (Tutorial)
- Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System [arxiv] [paper] [code]
Yixuan Su, Lei Shu, Elman Mansimov, Arshit Gupta, Deng Cai, Yi-An Lai, Yi Zhang
ACL 2022
- Multilingual AMR Parsing with Noisy Knowledge Distillation [arxiv] [paper] [code]
Deng Cai, Xin Li, Jackie Chun-Sing Ho, Lidong Bing, Wai Lam
EMNLP 2021 (Findings)
- Exploiting Reasoning Chains for Multi-hop Science Question Answering [arxiv] [paper] [code]
Weiwen Xu, Yang Deng, Huihui Zhang, Deng Cai, Wai Lam
EMNLP 2021 (Findings)
- Neural Machine Translation with Monolingual Translation Memory [arxiv] [paper] [code] [slides]
Deng Cai, Yan Wang, Huayang Li, Wai Lam, Lemao Liu
ACL 2021
Outstanding Paper Award (6/3350) - Dialogue Response Selection with Hierarchical Curriculum Learning [arxiv] [paper] [code]
Yixuan Su*, Deng Cai*, Qingyu Zhou, Zibo Lin, Simon Baker, Yunbo Cao, Shuming Shi, Nigel Collier, Yan Wang
ACL 2021 - Dynamic Semantic Graph Construction and Reasoning for Explainable Multi-hop Question Answering [arxiv] [paper] [code]
Weiwen Xu, Huihui Zhang, Deng Cai, Wai Lam
ACL 2021 (Findings) - Assessing Dialogue Systems with Distribution Distances [arxiv] [paper] [code]
Jiannan Xiang*, Yahui Liu*, Deng Cai, Huayang Li, Defu Lian, Lemao Liu
ACL 2021 (Findings) - Non-Autoregressive Text Generation with Pre-trained Language Models [arxiv] [paper]
Yixuan Su, Deng Cai, Yan Wang, David Vandyke, Simon Baker, Piji Li, Nigel Collier
EACL 2021 - The World is Not Binary: Learning to Rank with Grayscale Data for Dialogue Response Selection [arxiv] [paper]
Zibo Lin*, Deng Cai*, Yan Wang, Xiaojiang Liu, Hai-Tao Zheng, Shuming Shi
EMNLP 2020 - Describe What to Change: A Text-guided Unsupervised Image-to-Image Translation Approach [arxiv]
Yahui Liu, Marco De Nadai, Deng Cai, Huayang Li, Xavier Alameda-Pineda, Nicu Sebe, Bruno Lepri
ACM MM 2020 - AMR Parsing via Graph-Sequence Iterative Inference [arxiv] [paper] [code] [slides]
Deng Cai, Wai Lam
ACL 2020 - Graph Transformer for Graph-to-Sequence Learning
[arxiv] [paper] [code] [slides]
Deng Cai, Wai Lam
AAAI 2020 - Core Semantic First: A Top-down Approach for AMR Parsing
[arxiv] [paper] [code] [slides]
Deng Cai, Wai Lam
EMNLP 2019 - Retrieval-guided Dialogue Response Generation via a Matching-to-Generation Framework
[paper] [code]
Deng Cai, Yan Wang, Wei Bi, Zhaopeng Tu, Xiaojiang Liu, Shuming Shi
EMNLP 2019 - Charge-Based Prison Term Prediction with Deep Gating Network
[arxiv]
Huajie Chen*, Deng Cai*, Wei Dai, Zehui Dai, Yadong Ding
EMNLP 2019 - Skeleton-to-Response: Dialogue Generation Guided by Retrieval Memory
[arxiv] [paper] [code] [slides]
Deng Cai, Yan Wang, Wei Bi, Zhaopeng Tu, Xiaojiang Liu, Wai Lam, Shuming Shi
NAACL 2019 - Unsupervised Learning helps Supervised Neural Word Segmentation
[preprint]
Xiaobin Wang, Deng Cai, Guangwei Xu, Hai Zhao, Linlin Li, Luo Si
AAAI 2019 - Translating a Math Word Problem to a Expression Tree
[paper]
Lei Wang, Yan Wang, Deng Cai, Dongxiang Zhang, Xiaojiang Liu
EMNLP 2018 - Fast and Accurate Neural Word Segmentation for Chinese
[paper] [code]
Deng Cai, Hai Zhao, Zhisong Zhang, Yuan Xin, Yongjian Wu, Feiyue Huang
ACL 2017 - Neural Word Segmentation Learning for Chinese
[paper] [code] [slides]
Deng Cai, Hai Zhao
ACL 2016 - Exploring Dense Retrieval for Dialogue Response Selection
Tian Lan, Deng Cai☨, Yan Wang, Yixuan Su, Heyan Huang, Xian-Ling Mao
ACM Transactions on Information Systems, 2023. - PROTOTYPE-TO-STYLE: Dialogue Generation With Style-Aware Editing on Retrieval Memory
Yixuan Su, Yan Wang, Deng Cai, Simon Baker, Anna Korhonen, Nigel Collier
IEEE Transactions on Audio, Speech and Language Processing, 2021. - Neural Machine Translation with Noisy Lexical Constraints
Huayang Li, Guoping Huang, Deng Cai, Lemao Liu
IEEE Transactions on Audio, Speech and Language Processing, 2020. - A Hybrid Model for Chinese Spelling Check
[paper]
Deng Cai*, Hai Zhao*, Yang Xin, Yuzhu Wang, Zhongye Jia
ACM Transactions on Asian and Low-Resource Language Information Process, 2017.
Open Source Projects (Github Profile)
-
(Most of my research work are open-sourced. Here are some additional projects.)
- BERT
A simple yet complete implementation of the popular BERT model, tested on large-scale distributed environment (more than 80GPUs). Its variant is adpoted by Tencent. - Dynet
I used to contribute actively to Dynet. A list of my contributions can be found here. - Biaffine Dependency Parser
A Dynet implementation of the popular biaffine dependency parser, achieving state-of-the-art performance. Much simpler than the original tensorflow code. I played around many domain adaptation methods on this model. - Implicit Discourse Relation Classification
Pair-aware sentence modeling for Implicit Discourse Relation Classification. A summary of this work: [paper]
Selected Awards and Honors
- Young Elite Scientist Sponsorship Program, China Association for Science and Technology, 2023
- EACL-2023 Outstanding Reviewer
- ACL-2021 Outstanding Paper
- EMNLP-2020 Outstanding Reviewer
- AAAI-2020 Scholarship Award
- National Scholarship for Graduate Student (top 2% students), Ministry of Education of P.R.China, 2016
- Excellent Undergraduate Thesis Award, XMU, 2015
- Dean's List, 2014, School of Information Science and Engineering, XMU, 2014
- Gold Medal, The 5-th Fujian Provincial University Programming Contest, 2014
- Bronze Medal, ACM-ICPC Asia Regional Programming Contest, 2013
Education
- Aug. 2018 - Jul. 2022
PhD, Dept. of Systems Engineering and Engineering Management, The Chinese University of Hong Kong - Sep. 2015 - Mar. 2018
MS, Dept. of Computer Science, Shanghai Jiao Tong University - Sep. 2011 - Jun. 2015
BE, Dept. of Computer Science, Xiamen University
Research Experience
- summer 2021, Applied scientist intern with Elman Mansimov and Yi Zhang
Amazon AWS AI, Seattle (remote) - spring 2021, Research intern with Yizhe Zhang, Michel Galley, and Bill Dolan
Microsoft Research, Redmond (remote) - 2018 - 2020, Student Researcher with Xiaojiang Liu, Yan Wang, and Shuming Shi
Tencent AI Lab, Shenzhen - Jan. 2018 - Mar. 2018, Research intern with Xiaobin Wang, Guangwei Xu, and Linlin Li
Alibaba DAMO Academy, Hangzhou - Jul. 2017 - Dec. 2017, Visiting scholar, Advisor: Prof. Yue Zhang
Singapore University of Technology and Design, Singapore
Professional Service
-
Area Chair:
- COLING(2022), ACL(2024), EMNLP(2024) Program Committee Member/Reviewer:
- ACL Rolling Review(2021-), ACL(2017-2023), EMNLP(2019-2023), NAACL(2021)
- NeurIPS(2022-2024), ICML(2022-2024), ICLR(2023-2024), AAAI(2020-2022), SIGKDD(2022-2023), WSDM(2022-2023)
- COLING(2016, 2018, 2020, 2022), AACL(2020, 2022-2023), EACL(2017, 2021, 2023), LREC(2018, 2020, 2022), INLG(2019-2021), etc Journal Reviewer:
- Computational Linguistics
- IEEE Transactions on Pattern Analysis and Machine Intelligence
- ACM Transactions on Information Systems
- IEEE Transactions on Audio, Speech and Language Processing
- Neurocomputing
- Pattern Analysis and Applications
Miscellaneous
- What?! I remember clearly, last time I played the game, I was an international master.
- I cannot swim well because my density is too large.