Deng Cai (蔡登)

I am a senior researcher at Tencent AI Lab. I recieved my PhD from The Chinese University of Hong Kong, where I was advised by Prof. Wai Lam. Before that, I was a MS student at Shanghai Jiao Tong University supervised by Prof. Hai Zhao. In the past, I also worked with Meta AI, Amazon AWS AI, Microsoft Research Redmond, and Alibaba DAMO Academy.

I have a broad interest in natural language processing and machine learning. My work has spanned from fundamental language analysis (e.g., semantic parsing) to real-world NLP applications (e.g., chatbots & translation). From a systematical view, my research is driven by the ultimate goal of building more interpretable and extensible AI systems. To achieve that, my research has revolved around symbolic semantics and reasoning (ACL20, AAAI20, EMNLP21), and explicit and external memory (NAACL19, EMNLP20, ACL21, ICLR23, ICLR24)

My current research focuses on large language models:

compute/data-efficient pretraining
generalist/specialist alignment
fast and quality decoding
multimodality/retrieval augmentation

looking for internship or collaboration? drop me an email.

Email: thisisjcykcd AT gmail.com

Activities

Tutorials

Oct. 2022, Tutorial Speaker, CCL2022

Jul. 2022, Co-organize a tutorial on Retrieval-Augmented Text Generation at SIGIR 2022

Jul. 2022, Co-organize a tutorial on Retrieval-Augmented Text Generation at IJCAI 2022

More

Sep. 2023, MLNLP Outstanding Speaker

Welcome submissions to the 1st Workshop on Taming Large Language Models @ SIGDIAL 2023 & INLG 2023

Nov. 2022, Invited Talk at Tsinghua University (hosted by Prof. Minlie Huang)

Nov. 2022, Invited Talk at Technical University of Darmstadt (hosted by Prof. Iryna Gurevych)

Sep. 2022, Guest Speaker, NLPCC2022 Student Workshop

Sep. 2022, Invited Talk at Central South University

Sep. 2022, Invited Talk at Tsinghua University (hosted by Prof. Bowen Zhou)

Jul. 2022, Invited Talk at MLNLP Webinar

Jul. 2022, Invited Talk at Peking University (hosted by Prof. Yuexian Zou)

Jun. 2022, Passed my PhD thesis defense

Mar. 2022, Invited Talk at NLG Student Webinar, Chinese Information Processing Society of China

Mar. 2022, Invited Talk at Bytedance

Feb. 2022, Invited Talk at The Chinese University of Hong Kong (hosted by Prof. Helen Meng)

Jan. 2022, Invited Talk at Xiamen University (hosted by Prof. Jinsong Su)

Dec. 2021, Invited Talk at Hunan University

Oct. 2021, Invited Talk at Amazon AWS AI

Sep. 2021, Invited Talk at Institute of Computing Technology, Chinese Academy of Sciences

Jul. 2021, Invited Talk at Tencent Research

Papers (Google Scholar Profile)

(*: equal contribution, ☨: correspondence)

Preprints

A Thorough Examination of Decoding Methods in the Era of LLMs [arxiv]
Chufan Shi^*, Haoran Yang^*, Deng Cai^☨, Zhisong Zhang, Yifan Wang, Yujiu Yang, Wai Lam
arXiv, 2024.
Inferflow: an Efficient and Highly Configurable Inference Engine for Large Language Models [arxiv] [code]
Shuming Shi, Enbo Zhao, Deng Cai, Leyang Cui, Xinting Huang, Huayang Li
arXiv, 2023.
Reasons to Reject? Aligning Language Models with Judgments [arxiv]
Weiwen Xu, Deng Cai^☨, Zhisong Zhang, Wai Lam, Shuming Shi
arXiv, 2023.
StrategyLLM: Large Language Models as Strategy Generators, Executors, Optimizers, and Evaluators for Problem Solving [arxiv]
Chang Gao, Haiyun Jiang, Deng Cai, Shuming Shi, Wai Lam
arXiv, 2023.
TextBind: Multi-turn Interleaved Multimodal Instruction-following in the Wild [blog] [demo] [arxiv] [code]
Huayang Li^*, Siheng Li^*, Deng Cai^*,☨, Longyue Wang, Lemao Liu, Taro Watanabe, Yujiu Yang, Shuming Shi
arXiv, 2023.
Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models [arxiv]
Yue Zhang, Yafu Li, Leyang Cui, Deng Cai, Lemao Liu, Tingchen Fu, Xinting Huang, Enbo Zhao, Yu Zhang, Yulong Chen, Longyue Wang, Anh Tuan Luu, Wei Bi, Freda Shi, Shuming Shi
arXiv, 2023.
Multi-Task Instruction Tuning of LLaMa for Specific Scenarios: A Preliminary Study on Writing Assistance [arxiv]
Yue Zhang, Leyang Cui, Deng Cai, Xinting Huang, Tao Fang, Wei Bi
arXiv, 2023.
A Survey on Retrieval-Augmented Text Generation [arxiv]
Huayang Li^*, Yixuan Su^*, Deng Cai^*, Yan Wang^*, Lemao Liu^*
arXiv, 2022.
Narrative Incoherence Detection [arxiv]
Deng Cai, Yizhe Zhang, Yichen Huang, Wai Lam, Bill Dolan
arXiv, 2021.

Selected Publications

A Frustratingly Simple Decoding Method for Neural Text Generation
Haoran Yang, Deng Cai^☨, Huayang Li, Wei Bi, Wai Lam, Shuming Shi
In Proceedings of the International Conference on Computational Linguistics, 2024. (COLING 2024)
Retrieval is Accurate Generation [paper]
Bowen Cao, Deng Cai^☨, Leyang Cui, Xuxin Cheng, Wei Bi, Yuexian Zou, Shuming Shi
In Proceedings of the International Conference on Learning Representation, 2024. (ICLR 2024)
The Reasonableness Behind Unreasonable Translation Capability of Large Language Model [paper]
Tingchen Fu, Lemao Liu, Deng Cai, Guoping Huang, Shuming Shi, Rui Yan
In Proceedings of the International Conference on Learning Representation, 2024. (ICLR 2024)
Knowledge Fusion of Large Language Models [paper]
Fanqi Wan, Xinting Huang, Deng Cai, Xiaojun Quan, Wei Bi, Shuming Shi
In Proceedings of the International Conference on Learning Representation, 2024. (ICLR 2024)
Specialist or Generalist? Instruction Tuning for Specific NLP Tasks [paper]
Chufan Shi, Yixuan Su, Cheng Yang, Yujiu Yang, Deng Cai^☨
In Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2023. (EMNLP 2023)
Large Language Models Meet Harry Potter: A Bilingual Dataset for Aligning Dialogue Agents with Characters [arxiv] [paper]
Nuo Chen, Yan Wang, Haiyun Jiang, Deng Cai, Yuhan Li, Ziyang Chen, Longyue Wang, Jia Li
In Findings of the Association for Computational Linguistics: EMNLP, 2023. (EMNLP 2023 Findings)
Repetition In Repetition Out: Towards Understanding Neural Text Degeneration from the Data Perspective [arxiv] [paper] [code]
Huayang Li, Tian Lan, Zihao Fu, Deng Cai, Lemao Liu, Nigel Collier, Taro Watanabe, Yixuan Su
In Proceedings of the Conference on Neural Information Processing Systems , 2023. (NeurIPS 2023)
PandaGPT: One Model To Instruction-Follow Them All [blog] [demo] [arxiv] [code]
Yixuan Su^*, Tian Lan^*, Huayang Li^*, Jialu Xu, Yan Wang, Deng Cai^*,☨
In Proceedings of the 1st Workshop on Taming Large Language Models , 2023. (TLLM 2023)
Effidit: An Assistant for Improving Writing Efficiency [paper] [demo]
Tencent AI Lab
In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (System Demonstrations), 2023. (ACL 2023 demo)
Copy is All You Need [paper]
Tian Lan^*, Deng Cai^*,☨, Yan Wang, Heyan Huang, Xian-Ling Mao
In Proceedings of the International Conference on Learning Representation, 2023. (ICLR 2023)
Retrofitting Multilingual Sentence Embeddings with Abstract Meaning Representation [arxiv] [paper] [code]
Deng Cai, Xin Li, Jackie Chun-Sing Ho, Lidong Bing, Wai Lam
In Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2022. (EMNLP 2022)
Linearizing Transformer with Key-Value Memory [arxiv] [paper] [code]
Yizhe Zhang^*, Deng Cai^*
In Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2022. (EMNLP 2022)
N-gram Is Back: Residual Learning of Neural Text Generation with n-gram Language Model [arxiv] [paper]
Huayang Li, Deng Cai, Jin Xu, Taro Watanabe
In Findings of the Association for Computational Linguistics: EMNLP, 2022. (EMNLP 2022 Findings)
Measuring and Reducing Model Update Regression in Structured Prediction for NLP [arxiv] [blog] [paper]
Deng Cai, Elman Mansimov, Yi-An Lai, Yixuan Su, Lei Shu, Yi Zhang
In Proceedings of the Conference on Neural Information Processing Systems , 2022. (NeurIPS 2022)
Learning to Break the Loop: Analyzing and Mitigating Repetitions for Neural Text Generation [arxiv]
Jin Xu, Xiaojiang Liu, Jianhao Yan, Deng Cai, Huayang Li, Jian Li
In Proceedings of the Conference on Neural Information Processing Systems, 2022. (NeurIPS 2022)
Recent Advances in Retrieval-Augmented Text Generation
Deng Cai, Yan Wang, Lemao Liu, Shuming Shi
SIGIR Tutorial 2022, 2022. (SIGIR 2022 Tutorial)
Recent Advances in Retrieval-Augmented Text Generation
Deng Cai, Yan Wang, Lemao Liu, Shuming Shi
IJCAI Tutorial 2022, 2022. (IJCAI 2022 Tutorial)
Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System [arxiv] [paper] [code]
Yixuan Su, Lei Shu, Elman Mansimov, Arshit Gupta, Deng Cai, Yi-An Lai, Yi Zhang
In Proceedings of the Annual Meeting of the Association for Computational Linguistics, 2022. (ACL 2022)
Multilingual AMR Parsing with Noisy Knowledge Distillation [arxiv] [paper] [code]
Deng Cai, Xin Li, Jackie Chun-Sing Ho, Lidong Bing, Wai Lam
In Findings of the Association for Computational Linguistics: EMNLP, 2021. (EMNLP 2021 Findings)
Exploiting Reasoning Chains for Multi-hop Science Question Answering [arxiv] [paper] [code]
Weiwen Xu, Yang Deng, Huihui Zhang, Deng Cai, Wai Lam
In Findings of the Association for Computational Linguistics: EMNLP, 2021. (EMNLP 2021 Findings)
Neural Machine Translation with Monolingual Translation Memory [arxiv] [paper] [code] [slides]
Deng Cai, Yan Wang, Huayang Li, Wai Lam, Lemao Liu
In Proceedings of the Annual Meeting of the Association for Computational Linguistics, 2021. (ACL 2021)
Outstanding Paper Award
Dialogue Response Selection with Hierarchical Curriculum Learning [arxiv] [paper] [code]
Yixuan Su^*, Deng Cai^*, Qingyu Zhou, Zibo Lin, Simon Baker, Yunbo Cao, Shuming Shi, Nigel Collier, Yan Wang
In Proceedings of the Annual Meeting of the Association for Computational Linguistics, 2021. (ACL 2021)
Dynamic Semantic Graph Construction and Reasoning for Explainable Multi-hop Question Answering [arxiv] [paper] [code]
Weiwen Xu, Huihui Zhang, Deng Cai, Wai Lam
In Findings of the Association for Computational Linguistics: ACL, 2021. (ACL 2021 Findings)
Assessing Dialogue Systems with Distribution Distances [arxiv] [paper] [code]
Jiannan Xiang^*, Yahui Liu^*, Deng Cai, Huayang Li, Defu Lian, Lemao Liu
In Findings of the Association for Computational Linguistics: ACL, 2021. (ACL 2021 Findings)
Non-Autoregressive Text Generation with Pre-trained Language Models [arxiv] [paper]
Yixuan Su, Deng Cai, Yan Wang, David Vandyke, Simon Baker, Piji Li, Nigel Collier
In Proceedings of the Conference of the European Chapter of the Association for Computational Linguistics, 2021. (EACL 2021)
The World is Not Binary: Learning to Rank with Grayscale Data for Dialogue Response Selection [arxiv] [paper]
Zibo Lin^*, Deng Cai^*, Yan Wang, Xiaojiang Liu, Hai-Tao Zheng, Shuming Shi
In Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2020. (EMNLP 2020)
Describe What to Change: A Text-guided Unsupervised Image-to-Image Translation Approach [arxiv]
Yahui Liu, Marco De Nadai, Deng Cai, Huayang Li, Xavier Alameda-Pineda, Nicu Sebe, Bruno Lepri
In Proceedings of the 28th ACM International Conference on Multimedia, 2020. (ACM MM 2020)
AMR Parsing via Graph-Sequence Iterative Inference [arxiv] [paper] [code] [slides]
Deng Cai, Wai Lam
In Proceedings of the Annual Meeting of the Association for Computational Linguistics, 2020. (ACL 2020)
Graph Transformer for Graph-to-Sequence Learning [arxiv] [paper] [code] [slides]
Deng Cai, Wai Lam
In Proceedings of the AAAI Conference on Artificial Intelligence, 2020. (AAAI 2020)
Core Semantic First: A Top-down Approach for AMR Parsing [arxiv] [paper] [code] [slides]
Deng Cai, Wai Lam
In Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2019. (EMNLP 2019)
Retrieval-guided Dialogue Response Generation via a Matching-to-Generation Framework [paper] [code]
Deng Cai, Yan Wang, Wei Bi, Zhaopeng Tu, Xiaojiang Liu, Shuming Shi
In Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2019. (EMNLP 2019)
Skeleton-to-Response: Dialogue Generation Guided by Retrieval Memory [arxiv] [paper] [code] [slides]
Deng Cai, Yan Wang, Wei Bi, Zhaopeng Tu, Xiaojiang Liu, Wai Lam, Shuming Shi
In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics, 2019. (NAACL 2019)
Unsupervised Learning helps Supervised Neural Word Segmentation [preprint]
Xiaobin Wang, Deng Cai, Guangwei Xu, Hai Zhao, Linlin Li, Luo Si
In Proceedings of the AAAI Conference on Artificial Intelligence, 2019. (AAAI 2019)
Translating a Math Word Problem to a Expression Tree [paper]
Lei Wang, Yan Wang, Deng Cai, Dongxiang Zhang, Xiaojiang Liu
In Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2018. (EMNLP 2018)
Fast and Accurate Neural Word Segmentation for Chinese [paper] [code]
Deng Cai, Hai Zhao, Zhisong Zhang, Yuan Xin, Yongjian Wu, Feiyue Huang
In Proceedings of the Annual Meeting of the Association for Computational Linguistics, 2017. (ACL 2017)
Neural Word Segmentation Learning for Chinese [paper] [code] [slides]
Deng Cai, Hai Zhao
In Proceedings of the Annual Meeting of the Association for Computational Linguistics, 2016. (ACL 2016)
Exploring Dense Retrieval for Dialogue Response Selection
Tian Lan, Deng Cai^☨, Yan Wang, Yixuan Su, Heyan Huang, Xian-Ling Mao
ACM Transactions on Information Systems, 2023. (TOIS)
PROTOTYPE-TO-STYLE: Dialogue Generation With Style-Aware Editing on Retrieval Memory
Yixuan Su, Yan Wang, Deng Cai, Simon Baker, Anna Korhonen, Nigel Collier
IEEE Transactions on Audio, Speech and Language Processing, 2021. (TASLP)
Neural Machine Translation with Noisy Lexical Constraints
Huayang Li, Guoping Huang, Deng Cai, Lemao Liu
IEEE Transactions on Audio, Speech and Language Processing, 2020. (TASLP)
A Hybrid Model for Chinese Spelling Check [paper]
Deng Cai^*, Hai Zhao^*, Yang Xin, Yuzhu Wang, Zhongye Jia
ACM Transactions on Asian and Low-Resource Language Information Process, 2017. (TALLIP)

Open Source Projects (Github Profile)

BERT
A simple yet complete implementation of the popular BERT model, tested on large-scale distributed environment (more than 80GPUs). Its variant is adpoted by Tencent.
Dynet
I used to contribute actively to Dynet. A list of my contributions can be found here.
Biaffine Dependency Parser
A Dynet implementation of the popular biaffine dependency parser, achieving state-of-the-art performance. Much simpler than the original tensorflow code. I played around many domain adaptation methods on this model.
Implicit Discourse Relation Classification
Pair-aware sentence modeling for Implicit Discourse Relation Classification. A summary of this work: [paper]

Selected Awards and Honors

Young Elite Scientist Sponsorship Program, China Association for Science and Technology, 2023
EACL-2023 Outstanding Reviewer
ACL-2021 Outstanding Paper
EMNLP-2020 Outstanding Reviewer
AAAI-2020 Scholarship Award
National Scholarship for Graduate Student (top 2% students), Ministry of Education of P.R.China, 2016
Excellent Undergraduate Thesis Award, XMU, 2015
Dean's List, 2014, School of Information Science and Engineering, XMU, 2014
Gold Medal, The 5-th Fujian Provincial University Programming Contest, 2014
Bronze Medal, ACM-ICPC Asia Regional Programming Contest, 2013

Education

Aug. 2018 - Jul. 2022
PhD, Dept. of Systems Engineering and Engineering Management, The Chinese University of Hong Kong
Sep. 2015 - Mar. 2018
MS, Dept. of Computer Science, Shanghai Jiao Tong University
Sep. 2011 - Jun. 2015
BE, Dept. of Computer Science, Xiamen University

Research Experience

summer 2021, Applied scientist intern with Elman Mansimov and Yi Zhang
Amazon AWS AI, Seattle (remote)
spring 2021, Research intern with Yizhe Zhang, Michel Galley, and Bill Dolan
Microsoft Research, Redmond (remote)
2018 - 2020, Research collaborator with Xiaojiang Liu, Yan Wang, and Shuming Shi
Tencent AI Lab, Shenzhen
Jan. 2018 - Mar. 2018, Research intern with Xiaobin Wang, Guangwei Xu, and Linlin Li
Alibaba DAMO Academy, Hangzhou
Jul. 2017 - Dec. 2017, Visiting scholar, Advisor: Prof. Yue Zhang
Singapore University of Technology and Design, Singapore

Professional Service

Area Chair

COLING(2022), ACL(2024)

Program Committee Member/Reviewer

ACL Rolling Review(2021-), ACL(2017-2023), EMNLP(2019-2023), NAACL(2021)
NeurIPS(2022-2023), ICML(2022-2024), ICLR(2023-2024), AAAI(2020-2022), SIGKDD(2022-2023), WSDM(2022-2023)
COLING(2016, 2018, 2020, 2022), AACL(2020, 2022-2023), EACL(2017, 2021, 2023), LREC(2018, 2020, 2022), INLG(2019-2021), etc

Journal Reviewer

Computational Linguistics
IEEE Transactions on Pattern Analysis and Machine Intelligence
ACM Transactions on Information Systems
IEEE Transactions on Audio, Speech and Language Processing
Neurocomputing
Pattern Analysis and Applications

Miscellaneous

What?! I remember clearly, last time I played the game, I was an international master.

I cannot swim well because my density is too large.