Xiaodong GU


Associate Professor

School of Computer Science, Shanghai Jiao Tong University

Contact:

Room 1208, Software Building, No.800 Dongchuan Road, Shanghai, China
Email:

Research Interest:

My research focuses on large language models for natural and programming languages. I develop efficient machine learning methodologies for software code. My research topics are:

Github 

Selected Publications

[Full List] [Google Scholar]

[FSE 2026] Neuron-Guided Interpretation of Code LLMs: Where, Why, and How?
Zhe Yin, Xiaodong Gu, Beijun Shen

[FSE 2026] Beyond Language Boundaries: Uncovering Program Language Families with Code Language Models
Shangho Yun, Xiaodong Gu, Jianghong Huang, Beijun Shen

[FSE 2026] In Line with Context: Repository-Level Code Generation via Context Inlining
Chao Hu, Wenhao Zeng, Yuling Shi, Beijun Shen, Xiaodong Gu

[ICSE 2026] SWE-Debate: Competitive Multi-Agent Debate for Software Issue Resolution
Han Li, Yuling Shi, Shaoxin Lin, Xiaodong Gu, Heng Lian, Xin Wang, Yantao Jia, Tao Huang, Qianxiang Wang
[paper]

[ICSE 2026 SEIP] EVOC2RUST: A Skeleton-guided Framework for Project-Level C-to-Rust Translation
Chaofan Wang, Tingrui Yu, Chen Xie, Jie Wang, Dong Chen, Wenrui Zhang, Yuling Shi, Xiaodong Gu, Beijun Shen
[paper]

[ICSE 2026] From Code to Correctness: Closing the Last Mile of Code Generation with Hierarchical Debugging
Yuling Shi, Songsong Wang, Chengcheng Wan, Min Wang and Xiaodong Gu
[paper] [code]

[AAAI 2026] Anti-Adversarial Learning: Desensitizing Prompts for Large Language Models
Xuan Li, Zhe Yin, Xiaodong Gu, and Beijun Shen
[paper]

[AAMAS 2026] HyperAgent: Leveraging Hypergraphs for Topology Optimization in Multi-Agent Communication
Heng Zhang, Yuling Shi, Xiaodong Gu, Zijian Zhang, Haochen You, Lubin Gan, Yilei Yuan, Jin Huang

[AAMAS 2026] GraphTracer: Graph-Guided Failure Tracing in LLM Agents for Robust Multi-Turn Deep Search
Heng Zhang, Yuling Shi, Xiaodong Gu, Zijian Zhang, Haochen You, Lubin Gan, Yilei Yuan, Jin Huang

[AAMAS 2026] D³MAS: Decompose, Deduce, and Distribute for Enhanced Knowledge Sharing in Multi-Agent Systems
Heng Zhang, Yuling Shi, Xiaodong Gu, Zijian Zhang, Haochen You, Lubin Gan, Yilei Yuan, Jin Huang

[TSE 2025] Synthetic Malware at Scale: Malicious Code Generation with Code Transplanting
Guangzhan Wang, Diwei Chen, Yuting Chen, Beijun Shen, Xiaodong Gu

[ASE 2025] LongCodeZip: Compress Long Context for Code Language Models
Yuling Shi, Yichun Qian, Hongyu Zhang, Beijun Shen, Xiaodong Gu
[paper] [code]

[EMNLP 2025] Transplant Then Regenerate: A New Paradigm for Text Data Augmentation
Guangzhan Wang, Hongyu Zhang, Beijun Shen and Xiaodong Gu
[paper] [code]

[EMNLP 2025 Findings] LastingBench: Defend Benchmarks Against Knowledge Leakage
Yixiong Fang, Tianran Sun, Yuling Shi, Min Wang and Xiaodong Gu
[paper] [code]

[ICSE 2025] Between Lines of Code: Unraveling the Distinct Patterns of Machine and Human Programmers
Yuling Shi, Hongyu Zhang, Chengcheng Wan and Xiaodong Gu
[paper] [code] [bibtex]

[TOSEM 2025] On the Effectiveness of Large Language Models in Domain-Specific Code Generation
Xiaodong Gu, Meng Chen, Yalan Lin, Yuhan Hu, Hongyu Zhang, Chengcheng Wan, Zhao Wei, Yong Xu, Juhong Wang
(ESI Highly Citated Paper)
[paper]

[ASE 2024]How Effectively Do Code Language Models Understand Poor-Readability Code?
Chao Hu, Yitian Chai, Hao Zhou, Fandong Meng, Jie Zhou and Xiaodong Gu
[paper] [code] [bibtex]

[TSE 2024]VarGAN: Adversarial Learning of Variable Semantic Representations
Yalan Lin, Chengcheng Wan, Shuwen Bai, Xiaodong Gu
[paper] [code]

[ASE 2023] On the Evaluation of Neural Code Translation: Taxonomy and Benchmark
Mingsheng Jiao, Tingrui Yu, Xuan Li, Guanjie Qiu, Xiaodong Gu, Beijun Shen
[paper] [slides] [code]

[ASE 2023] InfeRE: Step-by-Step Regex Generation via Chain of Inference
Shuai Zhang, Xiaodong Gu, Yuting Chen, Beijun Shen
[paper] [slides] [code] [bibtex]

[ESEC/FSE 2023] Self-Supervised Query Reformulation for Code Search
Yuetian Mao, Chengcheng Wan, Yuze Jiang, Xiaodong Gu
[paper] [slides] [code] [bibtex]

[FSE 2022]Diet Code Is Healthy: Simplifying Programs for Pre-Trained Models of Code
Zhaowei Zhang, Hongyu Zhang, Beijun Shen, Xiaodong Gu
[paper] [slides] [code] [bibtex]

[ICSE 2022] Cross-Domain Deep Code Search with Meta Learning
Yitian Chai, Hongyu Zhang, Beijun Shen and Xiaodong Gu
[paper] [code] [slides] [bibtex]

Teaching

Students

I am grateful to the wonderful students I have been collaborating with

Alumni

Grants

Services

Program Committee ASE (2025), ACL (2023), EMNLP (2021, 2022, 2023), COLING (2020, 2022, 2024), IJCAI (2023), EACL (2023)
Reviewer Board Automated Software Engineering (AUSE), Empirical Software Engineering (EMSE)
Journal Reviewer TSE, TOSEM, EMSE, IST, JSS, FCS