Associate Professor
School of Computer Science, Shanghai Jiao Tong University
Contact:
Room 1208, Software Building, No.800 Dongchuan Road, Shanghai, China
Email:
Research Interest:
My research focuses on large language models for natural and programming languages. I develop efficient machine learning methodologies for software code. My research topics are:
[FSE 2026] Neuron-Guided Interpretation of Code LLMs: Where, Why, and How?
[FSE 2026] Beyond Language Boundaries: Uncovering Program Language Families with Code Language Models
[FSE 2026] In Line with Context: Repository-Level Code Generation via Context Inlining
[ICSE 2026] SWE-Debate: Competitive Multi-Agent Debate for Software Issue Resolution
[paper]
[ICSE 2026 SEIP] EVOC2RUST: A Skeleton-guided Framework for Project-Level C-to-Rust Translation
[paper]
[ICSE 2026] From Code to Correctness: Closing the Last Mile of Code Generation with Hierarchical Debugging
[paper]
[code]
[AAAI 2026] Anti-Adversarial Learning: Desensitizing Prompts for Large Language Models
[paper]
[AAMAS 2026] HyperAgent: Leveraging Hypergraphs for Topology Optimization in Multi-Agent Communication
[AAMAS 2026] GraphTracer: Graph-Guided Failure Tracing in LLM Agents for Robust Multi-Turn Deep Search
[AAMAS 2026] D³MAS: Decompose, Deduce, and Distribute for Enhanced Knowledge Sharing in Multi-Agent Systems
[TSE 2025] Synthetic Malware at Scale: Malicious Code Generation with Code Transplanting
[ASE 2025] LongCodeZip: Compress Long Context for Code Language Models
[paper]
[code]
[EMNLP 2025] Transplant Then Regenerate: A New Paradigm for Text Data Augmentation
[paper]
[code]
[EMNLP 2025 Findings] LastingBench: Defend Benchmarks Against Knowledge Leakage
[paper]
[code]
[ICSE 2025] Between Lines of Code: Unraveling the Distinct Patterns of Machine and Human Programmers
[paper]
[code]
[bibtex]
[TOSEM 2025] On the Effectiveness of Large Language Models in Domain-Specific Code Generation
(ESI Highly Citated Paper)
[paper]
[ASE 2024]How Effectively Do Code Language Models Understand Poor-Readability Code?
[paper]
[code]
[bibtex]
[TSE 2024]VarGAN: Adversarial Learning of Variable Semantic Representations
[paper]
[code]
[ASE 2023] On the Evaluation of Neural Code Translation: Taxonomy and Benchmark
[paper]
[slides]
[code]
[ASE 2023] InfeRE: Step-by-Step Regex Generation via Chain of Inference
[paper]
[slides]
[code]
[bibtex]
[ESEC/FSE 2023] Self-Supervised Query Reformulation for Code Search
[paper]
[slides]
[code]
[bibtex]
[FSE 2022]Diet Code Is Healthy: Simplifying Programs for Pre-Trained Models of Code
[paper]
[slides]
[code]
[bibtex]
[ICSE 2022] Cross-Domain Deep Code Search with Meta Learning
[paper]
[code]
[slides]
[bibtex]
I am grateful to the wonderful students I have been collaborating with
Alumni
,场景知识增强的Java代码自动生成技术,2024.9.1-2025.2.25,主持
,基于大模型的恶意代码样本生成,2023.5.1-2024.4.31,主持| Program Committee | ASE (2025), ACL (2023), EMNLP (2021, 2022, 2023), COLING (2020, 2022, 2024), IJCAI (2023), EACL (2023) |
| Reviewer Board | Automated Software Engineering (AUSE), Empirical Software Engineering (EMSE) |
| Journal Reviewer | TSE, TOSEM, EMSE, IST, JSS, FCS |