I am a master's student in the School of Computer Science and Technology at Harbin Institute of Technology, Shenzhen. I have the privilege of being advised by Prof. Cuiyun Gao. I received my Bachelor's degree from Harbin Institute of Technology, Shenzhen. My research interests primarily focus on Artificial Intelligence for Software Engineering (AI4SE) and Code Intelligence.
I welcome any communication, so feel free to reach out! You can contact me via email at 200111115@stu.hit.edu.cn or on WeChat at gghhzzxhkazz22.
Annual Conference on Neural Information Processing Systems (CCF-A Conference)
We introduce Repo2Run, the first LLM-based agent aiming at automating the building of executable test environments for any repositories at scale. Specifically, given a code repository, Repo2Run iteratively builds the Docker image, runs unit tests based on the feedback of the building, and synthesizes the Dockerfile until the entire pipeline is executed successfully.
🏆 Spotlight Paper
The IEEE/ACM International Conference on Automated Software Engineering (CCF-A Conference)
We propose CodeVisionary, the first agent-based evaluation framework for complex code generation. CodeVisionary consists of two stages: (1) Requirement-guided multi-dimensional context distillation stage, which first formulates a detailed evaluation plan by decomposing task requirements, and then stepwise collects multi-dimensional contextual information for each requirement. (2) Fine-grained scoring and summarization stage, which defines self-directed and negotiation-based actions, allowing multiple judges to comprehend complex code from fine-grained and diverse viewpoints, and reach a consensus through discussion. A comprehensive evaluation report is also generated for enhanced explainability.
🏆 Directly Accepted Without Revision (9.9%)
The ACM International Conference on the Foundations of Software Engineering (CCF-A Conference)
We propose an automated bug reproduction script generation framework named AEGIS. AEGIS consists of two main modules: (1) Bug-related context summarization module, aiming at condensing the retrieved information into structural context through further reranking and summarization. (2) Finite state machine (FSM)-guided script generation module, which aims at guiding the script modification process with proposed FSM which contains predefined modification rules.
The International ACM SIGIR Conference on Research and Development in Information Retrieval (CCF-A Conference)
We introduce CodeRepoQA, a large-scale benchmark specifically designed for evaluating repository-level question-answering capabilities in the field of software engineering. CodeRepoQA is a multi-turn question-answering benchmark with 585,687 entries. It covers a diverse array of software engineering scenarios, with an average of 6.62 dialogue turns per entry.
The IEEE/ACM International Conference on Software Engineering (CCF-A Conference)
We propose an automated data collection framework and construct the first repository-level high-quality vulnerability dataset named ReposVul. The proposed framework mainly contains three modules: (1) A vulnerability untangling module, aiming at distinguishing vulnerability-fixing related code changes from tangled patches. (2) A multi-granularity dependency extraction module, aiming at capturing the inter-procedural call relationships of vulnerabilities. (3) A trace-based filtering module, aiming at filtering the outdated patches.
🏆 Best Paper Award of the Track
The IEEE/ACM International Conference on Automated Software Engineering (CCF-A Conference)
We propose a novel model named PILOT for vulnerability detection. It mainly contains two modules: (1) A distance-aware label selection module, aiming at generating pseudo-labels for selected unlabeled data, which involves the inter-class distance prototype and progressive fine-tuning; (2) A mixed-supervision representation learning module to further alleviate the influence of noise and enhance the discrimination of representations.
We propose Trae Agent, the first agent-based ensemble reasoning approach for repository-level issue resolution. Trae Agent formulates our goal as an optimal solution search problem and addresses two key challenges, i.e., large ensemble spaces and repository-level understanding, through modular agents for generation, pruning, and selection. We conduct extensive experiments using three leading LLMs on the widely-adopted SWE-bench benchmark, comparing Trae Agent against four state-of-the-art ensemble reasoning techniques.
Recent advances in large language models (LLMs) have shown significant potential to automate various software development tasks, including code completion, test generation, and bug fixing. However, the application of LLMs for automated bug fixing remains challenging due to the complexity and diversity of real-world software systems. In this paper, we introduce MarsCode Agent, a novel framework that leverages LLMs to automatically identify and repair bugs in software code. MarsCode Agent combines the power of LLMs with advanced code analysis techniques to accurately localize faults and generate patches. Our approach follows a systematic process of planning, bug reproduction, fault localization, candidate patch generation, and validation to ensure high-quality bug fixes. We evaluated MarsCode Agent on SWE-bench, a comprehensive benchmark of real-world software projects, and our results show that MarsCode Agent achieves a high success rate in bug fixing compared to most of the existing automated approaches.
2024.06 - now, Trae Research, ByteDance, China. Working with Chao Peng and Pengfei Gao.
2024.09 – now, M.Eng. in Computer Science and Technology, Harbin Institute of Technology, Shenzhen
2020.09 – 2024.06, B.Eng. in Computer Science and Technology, Harbin Institute of Technology, Shenzhen