Publications

Reinforcement Learning from Automatic Feedback for High-Quality Unit Test Generation PAPER

Published in DeepTest (ICSE Workshop), 2025

We propose a novel technique called Reinforcement Learning from Static Quality Metrics (RLSQM).

Recommended citation: Benjamin Steenhoek, Michele Tufano, Neel Sundaresan, and Alexey Svyatkovskiy. 2025. Reinforcement Learning from Automatic Feedback for High-Quality Unit Test Generation. In 2025 Sixth International Workshop on Deep Learning for Testing and Testing for Deep Learning (DeepTest ’25), April 27–May 3, 2025, Ottawa, Canada. https://arxiv.org/abs/2412.14308

Closing the Gap: A User Study on the Real-world Usefulness of AI-powered Vulnerability Detection & Repair in the IDE PAPER

Published in ICSE, 2025

We present DeepVulGuard, an AI-powered vulnerability detection & repair tool built into the VSCode IDE, and the results of a user study on this tool with 17 professional software developers.

Recommended citation: Benjamin Steenhoek, Siva Sivaraman, Renata Saldivar, Yevhen Mohylevskyy, Roshanak Zilouchian Moghaddam, and Wei Le. 2025. Closing the Gap: A User Study on the Real-world Usefulness of AI-powered Vulnerability Detection & Repair in the IDE. In 2025 IEEE/ACM 46th International Conference on Software Engineering (ICSE ’25), April 27–May 3, 2025, Ottawa, Canada. https://arxiv.org/abs/2412.14306

Understanding and improving deep learning models for vulnerability detection DISSERTATION

Published in Iowa State University, ProQuest Dissertations & Theses Global, 2024

In this dissertation, we comprehensively evaluate state-of-the-art (SOTA) DL vulnerability detection models and propose a body of approaches for improving DL for vulnerability detection using static and dynamic analysis.

Recommended citation: Benjamin Steenhoek. 2024. Understanding and improving deep learning models for vulnerability detection (Publication No. 31562057). Available from Dissertations & Theses @ Iowa State University; ProQuest Dissertations & Theses Global. https://benjijang.com/files/2024-12-19-dissertation.pdf

To Err is Machine: Vulnerability Detection Challenges LLM Reasoning PAPER

Published in ArXiv, 2024

We present a challenging code reasoning task: vulnerability detection. We evaluated the vulnerability detection capabilities of SOTA LLMs. We systematically searched for the best-performing prompts and analyzed their reasoning.

Recommended citation: Benjamin Steenhoek, Md Mahbubur Rahman, Monoshi Kumar Roy, Mirza Sanjida Alam, Hengbo Tong, Swarna Das, Earl T. Barr, and Wei Le. 2024. To Err is Machine: Vulnerability Detection Challenges LLM Reasoning. ArXiv. https://arxiv.org/pdf/2403.17218

Dataflow Analysis-Inspired Deep Learning for Efficient Vulnerability Detection PAPER

Published in ICSE, 2024

We present DeepDFA, a dataflow analysis-inspired graph learning framework and an embedding technique that enables graph learning to simulate dataflow computation.

Recommended citation: Benjamin Steenhoek, Hongyang Gao, and Wei Le. 2024. Dataflow Analysis-Inspired Deep Learning for Efficient Vulnerability Detection. In 2024 IEEE/ACM 46th International Conference on Software Engineering (ICSE ’24), April 14–20, 2024, Lisbon, Portugal. https://doi.org/10.48550/arXiv.2212.08108

TRACED: Execution-aware Pre-training for Source Code PAPER

Published in ICSE, 2024

We introduce TRACED, an execution-aware pre-training strategy for source code wherein we pre-train code language models with a combination of source code, executable inputs, and corresponding execution traces.

Recommended citation: Yangruibo Ding, Benjamin Steenhoek, Kexin Pei, Gail Kaiser, Wei Le, and Baishakhi Ray. 2024. TRACED: Execution-aware Pre-training for Source Code. In 2024 IEEE/ACM 46th International Conference on Software Engineering (ICSE ’24), April 14–20, 2024, Lisbon, Portugal. ACM, New York, NY, USA, 12 pages. https://doi.org/10.48550/arXiv.2306.07487

Do Language Models Learn Semantics of Code? A Case Study in Vulnerability Detection PAPER

Published in ArXiv, 2023

In this paper, we analyze language models to investigate whether the models have learned the semantics of code relevant to vulnerability detection, namely bug semantics, and if so, how the alignment to bug semantics relates to model performance.

Recommended citation: Benjamin Steenhoek, Md Mahbubur Rahman, Shaila Sharmin, & Wei Le. (2023). Do Language Models Learn Semantics of Code? A Case Study in Vulnerability Detection. ArXiv. https://arxiv.org/abs/2311.04109

An Empirical Study of Deep Learning Models for Vulnerability Detection PAPER

Published in ICSE, 2023

In this paper, we surveyed and reproduced 9 state-of-the-art (SOTA) deep learning models on 2 widely used vulnerability detection datasets: Devign and MSR.

Recommended citation: Benjamin Steenhoek, Md Mahbubur Rahman, Richard Jiles, and Wei Le. 2023. An Empirical Study of Deep Learning Models for Vulnerability Detection. In Proceedings of the 45th International Conference on Software Engineering (ICSE 2023). https://doi.org/10.48550/arXiv.2212.08109

A Study of Static Warning Cascading Tools (Experience Paper) PREPRINT

Published in ArXiv, 2023

In this paper, we report the challenges of cascading warnings generated from two versions of programs. We investigated program differencing tools and extend them to perform warning cascading automatically.

Recommended citation: Guo, X., Joshy, A. K., Steenhoek, B., Le, W., & Flynn, L. (2023). A Study of Static Warning Cascading Tools (Experience Paper). ArXiv. https://doi.org/10.48550/arXiv.2305.02515

An Empirical Study of Open-Source Development Practices for Safety Certified Software COURSE PROJECT

Published in Iowa State University, COM S 515 Final Project, 2022

This paper extends a dataset of open-source safety-critical software with details about the project’s development practices and safety goals.

Recommended citation: Steenhoek, Benjamin. (2022). An Empirical Study of Open-Source Development Practices for Safety Certified Software (Final Project). Iowa State University COM S 515. https://benjijang.com/files/2022-04-26-coms515-opensource.pdf

Refactoring programs to improve the performance of deep learning for vulnerability detection POSTER

Published in Iowa State University 6th Annual Research Day, 2022

This poster is about refactoring programs as a method of data augmentation.

Recommended citation: Steenhoek, Benjamin. (2022). Refactoring programs to improve the performance of deep learning for vulnerability detection (Poster). Presented at: Iowa State University 6th Annual Research Day. https://benjijang.com/files/2022-04-01-poster.pdf

Refactoring programs to improve the performance of deep learning for vulnerability detection THESIS

Published in Iowa State University, ProQuest Dissertations & Theses Global, 2021

This paper is about refactoring programs as a method of data augmentation.

Recommended citation: Steenhoek, Benjamin. (2021). Refactoring programs to improve the performance of deep learning for vulnerability detection (Publication No. 28648161). Available from Dissertations & Theses @ Iowa State University; ProQuest Dissertations & Theses Global. https://www.proquest.com/dissertations-theses/refactoring-programs-improve-performance-deep/docview/2625295478/se-2?accountid=10906

Validating static warnings via testing code fragments PAPER

Published in ISSTA, 2021

In this paper, we present a novel solution that automatically generates test cases based on static warnings to validate true and false positives.

Recommended citation: Ashwin Kallingal Joshy, Xueyuan Chen, Benjamin Steenhoek, and Wei Le. 2021. Validating static warnings via testing code fragments. In Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2021). Association for Computing Machinery, New York, NY, USA, 540–552. https://doi.org/10.1145/3460319.3464832