Sumon Biswas
Sumon Biswas
Home
Publication
Service
Projects
Teaching
Students
News
Talks
Blogs
Light
Dark
Automatic
LLM
Plan Then Action: High-Level Planning Guidance Reinforcement Learning for LLM Reasoning
We propose PTA-GRPO, a two-stage framework that improves LLM reasoning by combining high-level planning guidance with guidance-aware reinforcement learning.
Zhihao Dou
,
Qinjian Zhao
,
Zhongwei Wan
,
Dinggen Zhang
,
Weida Wang
,
Towsif Raiyan
,
Benteng Chen
,
Qingtao Pan
,
Yang Ouyang
,
Zhiqiang Gao
,
Shufei Zhang
,
Sumon Biswas
Cite
ArXiv
Bias Testing and Mitigation in Black Box LLMs using Metamorphic Relations
We propose a unified framework using metamorphic relations for systematic bias evaluation and mitigation in black-box LLMs.
Sina Salimian
,
Gias Uddin
,
Sumon Biswas
,
Henry Leung
Cite
ArXiv
Are Prompt Engineering and TODO Comments Friends or Foes? An Evaluation on GitHub Copilot
We show that GitHub Copilot can generate code with the symptoms of SATD, both prompted and unprompted. Moreover, we demonstrate the tool’s ability to automatically repay SATD under different circumstances and qualitatively investigate the characteristics of successful and unsuccessful comments.
David OBrien
,
Sumon Biswas
,
Sayem Imtiaz
,
Rabe Abdalkareem
,
Emad Shihab
,
Hridesh Rajan
Cite
DOI
Cite
×