Skip to content

Create guidelines around AI assisted code changes and code review #89343

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
nashif opened this issue Apr 30, 2025 · 2 comments
Open

Create guidelines around AI assisted code changes and code review #89343

nashif opened this issue Apr 30, 2025 · 2 comments
Labels
Process Tracked by the process WG RFC Request For Comments: want input from the community

Comments

@nashif
Copy link
Member

nashif commented Apr 30, 2025

Create a guidelines around AI assissted code changes and code reviews using Copilot in github.

@nashif nashif added Process Tracked by the process WG RFC Request For Comments: want input from the community labels Apr 30, 2025
@github-project-automation github-project-automation bot moved this to To do in Process Apr 30, 2025
@nashif nashif changed the title Create guidelines around AI assisgted code changes and code review Create guidelines around AI assisted code changes and code review Apr 30, 2025
@keith-zephyr
Copy link
Collaborator

One agreement from Process WG - Copilot reviews are suggestions only. Copilot requested changes are non-blocking and require a human reviewer to make a change request blocking.

As a project - Zephyr needs to establish that using Copilot for reviews is allowed.

@kartben - Does the process WG think copilot is adding noise or providing value?

  • summary can be seen as noise or helpful depending on the size of the PR.
  • @fabiobaltieri - doesn't like the summary feature
  • @nashif - there are other tools that can identify complexity issues better than copilot
  • @kartben - copilot is finding real issues
  • @nashif - we could use the hide feature when copilot doesn't add anything useful to reduce noise
  • @keith-zephyr - copilot is provided value and noise
  • @kartben - hopefully it continues to improve
  • @nashif - would like to see the project move to always enabling once we agree there is value
  • @keith-zephyr - if low-value comments are hidden can we track this to measure signal to noise?
  • @kartben - for well formed PRs, copilot will only provide a summar

Reviewer accounts do need to enable copilot individually before the option to add copilot as a reviewer shows up.

Action Items

  • Need to review costs
  • Continue gathering data to assess value to the project. @kartben to investigate the costs as Github rolls this out

@aescolar
Copy link
Member

aescolar commented May 7, 2025

My thoughts, (this issue is not limited to Copilot)
First a couple of serious concerns:

  • With regards to AI generated or partially generated code: I think the project should not accept it (beyond trivial changes), and be explicit about it in the contribution guidelines. Why: a) Contributors must have rights over the code they contribute and sign it off. With code produced by an AI agent a contributor cannot know where it came from. With some agents collating search results and LLM generated filler, we are likely to end up in a situation where the AI agent is just hidding the fact the code is somebody elses copyrighted/not compatibly licensed code. b) AI generated code can look ok in a first look but be completely incorrect or pointless, while a too permisive or quick reviewer may accept it. Being AI generated means the barrier for a contributor to generate and submit it is very low (and therefore the code may also not be even tested or built at all, we have seen this already in PRs). The risk for crud getting in, and reviewers time waste is too high.

  • About commit messages, PR desciptions or answers to discussions: We should be explicit about requiring contributors to either write them themselves or be very critical about any AI generated ones. Why: We have already several examples of PRs descriptions, commits and PR messages generated by AI, where the user did not disclose this fact, and where on a quick isolated first look the PR description or commit message seems ok, but then it becomes evident they do not match the actual commits. LLMs are very good at generating "correctly sounding" text, which does not need to be factually correct. At best this will waste reviewers time and create pointless confusion. At worst: a) crud will get in our repository, and b) we end up in a situation where we have "contributors" acting as interfaces to AI agents, copy pasting the discussion and copying the AI agent responses, wasting reviewers time in senseless back and forths, or even worse, somebody automates an AI agent to do it automatically without the human middleman. (As maintainer I absolutely do not want to waste my time, thinking I'm helping a new contributor, being patient trying to explain things, while all I'm doing is wasting my time interacting with a prentender AI agent).

I think both of these are real serious issues we are starting to face. And that unless we are clear about from today, will just get much worse over time.

About usign Copilot in reviews:

  • I would be quite wary of it and specially about starting to use it systematically or using any money on it. To me it seems like MS is just offering the technology for free/for cheap by now while they mature it so they are covering the cost while they beta test it with us. The technology may improve, but if it does, the cost will certainly raise. This technology is quite a resource burner, so if the cost is passed to us, I cannot imagine it would be worthwhile our projects money any time soon if anybody can just press the "copilot do some work now" button. GitHubs CI is hardly price competitive, I cannot imagine Copilot would be.

So far I have seen copilot provide some valid review feedback (typos, or some issues a static analyzer could also detect) and some other pointless/senseless. I can also imagine Copilot would help somebody understand a PR in some cases they otherwise wouldn't, but you may also wonder that if a reviewer needs copilot to explain them a PR, maybe they should not be reviewing that PR, or even worst, base their feedback on the understanding they just gained from a Copilot generated explanation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Process Tracked by the process WG RFC Request For Comments: want input from the community
Projects
Status: To do
Status: No status
Development

No branches or pull requests

3 participants