Stories
Slash Boxes
Comments

SoylentNews is people

SoylentNews is powered by your submissions, so send in your scoop. Only 9 submissions in the queue.
posted by jelizondo on Friday November 21, @04:45AM   Printer-friendly

Developers tend to scrutinize AI-generated code less critically and they learn less from it:

When two software developers collaborate on a programming project—known in technical circles as 'pair programming'—it tends to yield a significant improvement in the quality of the resulting software. 'Developers can often inspire one another and help avoid problematic solutions. They can also share their expertise, thus ensuring that more people in their organization are familiar with the codebase,' explains Sven Apel, professor of computer science at Saarland University. Together with his team, Apel has examined whether this collaborative approach works equally well when one of the partners is an AI assistant. [...]

For the study, the researchers used GitHub Copilot, an AI-powered coding assistant introduced by Microsoft in 2021, which, like similar products from other companies, has now been widely adopted by software developers. These tools have significantly changed how software is written. 'It enables faster development and the generation of large volumes of code in a short time. But this also makes it easier for mistakes to creep in unnoticed, with consequences that may only surface later on,' says Sven Apel. The team wanted to understand which aspects of human collaboration enhance programming and whether these can be replicated in human-AI pairings. Participants were tasked with developing algorithms and integrating them into a shared project environment.

'Knowledge transfer is a key part of pair programming,' Apel explains. 'Developers will continuously discuss current problems and work together to find solutions. This does not involve simply asking and answering questions, it also means that the developers share effective programming strategies and volunteer their own insights.' According to the study, such exchanges also occurred in the AI-assisted teams—but the interactions were less intense and covered a narrower range of topics. 'In many cases, the focus was solely on the code,' says Apel. 'By contrast, human programmers working together were more likely to digress and engage in broader discussions and were less focused on the immediate task.

One finding particularly surprised the research team: 'The programmers who were working with an AI assistant were more likely to accept AI-generated suggestions without critical evaluation. They assumed the code would work as intended,' says Apel. 'The human pairs, in contrast, were much more likely to ask critical questions and were more inclined to carefully examine each other's contributions,' explains Apel. He believes this tendency to trust AI more readily than human colleagues may extend to other domains as well. 'I think it has to do with a certain degree of complacency—a tendency to assume the AI's output is probably good enough, even though we know AI assistants can also make mistakes.' Apel warns that this uncritical reliance on AI could lead to the accumulation of 'technical debt', which can be thought of as the hidden costs of the future work needed to correct these mistakes, thereby complicating the future development of the software.


Original Submission

 
This discussion was created by jelizondo (653) for logged-in users only, but now has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 4, Interesting) by JoeMerchant on Friday November 21, @07:22PM

    by JoeMerchant (3937) on Friday November 21, @07:22PM (#1424885)

    'I think it has to do with a certain degree of complacency—a tendency to assume the AI's output is probably good enough, even though we know AI assistants can also make mistakes.'

    So, I've been using Cursor for 3 weeks, Claude Code (virtually the same thing) for about a month in earnest. My primary conclusion is: the AI's initial output is probably NOT good enough. It needs more specification, oversight during development, testing, verification, validation, and in general "mature software development processes" than 95% of the human programmers I have ever worked with.

    AI loves to tell you it's done what you asked, 100% complete, MISSION ACCOMPLISHED!!! and, yet, when you ask it to review requirements it sheepishly fills out a new to-do list of things that were "deferred to a later phase of development" by it, unilaterally.

    It's an awful lot like working with people. It doesn't get anything 100% right 100% of the time. If you don't spell out very precise "done" criteria, it gets very creative about how to define its own "done" that happens before another compacting of its context window. It forgets stuff, particularly after compacting its context window.

    If you're using an AI to write code, make sure to ask it to "Review for technical debt, report." and then have it clean up its messes completely before letting anyone else see your AI generated masterpieces. I would also recommend fully developed and human reviewed specifications, test driven development, minimum 95% unit test coverage, integration tests as appropriate, periodic reviews for consistency / conflicts between specification documents, themselves and the implementation.

    Once you get through all that, muck out the steaming pile of self-congratulatory documentation the AI generates, review and refine the code until you wouldn't be embarrassed to call it your own. After all, it is _your_ code, you are just using a tool to help generate it.

    --
    🌻🌻🌻🌻 [google.com]
    Starting Score:    1  point
    Moderation   +2  
       Interesting=2, Total=2
    Extra 'Interesting' Modifier   0  
    Karma-Bonus Modifier   +1  

    Total Score:   4