Stories
Slash Boxes
Comments

SoylentNews is people

SoylentNews is powered by your submissions, so send in your scoop. Only 8 submissions in the queue.
posted by jelizondo on Friday November 21, @04:45AM   Printer-friendly

Developers tend to scrutinize AI-generated code less critically and they learn less from it:

When two software developers collaborate on a programming project—known in technical circles as 'pair programming'—it tends to yield a significant improvement in the quality of the resulting software. 'Developers can often inspire one another and help avoid problematic solutions. They can also share their expertise, thus ensuring that more people in their organization are familiar with the codebase,' explains Sven Apel, professor of computer science at Saarland University. Together with his team, Apel has examined whether this collaborative approach works equally well when one of the partners is an AI assistant. [...]

For the study, the researchers used GitHub Copilot, an AI-powered coding assistant introduced by Microsoft in 2021, which, like similar products from other companies, has now been widely adopted by software developers. These tools have significantly changed how software is written. 'It enables faster development and the generation of large volumes of code in a short time. But this also makes it easier for mistakes to creep in unnoticed, with consequences that may only surface later on,' says Sven Apel. The team wanted to understand which aspects of human collaboration enhance programming and whether these can be replicated in human-AI pairings. Participants were tasked with developing algorithms and integrating them into a shared project environment.

'Knowledge transfer is a key part of pair programming,' Apel explains. 'Developers will continuously discuss current problems and work together to find solutions. This does not involve simply asking and answering questions, it also means that the developers share effective programming strategies and volunteer their own insights.' According to the study, such exchanges also occurred in the AI-assisted teams—but the interactions were less intense and covered a narrower range of topics. 'In many cases, the focus was solely on the code,' says Apel. 'By contrast, human programmers working together were more likely to digress and engage in broader discussions and were less focused on the immediate task.

One finding particularly surprised the research team: 'The programmers who were working with an AI assistant were more likely to accept AI-generated suggestions without critical evaluation. They assumed the code would work as intended,' says Apel. 'The human pairs, in contrast, were much more likely to ask critical questions and were more inclined to carefully examine each other's contributions,' explains Apel. He believes this tendency to trust AI more readily than human colleagues may extend to other domains as well. 'I think it has to do with a certain degree of complacency—a tendency to assume the AI's output is probably good enough, even though we know AI assistants can also make mistakes.' Apel warns that this uncritical reliance on AI could lead to the accumulation of 'technical debt', which can be thought of as the hidden costs of the future work needed to correct these mistakes, thereby complicating the future development of the software.


Original Submission

This discussion was created by jelizondo (653) for logged-in users only, but now has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
(1)
  • (Score: 4, Insightful) by Snotnose on Friday November 21, @02:08PM (3 children)

    by Snotnose (1623) Subscriber Badge on Friday November 21, @02:08PM (#1424866)

    When I'm coding the last thing I want is someone talking to me. Every time I've asked AI for code it gives me something that is almost, but not quite correct. By the time I fix the AI's problems could have coded the damned thing myself.

    I suspect the kind of programmer that works well at pair programming is the kind that doesn't notice the AI code's problems and just trusts it.

    --
    Recent research has shown that 1 out of 3 Trump supporters is a stupid as the other 2.
    • (Score: 2) by JoeMerchant on Friday November 21, @07:25PM (2 children)

      by JoeMerchant (3937) on Friday November 21, @07:25PM (#1424886)

      Pair programming works brilliantly, in the right circumstances.

      Over the last 37 years, I can think of at least five pair programming sessions I participated in that were stellar examples of what "pair programming should be."

      I also tried about ten more that were equivocal in their outcome... awkward and uncomfortable socially, and as a result not particularly productive on the programming side.

      Like any partnership, the first key ingredient is: two eager and willing partners.

      --
      🌻🌻🌻 [google.com]
      • (Score: 2) by Thexalon on Friday November 21, @08:43PM (1 child)

        by Thexalon (636) on Friday November 21, @08:43PM (#1424890)

        Ditto for whiteboard design sessions: When you have every reason to respect the people you are working with based on your shared work history, you can debate and argue and land on something everyone is happy with. Add one or two people you know are morons and/or jerks, and that all goes away.

        Pair sessions work really well when 2 smart people think and talk through a problem the way I suspect Dennis and Ken did many many times back in the creation of Unix. They don't work if one partner is way more capable than the other.

        --
        "Think of how stupid the average person is. Then realize half of 'em are stupider than that." - George Carlin
        • (Score: 2) by JoeMerchant on Friday November 21, @10:01PM

          by JoeMerchant (3937) on Friday November 21, @10:01PM (#1424894)

          They don't work if one partner is way more capable than the other.

          True-ish. A couple of the good sessions I have had were in cases where one partner was capable AND confident, and the other partner was capable but not so confident. The confident partner can give minimal prompting letting the other build their confidence along the way.

          When one partner is clueless and not trying to improve themselves, the session is doomed. I've had a couple of those, too.

          --
          🌻🌻🌻 [google.com]
  • (Score: 0) by Anonymous Coward on Friday November 21, @02:09PM (1 child)

    by Anonymous Coward on Friday November 21, @02:09PM (#1424867)

    pair programming: paying two incompetent programmers to do the job of one.

    • (Score: 2) by JoeMerchant on Friday November 21, @11:01PM

      by JoeMerchant (3937) on Friday November 21, @11:01PM (#1424896)

      If you're hiring (paying) incompetent programmers, at least they do half the damage per hour while paired.

      --
      🌻🌻🌻 [google.com]
  • (Score: 4, Interesting) by JoeMerchant on Friday November 21, @07:22PM

    by JoeMerchant (3937) on Friday November 21, @07:22PM (#1424885)

    'I think it has to do with a certain degree of complacency—a tendency to assume the AI's output is probably good enough, even though we know AI assistants can also make mistakes.'

    So, I've been using Cursor for 3 weeks, Claude Code (virtually the same thing) for about a month in earnest. My primary conclusion is: the AI's initial output is probably NOT good enough. It needs more specification, oversight during development, testing, verification, validation, and in general "mature software development processes" than 95% of the human programmers I have ever worked with.

    AI loves to tell you it's done what you asked, 100% complete, MISSION ACCOMPLISHED!!! and, yet, when you ask it to review requirements it sheepishly fills out a new to-do list of things that were "deferred to a later phase of development" by it, unilaterally.

    It's an awful lot like working with people. It doesn't get anything 100% right 100% of the time. If you don't spell out very precise "done" criteria, it gets very creative about how to define its own "done" that happens before another compacting of its context window. It forgets stuff, particularly after compacting its context window.

    If you're using an AI to write code, make sure to ask it to "Review for technical debt, report." and then have it clean up its messes completely before letting anyone else see your AI generated masterpieces. I would also recommend fully developed and human reviewed specifications, test driven development, minimum 95% unit test coverage, integration tests as appropriate, periodic reviews for consistency / conflicts between specification documents, themselves and the implementation.

    Once you get through all that, muck out the steaming pile of self-congratulatory documentation the AI generates, review and refine the code until you wouldn't be embarrassed to call it your own. After all, it is _your_ code, you are just using a tool to help generate it.

    --
    🌻🌻🌻 [google.com]
(1)