Cursor’s New Bugbot Is Designed to Save Vibe Coders From Themselves

-


But the competitive landscape for AI-assisted coding platforms is crowded. Startups Windsurf, Replit, and Poolside also sell AI code-generation tools to developers. Cline is a popular open-source alternative. GitHub’s Copilot, which was developed in collaboration with OpenAI, is described as a “pair programmer” that auto-completes code and offers debugging assistance.

Most of these code editors are relying on a combination of AI models built by major tech companies, including OpenAI, Google, and Anthropic. For example, Cursor is built on top of Visual Studio Code, an open-source editor from Microsoft, and Cursor users are generating code by tapping into AI models like Google Gemini, DeepSeek, and Anthropic’s Claude Sonnet.

Several developers tell WIRED that they now run Anthropic’s coding assistant, Claude Code, alongside Cursor (or instead of it). Since May, Claude Code has offered various debugging options. It can analyze error messages, do step-by-step problem solving, suggest specific changes, and run unit tests in code.

All of which might beg the question: How buggy is AI-written code compared to code written by fallible humans? Earlier this week, the AI code-generation tool Replit reportedly went rogue and made changes to a user’s code despite the project being in a “code freeze,” or pause. It ended up deleting the user’s entire database. Replit’s founder and CEO said on X that the incident was “unacceptable and should never be possible.” And yet, it was. That’s an extreme case, but even small bugs can wreak havoc for coders.

Anysphere didn’t have a clear answer to the question of whether AI code demands more AI code debugging. Kaplan argues it is “orthogonal to the fact that people are vibe coding a lot.” Even if all of the code is written by a human, it’s still very likely that there will be bugs, he says.

Anysphere product engineer Rohan Varma estimates that on professional software teams, as much as 30 to 40 percent of code is being generated by AI. This is in line with estimates shared by other companies; Google, for example, has said that around 30 percent of the company’s code is now suggested by AI and reviewed by human developers. Most organizations are still making human engineers responsible for checking code before it’s deployed. Notably, one recent randomized control trial with 16 experienced coders suggested that it took them 19 percent longer to complete tasks than when they were not allowed to use AI tools.

Bugbot is meant to supercharge that. “The heads of AI at our larger customers are looking for the next step with Cursor,” Varma says. “The first step was, ‘Let’s increase the velocity of our teams, get everyone moving quicker.’ Now that they’re moving quicker, it’s, ‘How do we make sure we’re not introducing new problems, we’re not breaking things?’” He also emphasized that Bugbot is designed to spot specific kinds of bugs—hard-to-catch logic bugs, security issues, and other edge cases.

One incident that validated Bugbot for the Anysphere team: A couple months ago, the (human) coders at Anysphere realized that they hadn’t gotten any comments from Bugbot on their code for a few hours. Bugbot had gone down. Anysphere engineers began investigating the issue and found the pull request that was responsible for the outage.

There in the logs, they saw that Bugbot had commented on the pull request, warning a human engineer that if they made this change it would break the Bugbot service. The tool had correctly predicted its own demise. Ultimately, it was a human that broke it.



Source link

Ariel Shapiro
Ariel Shapiro
Uncovering the latest of tech and business.

Latest news

Gravel Running Shoes Are the Best Suitcase Shoe

“In general, we are noticing many of these shoes have more of a road running influence than they...

As Key Talent Abandons Apple, Meet the New Generation of Leaders Taking On the Old Guard

Start the music. Players walk clockwise in a circle. When the music stops, everyone sits in a chair....

This AI Model Can Intuit How the Physical World Works

The original version of this story appeared in Quanta Magazine.Here’s a test for infants: Show them a glass...

Lenovo’s Legion Go 2 Is a Good Handheld for Power Users

The detachable controllers go a long way towards making the device more portable and usable. The screen has...

Why Tehran Is Running Out of Water

This story originally appeared on Bulletin of the Atomic Scientists and is part of the Climate Desk collaboration.During...

Move Over, MIPS—There’s a New Bike Helmet Safety Tech in Town

Over the course of several hours and a few dozen trail miles, I had little to say about...

Must read

You might also likeRELATED
Recommended to you