Stop Freaking Out About Devin, It’s Not Going To Take Your Job
There’s a new AI in town called Devin. Cognition AI markets it as the world’s first ‘AI Software Engineer’.
Summary
The article discusses the capabilities and limitations of Devin, an AI software engineer developed by Cognition AI, and argues that while it is an impressive technology, it is not a threat to human programming jobs due to its current limitations and the complexity of software development.
Abstract
The article "Stop Freaking Out About Devin, It’s Not Going To Take Your Job" examines the recent concerns about Devin, an AI marketed as the world's first 'AI Software Engineer.' The author, after analyzing Devin's demonstrations and capabilities, concludes that the AI is not as autonomous as it appears, relying heavily on human prompts and existing large language models like GPT-4. The author points out that Devin's showcased speed is likely exaggerated and that the AI's current state suggests it is not ready to replace human engineers, especially for complex programming tasks. The article also criticizes the cherry-picked nature of Devin's successful Upwork job completion, emphasizing that the task was simple and not representative of the diverse challenges faced in software development. The author suggests that Devin may find a niche in addressing straightforward issues in open-source projects but doubts its ability to handle more intricate codebases without significant costs, which might make hiring human developers more feasible and effective.
Opinions
There’s a new AI in town called Devin. Cognition AI markets it as the world’s first ‘AI Software Engineer’.
And if you look at the comments of that video you’ll see that people are pretty freaked out about Devin. You see comments like:
Bro sitting on the couch engineered his own layoff as well as ours
I was taking a bootcamp to learn coding and get a job, build a career for the rest of my life. This is the last month of my bootcamp so thank you for ruining everything Devin & Cognition.
“Devin is a tireless”. This is some scary stuff, why would anyone want to hire human engineers when this machine can spit code 24/7.
That last one made me think of a certain movie. “Devin can’t be bargained with. He can’t be reasoned with. He doesn’t feel pity or remorse.”
But I actually beg to differ. I do not think Devin is a threat to programming jobs. At least not any more than existing AI technologies.
So first of all let’s look at how Devin works. It’s not exactly clear but I’m pretty sure Devin is just a wrapper around existing large language models, namely GPT-4.
So you can see in the video how Devin has a ‘blackboard’ area, a browser, an editor, and a terminal. I’m guessing it’s just using GPT-4 to tell it what to do with these interfaces.
Now originally when I saw this video I was like, “It’s obviously sped up” because I’ve integrated GPT-4 into my app and let me tell you, it is not fast. I mean it originally was pretty fast but then OpenAI must have tweaked something and now it is slow as molasses. Which is why I’m using Gemini now. It’s so much faster.
There is no way that Devin can work that fast. Especially with compiling. And then I looked closer and noticed something even weirder.
If you look in the area titled ‘Devin’s Workspace’ you can see what appears to be human prompts:



Pretty sneaky. When I first watched this I was like, “Oh, wow, Devin is doing all this itself.” No, it’s not. It’s basically a glorified wrapper around ChatGPT. You could accomplish something similar by just prompting ChatGPT with what you want and then manually copying and pasting the code into your editor.
And some of these prompts are pretty technical too. ‘Measure the wall clock runtime of the API and use the rest API directly to be fair.’ And this will replace software engineers. Give me a break. Now Devin will automate some of the processes like debugging but how well can it do these things?
It’s a little funny that Devin uses print statements to debug. Sure, I use print statements and they work fine. But Devin is an AI. You’d expect it to know what the memory looks like at each line or something. It just goes to show how limited Devin is.
But then you may ask, “What about the future? What if it gets better.” Well it could get better. But coding is a very difficult task and things don’t always scale that well. Like in my post ‘No Code App Development Is A Trap’ I said (about no-code app development):
You want to make things easy to do but at the same time you want to do everything. That’s not going to go well. And because of this you get tons of problems
Turns out doing everything in programming is very very hard. So, sure, this tech demo isn’t that hard to pull off. Especially as it looks like there’s a human prompting the AI to do stuff. But when these coding problems scale they get pretty tricky and I doubt Devin will be able to do it.
And there’s another thing I wanted to mention. Cognition AI also got it to do an Upwork job:
And people are like, “Wow, Devin is going to do all the Upwork jobs. No need for programmers now.” You know this example is cherry-picked right? I mean if it wasn’t Devin would be doing all the Upwork jobs and Cognition AI would be filthy rich.
So let’s see how this was cherry-picked. This is the job.

Do you notice anything? It’s incredibly simple and not defined well. It just wants you to do ‘inferences’ on a bunch of images using EC2. Oh, wait, it doesn’t want to do them, it just wants instructions on how to do them.
This reminds me of when I was in University and one of the assignments was to fix an issue in a GitHub project. So you know what we did? We looked through hundreds of issues to find one that was incredibly easy. I think one of our team members actually found an issue where a commenter already said how to fix it. 🤣 This seems like the exact same thing.
Although it is impressive that Devin got the code to actually work. It appears that at least one dependency had to be changed. Although it is possible that the console output said which one had to be changed. And then it’s less impressive. I would like to see Devin fix Gradle build issues. If it could do that then I’d be impressed.
Well, anyways, it generated this report:

Which, by the way, I’m pretty sure is not what the client wanted. The task specifically asked for detailed instructions on how to do it on an EC2 instance, not to actually do it for them. Oh well.
Also I looked up the job in question. It’s pretty recent, I expected it to have been resolved months ago, and so far (as of writing) no one has completed it.
The fact that this is so recent suggests that Devin is not ready for prime time yet and is currently very much in development.
And another way I know this is cherry-picked? It’s a small codebase on a very isolated project. In fact all of the examples shown appear to show only very limited projects. You know what I want to see? I want to see Devin take on a full app.
Also something else I found worrying: this video. Now it built some test cases and caught a bug which is impressive. But I’m not very convinced that the way it actually fixed the bug was ideal. Because it fixed it with this line.

Maybe there’s some documentation I’m not seeing but I can not for the life of me figure out what this code is doing. And I’m not even sure if it’s a good ‘fix’. It could be that this line only hides the underlying problem. Also, for the record, I hate if statements without curly braces. They caused the Heartbleed bug. I know the anti-goto cult likes to blame it on gotos but don’t listen to them. It was if statements without curly braces.
So it may seem like I’m really hard on Devin. And I am a little. This is another Silicon Valley company with a slick demo but when you look into it they lack substance. Honestly, I think we’re in a AI .Com Bubble. I was thinking of writing a post on that but I can’t find a good angle yet.
A lot of these AI companies are not going to succeed. Like Rabbit and Humane.
Although Devin may be able to carve out a niche. There are open-source projects that have thousands and thousands of issues that are probably pretty easy to fix if someone with enough knowledge of the codebase tried to fix them but there are not enough of these people so the bugs never get fixed. I’m talking about things like Flutter which currently has over 12k issues.
I think Devin could help in situations like this. Because open source projects like this are pretty self-contained and Devin should be able to handle it. In fact Devin has already done something like this:
However more complex code bases like Flutter will require Devin to look at a lot of code. Good thing we’re going to get LLMs with 1 million input tokens (about 4 million characters) soon. However it will be pretty expensive.
I wonder how much it will cost to operate Devin? Especially as Cognition AI will presumably be taking its cut. It might be cheaper to just hire a dev on Upwork. And, honestly, looking at the current state of Devin, you might get better quality code too.
If you liked this post and would like to stay updated with my future articles consider using my RSS app Stratum on iOS and Android. Also check out my language learning app Litany (iOS, Android).
Joe ProcopioAll the dots are connecting to one person, and that person needs to start doing their job
Attila VágóNot even the rockstar developers, ninja coders, or the 10x engineers…
Michael DainTurning a whiteboard sketch into working code by telling stories to a machine isn’t actually about replacing developers.