This is the last part in a five-part blog series, in which I attempted to prompt the JetBrains AI Assistant (the JBAI) into generating code by describing the tests needed to cover the functions that I wanted it to create.
A big thanks to Denise Yu for letting me use her Gophermon drawings for this blog series!
This resulted in a chat-style TDD approach to coding, or what I'm calling, for lack of a better term, Chat-Driven Development. Seriously, if you have a better phrase to describe it, let's use that.
Here are the other parts in the series:
Lessons Learned (or, Adjourning)
Series tl;dr: I was able to chat with the JBAI, so that it created a unit test for a shell function, and then asked it to implement that function. And it freaking worked.
This post's tl;dr: The best way I can describe the experience is to say that it's pair programming without someone asking clarifying questions after you tell them what to code.
Just want to see the code? The code for this part of the series has been merged into the main branch in this GitHub repo.
At this point I'm pretty much ready to wrap up. The JBAI generataed most of the code for the data service layer, and the REST handler functions, and all I have to do is make a few tweaks here and there.
But Does It Work?
Because it's all for naught if it doesn't, right?
In keeping with what I learned previously, I decided it's best to wire in all of these request handlers manually rather than explain to the JBAI how to do it.
At the top of /endpoints/user_endpoints.go
I added the following exported function:
func RegisterUserEndpoints(e *gin.Engine) {
e.GET("/users", getUsersHandler)
e.GET("/users/:user-id", getUserByIdHandler)
e.POST("/users", createUserHandler)
e.PUT("/users/:user-id", upsertUserHandler)
e.DELETE("/users/:user-id", deleteUserHandler)
}
Plus I added a test to cover it.
Then I called RegisterUserEndpoints
from the Start
function in /app/app.go
:
func Start() {
r := gin.Default()
registerHealthEndpoint(r)
endpoints.RegisterUserEndpoints(r) // <- right here
r.GET("/", func(c *gin.Context) {
c.JSON(200, gin.H{
"message": "hello, world",
})
})
r.Run() // listen and serve on 0.0.0.0:8080
}
At this point, it was time to test out my endpoints. Here's the initial result:
[x] GET /users
[x] GET /users/1
[ ] POST /users
- Nothing sets the
Id
field of the newly created user if the API client doesn't explicitly set it in the request, which it shouldn't do
- Nothing sets the
[ ] PUT /users/6 (New User)
- The newly created user isn't returned
[ ] PUT /users/1 (Existing User)
- The updated user isn't returned
[x] DELETE /users/2
(This Insomnia collection is in the repo, by the way, if you'd like to try it out.)
Honestly, that's not bad. My first project working with the JBAI, and we were able to get most of the statements covered with somewhat decent good API functionality.
It took about 10-15 minutes to get those two endpoints to behave as expected, plus add tests for the failure conditions that the JBAI had missed, which I pointed out in earlier posts. That, plus the test for RegisterUserEndpoints
brought test coverage in the endpoints
package to 100%.
And yes, I get that going for 100% can be overkill sometimes.
But I'm a completionist. And this is for science.
The Retro
The best way to think about this experience is to call it a weird version of pair programming. One side of the pair does all of the thinking and dictating, and the other makes assumptions about what code should be generated.
Unlike in real pair programming, the JBAI won't ask clarifying questions. Conversational AI is supposedly still a long way away, and in this case it comes at the detriment of a true pair programming experience. Whatever assumptions the JBAI makes - you get to fix them.
Some Surprises
The biggest surprise to me was how the JBAI treated its output context like a scratch pad, often deleting existing code and replacing it with a comment like // existing code
. I'd be curious to find out more about what the motivation behind that approach was.
Another big one was how the JBAI was unaware of the fully indexed project the way the IDE typically is. While the IDE could already give me details about complex code and exported structs happening in adjacent packages, the JBAI had a tough time keeping up with what was happening in the same package as its execution context.
Similarly, in what is likely a result of the previous two points, the fact that the execution context wasn't aware of its parent module was really shocking - that is, the import "module/url/app"
seen in part 1 of this series.
I was also surprised at the general inconsistency of how the coding style used by the JBAI, whether that's in how it mocked, tested, or even generated functional code. Maybe this just shows my how little I know about training an LLM.
Going Forward
Figuring out how to be as explicit as possible without being perscriptive was extremely hard, and I don't think I've quite mastered it. That's something I'll continue to work on as I start using this at work.
I'd also like to revisit the Generate Unit Tests feature that I wrote off in the beginning of this series. It's a pretty big feature, and while the experience I had with it was really poor, I'd imagine that's just a result of not yet knowing how to use it.
In a more real-world use case, I won't wait until all of the code is written to test it out, or run the unit tests and track coverage. We generally don't do that when we code manually, and in this case I did it for the sake of the blog, as opposed to focusing on productivity.
What would be super neat is if the JBAI could generate the Insomnia collection for me as it coded. Sounds like a blog post for another day.
All that said, I will definitely use the JBAI to write code at work and when I contribute to open source projects. I can poke at the flaws all day, but the end result is still a ton of time saved when you compare this experience to writing all of the code by hand. I also saw some approaches to testing that I wasn't familiar with, so I was able to learn something by reading the code written by the JBAI.
It Will Get Smarter
One of the lessons learned that keeps coming up in my mind is that I'm dealing with a backend service that lives - we can assume - in a cloud somewhere. As consumers of that service, we have no control over which AI model we're working with.
It also means that those models will evolve over time without us being alerted to that fact. For all I know, each of the issues I had with code generation will be solved within a month from this being published.
If that's the case, then it leaves us, the consumers, with a conundrum. We can learn to work around the shortcomings and stick to a pattern that works today; but in doing so, in the long term, we may miss out on the benefits of the evolving models.
Effectively, that means we have to keep trying more complex prompts for our AI Assistants, including the one provided by JetBrains. It may be time consuming, and it will get tedious, but we'll continue to learn to work with it as it learns to work with us.