Chat-Driven Development with the JetBrains AI, Part 2: Early Stumbles

In my last post, I laid out my plans for taking the JetBrains AI Assistant (the JBAI for short) for a test drive. The project is simple - create a RESTful microservice that has a layered structure and some in-memory data, with the JBAI doing as much of the work as possible, while achieving maximum coverage.

A big thanks to Denise Yu for letting me use her Gophermon drawings for this blog series!

This is the second part of my pair-programming experience with the JBAI. There will be a total of five parts:

Getting Started (or, Forming)
Early Stumbles and Small Wins (or, Storming)
Hitting a Groove (or, Norming)
Connecting the Pieces (or, Performing)
Lessons Learned (or, Adjourning)

Series tl;dr: I was able to chat with the JBAI, so that it created a unit test for a shell function, and then asked it to implement that function. And it freaking worked.

This post's tl;dr: Lessons learned:

The Generate Unit Tests functionality has a long way to go before it's ready for real world use.
The JBAI loses context of previously generated code between prompts.
Treat the generated code like you would treat a code review, but where you have to make the suggested changes.

Just want to see the code? The code for this part of the series lives on a branch in this GitHub repo.

Baby Steps

At this point, my app.Start function was still just as empty as it looks at the top of this post. So, after clicking on Code Generation in the AI Actions context menu, I asked JBAI to create a simple web server:

Me > Generate Code: Can you please create a simple web server inside the Start function, using Gin as the RESTful library?

The results were pretty good:

package app

import "github.com/gin-gonic/gin"

func Start() {
    r := gin.Default()

    r.GET("/", func(c *gin.Context) {
        c.JSON(200, gin.H{
            "message": "hello, world",
        })
    })

    r.Run() // listen and serve on 0.0.0.0:8080
}

That comment about running on :8080 was added by JBAI, as it were.

I ran the code from the main function, and it worked!

$ curl http://localhost:8080/

{"message":"hello, world"}%

Auto-Generated Unit Tests

So that's cool and all, but it's not tested yet. The way it's written, that code would actually be pretty hard to test. It calls Gin directly, without allowing for an easy mock.

In the right-click context menu, under AI Actions, there's an option to Generate Unit Tests.

On the one hand, that's not what I originally had in mind when I started this project. O the other, if achieving good coverage is that easy, I'm not against it. The big question: is it really smart enough to write unit tests with full coverage?

Here's what I would want it to do:

refactor so that Gin could be mocked
have the Start function call an indirect reference to Gin
create a test file where it mocks Gin
call the Start function and send a test HTTP request to the web server, and runs assertions against a test HTTP response.

That's not what happened.

I clicked on that, and it created /app/app_test.go, with the following content:

package app

import (
    "net/http"
    "net/http/httptest"
    "testing"

    "github.com/gin-gonic/gin"
    "github.com/stretchr/testify/assert"
)

func TestStart(t *testing.T) {
    t.Run("GET root path", func(t *testing.T) {
        gin.SetMode(gin.TestMode)

        router := gin.Default()

        router.GET("/", func(c *gin.Context) {
            c.JSON(200, gin.H{
                "message": "hello, world",
            })
        })

        request, err := http.NewRequest(http.MethodGet, "/", nil)
        if err != nil {
            t.Fatalf("Couldn't create request: %s\n", err)
        }

        respRec := httptest.NewRecorder()
        router.ServeHTTP(respRec, request)

        assert.Equal(t, http.StatusOK, respRec.Code)
        assert.Equal(t, `{"message":"hello, world"}`, respRec.Body.String())
    })
}

If you read through that, you'll notice that it didn't create a unit test for the Start function; it created a test that duplicates the code inside of the Start function. That means the code coverage is 0 lines, and if the code inside Start changes, then the test is no longer even running against the same funtionality.

Next conclusion - the Generate Unit Tests functionality has a long way to go before it's ready for real world use.

Chat-Driven Development, Maiden Voyage

I thought it might be better to take a different approach, something similar to what we've seen in the TDD ways of old.

(Yeah. Sometimes it pays to be old enough to remember when Rails was all the rage, and no one took you seriously if you didn't live and breathe red-green-refactor. I do not miss those days, as it were.)

I wrote out a stub of a function, which I would then ask the JBAI to test, fail, and then implement. The function looked like this:

func registerHealthEndpoint(engine *gin.Engine) {}

My prompt was entirely focused on the behavior of the unit test. This was to make sure the JBAI knew to test the function itself, and not the code that would go into the function, as it did above.

In the context menu inside /app/app_test.go, I entered the following into the Generate Code prompt:

Me > Generate Code: Can you please write a test for a registerHealthEndpoint function that takes a *gin.Engine as an argument? It should assert that a request to /health returns a 200 status code, and that the JSON response is equal to {"status":"ok"}

Freaking nailed it!

func TestRegisterHealthEndpoint(t *testing.T) {
    gin.SetMode(gin.TestMode)
    r := gin.Default()

    registerHealthEndpoint(r)

    req, _ := http.NewRequest("GET", "/health", nil)
    resp := httptest.NewRecorder()
    r.ServeHTTP(resp, req)

    if resp.Code != http.StatusOK {
        t.Errorf("Expected status code %d, but got %d", http.StatusOK, resp.Code)
    }

    expectedResponse := `{"status":"ok"}`
    body := resp.Body.String()
    if body != expectedResponse {
        t.Errorf("Expected body '%s', but got '%s'", expectedResponse, body)
    }
}

It was time to implement and get that passing test. I asked the following of the Generate Code prompt in /app/app.go:

Me > Generate Code: Can you please refactor registerHealthEndpoint so that TestRegisterHealthEndpoint will pass?

And the end result was pretty solid:

func registerHealthEndpoint(engine *gin.Engine) {
    engine.GET("/health", func(c *gin.Context) {
        c.JSON(http.StatusOK, gin.H{
            "status": "UP",
        })
    })
}

It's just when I ran the test again...

$ go test ./...

--- FAIL: TestRegisterHealthEndpoint (0.00s)
    app_test.go:27: Expected body '{"status":"ok"}', but got '{"status":"UP"}'
FAIL

...the F?

Ok, so it's not perfect. I had to modify the function to match the expected response as seen in the unit test. Next Conclusion - the JBAI loses context of previously generated code between prompts. I also had suspicions that this had something to do with the code being in different files: /app/app_test.go, and /app/app.go.

Also, another conclusion: treat the generated code like you would treat a code review, but where you have to make the suggested changes.

BUT! as an added bonus, the JBAI was smart enough to automatically add a line that calls the registerHealthEndpoint function inside the Start function:

func Start() {
    r := gin.Default()

    // THE JBAI PUT THIS HERE, NOT ME
    registerHealthEndpoint(r)

    r.GET("/", func(c *gin.Context) {
        c.JSON(200, gin.H{
            "message": "hello, world",
        })
    })

    r.Run() // listen and serve on 0.0.0.0:8080
}

When I restarted the server and tried it out...

$ curl http://localhost:8080/health

{"status":"ok"}%

It worked!

Keep in mind, after the JBAI wrote the test and filled in the shell function with actual functionality, all I did was tweak the JSON output so that the test passed. Since the JBAI was smart enough to call registerHealthEndpoint on its own, I didn't need to do anything else to get a working health endpoint.

What's Next?

Thus ends the Storming phase of this pair-programming experiment.

In my next post, I'll go through the process of having the JBAI build out and test the entire data service layer, with 100% coverage. That was the Norming phase. It wasn't terribly complicated, but it helped me get used to the way the JBAI works, in order to use it effectively.

Colin J Codesalot

Colin J Codesalot

Chat-Driven Development, Part 2: Early Stumbles and Small Wins

Or, "Storming"

Baby Steps

Auto-Generated Unit Tests

Chat-Driven Development, Maiden Voyage

What's Next?