GitHub Copilot Agent Isn’t a Silver Bullet — Here’s How to Actually Make It Work

GitHub Copilot Agent has been released for a while now, and we can see it compete with Anthropic and Google coding agents. These are a really cool way to code regardless of how developers - note that I didn't said engineers - are afraid of losing their job.

People are still discovering how to use code agent efficiently. First mistake is to trust that they will make everything from scratch including the thinking. To some extent, it is ok to let it start the work, that's actually true. The reality is that productivity wise you get diminishing return the longer you let the code agent decide for three reasons.

Reasons for lower productivity with GitHub Copilot

First reason is that the more the code changes the more you will have to spend time understanding it to make a change yourself.

Second reason, if you rely only on the code agent, you will be surprised that sometime it won't fully understand what you want either because you didn't prompt in the proper way to obtain the desired result according to the LLM, or because it doesn't really know or to achieve a request so specific in a particular context.

Third, and last reason, is that agents are generating a huge amount of tokens that will be kept in your context. So, it will fill it fast and it will lose the original ask.

This is critical to keep these in mind because these coding agent are not there yet. Is there a way to take advantage of GitHub Copilot to its fullest knowing these weaknesses? Yes. We just need to be a bit smart in our usage and a few tricks.

Let's tackle how to achieve full potential point by point.

How to fix your productivity with GitHub Copilot

Everything that needs to be fixed isn't rocket science. I will show you the easiest way to fix these productivity blocker. Trust me, it's simpler than you think, but not so easy to apply because it requires a lot of discipline.

Human in the loop

As a reminder, what is slowing you here is the time required for you to understand what and why something has been changed. Naturally, you may think that the more you let it do the work the better it will be because you will review only once.

Most of the time, you will begin to review once something is going wrong. And that's exactly where your mistake is. You assume that the last iteration is where things gone wrong. There's a high chance that it's a cumulation of stuff that deviated from an uncertain beginning. So, you basically have to either try to understand everything going on at the moment and decide the way to handle your issue possibly having to change a lot of code at multiple places. Or you could try to read every changes in historical order until you identify the place where it gone wrong.

We can agree that both ways would be very time and energy consuming. The best approach is to review incrementally, or, in other words, at each iteration. You will be reviewing multiple times a very few lines of code and files. You are still in control the full process. You understand fully what is happening and you can adjust at very specific places manually where your expertise shines.

The temptation is high to make two iterations back-to-back thinking it's ok, don't get fooled. Moreover, instead of trying to correct a faulty change of an iteration with another one, just undo and get back to the last checkpoint. On GitHub Copilot scroll up the chat until your prompt, and hovering on it you'll find a cross to delete this iteration with its changes. Then adjust your prompt for a better result.

LLM as a Language

An interesting fact about coding with agents is that every company will always say that their LLM is the best and gets the best results on certain tests and so on. Don't get trapped in all this marketing. It's true that we are getting better. Regardless, you should always try it by yourself. Let me share my experience on this.

Recently Google shared that their Gemini 2.5 pro outperform Claude Sonnet 3.7. I gave it a try on one of my project. And it was an utter disaster, the code wasn't what I was looking for at all. I felt like no matter how I tried again it was failing to understand my request. Looks surprisingly strange because Gemini should "outperform" Claude.

Think about it, the issue is probably not the model itself, but how I am communicating with it. After using Claude for a very long time, I naturally adjusted my wording to what's working best for it. Both models should have a very different dataset for training and so are the embeddings.

All in all, try to explore different LLMs, understand how they work, find the one that is working for you, optimise prompts, and be critical of the marketing.

Context overflow

Thanks to the discussion with a friend and fellow engineer (@Stephane Souron). I understood more about this issue.

I noticed that over time, the quality of the answers I get from the agent was greatly deteriorating at some point. The reason is really simple. Agents are basically chaining prompts in and out moving a massive amount of tokens. This way, getting capped on the context window comes really quick. Even with huge context window of two millions token, you will be surprised how fast you fulfill it.

There are a way around this. Simpler than you think. Basically, the idea is to add a specific part in the prompt to ask the agent to maintain a file containing information optimised only for it about its context. Think about it as a long-term memory agent readable. Thanks to this file, you can safely reset you chat instance that will reset your context window as well. I won't give you set in stone rule when to reset because it is depending on too many factors, like the LLM and how many times the agent has to think for each iterations. Try to do it on a regular basis and probably I would say reset at each feature.

Want more?

I know that it seems very theoretical without concrete examples. Depending on your interest, I will extend this to in-depth articles. There are so much more to explore. Share your experience, I will be glad to discuss it with every one of you.

#Artificial Intelligence #GitHub Copilot #Productivity

Writer at Jack of all trades

GitHub Copilot Agent Isn’t a Silver Bullet — Here’s How to Actually Make It Work

For developers who’ve gone beyond the hype — real-world strategies to avoid common Copilot Agent pitfalls and reclaim your coding flow.

Reasons for lower productivity with GitHub Copilot