Next-Gen App & Browser Testing Cloud
Trusted by 2 Mn+ QAs & Devs to accelerate their release cycles

On This Page
Discover how to enhance AI code assistants with practical tips from Erhan Yahav. Learn about Retrieval-Augmented Generation (RAG) and optimizing AI for better code quality.

TestMu AI
January 30, 2026
AI code assistants are now a standard part of developers’ toolchains, but their effectiveness often comes into question. In this session, Eran Yahav, CTO of Tabnine, addresses these issues by offering practical tips on how to enhance AI tools. He explains how Retrieval-Augmented Generation (RAG) improves AI suggestions by incorporating external information, leading to more relevant recommendations.
He also highlights the importance of integrating AI with the existing codebase, such as through IDEs and code repositories, to boost performance. He discusses how to customize and fine-tune AI assistants for their specific needs, ensuring they are both effective and secure. The session provides valuable insights into optimizing AI tools for better productivity and code quality.
If you couldn’t catch all the sessions live, don’t worry! You can access the recordings at your convenience by visiting the TestMu AI YouTube Channel.
Erhan opened the session by highlighting the significant shift in software development, where AI is beginning to play a pivotal role. He illustrated this by demonstrating how, in the near future, developers might simply point to a JIRA ticket in their IDE, and AI would autonomously generate a fully functional application like a trading app—based on those requirements. While we’re not entirely there yet, we’re moving toward a reality where AI assists with an increasing number of tasks in the software development life cycle (SDLC).
He emphasized that AI is already delivering substantial productivity gains, ranging from 20% to 50%, by supporting various stages of the SDLC. Currently, human engineers drive the process, with AI acting as an assistant across different steps such as coding, testing, code review, and deployment. However, the future holds a more streamlined process involving three key phases: ideation, generation, and validation.

In this evolving landscape, the human engineer collaborates with AI during the ideation phase, defining designs and requirements. The generation phase sees AI taking over the code creation entirely, leaving humans to focus on refining tasks. The final phase validation becomes crucial as the bottleneck in ensuring software quality. As AI-generated code scales rapidly, so must the validation tools and processes, making the role of quality assurance more vital than ever.
Continuing from his initial discussion, Erhan delved deeper into the concept of delegating tasks to AI within the SDLC, framing it as a “delegation problem.” He introduced what he called the “Fundamental Theorem of Generativity.” This theorem suggests that it only makes sense to delegate a task to an AI agent if the combined cost of instructing the agent and verifying its output is significantly lower than the cost of doing the task manually.
The graphic shared by @yahave is enough to prove that AI is instrumental in realising tasks at every stage of the SDLC lifecycle! pic.twitter.com/vSlPUwDKoE
— LambdaTest (@testmuai) August 21, 2024
Erhan explained that for AI to be truly useful in the SDLC, two critical aspects need to be addressed:
However, AI test agents like KaneAI by TestMu AI addresses these challenges head-on by simplifying the creation, debugging, and management of tests through natural language commands.
KaneAI is a GenAI smart test agent, guiding teams through the entire testing lifecycle. From creating and debugging tests to managing complex workflows, KaneAI uses natural language to simplify the process. It enables faster, more intuitive test automation, allowing teams to focus on delivering high-quality software with reduced manual effort.
With the rise of AI in testing, its crucial to stay competitive by upskilling or polishing your skillsets. The KaneAI Certification proves your hands-on AI testing skills and positions you as a future-ready, high-value QA professional.
Erhan expanded on the concept of trust in AI-assisted software development by emphasizing the importance of “tight loop integration” between the human engineer and the AI assistant.

He described this collaboration as a continuous feedback loop where the human provides lightweight specifications, and the AI generates code based on those inputs. The human engineer then reviews the generated code, applying their judgment to decide whether to accept, modify, or reject it. He highlighted two key points on Human-AI collaboration:
Erhan provided a practical example to illustrate this process. Suppose a developer writes a function signature, such as get_tweet. The AI might offer two possible completions: one using a well-known library like Tweepy, and another performing an HTTP request directly using the Requests library. The human engineer then evaluates these options, considering factors like the reliability and quality of the libraries involved, and makes a decision based on their judgment.
Erhan continued by discussing scenarios where a developer might ask the AI to improve or refactor existing code. For instance, a simple function that converts temperatures from Fahrenheit to Celsius could be refactored by the AI to include a common conversion function, making the code more concise and maintainable.

For instance, Erhan presented the example of an AI-generated refractor where he replaced repetitive code with a common function. However, he acknowledged that in many cases, the decision to accept AI-generated refactoring may come down to personal preference or specific project requirements.
Erhan emphasized that while AI can significantly assist in enhancing code, the final judgment always rests with the human developer, ensuring that the code meets the project’s standards and requirements. This balance between AI assistance and human oversight is essential for maintaining trust in the development process.
In this segment, Erhan dived into the critical aspects of establishing trust in AI engineers, highlighting three key strategies that can help organizations confidently integrate AI into their software development processes.
If done correct, AI engineer will always bag “Employee of the month award” 👇🏼 pic.twitter.com/iEld7KKxzO
— LambdaTest (@testmuai) August 21, 2024
Erhan emphasized that for an AI engineer to be effective, it must be thoroughly personalized to understand the company’s specific code base and corporate standards. Without proper onboarding, an AI engineer might generate code that doesn’t align with the organization’s best practices, leading to issues such as:
The second crucial strategy is ensuring that those responsible for quality—whether they are architects, development managers, or test engineers—have full control over the AI’s output. This control allows them to:
The final strategy Erhan discussed involves gradually increasing the autonomy of AI engineers while maintaining clear delegation boundaries. The goal is to make it easy for developers to:
By successfully implementing these strategies, Erhan suggested that the AI engineer could effectively become the “employee of the month” consistently, knowing the codebase, adhering to company standards, and producing high-quality output that requires minimal oversight.
Erhan also addressed a question regarding which tasks in the SDLC can be effectively handled by AI to save time and improve accuracy. He identified two primary areas:
Erhan acknowledged that there are risks associated with relying on AI for code generation, especially when human oversight is not sufficiently rigorous. These risks include:
To mitigate these risks, Erhan highlighted the importance of maintaining tight human-AI integration and ensuring that the AI is personalized and well-controlled within the organization.
In discussing the future of AI in software development, Erhan emphasized a crucial separation within the system: the roles of generation and validation agents.

Here’s a closer look at these components:
As AI-generated code scales, so must the validation processes. This balance between generation and validation is critical for maintaining high standards and efficiency in software development.
Erhan shared some valuable insights into how test-generation agents are transforming the software development landscape.

Here’s a summary of what he discussed:
Here is how test generation agents work:
Here are some of the benefits of test-generation agents:
The challenges associated with test generation agents include:
Erhan’s insights underscore the strengths of test generation agents in expanding test coverage and managing routine tasks while also highlighting the need for ongoing human oversight, especially for complex testing scenarios.
In his talk, Erhan laid out what he envisions as the ideal AI engineer.

Here’s a breakdown of the key qualities he hopes for when hiring an AI agent:
Erhan’s vision highlights the importance of a personalized, efficient, and reliable AI engineer who can communicate effectively and seek clarification when necessary.
Moving on with the session, Erhan highlighted the importance of personalizing AI engineers to ensure they effectively integrate and perform within an organization.

Here’s a closer look at the four key aspects of personalization:
In his talk, Erhan emphasized the fundamental role context plays in the effectiveness of AI agents.

Here’s a detailed look at how context impacts AI performance and the challenges associated with managing it:
Context is everything when working with AI agents. The quality of results from an AI agent improves dramatically when it is provided with appropriate and comprehensive context. This means that the AI should be aware of a wide range of information relevant to the task at hand.
The level of context an AI agent can access varies. At one end, the agent might have no context beyond the immediate task. At the other end, it could be fully contextualized with access to the entire organization’s information, including non-code sources like JIRA and Confluence. This range impacts how well the AI can understand and address requests.

When given a simple request, such as displaying a table of student scores, the AI agent considers a multitude of factors:
This broad awareness helps the AI deliver more accurate and relevant results, avoiding unnecessary duplication and technical debt.
Language models, which power AI agents, have a limited context window and are measured in tokens. Typical context windows range from 8,000 to 200,000 tokens, with recent advancements aiming for up to 1 million tokens. However, as more information is added to the context window, latency increases, and the model’s accuracy can decrease.

The challenge lies in balancing and prioritizing the vast amounts of information available within the context window. For instance, when generating results, the AI agent must efficiently integrate and convey this information to produce accurate and useful outputs.
Erhan’s insights highlight the intricate balance required to manage context to maximize the effectiveness of AI agents. The better the AI is at contextualizing information, the more precise and useful its outputs will be.
To handle context effectively, techniques such as Retrieval-Augmented Generation (RAG) are used. This involves:

The retrieval algorithm plays a crucial role in enhancing context awareness.
Coaching provides a control layer for AI agents, allowing users to set specific guidelines and rules.

For instance, during code reviews, tools like Tabnine can automatically flag issues based on predefined rules, such as ensuring rate limiting for server requests. Users can also add custom rules in natural language.
This approach ensures that AI agents operate effectively within the given context, improving their reliability and alignment with organizational standards.
Erhan discussed how to effectively use AI agents to enforce coding standards and best practices:

Erhan’s session highlighted key aspects of leveraging AI agents to enhance code quality. He stressed the importance of context, noting that AI performance improves with rich contextual information from code bases and organizational tools like JIRA. The onboarding process involves careful selection of relevant systems to ensure effective operation and avoid legacy code.
The discussion also touched on balancing reliability, accuracy, and test coverage, with current efforts focusing on improving coverage. Overall, Erhan’s insights offer a path to using AI agents for better coding standards and higher-quality software development.
Erhan: Code generation with AI agents carries certain risks. Firstly, the quality of the generated code depends significantly on human judgment. If developers accept code without careful review, it can compromise the overall code quality. Secondly, AI agents might produce duplicate or similar pieces of code, leading to technical debt.
This redundancy can create refactoring challenges and clutter the codebase. While efforts are underway to mitigate these issues, it’s essential for users to be aware of these risks and apply diligent oversight when integrating AI into their development workflows.
Erhan: AI agents excel at generating test data and can significantly boost test coverage quickly. They are particularly effective in taking coverage from a low to a high level by creating diverse and interesting test cases that might not be immediately obvious to humans. For instance, an AI agent might add tests for scenarios like a future date of birth, which a human might overlook.
However, these agents may not be as effective in addressing the final, complex edge cases that require nuanced human reasoning. Their strength lies in rapidly enhancing coverage rather than fine-tuning the last details of test scenarios.
Erhan: One challenge with generating tests from existing source code is that the AI agent often has to infer specifications rather than work from clear, high-level requirements. This can lead to gaps between what the code is expected to do and what the tests actually cover.
However, by integrating with tools like JIRA to access detailed specifications and descriptions, future AI agents will be better able to align tests with defined test scenarios, bridging this gap and ensuring that tests accurately reflect the intended specifications.
Erhan: Currently, the focus is on generating tests to improve coverage rather than optimizing test latency. The team is aware of the importance of considering latency but has not yet integrated this factor into the process. This is a valuable point for future consideration, and they are open to exploring how to address it.
Please don’t hesitate to ask questions or seek clarification within the TestMu AI Community.
Did you find this page helpful?
More Related Hubs
TestMu AI forEnterprise
Get access to solutions built on Enterprise
grade security, privacy, & compliance