Large codebase guide #8932

digitarald · 2025-09-26T05:32:48Z

Outline

pamelafox · 2025-09-26T15:52:00Z

docs/copilot/guides/large-complex-codebases-guide.md

+---
+# Set up large complex codebases for AI pair programming
+
+This guide shows you how to scale context engineering for large, complex codebases with thousands of files, multiple teams, and intricate architectural dependencies. Building on the [context engineering guide](/docs/copilot/guides/context-engineering-guide.md), this covers advanced strategies for managing AI context in brownfield codebases where traditional "vibe coding" approaches break down.


Some nitpicks on the terminology and whether people will know them:

"brownfield" is a term that I only learnt at Microsoft, and always gives me weird imagery. Would "existing" codebases be just as descriptive? Or do most people know that term?

It always feels a bit funny to say "traditional" vibe coding, as it's too young to have traditions, and there's so much disagreement about what it means.

"Context engineering" is a hot term right now for AI app developers, but I don't know if it will make immediate sense to developers who haven't built AI agents/RAG systems yet. I really like the title of the article as I think that makes it super clear what you're describing, just have some concerns about whether people will be thrown off by not connecting "context engineering" to software development.

Afaik brownfield is a common term in DevOps spaces; but we can make sure it is explained where its used.

Agreed, can be reworded to highlight the scale in control and oversight

We have another guide that would explain it; but could need an aside for sure.

See also below, not sure if brownfield codebases should be part of this guide. They have their own particular problem space, which is broader than just the size of it.
I'd create a separate guide for dealing with existing / legacy codebases. That guide would then likely reference the context engineering and this large codebases guide.

pamelafox · 2025-09-26T15:53:36Z

docs/copilot/guides/large-complex-codebases-guide.md

+
+### Context inheritance patterns
+
+VS Code combines all applicable instruction files automatically:


Is it concatenated in this order? That was a question I got from developers, and I ended up going into Chat Debug View to try to answer their question. If we can be clear about the order, developers will appreciate it.

Its not, there is no priority order or callout in the instructions; and each model has slightly different best practices (like mention important things on top of the system prompt and again in the end).

Once make it clearer as we instruct the model on what takes precedence (like AGENTS.md in root vs in sub-folders vs chat mode, etc); and share that with users.

pamelafox · 2025-09-26T15:54:40Z

docs/copilot/guides/large-complex-codebases-guide.md

+VS Code combines all applicable instruction files automatically:
+
+- **Repository-wide** (`.github/copilot-instructions.md`): Core architecture, shared conventions
+- **Team-level** (`team-*.instructions.md`): Tech stack specifics, team workflows


Is VS Code actually looking at whether the filename has "team" versus "module" in it?

Good callout; they are madeup namespaces as its a very flexible system.

pamelafox · 2025-09-26T15:55:09Z

docs/copilot/guides/large-complex-codebases-guide.md

+Guidelines for implementing logging, monitoring, and observability features...
+```
+
+The AI agent can automatically load this instruction file when it detects the conversation involves logging, monitoring, or observability concepts.### Context scoping strategies


Newline missing?

pamelafox · 2025-09-26T15:55:59Z

docs/copilot/guides/large-complex-codebases-guide.md

+
+- `.github/copilot-instructions.md` for repository-wide context
+- Focused `.instructions.md` files with specific `applyTo` patterns for subsystems
+- Reference documentation using Markdown links


People always ask if Markdown links are auto-concatenated, I think it'd help to clarify somewhere that links are NOT auto-concatenated, but Copilot will be able to fetch when needed.

Good point. In some file like custom-instructions.md they are auto-included; but for agent mode we usually aim for tool-driven file reads

pamelafox · 2025-09-26T15:59:48Z

docs/copilot/guides/large-complex-codebases-guide.md

+
+### Role-based chat modes
+
+Different roles need different AI personas and tool access. Create [custom chat modes](/docs/copilot/customization/custom-chat-modes.md) with specific instructions and tool sets:


When you say "roles" here, I'm not sure if you're referring to job roles of engineers, or if you mean that you're giving GitHub Copilot a job role. Might help to clarify? "Each member of your team can use a custom chat mode that personalizes the persona of GitHub Copilot and grants specific toolsets. For example, a frontend developer can use a "frontend" mode with access to tools like Playwright and Figma"."
Dont know if thats clearer

pamelafox · 2025-09-26T16:00:30Z

docs/copilot/guides/large-complex-codebases-guide.md

+
+<!-- TODO: Add complete example chat mode files with tool configurations for each role -->
+
+### Team-specific workflows


This heading sounds kind of like role-based workflows, given there's usually a "frontend" team.

pamelafox · 2025-09-26T16:01:16Z

docs/copilot/guides/large-complex-codebases-guide.md

+
+### Team-specific workflows
+
+Create chat modes that encode team workflows:


Maybe something like "You can also create chat modes that are specific to common project phases and team workflows. For example"

digitarald · 2025-09-26T16:04:12Z

docs/copilot/guides/large-complex-codebases-guide.md

+
+These files can be referenced in your chat mode instructions to provide persistent context.
+
+<!-- TODO: Add workflow for when and how to update memory files during development -->


Add prompt for how to create them, and instructions that keep them updated.

digitarald · 2025-09-26T16:04:27Z

docs/copilot/guides/large-complex-codebases-guide.md

+
+In large codebases, context windows fill up quickly. Implement systematic context compaction:
+
+#### Memory file patterns


We might want to remove that; its less large codebase related.

pamelafox · 2025-09-26T16:04:54Z

docs/copilot/guides/large-complex-codebases-guide.md

+- **Performance mode**: Application + infrastructure optimization
+- **Migration mode**: Coordinated changes across multiple services
+
+## Context compaction techniques


Is there a less technical term for describing these techniques?

"Context optimization techniques"?

pamelafox · 2025-09-26T16:05:54Z

docs/copilot/guides/large-complex-codebases-guide.md

+
+```
+docs/ai/
+  current-tasks.md          # Active work and blockers


This is interesting as most people use task trackers for this. I make this only locally in a branch for TODOs on the current branch, and then delete it before PRing, once TODOs are complete. May want to clarify? People may be surprised to suggest storing tasks in the repo.

pamelafox · 2025-09-26T16:06:40Z

docs/copilot/guides/large-complex-codebases-guide.md

+  current-tasks.md          # Active work and blockers
+  architectural-decisions.md # Key design choices and rationale
+  integration-patterns.md   # How services communicate
+  common-pitfalls.md        # Frequent mistakes and solutions


Would you reference these from copilot-instructions.md?

Ah or would you reference them from the chat modes that you mention after this section?

ntrogh · 2025-09-29T13:01:14Z

docs/copilot/guides/large-complex-codebases-guide.md

+
+### Documentation and knowledge preservation
+
+Use AI to document tribal knowledge before it's lost:


Suggested change

Use AI to document tribal knowledge before it's lost:

Use AI to document institutional knowledge before it's lost:

ntrogh · 2025-09-29T13:05:21Z

docs/copilot/guides/large-complex-codebases-guide.md

+3. Specific module patterns
+4. Implementation details
+
+## Brownfield codebase strategies


Should this be a separate guide? Brownfield does not necessarily equate to a large codebase and the problem-space for existing codebases is broader than the size of the codebase.

ntrogh · 2025-09-29T13:09:08Z

docs/copilot/guides/large-complex-codebases-guide.md

+- **Domain expert modes**: Capture and share specialized knowledge
+- **Cross-team modes**: Facilitate knowledge sharing between teams
+
+## Context window optimization


Should this be grouped with the section on context compacting? They're all about optimizing the context size.

ntrogh · 2025-09-29T13:32:08Z

docs/copilot/guides/large-complex-codebases-guide.md

+
+<!-- TODO: Add step-by-step guide for organizing existing documentation into instruction files -->
+
+### Context inheritance patterns


This section feels overlapping with the previous one. Should they be combined? The main topics seem to be:

Multiple instructions files per topic/team/module/concept

How they are applied

ntrogh · 2025-09-29T13:37:39Z

docs/copilot/guides/large-complex-codebases-guide.md

+---
+```
+
+## Advanced chat mode patterns


Just "Chat mode patterns"?
Suggest adding a quick intro sentence about what chat modes are and why they're relevant for large codebases.

ntrogh · 2025-09-29T13:42:11Z

docs/copilot/guides/large-complex-codebases-guide.md

+- **Migration mode**: Coordinated changes across multiple services
+
+## Context compaction techniques
+


Add a quick intro about the problems related to context when working with a large codespace.

ntrogh · 2025-09-29T13:44:16Z

docs/copilot/guides/large-complex-codebases-guide.md

+
+#### Chat mode delegation
+
+Use specialized chat modes for context-heavy tasks:


By using specialized chat modes, how does this compact the context? Or does the user have to do something extra?

Large codebase guide outline

6c49720

vs-code-engineering bot assigned digitarald Sep 26, 2025

vs-code-engineering bot added this to the September 2025 milestone Sep 26, 2025

bpasero approved these changes Sep 26, 2025

View reviewed changes

pamelafox reviewed Sep 26, 2025

View reviewed changes

digitarald commented Sep 26, 2025

View reviewed changes

pamelafox reviewed Sep 26, 2025

View reviewed changes

ntrogh reviewed Sep 29, 2025

View reviewed changes

digitarald modified the milestones: September 2025, October 2025 Oct 3, 2025

kieferrm mentioned this pull request Oct 13, 2025

Iteration Plan for October 2025 microsoft/vscode#271045

Open

93 tasks


		### Context inheritance patterns

		VS Code combines all applicable instruction files automatically:


		### Role-based chat modes

		Different roles need different AI personas and tool access. Create [custom chat modes](/docs/copilot/customization/custom-chat-modes.md) with specific instructions and tool sets:


		<!-- TODO: Add complete example chat mode files with tool configurations for each role -->

		### Team-specific workflows


		### Team-specific workflows

		Create chat modes that encode team workflows:


		These files can be referenced in your chat mode instructions to provide persistent context.

		<!-- TODO: Add workflow for when and how to update memory files during development -->


		In large codebases, context windows fill up quickly. Implement systematic context compaction:

		#### Memory file patterns


		### Documentation and knowledge preservation

		Use AI to document tribal knowledge before it's lost:

	Use AI to document tribal knowledge before it's lost:
	Use AI to document institutional knowledge before it's lost:


		<!-- TODO: Add step-by-step guide for organizing existing documentation into instruction files -->

		### Context inheritance patterns

		- Migration mode: Coordinated changes across multiple services

		## Context compaction techniques


		#### Chat mode delegation

		Use specialized chat modes for context-heavy tasks:

Large codebase guide #8932

Are you sure you want to change the base?

Large codebase guide #8932

Uh oh!

Conversation

digitarald commented Sep 26, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ntrogh Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ntrogh Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ntrogh Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

ntrogh Sep 29, 2025 •

edited

Loading

ntrogh Sep 29, 2025 •

edited

Loading

ntrogh Sep 29, 2025 •

edited

Loading