How we introduced an anti-corruption layer over a legacy database without breaking the rest of the system

If you've ever built a modern service on top of a legacy database, you probably know the feeling. The code starts simple, then slowly accumulates exceptions, special cases, temporary fixes, and defensive mapping logic until the original design is no longer recognizable.

That's exactly the situation we had in our Missions service. The service was part of a modern domain-driven architecture, but the persistence layer had to talk to a database designed more than a decade ago for other legacy systems. We couldn't redesign the schema. We couldn't remove weird conventions. And we definitely couldn't break backward compatibility.

So instead of trying to fix the database, we decided to isolate it. This article walks through that decision: the constraints we worked under, the architectural choices we made, and the tradeoffs we accepted along the way.

The Problem: a Modern Service on Top of a Messy Legacy DB

Our service manages three core concepts:

Mission
MissionTemplate
Route

At the domain level, these are clean, explicit, and modeled as separate aggregates.

The database, of course, had a very different opinion.

Tables were reused for multiple concepts. The same table could store both a mission and a mission template, with the real meaning determined by flags, type IDs, or nullable columns. Route points and mission activities shared the same physical structure, but had to be interpreted differently depending on context. Some rows were malformed or incomplete, which meant that during loading we sometimes had to repair data on the fly just to get a usable object out.

The target handling was its own problem. Some records represented real POIs, some were copies of POIs taken at a specific point in time, and some were plain addresses. The difference mattered for how they had to be read and written. On top of that, certain target categories had to exist with exactly the same name as the mission or template they belonged to, a convention with no enforcement in the schema itself.

Letting the domain model talk directly to EF entities would have made every legacy quirk a domain problem. We needed a boundary.

The Constraints We Had

We had two hard constraints going in. First, we could not change the legacy database schema. Second, we could not break the application layer while refactoring infrastructure.

The first constraint had real consequences. We had to accept that the schema would remain inconsistent, that the same tables would continue serving multiple meanings, and that legacy consumers would keep relying on the current shape. The new service had to be resilient to malformed or partial rows, and any fix had to happen in code, not in the database design.

The second constraint shaped how we structured the solution. We kept a single public interface, MissionRepository, unchanged from the outside. The split into multiple repositories happened only inside Infrastructure, which let us do a significant internal refactor without touching handlers, queries, or any other application code.

The First Important Decision: Keep 1:1 Legacy Models

One of the best choices we made was introducing models that mirror the legacy tables almost 1:1.

At first glance, that can feel like duplication. You end up with EF entities for the database, domain entities for the business logic, and mapping code to move between the two. But in our case, that duplication was exactly what protected the domain.

Instead of pretending the legacy DB was already our domain, we treated it as an external system. We created dedicated legacy models like RouteLegacy, TargetLegacy, and TravelStatusLegacy. These classes represent the database as it really is, not as we wish it were.

That gave us a clean boundary: legacy models describe persistence reality, domain models describe business reality, and mapping code explains how to move between the two. This is the core of the anti-corruption layer.

The Refactor Goal

Our goal was not "make the repository smaller". The real goal was to make legacy complexity explicit and local.

In practice, that meant stopping the habit of putting every concern inside one giant repository. We wanted to separate read orchestration from write orchestration, isolate legacy query logic, mapping logic, and repair rules, and keep the public application contract unchanged throughout the process.

The target architecture looked like this:

Application
  MissionRepository
Infrastructure
  MissionRepositoryFacade : MissionRepository
  MissionRepository
  MissionTemplateRepository
  RouteRepository
  Legacy\
    Lookups\
    Readers\
    Snapshots\
    Mapping\

From the outside, nothing changed. From the inside, everything did.

Read Side: Turning Legacy Rows into Clean Domain Objects

The read side was the easiest place to create order first.

The Main Idea

We introduced a simple pipeline:

lookup/reader -> snapshot -> mapper -> domain

Each step has one job.

1. Lookups and Readers

Lookups are small, focused components for simple legacy queries, like travel status or user fleet. Instead of scattering EF queries around the repository, these components centralize them.

Readers build on top of lookups and load raw legacy rows to compose all the data needed to build a domain aggregate. For example, a mission reader loads the main route row together with points, statuses, types, targets, and attributes. A route reader loads the route row and its route point rows. A template reader loads the template row plus its related data.

2. Snapshots

A snapshot is just a grouped representation of the raw legacy data needed for one aggregate.

It’s not domain yet. It’s just “all the pieces we loaded”.

3. Mappers

Mappers convert a snapshot into a proper domain object. This is where legacy conventions become explicit. For example, a row is treated as a mission activity only if the related type has IsActivity set. A target is considered an address POI if its category name matches the mission or template name. Route points are filtered by a computed type, so only those that match belong to the actual route.

Why This Helped So Much

Before this refactor, the repository itself was doing everything: loading rows, interpreting them, validating them, building domain objects, handling errors, skipping malformed data, and understanding legacy conventions. That made the code hard to read and hard to safely modify.

With the new structure, readers know how to load, mappers know how to translate, and repositories just orchestrate. That sounds simple, but it changes the maintenance experience a lot.

Splitting the Old Mapping Monolith

One of the biggest cleanup steps was removing the old LegacyMappers monolith. Originally, there was a single giant static mapper that knew how to convert almost everything: missions, mission activities, mission types, mission statuses, mission templates, template activities, route points, route point types, and mission attributes. It worked, but it had become a magnet for every new exception and every new special rule.

So we split it into smaller, focused mappers: MissionLegacyAggregateMapper, MissionTemplateLegacyAggregateMapper, and RouteLegacyMapper. That gave us smaller, more coherent files, and made it much clearer where a new rule should go.

A Small Example: Read Flow

Here’s a simplified version of the new read flow for a mission:

public async Task<Mission> GetMissionByIdAsync(MissionId missionId)
{
    var missionSnapshot = await missionLegacyReader.GetByIdAsync(missionId);
    return missionLegacySnapshotMapper.Map(missionSnapshot);
}

That's the kind of code we wanted in repositories: short, boring, and predictable. The interesting logic now lives in the mapper and reader, where it actually belongs.

Write Side: The Hard Part

Reads were only half the story. Writes were more painful, because the service does not just save data. It often has to repair or adapt legacy persistence rules while saving, and that is where legacy persistence usually becomes dangerous.

Why Writes Were Harder

Writing a mission or activity was not just "map and save". We had to ensure that target categories existed with the correct name, clone existing POIs into mission-specific snapshots, assign new PointId values in a legacy-compatible way, update route metadata stored in JSON, and persist multiple rows across multiple legacy tables in the right sequence. On top of that, missions and mission templates required different behavior throughout the whole process.

So we needed stronger separation here.

The Write-Side Approach

For writes, we introduced a structure like this:

state-reader -> validation/repair -> writing factory -> persistence

A state reader loads the existing legacy rows needed to perform an update safely. For example, updating a mission template may require loading the current legacy route row before touching anything else.

From there, we validate and repair the loaded state before proceeding. We started treating issues as three distinct categories:

Fatal: the data is too broken to infer what to do
Repairable: we can safely fix it in code
Warning: the data is weird, but still usable

Writing factories then take the repaired state and build the legacy rows needed for persistence. This was an important separation: EF models should represent database structure, not carry business-aware construction logic. Dedicated factories like RouteLegacyFactory, RouteLegacyMetadataUpdater, and TargetLegacyFactory own that responsibility instead.

Example: Building Legacy Rows Outside EF Models

Instead of doing this inside RouteLegacy itself:

var missionLegacy = RouteLegacy.FromMission(mission);

we now do it through a dedicated factory:

var missionLegacy = RouteLegacyFactory.FromMission(mission);

That may look like a small naming change, but conceptually it's a big improvement. RouteLegacy is just a legacy row model, while RouteLegacyFactory is the place where we describe how a domain mission becomes a legacy route row. That's a much cleaner boundary.

The POI / Target Problem

One of the most interesting parts of the write side was target handling. When creating a mission activity, we had two different cases.

Case 1: the activity is based on a plain address

We create a new TargetLegacy row to store the address snapshot.

Case 2: the activity is based on an existing POI

We do not want the activity to keep pointing directly at the live POI forever. If the original POI changes later, the mission should still reflect the information it had when the activity was created. So for missions, we duplicate the POI into a new legacy target snapshot.

That behavior is intentional, and encoding it in a dedicated factory made the code much easier to follow:

var targetCopy = TargetDetailLegacyFactory.Create(
    sourceTargetId: originalTarget.Id,
    city: originalTarget.City,
    province: originalTarget.Province,
    country: originalTarget.Country,
    address: originalTarget.Address,
    streetNumber: originalTarget.StreetNumber,
    zipCode: originalTarget.ZipCode,
    latitude: originalTarget.Latitude,
    longitude: originalTarget.Longitude,
    ...
    );

This logic used to be buried inside a huge repository method. Now it has a clear home.

Mission vs MissionTemplate: Same Tables, Different Rules

A big part of the complexity came from the fact that missions and mission templates share parts of the legacy schema, but they don't behave the same. That's why the internal repository split was worth it.

We now have:

MissionRepository
MissionTemplateRepository
RouteRepository

while still keeping a single public IMissionRepository through MissionRepositoryFacade.

This let us separate the write logic where behavior actually differs. Mission activities duplicate POIs, while template activities reuse them differently. Route creation has different persistence context depending on whether you're dealing with a mission or a template. Templates also require fallback handling for expected dates that missions don't need.

That separation reduced the amount of "if this is a template, do X, otherwise do Y" inside a single file.

Keeping the Public Contract Stable

One of the best parts of this migration was that we did not force the application layer to change.

So from the application point of view, this still existed:

public interface MissionRepository
{
    // existing methods
}

But the infrastructure implementation changed to a facade:

internal sealed class MissionRepositoryFacade : MissionRepository
{
    private readonly MissionPersistenceRepository missionRepository;
    private readonly MissionTemplatePersistenceRepository missionTemplateRepository;
    private readonly RoutePersistenceRepository routeRepository;
    // delegates calls to the right internal repository
}

It gave us the benefits of separation immediately, without forcing a broader migration everywhere else.

Lessons We Learned

1. Don't fight the legacy DB by pretending it's clean

Trying to map legacy rows directly into rich domain objects usually just spreads the pain everywhere. A clear anti-corruption layer is slower to build at first, but much cheaper to maintain.

2. 1:1 legacy models are not duplication, they are protection

In this kind of system, a legacy model is not a domain model with worse naming. It's a contract with a messy external reality, and that distinction matters.

3. Read and write deserve different designs

When you treat reads and writes as the same problem, you end up with repositories that do too much. Loading data and saving data have fundamentally different concerns, and mixing them is what turns a repository into a maintenance nightmare.

4. Big refactors are easier when the public contract stays stable

Keeping MissionRepository unchanged let us move fast internally without breaking the rest of the service. But the broader lesson is that when you need to do a significant internal restructuring, the best thing you can do is find a boundary that does not need to move. If the public contract is stable, the refactor stays contained. If it is not, every change ripples outward and the scope grows until the refactor becomes unmanageable.

5. "Fixing it in code" is acceptable, if the fix is explicit

Sometimes legacy data really is malformed, and the system needs to repair it to keep working. That's fine, as long as those repairs are deterministic, visible, localized, and easy to test.

6. Understand what the system guarantees before touching it

The existing layer had poor test coverage, and in a system this complex that was a real problem. The legacy schema had accumulated edge cases over the years, most of them undocumented, many tied to backward compatibility requirements that were not obvious from reading the code alone. When we started the refactor, we hit those cases the hard way. Behavior that looked like a bug turned out to be intentional. A path that seemed dead was actually exercised by a legacy consumer. We had to pause and invest time in writing tests first, focusing specifically on those strange compatibility paths before touching anything else. In retrospect, that pause was the right call. The tests gave us a safety net, but more importantly they gave us a map of what the system actually had to do, not just what we thought it did.

Final Thoughts

This refactor was not about elegance for its own sake. It was about making a system survivable.

The database is still legacy. The constraints are still there. We did not remove a single weird convention, and we did not fix a single malformed row at the schema level. What we did was stop pretending that complexity did not exist, and start giving it explicit boundaries instead.

Before this refactor, legacy rules were scattered everywhere: in repository methods, in mapping logic, in ad-hoc fixes buried inside handlers. Every new requirement meant navigating that implicit knowledge again, and hoping nothing broke in the process.

Now that complexity has names. It has dedicated components. It lives in a layer that is explicitly designed to contain it, not in the domain layer that is supposed to ignore it.

That's what a good anti-corruption layer should do. Not magically remove legacy complexity, but make it visible, local, and survivable.

When You Can't Fix the Database, You Contain It