Chapter 8 – Get Your Hands Dirty on Clean Architecture

Chapter 8

Mapping Between Boundaries

In the previous chapters, we discussed the web, application, domain, and persistence layers and what each of those layers contributes to implementing a use case.

We have, however, barely touched upon the dreaded and omnipresent topic of mapping between the models of each layer. I bet you have had a discussion at some point about whether to use the same model in two layers in order to avoid implementing a mapper.

The argument might have gone something like this:

Pro-Mapping Developer:

If we don't map between layers, we have to use the same model in both layers, which means that the layers will be tightly coupled.

Contra-Mapping Developer:

But if we do map between layers, we produce a lot of boilerplate code, which is overkill for many use cases, since they're only doing CRUD and have the same model across layers anyway.

As is often the case in discussions like this, there's truth to both sides of the argument. Let's discuss some mapping strategies with their pros and cons and see whether we can help those developers make a decision.

The "No Mapping" Strategy

The first strategy is actually not mapping at all:

Figure 8.1: If the port interfaces use the domain model as the input and output model, we don't need to map between layers

The preceding figure shows the components that are relevant to the "Send Money" use case from our BuckPal example application.

In the web layer, the web controller calls the SendMoneyUseCase interface to execute the use case. This interface takes an Account object as an argument. This means that both the web layer and the application layer need access to the Account class – both are using the same model.

On the other side of the application, we have the same relationship between the persistence and application layers. Since all layers use the same model, we don't need to implement a mapping between them.

But what are the consequences of this design?

The web and persistence layers may have special requirements for their models. If our web layer exposes its model via REST, for instance, the model classes might need some annotations that define how to serialize certain fields into JSON. The same is true for the persistence layer if we are using an ORM framework, which might require some annotations that define the database mapping.

In the example, all of those special requirements have to be dealt with in the Account domain model class even though the domain and application layers are not interested in them. This violates the Single Responsibility Principle since the Account class has to be changed due to reasons related to the web, application, and persistence layers.

Aside from the technical requirements, each layer might require certain custom fields in the Account class. This might lead to a fragmented domain model with certain fields only relevant in one layer.

Does this mean, though, that we should never, ever, implement a "no mapping" strategy? Certainly not.

Even though it might feel dirty, a "no mapping" strategy can be perfectly valid.

Consider a simple CRUD use case. Do we really need to map the same fields from the web model into the domain model and from the domain model into the persistence model? I'd say we don't.

And what about those JSON or ORM annotations on the domain model? Do they really bother us? Even if we have to change an annotation or two in the domain model if something changes in the persistence layer, so what?

As long as all layers need exactly the same information in exactly the same structure, a "no mapping" strategy is a perfectly valid option.

However, as soon as we are dealing with web or persistence issues in the application or domain layer (aside from annotations, perhaps), we should move to another mapping strategy.

There is a lesson for the two developers from the introduction here: even though we have decided on a certain mapping strategy in the past, we can change it later.

In my experience, many use cases start their life as simple CRUD use cases. Later, they might grow into a full-fledged business use case with a rich behavior and validations that justify a more expensive mapping strategy. Or they might forever keep their CRUD status, in which case, we are glad that we haven't invested in a different mapping strategy.

The "Two-Way" Mapping Strategy

A mapping strategy where each layer has its own model is what I call the "two-way" mapping strategy, outlined in the following figure:

Figure 8.2: With each adapter having its own model, the adapters are responsible for mapping their model into the domain model and back

Each layer has its own model, which may have a structure that is completely different from the domain model.

The web layer maps the web model into the domain model that is expected by the incoming ports. It also maps domain objects returned by the incoming ports back into the web model.

The persistence layer is responsible for a similar mapping between the domain model, which is used by the outgoing ports, and the persistence model.

Both layers map in two directions, hence the name "two-way" mapping.

With each layer having its own model, each layer can modify its own model without affecting the other layers (as long as the contents are unchanged). The web model can have a structure that allows the optimal presentation of the data. The domain model can have a structure that best allows implementing the use cases. And the persistence model can have the structure needed by an ORM for persisting objects to a database.

This mapping strategy also leads to a clean domain model that is not dirtied by web or persistence concerns. It does not contain JSON or ORM mapping annotations. The single responsibility principle is satisfied.

Another bonus of "two-way" mapping is that, after the "no mapping" strategy, it's conceptually the simplest mapping strategy. The mapping responsibilities are clear: the outer layers/adapters map into the model of the inner layers and back. The inner layers only know their own model and can concentrate on the domain logic instead of mapping.

As with every mapping strategy, "two-way" mapping also has its drawbacks.

First of all, it usually results in a lot of boilerplate code. Even if we use one of the many mapping frameworks out there to reduce the amount of code, implementing the mapping between models usually takes up a good portion of our time. This is partly due to the fact that debugging mapping logic is a pain – especially when using a mapping framework that hides its inner workings behind a layer of generic code and reflection.

Another drawback is that the domain model is used to communicate across layer boundaries. The incoming ports and outgoing ports use domain objects as input parameters and return values. This makes them vulnerable to changes that are triggered by the needs of the outer layers, whereas it's desirable for the domain model only to evolve due to the needs of the domain logic.

Just like the "no mapping" strategy, the "two-way" mapping strategy is not a silver bullet. In many projects, however, this kind of mapping is considered a holy law that we have to comply with throughout the whole code base, even for the simplest CRUD use cases. This unnecessarily slows down development.

No mapping strategy shouldn't be considered an iron law. Instead, we should decide for each use case.

The "Full" Mapping Strategy

Another mapping strategy is what I call the "full" mapping strategy, sketched in the following figure:

Figure 8.3: With each operation requiring its own model, the web adapter and application layer each map their model into the model expected by the operation they want to execute

This mapping strategy introduces a separate input and output model per operation. Instead of using the domain model to communicate across layer boundaries, we use a model specific to each operation, such as SendMoneyCommand, which acts as an input model to the SendMoneyUseCase port in the figure. We can call those models "commands," "requests," or something similar.

The web layer is responsible for mapping its input into the command object of the application layer. Such a command makes the interface to the application layer very explicit, with little room for interpretation. Each use case has its own command with its own fields and validations. There's no guessing involved as to which fields should be filled and which fields better be left empty since they would otherwise trigger a validation we don't want for our current use case.

The application layer is then responsible for mapping the command object into whatever it needs to modify the domain model according to the use case.

Naturally, mapping from one layer into many different commands requires even more mapping code than mapping between a single web model and a domain model. This mapping, however, is significantly easier to implement and maintain than a mapping that has to handle the needs of many use cases instead of only one.

I don't advocate this mapping strategy as a global pattern. It plays out its advantages best between the web layer (or any other incoming adapter) and the application layer to clearly demarcate the state-modifying use cases of the application. I would not use it between the application and persistence layer due to the mapping overhead.

Also, in some cases, I would restrict this kind of mapping to the input model of operations and simply use a domain object as the output model. SendMoneyUseCase might then return an Account object with the updated balance, for instance.

This shows that mapping strategies can and should be mixed. No mapping strategy needs to be a global rule across all layers.

The "One-Way" Mapping Strategy

There is yet another mapping strategy with another set of pros and cons – the "one-way" strategy sketched in the following figure:

Figure 8.4: With the domain model and the adapter models implementing the same "state" interface, each layer only needs to map objects it receives from other layers – one way

In this strategy, the models in all layers implement the same interface, which encapsulates the state of the domain model by providing getter methods for the relevant attributes.

The domain model itself can implement a rich behavior, which we can access from our services within the application layer. If we want to pass a domain object to the outer layers, we can do so without mapping, since the domain object implements the state interface expected by the incoming and outgoing ports.

The outer layers can then decide whether they can work with the interface or whether they need to map it into their own model. They cannot inadvertently modify the state of the domain object since the modifying behavior is not exposed by the state interface.

Objects we pass from an outer layer into the application layer also implement this state interface. The application layer then has to map it into the real domain model in order to get access to its behavior. This mapping plays well with the DDD concept of a factory. A factory in terms of DDD is responsible for reconstituting a domain object from a certain state, which is exactly what we are doing (Domain-Driven Design by Eric Evans, Addison-Wesley, 2004, p. 158)

The mapping responsibility is clear: if a layer receives an object from another layer, we map it into something the layer can work with. Thus, each layer only maps one way, making this the "one-way" mapping strategy.

With the mapping distributed across layers, however, this strategy is conceptually more difficult than the other strategies.

This strategy plays out its strength best if the models across the layers are similar. For read-only operations, for instance, the web layer might not need to map into its own model at all, since the state interface provides all the information it needs.

When to Use Which Mapping Strategy?

This is the million-dollar question, isn't it?

The answer is the usual, dissatisfying, "It depends."

Since each mapping strategy has different advantages and disadvantages, we should resist the urge to define a single strategy as a hard-and-fast global rule for the whole codebase. This goes against our instincts, as it feels untidy to mix patterns within the same code base. But knowingly choosing a pattern that is not the best pattern for a certain job, just to serve our sense of tidiness, is irresponsible, plain and simple.

Also, as software evolves over time, the strategy that was the best for the job yesterday might not still be the best for the job today. Instead of starting with a fixed mapping strategy and keeping it over time – no matter what – we might start with a simple strategy that allows us to quickly evolve the code and later move to a more complex one that helps us to better decouple the layers.

In order to decide which strategy to use and when we need to agree upon a set of guidelines within the team. These guidelines should answer the question of which mapping strategy should be the first choice in which situation. They should also answer why they are the first choice so that we are able to evaluate whether those reasons still apply after some time.

We might, for example, define different mapping guidelines for modifying use cases than we do for queries. Also, we might want to use different mapping strategies between the web and application layers and between the application and persistence layers.

Guidelines for these situations might look like this:

If we are working on a modifying use case, the "full mapping" strategy is the first choice between the web and application layers, in order to decouple the use cases from one another. This gives us clear per-use-case validation rules and we won't have to deal with fields we don't need in a certain use case.

If we are working on a modifying use case, the "no mapping" strategy is the first choice between the application and persistence layers in order to be able to quickly evolve the code without mapping overhead. As soon as we have to deal with persistence issues in the application layer, however, we move to a "two-way" mapping strategy to keep persistence issues in the persistence layer.

If we are working on a query, the "no mapping" strategy is the first choice between the web and application layers and between the application and persistence layers in order to be able to quickly evolve the code without mapping overhead. As soon as we have to deal with web or persistence issues in the application layer, however, we move to a "two-way" mapping strategy between the web and application layers or the application layer and the persistence layer, respectively.

In order to successfully apply guidelines like these, they must be present in the minds of developers. So, the guidelines should be discussed and revised continuously as a team effort.

How Does This Help Me Build Maintainable Software?

With incoming and outgoing ports acting as gatekeepers between the layers of our application, they define how the layers communicate with each other and thus whether and how we map between layers.

With narrow ports in place for each use case, we can choose different mapping strategies for different use cases, and even evolve them over time without affecting other use cases, thus selecting the best strategy for a certain situation at a certain time.

This selection of mapping strategies per situation certainly is harder and requires more communication than simply using the same mapping strategy for all situations, but it will reward the team with a codebase that does just what it needs to do and is easier to maintain, as long as the mapping guidelines are known.