Domain-Driven Design: Tackling Complexity in the Heart of Software

Author: Eric Evans
4.3
All Stack Overflow 161
This Year Stack Overflow 5
This Month Stack Overflow 1

Comments

by cottsak   2019-04-25
Largely agree with this except the initial objection "Data structure and functions should not be bound together" and the "Why was OO popular?" which is missing the response - because using the notion that "objects" from the real world encapsulate state and behaviour was not only a solid premise, but it was an attempt to interface the computer world with the real. This was not a failed concept. It makes some sense. Just very few teams find the discipline to model their core domain this way.

Which brings be back to the original objection. I think this is true most of the time, except when it's not - which is your core Domain Model. The first 1/2 of the Blue Book[1] lays out straightforward means to arrange code, functions, data/state and related behaviours in a way which can be managed and maintained over time. This is pretty important as most folks who've spent any length of time maintaining vast applications will know that it's incredibly hard to reason about a first-class concept in an application without clear boundaries around said concept, it's structures and it's behaviour. Most of us are unlucky and find this scattered across the landscape. Few applications take the focus to "model" these concepts clearly.

Does this modelling have to be done with "Domain Model", or DDD, or something else that can be loosely coupled with OOD - probably not. But another developer absolutely has to be able to reason about said structures and behaviour. They have to be able to read it easily and grok it quickly. And having done that, they don't want to be surprised by some distant missing element, 20 calls or 1000 lines or 15 modules (repos, submodules, etc, etc) away! This is possibly the biggest time-sink and therefore "cost" of development. One could also take this further and postulate that about 1/2 of us are employed as a direct result of applications whose core concepts are so poorly designed or hard to reason about, that a massive volume of work (time?) is dedicated to unwinding the ambiguity that results.

I don't want to suggest that OOP or OOD/DDD/{other modelling process} would necessarily fix this, but the attempt to clarify and find a means to make modelling these critical concepts easier and less costly is admirable IMO.

It's ok if your infrastructure takes a different approach, or is "functional" or "dynamic" in nature. If your test suite uses completely different patterns and paradigms because the tooling makes that easy then - awesome! But if the core model/concepts of your application are hard to understand, reason about, and therefore maintain, then you're pretty fucked.

OO doesn't "suck". It's spirit is just largely lost and like many other things in life, it's been hijacked and mutilated into something many of us come to loathe because we've never seen it deliver on the promises. I guess we will be having this conversation again in another decade about something else that's hugely popular right now.

[1] https://www.amazon.com/Domain-Driven-Design-Tackling-Complex...

by anonymous   2019-01-13

Before you choose an integration mode, there's analysis to be done on a strategic level about the business subdomains, teams that will maintain these services, who owns what, how you expect the models to evolve, etc.

I'd recommend having a look at the Context Mapping part of the DDD book and then Enterprise Integration Patterns by Hohpe and Woolf for concrete implementation.

by karmajunkie   2018-11-10
If you're going to dive into CQRS/ES, I'd recommend:

* Enterprise Integration Patterns (basically an entire book about messaging architectures) [1] * Vaughn Vernon's books and online writing [2], * Domain Driven Design by Eric Evans [3], * and most of what Greg Young, Udi Dahan, and that constellation of folks has done online (lots of talks and blog articles.)

Depending on your platform of choice, there may be others worth reading. For my 2¢, the dragons are mostly in the design phase, not the implementation phase. The mechanics of ES are pretty straightforward—there are a few things to look out for, like detection of dropped messages, but they're primarily the risks you see with any distributed system, and you have a collection of tradeoffs to weigh against each other.

In design, however, your boundaries become very important, because you have to live with them for a long time and evolving them takes planning. If you create highly coupled bounded contexts, you're in for a lot of pain over the years you maintain a system. However, if you do a pretty good job with them, there's a lot of benefits completely aside from ES.

[1] https://www.amazon.com/Enterprise-Integration-Patterns-Desig...

[2] https://www.amazon.com/Domain-Driven-Design-Tackling-Complex...

by anonymous   2018-05-09
@AppleBook89 You can read the great tutorial from http://cqrs.nu/ Event-sourcing (and CQRS) works great with (Domain driven design )[https://www.amazon.com/Domain-Driven-Design-Tackling-Complexity-Software/dp/0321125215]. There are a lot of other sources on the internet, too many in fact to be put here. Look also (here)[https://stackoverflow.com/tags/domain-driven-design/info]
by ChicagoDave   2018-01-25
I can only refer you to Eric Evans book (https://www.amazon.com/Domain-Driven-Design-Tackling-Complex...) and other domain driven design material.

Boundaries are by domain, and yes that's not a simple thing to define. Sometimes, domains have varying interfaces, which makes building micro-services more complex, especially when trying to adhere to REST/Swagger standards (something I'm not overly find of).

But keeping things as simple as possible is really the best approach.

All micro-services should be small. When I see someone say "big", then I'm guessing there are a lot of ad-hoc actions...those need to be broken down into their proper domain or relegated to a query service.

by jgrodziski   2017-10-19
Identifying changing "stuff" in the real world is for me a fundamental topic of any serious data modeling for any kind of software (be it an API, a traditional database stuff, etc). Identity is also at the center of the entity concept of Domain-Driven Design (see the seminal book of Eric Evans on that: https://www.amazon.com/Domain-Driven-Design-Tackling-Complex...).

I started changing my way of looking at identity by reading the rationale of clojure (https://www.amazon.com/Data-Reality-Perspective-Perceiving-I....

More specifically concerning the article, I do agree with the point of view of the author distinguishing access by identifier and hierarchical compound name better represented as a search. On the id stuff, I find the amazon approach of using URN (in summary: a namespaced identifier) very appealing: http://philcalcado.com/2017/03/22/pattern_using_seudo-uris_w.... And of course, performance matters concerning IDs and UUID: https://tomharrisonjr.com/uuid-or-guid-as-primary-keys-be-ca....

Happy data modeling :)

EDIT: - add an excerpt from the clojure rationale

by anonymous   2017-10-01

Implement Repository pattern is pretty easy, you just need to create interface with CRUD methods and use it in your domain logic.

For example:

class CreateEntityException;
class ReadEntityException;
class UpdateEntityException;
class DeleteEntityException;

interface Repository<Entity> {
    Entity create(Entity entity) throws CreateEntityException;
    Entity read(long entityId) throws ReadEntityException;
    Entity update(Entity entity) throws UpdateEntityException;
    void delete(long entityId) throws DeleteEntityException;
}

Methods count and signature can be different in your own project but an approach is the same. After it you can create concrete implementation of repository that encapsulate one or another datasource - ContentProviderRepository, OrmLiteRepository, RealmRepository etc. Then by using Dependency Injection principle you should inject correct implementation.

There are few good books that covered Repository patterns. Pattern is independent from platform so it is easy to implement and use every platform.

https://www.amazon.com/Domain-Driven-Design-Tackling-Complexity-Software/dp/0321125215

https://www.manning.com/books/functional-and-reactive-domain-modeling

by anonymous   2017-09-18
I agree that this is the best approach. For a very detailed overview of this design pattern, check out the great Eric Evans book, Domain Driven Design (https://www.amazon.com/Domain-Driven-Design-Tackling-Complexity-Software/dp/0321125215).
by galeaspablo   2017-08-20
> If you aren’t familiar with the database pattern known as event sourcing (don’t worry — it’s relatively new),

It's not relatively new. That “transaction file” thing in your database? Event Sourcing.

https://www.amazon.co.uk/Domain-driven-Design-Tackling-Compl... https://www.amazon.co.uk/Implementing-Domain-Driven-Design-V...

-----------------

If that doesn't do it for you, please just remember the good old CAP theorem.

https://en.wikipedia.org/wiki/CAP_theorem

by jpalomaki   2017-08-20
Two books that affected my thinking on the subject were Domain Driven Design[1] by Eric Evans and Object Thinking[2] by David West. Many years since I read the books and I don't claim to have studied them in detail so I'm not saying if they were good or bad, but at least I got some ideas out of them.

[1] https://www.amazon.com/Domain-Driven-Design-Tackling-Complex...

[2] https://www.microsoftpressstore.com/store/object-thinking-97...

by anonymous   2017-08-20

Active Record Pattern

Django is tailored towards the use of the Active Record Pattern as described on this Django Design Philosophy page.

Your second example follows this pattern - the model itself has its properties, behaviour and data access contained within.

You can still use this pattern in a more DDD-like way, if you push more behaviour onto the model. e.g. in your example, a more effective use of the pattern would be to wrap the line

poll.question += '?'

in an intention revealing method on the poll object, so that the update_poll method is:

views.py
def update_poll(poll_id):
    poll = models.Polls.objects.get(poll_id)
    poll.add_question()
    poll.save()

This has the advantage of separating the business logic (pushed into the model) from the application flow logic (the update_poll method)

Although I'd suggest using a name that actually illustrates the intent or purpose of the method rather than just add_question.

But even if you do this, you are still using the Active Record pattern, not pure DDD.

You ask "Is there a need for DDD with ORM?"

DDD and an ORM are attempting to solve different problems. ORMs provide a convenient way of abstracting the set-like record-oriented world of databases in a more object oriented fashion.

DDD is an approach to assist with modelling complex real world situations in code.

Many DDD systems use ORMs to solve the infrastructure concerns of retrieving and persisting from the database (sometimes wrapping the ORM in a repository for a variety of reasons), but the DDD focus is on the domain models and how suitably they model the domain under consideration.

So - in your example, the benefits of DDD are difficult to see, as the business logic is so relatively simple.

I recommend reading the authoritive source on DDD - Domain Driven Design by Eric Evans for a language agnostic overview of the approach and the situations where it adds value.

Update

You ask:

Can you update me with one good example where using DDD with an ORM makes sense

and

If we use ORM I think there is no need for a DDD-repository

I think a better way to think about it is - when using an ORM, the ORM is the repository. You ask it for a model and it returns a model. That is the purpose of a repository. When people wrap it in a class called 'repository' it is usually because they want to do one of a few things:

  1. make it easier to inject a mock repository to simplify unit testing
  2. abstract the specific orm technology used to give flexibility to change the ORM later without having to touch the services or domain

This overview of the repository pattern provides another good writeup of the ddd repository pattern.

by anonymous   2017-08-20

Repository pattern emerges in the context of DDD, and the best book to read on that would be the original Domain-Driven Design: Tackling Complexity in the Heart of Software

One of the things that Evans keeps saying over and over is that all of the concepts of the domain must be explicit.

What does your menu represent? Does it have a name in the domain?

Lets say it does, and it is the concept of "weekly promoted products", then you could make this concept explicit - and introduce a separate class for that, with a separate repository, that could read data from products table, and from categories table to come up with the list of promoted products.

Please note that according to Evans our domain and persistence models do not have to match, and we don't need to have a repository per domain object (specifically, we should only have repositories for aggregate roots, and references all other entities should be obtained by traversing the association from the aggregate root)

Alternatively, in a design, somewhat departed from DDD, you could place the selection of those promoted products in the product repository, pre-load the categories and just have a TopCategory calculation property on a product

Please note, that if the new entity "promoted products" is purely virtual, and you are only planning to do read operations, without mutating its state, then it makes a lot of sense to apply Command Query Responsibility Seggregation, and have a separate read model for "promoted products" outside of your main domain model.

by anonymous   2017-08-20

The whole Chapter 6 of Eric Evans book is devoted to the problems you are describing.

First of all, Factory in DDD doesn't have to be a standalone service -

Evans DDD, p. 139:

There are many ways to design FACTORIES. Several special-purpose creation patterns - FACTORY METHOD, ABSTRACT FACTORY, and BUILDER - were thoroughly treated in Gamma et. al 1995. <...> The point here is not to delve deeply into designing factories, but rather to show the place of factories as important components of a domain design.

Each creation method in Evans FACTORY enforces all invariants of the created object, however, object reconstitution is a special case

Evans DDD, p. 145:

A FACTORY reconstituting an object will handle violation of an invariant differently. During creation of a new object, a FACTORY should simply balk when invariant isn't met, but a more flexible response may be necessary in reconstitution.

This is important, because it leads us to creating separate FACTORIES for creation and reconstitution. (in the diagram on page 155 TradeRepository uses a specialized SQL TradeOrderFactory, not a general general purpose TradeOrderFactory )

So you need to implement a separate logic for reconstitution, and there are several ways to do it (You can find the full theory in Martin J Fowler Patterns Of Enterprise Application Architecture, on page 169 there's a subheading Mapping Data to Domain Fields, but not all of methods described look suitable(for example making the object fields package-private in java is seems to be too intrusive) so I'd prefer only one of the following two options

  • You can create a separate FACTORY and document it so that developers should only use it only for persistence or testing.
  • You can set the private field values with reflection, as for example Hibernate does.

Regarding the anemic domain model with setters/and getters, the upcoming Vaughn Vernon book criticizes this approach a lot so I dare say it is an antipattern in DDD.

by anonymous   2017-08-20

As far as I can see contracts don't exist without cars hence cars is the aggregate root.

This is not necessarily true. 'Don't exist without' is not enough for an entity to become a part of an Aggregate Root. Consider classic order processing domain. You have an Order that is an Aggregate Root. You also have a Customer that is an Aggregate Root. Order can not exist without a Customer but it does not mean that Orders are part of the Customer Aggregate. In DDD entities inside one Aggregate can have references to other Aggregate Roots. From DDD book:

Objects within the AGGREGATE can hold references to other AGGREGATE roots.

Aggregate is a life cycle and data exchange unit. It is essentially a cluster of objects that enforces invariants. This is something you want to be locked if you have multiple users changing domain at the same time.

Back to your question, my understanding is that the domain is something like rent / lease a car / truck / limo / bulldozer. I think that HireContract may not be a part of Car aggregate because they may have different lifecycles and HireContract just makes sense on its own, without a Car. It seem to be more of a Order-Product relationship that is also a classic example of two different Aggregates referencing each other. This theory is also confirmed by the fact that business needs to see "All Contracts". They probably don't think of Car containing all Contracts. If this is true than you need to keep your ContractsRepository.

On an unrelated note, you might be interested in reading this answer about repository interface design.

by anonymous   2017-08-20

What you are looking for is what in xUnit Test Patterns is called Test-Specific Equality.

While you can sometimes choose to override the Equals method, this may lead to Equality Pollution because the implementation you need to the test may not be the correct one for the type in general.

For example, Domain-Driven Design distinguishes between Entities and Value Objects, and those have vastly different equality semantics.

When this is the case, you can write a custom comparison for the type in question.

If you get tired doing this, AutoFixture's Likeness class offers general-purpose Test-Specific Equality. With your Student class, this would allow you to write a test like this:

[TestMethod]
public void VerifyThatStudentAreEqual()
{
    Student st1 = new Student();
    st1.ID = 20;
    st1.Name = "ligaoren";

    Student st2 = new Student();
    st2.ID = 20;
    st2.Name = "ligaoren";

    var expectedStudent = new Likeness<Student, Student>(st1);

    Assert.AreEqual(expectedStudent, st2);
}

This doesn't require you to override Equals on Student.

Likeness performs a semantic comparison, so it can also compare two different types as long as they are semantically similar.

by anonymous   2017-08-20

10,000 foot answer:

You might find Domain Driven Design and Clean Code useful as they give you a set of patterns that work well together and a set of principals for evaluating when to apply a pattern. DDD resources: the book, free quick intro, excellent walkthrough. Clean Code resources: summary, SOLID principles.

Specific answer:

You are already using the Repository pattern (your utility classes) which I'd probably use here as well. The static members can make the code difficult to test but otherwise aren't a problem. If the Repositories become too complex, break out the low-level API communication into Gateways.

Since an entity is split across multiple data sources, consider modelling this explicitly. For example: Person, HumanResourcesPerson, AccountingPerson. Use names understood by the external systems and their business owners (e.g. Employee, Resource). See Single Responsibilty Principle and Ubiquitous Language for some reasoning. These may be full Entities or just Data Transfer Objects (DTOs) depending on how complex they are.

The synchronization might be performed by an Application Service that coordinates the Repositories and Entities.

by anonymous   2017-08-20

It seems like you are asking about the difference between the data model and the domain model – the latter is where you can find the business logic and entities as perceived by your end user, the former is where you actually store your data.

Furthermore, I've interpreted the 3rd part of your question as: how to notice failure to keep these models separate.

These are two very different concepts and it's always hard to keep them separate. However, there are some common patterns and tools that can be used for this purpose.

About the Domain Model

The first thing you need to recognize is that your domain model is not really about data; it is about actions and questions such as "activate this user", "deactivate this user", "which users are currently activated?", and "what is this user's name?". In classical terms: it's about queries and commands.

Thinking in Commands

Let's start by looking at the commands in your example: "activate this user" and "deactivate this user". The nice thing about commands is that they can easily be expressed by small given-when-then scenario's:

given an inactive user
when the admin activates this user
then the user becomes active
and a confirmation e-mail is sent to the user
and an entry is added to the system log
(etc. etc.)

Such scenario's are useful to see how different parts of your infrastructure can be affected by a single command – in this case your database (some kind of 'active' flag), your mail server, your system log, etc.

Such scenario's also really help you in setting up a Test Driven Development environment.

And finally, thinking in commands really helps you create a task-oriented application. Your users will appreciate this :-)

Expressing Commands

Django provides two easy ways of expressing commands; they are both valid options and it is not unusual to mix the two approaches.

The service layer

The service module has already been described by @Hedde. Here you define a separate module and each command is represented as a function.

services.py

def activate_user(user_id):
    user = User.objects.get(pk=user_id)

    # set active flag
    user.active = True
    user.save()

    # mail user
    send_mail(...)

    # etc etc

Using forms

The other way is to use a Django Form for each command. I prefer this approach, because it combines multiple closely related aspects:

  • execution of the command (what does it do?)
  • validation of the command parameters (can it do this?)
  • presentation of the command (how can I do this?)

forms.py

class ActivateUserForm(forms.Form):

    user_id = IntegerField(widget = UsernameSelectWidget, verbose_name="Select a user to activate")
    # the username select widget is not a standard Django widget, I just made it up

    def clean_user_id(self):
        user_id = self.cleaned_data['user_id']
        if User.objects.get(pk=user_id).active:
            raise ValidationError("This user cannot be activated")
        # you can also check authorizations etc. 
        return user_id

    def execute(self):
        """
        This is not a standard method in the forms API; it is intended to replace the 
        'extract-data-from-form-in-view-and-do-stuff' pattern by a more testable pattern. 
        """
        user_id = self.cleaned_data['user_id']

        user = User.objects.get(pk=user_id)

        # set active flag
        user.active = True
        user.save()

        # mail user
        send_mail(...)

        # etc etc

Thinking in Queries

You example did not contain any queries, so I took the liberty of making up a few useful queries. I prefer to use the term "question", but queries is the classical terminology. Interesting queries are: "What is the name of this user?", "Can this user log in?", "Show me a list of deactivated users", and "What is the geographical distribution of deactivated users?"

Before embarking on answering these queries, you should always ask yourself two questions: is this a presentational query just for my templates, and/or a business logic query tied to executing my commands, and/or a reporting query.

Presentational queries are merely made to improve the user interface. The answers to business logic queries directly affect the execution of your commands. Reporting queries are merely for analytical purposes and have looser time constraints. These categories are not mutually exclusive.

The other question is: "do I have complete control over the answers?" For example, when querying the user's name (in this context) we do not have any control over the outcome, because we rely on an external API.

Making Queries

The most basic query in Django is the use of the Manager object:

User.objects.filter(active=True)

Of course, this only works if the data is actually represented in your data model. This is not always the case. In those cases, you can consider the options below.

Custom tags and filters

The first alternative is useful for queries that are merely presentational: custom tags and template filters.

template.html

<h1>Welcome, {{ user|friendly_name }}</h1>

template_tags.py

@register.filter
def friendly_name(user):
    return remote_api.get_cached_name(user.id)

Query methods

If your query is not merely presentational, you could add queries to your services.py (if you are using that), or introduce a queries.py module:

queries.py

def inactive_users():
    return User.objects.filter(active=False)


def users_called_publysher():
    for user in User.objects.all():
        if remote_api.get_cached_name(user.id) == "publysher":
            yield user 

Proxy models

Proxy models are very useful in the context of business logic and reporting. You basically define an enhanced subset of your model.

models.py

class InactiveUserManager(models.Manager):
    def get_query_set(self):
        query_set = super(InactiveUserManager, self).get_query_set()
        return query_set.filter(active=False)

class InactiveUser(User):
    """
    >>> for user in InactiveUser.objects.all():
    …        assert user.active is False 
    """

    objects = InactiveUserManager()
    class Meta:
        proxy = True

Query models

For queries that are inherently complex, but are executed quite often, there is the possibility of query models. A query model is a form of denormalization where relevant data for a single query is stored in a separate model. The trick of course is to keep the denormalized model in sync with the primary model. Query models can only be used if changes are entirely under your control.

models.py

class InactiveUserDistribution(models.Model):
    country = CharField(max_length=200)
    inactive_user_count = IntegerField(default=0)

The first option is to update these models in your commands. This is very useful if these models are only changed by one or two commands.

forms.py

class ActivateUserForm(forms.Form):
    # see above

    def execute(self):
        # see above
        query_model = InactiveUserDistribution.objects.get_or_create(country=user.country)
        query_model.inactive_user_count -= 1
        query_model.save()

A better option would be to use custom signals. These signals are of course emitted by your commands. Signals have the advantage that you can keep multiple query models in sync with your original model. Furthermore, signal processing can be offloaded to background tasks, using Celery or similar frameworks.

signals.py

user_activated = Signal(providing_args = ['user'])
user_deactivated = Signal(providing_args = ['user'])

forms.py

class ActivateUserForm(forms.Form):
    # see above

    def execute(self):
        # see above
        user_activated.send_robust(sender=self, user=user)

models.py

class InactiveUserDistribution(models.Model):
    # see above

@receiver(user_activated)
def on_user_activated(sender, **kwargs):
        user = kwargs['user']
        query_model = InactiveUserDistribution.objects.get_or_create(country=user.country)
        query_model.inactive_user_count -= 1
        query_model.save()

Keeping it clean

When using this approach, it becomes ridiculously easy to determine if your code stays clean. Just follow these guidelines:

  • Does my model contain methods that do more than managing database state? You should extract a command.
  • Does my model contain properties that do not map to database fields? You should extract a query.
  • Does my model reference infrastructure that is not my database (such as mail)? You should extract a command.

The same goes for views (because views often suffer from the same problem).

  • Does my view actively manage database models? You should extract a command.

Some References

Django documentation: proxy models

Django documentation: signals

Architecture: Domain Driven Design

by anonymous   2017-08-20

If you are looking for deeper insights in how to layer real software and what proper names they should have you should read about Domain Driven Design

First and classic book (be aware that it's very general). As for something practical you can check out this book or just google for some online examples.

by Ruben Bartelink   2017-08-20

Other hanselminutes episodes on testing:

Other podcasts:

Other questions like this:

Blog posts:

I know you didn't ask for books but... Can I also mention that Beck's TDD book is a must read, even though it may seem like a dated beginner book on first flick through (and Working Effectively with Legacy Code by Michael C. Feathers of course is the bible). Also, I'd append Martin(& Martin)'s Agile Principles, Patterns & Techniques as really helping in this regard. In this space (concise/distilled info on testing) also is the excellent Foundations of programming ebook. Goob books on testing I've read are The Art of Unit Testing and xUnit Test Patterns. The latter is an important antidote to the first as it is much more measured than Roy's book is very opinionated and offers a lot of unqualified 'facts' without properly going through the various options. Definitely recommend reading both books though. AOUT is very readable and gets you thinking, though it chooses specific [debatable] technologies; xUTP is in depth and neutral and really helps solidify your understanding. I read Pragmatic Unit Testing in C# with NUnit afterwards. It's good and balanced though slightly dated (it mentions RhinoMocks as a sidebar and doesnt mention Moq) - even if nothing is actually incorrect. An updated version of it would be a hands-down recommendation.

More recently I've re-read the Feathers book, which is timeless to a degree and covers important ground. However it's a more 'how, for 50 different wheres' in nature. It's definitely a must read though.

Most recently, I'm reading the excellent Growing Object-Oriented Software, Guided by Tests by Steve Freeman and Nat Pryce. I can't recommend it highly enough - it really ties everything together from big to small in terms of where TDD fits, and various levels of testing within a software architecture. While I'm throwing the kitchen sink in, Evans's DDD book is important too in terms of seeing the value of building things incrementally with maniacal refactoring in order to end up in a better place.