Exploring CQRS and Event Sourcing PDF
Exploring CQRS and Event Sourcing PDF
and
system to showcase various CQRS and ES concepts, challenges, and guidance that includes both production
techniques. The development team did not work in isolation; we actively quality source code and documentation.
Event Sourcing
sought input from industry experts and from a wide group of advisors to
ensure that the guidance is both detailed and practical. The guidance is designed to help
software development teams:
The CQRS pattern and event sourcing are not mere simplistic solutions to
Make critical design and technology
Event Sourcing
the problems associated with large-scale, distributed systems. By providing selection decisions by highlighting
you with both a working application and written guidance, we expect you’ll the appropriate solution architectures,
be well prepared to embark on your own CQRS journey. technologies, and Microsoft products
for common scenarios
Dominic Betts
Julián Domínguez
Grigori Melnik
Fernando Simonazzi
Mani Subramanian
978-1-62114-017-7
This document is provided “as-is”. Information and views expressed in
this document, including URL and other Internet Web site references,
may change without notice.
Some examples depicted herein are provided for illustration only and are
fictitious. No real association or connection is intended or should be
inferred.
This document does not provide you with any legal rights to any
intellectual property in any Microsoft product. You may copy and use
this document for your internal, reference purposes. You may modify this
document for your internal, reference purposes
© 2012 Microsoft. All rights reserved.
Microsoft, MSDN, SQL Azure, SQL Server, Visual Studio, Windows, and
Windows Azure are trademarks of the Microsoft group of companies. All
other trademarks are property of their respective owners.
Contents
v
vi
Architecture 93
Conference Management bounded context 97
Patterns and concepts 97
Event sourcing 97
Identifying aggregates 98
Task-based UI 99
CRUD 101
Integration between bounded contexts 101
Pushing changes from the Conference Management
bounded context 102
Pushing changes to the Conference Management
bounded context 104
Choosing when to update the read-side data 105
Distributed transactions and event sourcing 105
Autonomy versus authority 105
Favoring autonomy 106
Favoring authority 106
Choosing between autonomy and authority 106
Approaches to implementing the read side 107
Eventual consistency 107
Implementation details 108
The Conference Management bounded context 108
Integration with the Orders and Registration bounded
context 108
The Payments bounded context 109
Integration with online payment services, eventual
consistency, and command validation 111
Event sourcing 113
Raising events when the state of an aggregate changes 113
Persisting events to the event store 117
Replaying events to rebuild state 118
Issues with the simple event store implementation 120
Windows Azure table storage-based event store 120
Calculating totals 122
Impact on testing 123
Timing issues 123
Involving the domain expert 123
Summary 124
More information 124
Journey 6: Versioning Our System 125
Working definitions for this chapter 125
User stories 126
No down time upgrade 126
Display remaining seat quantities 126
Handle zero-cost seats 126
Architecture 126
ix
This is another excellent guide from the patterns & practices team—real software engineering with
no comforting illusions taken or offered. This guide provides a detailed journal of the practitioners
implementing a real production system using the CQRS and Event Sourcing patterns, and also high-
lights the tradeoffs and teaches the principles that underlie them. The topics presented are relevant
and useful, especially if you are building highly scalable Windows Azure applications. You’ll be both
challenged and inspired!
—Scott Guthrie, Corporate Vice-President, Azure App Platform, Microsoft
Having participated and co-authored various guides from patterns & practices, the “CQRS Journey”
follows the same walkthrough, scenario-based style, but adding even more fresh empirical content.
It’s a true testament of a skilled development team without previous CQRS experience, going through
the journey of implementing a complex system and documenting their adventures and lessons learnt
in this diary. If I had to recommend to someone where to start with CQRS, I would definitely point
them to this guide.
—Matias Woloski, CTO, Auth10 LLC
The “CQRS Journey” guide is an excellent resource for developers who want to begin developing a
CQRS system or convert their current system. It’s a true “trial by fire” approach to the concepts and
implementation hurdles that a team would encounter when adopting CQRS. I would recommend
reading it twice as I picked up even more lessons the second time through.
—Dan Piessens, Lead Software Architect, Zywave
I think it’s a really big step in communication with the developer community. You not only share your
development experience with a broad audience (which is very valuable by itself) but you’re also open
for learning from the community. While working on real projects it’s difficult to stop, find some time
to structure your knowledge, prepare it in the form understandable for others. It’s very cool that you
found time and resources for such educational effort, I really appreciate this.
—Ksenia Mukhortova, Business Applications Developer, Intel
I’m very excited about A CQRS Journey for a number of reasons. It explores, with an even hand and a
fair mind, a topic where opinions are both diverse and numerous. True to its name, the guide captures
the progression of learning. Conclusions are not simply stated; they arrive as a result of experience.
Additionally, the project embraced a passionate community with a spirit of inclusion and transparency.
The result is friendly-to-read guidance that is both diligent in execution and rigorous in its research.
—Christopher Bennage, Software Development Engineer, Microsoft
xvii
xviii
The journey project used Windows Azure SQL Database (backing write & read models), Service Bus
(for reliable messaging), and Tables (for event store). Production-quality, scalable cloud services that
can be provisioned on-demand with a few mouse-clicks (or API calls) can turn some tough infrastruc-
ture problems into trivial ones.
—Bill Wilder, MVP, Independent Consultant
Perhaps the best lessons out of this guidance will be just how easy it is to work with Microsoft now
that they are embracing more community and open source.
—Adam Dymitruk, Systems Architect
The work that patterns & practices is doing here is very important as it is packaging the concepts in
a digestible fashion and helping developers to wade through the ambiguities of CQRS. The real world
experiences captured within the journey project will be invaluable to folks looking at applying CQRS
within their application development”
—Glenn Block, Senior Program Manager, Microsoft, Windows Azure SDK for Node.js,
Organizer at ALT.NET Seattle Chapter
The p&p team’s dedication and hard work go hand-in-hand with the very high level of competency
present on the team. Their attention to detail, insistence on clarity, and open collaboration with the
community all led to the creation of material representing enormous value to consumers of the guid-
ance. I definitely plan on referencing this material and code in future engagements because I think my
clients will derive many benefits from it–a win-win for everyone!
—Josh Elster, Principal, Liquid Electron
CQRS is a very important pattern, and a tool that any cloud developer should have in his or her tool-
belt. It is particularly well-suited for the cloud since it allows for the implementation of massively
scalable solutions based on simple, common patterns (like queues, event handlers, and view models,
to name a few). Like all patterns, there are several concrete, correct ways of implementing CQRS. A
journey of the type undertaken by Microsoft’s patterns & practices team is a great way to explore the
different options, tradeoffs, and even possible mistakes one can make along the way, and accelerate
one’s learning of the CQRS pattern.
—Shy Cohen, Principal, Shy Cohen Consulting
patterns & practices assembled many of the active and key people in the CQRS community to join
them on the their journey with CQRS and along the way discovered confusing terminology and con-
cepts that created opportunities for leaders in the community to bring clarity to a broad audience.
The material produced is influenced from the results of building a real world application and ex-
presses the experiences from advisors and the patterns & practices team during the development
process. By request from the community to allow outside contributions, everything has been open
sourced on GitHub. Anyone interested is encouraged to take a look at the guide or implementation.
The patterns & practices team has been very welcoming to anyone who wants to collaborate on
covering additional areas, alternative implementations or further extending what is currently in place.
—Kelly Sommers, Developer
xix
Congratulations on getting to what looks to be nice guidance. I know that the announcement that
p&p was going to embark on this project caused a twitter firestorm but you seem to have come
through it well. I’m a fan of the p&p books and think you’ve done a great job in sharing good prac-
tices with the community.
—Neil Mackenzie, Windows Azure MVP
CQRS is as much about architecture community as it is about concrete patterns—thus the project is
aptly named “CQRS Journey.” The community involvement and engagement in this project is unprec-
edented for Microsoft and reflects the enthusiasm amongst the many (if may say: young) software
architects from across the industry who are rediscovering proven architecture patterns and are recom-
posing them in new ways to solve today’s challenges. For me, one takeaway from this project is that
the recipes developed here need to be carefully weighed against their alternatives. As with any soft-
ware architecture approaches that promise easy scalability or evolvability of solutions, the proof will
be in concrete, larger production implementations and how they hold up to changing needs over time.
Thus, the results of this Journey project mark a start and not a finish line.
—Clemens Vasters, Principal Technical Lead, Microsoft Corporation
The experiences and conclusions of the p&p team match up very well with our own real-world expe-
riences. Their conclusions in Chapter 8 are spot on. One of the best aspects of this guidance is that
the p&p team exposes more of their thought processes and learning throughout the Journey than
most write-ups that you may read. From arguments between Developer 1 and Developer 2 on the
team, to discussions with experts such as Greg Young and Udi Dahan, to an excellent post-project
review in Chapter 8, the thought process is out there for you to learn from.
Thanks for this great work, guys. I hope you keep this style with your upcoming guidance pieces.
—Jon Wagner, SVP & Chief Architect, eMoney Advisor
The CQRS journey release by patterns & practices provides real world insight into the increasingly
popular CQRS pattern used in distributed systems that rely upon asynchronous, message based ap-
proaches to achieve very large scale. The exploration of the issues the team faced throughout the
implementation of the pattern is extremely useful for organizations considering CQRS, both to deter-
mine where the pattern is appropriate for them, and to go into the design and implementation with a
baseline understanding of the complexity it will introduce. I really enjoyed the candor around the
approach taken, the issues encountered, and the early design choices that the team would change in
hindsight. This is a must read for any organization embarking upon CQRS, regardless of what platform
they are using.
—Chris Keyser, VP Engineering, CaseNetwork
I’d like to personally thank the team for putting together such a transparent journey throughout this
project. I’m very pleased with the final release.
—Truong Nguyen, CEO, Nepsoft
I started off the new year on January 3rd with a few hour long meeting showing the team at patterns
& practices a bit about Command and Query Responsibility Segregation (CQRS) and Event Sourcing
(ES). Most of the team had previously not been exposed to these ideas. Today is almost exactly six
months later and they have produced a document of over 200 pages of discussions and guidance as
well as a full end to end example hosted in Windows Azure. This is certainly not a small feat.
When the announcement of the project came out, the twitter stream near instantly went negative
as many thought that Microsoft was building a CQRS framework; which was premature from the
community. The process followed similar paths to other patterns & practices projects with a large
advisor board being set up. I believe however that the most interesting part of the process was the
decision to host the work on GitHub and allow pull requests which is an extremely open and transpar-
ent way of communicating during the project.
One of the main benefits for the community as a whole of going through such a process is that
people were forced to refine their vocabularies. There are in the DDD/CQRS/ES communities many
different voices and often times, especially in younger groups, vocabularies will go down divergent paths
leading to fractured community. An example of nebulous terminologies can be seen in the terms ”saga,”
”process manager,” and ”workflow”; the community as a whole I believe benefited from the discussions
over defining what it actually is. One of the most interesting conversations brought up for me person-
ally was defining the difference between an Event Store and a Transaction Log as legitimate arguments
can be made that either is a higher level abstraction of the other. This has led not only to many interest-
ing discussions in the community but to a far stricter future definition of what an Event Store is.
”For the things we have to learn before we can do them, we learn by doing them. ~Aristotle”
The quote above was the team motto during the project. Many will be looking towards the guidance
presented as being authoritative guidance of how things should be done. This is however not the
optimal way to look at the guidance as presented (though it does contain many bits of good authori-
tative guidance). The main benefit of the guidance is the learning experience that it contains. It is
important to remember that the team came into the ideas presented as non-experienced in CQRS and
they learned in the process of doing. This gives a unique perspective throughout much of the text
where things are learned along the way or are being seen through fresh eyes of someone recently
having learned and attempted to apply the ideas. This perspective has also brought up many interest-
ing conversations within the community. The patterns & practices team deserves credit for digging
deep, facilitating these discussions, and bringing to light various incongruities, confusions and incon-
sistencies as they went along.
xxi
xxii
Keeping in mind the origination point of the team, the most valuable bits in the text that a reader
should focus on aside from general explanations are places where tradeoffs are discussed. There is an
unfortunate tendency to seek authoritative answers that ”things should be done in this way” when
they in fact do not exist. There are many ways to proverbially skin a cat and all have their pros and
cons. The text is quite good at discussing alternative points of view that came up as possible answers,
or that received heavy discussion within the advisor group, these can often be seen in the “developer 1/
developer 2 discussions.” One such discussion I mentioned previously in defining the difference be-
tween event sourcing and a transaction log. Many of these types of discussions come at the end of
the guidance.
How might things be approached differently? One of my favourite discussions towards the end
of the guidance dealing with performance is the independent realization that messaging is not equiv-
alent to distribution. This is a very hard lesson for many people to understand and the way that it
comes up rather organically and much like it would on most teams as a performance problem is a great
explanation. I can say 100 times to apply the first law of distributed computing, don’t distribute;
however seeing it from the eyes of a team dealing with a performance problem who has already made
the mistake of equating the two is a very understandable path and a great teaching tool. This section
also contains a smörgåsbord of information and insights in terms of how to build performant applica-
tions in Windows Azure.
Out in the wild, there are plenty of naïve samples of CQRS/ES implementations, which are great
for describing the concepts. There are details and challenges that will not surface till you work on a
complex, real-world production system. The value of the p&p’s sample application is that it uses a
fairly complex domain and the team went through multiple releases and focused on infrastructure
hardening, performance optimizations, dealing with transient faults and versioning, etc. — many
practical issues that you face when implementing CQRS and ES.
As with any project, people may disagree with implementation choices and decisions made. It is
important to remember the scoping of the project. The guidance is not coming from an expert view-
point throughout the process, but that of a group “learning by doing.” The process was and remains
open to contributions, and in fact this version has been reviewed, validated, and guided by experts in
the community. In the spirit of OSS “send a pull request.” This guide can serve as a valuable point to
start discussions, clear up misconceptions, and refine how we explain things, as well as drive improve-
ment both in the guidance itself and in getting consistent viewpoints throughout the community.
In conclusion I think patterns & practices has delivered to the community a valuable service in the
presentation of this guidance. The view point the guidance is written from is both an uncommon and
valuable one. It has also really been a good overall exercise for the community in terms of setting the
bar for what is being discussed and refining of the vocabularies that people speak in. Combine this
with the amount of previously difficult to find Windows Azure guidance and the guidance becomes
quite valuable to someone getting into the ideas.
Greg Young
Preface
Why are we embarking on this journey?
xxiii
xxiv
This written guidance is itself split into three distinct sections that you can read independently: a
description of the journey we took as we learned about CQRS, a collection of CQRS reference mate-
rials, and a collection of case studies that describe the experiences other teams have had with the
CQRS pattern. The map in Figure 1 illustrates the relationship between the first two sections: a
journey with some defined stopping points that enables us to explore a space.
Figure 1
A CQRS journey
A CQRS journey
This section is closely related to the RI and the chapters follow the chronology of the project to de-
velop the RI. Each chapter describes relevant features of the domain model, infrastructure elements,
architecture, and user interface (UI) that the team was concerned with during that phase of the
project. Some parts of the system are discussed in several chapters, and this reflects the fact that the
team revisited certain areas during later stages. Each of these chapters discuss how and why particular
CQRS patterns and concepts apply to the design and development of particular bounded contexts,
describe the implementation, and highlight any implications for testing.
xxv
Other chapters look at the big picture. For example, there is a chapter that explains the rationale
for splitting the RI into the bounded contexts we chose, another chapter analyzes the implications of
our approach for versioning the system, and other chapters look at how the different bounded con-
texts in the RI communicate with each other.
This section describes our journey as we learned about CQRS, and how we applied that learn-
ing to the design and implementation of the RI. It is not prescriptive guidance and is not intended
to illustrate the only way to apply the CQRS approach to our RI. We have tried wherever possible
to capture alternative viewpoints through consultation with our advisors and to explain why we
made particular decisions. You may disagree with some of those decisions; please let us know at
cqrsjourney@microsoft.com.
This section of the written guidance makes frequent cross-references to the material in the sec-
ond section for readers who wish to explore any of the concepts or patterns in more detail.
CQRS reference
The second section of the written guidance is a collection of reference material collated from many
sources. It is not the definitive collection, but should contain enough material to help you to under-
stand the core patterns, concepts, and language of CQRS.
A CQRS journey
• Chapter 1, “The Contoso Conference Management System,” introduces our sample applica-
tion and our team of (fictional) experts.
• Chapter 2, “Decomposing the Domain,” provides a high-level view of the sample application
and describes the bounded contexts that make up the application.
• Chapter 3, “Orders and Registrations Bounded Context,” introduces our first bounded
context, explores some CQRS concepts, and describes some elements of our infrastructure.
• Chapter 4, “Extending and Enhancing the Orders and Registrations Bounded Context,”
describes adding new features to the bounded context and discusses our testing approach.
• Chapter 5, “Preparing for the V1 Release,” describes adding two new bounded contexts and
handling integration issues between them, and introduces our event-sourcing implementa-
tion. This is our first pseudo-production release.
• Chapter 6, “Versioning Our System,” discusses how to version the system and handle up-
grades with minimal down time.
• Chapter 7, “Adding Resilience and Optimizing Performance,” describes what we did to make
the system more resilient to failure scenarios and how we optimized the performance of the
system. This was the last release of the system in our journey.
• Chapter 8, “Lessons Learned,” collects the key lessons we learned from our journey and
suggests how you might continue the journey.
xxvi
CQRS reference
• Chapter 1, “CQRS in Context,” provides some context for CQRS, especially in relation to the
domain-driven design approach.
• Chapter 2, “Introducing the Command Query Responsibility Segregation Pattern,” provides a
conceptual overview of the CQRS pattern.
• Chapter 3, “Introducing Event Sourcing,” provides a conceptual overview of event sourcing.
• Chapter 4, “A CQRS and ES Deep Dive,” describes the CQRS pattern and event sourcing in
more depth.
• Chapter 5, “Communicating between Bounded Contexts,” describes some options for
communicating between bounded contexts.
• Chapter 6, “A Saga on Sagas,” explains our choice of terminology: process manager instead of
saga. It also describes the role of process managers.
• Chapter 7, “Technologies Used in the Reference Implementation,” provides a brief overview
of some of the other technologies we used, such as the Windows Azure Service Bus.
• Appendix 1, “Release Notes,” contains detailed instructions for downloading, building, and
running the sample application and test suites.
• Appendix 2, “Migrations,” contains instructions for performing the code and data migrations
between the pseudo-production releases of the Contoso Conference Management System.
• Non-trivial. The domain must be complex enough to exhibit real problems, but at the same
time simple enough for most people to understand without weeks of study. The problems
should involve dealing with temporal data, stale data, receiving out-of-order events, and ver-
sioning. The domain should enable us to illustrate solutions using event sourcing, sagas, and
event merging.
• Collaborative. The domain must contain collaborative elements where multiple actors can
operate simultaneously on shared data.
• End to end. We wanted to be able illustrate the concepts and patterns in action from the
back-end data store through to the user interface. This might include disconnected mobile and
smart clients.
• Cloud friendly. We wanted to have the option of hosting parts of the RI on Windows Azure
and be able to illustrate how you can use CQRS for cloud-hosted applications.
• Large. We wanted to be able to show how our domain can be broken down into multiple
bounded contexts to highlight when to use and when not use CQRS. We also wanted to
illustrate how multiple architectural approaches (CQRS, CQRS/ES, and CRUD) and legacy
systems can co-exist within the same domain. We also wanted to show how multiple develop-
ment teams could carry out work in parallel.
• Easily deployable. The RI needed to be easily deployable so that you can install it and experi-
ment with it as you read this guidance.
As a result, we chose to implement the conference management system that Chapter 1, “Our Domain:
The Contoso Conference Management System,” introduces.
Arrow legend
Many illustrations in the guidance have arrows. Here is their associated meaning.
Event message
Command message
Method call
Flow of data
Object relationship
Figure 2
Legend for arrows
xxviii
xxix
xxx
Specifically, we’d like to acknowledge the following people who have contributed to the journey
in many different ways:
• Greg Young for your pragmatism, patience with us, continuous mentoring and irreplaceable
advice;
• Udi Dahan for challenging us and offering alternative views on many concepts;
• Clemens Vasters for pushing back on terminology and providing a very valuable perspective from
the distributed database field;
• Kelly Sommers for believing in us and bringing sanity to the community as well as for deep
technical insights;
• Adam Dymitruk for jumpstarting us on git and extending the RI;
• Glenn Block for encouraging us to go all the way with the OSS initiative and for introducing us
to many community members;
• Our GM Lori Brownell and our director Björn Rettig for providing sponsorship of the initiative
and believing in our vision;
• Scott Guthrie for supporting the project and helping amplify the message;
• Josh Elster for exploring and designing the MIL (Messaging Intermediate Language) and pushing
us to make it easier to follow the workflow of messages in code;
• Cesar De la Torre Llorente for helping us spike on the alternatives and bringing up terminological
incongruities between various schools and thought leaders;
• Rinat Abdullin for active participation at the beginning of the project and contributing a case
study;
• Bruno Terkaly and Ricardo Villalobos for exploring the disconnected client scenario that would
integrate with the RI;
• Einar Otto Stangvik for spiking on the Schedule Builder bounded context implementation in
Node.js;
• Mark Seemann for sending the very first pull request focusing on code quality;
• Christopher Bennage for helping us overcome GitHub limitations by creating the pundit review
system and the export-to-Excel script to manage iteration backlog more effectively;
• Bob Brumfield, Eugenio Pace, Carlos Farre, Hanz Zhang, and Rohit Sharma for many insights
especially on the perf and hardening challenges;
• Chris Tavares for putting out the first CQRS experiment at p&p and suggesting valuable scenarios;
• Tim Sharkinian for your perspectives on CQRS and for getting us on the SpecFlow train;
• Jane Sinyagina for helping solicit and process feedback from the advisors;
• Howard Wooten and Thomas Petchel for feedback on the UI style and usability;
• Kelly Leahy for sharing your experience and making us aware of potential pitfalls;
• Dylan Smith for early conversations and support of this project in pre-flight times;
• Evan Cooke, Tim Walton, Alex Dubinkov, Scott Brown, Jon Wagner, and Gabriel N. Schenker for
sharing your experiences and contributing mini-case studies.
We feel honored to be supported by such an incredible group of people.
Thank you!
Journey 1:
Our Domain:
Conference Management System
The starting point: Where have we come from,
what are we taking, and who is coming with us?
This chapter introduces a fictitious company named Contoso. It describes Contoso’s plans to launch
the Contoso Conference Management System, a new online service that will enable other companies
or individuals to organize and manage their own conferences and events. This chapter describes, at a
high-level, some of the functional and non-functional requirements of the new system, and why
Contoso wants to implement parts of it using the Command Query Responsibility Segregation (CQRS)
pattern and event sourcing (ES). As with any company considering this process, there are many issues
to consider and challenges to be met, particularly because this is the first time Contoso has used both
the CQRS pattern and event sourcing. The chapters that follow show, step by step, how Contoso
designed and built its conference management application.
This chapter also introduces a panel of fictional experts to comment on the development efforts.
1
2 Journey one
“Defining the CQRS pattern is easy. Realizing the benefits that implementing the
CQRS pattern can offer is not always so straightforward.”
“It’s not easy to balance the needs of the company, the users, the IT organization, the
developers, and the technical platforms we rely on.”
“I don’t care what architecture you want to use for the application; I’ll make it work.”
Carlos is the domain expert. He understands all the ins and outs of
conference management. He has worked in a number of organizations that
help people run conferences. He has also worked in a number of different
roles: sales and marketing, conference management, and consultant.
“I want to make sure that the team understands how this business works so that we can
deliver a world-class online conference management system.”
Our Dom a in: Conference M a nagement System 3
“Running complex applications in the cloud involves challenges that are different than
the challenges in managing on-premises applications. I want to make sure our new
conference management system meets our published service-level agreements (SLA).”
Beth is a business manager. She helps companies to plan how their business will
develop. She understands the market that the company operates in, the resources
that the company has available, and the goals of the company. She has both a
strategic view and an interest in the day-to-day operations of the company.
“Organizations face many conflicting demands on their resources. I want to make sure that our
company balances those demands and adopts a business plan that will make us successful in the
medium and long term.”If you have a particular area of interest, look for notes provided by the
specialists whose interests align with yours.
Creating a conference
A business customer can create new conferences and manage information about the conference such
as its name, description, and dates. The business customer can also make a conference visible on the
Contoso Conference Management System website by publishing it, or hide it by unpublishing it.
Additionally, the business customer defines the seat types and available quantity of each seat type
for the conference.
Contoso also plans to enable the business customer to specify the following characteristics of a
conference:
• Whether the paper submission process will require reviewers.
• What the fee structure for paying Contoso will be.
• Who key personnel, such as the program chair and the event planner, will be.
Nonfunctional requirements
Contoso has two major nonfunctional requirements for its conference management system—scal-
ability and flexibility—and it hopes that the CQRS pattern will help it meet them.
Scalability
The conference management system will be hosted in the cloud; one of the reasons Contoso chose a
cloud platform was its scalability and potential for elastic scalability.
Although cloud platforms such as Windows Azure enable you to scale applications by adding (or
removing) role instances, you must still design your application to be scalable. By splitting responsibil-
ity for the application’s read and write operations into separate objects, the CQRS pattern allows
Contoso to split those operations into separate Windows Azure roles that can scale independently of
each other. This recognizes the fact that for many applications, the number of read operations vastly
exceeds the number of write operations. This gives Contoso the opportunity to scale the conference
management system more efficiently, and make better use of the Windows Azure role instances it uses.
Our Dom a in: Conference M a nagement System 5
Flexibility
The market that the Contoso Conference Management System oper-
ates in is very competitive, and very fast moving. In order to compete,
Contoso must be able to quickly and cost effectively adapt the con-
ference management system to changes in the market. This require-
ment for flexibility breaks down into a number of related aspects:
• Contoso must be able to evolve the system to meet new
requirements and to respond to changes in the market.
• The system must be able to run multiple versions of its software
Contoso plans to compete
simultaneously in order to support customers who are in the by being quick to respond
middle of a conference and who do not wish to upgrade to a to changes in the market
new version immediately. Other customers may wish to migrate and to changing customer
their existing conference data to a new version of the software requirements. Contoso
as it becomes available. must be able to evolve
the system quickly and
• Contoso intends the software to last for at least five years. It painlessly.
must be able to accommodate significant changes over that
period.
• Contoso does not want the complexity of some parts of the
system to become a barrier to change.
• Contoso would like to be able to use different developers for
different elements of the system, using cheaper developers for
simpler tasks and restricting its use of more expensive and
experienced developers to the more critical aspects of the
system.
More information
All links in this book are accessible from the book’s online bibliogra-
phy available at: http://msdn.microsoft.com/en-us/library/jj619274.
In this chapter, we provide a high-level overview of the Contoso Conference Management System.
The discussion will help you understand the structure of the application, the integration points, and
how the parts of the application relate to each other.
Here we describe this high-level structure in terms borrowed from the domain-driven design
(DDD) approach that Eric Evans describes in his book, Domain-Driven Design: Tackling Complexity in
the Heart of Software (Addison-Wesley Professional, 2003). Although there is no universal consensus
that DDD is a prerequisite for implementing the Command Query Responsibility Segregation (CQRS)
pattern successfully, our team decided to use many of the concepts from the DDD approach, such as
domain, bounded context, and aggregate, in line with common practice within the CQRS community.
Chapter 1, “CQRS in Context,” in the Reference Guide discusses the relationship between the DDD
approach and the CQRS pattern in more detail.
7
8 Journey t wo
The business customer can specify the following information about a conference:
• The name, description, and slug (part of the URL used to access the conference).
• The start and end dates of the conference.
• The different types and quotas of seats available at the conference.
Additionally, the business customer can control the visibility of the conference on the public website
by either publishing or unpublishing the conference.
The business customer can also use the conference management website to view a list of orders
and attendees.
The Payments bounded context: The payments bounded context is responsible for managing the
interactions between the conference management system and external payment systems. It forwards
the necessary payment information to the external system and receives an acknowledgement that the
payment was either accepted or rejected. It reports the success or failure of the payment back to the
conference management system.
Initially, the payments bounded context will assume that the business customer has an account
with the third-party payment system (although not necessarily a merchant account), or that the busi-
ness customer will accept payment by invoice.
More information
All links in this book are accessible from the book’s online bibliogra-
phy available at: http://msdn.microsoft.com/en-us/library/jj619274.
“The Allegator is the same, as the Crocodile, and differs only in Name.”
John Lawson
13
14 Journey thr ee
Seat. A seat represents the right to be admitted to a conference or to access a specific session at
the conference such as a cocktail party, a tutorial, or a workshop. The business customer may change
the quota of seats for each conference. The business customer may also change the quota of seats for
each session.
Reservation. A reservation is a temporary reservation of one or more seats. The ordering process
creates reservations. When a registrant begins the ordering process, the system makes reservations
for the number of seats requested by the registrant. These seats are then not available for other reg-
istrants to reserve. The reservations are held for n minutes during which the registrant can complete
the ordering process by making a payment for those seats. If the registrant does not pay for the seats
within n minutes, the system cancels the reservation and the seats become available to other regis-
trants to reserve.
Seat availability. Every conference tracks seat availability for each type of seat. Initially, all of the
seats are available to reserve and purchase. When a seat is reserved, the number of available seats of
that type is decremented. If the system cancels the reservation, the number of available seats of that
type is incremented. The business customer defines the initial number of each seat type to be made
available; this is an attribute of a conference. A conference owner may adjust the numbers for the
individual seat types.
Conference site. You can access every conference defined in the system by using a unique URL.
Registrants can begin the ordering process from this site.
Each of the terms defined here was formulated through active discussions between the devel-
opment team and the domain experts. The following is a sample conversation between devel-
opers and domain experts that illustrates how the team arrived at a definition of the term at-
tendee.
Developer 1: Here’s an initial stab at a definition for attendee. “An attendee is someone who
has paid to attend a conference. An attendee can interact with the system to perform tasks
such as manage his agenda, print his badge, and provide feedback after the conference.”
Domain Expert 1: Not all attendees will pay to attend the conference. For example, some
conferences will have volunteer helpers, also speakers typically don’t pay. And, there may be
some cases where an attendee gets a 100% discount.
Domain Expert 1: Don’t forget that it’s not the attendee who pays; that’s done by the regis-
trant.
Developer 1: So we need to say that Attendees are people who are authorized to attend a
conference?
Developer 2: We need to be careful about the choice of words here. The term authorized will
make some people think of security and authentication and authorization.
Developer 1: How about entitled?
Domain Expert 1: When the system performs tasks such as printing badges, it will need to
know what type of attendee the badge is for. For example, speaker, volunteer, paid attendee,
and so on.
Or ders a nd R egistr ations Bounded Context 17
Developer 1: Now we have this as a definition that captures everything we’ve discussed. An at-
tendee is someone who is entitled to attend a conference. An attendee can interact with the
system to perform tasks such as manage his agenda, print his badge, and provide feedback after
the conference. An attendee could also be a person who doesn’t pay to attend a conference
such as a volunteer, speaker, or someone with a 100% discount. An attendee may have multiple
associated attendee types (speaker, student, volunteer, track chair, and so on.)
Figure 1
Ordering UI mockups
18 Journey thr ee
These UI mockups helped the team in several ways, allowing them to:
• Communicate the core team’s vision for the system to the
graphic designers who are on an independent team at a third-
party company.
• Communicate the domain expert’s knowledge to the developers.
• Refine the definition of terms in the ubiquitous language.
• Explore “what if” questions about alternative scenarios and
approaches.
• Form the basis for the system’s suite of acceptance tests.
A frequently cited
advantage of the CQRS
pattern is that it enables
you to scale the read Architecture
side and write side of the
application independently The application is designed to deploy to Windows Azure. At this
to support the different stage in the journey, the application consists of a web role that con-
usage patterns. In this tains the ASP.NET MVC web application and a worker role that
bounded context, contains the message handlers and domain objects. The application
however, the number of
read operations from the uses a Windows Azure SQL Database instance for data storage, both
UI is not likely to hugely on the write side and the read side. The application uses the Windows
out-number the write Azure Service Bus to provide its messaging infrastructure.
operations: this bounded While you are exploring and testing the solution, you can run it
context focuses on locally, either using the Windows Azure compute emulator or by run-
registrants creating orders.
Therefore, the read side and ning the MVC web application directly and running a console applica-
the write side are deployed tion that hosts the handlers and domain objects. When you run the
to the same Windows Azure application locally, you can use a local SQL Server Express database
worker role rather than instead of SQL Database, and use a simple messaging infrastructure
to two separate worker implemented in a SQL Server Express database.
roles that could be scaled
independently. For more information about the options for running the applica-
tion, see Appendix 1, “Release Notes.”
This scenario considers what happens when a registrant tries to book several seats at a confer-
ence. The system must:
• Check that sufficient seats are available.
• Record details of the registration.
• Update the total number of seats booked for the conference.
We deliberately kept the scenario simple to avoid distractions while the team examines the
alternatives. These examples do not illustrate the final implementation of this bounded context.
The first approach considered by the team, shown in Figure 2, uses two separate aggregates.
Figure 2
Approach 1: Two separate aggregates
The second approach considered by the team, shown in Figure 3, uses a single aggregate in place of two.
Figure 3
Approach 2: A single aggregate
The third approach considered by the team, shown in Figure 4, uses a process manager to coordinate
the interaction between two aggregates.
Figure 4
Approach 3: Using a process manager
In the first model, the validation must take place in either the
Order or SeatsAvailability aggregate. If it is the former, the Order
aggregate must discover the current seat availability from the Seats-
Availability aggregate before the reservation is made and before it
raises the event. If it is the latter, the SeatsAvailability aggregate
must somehow notify the Order aggregate that it cannot reserve the
seats, and that the Order aggregate must undo (or compensate for)
any work that it has completed so far.
The second model behaves similarly, except that it is Order and
Undo is just one of many SeatsAvailability entities cooperating within a Conference aggregate.
compensating actions In the third model, with the process manager, the aggregates ex-
that occur in real life. The change messages through the process manager about whether the
compensating actions could registrant can make the reservation at the current time.
even be outside of the
All three models require entities to communicate about the vali-
system implementation and
involve human actors: for dation process, but the third model with the process manager appears
example, a Contoso clerk or more complex than the other two.
the business customer calls
the registrant to tell them Transaction boundaries
that an error was made and
An aggregate, in the DDD approach, represents a consistency bound-
that they should ignore the
last confirmation email they ary. Therefore, the first model with two aggregates, and the third
received from the Contoso model with two aggregates and a process manager will involve two
system. transactions: one when the system persists the new Order aggregate
and one when the system persists the updated SeatsAvailability ag-
gregate.
The term consistency boundary refers to a boundary within which
you can assume that all the elements remain consistent with each
other all the time.
To ensure the consistency of the system when a registrant creates an
order, both transactions must succeed. To guarantee this, we must
take steps to ensure that the system is eventually consistent by ensur-
ing that the infrastructure reliably delivers messages to aggregates.
In the second approach, which uses a single aggregate, we will
only have a single transaction when a registrant makes an order. This
appears to be the simplest approach of the three.
Or ders a nd R egistr ations Bounded Context 25
Concurrency
The registration process takes place in a multi-user environment where many registrants could at-
tempt to purchase seats simultaneously. The team decided to use the Reservation pattern to address
the concurrency issues in the registration process. In this scenario, this means that a registrant ini-
tially reserves seats (which are then unavailable to other registrants); if the registrant completes the
payment within a timeout period, the system retains the reservation; otherwise the system cancels
the reservation.
This reservation system introduces the need for additional message types; for example, an event
to report that a registrant has made a payment, or report that a timeout has occurred.
This timeout also requires the system to incorporate a timer somewhere to track when reserva-
tions expire.
Modeling this complex behavior with sequences of messages and the requirement for a timer is
best done using a process manager.
Implementation details
This section describes some of the significant features of the Orders and Registrations bounded
context implementation. You may find it useful to have a copy of the code so you can follow along.
You can download it from the Download center, or check the evolution of the code in the repository
on github: mspnp/cqrs-journey-code.
Do not expect the code samples to match the code in the reference implementation exactly. This
chapter describes a step in the CQRS journey, the implementation may well change as we learn
more and refactor the code.
26 Journey thr ee
High-level architecture
As we described in the previous section, the team initially decided to implement the reservations
story in the conference management system using the CQRS pattern but without using event sourc-
ing. Figure 5 shows the key elements of the implementation: an MVC web application, a data store
implemented using a Windows Azure SQL Database instance, the read and write models, and some
infrastructure components.
We’ ll describe what goes on inside the read and write models later in this section.
Figure 5
High-level architecture of the registrations bounded context
Or ders a nd R egistr ations Bounded Context 27
The following sections relate to the numbers in Figure 5 and provide more detail about these ele-
ments of the architecture.
var conference =
new Conference.Web.Public.Models.Conference
{
Code = conference.Code,
Name = conference.Name,
Description = conference.Description
};
return conference;
}
}
The read model retrieves the information from the data store and returns it to the controller using a
data transfer object (DTO) class.
28 Journey thr ee
2. Issuing commands
The web application sends commands to the write model through a command bus. This command bus
is an infrastructure element that provides reliable messaging. In this scenario, the bus delivers mes-
sages asynchronously and once only to a single recipient.
The RegistrationController class can send a RegisterToConference command to the write
model in response to user interaction. This command sends a request to register one or more seats at
the conference. The RegistrationController class then polls the read model to discover whether the
registration request succeeded. See the section “6. Polling the Read Model” below for more details.
The following code sample shows how the RegistrationController sends a RegisterToConference
command:
var viewModel = this.UpdateViewModel(conferenceCode, contentModel);
var command =
new RegisterToConference
{
OrderId = viewModel.Id,
ConferenceId = viewModel.ConferenceId,
Seats = viewModel.Items.Select(x =>
new RegisterToConference.Seat
{
SeatTypeId = x.SeatTypeId,
Quantity = x.Quantity
}).ToList()
};
this.commandBus.Send(command);
All of the commands are sent asynchronously and do not expect return values.
3. Handling commands
Command handlers register with the command bus; the command bus can then forward commands
to the correct handler.
The OrderCommandHandler class handles the RegisterToConference command sent from the
UI. Typically, the handler is responsible for initiating any business logic in the domain and for persisting
any state changes to the data store.
The following code sample shows how the OrderCommandHandler class handles the Register-
ToConference command:
Or ders a nd R egistr ations Bounded Context 29
repository.Save(order);
}
}
[HttpPost]
public ActionResult StartRegistration(string conferenceCode,
OrderViewModel contentModel)
{
...
this.commandBus.Send(command);
if (draftOrder != null)
{
if (draftOrder.State == "Booked")
{
return RedirectToAction(
"SpecifyPaymentDetails",
new { conferenceCode = conferenceCode, orderId = viewModel.Id });
}
else if (draftOrder.State == "Rejected")
{
return View("ReservationRejected", viewModel);
}
}
The team later replaced this mechanism for checking whether the system saves the order with an
implementation of the Post-Redirect-Get pattern. The following code sample shows the new version
of the StartRegistration action method.
For more information about the Post-Redirect-Get pattern see the article Post/Redirect/Get on
Wikipedia.
Or ders a nd R egistr ations Bounded Context 31
[HttpPost]
public ActionResult StartRegistration(string conferenceCode,
OrderViewModel contentModel)
{
...
this.commandBus.Send(command);
return RedirectToAction(
"SpecifyRegistrantDetails",
new { conferenceCode = conferenceCode, orderId = command.Id });
}
The action method now redirects to the SpecifyRegistrantDetails view immediately after it sends
the command. The following code sample shows how the SpecifyRegistrantDetails action polls for
the order in the repository before returning a view.
[HttpGet]
public ActionResult SpecifyRegistrantDetails(string conferenceCode, Guid orderId)
{
var draftOrder = this.WaitUntilUpdated(orderId);
...
}
The advantages of this second approach, using the Post-Redirect-Get pattern instead of in the
StartRegistration post action are that it works better with the browser’s forward and back naviga-
tion buttons, and that it gives the infrastructure more time to process the command before the
MVC controller starts polling.
Aggregates
The following code sample shows the Order aggregate.
public class Order : IAggregateRoot, IEventPublisher
{
public static class States
{
public const int Created = 0;
public const int Booked = 1;
public const int Rejected = 2;
public const int Confirmed = 3;
}
32 Journey thr ee
...
this.State = States.Booked;
}
this.State = States.Rejected;
}
}
Or ders a nd R egistr ations Bounded Context 33
Notice how the properties of the class are not virtual. In the original version of this class, the proper-
ties Id, UserId, ConferenceId, and State were all marked as virtual. The following conversation be-
tween two developers explores this decision.
Developer 1: I’m really convinced you should not make the property virtual, except if required by
the object-relational mapping (ORM) layer. If this is just for testing purposes, entities and aggre-
gate roots should never be tested using mocking. If you need mocking to test your entities, this is
a clear smell that something is wrong in the design.
Developer 2: I prefer to be open and extensible by default. You never know what needs may arise
in the future, and making things virtual is hardly a cost. This is certainly controversial and a bit
non-standard in .NET, but I think it’s OK. We may only need virtuals on lazy-loaded collections.
Developer 1: Since CQRS usually makes the need for lazy load vanish, you should not need it ei-
ther. This leads to even simpler code.
Developer 2: CQRS does not dictate usage of event sourcing (ES), so if you’re using an aggregate
root that contains an object graph, you’d need that anyway, right?
Developer 1: This is not about ES, it’s about DDD. When your aggregate boundaries are right,
you don’t need delay loading.
Developer 2: To be clear, the aggregate boundary is here to group things that should change to-
gether for reasons of consistency. A lazy load would indicate that things that have been grouped
together don’t really need this grouping.
Developer 1: I agree. I have found that lazy-loading in the command side means I have it modeled
wrong. If I don’t need the value in the command side, then it shouldn’t be there. In addition, I dis-
like virtuals unless they have an intended purpose (or some artificial requirement from an object-
relational mapping (ORM) tool). In my opinion, it violates the Open-Closed principle: you have
opened yourself up for modification in a variety of ways that may or may not be intended and
where the repercussions might not be immediately discoverable, if at all.
Developer 2: Our Order aggregate in the model has a list of Order Items. Surely we don’t need
to load the lines to mark it as Booked? Do we have it modeled wrong there?
Developer 1: Is the list of Order Items that long? If it is, the modeling may be wrong because you
don’t necessarily need transactionality at that level. Often, doing a late round trip to get and up-
dated Order Items can be more costly that loading them up front: you should evaluate the usual
size of the collection and do some performance measurement. Make it simple first, optimize if
needed.
—Thanks to Jérémie Chassaing and Craig Wilson
34 Journey thr ee
Figure 6 shows the entities that exist in the write-side model. There are two aggregates, Order and
SeatsAvailability, each one containing multiple entity types. Also there is a RegistrationProcess-
Manager class to manage the interaction between the aggregates.
The table in the Figure 6 shows how the process manager behaves given a current state and a
particular type of incoming message.
Or ders a nd R egistr ations Bounded Context 35
Figure 6
Domain objects in the write model
The process of registering for a conference begins when the UI sends a RegisterToConference com-
mand. The infrastructure delivers this command to the Order aggregate. The result of this command
is that the system creates a new Order instance, and that the new Order instance raises an Order-
Placed event. The following code sample from the constructor in the Order class shows this happen-
ing. Notice how the system uses GUIDs to identify the different entities.
36 Journey thr ee
this.events.Add(
new OrderPlaced
{
OrderId = this.Id,
ConferenceId = this.ConferenceId,
UserId = this.UserId,
Seats = this.Lines.Select(x =>
new OrderPlaced.Seat
{
SeatTypeId = x.SeatTypeId,
Quantity = x.Quantity
}).ToArray()
});
}
To see how the infrastructure elements deliver commands and events, see Figure 7.
The system creates a new RegistrationProcessManager instance to manage the new order. The fol-
lowing code sample from the RegistrationProcessManager class shows how the process manager
handles the event.
Or ders a nd R egistr ations Bounded Context 37
this.AddCommand(
new MakeSeatReservation
{
ConferenceId = message.ConferenceId,
ReservationId = this.ReservationId,
NumberOfSeats = message.Items.Sum(x => x.Quantity)
});
}
else
{
throw new InvalidOperationException();
}
}
The code sample shows how the process manager changes its state
and sends a new MakeSeatReservation command that the Seats-
Availability aggregate handles. The code sample also illustrates how
the process manager is implemented as a state machine that receives
messages, changes its state, and sends new messages.
When the SeatsAvailability aggregate receives a MakeReservation
command, it makes a reservation if there are enough available seats. The
following code sample shows how the SeatsAvailability class raises dif-
ferent events depending on whether or not there are sufficient seats.
{
Delay = TimeSpan.FromMinutes(15),
});
}
else
{
throw new InvalidOperationException();
}
}
Infrastructure
The sequence diagram in Figure 7 shows how the infrastructure elements interact with the domain
objects to deliver messages.
Figure 7
Infrastructure sequence diagram
A typical interaction begins when an MVC controller in the UI sends a message using the command
bus. The message sender invokes the Send method on the command bus asynchronously. The com-
mand bus then stores the message until the message recipient retrieves the message and forwards it
to the appropriate handler. The system includes a number of command handlers that register with the
command bus to handle specific types of commands. For example, the OrderCommandHandler class
defines handler methods for the RegisterToConference, MarkOrderAsBooked, and RejectOrder
commands. The following code sample shows the handler method for the MarkOrderAsBooked
command. Handler methods are responsible for locating the correct aggregate instance, calling meth-
ods on that instance, and then saving that instance.
Or ders a nd R egistr ations Bounded Context 41
if (order != null)
{
order.MarkAsBooked();
repository.Save(order);
}
}
}
repo.Save(process);
}
}
}
Figure 8
Message flows through a Windows Azure Service Bus topic
In the initial implementation, the CommandBus and EventBus classes are very similar. The only dif-
ference between the Send method and the Publish method is that the Send method expects the
message to be wrapped in an Envelope class. The Envelope class enables the sender to specify a time
delay for the message delivery.
Events can have multiple recipients. In the example shown in Figure 8, the ReservationRejected
event is sent to the RegistrationProcessManager, the WaitListProcessManager, and one other
destination. The EventProcessor class identifies the list of handlers to receive the event by examining
its list of registered handlers.
A command has only one recipient. In Figure 8, the MakeSeatReservation is sent to the Seats-
Availability aggregate. There is just a single handler registered for this subscription. The Command-
Processor class identifies the handler to receive the command by examining its list of registered
handlers.
44 Journey thr ee
The following code sample from the SubscriptionReceiver class shows how it receives a message
from the topic subscription.
private SubscriptionClient client;
...
try
{
message = this.receiveRetryPolicy
.ExecuteAction(this.DoReceiveMessage);
}
catch (Exception e)
{
Trace.TraceError(
"An unrecoverable error occurred while trying to receive" +
"a new message:\r\n{0}",
e);
throw;
}
try
{
if (message == null)
{
Thread.Sleep(100);
continue;
}
}
}
}
object payload;
using (var stream = message.GetBody<Stream>())
using (var reader = new StreamReader(stream))
{
Or ders a nd R egistr ations Bounded Context 47
payload = this.serializer.Deserialize(reader);
}
try
{
...
ProcessMessage(payload);
...
}
catch (Exception e)
{
if (args.Message.DeliveryCount > MaxProcessingRetries)
{
Trace.TraceWarning(
"An error occurred while processing a new message and" +
"will be dead-lettered:\r\n{0}",
e);
message.SafeDeadLetter(e.Message, e.ToString());
}
else
{
Trace.TraceWarning(
"An error occurred while processing a new message and" +
"will be abandoned:\r\n{0}",
e);
message.SafeAbandon();
}
return;
}
This example uses an extension method to invoke the Complete and Abandon methods of the
BrokeredMessage reliably using the Transient Fault Handling Application Block.
48 Journey thr ee
How are commands and events serialized? “You should consider whether
The Contoso Conference Management System uses the Json.NET se- you always need to use the
rializer. For details on how the application uses this serializer, see Windows Azure Service Bus for
“Technologies Used in the Reference Implementation” in the Refer- commands. Commands are
ence Guide. typically used within a bounded
context and you may not need to
send them across a process
Impact on testing boundary (on the write side you
Because this was the first bounded context the team tackled, one of may not need additional tiers),
the key concerns was how to approach testing given that the team in which case you could use an
wanted to adopt a test-driven development approach. The following in memory queue to deliver your
conversation between two developers about how to do TDD when commands.”
they are implementing the CQRS pattern without event sourcing —Greg Young, conversation with
summarizes their thoughts: the patterns & practices team
Developer 1: If we were using event sourcing, it would be easy to use a TDD approach when
we were creating our domain objects. The input to the test would be a command (that per-
haps originated in the UI), and we could then test that the domain object fires the expected
events. However if we’re not using event sourcing, we don’t have any events: the behavior of
the domain object is to persist its changes in data store through an ORM layer.
Developer 2: So why don’t we raise events anyway? Just because we’re not using event sourc-
ing doesn’t mean that our domain objects can’t raise events. We can then design our tests in
the usual way to check for the correct events firing in response to a command.
Developer 1: Isn’t that just making things more complicated than they need to be? One of
the motivations for using CQRS is to simplify things! We now have domain objects that need
to persist their state using an ORM layer and raise events that report on what they have per-
sisted just so we can run our unit tests.
Developer 2: I see what you mean.
Developer 1: Perhaps we’re getting stuck on how we’re doing the tests. Maybe instead of de-
signing our tests based on the expected behavior of the domain objects, we should think
about testing the state of the domain objects after they’ve processed a command.
Developer 2: That should be easy to do; after all, the domain objects will have all of the data
we want to check stored in properties so that the ORM can persist the right information to
the store.
Developer 1: So we really just need to think about a different style of testing in this scenario.
50 Journey thr ee
Developer 2: There is another aspect of this we’ll need to consider: we might have a set of
tests that we can use to test our domain objects, and all of those tests might be passing. We
might also have a set of tests to verify that our ORM layer can save and retrieve objects suc-
cessfully. However, we will also have to test that our domain objects function correctly
when we run them against the ORM layer. It’s possible that a domain object performs the
correct business logic, but can’t properly persist its state, perhaps because of a problem re-
lated to how the ORM handles specific data types.
[TestMethod]
public void when_reserving_less_seats_than_total_then_succeeds()
{
var sut = this.given_available_seats();
sut.MakeReservation(Guid.NewGuid(), 4);
}
[TestMethod]
[ExpectedException(typeof(ArgumentOutOfRangeException))]
public void when_reserving_more_seats_than_total_then_fails()
{
var sut = this.given_available_seats();
sut.MakeReservation(Guid.NewGuid(), 11);
}
Or ders a nd R egistr ations Bounded Context 51
These two tests work together to verify the behavior of the SeatsAvailability aggregate. In the first
test, the expected behavior is that the MakeReservation method succeeds and does not throw an
exception. In the second test, the expected behavior is for the MakeReservation method to throw
an exception because there are not enough free seats available to complete the reservation.
It is difficult to test the behavior in any other way without the aggregate raising events. For ex-
ample, if you tried to test the behavior by checking that the correct call is made to persist the ag-
gregate to the data store, the test becomes coupled to the data store implementation (which is a
smell); if you want to change the data store implementation, you will need to change the tests on the
aggregates in the domain model.
The following code sample shows an example of a test written using the state of the objects
under test. This style of test is the one used in the project.
public class given_available_seats
{
private static readonly Guid SeatTypeId = Guid.NewGuid();
this.sut = this.sutProvider.PersistReload(this.sut);
}
public given_available_seats()
: this(new NoPersistenceProvider())
{
}
[Fact]
public void when_reserving_less_seats_than_total_then_seats_become_unavailable()
{
this.sut.MakeReservation(Guid.NewGuid(), 4);
this.sut = this.sutProvider.PersistReload(this.sut);
Assert.Equal(6, this.sut.RemainingSeats);
}
52 Journey thr ee
[Fact]
public void when_reserving_more_seats_than_total_then_rejects()
{
var id = Guid.NewGuid();
sut.MakeReservation(id, 11);
Assert.Equal(1, sut.Events.Count());
Assert.Equal(id, ((ReservationRejected)sut.Events.Single()).ReservationId);
}
}
The two tests shown here test the state of the SeatsAvailability aggregate after invoking the Make-
Reservation method. The first test tests the scenario in which there are enough seats available. The
second test tests the scenario in which there are not enough seats available. This second test can make
use of the behavior of the SeatsAvailability aggregate because the aggregate does raise an event if it
rejects a reservation.
Summary
In the first stage in our journey, we explored some of the basics of implementing the CQRS pattern
and made some preparations for the next stages.
The next chapter describes how we extended and enhanced the work already completed by
adding more features and functionality to the Orders and Registrations bounded context. We will also
look at some additional testing techniques to see how they might help us on our journey.
More information
All links in this book are accessible from the book’s online bibliography available at:
http://msdn.microsoft.com/en-us/library/jj619274.
Journey 4:
“I see that it is by no means useless to travel, if a man wants to see something new.”
Jules Verne, Around the World in Eighty Days
53
54 Journey four
Architecture
The application is designed to deploy to Windows Azure. At this stage in the journey, the application
consists of a web role that contains the ASP.NET MVC web application and a worker role that con-
tains the message handlers and domain objects. The application uses Windows Azure SQL Database
(SQL Database) instances for data storage, both on the write side and the read side. The application
uses the Windows Azure Service Bus to provide its messaging infrastructure. Figure 1 shows this
high-level architecture.
Figure 1
Contoso Conference Management System high-level architecture
56 Journey four
While you are exploring and testing the solution, you can run it
locally, either using the Windows Azure compute emulator or by run-
ning the MVC web application directly and running a console applica-
tion that hosts the handlers and domain objects. When you run the
application locally, you can use a local SQL Server Express database
instead of SQL Database, and use a simple messaging infrastructure
implemented in a SQL Server Express database.
For more information about the options for running the applica-
tion, see Appendix 1, “Release Notes.”
Record locators
The system uses access codes instead of passwords so the registrant
is not forced to set up an account with the system. Many registrants
may use the system only once, so there is no need to create a perma-
nent account with a user ID and a password.
The system needs to be able to retrieve order information quick-
ly based on the registrant’s email address and access code. To provide
a minimum level of security, the access codes that the system gener-
ates should not be predictable, and the order information that regis-
trants can retrieve should not contain any sensitive information.
The team will split the database into two and explore options for
pushing changes from the normalized write side to the denormalized
read side in a later stage of the journey. For an example of using
Windows Azure blob storage instead of SQL tables for storing the read-
side data, see the SeatAssignmentsViewModelGenerator class.
E xtending a nd Enh a ncing the Or ders a nd R egistr ations Bounded Context 57
Figure 2
The read side storing data in a relational database
The Repository pattern mediates between the domain and data mapping layers using
a collection-like interface for accessing domain objects. For more info see Martin
Fowler, Catalog of Patterns of Enterprise Application Architecture, Repository.
58 Journey four
Simplicity
• This approach uses a thin abstraction layer over the underlying
database. Many ORMs support this approach and it minimizes
the amount of code that you must write.
• You only need to define a single repository and a single Query
method.
• You don’t need a separate query object. On the read side, the
queries should be simple because you have already denormalized
the data from the write side to support the read-side clients.
• You can make use of Language-Integrated Query (LINQ) to
provide support for features such as filtering, paging, and
sorting on the client.
Testability
• You can use LINQ to Objects for mocking.
There are possible objections to this approach including that:
• It is not easy to replace the data store with a non-relational
database that does not expose an IQueryable object. However,
you can choose to implement the read model differently in each
bounded context using an approach that is appropriate to that
bounded context.
• The client might abuse the IQueryable interface by performing
In the RI, using Entity operations that can be done more efficiently as a part of the
Framework, we didn’t need denormalization process. You should ensure that the denormal-
to write any code at all to
ized data fully meets the requirements of the clients.
expose the IQueryable
instance. We also had just • Using the IQueryable interface hides the queries away. How-
a single ViewRepository ever, since you denormalize the data from the write side, the
class. queries against the relational database tables are unlikely to be
complex.
• It’s hard to know if your integration tests cover all the different
uses of the Query method.
E xtending a nd Enh a ncing the Or ders a nd R egistr ations Bounded Context 59
You could also choose to use different DAO classes. This would make it easier to access different data
sources.
var ordersummary = OrderSummaryDAO.FindAll(userId);
var orderdetails = OrderDetailsDAO.Get(orderId);
Simplicity
• Dependencies are clearer for the client. For example, the client references an explicit IOrder-
SummaryDAO instance rather than a generic IViewRepository instance.
• For the majority of queries, there are only one or two predefined ways to access the object.
Different queries typically return different projections.
Flexibility
• The Get and Find methods hide details such as the partitioning of the data store and the data
access methods such as an object relational mapping (ORM) or executing SQL code explicitly.
This makes it easier to change these choices in the future.
• The Get and Find methods could use an ORM, LINQ, and the IQueryable interface behind the
scenes to get the data from the data store. This is a choice that you could make on a method-
by-method basis.
Performance
• You can easily optimize the queries that the Find and Get methods run.
• The data access layer executes all queries. There is no risk that the client MVC controller action
tries to run complex and inefficient LINQ queries against the data source.
Testability
• It is easier to specify unit tests for the Find and Get methods than to create suitable unit tests
for the range of possible LINQ queries that a client could specify.
Maintainability
• All of the queries are defined in the same location, the DAO classes, making it easier to modify
the system consistently.
60 Journey four
Figure 3
The new architecture of the reservation process
protected Order()
{
...
this.AccessCode = HandleGenerator.Generate(5);
}
To retrieve an Order instance, a registrant must provide her email address and the order access code.
The system will use these two items to locate the correct order. This logic is part of the read side.
The following code sample from the OrderController class in the web application shows how
the MVC controller submits the query to the read side using the LocateOrder method to discover the
unique OrderId value. This Find action passes the OrderId value to a Display action that displays the
order information to the registrant.
[HttpPost]
public ActionResult Find(string email, string accessCode)
{
var orderId = orderDao.LocateOrder(email, accessCode);
if (!orderId.HasValue)
{
return RedirectToAction(
"Find",
new { conferenceCode = this.ConferenceCode });
}
return RedirectToAction(
"Display",
new
{
conferenceCode = this.ConferenceCode,
orderId = orderId.Value
});
}
64 Journey four
this.ReservationExpirationDate = expirationDate;
this.Items.Clear();
this.Items.AddRange(
seats.Select(
seat => new OrderItem(seat.SeatType, seat.Quantity)));
}
if (this.ExpirationCommandId == Guid.Empty)
{
var bufferTime = TimeSpan.FromMinutes(5);
var expirationCommand =
new ExpireRegistrationProcess { ProcessId = this.Id };
this.ExpirationCommandId = expirationCommand.Id;
this.AddCommand(new Envelope<ICommand>(expirationCommand)
{
Delay = expirationTime.Subtract(DateTime.UtcNow).Add(bufferTime),
});
}
this.AddCommand(new MarkSeatsAsReserved
{
OrderId = this.OrderId,
Seats = message.ReservationDetails.ToList(),
Expiration = expirationTime,
});
}
...
}
The MVC RegistrationController class retrieves the order information on the read side. The
DraftOrder class includes the reservation expiry time that the controller passes to the view using the
ViewBag class, as shown in the following code sample.
66 Journey four
[HttpGet]
public ActionResult SpecifyRegistrantDetails(string conferenceCode, Guid orderId)
{
var repo = this.repositoryFactory();
using (repo as IDisposable)
{
var draftOrder = repo.Find<DraftOrder>(orderId);
var conference = repo.Query<Conference>()
.Where(c => c.Code == conferenceCode)
.FirstOrDefault();
this.ViewBag.ConferenceName = conference.Name;
this.ViewBag.ConferenceCode = conference.Code;
this.ViewBag.ExpirationDateUTCMilliseconds =
draftOrder.BookingExpirationDate.HasValue
? ((draftOrder.BookingExpirationDate.Value.Ticks - EpochTicks) / 10000L)
: 0L;
this.ViewBag.OrderId = orderId;
using System;
using System.ComponentModel.DataAnnotations;
using Common;
[Required(AllowEmptyStrings = false)]
public string FirstName { get; set; }
[Required(AllowEmptyStrings = false)]
public string LastName { get; set; }
[Required(AllowEmptyStrings = false)]
public string Email { get; set; }
}
The MVC view uses this command class as its model class. The following code sample from the
SpecifyRegistrantDetails.cshtml file shows how the model is populated.
@model Registration.Commands.AssignRegistrantDetails
...
The Web.config file configures the client-side validation based on the DataAnnotations attributes,
as shown in the following snippet.
<appSettings>
...
<add key="ClientValidationEnabled" value="true" />
<add key="UnobtrusiveJavaScriptEnabled" value="true" />
</appSettings>
The server-side validation occurs in the controller before it sends the command. The following code
sample from the RegistrationController class shows how the controller uses the IsValid property to
validate the command. Remember that this example uses an instance of the command as the model.
[HttpPost]
public ActionResult SpecifyRegistrantDetails(
string conferenceCode,
Guid orderId,
AssignRegistrantDetails command)
{
if (!ModelState.IsValid)
{
return SpecifyRegistrantDetails(conferenceCode, orderId);
}
this.commandBus.Send(command);
return RedirectToAction(
"SpecifyPaymentDetails",
new { conferenceCode = conferenceCode, orderId = orderId });
}
For an additional example, see the RegisterToConference command and the StartRegistration action
in the RegistrationController class.
For more information, see Models and Validation in ASP.NET MVC on MSDN.
E xtending a nd Enh a ncing the Or ders a nd R egistr ations Bounded Context 69
Column Description
OrderId A unique identifier for the Order
ReservationExpirationDate The time when the seat reservations expire
StateValue The state of the Order: Created, PartiallyReserved, ReservationCompleted, Rejected,
Confirmed
RegistrantEmail The email address of the Registrant
AccessCode The Access Code that the Registrant can use to access the Order
Column Description
OrderItemId A unique identifier for the Order Item
SeatType The type of seat requested
RequestedSeats The number of seats requested
ReservedSeats The number of seats reserved
OrderID The OrderId in the parent OrdersView table
To populate these tables in the read model, the read side handles events raised by the write side and
uses them to write to these tables. See Figure 3 above for more details.
The OrderViewModelGenerator class handles these events and updates the read-side repository.
70 Journey four
public OrderViewModelGenerator(
Func<ConferenceRegistrationDbContext> contextFactory)
{
this.contextFactory = contextFactory;
}
context.Save(dto);
}
}
...
}
if (entry.State == System.Data.EntityState.Detached)
this.Set<T>().Add(entity);
this.SaveChanges();
}
}
Figure 4
The SeatsAvailability aggregate and its associated commands and events
The domain now includes a SeatQuantity value type that you can use to represent a quantity of
a particular seat type.
Previously, the aggregate raised either a ReservationAccepted or a ReservationRejected event,
depending on whether there were sufficient seats. Now the aggregate raises a SeatsReserved event
that reports how many seats of a particular type it could reserve. This means that the number of seats
reserved may not match the number of seats requested; this information is passed back to the UI for
the registrant to make a decision on how to proceed with the registration.
Impact on testing
This section discusses some of the testing issues addressed during this stage of the journey.
Background:
Given the Business Customer selected the Create Conference option
...
E xtending a nd Enh a ncing the Or ders a nd R egistr ations Bounded Context
75
76 Journey four
ScenarioContext.Current.Set(
table.Rows[0]["Email"],
Constants.EmailSessionKey);
ScenarioContext.Current.Set(
Browser.FindText(Slug.FindBy),
Constants.AccessCodeSessionKey);
}
...
...
if (create)
{
Browser.SetInput("OwnerName", row["Owner"]);
Browser.SetInput("OwnerEmail", row["Email"]);
Browser.SetInput("name", row["Email"], "ConfirmEmail");
Browser.SetInput("Slug", Slug.CreateNew().Value);
}
Browser.SetInput("Tagline", Constants.UI.TagLine);
Browser.SetInput("Location", Constants.UI.Location);
Browser.SetInput("TwitterSearch", Constants.UI.TwitterSearch);
You can see how this approach simulates clicking on, and entering text into, UI elements in the web
browser.
The second approach is to implement the tests by interacting with the MVC controller classes. In
the longer-term, this approach will be less fragile at the cost of an initially more complex implementa-
tion that requires some knowledge of the internal implementation of the system. The following code
samples show an example of this approach.
First, an example scenario from the SelfRegistrationEndToEndWithControllers.feature file in
the Features\UserInterface\Controllers\Registration project folder:
E xtending a nd Enh a ncing the Or ders a nd R egistr ations Bounded Context 79
Assert.NotNull(redirect);
{
//ReservationUnknown
var result = registrationController.SpecifyRegistrantAndPaymentDetails(
(Guid)redirect.RouteValues["orderId"],
registrationController.ViewBag.OrderVersion);
Assert.IsNotType<RedirectToRouteResult>(result);
registrationViewModel =
RegistrationHelper.GetModel<RegistrationViewModel>(result);
}
Assert.NotNull(
registrationViewModel,
"Could not make the reservation and get the RegistrationViewModel");
}
...
Assert.NotNull(orderItem);
Assert.Equal(Int32.Parse(row["quantity"]), orderItem.ReservedSeats);
}
}
You can see how this approach uses the RegistrationController MVC
class directly.
Note: In these code samples, you can see how the values in the
attributes link the step implementation to the statements in the
related SpecFlow feature files.
The team chose to implement these steps as xUnit.net tests. To run these
tests within Visual Studio, you can use any of the test runners supported
by xUnit.net such as ReSharper, CodeRush, or TestDriven.NET.
Remember that these
Using tests to help developers understand acceptance tests are not the
message flows only tests performed on the
A common comment about implementations that use the CQRS pat- system. The main solution
tern or that use messaging extensively is the difficulty in understand- includes comprehensive
ing how all of the different pieces of the application fit together unit and integration tests,
and the test team also
through sending and receiving commands and events. You can help performed exploratory and
someone to understand your code base through appropriately de- performance testing on the
signed unit tests. application.
Consider this first example of a unit test for the Order aggregate:
public class given_placed_order
{
...
public given_placed_order()
{
this.sut = new Order(
OrderId, new[]
{
new OrderPlaced
{
ConferenceId = ConferenceId,
Seats = new[] { new SeatQuantity(SeatTypeId, 5) },
ReservationAutoExpiration = DateTime.UtcNow
}
});
}
82 Journey four
[Fact]
public void when_updating_seats_then_updates_order_with_new_seats()
{
this.sut.UpdateSeats(new[] { new OrderItem(SeatTypeId, 20) });
...
}
This unit test creates an Order instance and directly invokes the UpdateSeats method. It does not
provide any information to the person reading the test code about the command or event that causes
this method to be invoked.
Now consider this second example that performs the same test, but in this case by sending a
command:
public class given_placed_order
{
...
public given_placed_order()
{
this.sut = new EventSourcingTestHelper<Order>();
this.sut.Setup(
new OrderCommandHandler(sut.Repository, pricingService.Object));
this.sut.Given(
new OrderPlaced
{
SourceId = OrderId,
ConferenceId = ConferenceId,
Seats = new[] { new SeatQuantity(SeatTypeId, 5) },
ReservationAutoExpiration = DateTime.UtcNow
});
}
E xtending a nd Enh a ncing the Or ders a nd R egistr ations Bounded Context 83
[Fact]
public void when_updating_seats_then_updates_order_with_new_seats()
{
this.sut.When(
new RegisterToConference
{
ConferenceId = ConferenceId,
OrderId = OrderId,
Seats = new[] { new SeatQuantity(SeatTypeId, 20)
}});
...
}
This example uses a helper class that enables you to send a command to the Order instance. Now
someone reading the test can see that when you send a RegisterToConference command, you expect
to see an OrderUpdated event.
Testing is important
I’ve once believed that well-factored applications are easy to comprehend, no matter how large or
broad the codebase. Any time I had a problem understanding how some feature of an application
behaved, the fault would lie with the code and not in me.
Never let your ego get in the way of common sense.
Truth was, up until a certain point in my career, I simply hadn’t had exposure to a large, well-fac-
tored codebase. I wouldn’t have known what one looked like if it walked up and hit me in the face.
Thankfully, as I got more experienced reading code, I learned to recognize the difference.
Note: In any well-organized project, tests are a cornerstone of comprehension for developers
seeking to understand the project. Topics ranging from naming conventions and coding styles to
design approaches and usage patterns are baked into test suites, providing an excellent starting
point for integrating into a codebase. It’s also good practice in code literacy, and practice makes
perfect!
84 Journey four
My first action after cloning the Conference code was to skim the tests. After a perusal of the integra-
tion and unit test suites in the Conference Visual Studio solution, I focused my attention on the
Conference.AcceptanceTests Visual Studio solution that contains the SpecFlow acceptance tests.
Other members of the project team had done some initial work on the .feature files, which worked
out nicely for me since I wasn’t familiar with the details of the business rules. Implementing step
bindings for these features would be an excellent way to both contribute to the project and learn
about how the system worked.
Domain tests
My goal then was to take a feature file looking something like this:
Feature: Self Registrant scenarios for making a Reservation for
a Conference site with all Order Items initially available
In order to reserve Seats for a conference
As an Attendee
I want to be able to select an Order Item from one or many of
the available Order Items and make a Reservation
Background:
Given the list of the available Order Items for the CQRS
Summit 2012 conference with the slug code SelfRegFull
| seat type | rate | quota |
| General admission | $199 | 100 |
| CQRS Workshop | $500 | 100 |
| Additional cocktail party | $50 | 100 |
And the selected Order Items
| seat type | quantity |
| General admission | 1 |
| CQRS Workshop | 1 |
| Additional cocktail party | 1 |
Scenario: All the Order Items are available and all get reserved
When the Registrant proceeds to make the Reservation
Then the Reservation is confirmed for all the selected Order Items
And these Order Items should be reserved
| seat type |
| General admission |
| CQRS Workshop |
| Additional cocktail party |
And the total should read $749
And the countdown started
E xtending a nd Enh a ncing the Or ders a nd R egistr ations Bounded Context 85
All at a level just below the UI, but above (and beyond) infrastructure
concerns. Testing is tightly focused on the behavior of the overall
solution domain, which is why I’ll call these types of tests Domain
Tests. Other terms such as behavior-driven development (BDD) can
be used to describe this style of testing.
It may seem a little redundant to rewrite application logic already
implemented on the website, but there are a number of reasons why
it is worth the time:
• You aren’t interested (for these purposes) in testing how the
website or any other piece of infrastructure behaves; you’re
only interested in the domain. Unit and integration-level tests
will validate the correct functioning of that code, so there’s
no need to duplicate those tests.
• When iterating stories with product owners, spending time
on pure UI concerns can slow down the feedback cycle,
reducing the quality and usefulness of feedback.
• Discussing a feature in more abstract terms can lead to a
better understanding of the problem that the business is These “below the UI”
trying to solve, given the sometimes large mismatches tests are also known as
between the vocabularies used by different people when subcutaneous tests, (see
they discuss technological issues. Meszaros, G., Melnik, G.,
• Obstacles encountered in implementing the testing logic can Acceptance Test Engineering
Guide).
help improve the system’s overall design quality. Difficulty in
separating infrastructure code from application logic is gener-
ally regarded as a smell.
Note: There are many more reasons not listed here why these
types of tests are a good idea, but these are the important ones
for this example.
The architecture for the Contoso Conference Management System is
loosely coupled, utilizing messages to transfer commands and events
to interested parties. Commands are routed to a single handler via a
command bus, while events are routed to their 0...N handlers via an
event bus. A bus isn’t tied to any specific technology as far as consum-
ing applications are concerned, allowing arbitrary implementations to
be created and used throughout the system in a manner transparent
to users.
86 Journey four
Another bonus when it comes to behavioral testing of a loosely coupled message architecture is
related to the fact that BDD (or similarly styled) tests do not involve themselves with the inner work-
ings of application code. They only care about the observable behavior of the application under test.
This means that for the SpecFlow tests, we need only concern ourselves with publishing some com-
mands to the bus and examining the outward results by asserting expected message traffic and pay-
loads against the actual traffic/data.
Note: It’s OK to use mocks and stubs with these types of tests where appropriate. An appropriate
example would be in using a mock ICommandBus object instead of the AzureCommandBus type.
Mocking a complete domain service is an example where it is not appropriate. Use mocking
minimally, limiting yourself to infrastructure concerns and you’ ll make your life—and your tests—a
lot less stressful.
2. Locate the first place in the source code where either or both of the above occur. In this
case, it’s the Handle method in the RegistrationProcessManagerRouter class. Important:
this does not necessarily mean that the process is a command handler! Process managers are
responsible for creating and retrieving aggregate roots (AR) from storage for the purpose of
routing messages to the AR, so while they have methods similar in name and signature to an
ICommandHandler implementation, they do not implement a command’s logic.
3. Take note of the message type that is received as a parameter to the method where the state
change occurs, since we now need to figure out where that message originated.
• We also note that a new command, MakeSeatReservation, is being issued by the
RegistrationProcessManager.
• As mentioned above, this command isn’t actually published by the process issuing it; rather,
publication occurs when the process is saved to disk.
• These heuristics will need to be repeated to some degree or another on any commands
issued as side-effects of a process handling a command.
4. Do a find references on the OrderPlaced symbol to locate the (or a) top-most (external
facing) component that publishes a message of that type via the Send method on the
ICommandBus interface.
• Since internally issued commands are indirectly published (by a repository) on save, it may
be safe to assume that any non-infrastructure logic that directly calls the Send method is
an external point of entry.
While there is certainly more to these heuristics than noted here, what is there is likely sufficient to
demonstrate the point that even discussing the interactions is a rather lengthy, cumbersome process.
That makes it easily prone to misinterpretation. You can come to understand the various command/
event messaging interactions in this manner, but it is not very efficient.
Note: As a rule, a person can really only maintain between four and eight distinct thoughts in their
head at any given time. To illustrate this concept, let’s take a conservative count of the number of
simultaneous items you’ ll need to maintain in your short-term memory while following the above
heuristics:
Process type + Process state property + Initial State (NotStarted) + new() location + message type
+ intermediary routing class types + 2 *N^n Commands issued (location, type, steps) +
discrimination rules (logic is data too!) > 8.
When infrastructure requirements get mixed into the equation, the issue of information saturation
becomes even more apparent. Being the competent, capable, developers that we all are (right?), we
can start looking for ways to optimize these steps and improve the signal-to-noise ratio of relevant
information.
To summarize, we have two problems:
• The number of items we are forced to keep in our heads is too great to allow efficient
comprehension.
• Discussion and documentation for messaging interactions is verbose, error-prone, and
complicated.
Fortunately, it is quite possible to kill two birds with a single stone, with MIL (messaging intermediate
language).
88 Journey four
MIL began as a series of LINQPad scripts and snippets that I created to help juggle all these facts
while answering questions. Initially, all that these scripts accomplished was to reflect through one or
more project assemblies and output the various types of messages and handlers. In discussions with
members of the team it became apparent that others were experiencing the same types of problems
I had. After a few chats and brainstorming sessions with members of the patterns & practices team,
we came up with the idea of introducing a small domain-specific language (DSL) that would encapsulate
the interactions being discussed. The tentatively named SawMIL toolbox, located at http://jelster.
github.com/CqrsMessagingTools/ provides utilities, scripts, and examples that enable you to use MIL as
part of your development and analysis process managers.
In MIL, messaging components and interactions are represented in a specific manner: commands,
since they are requests for the system to perform some action, are denoted by ?, as in DoSomething?.
Events represent something definite that happened in the system, and hence gain a ! suffix, as in
SomethingHappened!.
Another important element of MIL is message publication and reception. Messages received from
a messaging source (such as Windows Azure Service Bus, nServiceBus, and so forth) are always pre-
ceded by the -> symbol, while messages that are being sent have the symbol following it. To keep the
examples simple for now, the optional nil element, (a period, . ) is used to indicate explicitly a no-op
(in other words, nothing is receiving the message). The following snippet shows an example of the nil
element syntax:
SendCustomerInvoice? -> .
CustomerInvoiceSent! -> .
Once a command or event has been published, something needs to do something with it. Commands
have one and only one handler, while events can have multiple handlers. MIL represents this relation-
ship between message and handler by placing the name of the handler on the other side of the mes-
saging operation, as shown in the following snippet:
SendCustomerInvoice? -> CustomerInvoiceHandler
CustomerInvoiceSent! ->
-> CustomerNotificationHandler
-> AccountsAgeingViewModelGenerator
Notice how the command handler is on the same line as the command, while the event is separated
from its handlers? That’s because in CQRS, there is a 1:1 correlation between commands and com-
mand handlers. Putting them together helps reinforce that concept, while keeping events separate
from event handlers helps reinforce the idea that a given event can have 0...N handlers.
E xtending a nd Enh a ncing the Or ders a nd R egistr ations Bounded Context 89
Aggregate Roots are prefixed with the @ sign, a convention that should be familiar to anyone who
has ever used twitter. Aggregate roots never handle commands, but occasionally may handle events.
Aggregate roots are most frequently event sources, raising events in response to business operations
invoked on the aggregate. Something that should be made clear about these events, however, is that
in most systems there are other elements that decide upon and actually perform the publication of
domain events. This is an interesting case where business and technical requirements blur boundaries,
with the requirements being met by infrastructure logic rather than application or business logic. An
example of this lies in the journey code: in order to ensure consistency between event sources and
event subscribers, the implementation of the repository that persists the aggregate root is the element
responsible for actually publishing the events to a bus. The following snippet shows an example of the
AggregateRoot syntax:
SendCustomerInvoice? -> CustomerInvoiceHandler
@Invoice::CustomerInvoiceSent! -> .
In the above example, a new language element called the scope context operator appears alongside
the @AggregateRoot. Denoted by double colons (::) the scope context element may or may not have
whitespace between its two characters, and is used to identify relationships between two objects.
Above, the AR ‘@Invoice’ is generating the CustomerSent! event in response to logic invoked by the
CustomerInvoiceHandler event handler. The next example demonstrates use of the scope element
on an AR, which generates multiple events in response to a single command:
SendCustomerInvoice? -> CustomerInvoiceHandler
@Invoice:
:CustomerInvoiceSent! -> .
:InvoiceAged! -> .
Scope context is also used to signify intra-element routing that does not involve infrastructure mes-
saging apparatus:
SendCustomerInvoice? -> CustomerInvoiceHandler
@Invoice::CustomerInvoiceSent! ->
-> InvoiceAgeingProcessRouter::InvoiceAgeingProcess
The last element that I’ll introduce is the State Change element. State changes are one of the best
ways to track what is happening within a system, and thus MIL treats them as first-class citizens. These
statements must appear on their own line of text, and are prefixed with the ‘*’ character. It’s the only
time in MIL that there is any mention or appearance of assignment because it’s just that important!
The following snippet shows an example of the State Change element:
SendCustomerInvoice? -> CustomerInvoiceHandler
@Invoice::CustomerInvoiceSent! ->
-> InvoiceAgegingProcessRouter::InvoiceAgeingProcess
*InvoiceAgeingProcess.ProcessState = Unpaid
90 Journey four
Summary
We’ve just walked through the basic steps used when describing messaging interactions in a loosely
coupled application. Although the interactions described are only a subset of possible interactions,
MIL is evolving into a way to compactly describe the interactions of a message-based system. Differ-
ent nouns and verbs (elements and actions) are represented by distinct, mnemonically significant
symbols. This provides a cross-substrate (squishy human brains < - > silicon CPU) means of communi-
cating meaningful information about systems as a whole. Although the language describes some types
of messaging interactions very well, it is very much a work in progress with many elements of the
language and tooling in need of development or improvement. This presents some great opportunities
for people looking to contribute to OSS, so if you’ve been on the fence about contributing or are
wondering about OSS participation, there’s no time like the present to head over to http://jelster.
github.com/CqrsMessagingTools/, fork the repos, and get started!
More information
All links in this book are accessible from the book’s online bibliography available at:
http://msdn.microsoft.com/en-us/library/jj619274.
Journey 5:
“Most people, after accomplishing something, use it over and over again like a gramophone record
till it cracks, forgetting that the past is just the stuff with which to make more future.”
Freya Stark
In addition, you can use event sourcing as a source of audit data, as a way to query historic state,
gain new business insights from past data, and replay events for debugging and problem analysis.
Eventual consistency. Eventual consistency is a consistency model that does not guarantee im-
mediate access to updated values. After an update to a data object, the storage system does not
guarantee that subsequent accesses to that object will return the updated value. However, the storage
system does guarantee that if no new updates are made to the object during a sufficiently long period
of time, then eventually all accesses can be expected to return the last updated value.
User stories
The team implemented the user stories described below during this stage of the journey.
Architecture
Figure 1 illustrates the key architectural elements of the Contoso Conference Management System in
the V1 release. The application consists of two websites and three bounded contexts. The infrastruc-
ture includes Windows Azure SQL Database (SQL Database) instances, an event store, and messaging
infrastructure.
The table that follows Figure 1 lists all of the messages that the artifacts (aggregates, MVC control-
lers, read-model generators, and data access objects) shown in the diagram exchange with each other.
Note: For reasons of clarity, the handlers (such as the OrderCommandHandler class) that
deliver the messages to the domain objects are not shown.
Figure 1
Architecture of the V1 release
94 Journey fi v e
* These events are only used for persisting aggregate state using event sourcing.
** The ConferenceViewModelGenerator creates these commands from the SeatCreated and
SeatUpdated events that it handles from the Conference Management bounded context.
96 Journey fi v e
The following list outlines the message naming conventions in the Contoso Conference Management
System
• All events use the past tense in the naming convention.
• All commands use the imperative naming convention.
• All DTOs are nouns.
The application is designed to deploy to Windows Azure. At this stage in the journey, the application
consists of two web roles that contain the ASP.NET MVC web applications and a worker role that
contains the message handlers and domain objects. The application uses SQL Database instances for
data storage, both on the write side and the read side. The Orders and Registrations bounded context
now uses an event store to persist the state from the write side. This event store is implemented using
Windows Azure table storage to store the events. The application uses the Windows Azure Service
Bus to provide its messaging infrastructure.
While you are exploring and testing the solution, you can run it locally, either using the Windows
Azure compute emulator or by running the ASP.NET MVC web application directly and running a
console application that hosts the handlers and domain objects. When you run the application lo-
cally, you can use a local SQL Server Express database instead of SQL Database, use a simple messag-
ing infrastructure implemented in a SQL Server Express database, and a simple event store also imple-
mented using a SQL Server Express database.
Note: The SQL-based implementations of the event store and the messaging infrastructure are only
intended to help you run the application locally for exploration and testing. They are not intended
to illustrate a production-ready approach.
For more information about the options for running the application, see Appendix 1, “Release Notes.”
Prepa ring for the V1 R elease 97
One of the issues to consider when choosing between storage mechanisms in Windows Azure is cost.
If you use SQL Database you are billed based on the size of the database, if you use Windows Azure
table or blob storage you are billed based on the amount of storage you use and the number of storage
transactions. You need to carefully evaluate the usage patterns on the different aggregates in your system
to determine which storage mechanism is the most cost effective. It may turn out that different storage
mechanisms make sense for different aggregate types. You may be able to introduce optimizations that
lower your costs, for example by using caching to reduce the number of storage transactions.
Identifying aggregates
My rule of thumb is that if In the Windows Azure table storage-based implementation of the
you’re doing green-field event store that the team created for the V1 release, we used the ag-
development, you need very gregate ID as the partition key. This makes it efficient to locate the
good arguments in order to partition that holds the events for any particular aggregate.
choose a SQL Database. In some cases, the system must locate related aggregates. For
Windows Azure Storage example, an order aggregate may have a related registrations aggre-
Services should be the default gate that holds details of the attendees assigned to specific seats. In
choice. However, if you this scenario, the team decided to reuse the same aggregate ID for the
already have an existing SQL related pair of aggregates (the Order and Registration aggregates) in
Server database that you want order to facilitate look-ups.
to move to the cloud, it’s a
different case.
—Mark Seemann - CQRS
Advisors Mail List
Figure 2
Example UIs for conference registration
On the first screen, the labels on the buttons reflect the underlying
CRUD operations that the system will perform when the user clicks
the Submit button, rather than displaying more user-focused action
words. Unfortunately, the first screen also requires the user to apply
some deductive knowledge about how the screen and the application
function. For example, the function of the Add button is not immedi-
ately apparent.
100 Journey fi v e
A typical implementation behind the first screen will use a data transfer object (DTO) to exchange
data between the back end and the UI. The UI will request data from the back end that will arrive
encapsulated in a DTO, it will modify the data in the DTO, and then return the DTO to the back end.
The back end will use the DTO to figure out what CRUD operations it must perform on the underly-
ing data store.
The second screen is more explicit about what is happening in terms of the business process: the
user is selecting quantities of seat types as a part of the conference registration task. Thinking about
the UI in terms of the task that the user is performing makes it easier to relate the UI to the write
model in your implementation of the CQRS pattern. The UI can send commands to the write side, and
those commands are a part of the domain model on the write side. In a bounded context that imple-
ments the CQRS pattern, the UI typically queries the read side and receives a DTO, and sends com-
mands to the write side.
Figure 3
Task-based UI flow
Prepa ring for the V1 R elease 101
The following conversation between several developers and the domain expert highlights some of
the key issues that the team needed to address in planning how to implement this integration.
Developer 1: I want to talk about how we should implement two pieces of the integration story
associated with our CRUD-style, Conference Management bounded context. First of all, when a
business customer creates a new conference or defines new seat types for an existing confer-
ence in this bounded context, other bounded contexts such as the Orders and Registrations
bounded context will need to know about the change. Secondly, when a business customer
changes the quota for a seat type, other bounded contexts will need to know about this change
as well.
Developer 2: So in both cases you are pushing changes from the Conference Management
bounded context. It’s one way.
Developer 1: Correct.
Developer 2: What are the significant differences between the scenarios you outlined?
Developer 1: In the first scenario, these changes are relatively infrequent and typically happen
when the business customer creates the conference. Also, these are append-only changes. We
don’t allow a business customer to delete a conference or a seat type after the conference has
been published for the first time. In the second scenario, the changes might be more frequent
and a business customer might increase or decrease a seat quota.
Developer 2: What implementation approaches are you considering for these integration sce-
narios?
Developer 1: Because we have a two-tier CRUD-style bounded context, for the first scenario
I was planning to expose the conference and seat-type information directly from the database
as a simple read-only service. For the second scenario, I was planning to publish events whenever
the business customer updates the seat quotas.
Developer 2: Why use two different approaches here? It would be simpler to use a single ap-
proach. Using events is more flexible in the long run. If additional bounded contexts need this
information, they can easily subscribe to the event. Using events provides for less coupling be-
tween the bounded contexts.
Developer 1: I can see that it would be easier to adapt to changing requirements in the future
if we used events. For example, if a new bounded context required information about who
changed the quota, we could add this information to the event. For existing bounded contexts,
we could add an adapter that converted the new event format to the old.
Developer 2: You implied that the events that notify subscribers of quota changes would send
the change that was made to the quota. For example, let’s say the business customer increased
a seat quota by 50. What happens if a subscriber wasn’t there at the beginning and therefore
doesn’t receive the full history of updates?
Prepa ring for the V1 R elease 103
Developer 1: We may have to include some synchronization mechanism that uses snapshots of
the current state. However, in this case the event could simply report the new value of the quo-
ta. If necessary, the event could report both the delta and the absolute value of the seat quota.
Developer 2: How are you going to ensure consistency? You need to guarantee that your
bounded context persists its data to storage and publishes the events on a message queue.
Developer 1: We can wrap the database write and add-to-queue operations in a transaction.
Developer 2: There are two reasons that’s going to be problematic later when the size of the
network increases, response times get longer, and the probability of failure increases. First, our
infrastructure uses the Windows Azure Service Bus for messages. You can’t use a single transac-
tion to combine the sending of a message on the Service Bus and a write to a database. Second,
we’re trying to avoid two-phase commits because they always cause problems in the long run.
Domain Expert: We have a similar scenario with another bounded context that we’ll be looking
at later. In this case, we can’t make any changes to the bounded context; we no longer have an
up-to-date copy of the source code.
Developer 1: What can we do to avoid using a two-phase commit? And what can we do if we
don’t have access to the source code and thus can’t make any changes?
Developer 2: In both cases, we use the same technique to solve the problem. Instead of publish-
ing the events from within the application code, we can use another process that monitors the
database and sends the events when it detects a change in the database. This solution may in-
troduce a small amount of latency, but it does avoid the need for a two-phase commit and you
can implement it without making any changes to the application code.
Another issue concerns when and where to persist integration events. Another approach to consider
In the example discussed above, the Conference Management bound- is to use an event store that
ed context publishes the events and the Orders and Registrations many bounded contexts share.
bounded context handles them and uses them to populate its read In this way, the originating
model. If a failure occurs that causes the system to lose the read- bounded context (for example
model data, then without saving the events there is no way to recre- the CRUD-style Conference
ate that read-model data. Management bounded
Whether you need to persist these integration events will depend context) could be responsible
on the specific requirements and implementation of your application. for persisting the integration
For example: events.
• The write side may handle the integration instead of the read —Greg Young - Conversation
side, as in the current example. The events will then result in with the patterns & practices
changes on the write side that are persisted as other events. team.
• Integration events may represent transient data that does not
need to be persisted.
• Integration events from a CRUD-style bounded context may
contain state data so that only the last event is needed. For
example if the event from the Conference Management
bounded context includes the current seat quota, you may
not be interested in previous values.
104 Journey fi v e
Note: To see how the current approach works, look at the OrderEventHandler class in the
Conference project.
Prepa ring for the V1 R elease 105
This is a key challenge you should address if you decide to implement an event store yourself.
If you are designing a scalable event store that you plan to deploy in a distributed environment
such as Windows Azure, you must be very careful to ensure that you meet this requirement.
106 Journey fi v e
Favoring autonomy
The autonomous approach assigns the responsibility for calculating
the order total to the Orders and Registrations bounded context. The
Orders and Registrations bounded context is not dependent on an-
other bounded context when it needs to perform the calculation
because it already has the necessary data. At some point in the past,
it will have collected the pricing information it needs from other
bounded contexts (such as the Conference Management bounded
context) and cached it.
The advantage of this approach is that the Orders and Registra-
tions bounded context is autonomous. It doesn’t rely on the avail-
ability of another bounded context or service.
The disadvantage is that the pricing information could be out of
date. The business customer might have changed the pricing informa-
tion in the Conference Management bounded context, but that
change might not yet have reached the Orders and Registrations
bounded context.
Favoring authority
In this approach, the part of the system that calculates the order total
obtains the pricing information from the bounded contexts (such as
the Conference Management bounded context) at the point in time
that it performs the calculation. The Orders and Registrations bound-
ed context could still perform the calculation, or it could delegate the
calculation to another bounded context or service within the system.
The advantage of this approach is that the system always uses the
latest pricing information whenever it is calculating an order total.
The disadvantage is that the Orders and Registrations bounded
context is dependent on another bounded context when it needs to
determine the total for the order. It either needs to query the Confer-
This choice may change ence Management bounded context for the up-to-date pricing infor-
depending on the state mation, or call another service that performs the calculation.
of your system. Consider
an overbooking scenario.
The autonomy strategy Choosing between autonomy and authority
may optimize for the The choice between the two alternatives is a business decision. The
normal case when lots specific business requirements of your scenario should determine
of conference seats are which approach to take. Autonomy is often the preference for large,
still available, but as a
particular conference fills online systems.
up, the system may need to
become more conservative
and favor authority, using
the latest information on
seat availability.
Prepa ring for the V1 R elease 107
The team decided to add a note to the view page warning users
about this possibility, although a production system is likely to update
the read model faster than a debug version of the application running
locally.
Implementation details
This section describes some of the significant features of the imple-
mentation of the Orders and Registrations bounded context. You may
So long as the registrant find it useful to have a copy of the code so you can follow along. You
knows that the changes can download a copy from the Download center, or check the evolu-
have been persisted, and tion of the code in the repository on GitHub: https://github.com/
that what the UI displays
could be a few seconds out
mspnp/cqrs-journey-code. You can download the code from the V1 re-
of date, they are not going lease from the Tags page on GitHub.
to be concerned.
Note: Do not expect the code samples to match exactly the code
in the reference implementation. This chapter describes a step in
the CQRS journey, the implementation may well change as we
learn more and refactor the code.
Figure 4
Overview of the payment process
110 Journey fi v e
Figure 4 shows how the Orders and Registrations bounded context, the Payments bounded con-
text, and the external payments service all interact with each other. In the future, registrants will also
be able to pay by invoice instead of using a third-party payment processing service.
The registrant makes a payment as a part of the overall flow in the UI, as shown in Figure 3. The
PaymentController controller class does not display a view unless it has to wait for the system to
create the ThirdPartyProcessorPayment aggregate instance. Its role is to forward payment informa-
tion collected from the registrant to the third-party payment processor.
Typically, when you implement the CQRS pattern, you use events as the mechanism for commu-
nicating between bounded contexts. However, in this case, the RegistrationController and Payment-
Controller controller classes send commands to the Payments bounded context. The Payments
bounded context does use events to communicate with the RegistrationProcessManager instance
in the Orders and Registrations bounded context.
The implementation of the Payments bounded context implements the CQRS pattern without
event sourcing.
The write-side model contains an aggregate called ThirdPartyProcessorPayment that consists of
two classes: ThirdPartyProcessorPayment and ThirdPartyProcessorPaymentItem. Instances of
these classes are persisted to a SQL Database instance by using Entity Framework. The PaymentsDb-
Context class implements an Entity Framework context.
The ThirdPartyProcessorPaymentCommandHandler implements a command handler for the
write side.
The read-side model is also implemented using Entity Framework. The PaymentDao class ex-
poses the payment data on the read side. For an example, see the GetThirdPartyProcessorPayment-
Details method.
Prepa ring for the V1 R elease 111
Figure 5 illustrates the different parts that make up the read side and the write side of the Pay-
ments bounded context.
Figure 5
The read side and the write side in the Payments bounded context
Integration with online payment services, eventual consistency, and command validation
Typically, online payment services offer two levels of integration with your site:
• The simple approach, for which you don’t need a merchant account with the payments provider,
works through a simple redirect mechanism. You redirect your customer to the payment service.
The payment service takes the payment, and then redirects the customer back to a page on
your site along with an acknowledgement code.
• The more sophisticated approach, for which you do need a merchant account, is based on an
API. It typically executes in two steps. First, the payment service verifies that your customer can
pay the required amount, and sends you a token. Second, you can use the token within a fixed
time to complete the payment by sending the token back to the payment service.
112 Journey fi v e
The abstract base class of the Order class defines the Update method. The following code sample
shows this method and the Id and Version properties in the EventSourced class.
private readonly Guid id;
private int version = -1;
The Update method sets the Id and increments the version of the
aggregate. It also determines which of the event handlers in the ag-
gregate it should invoke to handle the event type.
The following code sample shows the event handler methods in the Order class that are invoked
when the command methods shown above are called.
private void OnOrderPartiallyReserved(OrderPartiallyReserved e)
{
this.seats = e.Seats.ToList();
}
if (order != null)
{
order.MarkAsReserved(command.Expiration, command.Seats);
repository.Save(order);
}
}
The following code sample shows the initial simple implementation of the Save method in the Sql-
EventSourcedRepository class.
Note: These examples refer to a SQL Server-based event store. This was the initial approach that
was later replaced with an implementation based on Windows Azure table storage. The SQL Server-
based event store remains in the solution as a convenience; you can run the application locally and
use this implementation to avoid any dependencies on Windows Azure.
public void Save(T eventSourced)
{
// TODO: guarantee that only incremental versions of the event are stored
var events = eventSourced.Events.ToArray();
using (var context = this.contextFactory.Invoke())
{
foreach (var e in events)
{
using (var stream = new MemoryStream())
{
this.serializer.Serialize(stream, e);
var serialized = new Event
{
AggregateId = e.SourceId,
Version = e.Version,
Payload = stream.ToArray()
};
context.Set<Event>().Add(serialized);
}
}
118 Journey fi v e
context.SaveChanges();
}
We later found that using event sourcing and being able to replay
events was invaluable as a technique for analyzing bugs in the
production system running in the cloud. We could make a local copy
of the event store, then replay the event stream locally and debug the
application in Visual Studio to understand exactly what happened in
the production system.
The following code sample from the OrderCommandHandler class shows how calling the Find
method in the repository initiates this process.
public void Handle(MarkSeatsAsReserved command)
{
var order = repository.Find(command.OrderId);
...
}
Prepa ring for the V1 R elease 119
if (deserialized.Any())
{
return entityFactory.Invoke(id, deserialized);
}
return null;
}
}
The following code sample shows the constructor in the Order class that rebuilds the state of the
order from its event stream when it is invoked by the Invoke method in the previous code sample.
public Order(Guid id, IEnumerable<IVersionedEvent> history) : this(id)
{
this.LoadFrom(history);
}
The LoadFrom method is defined in the EventSourced class, as shown in the following code sample.
For each stored event in the history, it determines the appropriate handler method to invoke in the
Order class and updates the version number of the aggregate instance.
120 Journey fi v e
Note: This code sample also illustrates how a duplicate key error is used to identify a concurrency
error.
The Save method in the repository class is shown below. This method is invoked by the event handler
classes, invokes the Save method shown in the previous code sample, and invokes the SendAsync
method of the EventStoreBusPublisher class.
public void Save(T eventSourced)
{
var events = eventSourced.Events.ToArray();
var serialized = events.Select(this.Serialize);
this.publisher.SendAsync(partitionKey);
}
Calculating totals
To ensure its autonomy, the Orders and Registrations bounded
context calculates order totals without accessing the Conference
In the case of a failure,
the system must include a Management bounded context. The Conference Management
mechanism for scanning all bounded context is responsible for maintaining the prices of seats
of the partitions in table for conferences.
storage for aggregates
with unpublished events
and then publishing those
events. This process will
take some time to run, but
will only need to run when
the application restarts.
Prepa ring for the V1 R elease 123
You should also ensure that the domain expert attends bug triage meetings. He or she can help
clarify the expected behavior of the system, and during the discussion may uncover new user stories.
For example, during the triage of a bug related to unpublishing a conference in the Conference Man-
agement bounded context, the domain expert identified a requirement to allow the business cus-
tomer to add a redirect link for the unpublished conference to a new conference or alternate page.
Summary
During this stage of our journey, we completed our first pseudo-production release of the Contoso
Conference Management System. It now comprises several integrated bounded contexts, a more
polished UI, and uses event sourcing in the Orders and Registrations bounded context.
There is still more work for us to do, and the next chapter will describe the next stage in our
CQRS journey as we head towards the V2 release and address the issues associated with versioning
our system.
More information
All links in this book are accessible from the book’s online bibliography available at:
http://msdn.microsoft.com/en-us/library/jj619274.
Journey 6:
The top-level goal for this stage in the journey is to learn about how to upgrade a system that includes
bounded contexts that implement the CQRS pattern and event sourcing. The user stories that the
team implemented in this stage of the journey involve both changes to the code and changes to the
data: some existing data schemas changed and new data schemas were added. In addition to upgrading
the system and migrating the data, the team planned to do the upgrade and migration with no down
time for the live system running in Windows Azure.
User stories
The team implemented the following user stories during this phase of
the project.
Architecture
The application is designed to deploy to Windows Azure. At this stage
in the journey, the application consists of web roles that contain the
ASP.NET MVC web applications and a worker role that contains the
message handlers and domain objects. The application uses Windows
Azure SQL Database (SQL Database) instances for data storage, both
on the write side and the read side. The application uses the Windows
Azure Service Bus to provide its messaging infrastructure. Figure 1
shows this high-level architecture.
Versioning Our System 127
Figure 1
The top-level architecture in the V2 release
While you are exploring and testing the solution, you can run it locally, either using the Windows
Azure compute emulator or by running the MVC web application directly and running a console ap-
plication that hosts the handlers and domain objects. When you run the application locally, you can
use a local SQL Server Express database instead of SQL Database, and use a simple messaging infra-
structure implemented in a SQL Server Express database.
For more information about the options for running the application, see Appendix 1, “Release
Notes.”
Figure 2
Using one subscription per event handler
5. The EventProcessor instance catches the exception and abandons the event message. The
message is automatically put back into the subscription.
6. The EventProcessor instance receives the OrderPlaced event from the all subscription for a
second time.
7. It invokes the two Handle methods, causing the RegistrationProcessManagerRouter class
to retry the message and the OrderViewModelGenerator class to process the message a
second time.
8. Every time the RegistrationProcessManagerRouter class throws an exception, the Order-
ViewModelGenerator class processes the event.
In the V2 model, if a handler class throws an exception, the EventProcessor instance puts the event
message back on the subscription associated with that handler class. The retry logic now only causes
the EventProcessor instance to retry the handler that raised the exception, so no other handlers re-
process the message.
The first option is not always viable. In this particular case it would
work because the same team is implementing both bounded contexts
and the infrastructure, making it easy to use a shared event store.
A possible risk with the third option is that the set of events that
are needed may change in the future. If we don’t save events now,
they are lost for good.
Although the fifth option stores all the commands and events,
some of which you might never need to refer to again, it does provide
a complete log of everything that happens in the system. This could
Although from a purist’s be useful for troubleshooting, and also helps you to meet require-
perspective the first ments that have not yet been identified. The team chose this option
option breaks the strict over option two because it offers a more general-purpose mechanism
isolation between bounded that may have future benefits.
contexts, in some scenarios
it may be an acceptable and
The purpose of persisting the events is to enable them to be
pragmatic solution. played back when the Orders and Registrations bounded context
needs the information about current seat quotas in order to calculate
the number of remaining seats. To calculate these numbers consis-
tently, you must always play the events back in the same order. There
are several choices for this ordering:
• The order in which the events were sent by the Conference
Management bounded context.
• The order in which the events were received by the Orders and
Registrations bounded context.
• The order in which the events were processed by the Orders
and Registrations bounded context.
Most of the time these orderings will be the same. There is no correct
order; you just need to choose one to be consistent. Therefore, the
choice is determined by simplicity. In this case, the simplest approach
is to persist the events in the order that the handler in the Orders and
Registrations bounded context receives them (the second option).
There is a similar issue with saving timestamps for these events.
Timestamps may be useful in the future if there is a requirement to
look at the number of remaining seats at a particular time. The choice
here is whether you should create a timestamp when the event is cre-
ated in the Conference Management bounded context or when it is
This choice does not
typically arise with event received by the Orders and Registrations bounded context. It’s possible
sourcing. Each aggregate that the Orders and Registrations bounded context is offline for some
creates events in a fixed reason when the Conference Management bounded context creates
order, and that is the order an event; therefore, the team decided to create the timestamp when
that the system uses to
the Conference Management bounded context publishes the event.
persist the events. In this
scenario, the integration
events are not created by a
single aggregate.
Versioning Our System 133
Message ordering
The acceptance tests that the team created and ran to verify the V1
release highlighted a potential issue with message ordering: the ac-
ceptance tests that exercised the Conference Management bounded
context sent a sequence of commands to the Orders and Registra-
tions bounded context that sometimes arrived out of order.
The team considered two alternatives for ensuring that messages
arrive in the correct order.
• The first option is to use message sessions, a feature of the
Windows Azure Service Bus. If you use message sessions, this
guarantees that messages within a session are delivered in the
same order that they were sent. This effect was not noticed
• The second alternative is to modify the handlers within the when a human user tested
application to detect out-of-order messages through the use of this part of the system
because the time delay
sequence numbers or timestamps added to the messages when
between the times that the
they are sent. If the receiving handler detects an out-of-order commands were sent was
message, it rejects the message and puts it back onto the queue much greater, making it less
or topic to be processed later, after it has processed the mes- likely that the messages
sages that were sent before the rejected message. would arrive out of order.
Implementation details
This section describes some of the significant features of the imple-
mentation of the Orders and Registrations bounded context. You may
find it useful to have a copy of the code so you can follow along. You
can download a copy from the Download center, or check the evolu-
tion of the code in the repository on GitHub: https://github.com/
mspnp/cqrs-journey-code. You can download the code from the V2 re-
lease from the Tags page on GitHub.
Note: Do not expect the code samples to exactly match the code
in the reference implementation. This chapter describes a step in
the CQRS journey; the implementation may well change as we
learn more and refactor the code.
134 Journey si x
[HttpPost]
public ActionResult SpecifyRegistrantAndPaymentDetails(
AssignRegistrantDetails command,
string paymentType,
int orderVersion)
{
...
switch (paymentType)
{
case ThirdPartyProcessorPayment:
return CompleteRegistrationWithThirdPartyProcessorPayment(
command,
pricedOrder,
orderVersion);
case InvoicePayment:
break;
default:
break;
}
...
}
Data migration
The Conference Management bounded context stores order infor-
mation from the Orders and Registrations bounded context in the
PricedOrders table in its Windows Azure SQL Database instance.
Previously, the Conference Management bounded context received
the OrderPaymentConfirmed event; now it receives the Order-
Confirmed event that contains an additional IsFreeOfCharge prop-
erty. This becomes a new column in the database.
During the migration, any in-flight ConfirmOrderPayment com-
mands could be lost because they are no longer handled by the Order
We didn’t need to modify
aggregate. You should verify that none of these commands are cur-
the existing data in this rently on the command bus.
table during the migration The system persists the state of RegistrationProcessManager
because the default value class instances to a SQL Database table. There are no changes to the
for a Boolean is false. All schema of this table. The only change you will see after the migration
of the existing entries were
created before the system
is an additional value in the StateValue column. This reflects the ad-
supported zero-cost orders. ditional PaymentConfirmationReceived value in the ProcessState
enumeration in the RegistrationProcessManager class, as shown in
the following code sample:
public enum ProcessState
{
NotStarted = 0,
AwaitingReservationConfirmation = 1,
ReservationConfirmationReceived = 2,
PaymentConfirmationReceived = 3,
}
In the V1 release, the events that the event sourcing system persisted
We need to plan for the Order aggregate included the OrderPaymentConfirmed
carefully how to deploy
event. Therefore, the event store contains instances of this event
the V2 release so that
we can be sure that all type. In the V2 release, the OrderPaymentConfirmed event is re-
the existing, in-flight placed with the OrderConfirmed event.
ConfirmOrderPayment The team decided for the V2 release not to introduce mapping
commands are processed and filtering events at the infrastructure level when events are deseri-
by a worker role instance
alized. This means that the handlers must understand both the old and
running the V1 release.
new events when the system replays these events from the event
store. The following code sample shows this in the SeatAssignments-
Handler class:
Versioning Our System 137
static SeatAssignmentsHandler()
{
Mapper.CreateMap<OrderPaymentConfirmed, OrderConfirmed>();
}
public SeatAssignmentsHandler(
IEventSourcedRepository<Order> ordersRepo,
IEventSourcedRepository<SeatAssignments> assignmentsRepo)
{
this.ordersRepo = ordersRepo;
this.assignmentsRepo = assignmentsRepo;
}
You can also see the same technique in use in the OrderViewModel-
Generator class.
The approach is slightly different in the Order class because this
is one of the events that is persisted to the event store. The following
code sample shows part of the protected constructor in the Order
class:
protected Order(Guid id)
: base(id)
{
... Handling the old
events in this way was
base.Handles<OrderPaymentConfirmed>(e =>
straightforward for this
this.OnOrderConfirmed(Mapper.Map<OrderConfirmed>(e))); scenario because the only
base.Handles<OrderConfirmed>(this.OnOrderConfirmed); change needed was to
... the name of the event. It
} would be more complicated
if the properties of the
event changed as well. In
the future, Contoso will
consider doing the mapping
in the infrastructure to
avoid polluting the domain
model with legacy events.
138 Journey si x
Note: The ConferenceViewModelGenerator class does not use the SeatCreated and
SeatUpdated events.
The ConferenceViewModelGenerator class in the Orders and Registrations bounded context now
handles these events and uses them to calculate and store the information about seat type quantities
in the read model. The following code sample shows the relevant handlers in the ConferenceView-
ModelGenerator class:
public void Handle(AvailableSeatsChanged @event)
{
this.UpdateAvailableQuantity(@event, @event.Seats);
}
{
using (var repository = this.contextFactory.Invoke())
{
var dto = repository.Set<Conference>()
.Include(x => x.Seats)
.FirstOrDefault(x => x.Id == @event.SourceId);
if (dto != null)
{
if (@event.Version > dto.SeatsAvailabilityVersion)
{
foreach (var seat in seats)
{
var seatDto = dto.Seats
.FirstOrDefault(x => x.Id == seat.SeatType);
if (seatDto != null)
{
seatDto.AvailableQuantity += seat.Quantity;
}
else
{
Trace.TraceError(
"Failed to locate Seat Type read model being updated with id {0}.",
seat.SeatType);
}
}
dto.SeatsAvailabilityVersion = @event.Version;
repository.Save(dto);
}
else
{
Trace.TraceWarning ...
}
}
else
{
Trace.TraceError ...
}
}
}
140 Journey si x
return viewModel;
}
Data migration
The database table that holds the conference read-model data now has a new column to hold the
version number that is used to check for duplicate events, and the table that holds the seat type
read-model data now has a new column to hold the available quantity of seats.
As part of the data migration, it is necessary to replay all of the events in the event store for each
of the SeatsAvailability aggregates in order to correctly calculate the available quantities.
Versioning Our System 141
The BuildMessage method in the CommandBus class uses the command Id to create a unique mes-
sage Id that the Windows Azure Service Bus can use to detect duplicates:
private BrokeredMessage BuildMessage(Envelope command)
{
var stream = new MemoryStream();
...
...
return message;
}
try
{
namespaceManager.CreateSubscription(subscriptionDescription);
}
catch (MessagingEntityAlreadyExistsException) { }
}
Versioning Our System 143
The following code sample from the SessionSubscriptionReceiver class shows how to use sessions
to receive messages:
private void ReceiveMessages(CancellationToken cancellationToken)
{
while (!cancellationToken.IsCancellationRequested)
{
MessageSession session;
try
{
session =
this.receiveRetryPolicy.ExecuteAction(this.DoAcceptMessageSession);
}
catch (Exception e)
{
...
}
if (session == null)
{
Thread.Sleep(100);
continue;
}
while (!cancellationToken.IsCancellationRequested)
{
BrokeredMessage message = null;
try
{
try
{
message = this.receiveRetryPolicy.ExecuteAction(
() => session.Receive(TimeSpan.Zero));
}
catch (Exception e)
{
...
}
if (message == null)
{
// If we have no more messages for this session,
// exit and try another.
break;
}
144 Journey si x
In this way, you can guarantee that all of the messages from an indi-
You may find it useful to vidual source will be received in the correct order.
compare this version of the
ReceiveMessages method
that uses message sessions
with the original version in
the SubscriptionReceiver
class.
Versioning Our System 145
In the V2 release, the team changed the way the system creates the Windows Azure Service Bus topics and
subscriptions. Previously, the SubscriptionReceiver class created them if they didn’t exist already. Now, the system
creates them using configuration data when the application starts up. This happens early in the start-up process to
avoid the risk of losing messages if one is sent to a topic before the system initializes the subscriptions.
resetEvent.WaitOne();
if (exception != null)
{
throw exception;
}
}
146 Journey si x
The Kind property specifies whether the message is either a command or an event. The MessageId
and CorrelationId properties are set by the messaging infrastructure. The remaining properties are
set from the message metadata.
The following code sample shows the definition of the partition and row keys for these messages:
PartitionKey = message.EnqueuedTimeUtc.ToString("yyyMM"),
RowKey = message.EnqueuedTimeUtc.Ticks.ToString("D20") + "_" + message.MessageId
Notice how the row key preserves the order in which the messages
were originally sent and adds on the message ID to guarantee unique-
ness just in case two messages were enqueued at exactly the same time.
Data migration
When Contoso migrates the system from V1 to V2, it will use the
message log to rebuild the conference and priced-order read models
in the Orders and Registrations bounded context.
The conference read model holds information about conferences
and contains information from the ConferenceCreated, Conference-
Updated, ConferencePublished, ConferenceUnpublished, Seat-
Created, and SeatUpdated events that come from the Conference
Management bounded context.
The priced-order read model holds information from the Seat-
Contoso can use the
Created and SeatUpdated events that come from the Conference
message log whenever Management bounded context.
it needs to rebuild the However, in V1, these event messages were not persisted, so the
read models that are read models cannot be repopulated in V2. To work around this prob-
built from events that lem, the team implemented a data migration utility that uses a best
are not associated with
an aggregate, such as the
effort approach to generate events that contain the missing data to
integration events from the store in the message log. For example, after the migration to V2, the
Conference Management message log does not contain any ConferenceCreated events, so the
bounded context. migration utility finds this information in the database used by the
Conference Management bounded context and creates the missing
events. You can see how this is done in the GeneratePastEventLog-
MessagesForConferenceManagement method in the Migrator class
in the MigrationToV2 project.
The RegenerateViewModels method in the Migrator class
shown below rebuilds the read models. It retrieves all the events from
the message log by invoking the Query method, and then uses the
ConferenceViewModelGenerator and PricedOrderViewModel-
Updater classes to handle the messages.
Database.SetInitializer<ConferenceRegistrationDbContext>(null);
try
{
var dispatcher = new MessageDispatcher(handlers);
var events = logReader.Query(new QueryCriteria { });
dispatcher.DispatchMessages(events);
}
catch
{
using (var context =
new ConferenceRegistrationMigrationDbContext(dbConnectionString))
{
context.RollbackTablesMigration();
}
throw;
}
}
Impact on testing
The original events are not
During this stage of the journey, the test team continued to expand updated in any way and are
the set of acceptance tests. They also created a set of tests to verify treated as being immutable.
the data migration process.
152 Journey si x
SpecFlow revisited
Previously, the set of SpecFlow tests were implemented in two ways:
either simulating user interaction by automating a web browser, or by
operating directly on the MVC controllers. Both approaches had their
advantages and disadvantages, which are discussed in Chapter 4, “Ex-
tending and Enhancing the Orders and Registrations Bounded Con-
texts.”
After discussing these tests with another expert, the team also
implemented a third approach. From the perspective of the domain-
driven design (DDD) approach, the UI is not part of the domain
model, and the focus of the core team should be on understanding the
domain with the help of the domain expert and implementing the
business logic in the domain. The UI is just the mechanical part added
to enable users to interact with the domain. Therefore acceptance
testing should include verifying that the domain model functions in
the way that the domain expert expects. Therefore the team created
a set of acceptance tests using SpecFlow that are designed to exercise
the domain without the distraction of the UI parts of the system.
The following code sample shows the SelfRegistrationEndToEnd-
WithDomain.feature file in the Features\Domain\Registration folder
in the Conference.AcceptanceTests Visual Studio solution. Notice
how the When and Then clauses use commands and events.
Typically, you would expect the When clauses to send commands and
the Then clauses to see events or exceptions if your domain model uses
just aggregates. However, in this example, the domain-model includes
a process manager that responds to events by sending commands. The
test is checking that all of the expected commands are sent and all of
the expected events are raised.
Versioning Our System 153
Feature: Self Registrant end to end scenario for making a Registration for
a Conference site with Domain Commands and Events
In order to register for a conference
As an Attendee
I want to be able to register for the conference, pay for the
Registration Order and associate myself with the paid Order automatically
The following code sample shows some of the step implementations for the feature file. The steps
use the command bus to send the commands.
[When(@"the Registrant proceed to make the Reservation")]
public void WhenTheRegistrantProceedToMakeTheReservation()
{
registerToConference = ScenarioContext.Current.Get<RegisterToConference>();
var conferenceAlias = ScenarioContext.Current.Get<ConferenceAlias>();
registerToConference.ConferenceId = conferenceAlias.Id;
orderId = registerToConference.OrderId;
this.commandBus.Send(registerToConference);
Assert.NotNull(order);
Assert.Equal(orderId, order.Id);
}
Assert.NotNull(orderPlaced);
Assert.True(orderPlaced.Seats.All(os =>
registerToConference.Seats.Count(cs =>
cs.SeatType == os.SeatType && cs.Quantity == os.Quantity) == 1));
}
Versioning Our System 155
Summary
During this stage of our journey, we versioned our system and com-
pleted the V2 pseudo-production release. This new release included Testing the migration
some additional functionality and features, such as support for zero- process not only verifies
cost orders and more information displayed in the UI. that the migration runs as
We also made some changes in the infrastructure. For example, expected, but potentially
we made more messages idempotent and now persist integration reveals bugs in the
application itself.
events. The next chapter describes the final stage of our journey as we
continue to enhance the infrastructure and harden the system in
preparation for our V3 release.
More information
All links in this book are accessible from the book’s online bibliogra-
phy available at: http://msdn.microsoft.com/en-us/library/jj619274.
Journey 7:
The three primary goals for this last stage in our journey are to make the system more resilient to
failures, to improve the responsiveness of the UI, and to ensure that our design is scalable. The effort
to harden the system focuses on the RegistrationProcessManager class in the Orders and Registra-
tions bounded context. Performance improvement efforts are focused on the way the UI interacts
with the domain model during the order creation process.
157
158 Journey sev en
Snapshots. Snapshots are an optimization that you can apply to event sourcing; instead of replay-
ing all of the persisted events associated with an aggregate when it is rehydrated, you load a recent
copy of the state of the aggregate and then replay only the events that were persisted after saving the
snapshot. In this way you can reduce the amount of data that you must load from the event store.
Idempotency. Idempotency is a characteristic of an operation that means the operation can be
applied multiple times without changing the result. For example, the operation “set the value x to ten”
is idempotent, while the operation “add one to the value of x” is not. In a messaging environment, a
message is idempotent if it can be delivered multiple times without changing the result: either because
of the nature of the message itself, or because of the way the system handles the message.
Eventual consistency. Eventual consistency is a consistency model that does not guarantee im-
mediate access to updated values. After an update to a data object, the storage system does not
guarantee that subsequent accesses to that object will return the updated value. However, the storage
system does guarantee that if no new updates are made to the object during a sufficiently long period
of time, then eventually all accesses can be expected to return the last updated value.
Architecture
The application is designed to deploy to Windows Azure. At this stage in the journey, the application
consists of web roles that contain the ASP.NET MVC web applications and a worker role that contains
the message handlers and domain objects. The application uses Windows Azure SQL Database (SQL
Database) instances for data storage, both on the write side and the read side. The application also
uses Windows Azure table storage on the write side and blob storage on the read side in some places.
The application uses the Windows Azure Service Bus to provide its messaging infrastructure. Figure
1 shows this high-level architecture.
A dding R esilience a nd Optimizing Perfor m a nce 159
Figure 1
The top-level architecture in the V3 release
While you are exploring and testing the solution, you can run it locally, either using the Windows
Azure compute emulator or by running the MVC web application directly and running a console ap-
plication that hosts the handlers and domain objects. When you run the application locally, you can
use a local SQL Server Express database instead of SQL Database, and use a simple messaging infra-
structure implemented in a SQL Server Express database.
For more information about the options for running the application, see Appendix 1, “Release
Notes.”
Adding resilience
During this stage of the journey the team looked at options for hardening the RegistrationProcess-
Manager class. This class is responsible for managing the interactions between the aggregates in the
Orders and Registrations bounded context and for ensuring that they are all consistent with each
other. It is important that this process manager is resilient to a wide range of failure conditions if the
bounded context as a whole is to maintain its consistent state.
160 Journey sev en
Optimizing performance
During this stage of the journey we ran performance and stress tests
using Visual Studio 2010 to analyze response times and identify bottle-
necks. The team used Visual Studio Load Test to simulate different
numbers of users accessing the application, and added additional trac-
ing into the code to record timing information for detailed analysis.
The team created the performance test environment in Windows
Azure, running the test controller and test agents in Windows Azure
VM role instances. This enabled us to test how the Contoso Confer-
Although in this journey the ence Management System performed under different loads by using
team did their performance
testing and optimization
the test agents to simulate different numbers of virtual users.
work at the end of the As a result of this exercise, the team made a number of changes
project, it typically makes to the system to optimize its performance.
sense to do this work as you
go, addressing scalability UI flow before optimization
issues and hardening the
code as soon as possible.
When a registrant creates an order, she visits the following sequence
This is especially true if of screens in the UI.
you are building your own 1. The register screen. This screen displays the ticket types for
infrastructure and need the conference and the number of seats currently available
to be able to handle high according to the eventually consistent read model. The
volumes of throughput.
registrant selects the quantities of each seat type that she
would like to purchase.
2. The checkout screen. This screen displays a summary of the
order that includes a total price and a countdown timer that
tells the registrant how long the seats will remain reserved.
The registrant enters her details and preferred payment
method.
3. The payment screen. This simulates a third-party payment
processor.
4. The registration success screen. This displays if the payment
succeeded. It displays to the registrant an order locator code
Because implementing and link to a screen that enables the registrant to assign
the CQRS pattern leads attendees to seats.
to a very clear separation
of responsibilities for the See the section “Task-based UI” in Chapter 5, “Preparing for the V1
many different parts that Release” for more information about the screens and flow in the UI.
make up the system, we In the V2 release, the system must process the following com-
found it relatively easy mands and events between the register screen and the checkout
to add optimizations and
hardening because many of
screen:
the necessary changes were
• RegisterToConference
very localized within the • OrderPlaced
system. • MakeSeatReservation
• SeatsReserved
• MarkSeatsAsReserved
• OrderReservationCompleted
• OrderTotalsCalculated
A dding R esilience a nd Optimizing Perfor m a nce 163
Optimizing the UI
The team discussed with the domain expert whether or not is always
necessary to validate the seats availability before the UI sends the
RegisterToConference command to the domain.
The domain expert was clear that the system should confirm that
seats are available before taking payment. Contoso does not want to
sell seats and then have to explain to a registrant that those seats are
not available. Therefore, the team looked for ways to streamline the
process up to the point where the registrant sees the payment screen.
The team identified the following two optimizations to the UI
flow.
UI optimization 1
This cautious strategy is not Most of the time, there are plenty of seats available for a conference
appropriate in all scenarios.
In some cases, the business and registrants do not have to compete with each other to reserve
may prefer to take the seats. It is only for a brief time, as the conference comes close to
money even if it cannot selling out, that registrants do end up competing for the last few avail-
immediately fulfill the order. able seats.
The business may know that If there are plenty of available seats for the conference, then
the stock will be replenished
soon, or that the customer there is minimal risk that a registrant will get as far as the payment
will be happy to wait. In our screen only to find that the system could not reserve the seats. In this
scenario, although Contoso case, some of the processing that the V2 release performs before get-
could refund the money to ting to the checkout screen can be allowed to happen asynchro-
a registrant if tickets turned nously while the registrant is entering information on the checkout
out not to be available, a
registrant may decide to screen. This reduces the chance that the registrant experiences a delay
purchase flight tickets that before seeing the checkout screen.
are not refundable in the However, if the controller checks and finds that there are not
belief that the conference enough seats available to fulfill the order before it sends the Register-
registration is confirmed. ToConference command, it can re-display the register screen to en-
This type of decision is
clearly one for the business able the registrant to update her order based on current availability.
and the domain expert.
Essentially, we are
relying on the fact that
a reservation is likely
to succeed, avoiding a
time-consuming check.
We still perform the
check to ensure the seats
are available before the
registrant makes a payment.
A dding R esilience a nd Optimizing Perfor m a nce 165
UI optimization 2
In the V2 release, the MVC controller cannot display the checkout
screen until the domain publishes the OrderTotalsCalculated event
and the system updates the priced-order view model. This event is the
last event that occurs before the controller can display the screen.
If the system calculates the total and updates the priced-order
view model earlier, the controller can display the checkout screen
sooner. The team determined that the Order aggregate could calcu-
late the total when the order is placed instead of when the reserva-
tion is complete. This will enable the UI flow to move more quickly to
the checkout screen than in the V2 release.
The second set of optimizations that the team added in this stage of This optimization resulted
the journey related to the infrastructure of the system. These chang- in major changes to the
es addressed both the performance and the scalability of the system. infrastructure code.
The following sections describe the most significant changes we Combining asynchronous
calls with the Transient
made here. Fault Handling Application
Block is complex; we would
Sending and receiving commands and events benefit from some of the
asynchronously new simplifying syntax in
As part of the optimization process, the team updated the system to C# 4.5!
ensure that all messages sent on the Service Bus are sent asynchro-
nously. This optimization is intended to improve the overall respon-
siveness of the application and improve the throughput of messages.
As part of this change, the team also used the Transient Fault Handling
Application Block to handle any transient errors encountered when
using the Service Bus.
Treat the Service Bus just like any other critical component of your
system. This means you should ensure that your service bus can
be scaled. Also, remember that not all data has the same value to
your business. Just because you have a Service Bus, doesn’t mean
everything has to go through it. It’s prudent to eliminate low-value,
high-cost traffic.
A dding R esilience a nd Optimizing Perfor m a nce 169
No down-time migration
“Preparation, I have often said, is rightly two-thirds of any
venture.”
Amelia Earhart
Each service (Windows Azure The team planned to have a no-downtime migration from the V2
Service Bus, SQL Database,
Windows Azure storage) to the V3 release in Windows Azure. To achieve this, the migration
has its own particular way process uses an ad-hoc processor running in a Windows Azure work-
of implementing throttling er role to perform some of the migration steps.
behavior and notifying you The migration process still requires you to complete a configura-
when it is placed under tion step to switch off the V2 processor and switch on the V3 proces-
heavy load. For example,
see SQL Azure Throttling. It’s sor. In retrospect, we would have used a different mechanism to
important to be aware of streamline the transition from the V2 to the V3 processor based on
all the throttling that your feedback from the handlers themselves to indicate when they have
application may be subjected finished their processing.
to by different services your
For details of these steps, see Appendix 1, “Release Notes.”
application uses.
The team also considered using the Windows Azure SQL Database Business edition instead of the
Windows Azure SQL Database Web edition but, upon investigation, we determined that at present the
only difference between the editions is the maximum database size. The different editions are not tuned
to support different types of workload, and both editions implement the same throttling behavior.
A dding R esilience a nd Optimizing Perfor m a nce 173
this.AddCommand(new Envelope<ICommand>(seatReservationCommand)
{
TimeToLive = expirationWindow.Add(TimeSpan.FromMinutes(1)),
});
...
}
176 Journey sev en
Then, when it handles the SeatsReserved event, it checks that the CorrelationId property of the
event matches the most recent value of the SeatReservationCommandId variable, as shown in the
following code sample:
public void Handle(Envelope<SeatsReserved> envelope)
{
if (this.State == ProcessState.AwaitingReservationConfirmation)
{
if (envelope.CorrelationId != null)
{
if (string.CompareOrdinal(
this.SeatReservationCommandId.ToString(),
envelope.CorrelationId)
!= 0)
{
// Skip this event.
Trace.TraceWarning(
"Seat reservation response for reservation id {0}" +
"does not match the expected correlation id.",
envelope.Body.ReservationId);
return;
}
}
...
}
pm.Handle(@event);
context.Save(pm);
}
}
The following code sample from the SqlProcessDataContext class shows how the system per-
sists all the commands along with the state of the process manager:
public void Save(T process)
{
var entry = this.context.Entry(process);
if (entry.State == System.Data.EntityState.Detached)
this.context.Set<T>().Add(process);
try
{
this.context.SaveChanges();
}
catch (DbUpdateConcurrencyException e)
{
throw new ConcurrencyException(e.Message, e);
}
this.DispatchMessages(undispatched, commands);
}
180 Journey sev en
The following code sample from the SqlProcessDataContext class shows how the system tries to
send the command messages:
private void DispatchMessages(UndispatchedMessages undispatched,
List<Envelope<ICommand>> deserializedCommands = null)
{
if (undispatched != null)
{
if (deserializedCommands == null)
{
deserializedCommands = this.serializer
.Deserialize<IEnumerable<Envelope<ICommand>>>(
undispatched.Commands).ToList();
}
throw;
}
A dding R esilience a nd Optimizing Perfor m a nce 181
The DispatchMessages method is also invoked from the Find method in the SqlProcessDataContext
class so that it tries to send any un-dispatched messages whenever the system rehydrates a Registration-
ProcessManager instance.
if (!ModelState.IsValid)
{
return View(viewModel);
}
if (needsExtraValidation)
{
return View(viewModel);
}
command.ConferenceId = this.ConferenceAlias.Id;
this.commandBus.Send(command);
return RedirectToAction(
"SpecifyRegistrantAndPaymentDetails",
new
{
conferenceCode = this.ConferenceCode,
orderId = command.OrderId,
orderVersion = orderVersion
});
}
If there are not enough available seats, the controller redisplays the current screen, displaying the
currently available seat quantities to enable the registrant to revise her order.
This remaining part of the change is in the SpecifyRegistrantAndPaymentDetails method in the
RegistrationController class. The following code sample from the V2 release shows that before the
optimization, the controller calls the WaitUntilSeatsAreConfirmed method before continuing to the
registrant screen:
A dding R esilience a nd Optimizing Perfor m a nce 183
[HttpGet]
[OutputCache(Duration = 0, NoStore = true)]
public ActionResult SpecifyRegistrantAndPaymentDetails(
Guid orderId,
int orderVersion)
{
var order = this.WaitUntilSeatsAreConfirmed(orderId, orderVersion);
if (order == null)
{
return View("ReservationUnknown");
}
if (order.State == DraftOrder.States.PartiallyReserved)
{
return this.RedirectToAction(
"StartRegistration",
new
{
conferenceCode = this.ConferenceCode,
orderId, orderVersion = order.OrderVersion
});
}
if (order.State == DraftOrder.States.Confirmed)
{
return View("ShowCompletedOrder");
}
if (order.ReservationExpirationDate.HasValue
&& order.ReservationExpirationDate < DateTime.UtcNow)
{
return RedirectToAction(
"ShowExpiredOrder",
new { conferenceCode = this.ConferenceAlias.Code, orderId = orderId });
}
this.ViewBag.ExpirationDateUTC = order.ReservationExpirationDate;
return View(
new RegistrationViewModel
184 Journey sev en
{
RegistrantDetails = new AssignRegistrantDetails { OrderId = orderId },
Order = pricedOrder
});
}
The following code sample shows the V3 version of this method, which no longer waits for the reser-
vation to be confirmed:
[HttpGet]
[OutputCache(Duration = 0, NoStore = true)]
public ActionResult SpecifyRegistrantAndPaymentDetails(
Guid orderId,
int orderVersion)
{
var pricedOrder = this.WaitUntilOrderIsPriced(orderId, orderVersion);
if (pricedOrder == null)
{
return View("PricedOrderUnknown");
}
if (!pricedOrder.ReservationExpirationDate.HasValue)
{
return View("ShowCompletedOrder");
}
return View(
new RegistrationViewModel
{
RegistrantDetails = new AssignRegistrantDetails { OrderId = orderId },
Order = pricedOrder
});
}
Note: We made this method asynchronous later on during this stage of the journey.
A dding R esilience a nd Optimizing Perfor m a nce 185
The second optimization in the UI flow is to perform the calculation of the order total earlier in
the process. In the previous code sample, the SpecifyRegistrantAndPaymentDetails method still
calls the WaitUntilOrderIsPriced method, which pauses the UI flow until the system calculates an
order total and makes it available to the controller by saving it in the priced-order view model on the
read side.
The key change to implement this is in the Order aggregate. The constructor in the Order class
now invokes the CalculateTotal method and raises an OrderTotalsCalculated event, as shown in the
following code sample:
public Order(
Guid id,
Guid conferenceId,
IEnumerable<OrderItem> items,
IPricingService pricingService)
: this(id)
{
var all = ConvertItems(items);
var totals = pricingService.CalculateTotal(conferenceId, all.AsReadOnly());
this.Update(new OrderPlaced
{
ConferenceId = conferenceId,
Seats = all,
ReservationAutoExpiration = DateTime.UtcNow.Add(ReservationAutoExpiration),
AccessCode = HandleGenerator.Generate(6)
});
this.Update(
new OrderTotalsCalculated
{
Total = totals.Total,
Lines = totals.Lines != null ? totals.Lines.ToArray() : null,
IsFreeOfCharge = totals.Total == 0m
});
}
Previously, in the V2 release the Order aggregate waited until it received a MarkAsReserved com-
mand before it called the CalculateTotal method.
186 Journey sev en
container.RegisterInstance<ICommandBus>(synchronousCommandBus);
container.RegisterInstance<ICommandHandlerRegistry>(synchronousCommandBus);
container.RegisterType<ICommandHandler, OrderCommandHandler>(
"OrderCommandHandler");
container.RegisterType<ICommandHandler, ThirdPartyProcessorPaymentCommandHandler>(
"ThirdPartyProcessorPaymentCommandHandler");
container.RegisterType<ICommandHandler, SeatAssignmentsHandler>(
"SeatAssignmentsHandler");
Note: There is similar code in the Conference.Azure.cs file to configure the worker role to send
some commands in-process.
The following code sample shows how the SynchronousCommandBusDecorator class implements
the sending of a command message:
public class SynchronousCommandBusDecorator : ICommandBus, ICommandHandlerRegistry
{
private readonly ICommandBus commandBus;
private readonly CommandDispatcher commandDispatcher;
...
Trace.TraceInformation(
"Command with id {0} was not handled locally. Sending it through the bus.",
command.Body.Id);
this.commandBus.Send(command);
}
}
...
try
{
var traceIdentifier =
string.Format(
CultureInfo.CurrentCulture,
" (local handling of command with id {0})",
command.Body.Id);
handled = this.commandDispatcher.ProcessMessage(traceIdentifier,
command.Body, command.MessageId, command.CorrelationId);
}
catch (Exception e)
{
Trace.TraceWarning(
"Exception handling command with id {0} synchronously: {1}",
command.Body.Id,
e.Message);
}
return handled;
}
}
Notice how this class tries to send the command synchronously without using the Service Bus, but if
it cannot find a handler for the command, it reverts to using the Service Bus. The following code
sample shows how the CommandDispatcher class tries to locate a handler and deliver a command
message:
A dding R esilience a nd Optimizing Perfor m a nce 189
...
return this.originatorEntityFactory
if (deserialized.Any())
{
return this.entityFactory.Invoke(id, deserialized);
}
}
return null;
}
If the cache entry was updated in the last few seconds, there is a high probability that it is not stale
because we have a single writer for high-contention aggregates. Therefore, we optimistically avoid
checking for new events in the event store since the memento was created. Otherwise, we check in
the event store for events that arrived after the memento was created.
The following code sample shows how the SeatsAvailability class adds a snapshot of its state
data to the memento object to be cached:
A dding R esilience a nd Optimizing Perfor m a nce 191
...
</Topic>
A dding R esilience a nd Optimizing Perfor m a nce 193
...
container.RegisterInstance<IProcessor>(
"SeatsAvailabilityCommandProcessor",
seatsAvailabilityCommandProcessor);
The following code sample shows the new abstract SeatsAvailabilityCommand class that includes a
session ID based on the conference that the command is associated with:
public abstract class SeatsAvailabilityCommand : ICommand, IMessageSessionProvider
{
public SeatsAvailabilityCommand()
{
this.Id = Guid.NewGuid();
}
string IMessageSessionProvider.SessionId
{
get { return "SeatsAvailability_" + this.ConferenceId.ToString(); }
}
}
194 Journey sev en
TimeSpan timeToCache;
if (seatTypes.All(x => x.AvailableQuantity > 200 || x.AvailableQuantity <= 0))
{
timeToCache = TimeSpan.FromMinutes(5);
}
else if (seatTypes.Any(x => x.AvailableQuantity < 30 && x.AvailableQuantity > 0))
{
// There are just a few seats remaining. Do not cache.
timeToCache = TimeSpan.Zero;
}
else if (seatTypes.Any(x => x.AvailableQuantity < 100 && x.AvailableQuantity > 0))
{
timeToCache = TimeSpan.FromSeconds(20);
}
else
{
timeToCache = TimeSpan.FromMinutes(1);
}
The system now also uses a cache to hold seat type descriptions in the
PricedOrderViewModelGenerator class.
Sequential GUIDs
Previously, the system generated the GUIDs that it used for the IDs of aggregates such as orders and
reservations using the Guid.NewGuid method, which generates random GUIDs. If these GUIDs are
used as primary key values in a SQL Database instance, this causes frequent page splits in the indexes,
which has a negative impact on the performance of the database. In the V3 release, the team added a
utility class that generates sequential GUIDs. This ensures that new entries in the SQL Database tables
are always appends; this improves the overall performance of the database. The following code sample
shows the new GuidUtil class:
A dding R esilience a nd Optimizing Perfor m a nce 197
/// <summary>
/// Creates a sequential GUID according to SQL Server’s ordering rules.
/// </summary>
public static Guid NewSequentialId()
{
// This code was not reviewed to guarantee uniqueness under most
// conditions, nor completely optimize for avoiding page splits in SQL
// Server when doing inserts from multiple hosts, so do not re-use in
// production systems.
var guidBytes = Guid.NewGuid().ToByteArray();
For further information, see The Cost of GUIDs as Primary Keys and Good Page Splits and Sequential
GUID Key Generation.
198 Journey sev en
this.tokenProvider = TokenProvider.CreateSharedSecretTokenProvider(
settings.TokenIssuer,
settings.TokenAccessKey);
this.serviceUri = ServiceBusEnvironment.CreateServiceUri(
settings.ServiceUriScheme,
settings.ServiceNamespace,
settings.ServicePath);
else
{
this.client.PrefetchCount = 14;
}
...
}
For more information, see Code First Data Annotations on the MSDN
website.
With the optimistic concurrency check in place, we also removed
the C# lock in the Session-SubscriptionReceiver class that was a
potential bottleneck in the system.
Impact on testing
During this stage of the journey the team reorganized the Confer-
ence.Specflow project in the Conference.AcceptanceTests Visual
Studio solution to better reflect the purpose of the tests.
Integration tests
The tests in the Features\Integration folder in the Conference.Spec-
flow project are designed to test the behavior of the domain directly,
verifying the behavior of the domain by looking at the commands and
events that are sent and received. These tests are designed to be un-
derstood by programmers rather than domain experts and are formu-
lated using a more technical vocabulary than the ubiquitous language.
In addition to verifying the behavior of the domain and helping devel-
opers to understand the flow of commands and events in the system,
these tests proved to be useful in testing the behavior of the domain
in scenarios in which events are lost or are received out of order.
The Conference folder contains integration tests for the Confer-
ence Management bounded context, and the Registration folder
contains tests for the Orders and Registrations bounded context.
Summary
These integration tests The focus of the final stage in our CQRS journey and the V3 pseudo-
make the assumption that production release was on resilience and performance. The next
the command handlers chapter summarizes the lessons we have learned during the entire
trust the sender of the journey and also suggest some things that we might have done differ-
commands to send valid ently if we had the chance to start over with the knowledge we’ve
command messages. This
may not be appropriate for gained.
other systems you may be
designing tests for.
More information
All links in this book are accessible from the book’s online bibliogra-
phy available at: http://msdn.microsoft.com/en-us/library/jj619274.
Journey 8:
201
202 Journey eight
This was borne out in practice during our journey and we bene-
fited significantly from this separation when we did need to solve a
performance issue.
During the last stage of our journey, testing revealed a set of
performance issues in our application. When we investigated them, it
turned out they had less to do with the way we had implemented the
CQRS pattern and more to do with the way we were using our infra-
structure. Discovering the root cause of these problems was the hard
part; with so many moving parts in the application, getting the right
tracing and the right data for analysis was the challenge. Once we
identified the bottlenecks, fixing them turned out to be relatively
easy, largely because of the way the CQRS pattern enables you to
clearly separate different elements of the system, such as reads and
writes. Although the separation of concerns that results from imple-
menting the CQRS pattern can make it harder to identify an issue,
once you have identified one, it is not only easier to fix, but also easi-
er to prevent its return. The decoupled architecture makes it simpler
to write unit tests that reproduce issues.
The challenges we encountered in tackling the performance is-
sues in the system had more to do with the fact that our system is a
distributed, message-based system than the fact that it implements
the CQRS pattern.
Although our event store is
not production-ready, the
Chapter 7, “Adding Resilience and Optimizing Performance” pro-
current implementation vides more information about the ways we addressed the performance
gives a good indication issues in the system and makes some suggestions about additional
of the type of issues you changes that we would like to make, but didn’t have time to implement.
should address if you decide
to implement your own
event store.
Implementing a message-driven system is far
from simple
Our approach to infrastructure on this project was to develop it as
needed during the journey. We didn’t anticipate (and had no fore-
warning of) how much time and effort we would need to create the
robust infrastructure that our application required. We spent at least
twice as much time as we originally planned on many development
tasks because we continued to uncover additional infrastructure-re-
lated requirements. In particular, we learned that having a robust
event store from the beginning is essential. Another key idea we took
away from the experience is that all I/O on the message bus should be
asynchronous.
Although our application is not large, it illustrated clearly to us the
It would also help if we
had a standard notation for importance of having end-to-end tracing available, and the value of
messaging that would help tools that help us understand all of the message flows in the system.
us communicate some of Chapter 4, “Extending and Enhancing the Orders and Registrations
the issues with the domain Bounded Context,” describes the value of tests in helping us under-
experts and people outside
stand the system, and discusses the messaging intermediate language
of the core team.
(MIL) created by Josh Elster, one of our advisors.
Epilogue: Lessons Lea r ned 203
In summary, many of the issues we met along the way were not
related specifically to the CQRS pattern, but were more related to the
distributed, message-driven nature of our solution.
We found that having a single bus abstraction in our code obscured the fact that
some messages are handled locally in-process and some are handled in a different
role instance. To see how this is implemented, look at the ICommandBus interface
and the CommandBus and SynchronousCommandBusDecorator classes.
Chapter 7, “Adding Resilience and Optimizing Performance” includes a discussion
of the SynchronousCommandBusDecorator class.
204 Journey eight
CQRS is different
At the start of our journey we were warned that although the CQRS
pattern appears to be simple, in practice it requires a significant shift
in the way you think about many aspects of the project. Again, this
was borne out by our experiences during the journey. You must be
prepared to throw away many assumptions and preconceived ideas,
and you will probably need to implement the CQRS pattern in sev-
eral bounded contexts before you begin to fully understand the
benefits you can derive from the pattern.
An example of this is the concept of eventual consistency. If you
come from a relational database background and are accustomed to
the ACID properties of transactions, then embracing eventual consis-
tency and understanding its implications at all levels in the system is
a big step to take. Chapter 5, “Preparing for the V1 Release” and Chap-
ter 7, “Adding Resilience and Optimizing Performance” both discuss
eventual consistency in different areas of the system.
In addition to being different from what you might be familiar
with, there is also no single correct way to implement the CQRS pat-
tern. We made more false starts on pieces of functionality and esti-
mated poorly how long things would take due to our unfamiliarity
with the pattern and approach. As we become more comfortable with
the approach, we hope to become faster at identifying how to imple-
ment the pattern in specific circumstances and improve the accuracy
of our estimates.
Another situation in which we took some time to understand the
CQRS approach and its implications was during the integration be-
The CQRS pattern is
conceptually simple; the tween our bounded contexts. Chapter 5, “Preparing for the V1 Re-
devil is in the details. lease,” includes a detailed discussion of how the team approached the
integration issue between the Conference Management and the Or-
ders and Registrations bounded contexts. This part of the journey
uncovered some additional complexity that relates to the level of
coupling between bounded contexts when you use events as the in-
tegration mechanism. Our assumption that events should only contain
information about the change in the aggregate or the bounded con-
text proved to be unhelpful; events can contain additional informa-
tion that is useful to one or more subscribers and helps to reduce the
amount of work that a subscriber must perform.
Epilogue: Lessons Lea r ned 205
More information
All links in this book are accessible from the book’s online bibliography available at:
http://msdn.microsoft.com/en-us/library/jj619274.
Reference 1:
CQRS in Context
This chapter is intended to provide some context for the main subject of this guide: a discussion of
the Command Query Responsibility Segregation (CQRS) pattern. It is useful to understand some of
the origins of the CQRS pattern and some of the terminology you will encounter in this guide and in
other material that discusses the CQRS pattern. It is particularly important to understand that the
CQRS pattern is not intended for use as the top-level architecture of your system; rather, it should be
applied to those subsystems that will gain specific benefits from the application of the pattern.
Before we look at the issues surrounding the use of different architectures within a complex ap-
plication, we need to introduce some of the terminology that we will use in this chapter and subse-
quent chapters of this reference guide. Much of this terminology comes from an approach to develop-
ing software systems known as domain-driven design (DDD). There are a few important points to
note about our use of this terminology:
• We are using the DDD terminology because many CQRS practitioners also use this terminol-
ogy, and it is used in much of the existing CQRS literature.
• There are other approaches that tackle the same problems that DDD tackles, with similar
concepts, but with their own specific terminologies.
• Using a DDD approach can lead naturally to an adoption of the CQRS pattern. However, the
DDD approach does not always lead to the use of the CQRS pattern, nor is the DDD approach
a prerequisite for using the CQRS pattern.
• You may question our interpretation of some of the concepts of DDD. The intention of this
guide is to take what is useful from DDD to help us explain the CQRS pattern and related
concepts, not to provide guidance on how to use the DDD approach.
To learn more about the foundational principles of DDD, you should read the book Domain-Driven
Design: Tackling Complexity in the Heart of Software by Eric Evans (Addison-Wesley Professional, 2003).
To see how these principles apply to a concrete development project on the .NET platform, along with
insights and experimentation, you should read the book Applying Domain-Driven Design and Patterns
by Jimmy Nilsson (Addison-Wesley Professional, 2006).
In addition, to see how Eric Evans describes what works and what doesn’t in DDD, and for his
view on how much has changed over the previous five years, we recommend his talk at QCon London
2009.
For a summary of the key points in Eric Evans’ book, you should read the free book, Domain-
Driven Design Quickly by Abel Avram and Floyd Marinescu (C4Media, 2007).
211
212 R eference one
Domain model
At the heart of DDD lies the concept of the domain model. This
model is built by the team responsible for developing the system in
question, and that team consists of both domain experts from the
business and software developers. The domain model serves several
functions:
• It captures all of the relevant domain knowledge from the
domain experts.
• It enables the team to determine the scope and verify the
consistency of that knowledge.
• The model is expressed in code by the developers.
• It is constantly maintained to reflect evolutionary changes in the
domain.
DDD focuses on the domain because that’s where the business value
is. An enterprise derives its competitive advantage and generates busi-
ness value from its core domains. The role of the domain model is to
capture what is valuable or unique to the business.
Much of the DDD approach focuses on how to create, maintain,
and use these domain models. Domain models are typically composed
of elements such as entities, value objects, aggregates, and described
using terms from a ubiquitous language.
Ubiquitous language
The concept of a ubiquitous language is very closely related to that of
the domain model. One of the functions of the domain model is to
foster a common understanding of the domain between the domain
experts and the developers. If both the domain experts and the devel-
opers use the same terms for objects and actions within the domain
(for example, conference, chair, attendee, reserve, waitlist), the risk of
confusion or misunderstanding is reduced. More specifically, if every-
one uses the same language, there are less likely to be misunderstand-
ings resulting from translations between languages. For example, if a
developer has to think, “if the domain expert talks about a delegate, In our journey, we used
he is really talking about an attendee in the software,” then eventu- SpecFlow to express
ally something will go wrong as a result of this lack of clarity. business rules as acceptance
tests. They helped us to
communicate information
about our domain with
clarity and brevity, and
formulate a ubiquitous
language in the process.
For more information,
see Chapter 4, “Extending
and Enhancing the Orders
and Registrations Bounded
Context” in Exploring CQRS
and Event Sourcing.
214 R eference one
Bounded contexts
So far, the DDD concepts and terminology that we have briefly introduced are related to creating,
maintaining, and using a domain model. For a large system, it may not be practical to maintain a single
domain model; the size and complexity make it difficult to keep it coherent and consistent. To manage
this scenario, DDD introduces the concepts of bounded contexts and multiple models. Within a
system, you might choose to use multiple smaller models rather than a single large model, each one
focusing on some aspect or grouping of functionality within the overall system. A bounded context
is the context for one particular domain model. Similarly, each bounded context (if implemented
following the DDD approach) has its own ubiquitous language, or at least its own dialect of the do-
main’s ubiquitous language.
216 R eference one
Figure 1
Bounded contexts within a large, complex system
“A given bounded context Figure 1 shows an example of a system that is divided into multiple
should be divided into business bounded contexts. In practice, there are likely to be more bounded
components, where these contexts than the three shown in the diagram.
business components have full There are no hard and fast rules that specify how big a bounded
UI through DB code, and are context should be. Ultimately it’s a pragmatic issue that is determined
put together in composite UI’s by your requirements and the constraints on your project.
and other physical pipelines to
fulfill the system’s functional-
ity. A business component can
exist in only one bounded
context.”
—Udi Dahan, Udi & Greg
Reach CQRS Agreement
CQR S in Context 217
Eric Evans makes the case for larger bounded contexts: “For me, a bounded context is an
“Favoring larger bounded contexts: abstract concept (and it’s still an
• Flow between user tasks is smoother when more is handled with important one!) but when it
a unified model. comes to technical details, the
• It is easier to understand one coherent model than two distinct business component is far more
ones plus mappings. important than the bounded
• Translation between two models can be difficult (sometimes context.”
impossible). —Greg Young, Conversation
• Shared language fosters clear team communication. with the patterns & practices
Favoring smaller bounded contexts: team
• Communication overhead between developers is reduced.
• Continuous Integration is easier with smaller teams and code
bases.
• Larger contexts may call for more versatile abstract models,
requiring skills that are in short supply.
• Different models can cater to special needs or encompass the
jargon of specialized groups of users, along with specialized
dialects of the Ubiquitous Language.”
—Eric Evans, Domain-Driven Design: Tackling Complexity in
the Heart of Software, page 383.
You decide which patterns and approaches to apply (for example, BC is often used as an
whether to use the CQRS pattern or not) within a bounded context, acronym for bounded
not for the system. contexts (in DDD) and
business components
(in service-oriented
Anti-corruption layers architecture (SOA)). Do
Different bounded contexts have different domain models. When not confuse them. In
your bounded contexts communicate with each other, you need to our guidance, BC means
ensure that concepts specific to one domain model do not leak into “bounded context.”
another domain model. An anti-corruption layer functions as a gate-
keeper between bounded contexts and helps you keep your domain
models clean.
218 R eference one
Context maps
“I think context mapping is A large complex system can have multiple bounded contexts that in-
perhaps one thing in there teract with one another in various ways. A context map is the docu-
that should be done on every mentation that describes the relationships between these bounded
project. The context map helps contexts. It might be in the form of diagrams, tables, or text.
you keep track of all the A context map helps you visualize the system at a high level, show-
models you are using.” ing how some of the key parts relate to each other. It also helps to
—Eric Evans, What I’ve clarify the boundaries between the bounded contexts. It shows where
learned about DDD since the and how the bounded contexts exchange and share data, and where
book you must translate data as it moves from one domain model to another.
A business entity, such as a customer, might exist in several
“Sometimes the process of bounded contexts. However, it may need to expose different facets
gathering information to draw or properties that are relevant to a particular bounded context. As a
the context map is more customer entity moves from one bounded context to another you
important than the map may need to translate it so that it exposes the relevant facets or prop-
itself.” erties for its current context.
—Alberto Brandolini, Context
Mapping in action Bounded contexts and multiple
architectures
A bounded context typically represents a slice of the overall system
with clearly defined boundaries separating it from other bounded
contexts within the system. If a bounded context is implemented by
following the DDD approach, the bounded context will have its own
domain model and its own ubiquitous language. Bounded contexts are
also typically vertical slices through the system, so the implementa-
tion of a bounded context will include everything from the data store,
right up to the UI.
The same domain concept can exist in multiple bounded contexts.
For example, the concept of an attendee in a conference management
system might exist in the bounded context that deals with bookings, in
the bounded context that deals with badge printing, and in the bound-
ed context that deals with hotel reservations. From the perspective of
the domain expert, these different versions of the attendee may require
different behaviors and attributes. For example, in the bookings bound-
ed context the attendee is associated with a registrant who makes the
bookings and payments. Information about the registrant is not relevant
in the hotel reservations bounded context, where information such as
dietary requirements or smoking preferences is important.
One important consequence of this split is that you can use dif-
ferent implementation architectures in different bounded contexts.
For example, one bounded context might be implemented using a
DDD layered architecture, another might use a two-tier CRUD archi-
tecture, and another might use an architecture derived from the
CQRS pattern. Figure 2 illustrates a system with multiple bounded
contexts, each using a different architectural style. It also highlights
that each bounded context is typically end-to-end, from the persis-
tence store through to the UI.
CQR S in Context 219
Figure 2
Multiple architectural styles within a large, complex application
In addition to managing complexity, there is another benefit of dividing the system into bounded
contexts. You can use an appropriate technical architecture for different parts of the system to ad-
dress the specific characteristics of each part. For example, you can address such questions as wheth-
er it is a complex part of the system, whether it contains core domain functionality, and what is its
expected lifetime.
However, many people can point to projects where they have “It is something of a tradition
seen real benefits from implementing the CQRS pattern while not to connect both paradigms
using the DDD approach for the domain analysis and model design. because using DDD can lead
In summary, the DDD approach is not a prerequisite for imple- naturally into CQRS, and also
menting the CQRS pattern, but in practice they do often go together. the available literature about
CQRS tends to use DDD
terminology. However, DDD
More information is mostly appropriate for very
All links in this book are accessible from the book’s online bibliogra- large and complex projects.
phy available at: http://msdn.microsoft.com/en-us/library/jj619274. On the other hand, there is no
reason why a small and simple
project cannot benefit from
CQRS. For example, a
relatively small project that
would otherwise use distrib-
uted transactions could be
split into a write side and a
read side with CQRS to avoid
the distributed transaction,
but it may be simple enough
that applying DDD would be
overkill.”
—Alberto Población (Cus-
tomer Advisory Council)
Reference 2:
What is CQRS?
In his book “Object Oriented Software Construction,” Betrand Meyer “CQRS is simply the creation
introduced the term “Command Query Separation” to describe the of two objects where there was
principle that an object’s methods should be either commands or previously only one. The
queries. A query returns data and does not alter the state of the ob- separation occurs based upon
ject; a command changes the state of an object but does not return whether the methods are a
any data. The benefit is that you have a better understanding what command or a query (the same
does, and what does not, change the state in your system. definition that is used by
CQRS takes this principle a step further to define a simple pattern. Meyer in Command and
Query Separation: a command
is any method that mutates
state and a query is any
method that returns a value).”
—Greg Young, CQRS, Task
Based UIs, Event Sourcing agh!
223
224 R eference t wo
“CQRS is a simple pattern What is important and interesting about this simple pattern is
that strictly segregates the how, where, and why you use it when you build enterprise systems.
responsibility of handling Using this simple pattern enables you to meet a wide range of archi-
command input into an tectural challenges, such as achieving scalability, managing complexity,
autonomous system from the and managing changing business rules in some portions of your system.
responsibility of handling
side-effect-free query/read
access on the same system. The following conversation between Greg Young and Udi
Consequently, the decoupling Dahan highlights some of the important aspects of the
allows for any number of CQRS pattern:
homogeneous or heterogeneous Udi Dahan: If you are going to be looking at applying CQRS,
query/read modules to be it should be done within a specific bounded context, rather
paired with a command than at the whole system level, unless you are in a special
processor. This principle case, when your entire system is just one single bounded
presents a very suitable context.
foundation for event sourcing,
eventual-consistency state Greg Young: I would absolutely agree with that statement.
replication/fan-out and, thus, CQRS is not a top-level architecture. CQRS is something
high-scale read access. In that happens at a much lower level, where your top level ar-
simple terms, you don’t service chitecture is probably going to look more like SOA and EDA
queries via the same module of [service-oriented or event-driven architectures].
a service that you process Udi Dahan: That’s an important distinction. And that’s
commands through. In REST something that a lot of people who are looking to apply
terminology, GET requests CQRS don’t give enough attention to: just how important on
wire up to a different thing the one hand, and how difficult on the other, it is to identify
from what PUT, POST, and the correct bounded contexts or services, or whatever you
DELETE requests wire up to.” call that top-level decomposition and the event-based syn-
—Clemens Vasters (CQRS chronization between them. A lot of times, when discussing
Advisors Mail List) CQRS with clients, when I tell them “You don’t need CQRS
for that,” their interpretation of that statement is that, in es-
sence, they think I’m telling them that they need to go back
to an N-tier type of architecture, when primarily I mean that
a two-tier style of architecture is sufficient. And even when I
say two-tier, I don’t necessarily mean that the second tier
needs to be a relational database. To a large extent, for a lot
of systems, a NoSQL, document-style database would prob-
ably be sufficient with a single data management-type tier
operated on the client side. As an alternative to CQRS, it’s
important to lay out a bunch of other design styles or ap-
proaches, rather than thinking either you are doing N-tier
object relational mapping or CQRS.
Introducing the Comm a nd Query R esponsibilit y Segregation Pattern 225
When asked whether he considers CQRS to be an approach or a pattern, and if it’s a pattern,
what problem it specifically solves, Greg Young answered:
“If we were to go by the definition that we set up for CQRS a number of years ago, it’s going
to be a very simple low-level pattern. It’s not even that interesting as a pattern; it’s more just
pretty conceptual stuff; you just separate. What’s more interesting about it is what it en-
ables. It’s the enabling that the pattern provides that’s interesting. Everybody gets really
caught up in systems and they talk about how complicated CQRS is with Service Bus and all
the other stuff they are doing, and in actuality, none of that is necessary. If you go with the
simplest possible definition, it would be a pattern. But it’s more what happens once you ap-
ply that pattern—the opportunities that you get.”
Figure 1
A possible architectural implementation of the CQRS pattern
226 R eference t wo
In Figure 1, you can see how this portion of the system is split into a read side and a write side.
The object or objects or the read side contain only query methods, and the objects on the write side
contain only command methods.
There are several motivations for this segregation including:
• In many business systems, there is a large imbalance between the number of reads and the
number of writes. A system may process thousands of reads for every write. Segregating the
two sides enables you to optimize them independently. For example, you can scale out the
read side to support the larger number of read operations independently of the write side.
• Typically, commands involve complex business logic to ensure that the system writes correct
and consistent data to the data store. Read operations are often much simpler than write
operations. A single conceptual model that tries to encapsulate both read and write opera-
tions may do neither well. Segregating the two sides ultimately results in simpler, more
maintainable, and more flexible models.
• Segregation can also occur at the data store level. The write side may use a database schema
that is close to third normal form (3NF) and optimized for data modifications, while the read
side uses a denormalized database that is optimized for fast query operations.
Note: Although Figure 1 shows two data stores, applying the CQRS pattern does not mandate
that you split the data store, or that you use any particular persistence technology such as a
relational database, NoSQL store, or event store (which in turn could be implemented on top of a
relational database, NoSQL store, file storage, blob storage and so forth.). You should view CQRS
as a pattern that facilitates splitting the data store and enabling you to select from a range of
storage mechanisms.
Figure 1 might also suggest a one-to-one relationship between the write side and the read side. How-
ever, this is not necessarily the case. It can be useful to consolidate the data from multiple write
models into a single read model if your user interface (UI) needs to display consolidated data. The
point of the read-side model is to simplify what happens on the read side, and you may be able to
simplify the implementation of your UI if the data you need to display has already been combined.
There are some questions that might occur to you about the practicalities of adopting architec-
ture such as the one shown in Figure 1.
• Although the individual models on the read side and write side might be simpler than a single
compound model, the overall architecture is more complex than a traditional approach with a
single model and a single data store. So, haven’t we just shifted the complexity?
• How should we manage the propagation of changes in the data store on the write side to the
read side?
• What if there is a delay while the updates on the write side are propagated to the read side?
• What exactly do we mean when we talk about models?
The remainder of this chapter will begin to address these questions and to explore the motivations
for using the CQRS pattern. Later chapters will explore these issues in more depth.
Introducing the Comm a nd Query R esponsibilit y Segregation Pattern 227
“A given bounded context The reasons for identifying context boundaries for your domain
should be divided into business models are not necessarily the same reasons for choosing the portions
components, where these of the system that should use the CQRS pattern. In DDD, a bounded
business components have full context defines the context for a model and the scope of a ubiquitous
UI through DB code, and are language. You should implement the CQRS pattern to gain certain
ultimately put together in benefits for your application such as scalability, simplicity, and main-
composite UIs and other tainability. Because of these differences, it may make sense to think
physical pipelines to fulfill the about applying the CQRS pattern to business components rather than
system’s functionality. bounded contexts.
A business component can It is quite possible that your bounded contexts map exactly onto
exist in only one bounded your business components.
context.
Note: Throughout this guide, we use the term bounded context
CQRS, if it is to be used at
in preference to the term business component to refer to the
all, should be used within a
context within which we are implementing the CQRS pattern.
business component.”
—Udi Dahan, Udi & Greg In summary, you should not apply the CQRS pattern to the top level
Reach CQRS Agreement. of your system. You should clearly identify the different portions of
your system that you can design and implement largely indepen-
dently of each other, and then only apply the CQRS pattern to those
portions where there are clear business benefits in doing so.
Events are notifications; they report something that has already happened to other interested
parties. For example, “the customer’s credit card has been billed $200” or “ten seats have been booked
for conference X.” Events can be processed multiple times, by multiple consumers.
Both commands and events are types of message that are used to exchange data between objects.
In DDD terms, these messages represent business behaviors and therefore help the system capture
the business intent behind the message.
A possible implementation of the CQRS pattern uses separate data stores for the read side and
the write side; each data store is optimized for the use cases it supports. Events provide the basis of
a mechanism for synchronizing the changes on the write side (that result from processing commands)
with the read side. If the write side raises an event whenever the state of the application changes, the
read side should respond to that event and update the data that is used by its queries and views. Figure
2 shows how commands and events can be used if you implement the CQRS pattern.
Figure 2
Commands and events in the CQRS pattern
230 R eference t wo
Scalability
Scalability should not be the In many enterprise systems, the number of reads vastly exceeds the
only reason why you choose to number of writes, so your scalability requirements will be different for
implement the CQRS pattern each side. By separating the read side and the write side into separate
in a specific bounded context: models within the bounded context, you now have the ability to scale
“In a non-collaborative each one of them independently. For example, if you are hosting ap-
domain, where you can plications in Windows Azure, you can use a different role for each
horizontally add more side and then scale them independently by adding a different number
database servers to support of role instances to each.
more users, requests, and data
at the same time you’re adding
web servers, there is no real
scalability problem (until
you’re the size of Amazon,
Google, or Facebook).
Database servers can be cheap
if you’re using MySQL, SQL
Server Express, or others.”
—Udi Dahan, When to avoid
CQRS.
Introducing the Comm a nd Query R esponsibilit y Segregation Pattern 231
Reduced complexity
In complex areas of your domain, designing and implementing objects Separation of concerns is the
that are responsible for both reading and writing data can exacerbate key motivation behind
the complexity. In many cases, the complex business logic is only ap- Bertrand Meyer’s Command
plied when the system is handling updates and transactional opera- Query Separation Principle:
tions; in comparison, read logic is often much simpler. When the “The really valuable idea in
business logic and read logic are mixed together in the same model, it this principle is that it’s
becomes much harder to deal with difficult issues such as multiple extremely handy if you can
users, shared data, performance, transactions, consistency, and stale clearly separate methods that
data. Separating the read logic and business logic into separate models change state from those that
makes it easier to separate out and address these complex issues. don’t. This is because you can
However, in many cases it may require some effort to disentangle and use queries in many situations
understand the existing model in the domain. with much more confidence,
Like many patterns, you can view the CQRS pattern as a mecha- introducing them anywhere,
nism for shifting some of the complexity inherent in your domain into changing their order. You have
something that is well known, well understood, and that offers a to be more careful with
standard approach to solving certain categories of problems. modifiers.”
Another potential benefit of simplifying the bounded context by —Martin Fowler, Command-
separating out the read logic and the business logic is that it can make QuerySeparation
testing easier.
Flexibility
The flexibility of a solution that uses the CQRS pattern largely derives
from the separation into the read-side and the write-side models. It
becomes much easier to make changes on the read side, such as add-
ing a new query to support a new report screen in the UI, when you
can be confident that you won’t have any impact on the behavior of
the business logic. On the write side, having a model that concerns
itself solely with the core business logic in the domain means that you
have a simpler model to deal with than a model that includes read
logic as well.
In the longer term, a good, useful model that accurately describes
your core domain business logic will become a valuable asset. It will
enable you to be more agile in the face of a changing business environ-
ment and competitive pressures on your organization.
This flexibility and agility relates to the concept of continuous “Continuous integration
integration in DDD: means that all work within
In some cases, it may be possible to have different development the context is being merged
teams working on the write side and the read side, although in prac- and made consistent fre-
tice this will probably depend on how large the particular bounded quently enough that when
context is. splinters happen they are
caught and corrected quickly.”
—Eric Evans, “Domain-
Driven Design,” p342.
232 R eference t wo
Collaborative domains
Both Udi Dahan and Greg Young identify collaboration as the charac- “In a collaborative domain, an
teristic of a bounded context that provides the best indicator that you inherent property of the
may see benefits from applying the CQRS pattern. domain is that multiple actors
The CQRS pattern is particularly useful where the collaboration operate in parallel on the same
involves complex decisions about what the outcome should be when set of data. A reservation
you have multiple actors operating on the same, shared data. For ex- system for concerts would be a
ample, does the rule “last one wins” capture the expected business good example of a collabora-
outcome for your scenario, or do you need something more sophisti- tive domain; everyone wants
cated? It’s important to note that actors are not necessarily people; the good seats.”
they could be other parts of the system that can operate indepen- —Udi Dahan, Why you should
dently on the same data. be using CQRS almost
everywhere...
Note: Collaborative behavior is a good indicator that there will
be benefits from applying the CQRS pattern; however, this is not
a hard and fast rule!
Such collaborative portions of the system are often the most com-
plex, fluid, and significant bounded contexts. However, this character-
istic is only a guide: not all collaborative domains benefit from the
CQRS pattern, and some non-collaborative domains do benefit from
the CQRS pattern.
Stale data
In a collaborative environment where multiple users can operate on
the same data simultaneously, you will also encounter the issue of
stale data; if one user is viewing a piece of data while another user
changes it, then the first user’s view of the data is stale.
Whatever architecture you choose, you must address this prob-
lem. For example, you can use a particular locking scheme in your
database, or define the refresh policy for the cache from which your
users read data.
The two previous examples show two different areas in a system “Standard layered architec-
where you might encounter and need to deal with stale data; in most tures don’t explicitly deal with
collaborative enterprise systems there will be many more. The CQRS either of these issues. While
pattern helps you address the issue of stale data explicitly at the ar- putting everything in the same
chitecture level. Changes to data happen on the write side, users view database may be one step in
data by querying the read side. Whatever mechanism you chose to use the direction of handling
to push the changes from the write side to the read side is also the collaboration, staleness is
mechanism that controls when the data on the read side becomes usually exacerbated in those
stale, and how long it remains so. This differs from other architectures, architectures by the use of
where management of stale data is more of an implementation detail caches as a performance-
that is not always addressed in a standard or consistent manner. improving afterthought.”
In the chapter “A CQRS and ES Deep Dive,” we will look at how —Udi Dahan talking about
the synchronization mechanism between write side and the read side collaboration and staleness,
determines how you manage the issue of stale data in your application. Clarified CQRS.
234 R eference t wo
More information
All links in this book are accessible from the book’s online bibliogra-
phy available at: http://msdn.microsoft.com/en-us/library/jj619274.
Reference 3:
Event sourcing (ES) and Command Query Responsibility Segregation (CQRS) are frequently men-
tioned together. Although neither one necessarily implies the other, you will see that they do comple-
ment each other. This chapter introduces the key concepts that underlie event sourcing, and provides
some pointers on the potential relationship with the CQRS pattern. This chapter is an introduction;
Chapter 4, “A CQRS and ES Deep Dive,” explores event sourcing and its relationship with CQRS in
more depth.
To help understand event sourcing, it’s important to have a basic definition of events that cap-
tures their essential characteristics:
• Events happen in the past. For example, “the speaker was booked,” “the seat was reserved,” “the
cash was dispensed.” Notice how we describe these events using the past tense.
• Events are immutable. Because events happen in the past, they cannot be changed or undone.
However, subsequent events may alter or negate the effects of earlier events. For example, “the
reservation was cancelled” is an event that changes the result of an earlier reservation event.
• Events are one-way messages. Events have a single source (publisher) that publishes the event.
One or more recipients (subscribers) may receive events.
• Typically, events include parameters that provide additional information about the event. For
example, “Seat E23 was booked by Alice.”
• In the context of event sourcing, events should describe business intent. For example, “Seat E23
was booked by Alice” describes in business terms what has happened and is more descriptive than,
“In the bookings table, the row with key E23 had the name field updated with the value Alice.”
We will also assume that the events discussed in this chapter are associated with aggregates; see the
chapter “CQRS in Context” for a description of the DDD terms: aggregates, aggregate roots, and
entities. There are two features of aggregates that are relevant to events and event sourcing:
• Aggregates define consistency boundaries for groups of related entities; therefore, you can use
an event raised by an aggregate to notify interested parties that a transaction (consistent set of
updates) has taken place on that group of entities.
• Every aggregate has a unique ID; therefore, you can use that ID to record which aggregate in
the system was the source of a particular event.
235
236 R eference thr ee
For the remainder of this chapter, we will use the term aggregate to refer to a cluster of associ-
ated objects that are treated as a unit for the purposes of data changes. This does not mean that event
sourcing is directly related to the DDD approach; we are simply using the terminology from DDD to
try to maintain some consistency in our language in this guide.
Figure 1
Using an object-relational mapping layer
Figure 1 provides a deliberately simplified view of the process. In practice, the mapping performed by
the ORM layer will be significantly more complex. You will also need to consider exactly when the
load and save operations must happen to balance the demands of consistency, reliability, scalability,
and performance.
238 R eference thr ee
Figure 2
Using event sourcing
Figure 2 illustrates the second approach—using event sourcing in place of an ORM layer and a rela-
tional database management system (RDBMS).
Note: You might decide to implement the event store using an RDBMS. The relational schema will
be much simpler than the schema used by the ORM layer in the first approach. You can also use a
custom event store.
Introducing Ev ent Sourcing 239
What you have also gained with the second approach is a com-
plete history, or audit trail, of the bookings and cancellations for a
conference. Therefore, the event stream becomes your only source of
truth. There’s no need to persist aggregates in any other form or shape
since you can easily replay the events and restore the state of the
system to any point in time.
In some domains, such as accounting, event sourcing is the natu-
ral, well-established approach: accounting systems store individual
transactions from which it is always possible to reconstruct the cur-
For additional insights into rent state of the system. Event sourcing can bring similar benefits to
using events as a storage other domains.
mechanism, see Events as a
Storage Mechanism by Greg
Young. Why should I use event sourcing?
So far, the only justification we have offered for the use of event
The primary benefit of using sourcing is the fact that it stores a complete history of the events
event sourcing is a built-in associated with the aggregates in your domain. This is a vital feature
audit mechanism that ensures in some domains, such as accounting, where you need a complete
consistency of transactional audit trail of the financial transactions, and where events must be
data and audit data because immutable. Once a transaction has happened, you cannot delete or
these are the same data. change it, although you can create a new corrective or reversing trans-
Representation via events action if necessary.
allows you to reconstruct the The following list describes some of the additional benefits that
state of any object at any you can derive from using event sourcing. The significance of the indi-
moment in time. vidual benefits will vary depending on the domain you are working in.
—Paweł Wilkosz (Customer • Performance. Because events are immutable, you can use an
Advisory Council) append-only operation when you save them. Events are also
simple, standalone objects. Both these factors can lead to better
“Another problem with the performance and scalability for the system than approaches that
having of two models is that it is use complex relational storage models.
necessarily more work. One must • Simplification. Events are simple objects that describe what has
create the code to save the happened in the system. By simply saving events, you are
current state of the objects and avoiding the complications associated with saving complex
one must write the code to domain objects to a relational store; namely, the object-relation-
generate and publish the events. al impedance mismatch.
No matter how you go about
doing these things it cannot
possibly be easier than only
publishing events, even if you
had something that made storing
current state completely trivial to
say a document storage, there is
still the effort of bringing that
into the project.”
—Greg Young - Why use Event
Sourcing?
Introducing Ev ent Sourcing 241
• Audit trail. Events are immutable and store the full history of
the state of the system. As such, they can provide a detailed
audit trail of what has taken place within the system.
• Integration with other subsystems. Events provide a useful
way of communicating with other subsystems. Your event store
can publish events to notify other interested subsystems of
changes to the application’s state. Again, the event store
provides a complete record of all the events that it published to
other systems.
• Deriving additional business value from the event history. By “Event sourcing can also help
storing events, you have the ability to determine the state of the with complex testing scenarios
system at any previous point in time by querying the events where you need to verify that
associated with a domain object up to that point in time. This a given action triggered a
enables you to answer historical questions from the business specific result. This is
about the system. In addition, you cannot predict what ques- especially relevant for
tions the business might want to ask about the information negative results, where you
stored in a system. If you store your events, you are not discard- need to verify that an action
ing information that may prove to be valuable in the future. did not trigger a result; this is
• Production troubleshooting. You can use the event store to frequently not verified when
troubleshoot problems in a production system by taking a copy writing tests, but can easily be
of the production event store and replaying it in a test environ- instrumented when the
ment. If you know the time that an issue occurred in the changes are being recorded
production system, then you can easily replay the event stream through events.”
up to that point to observe exactly what was happening. —Alberto Población (Cus-
• tomer Advisory Council)
Fixing errors. You might discover a coding error that results in
the system calculating an incorrect value. Rather than fixing the
coding error and performing a risky manual adjustment on a
stored item of data, you can fix the coding error and replay the
event stream so that the system calculates the value correctly
based on the new version of the code.
• Testing. All of the state changes in your aggregates are recorded “As long as you have a stream
as events. Therefore, you can test that a command had the of events, you can project it to
expected effect on an aggregate by simply checking for the any form, even a conventional
event. SQL database. For instance,
• Flexibility. A sequence of events can be projected to any my favorite approach is to
desired structural representation. project event streams into
JSON documents stored in a
cloud storage.”
—Rinat Abdullin, Why Event
Sourcing?
242 R eference thr ee
“From experience, ORMs lead Chapter 4, “A CQRS and ES Deep Dive,” discusses these benefits
you down the path of a in more detail. There are also many illustrations of these benefits in
structural model while ES the reference implementation described in the companion guide Ex-
leads you down the path of a ploring CQRS and Event Sourcing.
behavioral model. Sometimes
one just makes more sense
than the other. For example,
Event sourcing concerns
in my own domain (not model) The previous section described some of the benefits you might realize
I get to integrate with other if you decide to use event sourcing in your system. However, there are
parties that send a lot of some concerns that you may need to address, including:
really non-interesting • Performance. Although event sourcing typically improves the
information that I need to performance of updates, you may need to consider the time it
send out again later when takes to load domain object state by querying the event store
something interesting happens for all of the events that relate to the state of an aggregate.
on my end. It’s inherently Using snapshots may enable you to limit the amount of data
structural. Putting those that you need to load because you can go back to the latest
things into events would be a snapshot and replay the events from that point forward. See the
waste of time, effort, and chapter “A CQRS and ES Deep Dive,” for more information
space. Contrast this with about snapshots.
another part of the domain • Versioning. You may find it necessary to change the definition
that benefits a lot from of a particular event type or aggregate at some point in the
knowing what happened, why future. You must consider how your system will be able to
it happened, when it did or handle multiple versions of an event type and aggregates.
didn’t happen, where time and • Querying. Although it is easy to load the current state of an
historical data are important
object by replaying its event stream (or its state at some point in
to make the next business
the past), it is difficult or expensive to run a query such as, “find
decision. Putting that into a
all my orders where the total value is greater than $250.” How-
structural model is asking for
ever, if you are implementing the CQRS pattern, you should
a world of pain. It depends,
remember that such queries will typically be executed on the
get over it, choose wisely, and
read side where you can ensure that you can build data projec-
above all: make your own
tions that are specifically designed to answer such questions.
mistakes.”
—Yves Reynhout (CQRS
Advisors Mail List)
Introducing Ev ent Sourcing 243
CQRS/ES
The CQRS pattern and event sourcing are frequently combined; each “ES is a great pattern to use to
adding benefit to the other. implement the link between
Chapter 2, “Introducing the Command Query Responsibility Seg- the thing that writes and the
regation Pattern,” suggested that events can form the basis of the thing that reads. It’s by no
push synchronization of the application’s state from the data store on means the only possible way
the write side to the data store on the read side. Remember that to create that link, but it’s a
typically the read-side data store contains denormalized data that is reasonable one and there’s
optimized for the queries that are run against your data; for example, plenty of prior art with
to display information in your application’s UI. various forms of logs and log
You can use the events you persist in your event store to propa- shipping. The major tipping
gate all the updates made on the write side to the read side. The read point for whether the link is
side can use the information contained in the events to maintain “ES” seem to be whether the
whatever denormalized data you require on the read side to support log is ephemeral or a perma-
your queries. nent source of truth. The
CQRS pattern itself merely
mandates a split between the
write and the read thing, so
ES is strictly complementary.”
—Clemens Vasters (CQRS
Advisors Mail List)
Figure 3
CQRS and event sourcing
Notice how the write side publishes events after it persists them to the event store. This avoids the
need to use a two-phase commit, which you would need if the aggregate were responsible for saving
the event to the event store and publishing the event to the read side.
Normally, these events will enable you to keep the data on the read side up to date practically in
real time; there will be some delay due to the transport mechanism, and Chapter 4, “A CQRS and ES
Deep Dive” discusses the possible consequences of this delay.
You can also rebuild the data on the read side from scratch at any time by replaying the events
from your event store on the write side. You might need to do this if the read side data store got out
of synchronization for some reason, or because you needed to modify the structure of the read-side
data store to support a new query.
You need to be careful replaying the events from the event store to rebuild the read-side data
store if other bounded contexts also subscribe to the same events. It might be easy to empty the
read-side data store before replaying the events; it might not be so easy to ensure the consistency of
another bounded context if it sees a duplicate stream of events.
Introducing Ev ent Sourcing 245
Remember that the CQRS pattern does not mandate that you use different stores on the read
side and write side. You could decide to use a single relational store with a schema in third normal form
and a set of denormalized views over that schema. However, replaying events is a very convenient
mechanism for resynchronizing the read-side data store with the write-side data store.
Event stores
If you are using event sourcing, you will need a mechanism to store your events and to return the
stream of events associated with an aggregate instance so that you can replay the events to recreate
the state of the aggregate. This storage mechanism is typically referred to as an event store.
You may choose to implement your own event store, or use a third-party offering, such as Jona-
than Oliver’s EventStore. Although you can implement a small-scale event store relatively easily, a
production quality, scalable one is more of a challenge.
Chapter 8, “Epilogue: Lessons Learned,” summarizes the experiences that our team had imple-
menting our own event store.
Basic requirements
Typically, when you implement the CQRS pattern, aggregates raise events to publish information to
other interested parties, such as other aggregates, process managers, read models, or other bounded
contexts. When you use event sourcing, you persist these same events to an event store. This enables
you to use those events to load the state of an aggregate by replaying the sequence of events associ-
ated with that aggregate.
Therefore, whenever an aggregate instance raises an event, two things must happen. The system
must persist the event to the event store, and the system must publish the event.
Note: In practice, not all events in a system necessarily have subscribers. You may raise some events
solely as a way to persist some properties of an aggregate.
Whenever the system needs to load the current state of an aggregate, it must query the event store
for the list of past events associated with that aggregate instance.
Underlying storage
Events are not complex data structures; typically, they have some standard metadata that includes the
ID of the aggregate instance they are associated with and a version number, and a payload with the
details of the event itself. You do not need to use a relational database to store your events; you could
use a NoSQL store, a document database, or a file system.
When you load the persisted events, you will load them in the order in which they were origi-
nally saved. If you are using a relational database, the records should be keyed using the aggregate ID
and a field that defines the ordering of events.
If an aggregate instance has a large number of events, this may affect the time that it takes to
replay all of the events to reload the state of the aggregate. One option to consider in this scenario is
to use a snapshot mechanism. In addition to the full stream of events in the event store, you can store
a snapshot of the state of the aggregate at some recent point in time. To reload the state of the ag-
gregate, you first load the most recent snapshot, then replay all of the subsequent events. You could
generate the snapshot during the write process; for example, by creating a snapshot every 100 events.
Note: How frequently you should take snapshots depends on the performance characteristics of
your underlying storage. You will need to measure how long it takes to replay different lengths of
event streams to determine the optimum time to create your snapshots.
As an alternative, you could cache heavily used aggregate instances in memory to avoid repeatedly
replaying the event stream.
When an event store persists an event, it must also publish that event. To preserve the consis-
tency of the system, both operations must succeed or fail together. The traditional approach to this
type of scenario is to use a distributed, two-phase commit transaction that wraps together the data
store append operation and the messaging infrastructure publishing operation. In practice, you may
find that support for two-phase commit transactions is limited in many data stores and messaging
platforms. Using two-phase commit transactions may also limit the performance and scalability of the
system.
Note: For a discussion of two-phase commit transactions and the impact on scalability, see the
article “Your Coffee Shop Doesn’t Use Two-Phase Commit” by Gregor Hohpe.
One of the key problems you must solve if you choose to implement your own event store is how to
achieve this consistency. For example, an event store built on top of Windows Azure table storage
could take the following approach to maintain consistency between persisting and publishing events:
use a transaction to write copies of the event to two entities in the same partition in the same table;
one entity stores an immutable event that constitutes part of the event stream of the aggregate; the
other entity stores an event that is part of a list of events pending publication. You can then have a
process that reads the list of events pending publication, guarantees to publish those events at least
once, and then after publication removes each event from the pending list.
An additional set of problems related to consistency occurs if you plan to scale out your event
store across multiple storage nodes, or use multiple writers to write to the store. In this scenario, you
must take steps to ensure the consistency of your data. The data on the write side should be fully
consistent, not eventually consistent. For more information about the CAP theorem and maintaining
consistency in distributed systems, see the next chapter “A CQRS and ES Deep Dive.”
More information
All links in this book are accessible from the book’s online bibliography available at:
http://msdn.microsoft.com/en-us/library/jj619274.
Reference 4:
Introduction
This chapter begins with a brief recap of some of the key points from the previous chapters, then
explores in more detail the important concepts that relate to the Command Query Responsibility
Segregation (CQRS) pattern and event sourcing (ES).
247
248 R eference four
If you segregate your data into a write-side store and a read-side store, you are now making it
explicit in your architecture that when you query data, it may be out of date, but that the data on the
read side will be eventually consistent with the data on the write side. This helps you to simplify the
design of the application and makes it easier to implement collaborative applications where multiple
users may be trying to modify the same data simultaneously on the write side.
public Guid Id
{
get { return this.id; }
}
{
this.handlers.Add(typeof(TEvent), @event => handler((TEvent)@event));
}
...
...
}
...
Commands
Commands are imperatives; they are requests for the system to per- “I think that in most circum-
form a task or action. Two examples are: “book two places on confer- stances (if not all), the
ence X” or “allocate speaker Y to room Z.” Commands are usually command should succeed (and
processed just once, by a single recipient. that makes the async story
Both the sender and the receiver of a command should be in the way easier and practical). You
same bounded context. You should not send a command to another can validate against the read
bounded context because you would be instructing that other bound- model before submitting a
ed context, which has separate responsibilities in another consistency command, and this way being
boundary, to perform some work for you. However, a process manager almost certain that it will
may not belong to any particular bounded context in the system, but succeed.”
it still sends commands. Some people also take the view that the UI is —Julian Dominguez (CQRS
not a part of the bounded context, but the UI still sends commands. Advisors Mail List)
Command handlers
“I don’t see the reason to retry Commands are sent to a specific recipient, typically an aggregate in-
the command here. When you stance. The command handler performs the following tasks:
see that a command could not 1. It receives a command instance from the messaging infra-
always be fulfilled due to race structure.
conditions, go talk with your 2. It validates that the command is a valid command.
business expert and analyze
3. It locates the aggregate instance that is the target of the
what happens in this case,
command. This may involve creating a new aggregate in-
how to handle compensation,
stance or locating an existing instance.
offer an alternate solution, or
deal with overbooking. As far 4. It invokes the appropriate method on the aggregate instance,
as I can see, the only reason to passing in any parameters from the command.
retry is for technical transient 5. It persists the new state of the aggregate to storage.
failures such as those that Typically, you will organize your command handlers so that you have
could occur when accessing the a class that contains all of the handlers for a specific aggregate type.
state storage.” You messaging infrastructure should ensure that it delivers just a
—Jérémie Chassaing (CQRS single copy of a command to single command handler. Commands
Advisors Mail List) should be processed once, by a single recipient.
The following code sample shows a command handler class that
handles commands for Order instances.
public class OrderCommandHandler :
ICommandHandler<RegisterToConference>,
ICommandHandler<MarkSeatsAsReserved>,
ICommandHandler<RejectOrder>,
ICommandHandler<AssignRegistrantDetails>,
ICommandHandler<ConfirmOrder>
{
private readonly IEventSourcedRepository<Order> repository;
else
{
order.UpdateSeats(items);
}
repository.Save(order, command.Id.ToString());
}
This handler handles five different commands for the Order aggregate. The RegisterToConference
command is an example of a command that creates a new aggregate instance. The ConfirmOrder
command is an example of a command that locates an existing aggregate instance. Both examples use
the Save method to persist the instance.
If this bounded context uses an ORM, then the Find and Save methods in the repository class will
locate and persist the aggregate instance in the underlying database.
If this bounded context uses event sourcing, then the Find method will replay the aggregate’s
event stream to recreate the state, and the Save method will append the new events to the aggregate’s
event stream.
Note: If the aggregate generated any events when it processed the command, then these events are
published when the repository saves the aggregate instance.
256 R eference four
Developer 1: One of the claims that I often hear for using event sourcing is that it enables you
to capture the user’s intent, and that this is valuable data. It may not be valuable right now,
but if we capture it, it may turn out to have business value at some point in the future.
Developer 2: Sure. For example, rather than saving just a customer’s latest address, we might
want to store a history of the addresses the customer has had in the past. It may also be use-
ful to know why a customer’s address was changed; perhaps they moved into a new house or
you discovered a mistake with the existing address that you have on file.
Developer 1: So in this example, the intent might help you to understand why the customer
hadn’t responded to offers that you sent, or might indicate that now might be a good time to
contact the customer about a particular product. But isn’t the information about intent, in
the end, just data that you should store. If you do your analysis right, you’d capture the fact
that the reason an address changes is an important piece of information to store.
Developer 2: By storing events, we can automatically capture all intent. If we miss something
during our analysis, but we have the event history, we can make use of that information later.
If we capture events, we don’t lose any potentially valuable data.
Developer 1: But what if the event that you stored was just, “the customer address was
changed?” That doesn’t tell me why the address was changed.
Developer 2: OK. You still need to make sure that you store useful events that capture what is
meaningful from the perspective of the business.
Developer 1: So what do events and event sourcing give me that I can’t get with a well-de-
signed relational database that captures everything I may need?
Developer 2: It really simplifies things. The schema is simple. With a relational database you
have all the problems of versioning if you need to start storing new or different data. With
event sourcing, you just need to define a new event type.
Developer 1: So what do events and event sourcing give me that I can’t get with a standard
database transaction log?
Developer 2: Using events as your primary data model makes it very easy and natural to do
time-related analysis of data in your system; for example, “what was the balance on the ac-
count at a particular point in time?” or, “what would the customer’s status be if we’d intro-
duced the reward program six months earlier?” The transactional data is not hidden away and
inaccessible on a tape somewhere, it’s there in your system.
Developer 1: So back to this idea of intent. Is it something special that you can capture using
events, or is it just some additional data that you save?
Developer 2: I guess in the end, the intent is really there in the commands that originate from
the users of the system. The events record the consequences of those commands. If those
events record the consequences in business terms then it makes it easier for you to infer the
original intent of user.
—Thanks to Clemens Vasters and Adam Dymitruk
258 R eference four
The first approach uses an action-based contract that couples the events to a particular aggregate
type. The second approach uses a uniform contract that uses a resource field as a hint to associate
the event with an aggregate type.
Note: How the events are actually stored is a separate issue. This discussion is focusing on how to
model your events.
The advantages of the first approach are:
• Strong typing.
• More expressive code.
• Better testability.
A CQR S a nd ES Deep Di v e 259
Events
Events report that something has happened. An aggregate or process
manager publishes one-way, asynchronous messages that are pub-
lished to multiple recipients. For example: SeatsUpdated, Payment-
Completed, and EmailSent. Variable environment
state needs to be stored
Sample Code alongside events in order
The following code sample shows a possible implementation of an to have an accurate
event that is used to communicate between aggregates or process representation of the
circumstances at the
managers. It implements the IEvent interface. time when the command
resulting in the event was
public interface IEvent
executed, which means that
{ we need to save everything!
Guid SourceId { get; }
}
...
The following code sample shows a possible implementation of an event that is used in an event
sourcing implementation. It extends the VersionedEvent abstract class.
public abstract class VersionedEvent : IVersionedEvent
{
public Guid SourceId { get; set; }
...
The Version property refers to the version of the aggregate. The version is incremented whenever the
aggregate receives a new event.
Event handlers
Events are published to multiple recipients, typically aggregate instances or process managers. The
Event handler performs the following tasks:
1. It receives an Event instance from the messaging infrastructure.
2. It locates the aggregate or process manager instance that is the target of the event. This may
involve creating a new aggregate instance or locating an existing instance.
3. It invokes the appropriate method on the aggregate or process manager instance, passing in
any parameters from the event.
4. It persists the new state of the aggregate or process manager to storage.
Sample code
public void Handle(SeatsAdded @event)
{
var availability = this.repository.Find(@event.ConferenceId);
if (availability == null)
availability = new SeatsAvailability(@event.ConferenceId);
availability.AddSeats(@event.SourceId, @event.AddedQuantity);
this.repository.Save(availability);
}
A CQR S a nd ES Deep Di v e 261
If this bounded context uses an ORM, then the Find and Save
methods in the repository class will locate and persist the aggregate
instance in the underlying database.
If this bounded context uses event sourcing, then the Find
method will replay the aggregate’s event stream to recreate the state,
and the Save method will append the new events to the aggregate’s
event stream.
“Very often people attempting The concept of eventual consistency offers a way to make it appear
to introduce eventual consis- from the outside that we are meeting these three guarantees. In the
tency into a system run into CAP theorem, the consistency guarantee specifies that all the nodes
problems from the business should see the same data at the same time; instead, with eventual con-
side. A very large part of the sistency we state that all the nodes will eventually see the same data.
reason of this is that they use It’s important that changes are propagated to other nodes in the sys-
the word consistent or tem at a faster rate than new changes arrive in order to avoid the
consistency when talking with differences between the nodes continuing to increase. Another way
domain experts / business of viewing this is to say that we will accept that, at any given time,
stakeholders. some of the data seen by users of the system could be stale. For many
... business scenarios, this turns out to be perfectly acceptable: a busi-
Business users hear “consis- ness user will accept that the information they are seeing on a screen
tency” and they tend to think may be a few seconds, or even minutes out of date. Depending on the
it means that the data will be details of the scenario, the business user can refresh the display a bit
wrong. That the data will be later on to see what has changed, or simply accept that what they see
incoherent and contradictory. is always slightly out of date. There are some scenarios where this
This is not actually the case. delay is unacceptable, but they tend to be the exception rather than
Instead try using the words the rule.
stale or old. In discussions
when the word stale is used Note: To better understand the tradeoffs described by the CAP
the business people tend to theorem, check out the special issue of IEEE Computer magazine
realize that it just means that dedicated to it (Vol.45(no.2), Feb 2012).
someone could have changed
the data, that they may not
have the latest copy of it.”
—Greg Young, Quick
Thoughts on Eventual Consis-
tency.
Figure 1
Using a distributed transaction to maintain consistency
The problems that may result from this approach relate to performance and availability. Firstly, both
sides will need to hold locks until both sides are ready to commit; in other words, the transaction can
only complete as fast as the slowest participant can.
264 R eference four
This transaction may include more than two participants. If we are scaling the read side by adding
multiple instances, the transaction must span all of those instances.
Secondly, if one node fails for any reason or does not complete the transaction, the transaction
cannot complete. In terms of the CAP theorem, by guaranteeing consistency, we cannot guarantee
the availability of the system.
If you decide to relax your consistency constraint and specify that your read side only needs to
be eventually consistent with the write side, you can change the scope of your transaction. Figure 2
shows how you can make the read side eventually consistent with the write side by using a reliable
messaging transport to propagate the changes.
Figure 2
Using a reliable message transport
A CQR S a nd ES Deep Di v e 265
In this example, you can see that there is still a transaction. The
scope of this transaction includes saving the changes to the data store
on the write side, and placing a copy of the change onto the queue
that pushes the change to the read side.
This solution does not suffer from the potential performance
problems that you saw in the original solution if you assume that the
messaging infrastructure allows you to quickly add messages to a
queue. This solution is also no longer dependent on all of the read-
side nodes being constantly available because the queue acts as a
buffer for the messages addressed to the read-side nodes.
This eventual consistency
Note: In practice, the messaging infrastructure is likely to use a might not be able to
publish/subscribe topology rather than a queue to enable multiple guarantee the same order of
read-side nodes to receive the messages. updates on the read side as
on the write side.
This third example (Figure 3) shows a way you can avoid the need for
a distributed transaction.
Figure 3
No distributed transactions
266 R eference four
This example depends on functionality in the write-side data store: it must be able to send a message
in response to every update that the write-side model makes to the data. This approach lends itself
particularly well to the scenario in which you combine CQRS with event sourcing. If the event store
can send a copy of every event that it saves onto a message queue, then you can make the read side
eventually consistent by using this infrastructure feature.
There are two ways to make use of the version number in the
aggregate instance:
“These are technical perfor- • Optimistic: Append the event to the event-stream if the latest
mance optimizations that can event in the event-stream is the same version as the current,
be implemented on case-by- in-memory, instance.
case bases.” • Pessimistic: Load all the events from the event stream that have
—Rinat Abdullin (CQRS a version number greater than the version of the current,
Advisors Mail List) in-memory, instance.
Messaging considerations
Whenever you use messaging, there are a number of issues to consider.
This section describes some of the most significant issues when you
are working with commands and events in a CQRS implementation.
Duplicate messages
An error in the messaging infrastructure or in the message receiving
code may cause a message to be delivered multiple times to its re-
cipient.
There are two potential approaches to handling this scenario.
Some messaging • Design your messages to be idempotent so that duplicate
infrastructures offer a messages have no impact on the consistency of your data.
guarantee of at least once • Implement duplicate message detection. Some messaging
delivery. This implies that
you should explicitly handle infrastructures provide a configurable duplicate detection
the duplicate message strategy that you can use instead of implementing it yourself.
delivery scenario in your
application code.
For a detailed discussion of idempotency in reliable systems, see the
article “Idempotence Is Not a Medical Condition” by Pat Helland.
A CQR S a nd ES Deep Di v e 269
Lost messages
An error in the messaging infrastructure may cause a message not to
be delivered to its recipient.
Many messaging infrastructures offer guarantees that messages
are not lost and are delivered at least once to their recipient. Alterna-
tive strategies that you could implement to detect when messages
have been lost include a handshake process to acknowledge receipt of
a message to the sender, or assigning sequence numbers to messages
so that the recipient can determine if it has not received a message.
Out-of-order messages
The messaging infrastructure may deliver messages to a recipient in an
order different than the order in which the sender sent the messages.
In some scenarios, the order that messages are received in is not
significant. If message ordering is important, some messaging infra-
structures can guarantee ordering. Otherwise, you can detect out-of-
order messages by assigning sequence numbers to messages as they
are sent. You could also implement a process manager process in the
receiver that can hold out-of-order messages until it can reassemble
messages into the correct order.
If messages need to be ordered within a group, you may be able
to send the related messages as a single batch.
Unprocessed messages
A client may retrieve a message from a queue and then fail while it is
processing the message. When the client restarts, the message has
been lost.
Some messaging infrastructures allow you to include the read of
the message from the infrastructure as part of a distributed transac-
tion that you can roll back if the message processing fails.
Another approach offered by some messaging infrastructures, is
to make reading a message a two-phase operation. First you lock and
read the message, then when you have finished processing the mes-
sage you mark it as complete and it is removed from the queue or
topic. If the message does not get marked as complete, the lock on
the message times out and it becomes available to read again.
Event versioning
As your system evolves, you may find that you need to make changes
to the events that your system uses. For example: If a message still cannot be
• Some events may become redundant in that they are no longer
processed after a number
raised by any class in your system. of retries, it is typically sent
• You may need to define new events that relate to new features to a dead-letter queue for
further investigation.
or functionality within in your system.
• You may need to modify existing event definitions.
The following sections discuss each of these scenarios in turn.
270 R eference four
Redundant events
If your system no longer uses a particular event type, you may be able to simply remove it from the
system. However, if you are using event sourcing, your event store may hold many instances of this
event, and these instances may be used to rebuild the state of your aggregates. Typically, you treat the
events in your event store as immutable. In this case, your aggregates must continue to be able to
handle these old events when they are replayed from the event store even though the system will no
longer raise new instances of this event type.
Note: If the semantic meaning of an event changes, then you should treat that as new event type,
and not as a new version of an existing event.
Where you have multiple versions of an event type, you have two basic choices of how to handle the
multiple versions: you can either continue to support multiple versions of the event in your domain
classes, or use a mechanism to convert old versions of events to the latest version whenever they are
encountered by the system.
The first option may be the quickest and simplest approach to adopt because it typically doesn’t
require any changes to your infrastructure. However, this approach will eventually pollute your domain
classes as they end up supporting more and more versions of your events, but if you don’t anticipate
many changes to your event definitions this may be acceptable.
A CQR S a nd ES Deep Di v e 271
The second approach is a cleaner solution: your domain classes only need to support the latest
version of each event type. However you do need to make changes to your infrastructure to translate
the old event types to the latest type. The issue here is to decide whereabouts in your infrastructure
to perform this translation.
One option is to add filtering functionality into your messaging infrastructure so that events are
translated as they are delivered to their recipients; you could also add the translation functionality into
your event handler classes. If you are using event sourcing, you must also ensure that old versions of
events are translated as they are read from the event store when you are rehydrating your aggregates.
Whatever solution you adopt, it must perform the same translation wherever the old version of
the event originates from—another bounded context, an event store, or even from the same bounded
context if you are in the middle of a system upgrade.
Your choice of serialization format may make it easier to handle different versions of events; for
example, JavaScript Object Notation (JSON) deserialization can simply ignore deleted properties, or
the class that the object is deserialized to can provide a meaningful default value for any new property.
Task-based UIs
In Figure 3 above, you can see that in a typical implementation of the CQRS pattern, the UI queries
the read side and receives a DTO, and sends commands to the write side. This section describes some
of the impact this has on the design of your UI.
In a typical three-tier architecture or simple CRUD system, the UI also receives data in the form
of DTOs from the service tier. The user then manipulates the DTO through the UI. The UI then sends
the modified DTO back to the service tier. The service tier is then responsible for persisting the
changes to the data store. This can be a simple, mechanical process of identifying the CRUD opera-
tions that the UI performed on the DTO and applying equivalent CRUD operations to the data store.
There are several things to notice about this typical architecture:
• It uses CRUD operations throughout.
• If you have a domain model you must translate the CRUD operations from the UI into some-
thing that the domain understands.
• It can lead to complexity in the UI if you want to provide a more natural and intuitive UI that
uses domain concepts instead of CRUD concepts.
• It does not necessarily capture the user’s intent.
• It is simple and well understood.
272 R eference four
The following list identifies the changes that occur in your architec-
ture if you implement the CQRS pattern and send commands from
the UI to the write side:
• It does not use CRUD-style operations.
• The domain can act directly in response to the commands from
the UI.
• You can design the UI to construct the commands directly,
making it easier to build a natural and intuitive UI that uses
concepts from the domain.
• It is easier to capture the user’s intent in a command.
• It is more complex and assumes that you have a domain model
in the write side.
• The behavior is typically in one place: the write model.
“Every human-computer A task-based UI is a natural, intuitive UI based on domain concepts
interaction (HCI) professional that the users of the system already understand. It does not impose
I have worked with has been the CRUD operations on the UI or the user. If you implement the
in favor of task-based UIs. CQRS pattern, your task-based UI can create commands to send to
Every user that I have met the domain model on the write side. The commands should map very
that has used both styles of closely onto the mental model that your users have of the domain, and
UI, task based and grid based, should not require any translation before the domain model receives
has reported that they were and processes them.
more productive when using In many applications, especially where the domain is relatively
the task-based UI for interac- simple, the costs of implementing the CQRS pattern and adding a
tive work. Data entry is not task-based UI will outweigh any benefits. Task-based UIs are particu-
interactive work.” larly useful in complex domains.
—Udi Dahan - Tasks, There is no requirement to use a task-based UI when you imple-
Messages, & Transactions. ment the CQRS pattern. In some scenarios a simple CRUD-style UI is
all that’s needed.
“The concept of a task-based
UI is more often than not
assumed to be part of CQRS;
Taking advantage of Windows Azure
it is not; it is there so the In Chapter 2, “Introducing the Command Query Responsibility Segre-
domain can have verbs, but gation Pattern,” we suggested that the motivations for hosting an
also capturing the intent of application in the cloud were similar to the motivations for imple-
the user is important in menting the CQRS pattern: scalability, elasticity, and agility. This sec-
general.” tion describes in more detail how a CQRS implementation might use
—Greg Young - CQRS, Task some of specific features of the Windows Azure platform to provide
Based UIs, Event Sourcing agh! some of the infrastructure that you typically need when you imple-
ment the CQRS pattern.
A CQR S a nd ES Deep Di v e 273
Persisting events
The following code sample shows how the implementation persists an event to Windows Azure table
storage.
public void Save(string partitionKey, IEnumerable<EventData> events)
{
var context = this.tableClient.GetDataServiceContext();
foreach (var eventData in events)
{
var formattedVersion = eventData.Version.ToString("D10");
context.AddObject(
this.tableName,
new EventTableServiceEntity
{
PartitionKey = partitionKey,
RowKey = formattedVersion,
SourceId = eventData.SourceId,
SourceType = eventData.SourceType,
EventType = eventData.EventType,
Payload = eventData.Payload
});
...
try
{
this.eventStoreRetryPolicy.ExecuteAction(() =>
context.SaveChanges(SaveChangesOptions.Batch));
}
catch (DataServiceRequestException ex)
{
var inner = ex.InnerException as DataServiceClientException;
if (inner != null && inner.StatusCode == (int)HttpStatusCode.Conflict)
{
throw new ConcurrencyException();
}
throw;
}
}
A CQR S a nd ES Deep Di v e 275
Retrieving events
The following code sample shows how to retrieve the list of events associated with an aggregate.
public IEnumerable<EventData> Load(string partitionKey, int version)
{
var minRowKey = version.ToString("D10");
var query = this.GetEntitiesQuery(partitionKey, minRowKey,
RowKeyVersionUpperLimit);
var all = this.eventStoreRetryPolicy.ExecuteAction(() => query.Execute());
return all.Select(x => new EventData
{
Version = int.Parse(x.RowKey),
SourceId = x.SourceId,
SourceType = x.SourceType,
EventType = x.EventType,
Payload = x.Payload
});
}
The events are returned in the correct order because the version number is used as the row key.
276 R eference four
Publishing events
To guarantee that every event is published as well as persisted, you can use the transactional behavior
of Windows Azure table partitions. When you save an event, you also add a copy of the event to a
virtual queue on the same partition as part of a transaction. The following code sample shows a
complete version of the save method that saves two copies of the event.
public void Save(string partitionKey, IEnumerable<EventData> events)
{
var context = this.tableClient.GetDataServiceContext();
foreach (var eventData in events)
{
var formattedVersion = eventData.Version.ToString("D10");
context.AddObject(
this.tableName,
new EventTableServiceEntity
{
PartitionKey = partitionKey,
RowKey = formattedVersion,
SourceId = eventData.SourceId,
SourceType = eventData.SourceType,
EventType = eventData.EventType,
Payload = eventData.Payload
});
try
{
this.eventStoreRetryPolicy.ExecuteAction(() =>
context.SaveChanges(SaveChangesOptions.Batch));
}
A CQR S a nd ES Deep Di v e 277
throw;
}
}
You can use a task to process the unpublished events: read the unpublished event from the virtual
queue, publish the event on the messaging infrastructure, and delete the copy of the event from the
unpublished queue. The following code sample shows a possible implementation of this behavior.
private readonly BlockingCollection<string> enqueuedKeys;
{
try
{
var pending = this.queue.GetPending(key).AsCachedAnyEnumerable();
if (pending.Any())
{
foreach (var record in pending)
{
var item = record;
this.sender.Send(() => BuildMessage(item));
this.queue.DeletePending(item.PartitionKey, item.RowKey);
}
}
}
catch
{
this.enqueuedKeys.Add(key);
throw;
}
}
}
A word of warning
For example, a process manager (described in Chapter 6, “A Saga on “Oftentimes when writing
Sagas”) may process a maximum of two messages per second during software that will be cloud
its busiest periods. Because a process manager must maintain consis- deployed you need to take on
tency when it persists its state and sends messages, it requires trans- a whole slew of non-functional
actional behavior. In Windows Azure, adding this kind of transac- requirements that you don’t
tional behavior is nontrivial, and you may find yourself writing code to really have...”
support this behavior: using at-least-once messaging and ensuring that —Greg Young (CQRS
all of the message recipients are idempotent. This is likely to be more Advisors Mail List)
complex to implement than a simple distributed transaction.
More information
All links in this book are accessible from the book’s online bibliogra-
phy available at: http://msdn.microsoft.com/en-us/library/jj619274.
Reference 5:
Communicating Between
Bounded Contexts
Introduction
Bounded contexts are autonomous components, with their own domain models and their own ubiq-
uitous language. They should not have any dependencies on each other at run time and should be
capable of running in isolation. However they are a part of the same overall system and do need to
exchange data with one another. If you are implementing the CQRS pattern in a bounded context,
you should use events for this type of communication: your bounded context can respond to events
that are raised outside of the bounded context, and your bounded context can publish events that
other bounded contexts may subscribe to. Events (one-way, asynchronous messages that publish in-
formation about something that has already happened), enable you to maintain the loose coupling
between your bounded contexts. This guidance uses the term integration event to refer to an event
that crosses bounded contexts.
Context maps
A large system, with dozens of bounded contexts, and hundreds of different integration event types,
can be difficult to understand. A valuable piece of documentation records which bounded contexts
publish which integration events, and which bounded contexts subscribe to which integration events.
281
282 R eference fi v e
You can also use the anti-corruption layer to translate incoming integration events. This transla-
tion might include the following operations:
• Mapping to a different event type when the publishing bounded context has changed the type
of an event to one that the receiving bounded context does not recognize.
• Converting to a different version of the event when the publishing bounded context uses a
different version to the receiving bounded context.
If you determine that you need to persist your integration events from a legacy bounded context,
you also need to decide where to store those events: in the legacy publishing bounded context, or the
receiving bounded context. Because you use the integration events in the receiving bounded context,
you should probably store them in the receiving bounded context.
Your event store must have a way to store events that are not associated with an aggregate.
Note: As a practical solution, you could also consider allowing the legacy bounded context to
persist events directly into the event store that your CQRS bounded context uses.
More information
All links in this book are accessible from the book’s online bibliography available at:
http://msdn.microsoft.com/en-us/library/jj619274.
Reference 6:
A Saga on Sagas
Process Managers, Coordinating Workflows, and Sagas
285
286 R eference si x
Although we have chosen to use the term process manager, sagas may still have a part to play in
a system that implements the CQRS pattern in some of its bounded contexts. Typically, you would
expect to see a process manager routing messages between aggregates within a bounded context, and
you would expect to see a saga managing a long-running business process that spans multiple bound-
ed contexts.
The following section describes what we mean by the term process manager. This is the working
definition we used during our CQRS journey project.
Note: For a time the team developing the Reference Implementation used the term coordinating
workflow before settling on the term process manager. This pattern is described in the book
“Enterprise Integration Patterns” by Gregor Hohpe and Bobby Woolf.
Process Manager
This section outlines our definition of the term process manager. Before describing the process man-
ager there is a brief recap of how CQRS typically uses messages to communicate between aggregates
and bounded contexts.
Figure 1
Order processing without using a process manager
In the example shown in Figure 1, each aggregate sends the appropriate command to the aggregate
that performs the next step in the process. The Order aggregate first sends a MakeReservation
command to the Reservation aggregate to reserve the seats requested by the customer. After the
seats have been reserved, the Reservation aggregate raises a SeatsReserved event to notify the
Order aggregate, and the Order aggregate sends a MakePayment command to the Payment ag-
gregate. If the payment is successful, the Order aggregate raises an OrderConfirmed event to notify
the Reservation aggregate that it can confirm the seat reservation, and the customer that the order
is now complete.
288 R eference si x
Figure 2
Order processing with a process manager
The example shown in Figure 2 illustrates the same business process as that shown in Figure 1, but this
time using a process manager. Now, instead of each aggregate sending messages directly to other
aggregates, the messages are mediated by the process manager.
This appears to complicate the process: there is an additional object (the process manager) and a
few more messages. However, there are benefits to this approach.
Firstly, the aggregates no longer need to know what is the next step in the process. Originally, the
Order aggregate needed to know that after making a reservation it should try to make a payment by
sending a message to the Payment aggregate. Now, it simply needs to report that an order has been
created.
Secondly, the definition of the message flow is now located in a single place, the process man-
ager, rather than being scattered throughout the aggregates.
In a simple business process such as the one shown in Figure 1 and Figure 2, these benefits are
marginal. However, if you have a business process that involves six aggregates and tens of messages,
the benefits become more apparent. This is especially true if this is a volatile part of the system where
there are frequent changes to the business process: in this scenario, the changes are likely to be local-
ized to a limited number of objects.
A Saga on Sagas 289
In Figure 3, to illustrate this point, we introduce wait listing to the process. If some of the seats
requested by the customer cannot be reserved, the system adds these seat requests to a waitlist. To
make this change, we modify the Reservation aggregate to raise a SeatsNotReserved event to report
how many seats could not be reserved in addition to the SeatsReserved event that reports how many
seats could be reserved. The process manager can then send a command to the WaitList aggregate
to waitlist the unfulfilled part of the request.
Figure 3
Order processing with a process manager and a waitlist
It’s important to note that the process manager does not perform any business logic. It only routes
messages, and in some cases translates between message types. For example, when it receives a Seats-
NotReserved event, it sends an AddToWaitList command.
290 R eference si x
More information
All links in this book are accessible from the book’s online bibliography available at:
http://msdn.microsoft.com/en-us/library/jj619274.
Reference 7:
291
292 R eference sev en
Queues
Windows Azure Service Bus queues provide a durable mechanism for senders to send one-way mes-
sages for delivery to a single consumer.
Figure 1 shows how a queue delivers messages.
Figure 1
Windows Azure Service Bus Queue
Figure 2
Windows Azure Service Bus Topic
Reading messages
A consumer can use one of two modes to retrieve messages from queues or subscriptions: Receive-
AndDelete mode and PeekLock mode.
In the ReceiveAndDelete mode, a consumer retrieves a message in a single operation: the Service
Bus delivers the message to the consumer and marks the message as deleted. This is the simplest mode
to use, but there is a risk that a message could be lost if the consumer fails between retrieving the
message and processing it.
In the PeekLock mode, a consumer retrieves a message in two steps: first, the consumer requests
the message, the Service Bus delivers the message to the consumer and marks the message on the
queue or subscription as locked. Then, when the consumer has finished processing the message, it
informs the Service Bus so that it can mark the message as deleted. In this scenario, if the consumer
fails between retrieving the message and completing its processing, the message is re-delivered when
the consumer restarts. A timeout ensures that locked messages become available again if the con-
sumer does not complete the second step.
In the PeekLock mode, it is possible that a message could be delivered twice in the event of a
failure. This is known as at least once delivery. You must ensure that either the messages are idempo-
tent, or add logic to the consumer to detect duplicate messages and ensure exactly once processing.
Every message has a unique, unchanging Id which facilitates checking for duplicates.
You can use the PeekLock mode to make your application more robust when it receives mes-
sages. You can maintain consistency between the messages you receive and a database without using
a distributed transaction.
Sending messages
When you create a client to send messages, you can set the RequiresDuplicateDetection and Duplicate-
DetectionHistoryTimeWindow properties in the QueueDescription or TopicDescription class. You
can use duplicate detection feature to ensure that a message is sent only once. This is useful if you retry
sending a message after a failure and you don’t know whether it was previously sent.
You can use the duplicate detection feature to make your application more robust when it re-
ceives messages without using a distributed transaction. You can maintain consistency between the
messages you send and a database without using a distributed transaction.
Expiring messages
When you create a BrokeredMessage object, you can specify an expiry time using the ExpiresAtUtc
property or a time to live using the TimeToLive property. When a message expires you can specify
either to send the message to a dead letter queue or discard it.
Serializing messages
You must serialize your Command and Event objects if you are sending them over the Windows Azure
Service Bus.
The Contoso Conference Management System uses Json.NET serializer to serialize command and
event messages. The team chose to use this serializer because of its flexibility and resilience to version
changes.
The following code sample shows the adapter class in the Common project that wraps the Json.
NET serializer.
public class JsonSerializerAdapter : ISerializer
{
private JsonSerializer serializer;
this.serializer.Serialize(writer, graph);
return this.serializer.Deserialize(reader);
}
}
Further information
For general information about the Windows Azure Service Bus, see Service Bus on MSDN.
For more information about Service Bus topologies and patterns, see Overview of Service Bus
Messaging Patterns on MSDN.
For information about scaling the Windows Azure Service Bus infrastructure, see Best Practices
for Performance Improvements Using Service Bus Brokered Messaging on MSDN.
For information about Json.NET, see Json.NET.
296 R eference sev en
DependencyResolver.SetResolver(new UnityServiceLocator(this.container));
...
}
The MVC controller classes no longer have parameter-less constructors. The following code sample
shows the constructor from the RegistrationController class:
private ICommandBus commandBus;
private Func<IViewRepository> repositoryFactory;
Further information
For more information about the Unity Application Block, see Unity Application Block on MSDN.
More information
All links in this book are accessible from the book’s online bibliography available at:
http://msdn.microsoft.com/en-us/library/jj619274.
Tales from the Trenches
Twilio
Product overview
Twilio provides high-availability voice and SMS APIs, hosted in the cloud, that enable developers to
add automated voice and SMS capabilities to a wide range of applications.
Although Twilio did not explicitly implement the CQRS pattern or use event sourcing, many of
the fundamental concepts implicit in their designs are very similar to concepts that relate to the CQRS
pattern including splitting read and write models and relaxing consistency requirements.
Lessons learned
This section summarizes some of the key lessons learned by Twilio during the development of the
Twilio APIs and services.
297
298 Ta les from the Tr enches
• It’s important to understand, for a system, what are the units of failure for the different pieces
that make up that system, and then to design the system to be resilient to those failures. Typical
units of failure might be an individual host, a datacenter or zone, a geographic region, or a cloud
service provider. Identifying units of failure applies both to code deployed by Twilio, and to
technologies provided by a vendor, such as data storage or queuing infrastructure. From the
perspective of a risk profile, units of failure at the level of a host are to be preferred because it
is easier and cheaper to mitigate risk at this level.
• Not all data requires the same level of availability. Twilio gives its developers different primitives
to work with that offer three levels of availability for data; a distributed queuing system that is
resilient to host and zone failures, a replicated database engine that replicates across regions,
and an in-memory distributed data store for high availability. These primitives enable the
developers to select a storage option with a specified unit of failure. They can then choose a
store with appropriate characteristics for a specific part of the application.
Idempotency
An important lesson that Twilio learned in relation to idempotency is the importance of assigning the
token that identifies the specific operation or transaction that must be idempotent as early in the
processing chain as possible. The later the token is assigned, the harder it is to test for correctness and
the more difficult it is to debug. Although Twilio don’t currently offer this, they would like to be able
to allow their customers to set the idempotency token when they make a call to one of the Twilio APIs.
No-downtime deployments
To enable no-downtime migrations as part of the continuous deployment of their services, Twilio uses
risk profiles to determine what process must be followed for specific deployments. For example, a
change to the content of a website can be pushed to production with a single click, while a change to
a REST API requires continuous integration testing and a human sign-off. Twilio also tries to ensure
that changes to data schemas do not break existing code: therefore the application can keep running,
without losing requests as the model is updated using a pivoting process.
Some features are also initially deployed in a learning mode. This means that the full processing
pipeline is deployed with a no-op at the end so that the feature can be tested with production traffic,
but without any impact on the existing system.
Performance
Twilio has four different environments: a development environment, an integration environment, a
staging environment, and a production environment. Performance testing, which is part of cluster
testing, happens automatically in the integration and staging environments. The performance tests
that take a long time to run happen in an ongoing basis in the integration environment and may not
be repeated in the staging environment.
If load-levels are predictable, there is less of a requirement to use asynchronous service implemen-
tations within the application because you can scale your worker pools to handle the demand. How-
ever, when you experience big fluctuations in demand and you don’t want to use a callback mechanism
because you want to keep the request open, then it makes sense to make the service implementation
itself asynchronous.
T wilio 299
Twilio identified a trade-off in how to effectively instrument their systems to collect performance
monitoring data. One option is to use a common protocol for all service interactions that enables the
collection of standard performance metrics through a central instrumentation server. However, it’s
not always desirable to enforce the use of a common protocol and enforce the use of specific inter-
faces because it may not be the best choice in all circumstances. Different teams at Twilio make their
own choices about protocols and instrumentation techniques based on the specific requirements of
the pieces of the application they are responsible for.
References
For further information relating to Twilio, see:
• Twilio.com
• High-Availability Infrastructure in the Cloud
• Scaling Twilio
• Asynchronous Architectures for Implementing Scalable Cloud Services
• Why Twilio Wasn’t Affected by Today’s AWS Issues
More information
All links in this book are accessible from the book’s online bibliography available at:
http://msdn.microsoft.com/en-us/library/jj619274.
Tales from the Trenches
Lokad Hub
Project overview
Lokad Hub is an infrastructure element that unifies the metered, pay-as-you-go, forecasting subscrip-
tion offered by Lokad. It also provides an intelligent, self-managing, business backend for Lokad’s in-
ternal teams.
Lokad requires this piece of infrastructure to be extremely flexible, focused, self-managing, and
capable of surviving cloud outages. Key features of Lokad Hub include:
• Multi-tenancy
• Scalability
• Instant data replication to multiple locations
• Deployable to any cloud
• Supports multiple production deployments daily
• Full audit logs and the ability to roll back to any point in time
• Integration with other systems
The current version was developed using the domain-driven design (DDD) approach, implements the
CQRS pattern, and uses event sourcing (ES). It is a replacement for a legacy, CRUD-style system.
For Lokad, the two key benefits of the new system are the low development friction that makes
it possible to perform multiple deployments per day, and the ability to respond quickly to changes in
the system’s complex business requirements.
Lessons learned
This section summarizes some of the key lessons learned by Lokad during the development of
Lokad Hub.
300
Lok a d Hub 301
Benefits of DDD
The team at Lokad adopted the DDD approach in the design and development of Lokad Hub. The
DDD approach helped to divide the complex domain into multiple bounded contexts. It was then
possible to model each bounded context separately and select to most appropriate technologies for
that bounded context. In this project, Lokad chose a CQRS/ES implementation for each bounded
context.
Lokad captured all the business requirements for the system in the models as code. This code
became the foundation of the new system.
However, it did take some time (and multiple iterations) to build these models and correctly
capture all of the business requirements.
Reducing dependencies
The core business logic depends only on message contracts and the Lokad.CQRS portability inter-
faces. Therefore, the core business logic does not have any dependencies on specific storage providers,
object-relational mappers, specific cloud services, or dependency injection containers. This makes it
extremely portable, and simplifies the development process.
Using sagas
Lokad decided not to use sagas in Lokad Hub because they found them to be overly complex and
non-transparent. Lokad also found issues with trying to use sagas when migrating data from the leg-
acy CRUD system to the new event sourced system.
Migration to ES
Lokad developed a custom tool to migrate data from the legacy SQL data stores into event streams
for the event-sourced aggregates in the new system.
Using projections
Projections of read-side data, in combination with state of the art UI technologies, made it quicker
and easier to build a new UI for the system.
The development process also benefited from the introduction of smart projections that are re-
built automatically on startup if the system detects any changes in them.
Event sourcing
Event sourcing forms the basis of the cloud failover strategy for the system, by continuously replicat-
ing events from the primary system. This strategy has three goals:
• All data should be replicated to multiple clouds and datacenters within one second.
• There should be read-only versions of the UI available immediately if the core system becomes
unavailable for any reason.
• A full read/write backup system can be enabled manually if the primary system becomes
unavailable.
302 Ta les from the Tr enches
Although, it would be is possible to push this further and even have a zero downtime strategy, this
would bring additional complexity and costs. For this system, a guaranteed recovery within a dozen
minutes is more than adequate.
The most important aspect of this strategy is the ability to keep valuable customer data safe and
secure even in the face of global cloud outages.
Event sourcing also proved invaluable when a glitch in the code was discovered soon after the
initial deployment. It was possible to roll the system back to a point in time before the glitch mani-
fested itself, fix the problem in the code, and then restart the system
Infrastructure
When there are multiple bounded contexts to integrate (at least a dozen in the case of Lokad Hub)
it’s important to have a high-level view of how they integrate with each other. The infrastructure that
supports the integration should also make it easy to support and manage the integration in a clean and
enabling fashion.
Once you have over 100,000 events to keep and replay, simple file-based or blob-based event
stores becoming limiting. With these volumes, it is better to use a dedicated event-streaming server.
References
For further information relating to Lokad Hub, see:
• Case: Lokad Hub
• Lokad.com
• Lokad Team
More information
All links in this book are accessible from the book’s online bibliography available at:
http://msdn.microsoft.com/en-us/library/jj619274.
Tales from the Trenches
Project overview
The following is a list of the overall goals of the project. We wanted to:
• Build a sample reference architecture for enterprise level applications with the main emphasis
on performance, scalability, reliability, extensibility, testability, and modularity.
• Enforce SOLID (single responsibility, open-closed, Liskov substitution, interface segregation,
and dependency inversion) principles.
• Utilize test-driven development and evaluate performance early and often as part of our
application lifecycle management (ALM).
• Provide abstraction and interoperability with third-party and legacy systems.
• Address infrastructure concerns such as authentication (by using claims-based, trusted sub
systems), and server and client side caching (by using AppFabric for Windows Server).
• Include the capabilities necessary to support various types of clients.
We wanted to use the CQRS pattern to help us to improve the performance, scalability, and reliabil-
ity of the system.
On the read side, we have a specialized query context that exposes the data in the exact format
that the UI clients require which minimizes the amount of processing they must perform. This separa-
tion provided great value in terms of a performance boost and enabled us to get very close to the
optimal performance of our web server with the given hardware specification.
On the write side, our command service allows us to add queuing for commands if necessary and
to add event sourcing to create an audit log of the changes performed, which is a critical component
for any financial system. Commands provided a very loosely coupled model to work with our domain.
From the ALM perspective, commands provide a useful abstraction for our developers enabling them
to work against a concrete interface and with clearly defined contracts. Handlers can be maintained
independently and changed on demand through a registration process: this won’t break any service
contracts, and no code re-complication will be required.
This case study is based on contributions by Alex Dubinkov and Tim Walton.
303
304 Ta les from the Tr enches
The initial reference architecture application deals with financial advisor allocation models. The
application shows the customers assigned to the financial advisor, and the distribution of their alloca-
tions as compared to the modeled distribution that the customer and financial advisor had agreed
upon.
Lessons learned
This section summarizes some of the lessons learned during this project
Query performance
During testing of querying de-normalized context for one of the pilot applications, we couldn’t get
the throughput, measured in requests per second, that we expected even though the CPU and
memory counters were all showing in range values. Later on, we observed severe saturation of the
network both on the testing clients and on the server. Reviewing the amount of data we were query-
ing for each call, we discovered it to be about 1.6 Mb.
To resolve this issue we:
• Enabled compression on IIS, which significantly reduced amount of data returned from the
Open Data Protocol (OData) service.
• Created a highly de-normalized context that invokes a stored procedure that uses pivoting in
SQL to return just the final “model layout” back to the client.
• Cached the results in the query service.
Commands
We developed both execute and compensate operations for command handlers and use a technique
of batching commands that are wrapped in a transaction scope. It is important to use the correct
scope in order to reduce the performance impact.
One-way commands needed a special way to pass error notifications or results back to the caller.
Different messaging infrastructures (Windows Azure Service Bus, NServiceBus) support this function-
ality in different ways, but for our on-premises solution, we had to come up with our own custom
approach.
Because the system is very loosely coupled, it is critical that we have a highly organized bootstrap-
ping mechanism that is generic enough to provide modularity and materialization for the specific
container, mapping and logging choices.
More information
All links in this book are accessible from the book’s online bibliography available at:
http://msdn.microsoft.com/en-us/library/jj619274.
Tales from the Trenches
Digital Marketing
Refactoring an existing application has many challenges. Our story is about refactoring an existing
application over an extended period of time while still delivering new features. We didn’t start with
CQRS as the goal, which was a good thing. It became a good fit as we went along. Our product is
composed of multiple pieces, of which our customer facing portal (CFP) uses CQRS.
There are many aspects of the DMS that fit well with CQRS, but there were two main problems
we were trying to solve: slow reads and bloated View Objects (VO).
The CFP has a very large dataset with many tables containing tens of millions of rows; at the ex-
treme some tables have millions of rows for a single client. Generally, the best practice for this amount
of data in SQL Server is highly denormalized tables—ours is no exception. A large portion of our
value add is structured and reporting data together, allowing clients to make the most informed deci-
sion when altering their structured data. The combination of structured and reporting data required
many SQL joins and some of our page load times were over 20 seconds. There was a lot of friction for
users to make simple changes.
The combination of structured and reporting data also resulted in bloated View Objects. The CFP
suffered from the same woes that many long lived applications do—lots of cooks in the kitchen but
a limited set of ingredients. Our application has a very rich UI resulting in the common Get/Modify/
Save pattern. A VO started out with a single purpose: we need data on screen A. A few months later
we needed a similar screen B that had some of the same data. Fear not, we already had most of that,
we just needed to show it on screen B too—after all we wouldn’t want to duplicate code. Fast for-
ward a few months and our two screens have evolved independently even though they represented
“basically the same data.” Worse yet, our VO has been used in two more screens and one of them has
already been deprecated. At this point we are lucky if the original developers still remember what
values from the VO are used on which screens. Oh wait, it’s a few years later and the original develop-
ers don’t even work here anymore! We would often find ourselves trying to persist a VO from the UI
and unable to remember which magical group of properties must be set. It is very easy to violate the
Single Responsibility Principle in the name of reuse. There are many solutions to these problems and
CQRS is but one tool for making better software.
Before trying to make large architectural changes there are a few things we found to be very
successful for the CFP: Dependency Injection (DI) and Bounded Contexts.
306
Digita l M a rketing 307
Make your objects injectable and go get a DI Container. Changing a legacy application to be in-
jectable is a very large undertaking and will be painful and difficult. Often the hardest part is sorting
out the object dependencies. But this was completely necessary later on. As the CFP became inject-
able it was possible to write unit tests allowing us to refactor with confidence. Now that our applica-
tion was modular, injectable, and unit tested we could choose any architecture we wanted.
Since we decided to stick with CQRS, it was a good time to think about bounded contexts. First
we needed to figure out the major components of the overall product. The CFP is one bounded
context and only a portion of our overall application. It is important to determine bounded contexts
because CQRS is best applied within a bounded context and not as an integration strategy.
One of our challenges with CQRS has been physically separating our bounded contexts. Refactor-
ing has to deal with an existing application and the previous decisions that were made. In order to split
the CFP into its own bounded context we needed to vastly change the dependency graph. Code that
handles cross cutting concerns was factored into reference assemblies; our preference has been NuGet
packages built and hosted by TeamCity. All the remaining code that was shared between bounded
contexts needed to be split into separate solutions. Long term we would recommend separate re-
positories to ensure that code is not referenced across the bounded contexts. For the CFP we had too
much shared code to be able to completely separate the bounded contexts right away, but having
done so would have spared much grief later on.
It is important to start thinking about how your bounded contexts will communicate with each
other. Events and event sourcing are often associated with CQRS for good reason. The CFP uses
events to keep an auditable change history which results in a very obvious integration strategy of
eventing.
At this point the CFP is modular, injectable, testable (not necessarily fully tested), and beginning
to be divided by bounded context but we have yet to talk about CQRS. All of this ground work is
necessary to change the architecture of a large application—don’t be tempted to skip it.
The first piece of CQRS we started with was the commands and queries. This might seem ob-
tusely obvious but I point it out because we did not start with eventing, event sourcing, caching, or
even a bus. We created some commands and a bit of wiring to map them to command handlers. If you
took our advice earlier and you are using an Inversion of Control (IoC) container, the mapping of
command to command handler can be done in less than a day. Since the CFP is now modular and in-
jectable our container can create the command handler dependencies with minimal effort which al-
lowed us to wire our commands into our existing middleware code. Most applications already have a
remoting or gateway layer that performs this function of translating UI calls into middleware / VO
functions. In the CFP, the commands and queries replaced that layer.
One of our challenges has been to refactor an existing UI to a one-way command model. We have
not been able to make a strict one-way contract mainly due to database side ID generation. We are
working towards client side ID generation which will allow us to make commands fire and forget. One
technique that has helped a bit was to wrap the one way asynchronous bus in a blocking bus. This
helped us to minimize the amount of code that depends on the blocking capability. Even with that we
have too much code that relies upon command responses simply because the functionality was avail-
able, so try not to do this if possible.
308 Ta les from the Tr enches
Unfortunately we could only do this for so long before we realized it is just the same application
with a fancy new façade. The application was easier to work on, but that was more likely due to the
DI changes then to the commands and queries. We ran into the problem of where to put certain types
of logic. Commands and queries themselves should be very light weight objects with no dependencies
on VOs. There were a few occasions we were tempted during a complicated refactor to use an exist-
ing VO as part of a query but inevitably we found ourselves back down the path of bloated objects.
We also became tempted to use complex properties (getters and setters with code) on commands
and queries but this resulted in hidden logic—ultimately we found it better to put the logic in the
command handler or better yet in the domain or command validator.
At this point we also began to run into difficulties accomplishing tasks. We were in the middle of
a pattern switch and it was difficult to cleanly accomplish a goal. Should command handlers dispatch
other commands? How else will they exercise any logic that is now embedded in a command handler?
For that matter, what should be a command handler’s single responsibility?
We found that these questions could not be answered by writing more commands and queries
but rather by flushing out our CQRS implementation. The next logical choice was either the read or
the write model. Starting with the cached read model felt like the best choice since it delivers tangible
business value. We chose to use events to keep our read model up to date, but where do the events
come from? It became obvious that we were forced to create our write model first.
Choose a strategy for the write model that makes sense in your bounded context. That is, after
all, what CQRS allows: separating reads and writes to decouple the requirements of each. For the CFP
we use domains that expose behavior. We do not practice DDD, but a domain model fits well with
CQRS. Creating a domain model is very hard, we spent a lot of time talking about what our aggregate
roots are—do not underestimate how hard this will be.
When creating the write model we were very careful about introducing any dependencies to the
domain assembly. This will allow the domain to outlive other application specific technologies, but
was not without pain points. Our domain started out with a lot of validation that was eventually
moved into command validators; dependencies required for validation were not available from within
the domain. In the end, the domain simply translates behavior (methods) into events (class instances).
Most of our pain points were centered on saving the events without taking dependencies into the
domain assembly. The CFP does not use event sourcing, we were able to translate the domain events
into our existing SQL tables with objects we call Event Savers. This allows our domain to focus on
translating behavior to events and the command handler can publish and save the events. To prevent
the command handler from doing too much, we use a repository pattern to get and save a domain.
This allows us to switch to event sourcing in a later refactoring of the application if desired with
minimal effect on the domain. The Event Savers are simple classes that map an event to a stored
procedure call or table(s). We use RabbitMq to publish the events after saving, it is not transactional
but that has been ok so far.
As events become more ubiquitous it is possible to keep a read model up to date. We have a
separate service that subscribes to events and updates a Redis cache. By keeping this code separate
we isolate the dependencies for Redis and make our caching solution more pluggable. The choice of
caching technology is difficult and the best solution is likely to change over time. We needed the
flexibility to test multiple options and compare the performance vs. maintainability.
Digita l M a rketing 309
Once our cache was in place we discovered the oldest known theorem of caching: That which is
cached becomes stale. Invalid cache results can occur many different ways; we found enough that a
temporary measure was introduced to update items in the cache on a rolling schedule. The plan was
(and still is) to find and eliminate all sources of inconsistency. Database integrations or people/depart-
ments that update the write model directly will need to be routed through the domain to prevent the
cache from becoming incorrect. Our goal is total elimination of these discrepancies for complete
confidence in cached results.
More information
All links in this book are accessible from the book’s online bibliography available at:
http://msdn.microsoft.com/en-us/library/jj619274.
Tales from the Trenches
TOPAZ Technologies
What were the biggest challenges and how did we overcome them?
One of the biggest challenges was to convince management and other stakeholders in our company
to believe in the benefits of this new approach. Initially they were skeptical or even frightened at the
thought of not having the data stored in a RDBMS. DBAs, concerned about potential job loss, also
tried to influence management in a subtle, negative way regarding this new architecture.
We overcame these objections by implementing just one product using CQRS/ES, then showing
the stakeholders how it worked, and demonstrating how much faster we finished the implementa-
tion. We also demonstrated the significantly improved quality of the product compared to our
other products.
This study is contributed by Gabriel N. Schenker, Chief Software Architect, TOPAZ Technologies LLC
310
TOPA Z Technologies 311
Another challenge was the lack of knowledge in the development team of this area. For everyone
CQRS and ES were completely new.
As an architect, I did a lot of teaching in the form of lunch-and-learns in which I discussed the
fundamental aspects of this new architecture. I also performed live coding in front of the team and
developed some end-to-end exercises, which all developers were required to solve. I encouraged our
team to watch the various free videos in which Greg Young was presenting various topics related to
CQRS and event sourcing.
Yet another challenge is the fact that this type of architecture is still relatively new and not fully
established. Thus, finding good guidance or adhering to best practices is not as straightforward as with
more traditional architectures. How to do CQRS and ES right is still invokes lively discussions, and
people have very different opinions about both the overall architecture and individual elements of it.
Further information
This blog series discusses the details of the implementation.
More information
All links in this book are accessible from the book’s online bibliography available at:
http://msdn.microsoft.com/en-us/library/jj619274.
Tales from the Trenches
eMoney Nexus
This study is contributed by Jon Wagner, SVP & Chief Architect, eMoney Advisor
312
eMoney Nexus 313
System overview
The job of the Nexus is to fetch account data from a number of financial institutions, and publish that
data to a number of application servers.
Inputs
• Users – can tell the system to create a subscription to data updates from a source, force an
instant refresh of data, or modify processing rules for their accounts.
• Bulk Files – arrive daily with large workloads for account updates
• Timed Updates – arrive scheduled throughout the night to update individual subscriptions.
Subscribers
• Users – user interfaces need to update when operations complete or data changes.
• Planning Applications – multiple application instances need to be notified when data changes.
• Outgoing Bulk Files – enterprise partners need a daily feed of the changes to the account data.
Design Goals
• Decoupled Development – building and upgrading the Nexus should not be constrained by
application deployment lifecycles.
• Throughput Resilience – processing load for queries should not affect the throughput of the
data updates and vice versa.
• High Availability – processing should be fault tolerant for node outages.
• Continuous Deployment – connections and business logic should be upgradable during business
hours and should decouple Nexus changes from other systems.
• Long-Running Processes – data acquisition can take a long time, so an update operation must be
decoupled from any read/query operations.
• Re-playable Operations – data acquisition has a high chance of failure due to network errors,
timeouts, and so on, so operations must be re-playable for retry scenarios.
• Strong Diagnostics – since updated operations are complex and error-prone, diagnostic tools
are a must for the infrastructure.
• Non-Transactional – because our data is not the system of record, there is less of a need for
data rollbacks (we can just get a new update), and eventual consistency of the data is acceptable
to the end user.
314 Ta les from the Tr enches
The first step was to decouple the processing engine from the application system. We did that be
adding a service layer to accept change requests and a publishing system to send change events back
to the application. The application would have its own copy of the account data that is optimized for
the planning and search operations for end users. The Nexus could store the data in the best way
possible for high-throughput processing.
eMoney Nexus 315
Partitioning the system allows us to decouple any changes to the Nexus from the other systems. Like
all good Partition / Bounded Context / Service boundaries, the interfaces between the systems are
contracts that must be adhered to, but can evolve over time with some coordination between the
systems. For example, we have upgraded the publishing interface to the core application 5 or 6 times
to add additional data points or optimize the data publishing process. Note that we publish to a SQL
Server Service Broker, but this could be another application server in some scenarios.
This allowed us to achieve our first two design goals: Decoupled Development and Throughput
Resilience. Large query loads on the application would be directed at its own database, and bulk load
operations on the back end do not slow down the user experience. The Nexus could be deployed on
a separate schedule from the application and we could continue to make progress on the system.
Next, we added Windows Load Balancing and WCF services to expose the Command service to
consumers.
316 Ta les from the Tr enches
This allows us to add additional processing nodes, as well as remove nodes from the pool in order to
upgrade them. This got us to our goal of High Availability, as well as Continuous Deployment. In
most scenarios, we can take a node out of the pool during the day, upgrade it, and return it to the pool
to take up work.
For processing, we decided to break up each unit of work into “Messages.” Most Messages are
Commands that tell the system to perform an operation. Messages can dispatch other messages as
part of their processing, causing an entire workflow process to unfold. We don’t have a great separa-
tion between Sagas (the coordination of Commands) and Commands themselves, and that is some-
thing we can improve in future builds.
Whenever a client calls the Command service, if the request cannot be completed immediately, it
is placed in a queue for processing. This can be an end user, or one of the automated data load schedul-
ers. We use SQL Server Service Broker for our Message processing Queues. Because each of our data
sources have different throughput and latency requirements, we wrote our own thread pooling
mechanism to allow us to apportion the right number of threads-per-source at runtime through a
configuration screen. We also took advantage of Service Broker’s message priority function to allow
user requests to jump to the front of the worker queues to keep end users happy. We also separated
the Command (API) service from the Worker service so we can scale the workloads differently.
eMoney Nexus 317
This message processing design gave us a lot of benefits. First of all, with Command/Query Separation,
you are forced to deal with the fact that a Command may not complete immediately. By implementing
clients that need to wait for results, you are naturally going to be able to support Long-Running
Processes. In addition, you can persist the Command messages to a store and easily support Re-
playable Operations to handle retry logic or system restores. The Nexus Service has its own sched-
uler that sends itself Commands to start jobs at the appropriate time.
318 Ta les from the Tr enches
One unexpected benefit of using a queue infrastructure was more scalable performance. Partitioning
the workloads (in our case, by data source) allows for more optimal use of resources. When workloads
begin to block due to some resource slowness, we can dynamically partition that workload into a
separate processing queue so other work can continue.
One of the most important features that we added early on in development was Tracing and Di-
agnostics. When an operation is started (by a user or by a scheduled process), the system generates a
GUID (a “Correlation ID”) that is assigned to the message. The Correlation ID is passed throughout the
system, and any logging that occurs is tied to the ID. Even if a message dispatches another message to
be processed, the Correlation ID is along for the ride. This lets us easily figure out which log events in
the system go together (GUIDs are translated to colors for easy visual association). Strong Diagnostics
was one of our goals. When the processing of a system gets broken into individual asynchronous
pieces, it’s almost impossible to analyze a production system without this feature.
eMoney Nexus 319
To drive operations, the application calls the Nexus with Commands such as CreateSubscription,
UpdateSubscription, and RepublishData. Some of these operations can take a few minutes to com-
plete, and the user must wait until the operation is finished. To support this, each long-running Com-
mand returns an ActivityID. The application polls the Nexus periodically to determine whether the
activity is still running or if it has completed. An activity is considered completed when the update
has completed AND the data has been published to the read replica. This allows the application to
immediately perform a query on the read replica to see the data results.
320 Ta les from the Tr enches
Lessons learned
We’ve been running the Nexus in production for several years now, and for this type of system, the
benefits CQRS and ES are evident, at least for the read-write separation and data change events that
we use in our system.
• CQRS = Service Boundary + Separation of Concerns – the core of CQRS is creating service
boundaries for your inputs and outputs, then realizing that input and output operations are
separate concerns and don’t need to have the same (domain) model.
• Partitions are Important – define your Bounded Context and boundaries carefully. You will have
to maintain them over time.
• External systems introduce complexity – particularly when replaying an event stream, managing
state against an external system or isolating against external state may be difficult. Martin
Fowler has some great thoughts on it here.
• CQRS usually implies async but not always – because you generally want to see the results of
your Commands as Query results. It is possible to have Commands complete immediately if it’s
not a Query. In fact, it’s easier that way sometimes. We allow the CreateSubscription Com-
mand to return a SubscriptionID immediately. Then an async process fetches the data and
updates the read model.
eMoney Nexus 321
• User Experience for async is hard – users want to know when their operation completes.
• Build in Diagnostics from the beginning – trust me on this.
• Decomposing work into Commands is good – our BatchUpdate message just spawns off a lot
of little SubscriptionUpdate messages. It makes it easier to extend and reuse workflows over
time.
• Queue or Bus + Partitions = Performance Control – this lets you fan out or throttle your
workload as needs change.
• Event Sourcing lets you have totally different read systems for your data – we split our event
stream and send it to a relational database for user queries and into flat files for bulk delivery to
partners.
If you want some more good practical lessons on CQRS, you should read Chapter 8, “Epilogue: Lessons
Learned.”
Making it better
Like any system, there are many things we would like to do better.
• Workflow Testing is Difficult – we didn’t do quite enough work to remove dependencies from
our objects and messages, so it is tough to test sequences of events without setting up large
test cases. Doing a cleanup pass for DI/IOC would probably make this a lot easier.
• UI code is hard with AJAX and polling – but now that there are push libraries like SignalR, this
can be a lot easier.
• Tracking the Duration of an Operation – because our workflows are long, but the user needs to
know when they complete, we track each operation with an Activity ID. Client applications poll
the server periodically to see if an operation completes. This isn’t a scalability issue yet, but we
will need to do more work on this at some point.
As you can see, this implementation isn’t 100% pure CQRS/ES, but the practical benefits of these
patterns are real.
For more information, see Jon Wagner’s blog Zeros, Ones and a Few Twos.
Appendix 1
Release Notes
These release notes apply to the Reference Implementation – Contoso Conference Management
System. This RI complements the “Exploring CQRS and Event Sourcing” guide and is for learning pur-
poses only.
System evolution
The system has gone through three pseudo-production releases and additional improvements after V3.
Note: While the team went through actual deployments to Windows Azure and performed
migrations, the releases are referred to as ‘pseudo-production’ because they lack critical security
and other features necessary for a full production release that are not the focus of this guidance.
The notes apply to the latest version (packaged in this self-extractable zip) unless specified otherwise.
To follow the project evolution, please check out specific versions of the entire system tagged V1-
pseudo-prod, V2-pseudo-prod or V3-pseudo-prod in the git repository history. Also, see the Migra-
tion notes and Chapter 5, “Preparing for the V1 Release,” Chapter 6, “Versioning Our System” and
Chapter 7, “Adding Resilience and Optimizing Performance” of the Guide.
323
324 A ppendi x 1
5. Deploying the application to Windows Azure and using the Windows Azure Service Bus and
an event store that uses Windows Azure table storage.
Note: The local message bus and event store use SQL Express and are intended to help you run the
application locally for demonstration purposes. They are not intended to illustrate a production-
ready scenario.
Note: Scenarios 1, 2, 3 and 4 use SQL Express for other data storage requirements. Scenario 5
requires you to use SQL Database instead of SQL Express.
Note: The source code download for the V3 release also includes a Conference.NoAzureSDK
solution that enables you to build and run the sample application without installing the Windows
Azure SDK. This solution supports scenarios 1 and 2 only.
Prerequisites
Before you begin, you should install the following pre-requisites:
• Visual Studio 2010 or later
• SQL Server 2008 Express or later
• ASP.NET MVC 3 and MVC 4 for the V1 and V2 releases
• ASP.NET MVC 4 Installer (Visual Studio 2010) for the V3 release
• Windows Azure SDK for .NET - November 2011 for the V1 and V2 releases
• Windows Azure SDK for .NET - June 2012 or later for the V3 release
Note: The V1 and V2 releases of the sample application used ASP.NET MVC 3 in addition to ASP.
NET MVC 4. As of the V3 release all of the web applications in the project use ASP.NET MVC 4.
Note: The Windows Azure SDK is not a pre-requisite if you plan to use the Conference.
NoAzureSDK solution.
You can download and install all of these except for Visual Studio by using the Microsoft Web Platform
Installer 4.0.
You can install the remaining dependencies from NuGet by running the script install-packages.
ps1 included with the downloadable source.
If you plan to deploy any part of the RI to Windows Azure (scenarios 2, 4, 5), you must have a
Windows Azure subscription. You will need to configure a Windows Azure storage account (for blob
storage), a Windows Azure Service Bus namespace, and a SQL Database instance (they do not neces-
sarily need to be in the same Windows Azure subscription). You should be aware, that depending on
your Windows Azure subscription type, you may incur usage charges when you use the Windows
Azure Service Bus, Windows Azure table storage, and when you deploy and run the RI in Windows
Azure.
At the time of writing, you can sign-up for a Windows Azure free trial that enables you to run the
RI in Windows Azure.
Note: Scenario 1 enables you to run the RI locally without using the Windows Azure compute and
storage emulators.
R elease Notes 325
Note: The command above is displayed in multiple lines for better readability. This command should
be entered as a single line.
You must then modify the ServiceConfiguration.Cloud.cscfg file in the Conference.Azure project to
use the following connection strings.
SQL Database Connection String:
Server=tcp:[your-sql-azure-server].database.windows.net;Database=myDataBase;
User ID=[your-sql-azure-username]@[your-sql-azure-server];
Password=[your-sql-azure-password];Trusted_Connection=False;Encrypt=True;
MultipleActiveResultSets=True;
Conference.Azure\ServiceConfiguration.Cloud.cscfg:
<?xml version="1.0" encoding="utf-8"?>
<ServiceConfiguration serviceName="Conference.Azure" osFamily="1" osVersion="*"
xmlns="http://schemas.microsoft.com/ServiceHosting/2008/10/ServiceConfiguration">
<Role name="Conference.Web.Admin">
<Instances count="1" />
<ConfigurationSettings>
<Setting name="Microsoft.WindowsAzure.Plugins.Diagnostics.ConnectionString"
value="[your-windows-azure-connection-string]" />
<Setting name="Diagnostics.ScheduledTransferPeriod" value="00:02:00" />
<Setting name="Diagnostics.LogLevelFilter" value="Warning" />
<Setting name="Diagnostics.PerformanceCounterSampleRate" value="00:00:30" />
<Setting name="DbContext.ConferenceManagement"
value="[your-sql-azure-connection-string]" />
<Setting name="DbContext.SqlBus"
value="[your-sql-azure-connection-string]" />
</ConfigurationSettings>
</Role>
<Role name="Conference.Web.Public">
<Instances count="1" />
<ConfigurationSettings>
<Setting name="Microsoft.WindowsAzure.Plugins.Diagnostics.ConnectionString"
value="[your-windows-azure-connection-string]" />
<Setting name="Diagnostics.ScheduledTransferPeriod" value="00:02:00" />
<Setting name="Diagnostics.LogLevelFilter" value="Warning" />
<Setting name="Diagnostics.PerformanceCounterSampleRate" value="00:00:30" />
<Setting name="DbContext.Payments"
value="[your-sql-azure-connection-string]" />
<Setting name="DbContext.ConferenceRegistration"
value="[your-sql-azure-connection-string]" />
<Setting name="DbContext.SqlBus"
value="[your-sql-azure-connection-string]" />
<Setting name="DbContext.BlobStorage"
value="[your-sql-azure-connection-string]" />
</ConfigurationSettings>
</Role>
<Role name="WorkerRoleCommandProcessor">
<Instances count="1" />
<ConfigurationSettings>
<Setting name="Microsoft.WindowsAzure.Plugins.Diagnostics.ConnectionString"
value="[your-windows-azure-connection-string]" />
<Setting name="Diagnostics.ScheduledTransferPeriod" value="00:02:00" />
<Setting name="Diagnostics.LogLevelFilter” value="Information" />
<Setting name="Diagnostics.PerformanceCounterSampleRate" value="00:00:30" />
R elease Notes 327
<Setting name="DbContext.Payments"
value="[your-sql-azure-connection-string]" />
<Setting name="DbContext.EventStore"
value="[your-sql-azure-connection-string]" />
<Setting name="DbContext.ConferenceRegistrationProcesses"
value="[your-sql-azure-connection-string]" />
<Setting name="DbContext.ConferenceRegistration"
value="[your-sql-azure-connection-string]" />
<Setting name="DbContext.SqlBus"
value="[your-sql-azure-connection-string]" />
<Setting name="DbContext.BlobStorage"
value="[your-sql-azure-connection-string]" />
<Setting name="DbContext.ConferenceManagement"
value="your-sql-azure-connection-string]" />
</ConfigurationSettings>
</Role>
</ServiceConfiguration>
Note: The LogLevelFilter values for these roles is set to either Warning or Information. If you
want to capture logs from the application into the WADLogsTable, you should change these values
to Verbose.
Note: You cannot currently use the Windows Azure storage emulator for the event store. You must
use a real Windows Azure storage account.
Building the RI
Open the Conference Visual Studio solution file in the code repository that you downloaded and
un-zipped.
328 A ppendi x 1
You can use NuGet to download and install all of the dependencies by running the script install-
packages.ps1 before building the solution.
Build Configurations
The solution includes a number of build configurations. These are described in the following sections:
Release
Use the Release build configuration if you plan to deploy your application to Windows Azure.
This solution uses the Windows Azure Service Bus to provide the messaging infrastructure.
Use this build configuration if you plan to deploy the RI to Windows Azure (scenario 5).
Debug
Use the Debug build configuration if you plan either to deploy your application locally to the Win-
dows Azure compute emulator or to run the application locally and stand-alone without using the
Windows Azure compute emulator.
This solution uses the Windows Azure Service Bus to provide the messaging infrastructure and the
event store based on Windows Azure table storage (scenarios 2 and 4).
DebugLocal
Use the DebugLocal build configuration if you plan to either deploy your application locally to the
Windows Azure compute emulator or run the application on a local web server without using the
Windows Azure compute emulator.
This solution uses a local messaging infrastructure and event store built using SQL Server (scenarios
1 and 3).
Running the RI
When you run the RI, you should first create a conference, add at least one seat type, and then publish
the conference using the Conference.Web.Admin site.
After you have published the conference, you will then be able to use the site to order seats and use
the simulated the payment process using the Conference.Web site.
The following sections describe how to run the RI using in the different scenarios.
Scenario 1. Local Web Server, SQL Event Bus, SQL Event Store
To run this scenario you should build the application using the DebugLocal configuration.
Run the WorkerRoleCommandProcessor project as a console application.
Run the Conference.Web.Public and Conference.Web.Admin (located in the Conference-Manage-
ment folder) as web applications.
R elease Notes 329
For more information about how you can run these tests, please visit the xUnit.net site on Codeplex.
You can use NuGet to download and install all of the dependencies by running the script install-
packages.ps1 before building this solution.
The acceptance tests are created using SpecFlow. For more information about SpecFlow, please visit
SpecFlow.
The Conference.AcceptanceTests solution uses the same build configurations as the Conference
solution to control whether you run the acceptance tests against either the local SQL-based messag-
ing infrastructure and event store or the Windows Azure Service Bus messaging infrastructure and
Windows Azure table storage based event store.
You can use the xUnit console runner or a third-party tool with Visual Studio integration and xUnit
support (for example TDD.net) to run the tests. The xUnit GUI tool is not supported.
Known issues
The list of known issues attached and is available online.
More information
All links in this book are accessible from the book’s online bibliography available at:
http://msdn.microsoft.com/en-us/library/jj619274.
Appendix 2
Migrations
Note: You can change the value of the MaintenanceMode property in the Windows Azure
management portal.
331
332 A ppendi x 2
The Settings.xml file contains the names of the new Windows Azure tables that the V2 release
uses. If you are migrating data from V1 to V2 ensure that the name of the EventSourcing table is
different from the name of the table used by the V1 release. The name of the table used by the V1
release is hardcoded in the Program.cs file in the MigrationToV2 project:
var originalEventStoreName = "ConferenceEventStore";
<EventSourcing>
<ConnectionString>...</ConnectionString>
<TableName>ConferenceEventStoreApplicationDemoV2</TableName>
</EventSourcing>
Note: The migration utility assumes that the V2 event sourcing table is in the same Windows Azure
storage account as the V1 event sourcing table. If this is not the case, you will need to modify the
MigrationToV2 application code.
The App.config file contains the DbContext.ConferenceManagement connection string. The migra-
tion utility uses this connection string to connect to the SQL Database instance that contains the SQL
tables used by the application. Ensure that this connection string points to the Windows Azure SQL
Database that contains your production data. You can verify which SQL Database instance your
production environment uses by looking in the active ServiceConfiguration.csfg file.
Note: If you are running the application locally using the Debug configuration, the DbContext.
ConferenceManagement connection string will point to local SQL Express database.
Note: To avoid data transfer charges, you should run the migration utility inside a Windows Azure
worker role instead of on-premise. The solution includes an empty, configured Windows Azure
worker role in the MigrationToV2.Azure with diagnostics that you can use for this purpose. For
information about how to run an application inside a Windows Azure role instance, see Using
Remote Desktop with Windows Azure Roles.
Note: Migration from V1 to V2 is not supported if you are using the DebugLocal configuration.
Note: You can change the value of the MaintenanceMode property in the Windows Azure
management portal.
More information
All links in this book are accessible from the book’s online bibliography available at:
http://msdn.microsoft.com/en-us/library/jj619274.
Index
335
336
Command Query Responsibility Segregation (CQRS) CQRS and ES deep dive, 247-279
pattern, 1 aggregates and event sourcing, 250-252
CommandBus class, 48, 142 aggregates and ORM layers, 249
CommandDispatcher class, 187-189 aggregates in the domain model, 249
commands CAP theorem, 261
and command handlers, 252-256 commands
CQRS and ES deep dive, 253 and command handlers, 252-256
defined, 14, 54, 125, 157 and DTOs, 247
described, 228 and optimistic concurrency, 256
and DTOs, 247 concurrency and aggregates, 267-268
ensuring that commands are sent, 161 data and normalization, 248
handlers, 254 DDD and aggregates, 248
messages de-duplicating, 141-142 distributed transaction, 263
and optimistic concurrency, 256 duplicate messages, 268
processing optimizing, 166 event handlers, 260
single recipient, 44-47 events, 259-260
synchronous commands, 186-189 and event handlers, 256
Tales from the trenches, 304 and event sourcing, 248
validation, 61-62 versioning, 269-271
Common project, 295 eventual consistency, 248-249, 261-262
complexity reduction, 231 and CQRS, 263-266
concepts and terminology, 212-215 existing event definitions, 270-271
concurrency, 25 IAggregateRoot interface, 249
and aggregates, 267-268 ICommand interface, 253
optimistic concurrency check, 199 IEvent interface, 259
Conference Management bounded context, 8-9 intent, 256-257
user stories, 92 modelling, 258-259
V1 release, 97, 104, 108 lost messages, 269
conference sites defined, 16-17 messaging
ConferenceConfiguration.feature file, 74-76 and CQRS, 268-271
ConferenceController class, 27 Windows Azure Service Bus, 278
ConferenceProcessor class, 141 more information, 279
ConferenceRegistrationDbContext class, 71, 72 multiple role instances, 273
conferences, 4 new event types, 270
information caching, 167 no distributed transactions, 265
management system, 1 Order instances, 254-255
ConferenceViewModelGenerator class, 128, 138-140 out-of-order messages, 269
consistency See eventual consistency persisting events, 274-275
context maps publishing events, 276-278
bounded contexts communication, 281 read models and write models, 247
Contoso Conference Management System, 10-11 read side optimizing, 266
CQRS expert role (Gary), 10 redundant events, 270
defined, 8 reliable message transport, 264
described, 218 retrieving events, 275
Contoso Conference Management System, 3-5, 7 scaling out using multiple role instances, 273
See also V1 release task-based UIs, 271
context map, 10-11 unprocessed messages, 269
Contoso scenario, 1 VersionedEvent abstract class, 260
contributors and reviewers, xxix Windows Azure, 272-279
countdown timer, 64-66 Windows Azure table storage, 273-279
write side optimizing, 267
index
Index 337
introduction, 235-246 F
lessons learned, 209 flexibility, 5
ORM layers, 236-240 CQRS pattern introduction, 231
overview, 1 foreward, xxi-xxii
performance, scalability, and consistency, 246-247
standalone event sourcing, 245 G
Tales from the trenches, 301-302 Gary See CQRS expert role (Gary)
and transaction logging, 205 Global.asax.cs file, 296
underlying storage, 245 granularity, 48
V1 release, 97-98, 113 guide
why use?, 240-242 how to use, xxiii-xxviii
event stores, 245 structure, xxiii-xxiv
EventBus class, 48, 144 GUIDs, 196
events
asynchronous commands, 165
H
bounded contexts communication, 282
high availability, 297-298
CQRS and ES deep dive, 259-260
how to use this guide, xxiii-xxviii
defined, 14, 54, 125, 157
definition changes, 128
definitions, 270-271 I
described, 229 IAggregateRoot interface, 249
and event handlers, 256 ICommand interface, 253
and event sourcing, 248 idempotency, 125, 158
event sourcing events, 151 Tales from the trenches, 298
existing event definitions, 270-271 IEvent interface, 259
handlers, 260 implementation details, 62-74, 133
new event types, 270 Orders and Registrations bounded context, 25-49
out-of-order SeatsReserved events, 175 infrastructure
persisting, 146-150, 274-275 leveraging, 207
to the event store, 117-118 optimizing, 165-172
integration events, 131-132 Orders and Registrations bounded context, 40-42
processing, 161 Tales from the trenches, 302
processing multiple times, 129-131 integration, 111-113
publishing, 276-278 between bounded contexts, 101-103
in parallel, 167 with legacy systems, 282-283
redundant, 270 testing, 200
replaying to rebuild state, 118-120 intent, 256-257
retrieving, 275 modelling, 258-259
versioning, 269-271 Inversion of Control (IoC) container, 304-305
EventSourced class, 115, 119-120 IQueryable interface, 57-58
EventStore class, 120-121 IT professional role (Poe), 3
EventStoreBusPublisher class, 122, 191
eventual consistency, 107-108 J
and CQRS, 263-266 Jana See software architect role (Jana)
CQRS and ES deep dive, 248-249, 261-262 journey, xxiv-xxv
defined, 92, 158
executable tests, 76-81
existing event definitions, 270-271
index
Index 339
W
WaitUntilSeatsAreConfirmed method, 182-184
Web.config file, 68
Windows Azure, 272-279
Windows Azure Service Bus, 42, 104
technologies used in the RI, 291
Windows Azure SQL database instance, 325-327
Windows Azure table storage, 273-279
Windows Azure table storage-based event store, 120-121
write side optimizing, 267
Z
zero-cost orders, 134
versioning, 134
Exploring CQRS
Exploring CQRS and Event Sourcing
This guide is focused on building highly scalable, highly available, and
patterns & practices
Proven practices for predictable results
maintainable applications with the Command & Query Responsibility
Segregation and the Event Sourcing architectural patterns. It presents a
Save time and reduce risk on your
learning journey, not definitive guidance. It describes the experiences of a software development projects by
development team with no prior CQRS proficiency in building, deploying (to incorporating patterns & practices,
Windows Azure), and maintaining a sample real-world, complex, enterprise Microsoft’s applied engineering
and
system to showcase various CQRS and ES concepts, challenges, and guidance that includes both production
techniques. The development team did not work in isolation; we actively quality source code and documentation.
Event Sourcing
sought input from industry experts and from a wide group of advisors to
ensure that the guidance is both detailed and practical. The guidance is designed to help
software development teams:
The CQRS pattern and event sourcing are not mere simplistic solutions to
Make critical design and technology
Event Sourcing
the problems associated with large-scale, distributed systems. By providing selection decisions by highlighting
you with both a working application and written guidance, we expect you’ll the appropriate solution architectures,
be well prepared to embark on your own CQRS journey. technologies, and Microsoft products
for common scenarios