Large Entity Framework Models by qxc16070

VIEWS: 526 PAGES: 22

									                                   Large Entity Framework Models




                                          Large Models
I’m not talking about size 14s on the catwalk. I’m talking about entity models of 200 entities or more.
Some of our customers have models with more than 1000 entities.

This essay covers design issues, Entity Framework limitations, and techniques for overcoming those
limitations using DevForce from IdeaBlade. A companion application and video afford concrete evidence
of those techniques; they can be found (along with this essay) on the “Ward’s Corner” page of our
website.

I will argue that big models are bad design. But we should decide for ourselves what is good or bad
design; if we want big models, we should be able to have them. If I want a big model can I have one?

The Entity Framework Limit

The answer is “no” if I want to use the Microsoft ADO.Net Entity Framework (EF) for Object Mapping
and Persistence. You may be able to squeak by but most developers will find that Entity Framework
either crawls or chokes.

The EF team addressed the subject head-on in a two part 2008 blog post by Srikanth Mandadi, an Entity
Framework Development Lead (see part one and part two). Mr. Srikanth gives his honest appraisal of
the problem and proposes some workarounds.

His appraisal in brief,

       The execution performance is awful.
       The EF visual designer is virtually unusable with more than 30 entities or for entities that have
        large numbers of properties.
       The CLR namespace bloats and IntelliSense is awash with names.
       Visual Studio and tools like ReSharper may run out of memory trying to make sense of the huge
        generated class file.

According to the Entity Framework team architect, this particular situation will not improve in the next
version (EF v.4). Version 4 brings plenty of other improvements but the visual designer is essentially the
same and the ~200 entity limit remains.

The Object Mapper from IdeaBlade’s DevForce is an alternative EDMX editor that makes it much easier
to view and modify any model with more than thirty entities.

If you want all of your entity classes in one project … hey, you asked for a big model and that’s what
IntelliSense is going to present to you. Lots of memory plus new versions of Visual Studio may restore an
acceptable development experience.

But none of these options can or will improve Entity Framework’s performance. Per Mr. Srikanth, you
must either shrink your model or split it into sub-models.




IdeaBlade, Inc                                      1                                            7-Nov-09
                                   Large Entity Framework Models

Shrinking the Model

The path to shrinking the model is obvious. Exclude the tables you don’t absolutely need. You can often
dump the reference entities (status, color, unit-of-measure) and pull them into your code base as
enumeration objects whose Ids match the foreign key values. The database tables survive to provide
referential integrity at the database level and to support reporting … if you happen to run reports out of
your transaction database.

I know one major client who shrunk the model. They shed roughly a hundred reference entities by
creating corresponding statically-defined enumeration object and they validate those objects with tests
that compare them to the reference tables in the database. They built a second, EF “Reference Entity”
model specifically for this purpose.

The reference objects live apart from the core model which knows nothing about them or the Reference
Entity model. The “Color” property of the Product entity, for example, is a business logic property that
trades in value objects of the reference type, “Color”; under the hood, the “Product.Color” property gets
and sets the hidden “Product.ColorId” property which is the foreign key value that is actually persisted.

This works well for them because their reference entities are stable; the fifty United States don’t change
often. Many of these entities are involved in business logic such that when the reference entities
change, the application must be recompiled and redeployed anyway.

The client might have used the separate “Reference Entity” model to construct application reference
objects dynamically reducing the need to recompile and redeploy in many cases.

Why Smaller Models?

Shrinking the model is not an option for my customers whose applications store data in a 1000 tables.
They must break up the model or choose a different technology (such as our “DevForce Classic” product
for .NET 2.0 which has its own ORM and persistence mechanism).

Before we examine how to break up a large EF model, I’d like to talk about why smaller models are a
good idea anyway.

Let me repeat, the Entity Framework should not dictate our model design. Available technology always
constrains design. But what a shame when the constraint is patently unnecessary; plenty of persistence
abstraction technologies handle far more than 200 entities comfortably and it’s a minor scandal that EF
cannot.

Nonetheless, I believe that a model of more than hundred entities is almost always a poor design. I have
no quarrel with an application that has thousands of them. I do think that such an application should be
conceived and constructed as a federation of smaller models with clear boundaries and cross-model
bridging of some sort.

Application software is hard to write and maintain. We are more successful when the people who
develop that software and the people who consume it have a shared understanding of the concepts and
processes embodied in that software.

This shared understanding is always elusive and not just in software. All communities, small and large,
struggle to maintain a common identity and purpose as evidenced by the rituals, stories, politics,



IdeaBlade, Inc                                      2                                            7-Nov-09
                                   Large Entity Framework Models

normative rules, and judicial proceedings that bind the community. These are at work even in the small:
in family and marriage.

A common language is essential. I mean “language” in a broad sense, understood to include the
meaning words have as experienced within the community itself.

Domain Driven Design (DDD) puts what I would call the “application community” at the center of its
philosophy so it follows that “ubiquitous language” is one of its core concepts. In brief, every element of
a domain model is couched in terms that everyone – developers, analysts and users – recognize as terms
in a “ubiquitous language”. The language is “ubiquitous” within that circle meaning that everyone
understands every word and every phrase and that this shared understanding is authentic. Of course it
frays in practice. But these insiders think that they actually do understand each other and believe they
mean the same thing when they say the same thing.

We are members of many communities. In the durable communities we prize highly, our shared
language is large and rich. If a community is less important, we won’t invest in learning or maintaining
the distinctive dialect of that community. When the community itself is in flux, members have less time
to master the language so it must be smaller.

News flash: no application domain is all that important and almost all members are transient. We simply
cannot afford to require a domain language with a large vocabulary. This conclusion applies directly to
the size of our application domain models.

I submit, with little evidence or fear of contradiction, that a model with more than 200 entities is a
model no one really understands. At best a few developers can identify all of them. These are the long
suffering developers, the ones who wrestle the beast every day.

What the entities actually do – their business purpose and their meaning to end users – this will elude
even them. There are too many moving parts and too few people who understand them.

I worry when I hear a customer speak proudly of thousands of entities, the über-database in which any
entity can be reached from any other entity. This is disaster in the making, certain death lurking in the
shadows.

Fortunately, the reality is usually different. While the database may hold thousands of tables – and our
tooling is capable of generate thousands of corresponding entity classes – the lucky truth is that the
entities actually coalesce into sub-models which collaborating teams can individually understand.

Technically, you could navigate from any entity to any other entity. In practice, you don’t. What is
missing is conscious modularity, an explicit bounding of sub-models that has yet to be made manifest
and enforced by the code. The sub-models are in there. We just haven’t taken the time to reveal them.

Entity Framework will oblige us to take the time. While that may seem like time wasted, I believe it will
be repaid in short order with code that is easier to read, write, and maintain.

Of course I will have to build a bridge between the models. A module grounded in Model A will need
information about customers … information derived from customers as they are known in Model B.

 Domain Driven Design speaks both to separating models and to bringing them together. Concepts such
as Bounded Context, Context Mapping, Shared Kernel, Domain Events, and Anti-Corruption Layers are
central to DDD thinking. Be sure to spend some time with them.


IdeaBlade, Inc                                      3                                            7-Nov-09
                                   Large Entity Framework Models

End of sermon. So how do we do it?

Separate Models, Separate Databases

DDD architects will tell you that different models should be backed by different data stores. When the
data of Model A and Model B are stored in the same database you cannot ensure that “A” entities will
be safe from corruption by consumers of “B” entities. You will lose the ability to evolve the models
independently if they share the same database.

Two models may seem to share the same entity – a Customer for instance – but seasoned architects
know that these are not really the same “Customer” thing. What it means to be a “customer” – what a
customer can do and what we know about it – will differ from model to model. The definitions drift
apart as the customer use cases are explored independently. The seeming similarity will eventually
disappear. Trying to force the Model A Customer type to match the Model B Customer type impedes the
development of both.

This makes sense to me and is confirmed by my experience. How many times have we seen a Customer
record become the dumping ground for data columns whose meaning and usage are long forgotten?

Security and scalability may also drive you toward separate databases. Some information may be so
sensitive that it requires special attention. Indeed, you may be required by law to keep and protect
certain data in a carefully controlled environment. But you need not impose those constraints on all
your application data. You can isolate the sensitive information in its own database which can be
physically and logically guarded more closely than data held elsewhere.

Certain data are more volatile than others; some data are more valuable than others. High volume,
volatile, low risk data such as shopping cart or shopping history data could go in a cloud database. You
may feel more comfortable keeping customer information (including credit cards) on premises.

If I have a choice and I have time to work out the details, I separate both the models and their backing
stores.

Separate Models, One Database

Sometimes you don’t have the choice or the time.

Few of us have the luxury of developing a green field application. If you’ve read this far, you probably
have an existing database with a boat load of tables and you’re not walking away from it. You feel that
arguing about design-first versus data-first is a waste of time; the “last responsible moment” for
committing to a schema happened long ago; generating 1000 types is vastly more appealing than coding
them by hand. You want an entity model that matches up and you want to get on with it.

While pontificating on Boundary Context and Context Mapping, we easily overlook how hard this is to
do properly. Our understanding of the domain is in great flux, especially at the beginning. Boundaries
won’t settle down and neither does the code for communicating across them.

The prospect of maintaining two separate Customer tables is most unappealing. When they differ, which
of the two is “right”? We can barely recognize that two customer entities are actually the same
customer when they map to the same table row; it gets harder when customer data are scattered across
multiple schemas and multiple databases. This problem has dogged developers for decades.



IdeaBlade, Inc                                      4                                            7-Nov-09
                                  Large Entity Framework Models

You have to think about cross-model transactions. Will you pay the performance and complexity cost of
distributed transactions that make ACID guarantees? Or will you follow the trend toward BASE
transactions which promise “eventual consistency” – eventual data integrity – after multiple, separate
database postings outside of transaction boundaries?

The smart talk about “eventual consistency” and “compensating transactions” sounds great. Without
automated ACID guarantees, you must ensure that data become consistent in the time allotted by your
service level agreement. If you timeout and have to undo the changes, you must know all the system
and human side effects of the now-failed transaction; you have to figure out how to roll back the data
across the databases and to make compensating adjustments.

This is astonishingly difficult work and completely unnecessary for many (most) business applications.

The DDD doomsayers need a reality check; most business applications have operated successfully for
years on the back of a single database.

“Get serious!” you say. “We’re going to have one database now … and for the foreseeable future!”

Ok, you’re the boss. I get it. We can have one database, the big one, the one you have now.

“Please, will you get on with it and show me how to make sub-models in Entity Framework?”




IdeaBlade, Inc                                     5                                           7-Nov-09
                                   Large Entity Framework Models



    Dividing a Large Entity Framework Model into Sub-models
In part one I talked about big models – models with more than 200 entities. Entity Framework is not kind
to owners of big models. The only practical solution, if you want to use EF, is to break up the big model
into sub-models. I explored why this is a good idea anyway, why you might want to back each model
with its own database … and why you probably won’t.

In this part, I’ll explain how to break up your Entity Framework model in the context of an IdeaBlade
DevForce application. Much of what I will say may be applicable if you don’t use DevForce; that’s dandy
but my purpose and demonstration are unapologetically DevForce-focused. The application and
accompanying video are available on the “Ward’s Corner” page of the IdeaBlade website.

Sharing Common Types

There is nothing to it if you have completely separate models, if they have no entities in common.
Create two EDMX models, two DevForce domain models, deploy them, and you’re done. You know how
to do it once; just do it twice.

Life is rarely that simple. I assume that you have at least one type that is nominally the same in two or
more sub-models. The challenge is to maintain that commonality without duplicating code.

A Scenario

Imagine two teams charged with building two separate modules of an application.

A customer relationship management (CRM) module tracks customers and their orders for products
drawn from the company catalog. We reference products but we don’t maintain the product catalog in
this module.

We maintain the product catalog in the second module. Activities include adding products, retiring
products, negotiating prices with suppliers, etc. In this module we don’t think about customers, sales
reps, orders, etc.

We keep all application data in the Northwind database. We’ll have two models: “Model A” for the CRM
module and “Model B” for the product catalog module.

Both models have a notion of “Product”. They each have a “Product” entity mapped to the same
Product table. Our two models look like this:

Model A                    Model B
Customer                   Product
Employee (SalesRep)        Category
Order                      Supplier
OrderDetail
Product

Divide the EF Model

In the part two of an EF team blog on this subject, Mr. Srikanth Mandadi describes two approaches to
splitting up an Entity Framework model and sharing types.


IdeaBlade, Inc                                       6                                            7-Nov-09
                                     Large Entity Framework Models

His first approach involves splitting out multiple CSDL files, a tortuous eleven step sequence that is
brittle and frightful to maintain. He observes that “this would not solve the performance problems ...”
Don’t even think of going this route; just forget about it.

In his second approach, he creates two separate models … and a third CSDL file to hold the conceptual
model for the type(s) in common. An illustration of the EF component files makes this clearer.

 Model File             Model A                Model B            Common
Conceptual        Customer                 Category              Product
(CSDL)            Employee (SalesRep)      Supplier
                  Order
                  OrderDetail
Storage           Customer                 Product
(SSDL)            Employee                 Category
                  Order                    Supplier
                  OrderDetail
                  Product
Map (MSL)         - Same as in SSDL -      - Same as in SSDL -
When you first define the two models, you’ll map Product in both. Then you factor the conceptual
Product into a third CSDL file and give it its own namespace.

I see far too many treacherous steps to follow. Don’t do this either.

EF Models With Duplicate Types

A third way is simply to have two models that both define Product. The Product-defining XML appears
in all three spaces: CSDL, SSDL, and MSL. Don’t mess about with a carved out Product CSDL file. Don’t
expose the component EDMX files either; keep them together in the consolidated EDMXs.

Mr. Srikanth mentions this third way briefly before dismissing it:

          You might want to consider duplicating information in CSDL too. This would allow you to work
          with the designer. But if you are dividing the model for performance and maintainability reasons
          and you actually want to use these smaller models in a single application, duplicating the
          information would not be a viable option. There are definitely other disadvantages with
          duplicating information across multiple model files (typically the same problems that you would
          see with duplicate code).

I disagree entirely. This is the best way to go.

When you move the Product into a separate CSDL as he proposes, you cannot use the visual designer at
all; that’s a huge sacrifice.

His way you have to juggle the three separate component files (CSDL.xml, SSDL.xml, MSL.xml) for each
model … plus the Product CSDL. That means careful manual coordination of seven files instead of the
automated handling of two EDMX files – another headache you’d best avoid.

Duplicating the Product entity in two EDMX files does not harm performance. Does it harm
maintainability?



IdeaBlade, Inc                                        7                                          7-Nov-09
                                   Large Entity Framework Models

Any duplication impairs maintainability. But don’t think his recommendation is more maintainable. The
forces of change to a conceptual entity come mostly from schema and mapping changes. Add a column
to the Product table and you’ll visit both models … without the aid of the EF visual designer. Even
renaming a Product property would require an update to both EDMX models because the mapping that
relates that property to a column in the Product table is buried in both MSL files.

So pay the price and duplicate the conceptual Product definition in both EDMX files. When change
comes – a new column, renamed property, whatever – you’ll carefully repeat the change to all models
that include Product. Someday someone will write a tool to “diff” entities in multiple EDMXs so you can
detect unwanted differences. Meanwhile be careful.

 I am pretty sure that you won’t have a lot of entities that cross sub-models. You need to keep a close
watch on these intersection points anyway. The boundary between domain models always deserves
your utmost attention. This is where the two development teams are likely to misunderstand each
other; this is where bugs are likely to occur. Code sharing fosters the dangerous pretense that both
teams mean the same thing by Product.

On the bright side, the two Product CLR classes emerging from the two models consist of generated
code, not code you’ll have to maintain by hand. Pay attention to the mapping and the generated code
will take care of itself.

        Disclosure: at this writing, DevForce cannot cope with a carve-out CSDL file and we could not
        support either of Mr. Srikanth’s favored approaches even if we agreed with them.

“BigModelBreakup” Solution

To explore the Customer/Product scenario described above, I’ve built a DevForce Visual Studio solution
with seven small projects:

Project                         Purpose
BigModelBreakup                 MS Testing of the separated models
DevForceIntegrationTesting      Custom asserts used frequently in DevForce integration testing
ModelA.DF                       DevForce “Model A”
ModelA.EF                       Entity Framework “Model A”
ModelB.DF                       DevForce “Model B”
ModelB.EF                       Entity Framework “Model B”
ModelIntegration                Support for cross module behaviors
I could have thrown everything into one pot but I much prefer to keep these concerns in separate
projects. I strongly favor separating the Entity Framework model from the DevForce model although we
rarely do so in our demonstrations. I find it cleaner, clearer, and easier to maintain; maybe it’s just me.

Two EF Models

The two EF models were built exactly as I described in the “third way” above. Each holds an EDMX that
completely describes its model. Here are the entity diagrams.




IdeaBlade, Inc                                       8                                            7-Nov-09
                                   Large Entity Framework Models

“Model A” – Customer Order




“Model B” – Product Catalog




Both models are generated from the same database, “NorthwindIB”, which is a slight variation on the
famous Microsoft “Northwind”. NorthwindIB is shipped with the IdeaBlade tutorials.

The two Product entities are almost identical. They differ in their treatment of foreign keys. The foreign
key ID properties that support the links to Category and Supplier are exposed in Model A. Model A lacks
the Category and Supplier entities and their associations to Product.

These foreign key properties are missing in Model B because Entity Framework v.1 insists on hiding
them. We’ll be able to see them in EF v.4 but they are hidden for now.

To make the Product definitions strictly identical, I could sever the associations among “Model B”
Product, Category, and Supplier entities. Such surgery is involved and delicate and not worth the



IdeaBlade, Inc                                      9                                            7-Nov-09
                                   Large Entity Framework Models

trouble. In fact, I rather like that I can navigate from Product to Category in “Model B”. I see no good
reason to deprive myself of that facility nor would I prefer to hand code that navigation as Mr. Srikanth
is obliged to do in his example.

Two DevForce Models

If you are familiar with DevForce, you know the next step is to launch the DevForce Object Mapper from
the Visual Studio toolbar and generate DevForce models for each one. I did just that, putting each
DevForce model in its own project.

Plenty of tweaking is possible but I didn’t bother in my first attempt. I did let the Object Mapper
“pluralize” the models.

Two critical steps must not be overlooked:

First, each model must have its own “Data Source Key Name”. I called them “DefaultA” and “DefaultB”
respectively. You set these key names when viewing the EDMX node of the browser. Distinct key names
are essential because we must maintain separate connection information for each EDMX model. The
connection strings will be identical (the database is the same) but the EDMX component file
specifications will differ by model.

Second, each DevForce model should have its own namespace. I called them “ModelA” and “ModelB”
respectively. We require two namespaces because we have two entities named “Product” and we need
to distinguish them in code.

We could change the entity names to “ProductA” and “ProductB” but this feels unnatural and draws
attention to the artificial model names “A” and “B”. Yuck. Moreover, I rather like using a namespace to
distinguish our models; it lightens the load on IntelliSense too.

Code generation yields two entities named Product: ModelA.Product and ModelB.Product. They have
approximately the same shape and the same properties. The ModelB.Product has navigation properties
that return Category and Supplier entities. The ModelA.Product lacks these properties but exposes the
CategoryID and SupplierID foreign key values.

I explained the root cause of these differences above. I confess that I’m not fond of the foreign key
properties in ModelA.Product; we’ll get rid of them later.

Meanwhile, we are ready to go. We build our application modules against each model separately.

We can combine models in the same application module if we want to do so.

We can even combine entities from each model in the same EntityManager (that’s the DevForce
equivalent of the client “Context” container for folks not versed in DevForce). The integration test,
“BreakupTestFixture.CanMixProductFromBothModelsInSameManager”, demonstrates this point.

Important note of caution: within a single EntityManager you can make changes to entities from both
models and save those changes in a single transaction. You should be aware that, under the covers,
DevForce (and the Entity Framework) will use a distributed transaction (using the Distributed
Transaction Coordinator – DTC) even though all of the tables involved belong to the same database.




IdeaBlade, Inc                                      10                                            7-Nov-09
                                     Large Entity Framework Models

 I do not believe that EF is not “smart” enough to realize that the database connections strings for the
two Entity Framework models are the same and that DTC is unnecessary. Distributed transactions are
considerably slower than simple transactions. Significant adverse performance consequences may
follow. You should measure to be sure that this isn’t a problem in your application.

I suggest that you avoid mixing entities from different models in the same Entity Manager and, if you
must mix them, avoid saving changes to multiple models. While we support distributed saves, proceed
with appropriate awareness and judgment.

“Product” Code in Common

ModelA.Product and ModelB.Product are distinct types backed by the same database table. They have
most of the same properties. I smell code duplication.

We discussed this point earlier. I argued that, while duplication is regrettable, it is impractical to avoid it
in mapping and generated code. I said nothing about the business logic code that of necessity we write
by hand.

The generated entity model is always somewhat anemic. DevForce generates a small amount of
“business logic” automatically, mostly validation, based on information it gleans from the database
structure. A primary key is obviously a required field; a string property backed by an nvarchar(30) field
should have a string length maximum of 30 characters. I’m grateful not to have to write these
validations myself. But on their own they do not constitute a model rich in behavior.

In DevForce as in Entity Framework, the main road to entity enrichment lies in elaborating the partial
classes. To extend Customer, we create a Customer class file with (in C#) a class definition that begins
“public partial class Customer {…}”.

When we write such code for a “shared” entity such as Product, we ask ourselves “do we have the same
logic for the Product in Model A as we do for the Product in Model B”.

Let’s suppose we have such common logic. We certainly don’t want to maintain that logic in two places.

We don’t have to. We can write the common logic in a single partial class file and share that file across
the model projects that need it. In the example below, I’ve added the file to the “ModelB.DF” project
and I link to it in the “ModelA.DF” project.

If you have written a Silverlight application you know about file links. Rather than copy a file into the
project, you point to (link to) a file residing somewhere else, thus avoiding file duplication and the
consequences thereof.

Linked files appear in the Visual Solution Explorer window with a shortcut adorner on the file icon as
seen here:




IdeaBlade, Inc                                        11                                              7-Nov-09
                                     Large Entity Framework Models

The effect is that the same physical file is compiled multiple times, once for each project. We have only
one canonical source file; changes to it are propagated through compilation to all projects that link to it.

Here is a Product partial class file that illustrates the point:
#if MODELA
namespace ModelA {
#elif MODELB
namespace ModelB {
#endif
    public partial class Product {

        private string _foo;

        public string Foo {
          get { return _foo ?? "Common Foo"; }
          set { _foo = value; }
        }
    }
}

The “Foo” property is implemented the same way for both models.

At the top we’ve used compiler directives to swap namespace names depending upon whether the
compiler is working on the “ModelA.DF” project or the “ModelB.DF” project.

“MODELA” and “MODELB” are conditional compilation symbols specific to each project. You set one or
the other on the “Build” tab when you open the “Property” window for the project.

This is “hacky”. Cover the children’s eyes and draw a curtain across the indiscretion. You will survive it.

Deliberately Divergent “Product” Code

No law that says the two Product implementations have to be identical. Once you’ve divided your large
model into sub-models based on their distinctive uses, you start to see valid reasons for ModelB.Product
to behave differently than ModelA.Product. The data integrity rules may be the same but the usage
scenarios are different. You may find activities that make sense only in one model or the other.

The DDD folks told us this would happen. We start to realize that the two things called “Product” mean
something different in each domain. Maybe they really should be different types; see part one of this
essay for a bit more on this point.

Don’t be shy; create another partial class file that adds special logic to the model that needs it, as we do
here in “Product_B_Only.cs” ; do not link to this file in the project for Model A:
namespace ModelB {

    public partial class Product {

        private string _bar;

        /// Model B (only) Bar property
        public string Bar {
          get { return _bar ?? "Model B Bar"; }
          set { _bar = value; }
        }
    }


IdeaBlade, Inc                                         12                                          7-Nov-09
                                   Large Entity Framework Models

}



Now that I see the value of diverging Product types, I might want to revisit my entity mapping.

For example, it’s unnecessary and even a little dirty for my ModelA.Product to expose the CategoryID
and SupplierID. These are foreign keys to notional types (Category, Supplier) that are utterly
meaningless in Model A.

I reopen the DevForce Object Mapper and make those properties “protected”. I won’t erase them from
the definition because I have some games in store for our demonstration. But I’ll certainly hide them
from law-abiding consumers of the ModelA.Product.

I decide that consumers of ModelA.Product shouldn’t be able to change a Product. While I’m in the
DevForce Object Mapper, I make all generated property Setters private. I go on to add guard logic that
prevents attempts to save a changed product because I’m worried that developers will get sneaky and
I’m feeling especially paranoid. The net effect is that users of Model A can see but not change Products.

I hasten to add that these departures from the initial Product modeling are entirely optional. While you
must cauterize the missing associations in Model A, if you want your Product entities to be otherwise
identical, you may have your wish. I’m merely observing that you are technically free to define Product
differently in the two models to reflect their distinctive model semantics.

Speaking for myself, I welcome the opportunity to explore the individual design possibilities of these
two domains. Having the database table in common will keep the Product entities from straying too far
apart. But only you know the degrees of freedom you can tolerate in your environment for your team.

Communicating Across Models About Products

All is well if modules built around these separate models do not have to communicate. That kind of
isolation is rare.

How would Module A – pinned to Model A – communicate something about Products to Module B?
Suppose, for example, that the user is looking at a Customer Order line item. She wants to know more
about the product; perhaps she is even authorized to make a change to that product. How does she
launch a Product Catalog management session to explore this particular?

Let’s look in the opposite direction too. Consider the Product Catalog manager who wants to know order
volume for a product so she can negotiate a better price with the supplier.

These are the kinds of “what if” use cases that lead project managers to lobby for large, comprehensive
models. “You see,” they say, “everything is potentially related to everything else.”

“Potentially” yes. But upon closer examination, it turns out that these are activities outside of the
module mainstream. They must be satisfied but not at the expense of a compromised domain model.

You’ll notice that the foreign information is typically read-only. You are not required to change a Product
from within the Customer Order module; you can jump from a ModelA.Product in one module to the
“same” product in a module built upon Model B.




IdeaBlade, Inc                                      13                                            7-Nov-09
                                   Large Entity Framework Models

The Product Catalog manager wants a snippet of information from the Customer Order system; if she
really needs to go spelunking among the orders, she’ll transition to a Model A-oriented module where
the facilities for investigation are richer.

The sanctity of our sub-models remains intact. We just need a way to communicate loosely from one
model environment to the other.

The demonstration code shows us a few ways to jump the gap.

Crossing the Chasm

Suppose, as we said, that a user of Module A decides to drill in on the product named in a line item.
Some UI gesture launches a product deep-dive in Module B. How can Module A tell Module B which
product to pursue?

I would arrange for Module A to send a “Show Product” message to Module B. Thus, Module A and
Module B are completely decoupled and know nothing about each other’s domain models. I would send
the message via a lightweight, in-process messaging technology such as an Event Aggregator like the one
in Prism.

What goes in the body of the message?

The easy, obvious answer is the product ID – the minimum information necessary to identify the product
for any Product message receiver in our application. Module B hears the call, extracts the ID, queries the
database, and proceeds from there.

I think this is the preferred way most of the time. Be wary of anything more complicated … such as I’m
about to show you.

Let’s change the scenario slightly. The user wants the potential to drill in on every product on the order,
maybe every product ever ordered by this customer.

Module B doesn’t know about Customers or Orders; its model refers to Products, Categories, and
Suppliers. Module A could extract the unique Product IDs and pass them in the body of the message to
Module B. Again, this may be the best approach.

But you’re losing sleep over the performance implications. You’ve got information in client memory in
one module and my recommended approach requires you to make a trip to the database to get
essentially the same information again for the second module.

Consider an alternative. Module A might ask a “transfer service” to translate ModelA.Products to
ModelB.Products without that “expensive” trip to the server. I demonstrate just such an approach with
the help of a class in the ModelIntegration project.

Unlike Module A, the ModelIntegration project has a reference to both Model A and Model B. It exists to
assist in regulated flows across the model boundaries. Here’s an extension method to do the job …
namespace ModelIntegration {
  public static class ProductExtensions {

     public static ModelB.Product ToModelB(
         this ModelA.Product productA) {
       var result = new ModelB.Product();



IdeaBlade, Inc                                      14                                            7-Nov-09
                                      Large Entity Framework Models

            CopyProductToModelB(productA, result);
            return result;
        }
        // ...
    }
}


… and an example of how to call it:
        productB = productA.ToModelB();

Imagine that this method is buried in the guts of a “Show Products” method in a Model Integration
Service class. With these basics in place, the communication goes something like this:

           Module A calls “Show Products” with a list of ModelA.Products.

           The service uses the “ToModelB” extension method to translate the products and fills the
            message body with ModelB.Products; it then raises the appropriate message event.

           Module B receives the message, extracts the ModelB.Products, and attaches them to its
            EntityManager for immediate use.

           Module B displays the Product Catalog Management screen with these products in a grid.

Module B won’t have to query the database for these products; all of their persistent state values were
transferred (even data that were hidden from Model A consumers).

Server trips may become necessary later as the user drills for more detail on some of these products.
But the initial list of products presented to the user will be constructed entirely from resources in client
memory.

Several “BreakupTestFixture” tests such as “CanCreateManyProductBsFromProductAs” demonstrate
this approach.

A Little More Detail

The “ToModelB” extension method delegates to a private static method, “CopyProductToModelB”. That
method could require intimate knowledge of both Product entities, making it difficult to maintain. If I
change the schema or rename a property, I have to edit this method.

In this example, I write a more robust translator using DevForce entity indexing.
private static void CopyProductToModelB(
    ModelA.Product productA, ModelB.Product productB) {

    // Assuming property names are identical
    // (except for foreign keys)
    // copy values from ProductA to Product B
    var dataProps =
      productA.EntityAspect.EntityMetadata.DataProperties;

    foreach (var prop in dataProps) {

        var targetName = prop.Name;


IdeaBlade, Inc                                        15                                           7-Nov-09
                                  Large Entity Framework Models

        if (targetName    == "CategoryID")
          targetName =    "Category_fk_CategoryID";
        if (targetName    == "SupplierID")
          targetName =    "Supplier_fk_SupplierID";

        productB.EntityAspect[targetName] = productA.EntityAspect[prop];
    }
}

The method assumes that the property names are the same for both types of Product entity with a few
notable exceptions. They differ where they must, in the treatment of foreign key IDs.

         Entity Framework v.1 hides the foreign key ids behind the associations. DevForce exposes them
         through properties it generates using the “EntityName_fk_ID” naming convention as seen in
         “Category_fk_CategoryID”. We see these special properties in Model B but not in Model A
         which lacks these associations. I can hardly wait to be done with this business in EF v.4.

Access and assignment via entity indexing by-passes the properties themselves. Entity indexing works
even when the properties are private, as they are for ModelA.Product. Use this DevForce backdoor
wisely and sparingly.

Creative Projection

One day we discover that we must show the Category and Shipper names when we display order line
items in Module A. Do we break down and expand Model A to include Category and Supplier?

We slide down a slippery slope if we do; our Model A could easily mushroom to an unwieldy size again.

We really don’t want the full Category and Shipper entities in Model A. We don’t care about their
business logic and we shouldn’t burden Model A (or Module A) with those responsibilities. All we want
are some names.

One approach would be to add a view to the database and model it as an entity. This is an effective but
heavy handed solution with potentially difficult deployment implications. Perhaps you lack authority to
change the database or the organizational barriers are too high.

Instead, you might consider using a Product projection object, a “Data Transfer Object” (DTO), that
conveys just the information you need back to Module A.

We have a small problem. Module A cannot compose a projection query that involves the Category and
Supplier entities without knowing about those entities … and Model A does not know about those
entities. Only code using Model B has access to those entities. We don’t want Module A to take a
dependency on Model B either.

I recommend turning to an integration “service” as we did before when we translated “A” products to
“B” products. This time, the lingua franca will be a ProductDto class such as this one:
[DataContract]
[ReadOnly(true)] // UI hint makes class ReadOnly
public class ProductDto : IKnownType {

    [DataMember]
    public int ProductID { get; set; }

    [DataMember]



IdeaBlade, Inc                                     16                                          7-Nov-09
                                   Large Entity Framework Models

    public String ProductName { get; set; }

    [DataMember]
    public String QuantityPerUnit { get; set; }

    [DataMember]
    public Decimal? UnitPrice { get; set; }

    [DataMember]
    public String CategoryName { get; set; }

    [DataMember]
    public String SupplierName { get; set; }
}

This class resides in the ModelIntegration project which is referenced by Module A. The integration
service belongs to the ModelIntegration assembly which, as we know, has a reference to Model B. The
service sports methods such as the following:
public ProductDto GetProductDtoById(
  EntityManager manager, int productId) {
  var q = manager.GetQuery<Product>()
      .Where(p => p.ProductID == productId)
      .Select(p => new ProductDto {
        ProductID = p.ProductID,
        ProductName = p.ProductName,
        QuantityPerUnit = p.QuantityPerUnit,
        UnitPrice = p.UnitPrice,
        CategoryName = p.Category.CategoryName,
        SupplierName = p.Supplier.CompanyName,
      });

    return q.FirstOrDefault();
}

Note the projection into the ProductDto class. In an n-tier application, DevForce executes the projection
on the middle tier before serializing the ProductDto instance over the wire to the client. This explains
the serialization markup on the DTO.

The net effect is that Module A receives the extended Product information it requires without widening
its model to include entities it neither needs nor wants.

Conclusion

You can break a large Entity Framework entity model into two or more sub-models. It requires some
careful thought about what those sub-models ought to be – effort repaid by a modular design that
should be easier to understand and maintain.

The mechanics are simple and acceptably robust. You will create multiple EF models with some
redundant entity class mappings. We can tame the duplication, if not eliminate it, by confining it to the
mapping, by relying on code generation, and by minimizing the number of entities that are conceptually
shared across models.

When we require a duplicate entity, we can write its common business logic in a single class file and use
file-linking to ensure that the same code base is compiled into all of the applicable model assemblies.




IdeaBlade, Inc                                     17                                           7-Nov-09
                                   Large Entity Framework Models

In this essay we saw simple techniques for cauterizing entities that would otherwise require navigations
to objects outside the local model. We saw simple techniques for communicating across modules about
unknown entities.

Such techniques are resilient in the face of changes to the database schema and changes to the models
themselves. We retain use of the visual designers and the other productivity tools that shield us from
raw XML surgery and keep our deployment relatively simple and obvious.

I concede that this is more complicated than it should be. How annoying to have these “mitigations”
thrust upon us by unwarranted technology constraints. But life isn’t fair and, as horrors go, this bug-a-
boo need not disturb our sleep.




IdeaBlade, Inc                                      18                                            7-Nov-09
                                     Large Entity Framework Models



                                            Sample Code
An example test application, “BigModelBreakup”, and a video explaining it accompany this essay; they
can be found on the “Ward’s Corner” page of the IdeaBlade website. In this appendix I explain how to
build and run that application.

I recommend that you install DevForce first; it’s free and makes everything I describe below so much
easier. The application will build and run without DevForce installed in case you’re just kicking the tires.

The tests depend upon the “NorthwindIB” database which is installed with DevForce. You’ll find a zipped
copy of this database inside “BigModelBreakup.zip” which you can expand and attach yourself.

The “NorthwindIB” database is built for SQL Server 2005. The connection strings that appear in several
configuration files (the “app.config” file in the BigModelBreakup project most importantly) assume you
have installed the database in the usual way and can access it with “Integrated Security”; you’ll have to
adjust these strings to suit your environment if you have departed from our suggested installation.

The application file structure is as follows:




The “Lib” directory contains DevForce assemblies for those who have not installed DevForce. If you have
installed DevForce, you can delete this directory … and you should delete it! All of the projects reference
the assemblies in this Lib directory but they will fall back to the GAC when DevForce is installed.

The solution holds seven small projects as described here:

Project                          Purpose
BigModelBreakup                  MS Testing of the separated models
DevForceIntegrationTesting       Custom asserts used frequently in DevForce integration testing
ModelA.DF                        DevForce “Model A”
ModelA.EF                        Entity Framework “Model A”
ModelB.DF                        DevForce “Model B”
ModelB.EF                        Entity Framework “Model B”
ModelIntegration                 Support for cross module behaviors


IdeaBlade, Inc                                       19                                            7-Nov-09
                                   Large Entity Framework Models

You must build the solution successfully before proceeding. There should be no errors or warnings.

The application consists of a battery of tests in the BigModelBreakup project. This should be the
“Startup Project”. The tests that are most pertinent to this discussion are in the “BreakupTestFixture.cs”
file.

These tests are defined with the expectation that you will run them with MS Test which comes with
Visual Studio. Feel free to convert to the test suite of your choice.

If BigModelBreakup project is your “Startup Project”, running the tests is as simple as hitting F5. Here’s
hoping they are all green!

Running Tests in 3 Tiers

The code as delivered runs 2-tier (client/server). You can run these tests in 3-tiers by “deploying” the
server as a console application and re-running the tests against this server.

Follow these four steps:

    1.   Create and populate a “Console Server” directory
    2.   Run “ServerConsole.exe” in that directory
    3.   Modify the “app.config” in the “BigModelBreakup” test project so that it uses the console server
    4.   Run the tests.

I’ve included a DOS command file, “CreateAndRunConsoleServer.cmd,” that combines steps #1 and #2.
You’ll find it among the solution artifacts.

         If you haven’t installed DevForce, you should substitute “CreateAndRunConsoleServer –
         DevForceNotInstalled.cmd”

After running “CreateAndRunConsoleServer.cmd” you should see two console windows. The first
creates the “ConsoleServer” directory and copies the necessary material into it. The second is the
running Console Server application.

Note: the operating system may first ask for your permission to run the Console Server.




IdeaBlade, Inc                                      20                                            7-Nov-09
                                    Large Entity Framework Models




You can close the first window as it has done its job of creating the “ConsoleServer” directory.

        If something goes wrong, inspect that first window to confirm that all files were copied
        correctly.

        It may be instructive to examine the ““CreateAndRunConsoleServer.cmd” command file to see
        what was intended.

Observe that the server is running. It is listening on port 9009 and will continue to do so until you shut it
down by pressing “Enter” or closing the command window.

Next open “app.config” in the “BigModelBreakup” project and set the “isDistributed” attribute to “true”.




Now run the tests as you did before.




IdeaBlade, Inc                                       21                                             7-Nov-09
                                     Large Entity Framework Models

How can you be sure that you were running 3-tier? By examining the server-side debug log.

       Open the “ConsoleServer” directory in Windows Explorer
       Find “Debuglog.xml” and double-click it; you should see it appear in a browser.
       Look for entries that show server activity such as this one:




Close the “ConsoleServer” command window to terminate the server.

You can delete the “ConsoleServer” directory if you wish.

When you are finished running 3-tier, remember to reset the “isDistributed” attribute back to “false”.
Case matters in XML so always enter “true” and “false” in lower case.

 If you neglect to reset the flag to false, the tests will continue to try to run 3-tier and they will fail with
the exception:

        “Unable to create instance of class BigModelBreakup.BreakupTestFixture. Error:
        IdeaBlade.EntityModel.EntityServerException: Unable to connect to
        http://localhost:9009/EntityService. The server or internet connection may be down.”




IdeaBlade, Inc                                         22                                               7-Nov-09

								
To top