Archive for the ‘NHibernate’ Category

Persisting enumeration classes with NHibernate

As part of my “Crafting Wicked Domain Models” talk, I walk through the concept of enumeration classes, yanked from Java and on Jon Skeet’s list of biggest C# mistakes (or missing features). In my talk, I leave out how to … Continue reading 

Ditching domain models for reads

Last week was a tipping point for me.  We had an issue where a production service failed because NHibernate was trying to issue thousands of UPDATE calls for domain objects that we didn’t update.  It turned out that we had added a new colum…

Ad-hoc mapping with NHibernate

In my recent adventures with massive bulk processing, there are some times when I want to pull bulk loaded tables from SQL, but don’t want to go through all the trouble of building a mapping in NHibernate.  For example, one recent project had an intermediate processing table of something like:

image

This table is used in a bulk copy scenario, so it’s very string-based to ease the burden of bulk loading.  Later, we’ll transactionally process this table to update our actual customer table.  In the meantime, we want to use this data in a .NET application.  We have a few options:

  • Load into a DataSet
  • Stream from an IDataReader
  • Map using NHibernate
  • Ad-hoc map using NHibernate

Many times, I like to go with the last option.  DataSets and data readers can be a pain to deal with, as most of the code I write has nothing to do with dealing with the data, but just getting it out in a sane format.

NHibernate supports transformers, which are used to transform the results of the query into something useful.  To make things easy, I’ll create a simple representation of this table:

public class BulkLoadCustomer
{
    public string CustomerId { get; set; }
    public string RegisteredDate { get; set; }
}

I’ll create some generic query:

var sql = "SELECT CustomerId, RegisteredDate FROM BulkLoad.Customer";

var sqlQuery = _unitOfWork.CurrentSession.CreateSQLQuery(sql);

I just have the NHibernate ISession object exposed through a Unit of Work pattern.  With the ISqlQuery object that gets created with the CreateSQLQuery() method, I can then specify that I want the results projected into my custom DTO:

var results = sqlQuery
    .SetResultTransformer(Transformers.AliasToBean(typeof(BulkLoadCustomer)))
    .List<BulkLoadCustomer>();

The AliasToBean method is a factory method on the static Transformers class.  I tell NHibernate to build a transformer to my DTO type, and finally use the List() method to execute the results.  I don’t have to specify any additional mapping file, and NHibernate never needs to know about that BulkLoadCustomer type until I build up the query.

The name “AliasToBean” is a relic of the Java Hibernate roots, which is why it didn’t jump out at me at first.  But it’s a great tool to use when you want to just map any table into a DTO, as long as the DTO matches up well to the underlying query results.

Kick It on DotNetKicks.com

Bulk processing with NHibernate

On a recent project, much of the application integration is done through bulk, batch processing.  Lots of FTP shuffling, moving files around, and processing large CSV or XML files.  Everything worked great with our normal NHibernate usage, until recently when we had to process historical transactional data.

The basic problem is that we had to calculate historical customer order totals.  Normally, this would be rather easy to do with a SQL-level bulk update.  However, in our case, we had to process the items one-by-one, as each transaction could potentially trigger an event, “reward earned”.  And naturally, the rules for this trigger are much too complex to attempt to do in straight T-SQL.

This meant that we still had to run our historical feed through the domain model.

A normal, daily drop of customer orders is processed fairly easily.  However, historical data over just the past year totaled close to 5 million transactions, each transaction being a single line in a CSV file.

Since a normal daily file would take about 2 hours to process, we simply could not wait the projected time to process all these transactions.  However, with some handy tips from the twitterverse, we were able to speed things up considerably.

Bulk import and export

The first thing we found is that if you have to do bulk, set-based processing, it was important to determine if this data is a bulk import, or bulk process.  Bulk import is extremely quick with tools like SQL Bulk Copy.  If you need to do a bulk import or export, use the right tool.  A bulk import of these rows into a single table takes about 2.5 minutes, versus days and days one transaction at a time.

In another process we needed to bulk load customer data.  The customer data matched fairly closely to our existing customer table, so we crafted a process that basically followed:

  • Create table to match shape of CSV file
  • Load CSV into table
  • Issue single UPDATE statement to target table with the WHERE clause joining to the CSV-imported table

In this manner we were able to very, very quickly import millions of customer records very quickly.  However, if it’s not straight bulk import or export, we have to go through other channels.

Optimizing NHibernate for bulk processing

It was a lot of work tinkering with several different ideas, but ultimately the churn was worth it.  Here’s a few tips I picked up along the way:

Utilize the Unit of Work pattern and explicit transactions

When I first started, all processing was done with implicit transactions, and copious use of the Flush() method on ISession.  This meant that every. single. save. was in its own transaction.  When you look at the number of roundtrips this entailed, our database was just getting completely hammered.

Additionally, an ISession instance was disposed after every write to the database.  This meant that we could not take advantage of any of the first-level-cache support (aka, identity map) inherent in ISession.

Instead, I switched the codebase to use an actual Unit of Work, where I created a class that controlled the begin, commit and rollback of a unit of work:

public interface IUnitOfWork : IDisposable
{
    ISession CurrentSession { get; }
    void Begin();
    void Commit();
    void RollBack();
    bool IsActive { get; }
}

Before, we really had no control or understanding of the lifecycle of the ISession object.  With this pattern, its lifecycle is tied to the IUnitOfWork, allowing me to take advantage of other NHibernate features.

Use MultiCriteria/MultiQuery to batch SELECTs

MultiCriteria and MultiQuery allow you to send multiple SELECTs down the wire.  For each row in our table, we had to issue a SELECT, as processing a single transaction meant I needed to potentially affect the customer record as well as any previous order transaction records.  Doing this one SELECT at a time can be quite chatty, so I batched several together using MultiCriteria.

Just going from 1 at a time to 10 at a time, while insignificant for <100 records, can really add up once you get into the millions.

Use statement batching

In addition to SELECTs, we can also batch together INSERTs and UPDATEs.  In our case, we parameterized the processing of the file to a certain batch size (say, 250).  We then enabled NHibernate’s statement batching in the hibernate.cfg file:

<property name="adonet.batch_size">250</property>

And now instead of one INSERT being sent down the line at a time, we send a whole messload at once.  Profiling showed us that statement batching alone dropped the time by 50%.

NHibernate is very, very smart about knowing when and in what order to save things, so as long as items are persistent, we only really need to commit the transaction for the bulk processing to go through.

Process bulk updates in batches

Finally, once we had a proper Unit of Work implementation in place, we could now process the giant file as if it were many, smaller files.  We split the incoming work into batches of 250, then created a Unit of Work per batch.  This meant that an entire set of 250 was processed in a single transaction, instead of 5 million individual transactions.

Without a proper Unit of Work in place, we would not be able to do this.

Profiling is your friend

Finally, we needed a way to test our processing and the resulting speed improvements.  A simple automated test with stop watches in place let us tinker with the internals and observer the result.  With tools like NHProf, we could also observe what extra fetching we might need to do along the way.  Its suggestions also keyed us in to the various improvements we added along the way.

Wrapping it up

Bottom line is, if you can reduce the operation to a bulk import or export, the SQL tools will be orders of magnitude faster than processing one at a time.  But if you’re forced to go through your domain model and NHibernate, just be aware that your normal NHibernate usage will not scale.  Instead, you’ll need to lean on some of the built-in features that you don’t normally use to really squeeze as much performance as you can.

Kick It on DotNetKicks.com

Migrating to Fluent NHibernate

Recently, I’ve been entrenched in migrating our existing hbm.xml mapping files to Fluent NHibernate.  Having been through other OSS upgrades, I was expecting something along the lines of pulling teeth.  I pictured a branch, tedious work to try and move mappings over, all the while making parallel changes to changes going on back in the trunk’s hbm files.  Instead, Fluent NHibernate supports a painless migration path, really surprising me.  I haven’t finished the migration, but all of the difficult mappings are behind me, and I haven’t run into a situation that couldn’t be migrated yet.  However, I have found a few things along the way that have helped make the migration much easier.

But first, let’s see how we can take an existing NHibernate application, chock full of hbm goodness, and start the migration to Fluent NHibernate.

Switching configuration

When migrating our mapping configuration, the name of the game is to preserve as much as we can, so that we make as few changes as possible.  The more changes we introduce, the harder time we will have figuring out if our application doesn’t work because of Fluent NHibernate not configured correctly.  Although Fluent NHibernate provides fluent code-based configuration, we’re not going to do that yet.  First, we’ll take our existing configuration strategy, whether it’s using the Configuration object, the hibernate.cfg.xml file, or .config file, and simply augment it so that we can integrate it into Fluent NHibernate’s configuration model.

In our case, we used the hibernate.cfg.xml file, but it all comes down to how you originally create the Configuration object.  One thing we need to augment in our configuration file:

<?xml version="1.0" encoding="utf-8" ?>
<hibernate-configuration xmlns='urn:nhibernate-configuration-2.2' >
    <session-factory>
        <property name="connection.driver_class">NHibernate.Driver.SqlClientDriver</property>
        <property name="connection.connection_string">Data Source=.\sqlexpress;Initial Catalog=IBuySpy;Integrated Security=true</property>
        <property name="show_sql">false</property>

        <property name="dialect">NHibernate.Dialect.MsSql2005Dialect</property>

        <property name="cache.provider_class">NHibernate.Caches.SysCache.SysCacheProvider,NHibernate.Caches.SysCache</property>
        <property name='proxyfactory.factory_class'>NHibernate.ByteCode.Castle.ProxyFactoryFactory, NHibernate.ByteCode.Castle</property>
        <property name="cache.use_query_cache">true</property>
        <property name="adonet.batch_size">1000</property>
        <property name="hbm2ddl.keywords">none</property>
        <mapping assembly="IBuySpy.Core" />

    </session-factory>
</hibernate-configuration>

Is that piece at the bottom – the “mapping assembly” piece.  We need to remove that, as we’re going to switch to use Fluent NHibernate to pick up all of our entity mapping configuration.  But other than that, leave that configuration file alone!  Whether you do web.config, hibernate.cfg.xml, or programmatic configuration, leave it alone.  Again, we don’t want to change too many things at once to make the switch to Fluent NHibernate.

The next piece is to find that part in your application that creates the Configuration NHibernate object.  That object is used to create the SessionFactory, so you’ll likely only find it in one place in your application, and only called once per AppDomain.  In any case, find the piece that instantiates a Configuration object, it might look something like this:

public class NHibernateBootstrapper
{
    public static Configuration Build()
    {
        var configuration = new Configuration();

        configuration.Configure();

        return configuration;

From here, you can now download the latest Fluent NHibernate release and add a reference to whatever project you load NHibernate (the Core project for us).  Next, change that final piece that returns Configuration object and replace it with this:

public static Configuration Build()
{
    var configuration = new Configuration();

    configuration.Configure();

    return Fluently.Configure(configuration)
        .Mappings(cfg =>
        {
            cfg.HbmMappings.AddFromAssemblyOf<Customer>();
        }).BuildConfiguration();
}

The Fluently class is the main window into Fluent NHibernate configuration.  We’ll use the overload of Configure() and pass in the existing Configuration object we created through our previous setup.  In our case, this means we get all the configuration from the hibernate.cfg.xml file.  This is important because we have quite a bit of deployment code around this XML file that we’d not want to change right now.

Next, we tell FNH to load up our existing entity configuration in the form of hbm.xml configuration that’s included as embedded resources in our assembly with the Customer (or any type in the assembly where you keep your HBMs).  This lets us keep all of our existing hbm’s, but just use FNH to load them up instead.

That’s it for the initial migration.  We used our existing method for creating the Configuration object, but modified it such that Fluent NHibernate is responsible for the final Configuration object used.  We didn’t touch any of the HBMs, but merely instructed FNH to use them.

Migrating individual mapping files

Once the initial shim for Fluent NHibernate is in place, our application will work exactly as before.  But that’s not the interesting part of FNH of course, we want to start taking advantage of the nice fluent mappings.  So what’s the general migration strategy for migrating individual hbm’s?  Pretty easy:

  1. Create a ClassMap for a single entity, matching the original hbm exactly
  2. Delete the original hbm.xml
  3. Perform a schema compare to make sure nothing’s changed
  4. Commit, rinse and repeat for each class map

That’s it!  You can move hbm’s one at a time, with no need to do an all-or-nothing switch.  FNH supports side-by-side HBM and fluent mappings seemlessly, even if you do crazy things like joined subclasses split between fluent and xml files.  Of course, you will need to add your fluent mappings to the Fluenty.Configure piece, but that’s pretty straightfoward:

return Fluently.Configure(configuration)
    .Mappings(cfg =>
    {
        cfg.HbmMappings.AddFromAssemblyOf<Customer>();
        cfg.FluentMappings.AddFromAssemblyOf<Customer>();
    }).BuildConfiguration();

Now as you add the fluent maps, one at a time, don’t try to “fix” the existing maps.  You can do all that after the fluent map replaces the existing HBM.  But we don’t want to change too many things at once, and modifying the DB schema and switching to FNH is too many changes at once.

However, there are some things we can start doing to enhance our mappings as we go.

Discovering and enforcing conventions

Although HBM files themselves have sensible defaults, they do not allow us to create new defaults across all of our HBMs.  For example, we can’t tell our HBMs that all collections are accessed through fields, all foreign keys are suffixed with “Id” and so on.  However, as we add the fluent maps, we should keep an eye on any duplication in our maps.  We can do things like:

  • Define naming conventions for primary/foreign key columns
  • Set access for collections
  • Configure custom NHibernate types for your own custom types (like the Enumeration class)
  • Create inheritance hierarchies in our ClassMaps to match any layer supertype hierarchy in our entities
  • Build extension methods to encapsulate configuring common components (like that Address class everywhere)

We can still preserve our existing DB schema, but it’s quite likely that your team has already settled on things like naming conventions, access conventions and so on.  But with HBMs, you couldn’t describe these things en masse.  With FNH, we can add conventions as we go.  Keep an eye out for these common conventions early, it’s a lot easier to put conventions in place earlier rather than when your HBMs have all been converted.

I still can’t get over how easy it was to migrate to FNH, it was completely seemless and we were able to take advantage of the conventions almost immediately.  Strongly-typed mappings are great, but it’s been the conventions that have really impressed me.  If you’re still on the old HBMs, upgrade now and ditch that XML!

Kick It on DotNetKicks.com