Wednesday, July 30, 2008

Common Base Class for LINQ to SQL

Language-Integrated Query (LINQ) is a set of features in Visual Studio 2008 that extends powerful query capabilities to the language syntax of C# and Visual Basic. As a part of LINQ, LINQ to SQL provides a run-time architecture for managing relational data as objects. To some extent, it equals to ORM tool or framework such as NHibernate and Castle based on .NET framework. It becomes our preferred choice gradually when we want to access the database.

In LINQ to SQL, all variables in the Data Model of a relational database can be strongly typed which provides the benefit of compile-time validation and IntelliSense. We can fetch the data from the database using query expression (it includes query syntax and method syntax.)

However, the strongly typed feature is not conducive to abstract the common logic of data operations so the developer has to define a specific class to handle the entity object. It results in a large number of repeated codes. If we can implement the base class which encapsulates the common operations such as Select, Where, Add, Update and Delete, it will be useful for N-Tier application.

Fortunately, the generic type will help us to achieve our goal. We can invoke a method called GetTable() in DataContext of LINQ. For example, we can implement the Where method which accepts the Lambda expression to find the result we want to get:

public IList<TEntity> Where(Func<TEntity, bool> predicate)

{

    InitDataContext();

    return m_context.GetTable<TEntity>().Where(predicate).ToList<TEntity>();

}

 

It’s simple. Even, we can expose the method which accepts the condition clause using Dynamic Query:

public static class DynamicQueryable

{

    public static IQueryable<T> Where<T>(this IQueryable<T> source, string predicate, params object[] values)

    {

        return (IQueryable<T>)Where((IQueryable)source, predicate, values);

    }

 

    public static IQueryable Where(this IQueryable source, string predicate, params object[] values)

    {

        if (source == null) throw new ArgumentNullException(“source”);

        if (predicate == null) throw new ArgumentNullException(“predicate”);

        LambdaExpression lambda = DynamicExpression.ParseLambda(source.ElementType, typeof(bool), predicate, values);

        return source.Provider.CreateQuery(

            Expression.Call(

                typeof(Queryable), “Where”,

                new Type[] { source.ElementType },

                source.Expression, Expression.Quote(lambda)));

    }

}

public IList<TEntity> Where(string predicate, params object[] values)

{

    InitDataContext();

    return m_context.GetTable<TEntity>().Where(predicate, values).ToList<TEntity>();

}

 

Of course, the query function is not a problem because we don’t need the properties of the entity when we invoke these methods and don’t care about the composition of Lambda Expression.

The key issue is how to update or delete the record of data table. You know, we must fetch the entity which is going to be operated before updating or deleting it. The keyword to search is often its identity. Furthermore, Object Identity and Change Tracking in Data Context need the object’s identity to track the changing of it. Usually, we would add the Id column which is identity or Guid for each table. Then we can fetch the entity object according to it:

    public void Update(Employee employee)

    {

        LinqSampleDataContext context = new LinqSampleDataContext();

 

        Employee emp = this.Where(e => e.EmployeeID == employee.EmployeeID);

        emp.FirstName = “First Name”;

        emp.LastName = “Last Name”;

 

        context.SubmitChanges();

    }

 

Obviously, we don’t know the entity’s properties if we use the generic type such as TEntity like above. Without the keyword how do we associate the changes with the existing record? LINQ introduces the Attach method which can attach the changed object to the context. There are three overloaded versions as below:
Attach(Object entity): Attaches an entity to the DataContext in an unmodified state;
Attach(Object entity, bool asModified): Attaches all entities of a collection to the DataContext in either a modified or unmodified state.
Attach(Object entity, Object orginal): Attaches an entity to the DataContext in either a modified or unmodified state by specifying both the entity and its original state.

Attach method is used to associate the deserialized entities to a new instance of a DataContext. However, we can associate the entity in one DataContext to another DataContext by using it. In update or delete case, this method is very useful. For example, we want to update the record according to a changed entity from another DataContext:

    public void Update(TEntity changedEntity)

    {

        InitDataContext();

 

        try

        {

            m_context.GetTable<TEntity>().Attach(changedEntity, true);

            m_context.SubmitChanges();

        }

        catch (ChangeConflictException)

        {

            m_context.ChangeConflicts.ResolveAll(RefreshMode.KeepCurrentValues);

            m_context.SubmitChanges();

        }

    }

 

It looks good, but not enough. We can’t attach a changed entity unless the entity has a TimeStamp column in the corresponding table. Therefore, we should add a column whose type is TimeStamp in SQL Server 2005, or set the IsVersion property to true for Id property in LINQ to SQL Designer. My advice is to create a TimeStamp column for your datatable, and it will improve the performance because it won’t check all columns whether they are changed during handling concurrency.

In fact, we can pass the original entity with generic type also. The only problem is how to pass the values you want to change. The solution to this problem is using Action delegate. The code snippet is as below:

    public void Update(TEntity originalEntity, Action<TEntity> update)

    {

        InitDataContext();

        try

        {

            m_context.GetTable<TEntity>().Attach(originalEntity);

            update(originalEntity);

 

            m_context.SubmitChanges();

        }

        catch (ChangeConflictException)

        {

            m_context.ChangeConflicts.ResolveAll(RefreshMode.KeepCurrentValues);

            m_context.SubmitChanges();

        }

    }

 

Now we can invoke it passing Lambda expression like this:

    [TestMethod()]

    public void UpdateWithAction()

    {

        LinqSampleDataContext context = new LinqSampleDataContext();

        EmployeeAccessor accessor = new EmployeeAccessor();

 

        Employee employee = context.Employees.Single(e => e.EmployeeID == 1);

 

        accessor.Update(employee, t => { t.FirstName = “First”; t.LastName = “Last”; });

    }

 

Unfortunately, the test case can’t pass sometimes. It will throw a NotSupportedException. And the message of exception is:
An attempt has been made to Attach or Add an entity that is not new, perhaps having been loaded from another DataContext. This is not supported.

Why? What happens? The real reason is the entity we handle has some associations with other entities. Please see figure 1:

linqbaseclass01

Figure 1 : The entity with association

If you remove all relationships in Employee table, and re-generated the data model, the test case would pass.

What shall we do? Of course, to remove all relationships of table explicitly is not a good way to solve the issue. It will impact on the whole data model and can’t be accepted when we develop the application. I found one guy, Steve Michelotti, raised one solution to solve it. That was to provide one method called Detach to remove the relationships using partial class:

public partial class Contact

{

    public void Detach()

    {

        foreach (Address address in this.Addresses)

        {

            address.Detach();

        }

    }

}

 

public partial class Address

{

    public void Detach()

    {

        this._AddressType = default(EntityRef<AddressType>);

        this._State = default(EntityRef<State>);

    }

}

 

Good job! But it is not perfect. First, it’s too complex because we must define the Detach method for every entity which has association with others. Second, we can’t abstract it to the base class in this way. In the base class, we don’t know the specific type of TEntity. In this case, we should turn to the reflection technology. It’s my way:

    private void Detach(TEntity entity)

    {

        foreach (FieldInfo fi in entity.GetType().GetFields(BindingFlags.NonPublic | BindingFlags.Instance))

        {

            if (fi.FieldType.ToString().Contains(“EntityRef”))

            {

                var value = fi.GetValue(entity);

                if (value != null)

                {

                    fi.SetValue(entity, null);

                }

            }

            if (fi.FieldType.ToString().Contains(“EntitySet”))

            {

                var value = fi.GetValue(entity);

                if (value != null)

                {

                    MethodInfo mi = value.GetType().GetMethod(“Clear”);

                    if (mi != null)

                    {

                        mi.Invoke(value, null);

                    }

 

                    fi.SetValue(entity, value);

                }

            }

        }

    }

 

For EntityRef fields, we may set their values to null by calling the SetValue of FieldInfo class to remove the relationship. However, we can’t do EntitySet in the same way because it is a collection. If set to null, it will throw exception. So I get the method information of the field and invoke the Clear method to clear all items in this collection. Finally, my implementation of Update method is as below:

    public void Update(TEntity originalEntity, Action<TEntity> update, bool hasRelationship)

    {

        InitDataContext();

        try

        {

            if (hasRelationship)

            {

                //Remove the relationship between the entitis

                Detach(originalEntity);

            }

            m_context.GetTable<TEntity>().Attach(originalEntity);

            update(originalEntity);

 

            m_context.SubmitChanges();

        }

        catch (ChangeConflictException)

        {

            m_context.ChangeConflicts.ResolveAll(RefreshMode.KeepCurrentValues);

            m_context.SubmitChanges();

        }

    }

 

Delete operation is similar except we don’t need invoke the second version of Attach (Attach(object entity, bool asModified)). Here is the code snippet:

    public void Delete(TEntity entity, bool hasRelationship)

    {

        InitDataContext();

        try

        {

            if (hasRelationship)

            {

                //Remove the relationship between the entities;

                Detach(entity);

            }

 

            m_context.GetTable<TEntity>().Attach(entity);

            m_context.GetTable<TEntity>().DeleteOnSubmit(entity);

            m_context.SubmitChanges();

        }

        catch (ChangeConflictException)

        {

            m_context.ChangeConflicts.ResolveAll(RefreshMode.KeepCurrentValues);

            m_context.SubmitChanges();

        }

    }

    public void Delete(IList<TEntity> entities, bool hasRelationship)

    {

        InitDataContext();

        try

        {

            if (hasRelationship)

            {

                //Remove the relationship

                foreach (TEntity entity in entities)

                {

                    Detach(entity);

                }

            }

 

            m_context.GetTable<TEntity>().AttachAll(entities);

            m_context.GetTable<TEntity>().DeleteAllOnSubmit(entities);

 

            m_context.SubmitChanges();

        }

        catch (ChangeConflictException)

        {

            m_context.ChangeConflicts.ResolveAll(RefreshMode.KeepCurrentValues);

            m_context.SubmitChanges();

        }

    }

 

Don’t worry about the correctness of final result when we remove the relationship between the entities. The Attach method is just responsible for associating the entity to a new instance of DataContext to track the change. When you submit the changes, DataContext will check the real value in the mapping database and update or delete the record according to the passed entity. Especially, you should take an action such as Cascade in the database if you want to cascade delete between foreign key table and primary key table, see figure 2.

linqbaseclass02

Figure 2: Set the delete rule

If no action, the System.Data.SqlClient.SqlException will be thrown and the message of it when you delete the entity as Primary Key table, like this:
The DELETE statement conflicted with the REFERENCE constraint “FK_Orders_Employees”. The conflict occurred in database “Northwind”, table “dbo.Orders”, column ‘EmployeeID’.

Maybe you notice the InitDataContext method is invoked in all methods to access the data. Its implementation like this:

    private TContext m_context = null;

 

    private TContext CreateContext()

    {

        return Activator.CreateInstance<TContext>() as TContext;

    }

 

    private void InitDataContext()

    {

        m_context = CreateContext();

    }

 

Why do we need create a new instance of DataContext for each method? The reason is Caching policy in DataContext. If you create a new instance of DataContext and query the data from the database though by it, then change its value and execute the next query though by the same instance, the DataContext will return the data stored in the internal cache rather than remapping the row to the table. For more information, please refer to the Linq in Action.

So, the best practice is to create a new instance of DataContext for each operation. Don’t worry about the performance, the DataContext is lightweight resource.

Let’s consider about the concurrency issue. The default option is Optimistic Concurrency. When the values are to be saved, the DataContext would check the previous values to see if they have been changed. If conflict occurs, the DataContext would need to know whether to automatically overwrite the previous changes, keep the previous changes, or somehow merge the changes.

The concurrency issue is not within the scope of this article. We can’t say which one of three ways is the best or worst, it depends on the business situation. Usually, I will handle concurrency by the rule of last submit win. So I encapsulate the SubmitChanges method and define it as visual method, if necessary, the subclass may override it:

public class AccessorBase<TEntity, TContext>

    where TEntity : class, new()

    where TContext : DataContext, new()

{

    private TContext m_context = null;

 

    /// <summary>

    /// It provides the default policy to handle the corrency conflict

    /// </summary>

    /// <param name=”context”>Data Context</param>

    protected virtual void SubmitChanges(TContext context)

    {

        try

        {

            context.SubmitChanges(ConflictMode.ContinueOnConflict);

        }

        catch (ChangeConflictException)

        {

            context.ChangeConflicts.ResolveAll(RefreshMode.KeepCurrentValues);

            context.SubmitChanges();

        }

        catch (Exception ex)

        {

            throw ex;

        }

    }

 

    /// <summary>

    /// Update the entity which was passed

    /// The changedEntity cann’t have the relationship between the entities

    /// </summary>

    /// <param name=”originalEntity”>It must be unchanged entity in another data context</param>       

    /// <param name=”update”>It is Action<T>delegate, it can accept Lambda Expression.</param>

    /// <param name=”hasRelationship”>Has relationship between the entities</param>

    public void Update(TEntity originalEntity, Action<TEntity> update, bool hasRelationship)

    {

        InitDataContext();

 

        try

        {

            if (hasRelationship)

            {

                //Remove the relationship between the entitis

                Detach(originalEntity);

            }

 

            m_context.GetTable<TEntity>().Attach(originalEntity);

 

            update(originalEntity);

 

            SubmitChanges(m_context);

        }

        catch (InvalidCastException ex)

        {

            throw ex;

        }

        catch (NotSupportedException ex)

        {

            throw ex;

        }

        catch (Exception ex)

        {

            throw ex;

        }

    }

}

 

Now, we have a common base class which can be derived to handle the entity. For example:

public class EmployeeAccessor : AccessorBase<Employee, NorthwindDataContext>

{

}

 

You don’t need implement any method. It’s more convenient to access the Employee table by using an instance of EmployeeAccessor:

    [TestMethod()]

    public void UpdateEmployee()

    {

        EmployeeAccessor accessor = new EmployeeAccessor();

        IList<Employee> entities = accessor.Where(e => e.EmployeeID == 1);

 

        if (entities != null && entities.Count > 0)

        {

            entities[0].FirstName = “Bruce”;

            entities[0].LastName = “Zhang”;

 

            accessor.Update(entities[0], true, true);

        }

    }

 

You may let Employee entity derive my base class directly even:

public partial class Employee : AccessorBase<Employee, NorthwindDataContext>

{

}

 

Its behavior is very similar with Rich Domain Model like Martin Fowler said in the article which title is Anemic Domain Model.

 

Download Source Code: LinqSample_Src.zip     368KB

 

Note: The database which my sample use is Northwind in SQL Server 2005 and I add a new column which type is TimeStamp for each table.

No comments: