Are your Models killing performance?

Sgt. Sitecore

7 years ago

Recently I blogged about my design pattern for Sitecore presentation components: Item->Repository->ViewModel->Controller->View. For brevity, let’s refer to it as Item-Repository-ViewModel.

In that post I admonished users to pay careful attention to code responsible for generating ViewModels. This post will reveal some common mistakes around Item retrieval and ViewModel generation and discuss how to get maximum efficiency out of your Repository and Model building layers.

Model Generation

Let’s have a look at the most common ViewModel strategies in Sitecore development.

Item Facades

An Item Facade is any Object that contains or descends from Sitecore.Data.Item. (CustomItem is a good example.) Until recently, Item Facades were a very popular construct in Sitecore development, and are not without advantages:

Base classes provided by Sitecore allow for easy extension and interoperability with other aspects of the Sitecore API.
Significant Item field values are exposed through properties, allowing for as-you-type code validation.
In “Custom Item” derived objects, Properties are not populated until the value is requested, which gives better performance.

Technically objects like this violate the spirit of Item-Repository-ViewModel. They expose a significant footprint of the Sitecore API to your Views, thus they are no longer considered best-practice.

ContentSearch API

If you’re using Content Search to retrieve data for display, you should also be taking advantage of the POCO hydration facilities available in the ContentSearch API. The IQueryable<T> style interface allows you to skip any intermediate Sitecore objects and hydrate your ViewModels directly from the Sitecore API. However, developers without an excellent working knowledge of the Content Search API and/or some understanding of the power of the Solr server that backs up this API can cause all kinds of performance problems.

Boring Old Property Mapping

Let’s assume you’ve adopted the concept that Views should have discrete ViewModels and not directly interact with Items. To keep things simple, let’s also assume you are using naked Sitecore, no ORM kit like Glass or Synthesis.

Chances are your code looks something like this:

var model = new MyModel
  {
    Property1 = myItem.Fields["FieldName1"],
    Property2 = myItem.Fields["FieldName2"].Value,
    Property3 = ((LinkField)myItem.Fields["FieldName3"]).Url,
    Property4 = FieldRenderer.Render(myItem, "FieldName4")
    // etc...
  }

This works, although tedious and prone to typos. It’s surprisingly lightweight and therefore should get good performance. You only access the Item Fields you need, and explicitly retrieve relevant Field properties directly.

ORM Frameworks

Developers that do property mapping by hand will eventually tire of it and begin looking for quality of life improvement. Frameworks like Constellation ModelMapper, Glass Mapper and Synthesis can remove uninteresting concerns by handling the mapping internally. Depending on the implementation of the ORM framework, you may inadvertently introduce performance problems or violate the separation of Item and View concerns:

ORM frameworks by nature rely on .NET Reflection technology to figure out how to assign Item field values to Model properties. This is a process that involves lots of little loops and can cause performance problems.
Depending upon the implementation of the ORM, you may actually be implementing an Item Facade strategy. Know your evil wizards!
Even if your ORM is completely “itemless”, pay attention to how it resolves Property values. Most ORMs are two-way: you have full read & write access to Item Field values. This flexibility introduces additional overhead on Model creation, since your Model is likely proxied by a dynamic object that has a significant amount of change-event wiring attached to it.

A Note about Code Generation

Developers who don’t like mapping properties also are likely to not like generating View Models, and may rely on code generation tools to produce models based on Item Templates. Be wary:

Automatically generated Models tend to lack the fine-grained scope of a true ViewModel, containing many more properties than are actually required. Aside from violating principles, each extra field contributes to lost performance
The runtime resolution of a ViewModel instance from an Item can be very slow depending upon the implementation. In some cases, the ORM compensates by generating mapping tables on app startup, but this just aggregates all that time into Sitecore’s already lengthy startup process.
In a Helix-style environment, Model generation based on Templates and Inheritance may cause Feature separation headaches and namespace collisions. These may not affect system performance, but chasing these issues down will impact your developer productivity.

Item Retrieval

Repositories in our pattern are responsible for generating ViewModels from Items. While we’ve discussed some performance pitfalls in models themselves, without question the biggest Repository performance hit comes from poor decisions when retrieving Items from Sitecore.

XPath Pitfalls

While the trend is to use the ContentSearch API, sometimes it’s more legible to rely on Sitecore’s older Database object’s Item retrieval methods. When you consider that the RenderingContext object provides you with access to your Rendering’s Datasource Item as well as the Page’s Item, you have a logical stepping off point for querying the Sitecore content tree using XPath. It’s important to be able to identify performance-sapping queries.

Example

Solving XPath query problems requires some understanding of the requirements that drove the XPath creation in the first place. Let’s look at a not-unrealistic scenario:

query = "//sites/*[@@key='somesitename']/*[@@templateName='News List']/*[@@templateName='News Folder']/*[@@templateName='News Page']";

In the example above, the developer clearly wants all the News Page Items stored in one or more News Lists, but only for a named Site.

Avoid “//” like The Plague

The “//” path expression will force Sitecore to evaluate each and every Item below the query’s starting node. When Sitecore processes the XPath query, the search first goes to the bottom of each branch rather than evaluating all Items at a given tree level. This is not only extremely inefficient, but it returns Items in an order that is seldom useful, forcing yet another organizing loop on the result set. To eliminate this performance-draining operation we can replace “//” with an absolute path:

query = "/sitecore/content/tenants/*/sites/*[@@key='somesitename']/*[@@templateName='News List']/*[@@templateName='News Folder']/*[@@templateName='News Page']";

Not pretty, but 100% better performance.

Try to minimize Attribute parameters

Further interrogation of the developer reveals:

Only News Folders are allowed to be inserted below News Lists.
Only News Pages are allowed to be inserted below News Folders.

This immediately allows for simplification as we don’t need to specify the type of Items we’re looking for, or that of their parents:

query = "/sitecore/content/tenants/*/sites/*[@@key='somesitename']/*[@@templateName='News List']/*/*";

All searches should have context

Since this post is about the Sitecore presentation layer, one can make a few key assumptions:

The query is being called from a Rendering.
The Rendering presents a specific list of News Pages.

The following approaches in development can give us additional simplification and therefore performance:

The Rendering should have a Datasource, and that Datasource should be an Item with a Template of “News List”.
To ease the Author’s ability to select only the current site’s News List Items, the Rendering’s Datasource Location can be set to something like the following: ".ancestor-or-self::*[@@templateName='Site']/home"
We can use the Datasource Item as the context node of the query.

Putting these facts together produces code which greatly simplifies our XPath statement and retrieves the desired list of Items with excellent performance:

var query = "./*/*";
var newsList= RenderingContext.Current.ContextItem;
var items = Sitecore.Data.Query.Query.SelectItems(query, newsList);

We have now:

Improved query performance.
Made the query more portable by removing Template and root path specifications.
Improved the Author’s editing experience by forcing them to specify which New List they want to display.

XPath Considerations Rollup:

Never start a query from the root (“/” or “/sitecore”). Know the most significant node in your information architecture and start your query there.
If you don’t have a starting node in mind, consider the context of Site, and start from SiteContext.StartPath. Aside from trimming the number of inspected Items, this simple rule will also prevent the appearance of _Standard Values Items in your result set.
Know which tree level you need to query and include it in your query. “./*/*/*[@@templatename=’Page’]” is surprisingly efficient in comparison to “//” and can often be used to achieve the same effect.
Instead of using Database.SelectItems(), consider using Sitecore.Data.Query.Query.SelectItems(query, contextNode) which forces you to explicitly define the top node of your search and implicitly specifies the Database and Language of the query through the provided contextNode, making your query context-safe.
Remeber that @@templateName or @@templateId do not support inheritance. Keep your queries future-proof by avoiding these attributes.
Store queried Items together in the Content Tree to simplify Query parameters. Every “@” or “@@” in your query incurs processing cost.
If you need to use a compound query “|” to retrieve all of your results, consider the complexity of each discrete query. If the queries are not extremely simple, consider using SearchContext instead.
If you truly need to use “//” you should switch to SearchContext.

ContentSearch Pitfalls

Sitecore’s Solr-backed ContentSearch API is an incredibly powerful and fast data access system. However, it is often hobbled by poor implementation.

Example

Here’s our poorly performing example, which was cribbed from a similar query seen in a production installation.

items = query.Where(i => i.Path.StartsWith("/sitecore/content/"))
          .Where(i => i.Name.Contains(searchWord) 
          || i.Headline.Contains(searchword)
          || i.Content.Contains(searchWord)).ToList()
          .Where(r => r.Language == contextLanguage.Name);

There’s a lot going on here. Let’s fix the obvious things first.

.ToList() should always be last.

The .ToList() extension method actually executes the query and makes it “real”. Any further LINQ operations occuring after .ToList() are against the realized result set and are not part of the original query. As a rule of thumb, you should consider any LINQ operation after .ToList() to be an additional foreach loop through your results. Let’s re-order:

items = query.Where(i => i.Path.StartsWith("/sitecore/content/"))
          .Where(i => i.Name.Contains(searchWord) 
          || i.Headline.Contains(searchword)
          || i.Content.Contains(searchWord))
          .Where(r => r.Language == contextLanguage.Name).ToList();

One should always be wary about using .ToList() on an IQueryable because you may be shooting yourself in the foot, particularly if you need pagination in your result set. The better UX libraries for managing paginated results can actually handle an IQueryable directly, allowing you to avoid writing pagination code yourself.

Relevance Matters, Use Filters for Speed

In general, one should order the .Where() clauses from least-specific to most-specific. With Sitecore 9, one can also take advantage of Solr’s edismax Filters, which are sort of like database Views, or Index columns, with caching. Use “Filter” on the broad strokes of the query as follows:

items = query.Filter(r => r.Language == contextLanguage.Name) 
          .Filter(i => i.Path.StartsWith("/sitecore/content/"))
          .Where(i => i.Name.Contains(searchWord) 
          || i.Headline.Contains(searchword)
          || i.Content.Contains(searchWord))
          .OrderBy(r => r.Name).ToList();

The re-ordered and filtered query should now be significantly better performing. Here’s my filter priority rule of thumb:

Language
Search Start Location/Ancestor Item
Item Template (if applicable or necessary)

Start your search from a known location in the content tree

Just like XPath, we want to limit our search. This prevents the following from being exposed to the public in search results:

_Standard Values
Branch Templates
System settings
Dictionary Items
Rendering Definitions
Items from other Sites in the installation

In our example, there’s an attempt to establish a path, but the syntax is incorrect. Here’s the example with the correct syntax for establishing a “context” node for the search:

var siteRoot = contextDatabase.GetItem(contextSite.StartPath, contextLanguage);
items = query.Filter(r => r.Language == contextLanguage.Name) 
          .Filter(i => i.Paths.Contains(siteRoot.ID))
          .Where(i => i.Name.Contains(searchWord) 
          || i.Headline.Contains(searchword)
          || i.Content.Contains(searchWord))
          .OrderBy(r => r.Name).ToList();

The magic is in i.Paths.Contains(ID) which provides a high-performance way to limit a query to a specific area of the Content Tree.

Use .Like() and .Boost() to search text

Unfortunately, it’s very difficult to divine the correct way to search for text within a field. The answer is not C# intuition. It requires an understanding of the Solr query parsers used behind the scenes. Here’s some basic takeaways:

Use .Like(text, slop) instead of .Contains()
Slop is the allowed distance between words in a phrase. 0.0f means the phrase must occur as supplied. It’s a safe default value.
Use .Boost(decimal) to set matching priority when you’re searching multiple fields with a query.
The .Boost() value needs to be between 0 and 1, and no two .Boost() values should be the same in a given query. Higher numbers give a field higher matching priority.

Here’s our example with Like, slop, and Boost:

var siteRoot = contextDatabase.GetItem(contextSite.StartPath, contextLanguage);

var slop = 0.0f;
var nameBoost = 1.0f;
var headlineBoost = 0.9f;
var contentBoost = 0.8f;

items = query.Filter(r => r.Language == contextLanguage.Name) 
          .Filter(i => i.Paths.Contains(siteRoot.ID))
          .Where(i => i.Name.Like(searchword, slop).Boost(nameBoost)
          || i.Headline.Like(searchword, slop).Boost(headlineBoost)
          || i.Content.Like(searchWord, slop).Boost(contentBoost))
          .ToList();

The above will produce very “natural” looking search results for the search term, and do it very quickly.

Use Query<T> to build your models for you

This is bad:

var results = query.ToList();
foreach (var result in results)
{
  modelList.Add(mapper.MapToNew<ViewModel>(result.GetItem()));
}

We’re ruining performance by adding a complete loop through the result set and hitting the Sitecore database for the full Item for each record in our search results, just so we can hydrate ViewModels.

This is a significantly better approach:

public class ViewModel : Sitecore.ContentSearch.SearchTypes.SearchResultItem
{
  [IndexField("field1")]
  public string Property1 {get; set;}

  [IndexField("field2")]
  public string Property2 {get; set;}
  // etc...
}

// and now in your Repository...
public IEnumerable<ViewModel> GetModel(Item contextItem, Site contextSite, Language contextLanguage)
{
  IQueryable<ViewModel> query = context.GetQueryable<ViewModel>();
  query = query.Where(...something...);

  return query.ToList(); // A list of ViewModels for your Controller.
}

ViewModel is now a SearchResultItem and can be hydrated directly by the ContentSearch API. No extra loops, and we never hit the Sitecore database.

Your ViewModel doesn’t have to descend from SearchResultItem

Inheriting from SearchResultItem has some drawbacks:

SearchResultItem contains many methods and facade properties that hide potential database calls.
SearchResultItem contains many facts that your View doesn’t need.
Assuming you’re building a modern, AJAX-enabled web application, SearchResultItem has several properties that resist basic JSON serialization.

Consider the following alternative base class:

public class SearchResultViewModel
{
  #region Utility Fields borrowed from SearchResultItem
  [IndexField("_group")] 
  [ScriptIgnore] 
  [TypeConverter(typeof(IndexFieldIDValueConverter))] 
  public virtual ID ItemId { get; set; }   

  [IndexField("_name")] 
  [ScriptIgnore] 
  public virtual string Name { get; set; }   

  [IndexField("_database")] 
  [ScriptIgnore] 
  public virtual string DatabaseName { get; set; }   

  [IndexField("_language")] 
  [ScriptIgnore] 
  public virtual string Language { get; set; }   

  [IndexField("_path")] 
  [ScriptIgnore]    
  [TypeConverter(typeof(IndexFieldEnumerableConverter))] 
  public IEnumerable<ID> Paths { get; set; }

  [IndexField("_template")] 
  [ScriptIgnore] 
  [TypeConverter(typeof(IndexFieldIDValueConverter))] 
  public virtual ID TemplateId { get; set; }
  #endregion
}

A base class like this provides the critical search facets needed to build queries, but intentionally hides them from JSON serialization, and contains no performance-sucking methods to trip up developers building Views.

Next Steps

We’ve discussed performance pitfalls you can avoid in your Repositories. Here’s a small list of things to remember:

Use the Item-Repository-ViewModel pattern to isolate Item retrieval.
Use the ContentSearch API when possible for performance.
For both ContentSearch and XPath, establish a Context Item for your searches.
Build your ViewModels as small as possible:
- This enforces separation of concerns
- If you use a model-mapping technology, fewer ViewModel properties means fewer loops in .NET Reflection land.
- Serializing ViewModels is easier if you start from scratch. You can then control the serialization method of each Property.

Following the recommendations in this post will keep your Sitecore build speedy and bug free. Time to go inspect your own code!

Model Generation

Item Facades

ContentSearch API

Boring Old Property Mapping

ORM Frameworks

A Note about Code Generation

Item Retrieval

XPath Pitfalls

Example

Avoid “//” like The Plague

Try to minimize Attribute parameters

All searches should have context

XPath Considerations Rollup:

ContentSearch Pitfalls

Example

.ToList() should always be last.

Relevance Matters, Use Filters for Speed

Start your search from a known location in the content tree

Use .Like() and .Boost() to search text

Use Query<T> to build your models for you

Your ViewModel doesn’t have to descend from SearchResultItem

Next Steps

Share this: