What does a great Sitecore implementation look like?
Thoughts from a software engineer that has spent 10+ years architecting, building, enhancing, and upgrading Sitecore implementations
Traditional Sitecore vs Frameworks
Traditional Sitecore MVC applications start at a much lower level than newer frameworks like Sitecore SXA or XM Cloud that have been built on top of Sitecore MVC. There is no base application to start from and the Sitecore Information Architecture (IA) is for all intents and purposes empty. While this gives the development team incredible freedom to build out the implementation as they wish, it also leaves the team wide open to issues that can be challenging to fix toward the end of an implementation. These development choices can often impact the user experience for content authors as well as limit the out of the box functionality that Sitecore provides.
This article will review common issues that I’ve seen in more than 10 years of building and maintaining Sitecore applications. This isn’t to say that any implementation that has some of these issues is a bad implementation. It’s more of a list of red flags to look for to get a better feel for where the implementation is today.
What a great MVC implementation looks like
A great implementation can be a joy to work on. It’s intuitive, follows consistent patterns, and it’s easy to add features or resolve issues. I’ve found that there are 5 key areas of an implementation that can make or break the development experience: workstation setup, dependency management, the information architecture, the solution architecture, and hotfixes.
Workstation Setup
A new developer on a project should be able to spin up a development environment very quickly. A developer on a great implementation can be up and running within 15 to 20 minutes of getting started. A developer working on a less than optimal solution can take multiple days to get up and running.
Easy Startup
The best implementations have the bootstrap and startup processes scripted. A new developer shouldn’t have to know every detail of the implementation just to get up and running. Older implementations often have a bootstrap script that verifies system dependencies, installs Sitecore, and synchronizes serialized templates and content into the databases. This lets them hit the ground running with an up to date environment very quickly.
More recent versions of Sitecore are built around Docker containers. Docker containers can be thought of like virtual machines that are built via scripts that the development team create. This allows for an even better development experience as whenever the underlying containers are updated, each development machine gets the updates as well. As all of the dependencies for Sitecore are in the containers, it’s as easy as running start.ps1
script to be up and running with the latest version.
If on the other hand developers are asking about database backups, Sitecore installation errors, login credentials, or serialization errors you may have some startup issues to resolve.
Documentation
Great implementations have a Readme.md
file in the codebase. This documentation isn’t to go into great depth about the implementation. Rather it’s documentation on getting environments up and running as well as high level information about the solution in general. It’s also a great place for documenting runtime options and environment variable options.
Content Data Updates
Templates in Sitecore are expected to be in source control. The content items defined by the templates are typically not stored in source control as it can be a very large amount of files. Typically there is sample content stored in source control that provides enough content to launch and demo the functionality of the site.
What happens though when a bug is reported in production that isn’t reproducible with development content? Content needs to be synchronized from a higher level environment down to a developer’s workstation. In modern environments this would be handled via the Sitecore command line interface (CLI). It can be helpful to keep the CLI environment and configurations stored in a separate repository to keep the primary repository clean.
This approach replaces the outdated technique of using Sitecore packages to pass templates and/or content items between environments. Creating packages typically takes longer and is more error prone than using the CLI.
Dependency Management
Dependency management is critically important for any software solution but there are some unique considerations for Sitecore implementations. This includes library references, Sitecore references, third party dependencies, and Sitecore content packages (historically provided by Sitecore).
Versioning
Sitecore as a product is unlike many other enterprise applications in that any code modifications are deployed on top of the existing deployment. This leads to the overwriting of the out of the box DLL and/or configuration files with ones from the developer’s solution/deployment artifacts. It’s important DLL versions remain consistent with the Sitecore versions.
This can be problematic when installing nuget packages that have different requirements than the deployed Sitecore versions. It’s typically common libraries like NewtownSoft.Json
that are impacted. Care should be taken to confirm that changing the version doesn’t introduce breaking changes within Sitecore. I would also typically expect to see a Packages.props
file that defines the specific versions of libraries used throughout the solution.
While it’s not uncommon to see some version changes in an implementation, it would be a red flag to see many DLLs with inconsistent versions. You can find the assembly version in the release notes for a particular Sitecore release.
Sitecore Dependencies
Older implementations would typically have a source controlled version of the Sitecore reference DLLs like Sitecore.Kernel.dll
. Modern implementations will use package references that refer directly to Sitecore’s nuget repository. Keeping DLL dependencies within source control can be problematic as it both increases the size of the repository as well as complicates paths to those libraries.
Third Party or Private Dependencies
Third party libraries that don’t have public nuget feeds should be stored in a enterprise hosted nuget repository. This includes any hotfix DLLs provided by Sitecore. Azure DevOps provides nuget repository hosting called Azure Artifacts that can be used for this purpose.
This allows for the third party DLLs to be deployed as part of the solution when included as a package reference.
Information Architecture
Sitecore’s Information Architecture (IA) is extremely limited out of the box. While this on one hand allows for an extremely flexible IA, it often results in unforeseen complications down the road. Often times these complications relate to adding additional sites. Does the IA support regional sites? Or what about separate sites per language? How is content shared between sites? All of these are directly impacted by the IA and it can be challenging to change when the issues actually arise.
Site Structure
A good Information Architecture (IA) in Sitecore clearly organizes your site’s content, structure, and data in a logical, scalable, and maintainable manner. The content is structured around how content authors use and navigate the content rather than how an organization is structured internally. A good Sitecore implementation typically has well thought out approaches for multisite, shared content, data sourcing, external content, and template design.
Shared Content
Shared content is any content that is typically shared between sites. This type of content could be anything from press releases, news articles, or even shared taxonomy.
Data Sourcing
A Sitecore page is comprised of a Layout and Renderings. Each layout has many renderings on it. The header, the footer, call to actions (CTAs), text areas, etc are all typically renderings (often referred to as components). Sitecore’s magic is it’s ability to change the content of renderings based upon all types of personalization rules that can truly provide unique experiences to site visitors. This magic is powered through the use of rendering data sources.
The data source on a rendering points to a content item that could many types of content like text fields or images. They are typically built with the rendering in mind. A CTA content item might contain an image field, a text field, and a target url field. That would allow content authors to dynamically switch out the CTA based upon whatever rules they like.
Multivariate testing (similar to A/B Testing) is also powered by rendering data sources. Typically each option being tested is a different data source on the rendering. A typical red flag on a Sitecore implementation is the lack of use of rendering data sources. A significant amount of out of the box functionality is lost without them.
Too Much Content
Teams building their first Sitecore implementation often fall into a bit of a trap: everything on the website should be in Sitecore. They’ve realized the power of Sitecore and want to leverage every bit of that power throughout the implementation. The reality though is that there can be too much data stored in Sitecore.
Under the hood, all Sitecore items are stored in a single SQL table. That does give you room for many, many items before you run into performance issues but it does put a scalability limit on the content management (CM) environment. A good rule of thumb is to only add items into Sitecore if content authors are expected to edit that content.
An example of this would be a commerce site that sells car parts. There could potentially be millions of SKUs that would correspond to different Sitecore items. It would be better for that data to live in it’s own SQL database rather than in Sitecore directly. Often times a single page in Sitecore renders all of those SKUs as a product page. This same approach applies to any content that is controlled by an external system of record. A red flag here would be seeing Sitecore bucket items containing vast numbers of items that all appear to be content editable.
Templates and Template Inheritance
Templates are the definitions for content items in Sitecore. You might have a CTA template that has fields for Text Message
, Image
, and Url
. This would allow you to make as many CTA items as you like that each have those fields. Template inheritance is where you can make a base template, lets say the one we just defined, and use it to create additional templates.
An example of this would be to create a Product CTA and a Newsletter CTA. Both would inherit the fields from the Base CTA but also have their own distinct fields:
Product CTA:
Text Message
Image
Url
Product Name
Short Product Description
Newsletter CTA:
Text Message
Image
Url
Newsletter Byline
You can keep going down the rabbit hole with this and at first glance it seems like a good thing. A common practice in software development is Don’t Repeat Yourself (DRY) and this seems to fit that perfectly. The downside only comes when you need to reorganize the content.
Finding where a field comes from becomes a more challenging process. If a template is only one level deep it isn’t much of an issue. I have seen inheritance structures going 10 layers deep and that is more of a problem. There is no 100% simple way to track it down.
Changes to base templates can cause unexpected regressions. With many developers on a project it often happens that base templates get used in more places than the original developer of a feature may have expected. An Author
base template could be used on nearly any content type. Any change to the base type would impact any rendering that is expecting the old type. This would typically be the type of thing you would expect to catch in a code review but serialized Sitecore content is much more difficult to review than plain code changes in a pull request.
Additionally, template fields can share the same name. Many base templates will have fields named something like Display Name
. Now imagine that your child template has 3 base templates all with the same field name. Where is the data coming from? It becomes a bit of a challenge to find out.
Generally speaking I would say that any implementation that uses template inheritance more than 1 level deep to be a red flag and likely technical debt that will come back to bite you someday.
Solution Architecture
Helix
Helix is the recommended architecture for more traditional Sitecore MVC or SXA solutions that would typically be hosted on-prem or platform as a service (PaaS). It provides structural guidelines for the files in the solution, best practices for dependency management, and guidelines for the information architecture. Generally speaking an implementation should follow Helix principles provided that it isn’t headless, XM Cloud, or following a composable DXP strategy.
A solution is typically broken down into three layers: Project, Feature, and Foundation. Typically a Helix solution will break the source tree down into the same structure:
Src
Project
Feature
Foundation
These three layers also tie directly into dependency management. The project layer only contains configuration, styling, layouts, and the composition of features. That means that a Project layer can only reference Feature and Foundation projects, never other Projects.
Features contain the business logic and reusable functionality that is independent of a Project. This is typically for functionality like navigation, search, banners, or call to actions. Features can reference Foundation projects but never other features or projects.
The Foundation layer generally contains the under the hood functionality shared across features and projects. This covers things like logging, service abstractions, data access, or any other broad set of functionality that can be used by many features or projects.
Generally speaking it’s a red flag to see features referencing other features or projects referencing other projects.
Configuration Transforms
A Sitecore application and all of its underlying functionality are defined via XML configuration files. Every implementation will involve updating the configuration to support new functionality. This is everything from implementing a 404 page (updating the item not found processor) to custom fields in the Solr indexes.
Remember that a Sitecore application is deployed by dropping the newly built code over top of an existing installation. When reviewing a legacy codebase it’s quite often that the updated configuration files get added to source control and replace the originals when deployed. This approach, while common, is technical debt.
A better approach is to use configuration transform files for any change that needs to be applied to the configuration. Typically these files would be added to each Feature or Foundation project that needs it. At application startup, Sitecore picks up these transforms and builds out the final configuration that is used. You can view the merged configuration of a Sitecore installation by navigating to:
https://localhost.cm/sitecore/admin/showconfig.aspx
It is a huge red flag if a solution doesn’t use configuration transforms. That typically means that a much deeper audit of the solution is necessary to uncover all of the technical debt.
Code Generation
By default, Sitecore’s data access library is not strongly typed and has no typed fields for any of the custom fields that you’ve created in the content tree (AKA the information architecture). Older Sitecore solutions would typically use a separately purchased product called Team Development for Sitecore (TDS) to synchronize changes from the Sitecore content tree, typically structural content and templates, into source control.
Virtually every implementation that uses TDS also uses a feature of TDS that generates strongly typed classes for every template that has been synchronized into the solution. While on one hand that’s a great thing as it gives developers strongly typed objects to use and greatly reduces the occurrence of runtime errors for missing template fields. On the other hand it generates a class for every template that has every field in that class. This often leads to very large models being passed throughout the application when really only a few fields are needed for any given instance.
It has become a better practice to manually create the classes as they are needed rather than creating every possibility as part of a code generation process. I wouldn’t necessarily call it a red flag to see code generation in use, but I would say that it’s a good sign to keep an eye open for other deprecated practices that may still be in use.
Glass Mapper
Since the out of the box data access library is rather simplistic, nearly every classic Sitecore implementation uses a 3rd party library called Glass Mapper. It’s essentially a replacement that can map Sitecore items directly to your custom c# classes. It also supports lazy loading for fields that refer to other items and has it’s own caching layer.
While it would be fine for a relatively small site to use the out of the box Sitecore data access library, it would be very rare for a site with any complexity to use it. As Glass Mapper is the only alternative that is actively maintained, it would be a red flag if a complex site wasn’t using it.
Hotfixes
One common theme that I’ve seen in most Sitecore solutions is the failure to apply hotfixes in development environments. It’s critical that the underlying Sitecore code in development matches that of the higher level environments. Too often the hotfixes from Sitecore get passed off to a managed services team that only apply them to production environments. Hotfixes can and have introduced changes to overall site performance when run at scale. They can also potentially overwrite customizations that have been built.
A better approach is to include hotfixes as part of the development process. That can be a pull request into a hotfix branch that also gets merged back into the development branch. It won’t slow down the deployment process and may in fact save time by having more eyes on the changes before they're deployed.
Sitecore and the future
At the time of this writing, the future is looking fairly strong that Sitecore XM Cloud is going to be the new paradigm for Sitecore implementations. To begin future proofing your implementation you will need to consider that XM Cloud is headless, serialization has standardized on Sitecore Serialization, and the lack of official support for backend customization.
XM Cloud sites are headless and built very similar to a Sitecore Jss site. That means all of an implementations razor views will need to be converted into react components. Starting to build react components now can save time during a future migration.
Another challenge is the change to item serialization. The oldest projects typically still use TDS for serialization while newer projects use Unicorn. In either case, the serialized items will need to be replaced with items serialized with Sitecore’s own serialization tools.
Customizations are one of the biggest pain points when moving forward toward XM Cloud. Common pipeline modifications like HttpRequestBegin
need to be replaced as the frontend is now decoupled from the backend. Other common issues are with dynamic placeholders and url rerouting.
Changes to the backend of Sitecore like custom field types and replacements for the rich text editor will also be problematic. XM Cloud has introduced a replacement to the Content Editor called Pages that has very limited support compared to the Content Editor. A good rule of thumb is that any customization that could be broken by Sitecore deploying an update to your implementation should be avoided.