Generate sitemap.xml on the fly in Umbraco CMS

Simple sitemap.xml Umbraco handler

Sitemap.xml is important component of SEO which is responsible for indexing your website. Search engine robots are generating indexes of your website based on this file.

Since content in Umbraco website is dynamic, it makes not so much sence to have static sitemap.xml file for indexing your content. Instead it is a lot better to have this XML structure generated on the fly based on the content and structure.

The logic for generating sitemap.xml should be to include only those pages which are visible to the visitor, which means excluding pages which dot not have templates defined (data nodes in Umbraco for storing the data) and pages which are marked as invisible based on some property (usually boo property umbVisible)

 This means the code which generates the output XML needs to take this in consideration when generating the XML.

Regarding the structure, it should follow recommendation from search engines such as Google as dominant search engine these days, but it also needs to simply with others like Bing and Yahoo. Recommendations about the stricture of sitemap.xml can be found in help documentation of these search engines.

Since this output does not need to be managed direly from the back-end as it automatically generates the output, I decided to implement it as HttpHandler. To make it work it need to be defined in web.config file.

<configuration>
	<system.webServer>
		<handlers>
			<add verb="*" path="sitemap.xml" name="Sitemap" type="Umbraco.Cms.Custom.SEO.SitemapHandler, Umbraco.Cms.Custom" />
		</handlers>
	</system.webServer>
</configuration>
    

The following code can be used out of the box and it mainly implements Google search engine recommended structure for sitemap.xml.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Web;
using System.Xml.Linq;
using Umbraco.Web;
using Umbraco.Core.Models;
using System.Xml;
using System.IO;
using System.Globalization;
using Umbraco.Core;
using Umbraco.Web.Security;
using System.Web.Caching;
namespace Umbraco.Cms.Custom.SEO
{
public class SitemapHandler : IHttpHandler
{
public bool IsReusable
{
get { return true; }
    }

        public void ProcessRequest(HttpContext context)
        {
            UmbracoContext.EnsureContext(
                new HttpContextWrapper(HttpContext.Current),
                ApplicationContext.Current,
                true);

            GetSitemapXml(context);
        }

        private static readonly string CACHE_KEY = Guid.NewGuid().ToString();

        public static void ClearCache()
        {
            HttpContext.Current.Cache.Remove(CACHE_KEY);
        }

        private void GetSitemapXml(HttpContext context)
        {
            string uri = context.Request.Url.AbsoluteUri.ToLower();
            UmbracoHelper uHelper = new UmbracoHelper(UmbracoContext.Current);

            IPublishedContent siteRoot = uHelper.TypedContentAtRoot().First();
            HttpResponse response = context.Response;
            XDocument xdoc = null;

            if (context.Cache[CACHE_KEY] == null || !(context.Cache[CACHE_KEY] is XDocument))
            {
                xdoc = new XDocument();
                XNamespace ns = "http://www.sitemaps.org/schemas/sitemap/0.9";
                XNamespace xhtml = "http://www.w3.org/1999/xhtml";

                XElement root = new XElement("urlset",
                    new XAttribute("xmlns", ns),
                    new XAttribute(XNamespace.Xmlns + "xhtml", xhtml));

                xdoc.Declaration = new XDeclaration("1.0", "utf-8", "yes");
                xdoc.Add(root);

                foreach (IPublishedContent content in siteRoot.Descendants().Where(d => d.TemplateId > 0))
                {
                    root.Add(new XElement("url", new XElement("loc", content.UrlWithDomain()),
                                new XElement("lastmod", content.UpdateDate.ToString("yyyy-MM-ddTHH:mm:sszzz")),
                                new XElement("changefreq", "weekly")
                           ));
                }
                context.Cache.Insert(CACHE_KEY, xdoc, null, DateTime.Now.AddDays(1), Cache.NoSlidingExpiration);
            }
            else
            {
                xdoc = context.Cache[CACHE_KEY] as XDocument;
            }
            response.Clear();
            response.ContentType = "text/xml";

            using (StreamWriter streamWriter = new StreamWriter(response.OutputStream, Encoding.UTF8))
            {
                XmlTextWriter xmlWriter = new XmlTextWriter(streamWriter);
                xdoc.WriteTo(xmlWriter);
            }
            response.End();
        }
    }
}

    

Since the code runs through the whole site structure it makes sense to cache the output to reduce site load generated by search engine crawlers. Depending on the content update frequency, you should set expiration period. In this example I set it to one day as it suits my website needs since it will not be updated more than once daily in most cases.

References

Disclaimer

Purpose of the code contained in snippets or available for download in this article is solely for learning and demo purposes. Author will not be held responsible for any failure or damages caused due to any other usage.


About the author

DEJAN STOJANOVIC

Dejan is a passionate Software Architect/Developer. He is highly experienced in .NET programming platform including ASP.NET MVC and WebApi. He likes working on new technologies and exciting challenging projects

CONNECT WITH DEJAN  Loginlinkedin Logintwitter Logingoogleplus Logingoogleplus

.NET

read more

JavaScript

read more

SQL/T-SQL

read more

PowerShell

read more

Comments for this article