Passed the BCS Foundation Certificate in Systems Development exam.

I have been a professional member of the British Computer Society for years now and I recently started to take a look at what exams they offer. I noticed they did a foundation certificate in systems development exam which looked like from the website that you could self study using a book.

I wanted to give a BCS exam a try but I didn’t want to pay out for expensive training courses and take time off work. Looking at the exam description page they referenced the BCS book titled Developing Information Systems bcsbook

I bought the book and started reading and studying the material. To my surprise the material in the book was reasonably up to date with modern software development practices and went into most aspects of formal systems development models around today and gave a nice history of where these practices came from.

The book covered the full systems development life-cycle from requirements engineering to business analysis to software development methodologies.

The exam itself was a 40 question exam and the book definitely covers enough for you to easily pass the exam and if you already have a lot of software development experience then most of the material will be pretty familiar to you.

All in all the exam is probably worth doing as the book is worth a read especially if you are new to the industry.  I booked my exam through Pearson Vue which can be accessed from the BCS website.

Pattern for creating scalable SharePoint 2010 BCS connectors

I am currently working on a project that requires me to pull in a large amount of data from an external database into a SharePoint farm and index the content for use in our search service application.

The dataset is currently around a million items and getting larger every day, there was obviously a need for the search crawler to be scalable and not take too much time indexing all this content.

I set about creating a standard .Net Connector Assembly that implemented, Finder (ReadList) and SpecificFinder (ReadItem) methods. The architecture for the BCS and Search framework looks like this (from MSDN)

I won’t go into detail on how to create .Net Connector Assemblies and BDC models because there are other articles out there that show you how, in this blog post I am going to detail how I made our connector scalable and performant by caching the content in memory and allowing the crawler to index the content from memory instead of making numerous (1 million+) calls to the database.

I encapsulated this caching mechanism into a library so that I can re-use the logic throughout my application on many BCS connectors.

I have also published a library containing the code so that this pattern can be re-used on other projects, feel free to download and use in your own projects.

You can download the code to follow along here http://www.athousandthreads.com/att.sharepoint.patterns.zip

Right down to the detail

The search crawler working on external connectors uses the following workflow to crawl all the content.

  1. The crawler first calls your Finder method (ReadList) on your .Net Connector Assembly, your Finder method needs to return the identifiers of all the items you want to be indexed.
  2. The crawler then calls your SpecificFinder method (ReadItem) passing it the identifier of each content it wants to index.

Now when the crawler initially calls my Finder method I need to go off to the database and retrieve all the items I want to index and return them to the crawler for indexing.

When the crawler then calls my SpecificFinder method, I don’t want to go back to the database I want to retrieve the item from the cache. I implemented this using a static collection of items that gets stored in the memory space of the MSSADM.exe process (the process that does the indexing).

There is some logic needed to synchronise the access to this shared cache and I have encapsulated this into the following class:

Caching

/// <summary>

/// Provides a caching mechanism for BCS external connectors.

/// </summary>

/// <typeparam name=”T”>Type of BDC entity to store in this cache.</typeparam>

/// <typeparam name=”I”>Type of the identifier for the BDC entity.</typeparam>

public class CachedConnectorService<T, I> where T : BDCEntity<I>

{

  #region Private members

  private List<T> cache;

  private CachedConnectorParameters<T, I> parameters;

  private static object lockObject = new object();

  #endregion

  #region Constructor

  public CachedConnectorService(CachedConnectorParameters<T, I> parameters)

  {

    this.parameters = parameters;

  }

  #endregion

  #region Public methods


  /// <summary>

  /// Reads individual entity from the cache.

  /// </summary>

  /// <param name=”identifier”>Identifier of the entity to read.</param>

  /// <returns>BDC entity.</returns>

  public T ReadItem(I identifier)

  {

    this.LogToOperations( “Reading BDC Item: “ + typeof(T).ToString() + “,
identifier=” 
+ identifier, EventSeverity.Information);

    T entity = default(T);

    try 

    {

        if (cache == null)

        {

            this.LogToOperations(typeof(T).ToString() + ” cache is null,
reloading cache from 
database.”EventSeverity.Information);

            this.ReadList();

        }

        entity = this.GetFromCache(identifier);


        if (entity == null)

        {

            this.LogToOperations(“Identifier not found in local cache, getting
from database.”
EventSeverity.Information);

            entity = this.GetFromDatabase(identifier,
parameters.DatabaseCall);

        }

    }

    catch (Exception)

    {

        this.LogToOperations( “Exception occured reading BDC Item: “ +
             typeof(T).ToString() + “, Identifier=” + identifier,
             EventSeverity.Error); 

    }

    return entity;

  } 


  /// <summary>

  /// Reads list of entities into the cache.

  /// </summary>

  /// <returns>Collection of entities.</returns>

  public IEnumerable<T> ReadList()

  {

      if (cache == null)

      {

          lock (lockObject)

          {

              try

              {

                  if (cache == null)

                  {

                      this.LogToOperations(“Getting list of “ +
                          typeof(T).ToString(), EventSeverity.Information);


                      List<T> cacheTemp = new List<T>();

this
.parameters.PopulateCache.Invoke(cacheTemp);

                      this.LogToOperations(“Loaded “ +
cacheTemp.Count.ToString() +
typeof(T).ToString(),
                         EventSeverity.Information);

                      cache = cacheTemp;

                  }

               }

               catch (Exception)

               {

                   this.LogToOperations(“Exception occured getting list of “ +
                       typeof(T).ToString(), EventSeverity.Error);

               }

          }

      }

      return cache.ToArray();

  }

  #endregion

  #region Private static methods

 

  /// <summary>

  /// Gets entity from the cache

  /// </summary>

  /// <param name=”identifier”>Identifier of entity to return.</param>

  /// <returns>Entity instance.</returns>

  private T GetFromCache(I identifier)

  {

      lock (lockObject)

      {

          return cache.Where 

              (

                a => a.Identifier.Equals(identifier)

              )

              .FirstOrDefault();

      }

  }


  /// <summary>

  /// Gets entity from the database using the specified delegate.

  /// </summary>

  /// <param name=”identifier”>Identifier of entity to return.</param>

  /// <param name=”databaseCall”>Delegate that does the work of 

  /// retrieving entity from database</param>

  /// <returns>Entity instance.</returns>

  private T GetFromDatabase(I identifier, Func<I, T> databaseCall)

  {

      return databaseCall.Invoke(identifier);

  }


  private void LogToOperations(string message, EventSeverity severity)

  {

      if (this.parameters.Logger != null)

      {

           this.parameters.Logger.LogToOperations(message, severity);

      }

  }

  #endregion

}

 

Parameters

The class uses a set of parameters that stores two delegates that are used to populating the cache and calling the database to get individual items. It also allows you to pass in a logger from the Microsoft patterns and practices logging library.

These delegates are used by your connector assemblies to pass in your specified logic for getting items into the cache. The parameters class looks like this.

public class CachedConnectorParameters<T, I> where T : BDCEntity<I>

{

  #region Public properties

  public Action<List<T>> PopulateCache { get; set; }

  public Func<I, T> DatabaseCall { get; set; }

  public ILogger Logger { get; set; }

  #endregion

}


Base entity

One last class is the BDCEntity<T> class which is used as a base class to all your BDC model entities, the class is simple and just allows the caching class to filter on identifiers.

public abstract class BDCEntity<T>

{

  #region Public members

  public T Identifier { get; set; }

  #endregion

}

 

.Net Connector Service & Entites

Now this forms the reusable library that provides caching to all your BDC connector assemblies an example of a class that uses this cachine pattern is shown below:

public class MyService

{

  #region Private static members

  private static CachedConnectorService<MyEntity, Int64> service;

  private static CachedConnectorParameters<MyEntity, Int64> parameters;

  #endregion

 

  #region Public methods

  /// <summary>

  /// Reads specified entity from the database.

  /// </summary>

  /// <param name=”id”>ID of the entity to retrieve from the database.</param>

  /// <returns>Instance of an MyEntity.</returns>

  public static MyEntity ReadItem(long id)

  {

      return ServiceInstance().ReadItem(id);

  }

 

  /// <summary>

  /// Reads a list of all entities from the database.

  /// </summary>

  /// <returns>Collection of MyEntity.</returns>

  public static IEnumerable<MyEntity> ReadList()

  {

      return ServiceInstance().ReadList();

  }

  #endregion

 

  #region Private static methods

  private static CachedConnectorService<MyEntity, Int64> ServiceInstance()

  {

      if (service == null)

      {

          if (parameters == null)

          {

              parameters = new CachedConnectorParameters<MyEntity, Int64>();

              parameters.DatabaseCall = GetDatabaseDelegate();

              parameters.PopulateCache = GetPopulateCacheDelegate();

              parameters.Logger = new SPLogger();

          }

          service = new CachedConnectorService<MyEntity, Int64>(parameters);

      }

      return service;

  }


  private static Action<List<MyEntity>> GetPopulateCacheDelegate()

  {

      return (entities) =>

        {

            using (MyWorkScope scope = new
                   MyWorkScope(DatabaseManager.EFConnectionString))

            {

                foreach (Entity entity in scope.CurrentContext.MySet)

                {

                    entities.Add(GetEntity(entity));

                }

            }

        };

  }


  private static Func<Int64, MyEntity> GetDatabaseDelegate()

  {

      return (identifier) =>

          {

               using (MyWorkScope scope = new
                      MyWorkScope(DatabaseManager.EFConnectionString))

               {

                   return
                 GetEntity(scope.CurrentContext.ReadEntity(identifier).FirstOrDefault());

               }

          };

  }


  /// <summary>

  /// Returns an entity from the specified object.

  /// </summary>

  /// <param name=”entity”>Entity to turn into an MyEntity.</param>

  /// <returns>Instance of MyEntity.</returns>

  private static MyEntity GetEntity(Entity entity)

  {

      MyEntity myEntity = new
      MyEntity();

      myEntity.Identifier = entity.Id;

      myEntity.Name = entity.FormattedName;

      myEntity.SiteUrl = entity.SiteUrl;

      myEntity.LastModifiedTimeStampField = entity.CC_ModifiedDate;


     return myEntity;

  }

Our BDC model entity class looks like this:

public partial class MyEntity : BDCEntity<Int64>

{

  public string Name { get; set; }

  public string SiteUrl { get; set; }

  public DateTime? LastModifiedTimeStampField { get; set; }

}

 

Memory Limits

There is one last thing to note about this approach. As we are caching all items within the MSSADM.exe process the memory footprint can get very large and there is a limit we hit on our server that is set by the filter damon. When the filter damon limit is hit the BCS connector assembly and it memory space is thrown away and hence the cache is reset. The above code handles this but as a consequence when ReadItem is called and the cache has gone away we have to reload from the database, you want to avoid doing this too many times for obvious reasons so we found we have to increase the memory limit of the filter damon to get better performance from the indexer.

You can find out how to do this from the links below:

Where can I get the library

You can download the full source code for the library here, it can be used by anyone free of charge.

http://www.athousandthreads.com/att.sharepoint.patterns.zip