Understanding processing time

I want to give a brief tutorial on understanding CPU processor utilization since it is commonly an area of confusion during performance analysis.

CPU time

CPU time is the percentage of clock ticks than a processor spends waiting for instructions. This is opposed to wall-clock time, which is the total time the whole computer takes to perform an operation. The “load” on a system is the ratio of clock ticks which are performing operations versus the click ticks spent waiting for instructions over a given time period. Thus, load only makes sense as an average over a particular time period. During any single clock tick, the CPU is either processing an instruction or a HLT.

Idle time

The CPU processes a set of instructions fed from memory. The memory contains a set of opcodes (commands) which tell the CPU which operation to perform and the memory where it is located. The rate of clock ticks is constant – several gigahertz in a modern CPU – and during each clock tick, the CPU processes either an instruction to do some work, or a HLT – an opcode which tells the CPU to turn keep components idle until the next cycle.

Monitoring CPU time

So if we were to “observe” a CPU at the clock-tick level of detail, we would see that it’s always either working or waiting. But we can’t actually do that, since the more closely we monitor clock cycles, the more clock cycles are needed to do the monitoring. So to get an accurate picture of activity, we have to step back to a level where the monitoring tool does not interfere too much with the process being observed.

Input/output overhead

Any given computer task has several components: CPU instructions, memory IO, disk IO, network IO, and many others. Only extremely simple tasks (like calculating π) can fit in the small (but very fast) memory buffers on the CPU itself. Real-world tasks will almost always require the CPU to wait for other components to finish shuffling data back and forth in a state where it is ready for the CPU to work on it. But the ideal scenario is for the CPU to be kept as busy as possible (constant 100% load) until the task is completed. This is the minimum possible time in which a task can be completed. If we were to monitor CPU activity during such a task, we’d see the load jump to 100% during the task than back to the baseline when it’s done.  But if the task is shorter than the measurement frequency of the CPU monitoring tool, it would be an average over two periods, or it might not be detected at all.


The situation is more complicated when there are multiple tasks to run, each off which requires some fraction of CPU time. The operating system will run the tasks concurrently. Both the processing time and the IO of tasks will overlap, but because the operating system takes turns running each task, the total CPU load will be some combination which may be higher or lower than the total of the individual tasks – depending on the other competing resources the tasks use. For example, two disk-intensive tasks will use less CPU than the sum of both running individually because the disk IO will be the bottleneck. But two CPU-intensive tasks would more than double total load because the CPU will have to run both tasks and have to handle the context switching between concurrent tasks.

Further reading

Performance considerations for the Entity Framework execution pipeline

An ORM provides value by doing a lot of things for you – virtualizing databases as native objects and converting types automatically. But the work the ORM does to reduce developer work has a cost too – it has an inherent performance penalty and may encourage some bad development practices.

An ORM by definition has to do some work to convert a database schema into a native view. ORM’s can try to minimize this performance penalty by defining the transformation at design time or caching the database to native object transformation at different points.

Entity Framework is built on top of ADO.Net, which is just an API that does not “know” anything about the database. To convert .Net code into SQL queries and back into CLR objects, Entity Framework performs a set of operations at different stages: (1) compile time (2) first run (3) each execution. To improve performance, EF allows you to shift some work from step 3 to 1 or 2. But the details can be tricky and understanding the best optimization strategy requires understanding the EF query execution pipeline.

The EF execution pipeline

The following information is based on this MSDN EF Performance page.

There are six steps I want comment on in EF execution:

  1. loading metadata
  2. generating views
  3. preparing the query
  4. executing the query
  5. tracking
  6.  materializing objects

1: Loading metadata: The metadata is the mapping defined in your EDMX file. This is a very expensive operation (see the breakdown here), but it only happens once. EF applications use more memory and have a warm-up penalty which should be ignored in performance analysis if you are not concerned with startup times.  Models with with very large (200+) number of entities have a number of problems – see this and this.

2: Generating views: The local query views are static objects which are cached per application domain. This is also an expensive operation but it can be pre-generated and embedded in the application if you care about first run times.

3: Preparing the query: Each unique query must be compiled into the EF version of a stored procedure before it is executed. Microsoft says that the commands are cached for later executions, but I’m confused about what exactly is cached because profiling shows that the query is compiled every. In any case, Microsoft suggest caching the compiled query to avoid this penalty.

Caching precompiled queries is somewhat unwieldy, so it is only advisable in performance-critical contexts, but it makes a big difference for frequently executed queries.

(Note: Take care when using .Count() or Any() to avoid unnecessary query recompilation and avoid unnecessary enumeration when a simple boolean check will do.)

By the way, all the steps above can be skipped by using direct SQL queries or stored procedures with EF. The performance of direct entity SQL falls between pure ADO.Net and Entity Framework queries, so it is only useful as the occasional exception to LINQ queries against a data store.

4: Executing the query: Query execution time depends on the underlying data source. Entity Framework is just a library built on top of ADO.Net, so pure ADO.Net queries are the a benchmark for any ORM’s built on top of it. Because, ADO.Net will always be faster than any framework built on top of it, it can be used as a fallback when other options have been exhausted.

5: Tracking: tracking is used to track changes for updates. If you only need to read data, you can get a small performance boost by disabling it.

6: Materializing objects: each object returned from the database must be converted into a class instance to be used. There is no way to avoid this penalty, but it is worthwhile to keep in mind that the less data there is, the faster it will materialized – not to mention transferred over the wire.

For best performance, queries should be as specific as possible. I have noticed that ORM’s encourage the bad habit of always selecting an entire row. This is because developers using an ORM tend to think of the database as an object repository rather than as a relational store, whereas raw SQL queries encourage manually selecting the needed rows. (Unless one has the awful habit of “select *”.) So to minimize overhead, pull just the data you need (the MVVM pattern is helpful in this regard.)

Final thoughts:

As the ADO.NET program manager himself has pointed out, Entity Framework is inherently slower than ADO.Net SQL queries. But performance should always be balanced against productivity. There are some applications which are definitely not suitable for an ORM and many others that are. Some operations can only be done using raw SQL and can take forever using an ORM (“truncate table”, temp tables, etc).  I think the best approach is to use an ORM where appropriate and optimize in the specific scenarios where performance is inadequate. I do think that Entity Framework is the best, simplest, and the safest choice out of all .Net ORM’s.

These are just a few tips on Entity Framework performance. There are other tips and much more information in the pages below and the links therein.

More reading:

Rules for good unit tests

  • Test method names should be sentences (This_method_does_this(){})
  •  Test the happy path – the most common/important functionality (acceptance criteria should be executable)
  • Test at the highest level that is practical
  • Unit Tests should not:
  • Talk to the database
  • Talk to the network
  • Touch the file system
  • Don’t change business logic to write the code
  • Depend on any other tests (can be run at any time, in any order)
  • Depend on environment variables (USE MOCKS!)
  • Tests should be fast (lengthy tests are doing something wrong)
  • Less than 10 lines of code
  • Only one or two logical asserts per test
  • Don’t write tests after development is done 

Further reading:


Render tags in templates as a partial view with MVC3

Suppose you are rendering templated content. Sometimes you want to reference partial views (or actions) in your templates and have them render with attributes provided by your template. One option is to use a Razor templating engine. But I just needed to render partial views based on a custom tag format, so I came up with my own solution:

        /// <summary>
        ///   Render parameterized tags as a partial view in MVC3 templates
        ///   Supports tags such as 
        /// </summary>
        /// The helper.
        /// The content.
        private static string RenderPartialViewTagsInTemplate(HtmlHelper helper, string content)
            var controls = new Dictionary();
            MatchCollection matches = Regex.Matches(content, @"&lt;view: (?S+)(s+(?[^=s]+)=""?(?[^""s]+)""?)*?s*/&gt;", RegexOptions.ExplicitCapture);
            foreach (Match tag in matches)
                string viewName = tag.Groups["name"].Value;
                var routeValues = new RouteValueDictionary();
                for (int i = 0; i  { content = content.Replace(c.Key, c.Value); });
            return content;

Getting around packet-inspecting firewalls with free VPN+proxy tools

Surfing the Internet in China requires some creativity to work around the government’s packet sniffing firewall which monitors all traffic into the country. “Packet sniffing” means that a simple proxy will not work – you must encrypt the traffic to prevent the contents of the data from being inspected. Here is a quick tutorial.

The most important part is the tunneling VPN. For this, I chose Hamachi – a free VPN solution from LogMeIn.  Because VPN only provides an encrypted tunnel, you still need a Proxy server to run on the outside.

You have many options for your proxy. Privoxy is easiest to configure and by default it blocks ads and other junk, improving your experience and saving you bandwidth.

Next, you need something to help you manage the Proxy settings on your machine. You can enable it manually, but generally you do not want the proxy enabled for 100% of your traffic. For this I suggest Proxy Switchy – a Google Chrome browser plugin to auto-proxy blocked sites. For Firefox there is Foxy Proxy, but it is not as easy to use. Proxy Switchy makes its settings global, so other apps also use its settings.

Here my stack:

  1. My browser on a slow Chinese network
  2. Hamachi VPN tunneling to encrypt everything
  3. Squid on a fast connection inside the Great Firewall for high-speed local proxy
  4. Privoxy in the USA

Because my home connection is slow, I use Squid -a caching web proxy to cache data on a computer near me. You can also run Squid on your local PC.

Other proxy servers which you can use to speed up your connection:

  • Polipo for DNS caching, HTTP optimization, pipelining, etc
  • Apache with PageSpeed for opimizing web page content (combining inlining, minifying, img optimizing, etc)

You can use these proxies instead of Privoxy or you can layer them together in sequence.

Addendum: how to configure Proxy Switchy:

Proxy Switchy is a proxy helper extension for the Google Chrome browser. It works with your existing VPN/proxy solution. The cool thing it does is automatically switch over to the proxy just for the sites that need it so you get a seamless transition. To get you started, it comes with a default switch rule list which works for most sites blocked by the GFC.  Even though this extension is for Google Chrome, it exports its settings to the system settings, so it works with any browser.

You can use the online rule list at http://autoproxy-gfwlist.googlecode.com/svn/trunk/gfwlist.txt

You can see how I configured some of the rules below:

Reading Excel files in .Net

This should work for most Excel versions including both xls and xslx:

const string ExcelConnString =
                @"Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0};Extended Properties=""Excel 12.0 Xml;HDR=YES"";";
            var adapter = new OleDbDataAdapter("SELECT * FROM [Sheet1$]", String.Format(ExcelConnString, physicalPath));
            var ds = new DataSet();
            adapter.Fill(ds, "anyNameHere");
            var data = ds.Tables["anyNameHere"].AsEnumerable();
            EnumerableRowCollection tags =
                data.Where(x =&gt; x.Field("tag") != string.Empty).Select(x =&gt;
                                                                                 new Tag()
                                                                                         Description = x.Field("Description")

Cross browser multi-columns with JQuery and CSS3

column-count was proposed in January 2001, candidate in December 2009. It’s supported in WebKit and Mozilla via extensions and Opera directly, but it’s not in IE9. Y U no support columns IE9? That’s OK, we can work around this with columnizer:

if ($.browser.msie &amp;&amp; $.browser.version &lt; 10) { // am I a hopeless romantic for assuming that IE10 will support it?
            width: 600,
            columns: 3
/* Support for Webkit, Mozilla, Opera */
div#multicolumn, .multicolumn {
	-moz-column-count: 3;
	-moz-column-gap: 20px;
	-webkit-column-count: 3;
	-webkit-column-gap: 20px;
	column-count: 3;
	column-gap: 20px;

n. 1: automatic, but with an element of magic. 2: too complex to understand and/or explain