Beware mixing List.Sort() and Thread.Abort() | Patrick Smacchia


After more than a decade of full-time development with .NET, I still discover surprising unfixed bugs in the heart of the framework. Today, while I was smoke-testing new stuff in the NDepend UI, I stumbled on an InvalidOperationException thrown by a call to List<T>.Sort() ?!

After 15 minutes of searching the cause of the bug, I found this unfixed bug on Microsoft Connect: List.Sort() throws InvalidOperationException (instead of re-throw ThreadAbortException) while thread is being aborted

In other words, if your application is using Thread.Abort() somehow, and that the abortable code is calling List<T>.Sort(), you are exposed to the risk of getting an InvalidOperationException instead of a ThreadAbortException.

I already hear that it is a good practice to avoid calling Thread.Abort(), but the fact is that if one wants a responsive UI, one needs some sort of abortable/timed-out background work, and at a point Thread.Abort() will have to be called for that. As long as calling Thread.Abort() is done in a safe way (see this Joe Duffy detailled post) we didn’t meet any problems so far. On millions of production runs logged, we cannot see any problem except … that List<T>.Sort() throws an InvalidOperationException with the frequency of around one on every 50K production runs. And today, this bug just popped to me in a reproductible way thanks to some refactoring I was doing.

To workaround this problem once for all, I enclosed all calls to any override of List<T>.Sort() in some extension methods in a dedicated class. These BugFreeSort() methods catch the InvalidOperationException. This is fine because anyway the thread is in a ThreadState.AbortRequested and the CLR will throw the ThreadAbortException at the end of the catch block. Also, attached to this class through an attribute, I declare a CQLinq code rule in the source code to make sure that all calls to List<T>.Sort() are made through one of these BugFreeSort() methods. Here is the code:

Here is the code rule declared in the code. Hopefully any .NET developer acquainted with the LINQ syntax, should be able to read this piece to code:

And here is a screenshot of the NDepend methods violating the code rule before refactoring:

We have almost 200 code rules declared in source code like this newly added one. They protect us from stumbling again and again in tricky identified scenarios. In other words with these rules we abide by the tenet: Fool me once, don’t fool me twice.

In a team context, doing so is especially useful since such rule is warning a developer when he’s doing something wrong. Such warning is actually a just-in-time automatic piece of education for newcomers developers, about how things must be done. Communication in the team is enhanced, the overall friction is decreased, and I found this pretty cool :)