Coding Horror

programming and human factors

Is DoEvents Evil, Revisited

A colleague of mine had some excellent comments on the surprising reentrancy issues you'll run into when using Application.DoEvents():

The Application.DoEvents method is often used to allow applications to repaint while some longer task is taking place. This is usually the result of polling instead of using events / delegates. That's fine, but developers need to understand DoEvents processes all of the messages in the message queue, not just paint messages. This can lead to unexpected reentrancy issues. A simple example is shown below.

// some data we care about
private int _count;
// the click handler for a button
private void buttonUserAction_Click(object sender, System.EventArgs e)
{
_count++;
// _count won't always be 1; it depends how many times
// this method was reentered during the DoEvents calls
Console.WriteLine(_count);
// simulate longer tasks that we are polling the status on.
// call DoEvents to allows the window to repaint
for (int i=0; i < 100000; i++)
Application.DoEvents();
_count--;
}

In this example, it doesn't matter if the method is reentered. But it might in other methods or applications. Always remember that DoEvents can cause methods to be reentered. And understand which methods are affected: any method that is called directly or indirectly in response to processing messages in the message queue.

It would be useful if the Form object had a Busy property; the form would only process paint related messages and skip input related messages like menus, clicks, keyboard, etc.

I put together a VS.NET 2003 winforms solution (6kb) demonstrating the code sample above.

This may make you wonder: Is DoEvents Evil?

I think it's definitely the lesser of two evils: it's either this simplified cooperative yielding or full-bore multithreaded code. DoEvents can be a big win with minimal effort in the right situations. For example, how about using it to improve perceived form load performance?

Discussion

Clean Sources Plus

Omar Shahine's Clean Sources is a nifty little right-click app for .NET developers:

This application does one thing. It adds an explorer shell menu to folders that when selected will recursively delete the contents of the bin, obj and setup folders. If you have a .NET project that you wish to share with someone, this is useful to remove the unnecessary stuff from the folder before you zip it up and send it off.

There's one glaring omission here, though. The source control bindings aren't removed! And neither are the local user setting files. I finally had some time, so...

Presenting Clean Sources Plus. It adds a right-click menu to folders that does the following:

  • Removes bin, obj, Debug and Release folders
  • Removes source control bindings from project and solution files
  • Removes user setting files

The result is a very clean, minimal set of .NET solution files, suitable for upload or sharing.

Updated to version 1.1 on 11/10/05 with the following new features:

  • Added second context menu "Clean and Zip Sources" which also zips the entire folder contents into a single zip file. This zip file is placed in the root folder and shares the same name as the folder.
  • The regex patterns used to determine what files and folders to delete are now set in the .config file. This way you can customize what gets deleted without recompiling.

I tested this with both C# and VB projects, but I'm not 100% sure it works for all other types of projects. It shouldn't break anything, but I may have missed some oddball project type (database? setup?) source bindings. And I only tested against the typical SourceSafe bindings. Anyway, test it out and let me know if there are any issues.*

You may also be interested in TreeTrim, which is based on this project.

* If this app deletes all your source code, then it's Omar's fault.

Discussion

Despite the incredible slowness and the sparseness of features, this is really really cool

If you were about to throw out your C++ compilers because of my post on the productivity benefits of managed code and scripting languages, hold your horses. Although managed code is pretty darn fast, sometimes performance still comes first. As Ole Eichorn points out in the comments:

[You said] "given the abandonment of C/C++ for mainstream programming".

Really?

Name any OS which isn't coded in C/C++. I mean, a real one.

Name any Office package which isn't coded in C/C++. I mean, one with measurable market share.

Name any database which isn't coded in C/C++.

Name any X which isn't coded in C/C++. Where X = webserver, application server, financial application, image analysis package, etc.

I think your definition of "mainstream" must be different from mine, because from my point of view EVERY mainstream program is written in C/C++, and nothing is even close.

What I actually said was mainstream programming, not mainstream programs. Consider the total volume of code written in a given year for the PC platform. What percentage of that code will end up in commercial, shipping applications-- much less an operating system? And in which of those applications will performance be the primary consideration? It's an incredibly tiny fraction!

But Ole's comment is still valid, insofar as it goes. I'm not proposing a world where all applications are written in managed code, or Python, or Ruby, or whatever the cool scripting language of the moment happens to be. It just doesn't make sense. To prove that point, here's an amazing quote from a first look at the ill-fated Corel Office for Java beta from way back in 1997:

The pre-beta version of WordPerfect on display is very basic, a few fonts, a few formatting commands -- not like the full-featured Word Processing apps we're used to. Still, it's enough to play around with.

As I mentioned before, it's very slow. All us fast typists will be frustrated, as there seems to be a two second delay between typing each letter and seeing it displayed.

Despite the incredible slowness and the sparseness of features, this is really, really cool and I hope Corel can pull this off quickly. If they can, it should open up the software market -- no longer would software companies be developing for platforms, they would be developing for one big market. Then it would be up to the Operating Systems themselves to attract users by their merits, not by what they can run.

So, er, good luck with that.

This is a stretch even on today's hardware, so I can't even begin to imagine what they were thinking back in 1997 when a 300 mhz CPU was top of the line. Where is Corel Office for Java now? Seriously, where is it? I can't find any mention of it.

And that's why C, C++, and even assembler are still part of a developer's toolkit. I argue that they are of increasingly diminished importance, but I would never propose that every application should be written in .NET.

At least not with a straight face.

Discussion

The myth of infinite detail: Bilinear vs. Bicubic

Have you ever noticed how, in movies and television, actors can take a crappy, grainy low-res traffic camera picture of a distant automobile and somehow "enhance" the image until they can read the license plate perfectly?

Yeah.

I don't know what kind of crazy infinite-detail fractal images these scriptwriters think we have. Here in the real world, bitmaps don't scale worth a damn. Take this bitmap, for example:

Hello Kitty, biatch!

If we blow that up 300% using the simplest possible algorithm ‐ a naive nearest neighbor (aka pixel resize) approach – we get this:

Hello Kitty, enlarged 300% using naive nearest neighbor

Pixel-tastic! But there's a well known way of interpolating the pixels in the image so it doesn't look quite so bad when upsized – something called bilinear filtering. Bilinear filtering samples nearby pixels in an effort to guesstimate what the missing pixels would look like in a larger image. Let's enlarge the image 300% using bilinear filtering and see what happens:

Hello Kitty, enlarged 300% using Bilinear Filtering

A bit blurry, yes, but clearly superior to giant chunky pixels.

There's also something called bicubic filtering which is supposed to be an improvement over bilinear filtering. Video cards have offered bilinear filtering for years, but they don't bother with bicubic filtering to this day. And that's with millions of transistors to burn. If bicubic is only offered by paint programs, you have to wonder, is it really worth it? Here's the same image enlarged 300% using bicubic filtering:

Hello Kitty, enlarged 300% using Bicubic Filtering

Interesting. It's sharper, but I'm not sure it's all that much better. And there's a bit of an oversharpening or halation effect at some color borders, too.

There's another image sample at Interpolate This with a writeup that implies that bicubic is flat-out superior, but I'm not sure that's the case. Either way you're interpolating*, it's just a question of how sharp you like your simulated pixels to be.

The best solution of all is to move to a vector representation and give up on bitmaps – and interpolation – entirely.

* A reader pointed out an interesting algorithm for interpolating low-res images called 2xSAI. Here's a screenshot I generated of a SNES game with 2xSAI interpolation enabled. Compare to the original screenshot.

Discussion

Are All Programming Languages The Same?

There's a chart in Code Complete that compares the productivity of working in different languages:

Programmers working with high-level languages achieve better productivity and quality than those working with lower-level languages. Languages such as C++, Java, Smalltalk, and Visual Basic have been credited with improving productivity, reliability, and comprehensibility by factors of 5 to 15 over low-level languages such asassembly and C (Brooks 1987, Jones 1998, Boehm 2000). You save time when you don't need to have an awards ceremony every time a C statement does what it's supposed to do. Moreover, higher-level languages are more expressive than lower-level languages. Each line of code says more. The [following table] shows typical ratios of source statements in several high-level languages to the equivalent code in C. A higher ratio means that each line of code in the language listed accomplishes more than does each line of code in C.

LanguageLevel Relative to C
C1
C++2.5
Fortran2
Java2.5
Perl6
Python6
Smalltalk6
MS Visual Basic4.5

Fair enough. Des Traynor wondered if this table was valid, so he performed a simple test: he provides examples of a tiny "read a file and print it to the console" app in Java, Perl, Python, and Ruby. I'll reprint the smallest version here, which happens to be the Python implementation:

filename = "readAFile.py"
try:
for line in open(filename, 'r').readlines(): print line
except: print "Problem with %s" % filename

For comparison, here's the VB.NET 2005 version:

Module Module1
Sub Main()
Dim filename As String = "readAFile.vb"
Try
For Each line As String In System.IO.File.ReadAllLines(filename)
Console.WriteLine(line)
Next
Catch
Console.WriteLine("Error reading file, or file not found.")
End Try
End Sub
End Module

And the C# 2005 version:

class Module1 {
static void Main(string[] args)	{
string filename = @"readAFile.cs";
try {
foreach (string line in System.IO.File.ReadAllLines(filename)) {
System.Console.WriteLine(line);
}
}
catch {
System.Console.WriteLine("File not found or error reading file.");
}
}
}

I had to edit the C# sample quite a bit to get rid of things that would have made the line count ridiculously large. Most notably, I removed the stupid always-on namespace declaration (don't get me started), added the System prefix to avoid the using, and folded leading curlies into the same line.

Anyway. Including the examples provided on Des' page, that gives us a final line count tally of:

LanguageLines of code
Java15
C# 20058
VB.NET 20058
Ruby6
Perl5
Python4

So, even with this trivial little example, there is a wide gap between "scripting" and "non-scripting" languages when it comes to lines of code. There's plenty of existing research to support the claim that scripting languages offer higher productivity, such as the 2000 IEEE paper An Empirical Comparison of Seven Programming Languages (free draft PDF):

Despite these caveats, directly comparing different programming languages can provide meaningful insights. For example, I conclude from the study that Java's memory overhead is still huge compared to C or C++, but its runtime efficiency has become quite acceptable. The scripting languages, however, offer reasonable alternatives to C and C++, even for tasks that must handle fair amounts of computation and data. Their relative runtime and memory-consumption overhead will often be acceptable, and they may offer significant advantages with respect to programmer productivity, at least for small programs like the phonecode problem.

That was written in 2000. Five years later, I am wondering if this distinction between "scripting" and "non-scripting" languages is as meaningful in a .NET world. If you examine the code samples above, you'll notice that most of the overhead in the "non-scripting" languages comes from the cruft associated with classes, functions, and object orientation. The main work loop, if considered alone, is almost identical in every language!

So then, if language isn't the real difference, what is? That very same language comparison paper offers this insight:

For all program aspects investigated, the performance variability that derives from differences among programmers of the same language -- as described by the bad-to-good ratios -- is on average as large or larger than the variability found among the different languages.

It's currently all the rage to propose that Ruby is changing the face of software development. I can definitely respect the passion behind this statement, but the actual data doesn't support a magic bullet language effect. Given ..

  1. the abandonment of C++ and C for mainstream programming
  2. the huge influence of individual programmer skill
  3. the slow but steady adoption of scripting/dynamic language conventions in Java and .NET
.. maybe all modern programming languages really are the same.

Discussion