.NET REST Frustrations

I've had a frustrating few days, and I'm going to vent about it a little bit here, because the experience has been something that I think many programmers have to deal with from time to time.

This weekend I found myself with 2 evenings home by myself, while my wife works her new 2nd shift job. While I miss having my evenings with her, part of me had been looking forward having some evenings all to myself, to really dig into some extra-curricular programming projects.

I'm in an unusual position right now. Unusual for myself anyway: I have a lot of ideas of things to work on. Some are for myself, and some are for work, to get a little extra satisfaction out of the job, and to make things more tolerable during the 9 to 5 when I'm on the hook for some less-than-enjoyable tasks. One thing most all of these ideas have in common is that they revolve around creating web services. Which is something I've never done before.

A few weeks back I wrote a meandering rumination, documenting my thoughts as I ventured into the web services world... Specifically, the REST world. I found myself leaning toward a particular approach, and this weekend I had an the time and opportunity to attempt to start down that road. Unfortunately, I hit some roadblocks. OpenRasta is just not going to work for me right now, I'm afraid. If I'm lucky that will change in the near future, but for now it's the state of things. So today I went back to the drawing board. Starting with whether I should persevere on the goal of using REST.

I didn't spend much time back at the very beginning here. I know that it is fairly easy to whip up an RPC-style web service with WCF. But I also know that RPC is (for most use cases) becoming a far less appropriate option. I've been doing a lot of reading on REST, and it is starting to make some fundamental sense to me. It's a paradigm I *want* to work in because it feels natural in so many ways. But I look at all the resources on REST in WCF and think to myself that it really just seems like a lot more work than it needs to be. Not to mention the fact that the whole point of WCF is to be protocol-agnostic, which is of course the opposite of REST, which embraces the HTTP protocol. Using WCF for REST seems like a frivolous waste of resources: a framework for its own sake, rather than for any value derived from the generality. So, I really wanted to avoid WCF for the learning curve and for the design conflict. But what are my other options?

I've read about how awesomely intuitive and simple the Sinatra framework is for REST, but I don't know Ruby, and really have zero comfort with Linux from a programming, admin, or server standpoint. So Sinatra really just has too many hurdles at this point. That's a lot of learning I'd have to do before I can make a simple hello world web service. Not sure I am willing to wait so long before I get some ROI. I'll learn Ruby properly, eventually. But preferably not when I'm also trying to learn 4 other things from scratch.

There are a couple of .NET frameworks around that claim to be heavily inspired by Sinatra. First there was Martin, and now Nancy. Unfortunately, I can't really find any recent activity on or around Martin. Maybe it was just a whim that didn't get the necessary care and investment.

Nancy looks more promising. It looks like a really nice start. I hope it will be a viable platform in the future, and I'll be keeping an eye on it. But right now it's brand new and not ready for production usage. Which puts it out of the running for all but my personal projects. It seems like a train I'd want to be on, and I'd love to contribute to the effort. But well, here I've got a couple more hurdles. Once again, I'm a web services n00b. I have my sense of design and elegance, and not much else. No real appreciation for the realities and complexities of the problem space or the underlying technologies. This is a journey I'm now looking to embark on for the first time. I'm far from ready to try blazing a trail for others. The second hurdle is that it's hosted on Github, and alas I'm still a Git n00b as well. I have successfully installed Git on one machine, and haven't actually successfully used it a single time yet.

What's left for REST? OpenRasta again. Which I've already established won't work. But why? It looks so... right. I can't just give up, can I? OpenRasta looks really nice in a lot of the same ways as Nancy, except that it's a far more mature product. And the author, Sebastien Lambla, has a ton of OSS experience. It's on Github too, like Nancy, but given a stable branch/build that shouldn't be a problem. But here's where things go bad. The stable binaries seem to have vanished from the place they are supposed to be, at ohloh.net. And the two ostensibly stable branches are currently in a state of... instability... as in they won't build. None of which contributes to a n00b-friendly situation. So here we are indeed back to the same obstacles I have with Nancy.

It's important for me to say that I don't begrudge Sebastien or his OSS compatriots for this situation. He's got a lot of irons in the fire with his OSS efforts, and a lot of people with more important complaints than mine are looking for their issues to be attended to. But it makes me sad, because this seemed like my best bet, and I feel like it's hanging there taunting me just outside my reach.

So... I'm coming out of the weekend very discouraged. I have all these things I want to do, and I feel like I'm being foiled at every turn by a combination of my own ignorance and unfortunate circumstance. My ideas are frozen until I can overcome some of these challenges, but right now I feel like I really need the motivation of progress on these ideas to propel me through the work of ramping up on... well, on anything. I have enough other things to do with my time that it's a challenge to prioritize learning new tech if I won't also be making progress toward producing something.

I feel like I have to make a choice between doing things the way I want and having to spend a long time digging through the muck of learning unfamiliar territory compounded with the pitfalls of immature tools, or I can compromise my design sense and go with WCF in the interest of making some material progress on my ideas.

At least WCF is a place to start where there is an established "on ramp" with controlled unknowns. I could get some preliminary, controlled exposure to the problem space and the technology stack while learning WCF, while still preserving a safety net of sorts, in the form of MS's mature and thorough framework support. Then once I have cleared those learning curves, I can revisit the question and do things the way I'd prefer to next time. So alas, that is probably what I will do.

It's just very discouraging to end up back at the place I started. The one place I didn't want to be. But that's life. Sometimes you can't fight the constraints. Sometimes you have to accept them and just do what you have to "ship", even if it means choosing a technology you'd rather not rely on.

Taking IoC beyond DI with Autofac Part 2: Relationships

In my last post I began to discuss the applications of Autofac as a tool for accomplishing true Inversion of Control, beyond just simple Dependency Injection. Specifically, last time I focused on creation strategies and lifetime control.

This week I want to talk a little more about lifetime control, and a lot more about categories of dependency, also known as relationship types. Nicholas Blumhardt, the principal author of Autofac, has a great blog post on what he calls “the relationship zoo” which I think goes a long way toward covering this space. I was originally going to do something similar, but his post is far more authoritative, and I’m quite certain I couldn’t do better. So instead, I'll abstain from the explanation and code samples and stick to analysis and rumination. So go read his post, then please do come back here and I’ll expand on it with some of my own thoughts.



The reality is that I take issue with some of the dependency abstractions provided by Autofac. It can be quite dangerous to your design to rely on some of them without very good reason. Used properly and prudently, they can absolutely address some particularly painful problems, especially when the dependencies consist of code you don't own and can't change. But it’s also easy to let them creep in wherever they appear to be convenient, and severely corrupt your design and architecture by doing so.

Let me start with the warnings, and then I’ll move on to extoll some virtues that I think Nicholas neglected to identify.

The first relationship type to be wary of is Lazy<T>. The intended purpose, as Nicholas explains, is to avoid expensive operations or construction until or unless it’s necessary. The idea of a lazy object is an old one. It’s a pattern that has been around for a while. My opposition to the usage of this relationship type is primarily that the need for laziness is usually a function not of the usage by the dependent class, but rather of the implementation of the module that is depended upon. Where possible, the consumer should be ignorant of implementation details of its dependencies. Let’s recognize that expensive operations are rarely truly transparent or ignorable, but the fact remains that the responsibility for this situation lies with the module being depended on, not on the consumer.

I strongly believe that if the code of the module that will be resolved for this dependency is under your control, then it behooves you to wrap the laziness around this functionality elsewhere... either building it into the implementation, or wrapping it with some sort of facade. Where Lazy<T> comes in handy is when you don’t have control over this code, and the class poorly encapsulates its functionality such that it can’t sufficiently be wrapped in a facade. At that point the consumer cannot pretend to ignore the situation, and may benefit from delegating the responsibility for the laziness to the container.

I’ll attach my second warning to the Func<T> relationship. I’m wary of the plain Func<T> because it overlaps a great deal with both Lazy<T>. It shares the same issues as Lazy<T> while adding the sin of looking suspiciously like a service locator. Service locators can be dangerous because they subvert the goal of inverting control. A service locator hands control back to the consumer by saying, “just call when you need X”, rather than handing off the dependency and saying “I know you need X so here, use this for that”. This is very rarely the appropriate way of expressing the relationship. The exception would be in the case that the consumer knows for certain that it will require multiple instances of the service it depends on.

Let’s spin things back to the positive by looking at Func and the like. How does adding the extra generic arguments change this from a smell to proper IoC? It expresses a very particular type of dependency. Specifically, it says that this class depends on at least one instance of T, but which instance or instances are needed is a function of a few arguments which won’t be known until each instance is actually needed. This is useful if you have a service which must be constructed for each use and requires some primitive types to guide its behavior, such as an encryption object which requires a key at construction. Or if you have a few different implementations of the a service which are mapped to different run-time states, such as a set of logging mechanisms for different error severities.

The IEnumerable<T> relationship is similar to this last scenario in that it offers a way to say “I depend on all implementations of T, no matter what they are”. This is probably a rarer scenario. Usually a service will have one key implementation, or a couple with very specific and differing purposes which will be used separately in different places. The most likely way for an inclusive but generalized need like this to arise is in an add-on or plug-in scenario. And in that case, you’re likely going to need some up-front processing to get things loaded up before the implementations can be passed off as dependencies to other objects.

I should note that it is probably not all that unusual for IEnumerable arguments to show up in constructors. But more often than not, this will be in data classes which will be instantiated directly, or a single step removed via a factory method, rather than resolved by the container. These aren’t truly “services” and are very unlikely to be registered with the IoC container. The factory may be a service, but what it creates in this case is more of a data structure than anything. Data structures mean state, and usually highly contextual ones at that. Autofac more context-sensitive than many IoC containers, but even it has its limits. With data structures, usually the best solution is either a direct constructor call, or a light factory.

One more small step along the path is Meta<T, M>. This is relationship type that specifies a dependency not only on a service of type T, but on some piece of metadata of type M about the module that will be provided at runtime. This metadata may be used to decide whether or not to go through with making use of the service, or how to do it. In fact, metadata is a great way to handle the special considerations a consuming object may need to make for a lazy service implementation involving a long-running operation. Maybe 90% of the time, the app can simply halt for an operation, but for certain implementations of the service, it’s more prudent to display a friendly “please wait” message and a progress bar. Attaching metadata at registration is a great way to enable these types of decisions intelligently without hard-coding a universal expectation one way or the other.

The final relationship type to address is Owned<T>. This one can be quite handy. At first blush it seems like this might violate the same principles as Lazy<T>, but if you think about it, they are actually quite different. Owned<T> indicates that the object expects to be solely responsible for the fate of the dependency passed to it. This tells the programmer, “I’m going to use this up, and it won’t be any good when I’m done with it, so just clean it up afterward.” Believe it or not, this fits in perfectly with the recommended implementation pattern for the .NET IDisposable interface. That is, that at some point, some object takes ownership of the resources, and responsibility for calling IDisposable, absolving all other transient handlers of the same. Ownership is sort of the dual, or inverse, of a constructor dependency. The constructor dependency says “I know I’m going to need this object”, and the ownership claim says “when I’m done with it, it can be tossed out.” And Autofac happily obliges, releasing the dependencies when the consumer itself is released, calling IDisposable.Dispose as appropriate.

After a twitter conversation with Nicholas himself, it became obvious to me that Owned<T> is probably the better solution to my qualms about InstancePerLifetimeScope. By establishing that the consumer “owns” its dependencies, we have essentially established exactly the limited and definite lifetime context that I was asking for! Behind the scenes, a lifetime scope is spun up when the object is instantiated, with nested resolution rules being applied as a matter of course. And when the object is released, then so is the lifetime scope and anything that was resolved as a new instance in that scope. However, we do have a symmetrical limitation here. Both the creation and the destruction of the context are tied to this object. Once the context is created, it can’t and/or shouldn’t be shared with anything either above or adjacent to this object in the dependency graph, or the deterministic disposal Autofac prides itself on will subverted and become unreliable.

In these past two posts, I’ve covered the whole continuum of dependency creation, lifecycle, and relationship strategies that go beyond simple Dependency Injection to fill out the breadth of what Inversion of Control really means. And they’re all available via Autofac to be handled (for the most part) separate from the implementation of the consumer, and without writing boilerplate factory classes or resource management classes to make it work. I hope that in reading these posts some people may see that IoC is a broad space of patterns and solutions, and that IoC containers are powerful and useful and far beyond the naive use and even abuse that people put them to for simple DI.

Taking IoC beyond DI with Autofac Part 1: Lifecycle Control

People who have to listen to me talk about programming know that I’m a big proponent of Inversion of Control (IoC) containers and what they can do to clean up your code. Most people get introduced to IoC containers via Dependency Injection (DI). DI is another great way to clean up your code. It’s hard to argue against it, really. It decouples your code in a big way. Not only does this make it more testable, but because it aids in separating concerns/responsibilities, this also makes it easier for you to track down bugs when they do show up. But people rightly point out that you don’t need an IoC container to use DI and get these benefits.

I’m usually both happy and sad to hear that argument. On the positive side, it means that people are acknowledging the benefits of DI, which is great. The more people are using DI, the less god classes full of spaghetti code there are out there for me to unearth in future maintenance efforts. Another reason I’m happy to hear that argument is because it means that people aren’t confusing the means for the end. DI is a good thing, for the reasons I established above, not because, as some people seem to think, IoC containers are good and DI is what IoC containers do.

That last sentence there leads into the reason that I’m sad to hear people dismiss IoC containers as being unnecessary for DI. The problem is, DI isn’t the only benefit of IoC containers. DI isn’t what IoC containers do. IoC containers, as their name would indicate, invert control. They take a number of concerns that have traditionally been assigned to the class under consideration, and extract them out to the context in which that class is used. That context may be the immediate consumers of the class, or it may be coordinating code, infrastructure, data access, or any number of other locations in the application. But the point is that the responsibilities are removed from the class itself and given to other classes whose business it is to know what should be created, when, with what initialization, and how it should be disposed of.

A good IoC container does more than just wire up constructor dependencies. It goes beyond that and lives up to the breadth of this definition of IoC. And that is why I laud and evangelize the glories of IoC containers.

Lets take a look at one particular container that I’ve come to know and love: Autofac. It’s an amazing framework that offers an answer to nearly everything that the principles of IoC ask of it. Autofac has features for lifecycle control, object ownership, and deterministic disposal. It has features for nested scoping, contextual dependency resolution, and varied construction mechanisms. These are all concerns that are a function not of the consumer of a dependency, but of the cloud of code and functionality that surrounds it, of the nature and design of your application as a composition. And Autofac gives you ways to deal with them on those terms.

Autofac boasts strong support for robust lifecycle control. In the project wiki you’ll find it laid out under the topic “Deterministic Disposal”, but it’s about creation as much as it’s about disposal. When you register a module with an Autofac container, you have the opportunity to specify a scope strategy. This strategy will determine whether a new instance is created upon the request, or whether an existing one is pulled from the container. Furthermore, it will also determine when references to the instance or instances are released and Dispose called on IDisposables. I’ll be doing some explaining, but if you want to do your own reading on the available strategies, you can do so here: http://code.google.com/p/autofac/wiki/InstanceScope

The two simple lifetime scopes that everyone tends to be conceptually familiar with are found n Autofac’s SingleInstance and InstancePerDependency scopes. The former is roughly equivalent to the function of a singleton pattern implementation, while the latter corresponds to a factory pattern implementation. Autofac goes well beyond this, however, and gives you two more scoping strategies that let you manage creation and disposal of your components in a more nuanced and more powerful way.

Both of the two more nuanced scope strategies depend on Autofac’s support for “nested containers”. A nested container is essentially a scoping mechanism, similar to a method or class definition, or a using block. Nested containers are useful for establishing the architectural layer boundaries of your application. For example, in a desktop application you may have many windows coming in and out of existence, all operating on the same domain objects, persisting them via the same handful of repository classes. These windows may be created in nested container contexts that are spun up as needed, and disposed when the windows are closed. Some objects will be created new each time this happens, while others are unique and shared across the entire UI. The nested container is what allows Autofac to make the appropriate distinction.

Imagine you are writing a file diff’ing application. You have to show two documents at once in side-by-side windows. They are the same thing, in terms of functionality and data, and so could just be two different instances of the same object.... But they will share some dependencies, and have references to their very own copies of certain others.

Lets pick out a few pieces of this puzzle and tie them to Autofac’s features. You will probably have a file access service that allows you to operate on the files that you have loaded. There’s no reason to have more than a single copy of this in the entire app, so it can effectively be a singleton. The way you would express this to Autofac is via the registry.

Given the class and interface definition:

You would register the component as a singleton like this:

The run-time behavior you would see based on this registration is that only one instance of the FileAccess component will ever be produced by the container. Subsequent requests will just return the one that’s already constructed.

The next layer on top of that is the UI. You decide that you may want to be able to have multiple comparison windows open at once, without having to run separate instances of the app. But each of those windows is still essentially a full instance of your apps interface. They’re not exactly singletons, but they should be unique within their separate stacks. Whatever sub-context they are in, there should be only one.

Given the class and interface definition:

You would register the component using the InstancePerMatchingLifetimeScope strategy, like this:

One weakness of Autofac for this use case is that in order to establish the proper resolution contexts, you have to refer directly to the container. In this case it’s probably okay, since you can be fairly certain you’ll only ever need a Left context and a Right context. So you can probably create the lifetime scopes up front, store them away somewhere, and explicitly resolve these objects from the separate contexts when needed. A sample setup for this can be seen below.


Then you’d need to make sure that the LeftContainer and RightContainer were used explicitly to resolve the left and right ComparisonWindow components.

This works for us in this situation. But it’s not at all difficult to imagine a scenario, maybe even in this same app, where the contexts aren’t predetermined and static. For example, you may want have a worker thread pool, where each thread has its own resolution context. In fact this is a situation addressed explicitly in the Autofac wiki. Even there, it seems to be accepted that an explicit container reference in the thread pool is necessary. It doesn’t seem like there’s a great solution to this challenge at this time, though I will surely be keeping an eye out for one. This is messy, concern-leaking infrastructure code that I would really prefer not to have to write.

There’s another scoping type that’s related to this one. In fact it’s use is a bit simpler. This is the InstancePerLifetimeScope strategy. Note the subtle lack of the “Matching” adjective in the name. What this indicates is that the context is implied rather than explicit. The behavior specified by this strategy is that at most one instance will be created within the resolution context where the resolution happens, at whatever level of nesting that happens to be. Functionally, this differs from InstancePerMatchingLifetimeScope in that it doesn’t search the context stack for one particular context in which to do the resolution.

This strategy can be effective when you have a set architecture with a trivial tree structure. Well-defined layering is crucial. All the leaf nodes of your context tree need to be at the same depth in order to be certain when a new instance will be created and when not. For example, a website where you have a base layer for the whole web app, and a leaf layer nested within for each individual web requests. In our diffing app, if we can be certain that we need no more deeply nested containers beyond our LeftContainer and our RightContainer, and all significant service resolution will happen at those layers, then we may have a use for this strategy for the dependencies of our Left and Right windows and controllers

The registration for this strategy is very similar to the others. Given a class and interface definition such as this:

The registration would look like this:

The final scoping strategy is InstancePerDependency. As noted before, the behavior is roughly equivalent to an implementation of a factory pattern. Every resolution request for a service registered as InstancePerDependency will result in the creation of a new instance. In our diffing app, we may find use for this strategy with something like an alert dialog. There’s no need to keep an alert around when it’s not being shown, and in fact it should almost certainly *not* carry any state from one alert to the next.

So given a class and interface such as this:

The registration would look like this:

That covers most all of Autofac’s lifecycle control functionality. There are four strategies available: SingleInstance which approximates Singleton, InstancePerDependency which approximates Factory, and the more subtle and, honestly, difficult to use, InstancePerLifetimeScope and InstancePerMatchingLifetimeScope which are heavily contextual. I really wish that these last two were more manageable and directable. If they were, I think that Autofac could claim to easily address most any lifecycle control need with very little overhead and requiring few concessions to the framework. This would be a very noble goal. But as it is, their behavior will tend to be circumstantial rather than controlled and intentional. And the only hope of improving that situation lies in taking great pains to organize your design to account for the container’s shortcomings and then go on to break the rule of not referencing the container.

Despite these shortcomings, I believe there are many benefits to be found in relying on Autofac for lifecycle control where it’s possible and not overly problematic. Certainly I’ve saved myself some headaches in doing so. And we haven’t even begun to explore the dependency relationship patterns that Autofac supports out of the box. We'll dive into to those next time!

A Noob's Scattered Thoughts on REST

Note: I hope you'll bear with me as I meander through my still very disorganized thoughts on the topic. This post is very much for myself, with the goal of distilling the information floating around in my head into something more concrete and which I can actually make use of...

I have a few projects on the horizon, both at work and on my own, which will involve web services in some form or another. I have to admit, I've never been much of a "web guy" before. My career to this point hasn't demanded that this change, but it looks like it's time for me to move off the desktop at least part-time. So I've dived into web services.

I develop for .NET at work. So while I may explore something else for my personal projects, for work I have to focus on .NET frameworks. Of course REST is all the rage these days, so it's obligatory that I research that. So I've read a number of blog posts describing what REST is about. My assessment of the REST is that it seems like a nice idiomatic design philosophy for a web service that deals primarily with persistent data entities and operations over them.

It seems clear to me though that there must be problems which might be addressed via a web service which aren't necessarily natural to model using REST. Of course the fallback, if one were to decide that the barrier is too great, would be to use some sort of RPC style. RPC comes natural to us developers, of course, because it's essentially just a web-enabled version of the idioms we use ubiquitously in code.

My own feelings about RPC as compared to REST are less critical than many. I understand the philosophical imperative to maintain the webbishness of services operating over the web. However, I also hold very strong to the opinion that solutions should be created using metaphors that lend themselves naturally to the problem space. REST, as defended by many, seems to me to be less about the problem space than it is about the mechanism the solution is built on.

However, if REST can be said to consist of two pillars, resource-orientation and mapping operations to the HTTP verbs, then it seems the later is really the part that seems limiting. After all, a resource is not so different from the "objects" that permeate our code as programmers. But imagine if, in your object-oriented design, you were allowed only 4 methods per object, and each of them had to fit one of four very specific patterns. That feels arbitrarily limiting, doesn't it? It does for me.

Now, having said that, I do see value in keeping things simple. The web is already a high level abstraction. And building convoluted metaphors on top of it that don't map easily to the mechanisms that underly it can cause a lot of unnecessary headaches. I also believe that it may not be unfair to compare RPC-heavy designs to the old style of procedural programming. That is, that they have all the nasty kinds of coupling and inertia, and few of the good kinds. Especially when the data entities involved are mutable.

That last point hits home hard for me, though. And I think it will end up strongly informing how my designs play out. The reason for this has largely to do with how my overall coding style has evolved over the past couple of years. I've found lately that, with the exception of DTOs and the primary domain objects in my solutions, a great many of the objects I work with tend to be immutable. And the algorithmic bits that operate on them tend to come in nuggets of functionality that are highly composable. This strikes me as being fairly compatible with REST's resource-oriented design.

Another factor I have to consider is the frameworks that are available to me. Comparing what I can find on the web about WCF and ASP.NET, the primary non-REST web service frameworks, and OpenRasta.... I have to say that the code and configuration of OpenRasta solutions seems to be much simpler, more elegant, and clearer of intent. This appeals to me greatly. It seems like it would be much easier not only to spin up something from scratch into a working solution, but also to evolve smoothly from my first fumbling attempts through to a final product that still has clarity of structure and intent. If I've learned anything in my career so far, it's the importance of that.

So my initial forays, at least, will likely be based in OpenRasta. If it turns out that OpenRasta can't give me what I need, or that there's too much friction, I'll look for help of course. But I also won't be afraid to try to weave some RPC into my design if it's called for.

Clean Injection of Individual Settings Values

The Setup

Today I again encountered a challenge that I have dealt with numerous times while working on the product I develop for my employer. It's not insurmountable an insurmountable challenge. Honestly, it's not even all that challenging. But it has been an irritant to me in the past because all the solutions that I have come up with felt unsatisfactory in some way.

That challenge is the injection of simple settings values into the classes that need to refer to them. By simple I mean single-value settings embodied by an integer, a string, or a floating-point, for example. This seems like a simple problem. And honestly, I thought this was going to be a short and sweet post, until I realized that there is value in delineating all the dead ends I've followed on the road to my current favored solution. As you'll see, I've put a fair amount of thought and analysis into it.

Being a hip programmer, I use an IoC container to constructor-inject all my complex dependencies. And my IoC container of choice, Autofac, is only too happy to auto-wire these so I don't have to worry about them. But mixed in among these are the simple settings values that guide the functionality of certain classes. A component/service model doesn't really apply to these values. No one string value is implementing the "string" service. The values can't be easily auto-wired because unlike the complex classes there is certain to be more than one important value floating around for the primitive type in question.

Our app processes image. Large files, and large volume. And worse, the working sets are very large, so we can't handle things piecemeal. We have a serious need in a few different places for disk caching of "active" entities that are still very much in flux. Having multiple caching points necessarily means that there are several parameters that may need to be tweaked to keep things optimized. For each cache we want to be able to set the disk location for the cache and the in-memory entity limit, just to start.

Taking a simple case with two cache points, we have two string settings and two integer settings. So already we are in a position where in order to inject the values, we'll need to do some sort of "switching". Some run-time decision of which string and which integer go where.

Pitfalls and Red Herrings

As I noted in my opening, I have solved this in several ways in the past. One is to hook into the IoC via Autofac's "OnPreparing" event, where I can supply values for particular constructor parameters. This is nice, because it means I can avoid setter injection. But it complicates the IoC bootstrapper by adding exceptions and special handling for particular classes. Just as undesirable, it couples the IoC bootstrapper directly to the settings mechanism.

What about setter injection? Autofac provides a post-construction OnActivated event that is perfect for setter injection, but this is subject to the exact same disadvantages as the pre-construction event. We could leave the setters alone and let some other object fill them in, but that leaves us with a couple different problems. First, there's just as much coupling, it's just outside the IoC, which may or may not be a marginal improvement depending on how it's implemented. If you end up with some class that must be aware of both the app settings mechanism and the class(es) that receive those settings then this is really not much of an improvement.

But beyond that, refraining from providing the values until after the components are obtained is undesirable for yet a few more reasons. First and foremost, it means that your services will exist in a state of incomplete initialization. The risk of getting hold of an incompletely initialized service makes calling code more brittle. And protecting against the possibility makes it more complex. Furthermore, setter injection for these particular types of values is undesirable because it implies they are variants. The truth is that the last thing you want is for some errant code to change the cache location on disk after a bunch of files have been stored there. And putting in protection against such post-initialization changes is pathologically unintuitive: it subverts the very nature and purpose of a setter.

So we've established that these direct injection routes are problematic in a number of ways. Let's move on to indirect injection. What does that mean? Basically it means putting a provider object in the middle. Our classes can take a dependency on the setting provider, which can wrap the settings mechanism itself, or act as a facade for a bundle of different mechanisms.

The option that at first appears simplest is to have a single settings provider object through which all the app's settings can be accessed. The classes can all depend on this object, which the IoC can provide with a singleton lifecycle if we desire, for maximum consistency. But now what we have essentially done is created a service locator for settings. This is another thing that's good to avoid for two reasons. For one, it creates a huge common coupling point, and for two, it violates "tell, don't ask". Why should my dependent class have to depend on a generic interface and worry about asking for the appropriate thing, when all it cares about is just that one thing?

This is especially dangerous if the app-side interface to your settings keeps them grouped just as they are in the user-side (i.e. the config file), as the build in .NET config mechanism is wont to do. The needs of the user for the purpose of managing settings individually or en masse are vastly different than the needs of the application whose behavior is driven by those settings. While a user likely thinks the most intuitive arrangement is for all the 15 of the paths to be bundled together, it's highly unlikely that any particular class in the application is going to care about more than one or two individual paths. And if the class doesn't need them, then they shouldn't be offered to it.

A Light in the Dark

So where do we go from here? If you can believe it after all this meandering and rambling, we're very close. From here we take a little tip from the DDD community: eschew primitives. If you think back to the beginning, the whole problem centers on the fact that primitives are just too darn generic. The type doesn't mean something specific enough for it to be an indicator of what exactly the dependency is. How do we fix this? Encapsulate the individual setting in a type specific to that need. Given that the explicit purpose of these classes will be to provide particular settings to the classes that need them, it is appropriate for these to couple to the configuration mechanism, whatever that may be, and more importantly, encapsulate it in useful bite-size chunks. And because the providers will themselves be injected where needed, the coupling is one-way, one level, down the layer hierarchy, which is arguably the best kind of coupling.

Show Me The Code

Enough talking. Now that I've set the stage, here's some code to furnish and light it.

First, the setting providers. As you can see, they tend to be nice and short and sweet.

Next, the caches that depend on them. Note how the setting providers are used in the constructors.

Finally, I'll show just how easy it can be to wire these up. If you bundle all your setting providers in one namespace, you can even safely auto-register them all in one fell swoop!

Objections?

There are a few things that I can anticipate people would object to. One is the potential for proliferation of tiny classes. I don't see this as a bad thing at all. I think it's fairly well established that small classes, and methods, with laser-focused responsibilities are far easier to maintain, evolve, and comprehend. I can say from personal experience that I am utterly convinced of this. And if anecdote isn't good enough for you, I'll add an appeal to authority to it =) Go read up on what the most respected programmers out there today are saying, and you'll see them express the same sentiment, and justify it very well.

Another thing I expect people to object to is that when taken as a whole, this pile of small classes looks like a bit of a heavy solution. And it is heavy in this context, where none of the actual app code is surrounding it. But nestle it inside a 50K line desktop app with hundreds of classes and it will start to look a lot better. For one, those classes and their namespace create sort of a bubble. It's a codespace that has boundaries and purpose. You know what's inside it, and you know what's not. It's a goal-oriented mental anchor to latch onto while you code, and that's a darn useful thing.

Community Service

As I see it, programming questions tend to take one of two forms:
  1. How do I do X?
  2. How do I do X in a clean, maintainable, and elegant way?
Matt Gemell has a great response to questions of the first form. I have nothing to add to his excellent post on that topic.

But I think it's crucially important to make a distinction between the two forms. The second form is a far more difficult question, asked far less commonly, and to which good answers are even rarer. When questions of this form are asked in forums, people are usually given a link to a page describing a pattern, or if they are really "lucky" a page with a sample implementation of that pattern. To be fair, patterns usually are the answer to these types of questions. But what we find written up on the web, or even in books, is most commonly pathetically oversimplified, without context, and often even without guidance on what support patterns are necessary to obtain the benefit or when alternatives may be preferable.

Essentially, most developers are left with no choice but to apply Matt's answer to form 1, to questions of form 2, in a much less information-rich environment. I contend that, while it may be one of those proverbial activities that "build character", it is ultimately more likely to be harmful to their immediate productivity--possibly even to their continued professional growth.

What we end up with is a pandemic of developers trying to hack out pattern implementations, being discouraged by the fact that the pattern seems to have no accounting for any possible deviations or complications in the form of the problem. Worse, developers are often dismayed to find that the one pattern they were told to use is merely the tip of a huge iceberg of support patterns without which the first may actually be more problematic than an ad hoc solution. Most often the developer in this position will end up determining that their deadlines will never allow them to go through the painful trial and error process on every one of these patterns, and accordingly drop back to the ad hoc solution.

It's time that we acknowledge that software development, whether you consider it an engineering discipline, an art, or a craft, has a history--albeit a short one. Things have been tried. Some work, some don't. There do exist "solved problems" in our problem space. To say that every developer should try and fail at all these efforts on their own ignores and devalues the collective experience of our community. Worse, it stunts the growth of the software development industry as a whole.

Yes, one can learn these things by trial and error. Yes, the understanding gained in this way is deeper and stronger than that gained initially by being tutored on how to apply the solution. And yes, there's a certain pride that comes with getting things done in this way. But this is not scalable. Putting each person forcibly through this crucible is not a sustainable strategy for creating experienced, productive, wise programmers. Every hour spent grappling in isolation with a question of "why won't this pattern do what I've been told it will" is an hour that could be spent creating functionality, or heaven forbid solving a new problem.

That is why those of us who have managed to obtain understanding of these "solved problems" must be willing to shoulder the responsibility of mentoring the less experienced. Being willing to explain, willing to discuss specifics, mutations, deviations, exceptions. Willing to discus process, mindset, methodology. These are the things that make the distinction between a programmer and a software developer, between a software developer and a software engineer.

The internet may very well not be the appropriate place to seek or provide this type of mentoring. I suspect it's not. And unfortunately, there are too many development teams out there buried in companies whose core competencies are not software, and consisting solely of these discouraged developers, lacking an experienced anchor, or even a compass. There are online communities that attempt to address the problem at least partially. ALT.NET is one such, for .NET technologies. But there really is no substitution for direct, personal mentorship.

So I would encourage young developers out there to seek out more experienced developers, and ask them the tough, form 2 questions. And I would even more strongly encourage experienced developers to keep a watchful eye for those in need of such guidance. Maybe even consider proactively forming a local group and seeking out recruits. Be willing, able, and happy to provide this guidance, because it benefits all of us. Every developer you aid is one less developer creating an ad hoc solution which you or I will be condemned to maintain, overhaul, or triage somewhere down the line.

Incidental Redundancy

A note: I originally composed this blog post in June 2008, but lost it in my backlog and it never got posted. In the interim, Robert C. "Uncle Bob" Martin has addressed the issue in his Clean Code Tip of the Week #1. He describes the issue and the dilemma far more concisely than I have here, and even provides a strategy for dealing with it. By all means feel free to skip my post here and consider him the "official source" on this topic. =)

I am a big fan of "lazy programming". By that of course I mean avoiding repetition and tedium wherever possible. I mean, that's what most programming is really about, right? You have a problem: some process that is annoying, unwieldly, or even impossible to perform with existing tools. And the solution is to write a program that takes care of all the undesirable parts of that process automatically, based on a minimized set of user input or configuration data.

The realities of programming on an actual computer rather than in an idealized theoretical environment rarely allow this ascending staircase of productivity to be climbed to the top step. But it is nevertheless a goal we should always strive for.

Hence Jeff Atwood's recent post about eliminating redundancies.

Anything that removes redundancy from our code should be aggressively pursued
An honorable goal, to be sure. But no rule is without exception, right?

I humbly submit to you the concept of "incidental redundancy". Incidental redundancy is a repetition of code syntax or semantics that tempts the programmer to refactor, but if carried out the refactoring could damage the elegance and discoverability of the program.

The difference between incidental redundancy and regular redundancy in code is that the redundancy does not arise because of any substantive, or at least relevant, similarity between the two problems in question. Here are two ways I can think of off the top of my head for this to happen:

  1. The solutions you have come up with to this point for each situation just happen to have taken a similar form. Given a different creative whim, or even just an alternative refactoring, the commonality may never have arisen.
  2. The problems are, in fact, similar at some level. But the level at which they are similar is far above or below the level where you are working, and so not truly relevant or helpful to the immediate problems you are trying to solve.
The first situation should be acceptable enough to anyone who has spent a decent amount of time programming. Sooner or later, you will spend some precious time carefully crafting a solution to a problem only to later discover that a year and a half previous, you had solved the same exact problem, in a completely different and maybe even better way.

The second situation may sound a little incredible. But allow me to point out an example from the world of physics. Please bear with me as this is an area of personal interest, and it really is the best example that comes to mind.

There are four forces in the known universe which govern all interactions of matter and energy, at least one of which I'm sure you've heard of: the electromagnetic, weak nuclear, strong nuclear, and gravitational forces. It is known now that the first two of those apparently very different forces are in fact two different aspects of the same phenomenon (the electroweak force), which only show up as different when ambient temperature is comparatively low. Most physicists are pretty sure that the third force is yet another aspect of that same phenomenon that splits off only at even higher temperatures. And it is suspected that gravity can be unified with the rest at temperatures higher still.

The point of all this is that there are four different phenomena which in fact bear undeniable similarities in certain respects, and these similarities continue to drive scientists to create a generalized theory that can explain it all. But no one in his right mind would try to design, for example, a complex electrical circuit based entirely on the generalized theory of the electroweak force.

The analogy to programming is that, were we to try to shift up to that higher level and formulate an abstraction to remove the redundancy, the effect on the problem at hand would be to make the solution unwieldly, or opaque, or verbose, or any number of other undesirable code smells. All at the cost of removing a little redundancy.

What we have in both physics and in our programming dilemma, is noise. We are straining our eyes in the dark to find patterns in the problem space, and our mind tells us we see shapes. For the moment they appear clear and inevitable, but in the fullness of time they will prove to have been illusions. But making things worse, the shadow you think you see may really exist, it's just not what you thought it was. That's the insidious nature of this noise: what you see may in fact be truth in a grander context. But in our immediate situation, it is irrelevant and problematic.

This concept is admittedly inspired heavily, though indirectly, by Raganwald's concept of "incidental complexity", where a solution takes a cumbersome form because of the act of projecting it upon the surface of a particular programming language, not unlike the way the picture from a digital projector becomes deformed if you point it at the corner of a room.

The real and serious implication of this is that, to put it in Raganwald's terms, if you refactor an incidental redundancy, the message your solution ends up sending to other programmers, and to yourself in the future, ceases to be useful in understanding the problem that is being solved. It starts sending a signal that there is a real and important correlation between the two things that you've just bound up in one generalization. When in fact, it's just chance. And so when people start to build on top of that generalization with those not quite correct assumptions, unnecessary complexities can quite easily creep in. And of course that inevitably impacts maintenance and further development.

Noise is ever-present in the world of programming, as in other creative and engineering disciplines. But it doesn't just come from the intrusive environment of our programming language or our tools as Raganwald pointed out. It can come from our own past experience, and even come from the problem itself.

So be wary. Don't submit unwittingly to the siren song of one more redundancy elimination. Think critically before you click that "refactor" button. Because eliminating an incidental redundancy is like buying those "as seen on tv" doodads to make your life "easier" in some way that was somehow never an issue before. You think it's going to streamline some portion of your life, like roasting garlic or cooking pastries. But in your quest to squeeze a few more drops of efficiency out of the tiny percentage of your time that you spend in these activities, you end up out some cash, and the proud owner of a toaster oven that won't cook anything flat, or yet another muffin pan, and a bunch of common cooking equipment that you probably already own.

So remember, redundancy should always be eliminated, except when it shouldn't. And when it shouldn't is when it is just noise bubbling up from your own mind, or from the problem space itself.

The First Rule of Extension Methods

Extension methods are a tremendously useful feature of C# 3. Briefly, they allow you to bundle new behavior into an existing class that wasn't included by the class's original author, but without opening up the implementation internals of the class. In a very general sense, this is useful if you come up with useful behavior related to a class and the best place for it is IN that class, but you don't want to bend or break encapsulation by inheriting. I won't spend any more words on an introduction, but rather offer a caution, for those of you who have seen their usefulness and would like to start taking advantage.

Before you even begin to think about whether a particular behavior belongs on an existing class, or as a service, or what have you, you should internalize one cardinal rule of dealing with extension methods. Thankfully this First Rule of Extension Methods is nothing like the First Rule of Fight Club, because if it were, I wouldn't be able to help you at all. No, the First Rule of Extension Methods is: DO NOT export extension methods in the same namespace as your other public classes.

The reason for this is very very simple: odds are very good that if you have come up with an idea of including a particular new behavior on an existing class, someone else has or will as well. There will only be a finite number of relevant names to give this new method, which means that if it's a simple method, especially one without parameters, there's a decent chance that the signatures of your method and this other programmer's will be identical. And if both you and this other programmer happen to include the method on public static classes in a namespace with other functionality the user needs, then someone trying to use both libraries could very likely run up against identifier collisions with the two methods.

Note that the rule begins with "do not export..." This is important, because you can be 100% certain to save your users any collision headaches if you just don't export your extension method. Why wouldn't you export your extension method? Well, there's a very good chance that just because you found your extension method to be useful and, dare I say, clever (Tyler Durden: "how's that working out for you?"), that doesn't mean the consumer of your library will. So consider carefully whether you should even make the method public at all, or rather just keep it internal.

If you decide that your method is just so darned handy that to keep it locked up would just be sadistic, then make a separate namespace for each batch of related extension methods you want to export. If the user decides they'd like to use them too, they can import the namespace as necessary. This will minimize the chance that they will need to fully qualify the namespaces of the other library with the identically named method whose author wasn't as responsible as you were and polluted their namespaces with extension methods.

This tactic won't guarantee anything, because there's always a chance some other library could use the same namespace identifiers as well. But with each additional namespace, you dramatically increase possible naming permutiations, and proportionally decrease the chance of collisions.

Useful Extension Methods 1 through 3 of N

Quite often when I'm writing code I'll notice a very small bit of logic that keeps popping up all over the place. It's usually something so trivial that most people barely notice it. But it's also usually something that shows up so often that, despite being small and ignorable, it constitutes a fairly constant level of noise in the code. By noise I simply mean something that takes up more characters than it needs to, obscuring the real meat of the logic of your code. Common functionality like this that everyone knows and understands should just get out of the way, fade into the background, and let the unique logic stand out.

You may have also heard of this noise idea by another name: accidental complexity.

As Reg Braithwaite has pointed out, looping syntax is one of these things. With the ubiquity of IEnumerable and the advent of extension methods in C#, there is almost never a good reason to write an explicit for loop anymore. Looping is a ubiquitous bit of logic that nonetheless takes up quite a lot of characters. Even the vaunted foreach loop is now officially more verbose than it very often needs to be.

Microsoft made it easy to get rid of the explicit loop when what you are doing is essentially a mapping operation, with the inclusion of the IEnumerable.Select extension method.

Say your Foo class has a static function taking a Bar and returning a Foo, and you want to use this function to take a collection of Bars and create a collection of Foos.

You could do this:
IEnumerable<Bar> bars = GetAllBars();
List<Foo> foos = new List<Foo>();

foreach (var bar in bars)
foos.Add(Foo.FromBar(bar));

Or you could do this, which is obviously much more concise:
IEnumerable&lt;bar> bars = GetAllBars();
IEnumerable&lt;foo> foos = bars.Select(Foo.FromBar).ToList();

Note that the Select function completely takes care of the looping logic. Once you know that, this code reveals itself as being extremely elegant. But what if the action you're taking doesn't return anything? You're "stuck" writing an explicit loop, right? Not at all.

Take this code:
IEnumerable<Foo> foos = bar.GetAllItems();

foreach (var foo in foos)
PrintToScreen(foo);

To start removing the noise, first define an extension method for IEnumerable called ForEach. This is extension method #1.
public static IEnumerable<X> ForEach<X>(this IEnumerable<X> lhs, Action<X> func)
{
foreach (X x in lhs)
{
func(x);
yield return x;
}
}

Then rewrite:
bar.GetAllItems()
.ForEach(PrintToScreen)
.ToList();

Now there's just the issue of that nasty little ToList call. Right now, we need that in order to force the collection to be iterated. The yield return syntax essentially causes a function's execution to be deferred until an element is actually requested. This is actually potentially useful even if none of the things you need to do will return values. You can chain together a bunch of actions on the collection by chain-calling ForEach with different delegates. But it's still silly to create and throw away a List just to do this.

So we create an Evaluate function that does a simple explicit iteration and nothing more, to force the iterator to be evaluated. This is extension method #2.
public static void Evaluate<X>(this IEnumerable<X> lhs)
{
foreach (X x in lhs) ;
}
And now you can replace ToList with Evaluate, which will iterate the collection without allocating a new List.
bar.GetAllItems()
.ForEach(PrintToScreen)
.Evaluate();

This is nice, and is going to be useful if we need to chain ForEach calls. But when we don't need that there's still that Evaluate call at the end that's going to be repeated every time we want this functionality, which could be an awful lot. So, let's get rid of that too.

To do that we define a Visit function (named for the Visitor Pattern), that will call ForEach with the given delegate, and then Evaluate as well. This is extension method #3.
public static void Visit<X>(this IEnumerable<X> lhs, Action<X> func)
{
lhs
.ForEach(func)
.Evaluate();
}

Now we can finally get all this done in a single function call:
bar.GetAllItems()
.Visit(PrintToScreen);

This takes some getting used to. But it really has the potential to condense your code. It isn't readily apparent looking at one bit of code, but once you start talking about nested loops or consecutive loops, you'll see the difference. Not to mention the total effect it will have across your codebase. Loops are everywhere. Shave off 50 characters from each one and your talking about a lot of characters in aggregate.

For my part, after I determined to avoid explicit loops whenever possible, the comparatively verbose explicit looping syntax became almost painfully extraneous to my eyes. I feel like function calls are much more elegant.

Decoupling Domain Model from Persistence

For the last several months, I've been working on a Windows desktop application. This application has a number of pretty common aspects: file manipulation, local data repository, GUI, user settings, collection/dataset manipulation (think drag-and-drop listview type stuff), communication with peripheral devices, and communication with remote services (such as a web service). I think just about every desktop application has a good cross-section of these aspects, and maybe a few others that are slipping my mind at the moment. As a result, there are patterns of application architecture that will arise, and which the development efforts of well-designed, robust applications will have in common. I'm not talking about the typical "design patterns" here à la Gang of Four (GoF), but rather "application architecture patterns". Big patterns. A set of 4 or 5 patterns that, when meshed together, encompass the big, meaty chunks accounting for 95% or better of your application.

I know this is true. I know these patterns exist. But as this is my first effort in developing a desktop app of this complexity, I don't really know what the patterns are. One that GoF did account for was MVC/MVP, and sure I know those... But honestly, up until recently (and maybe even still) my knowledge of that pattern was very hazy and academic. Actually implementing it was something I'd never had to do before. It took a lot of blood, sweat, and tears--lots of trial and error--to get to a point where I really feel like I'm starting to grasp the "right" way to put together an MVP structure.

So now I've moved on to persistence. I have my domain model. But I need to be able to persist the objects in my domain model to and from disk (not a database!). I've been struggling to find ways to add in this functionality without tainting the domain model with persistence logic. Granted, this is a valid way of doing things, as is evidenced by Martin Fowler's ActiveRecord pattern. But it really doesn't make for ease of unit testing. Your business logic and persistence logic get all mixed up in the same classes and you can't easily isolate one from the other for testing purposes.

Then I saw Fowler's Repository pattern. I thought, okay, maybe that will help. I can create an interface that I can implement which will take care of all the logic for disk access to all the right places, based on what objects or collections are being retrieved. And I can just mock the interface for purposes of testing the domain model and business logic. I have a simple three-level hierarchy of domain interfaces, so it would be fairly simple to implement all the querying and CRUD I need for each domain class on the repository interface. But then I thought about how we plan to have multiple implementations of these interfaces in the future, with different permutations of under-the-covers data to be persisted, and thought maybe it would be best to have the repository be metadata-driven, rather than hard-coding the query logic for the different domain classes. This of course naturally leads to a record-based design, and that would mean I'd want data mappers to translate from my domain classes to repository records. And of course, the repository class should be strictly independent of the persistence mechanism itself (in this case the disk) so that I can unit test the repository logic without worrying about maintaining a test repository just for that purpose.

And suddenly I realized that I had mentally re-created ADO .NET. DataSets and DataTables are the repository mechanism, (with typed datasets and the ADO designer even supporting generating data mappers for you). DataAdapters are the persistence service.

So where this leaves me is with the question of whether I should just design my on-disk storage schema, build myself an app-specific data adapter mechanism, and let ADO .NET do all the work. If I really wanted to minimize the amount of mapping and repository-access code I have to write, I might even be able to use the Entity Framework to do my dirty work for me. Admittedly I have no idea how much work it would take to implement my own data adapter. The interfaces seem straightforward enough, but I have a nagging feeling that's really just the tip of the iceberg.

After all my radio silence here, I don't know if anyone is still listening... But if anyone has any thoughts on the wisdom of this approach, I'm listening intently.

Don't Give Up Assembly Privacy For Sake of Unit Testing

Just a little PSA to other .NET unit testing newbs out there like me. This info is available various places on the internet, but you'll be lucky to find it without just the right search terms. So hopefully adding another blog post to the mix will make it easier to stumble on.


Unit testing frameworks need to instantiate your types, in order to run unit tests. What this means is that they need to be able to see them. The easiest way to do this is to make your classes public and/or place the tests right inside your project.


Placing the tests right in your project means that you'll very likely have to distribute the unit test framework assemblies along with your product. This might give more information to potential hackers than you would like. And making your classes public brings with it the often undesirable side-effect of opening up essentially all the types in your assembly to be used by anyone who knows where the assembly file is, in essentially any way they like.


But you don't have to make these concessions. You can move the tests out of your assembly, to keep them out of the deployment package, and still keep your classes internal (though not private). The .NET framework allows an assembly to declare "friend" assemblies that are allowed to see its internal classes. (Yes, very similar to the old C++ friend keyword). This is accomplished by adding an assembly attribute called InternalsVisibleTo to your AssemblyInfo.cs file.


If your unit test project does not have a strong name, it's as simple as referencing its assembly name:


[assembly: InternalsVisibleTo("MyCoolApp.UnitTests")]

However, I strongly recommend giving your unit test assembly a strong name. A strong name is a name involving a public-private key pair, and which is used by the .NET framework along with some hashing and encryption technology to prevent other people from creating assemblies that can masquerade as your own. Furthermore, if you give your app itself a strong name (which you should if you plan to distribute it), any libraries it references will need strong names, including the ones it just allows to see its internals.


So, if you decide to give your unit test project a strong name, you'll need the public key (not just the token) as well:


[assembly: InternalsVisibleTo("MyCoolApp.UnitTests, PublicKey={Replace this, including curly braces, with the public key}")]



(If you need to learn about strong names, and/or how to extract the public key from your assembly, this is a good place to start: http://msdn.microsoft.com/en-us/library/wd40t7ad.aspx.)


Once you've done this, you should be able to compile and run your unit tests from a separate project or even solution, and still keep the classes you're testing from being "public".


This is all well and good, but if you're working with mocks at all, you probably have another problem on your hands. The most popular .NET mock frameworks (e.g. RhinoMocks, Moq, and NMock) use the Castle Project's DynamicProxy library to create proxies for your types on the fly at runtime. Unfortunately, this means that the Castle DynamicProxy library ALSO needs to be able to reference your internal types. So you might end up with an error message like this:


'DynamicProxyGenAssembly2, Version=0.0.0.0, Culture=neutral, PublicKeyToken=a621a9e7e5c32e69' is attempting to implement an inaccessible interface.

Complicating this fact is that the Castle DynamicProxy library places the proxies it generates into a temporary assembly, which you can't just run the strong name tool against, because the temporary assembly doesn't exist as a stand-alone file. Fortunately, there are programmatic ways of extracting this information, and the work has been done for us. The public key for this assembly, at the time of this writing, has been made available, here and here. You might find some code at those links that could help you extract the public key from any future releases of Castle as well.


The important information is basically that, as of today, you need to add this to your AssemblyInfo.cs file, without line breaks:


[assembly: InternalsVisibleTo("DynamicProxyGenAssembly2, PublicKey=002400000480000094000000060200000024000052534131
0004000001000100c547cac37abd99c8db225ef2f6c8a360
2f3b3606cc9891605d02baa56104f4cfc0734aa39b93bf78
52f7d9266654753cc297e7d2edfe0bac1cdcf9f717241550
e0a7b191195b7667bb4f64bcb8e2121380fd1d9d46ad2d92
d2d15605093924cceaf74c4861eff62abf69b9291ed0a340
e113be11e6a7d3113e92484cf7045cc7"
)]

Caution: The name and public key of this temporary assembly was different in earlier versions, and could change again in later versions, but at least now you know a little more about what to look for, should it change.


So remember: You don't have to open wide your assembly to just anyone who wants to reference your types, just for sake of unit testing. It takes a bit of work, but you don't need to compromise.

Surviving WinForms Databinding

I've rarely had the freedom in my career to implement a Windows application using MVC-style separation of concerns. Generally I am told "just get it working". Now, if I already knew how to tier out an application, this wouldn't be a problem. But since I don't, and it would take me a good deal of time to figure out a satisfactory way of doing it, I haven't been able to justify spending the time, on the company dime.


But recently, I've been fortunate enough to work on a new software product without a hard ship-date, and having other obligations on my plate. This has given me the freedom to not spend every minute producing functionality. So I've been experimenting with implementing MVC with Windows Forms.


There are essentially two ways to get this done. You can:

  1. Write a bunch of manual event code to trigger view changes from model changes, and vice versa.
  2. Use databinding and save yourself the explicit event code.

This works great if all your datasources are DataTables, or DataViews, or other such framework classes that are designed from the ground up to work with WinForms databinding. But should you have the misfortune of not dealing with any record-based data, you'll find you have a much tougher road to walk.


If you, like I, choose option 2, and you happen to be working on anything more complicated than a hello world application, and you are truly committed to doing MVC both correctly, and with as little unnecessary code as possible, then you will undoubtedly spend, like I have, a lot of time banging your head against a brick wall, trying to figure out why your databinding isn't doing what it is supposed to. This is a terrible shame, because if databinding in WinForms worked properly and was easy to use, it would be a spectacular tool for saving time and shrinking your codebase.


Truth is, you can still same time and code. But not as much as you might think when first introduced to the promise of databinding. If you can find any decent information on the obstacles you'll encounter. Compounding the above hurdles is the fact that what information can be found online about them is scattered to the four winds, and no one bit references the rest. So, I've decided to blog the headaches I encounter, and my resolutions, as I find them. This should increase the searchability of the issues at least a bit, by tying together the separate references with my blog as a link between them, and also by containing the different bits of information within one website. Or it would, if I had readers...


My results so far have produced 5 rules, to help you preserve your sanity while using WinForms databinding.


Rule 1: Use the Binding.Format and Binding.Parse events.


This first rule isn't actually too hard to find information on. Format and Parse essentially let you bind to a datasource property that doesn't have the same data type as the property on the control. So you can bind a Currency object to a TextBox.Text property, for example.


The Format event will let you convert from your datasource property type to your control property type, and Parse will let you convert from your control property type to your datasource property type. The MSDN examples at the links above are pretty good. If you use them as a template, you won't go wrong. But if you start to switch things up, beware Rule 2...


Rule 2: If you use Format or Parse events, DO NOT add the Binding to the control till after register with the events.


I honestly don't know what the deal is with this one. I just know that if you add your events to your Binding object after you've already passed it to the Control.DataBindings.Add function, they won't get called. I don't know why this should be, unless the control only gets a copy of your actual Binding object, not a reference to it.


Unfortunately, I have lost the references I had to the forum posts that talked about this. There were several, and now I can find none of them. I know I saw them, though, and I saw the symptoms of the other ordering, so as for me, I'm going to make sure to follow this rule.


Rule 3: Use INotifyPropertyChanged and/or INotifyPropertyChanging.


I ran across this info in a post on Rick Strahl's blog. I ran across this information during a desparate scramble to find out why my datasource ceased to be updated by user actions on the controls, after the initial binding occurred. The INotifyPropertyChanged interface is intended to be implemented by a datasource class that has properties that will be bound to. It provides a public event called PropertyChanged, which is registered with the data binding mechanism when you add the Binding to your Control. Your class then calls this event delegate in the desired property setters, after the new property value has been set. Make sure to pass "this" as the sender, and the name of the property in the event arguments object. Notice that the property name is provided as a String, which means that there is reflection involved. This will become relevant in Rule 4. Also note that there is an INotifyPropertyChanging interface, which exposes an event that is meant to be raised immediately before you apply the datasource change. This is generally less useful for databinding, but I include it here to save some poor soul the type of frustration I have recently endured.


Rule 4: If you implement INotifyPropertyChanged, don't include any explicit property change events ending with "Changed".


As I mentioned, the databinding mechanism uses reflection. And in so doing, it manages to outsmart itself. There is a very good chance you're going to run into a situation for which these databinding mechanisms aren't useful, and you'll have to implement your own explicit property change events on your datasource class. And of course, you're going to name these events in the style of "NameChanged", "AddressChanged", "HairColorChanged", etc. However, the binding mechanism things it's smart, and rather than just registering the INotifyPropertyChanged.PropertyChanged method, it will also register with any public event whose name ends with "Changed". And if you didn't happen to make your event follow the standard framework event signature pattern--that is, void delegate(Object sender, EventArgs e)--then you will get errors when the initial binding is attempted, as the mechanism attempts to register it's own standard-style delegates with your custom events, and you get a casting error.


I solved this one by following a crazy whim, but I also tried to verify the information online. All I could find was one old post buried in an obscure forum somewhere.


Rule 5: Don't bind to clickable Radio Buttons


I know how great it would be if you could just bind your bunch of radio buttons to an enum property. I really do. You think you're just going to hook up some Format and Parse events to translate back to your enum, and all will be well. It would be so darn convenient, if it actually worked. But WinForms just isn't cut out for this. For 3 full releases now (or is it 3.5 releases?), this has been the case. It's because of the event order, which is not something that MS can go switching up without causing thousands of developers to get really cheesed off.


The problem really comes down to the fact that unlike other controls' data properties, the Checked property of a radio button doesn't actually change until focus leaves the radio button. And as with all WinForms controls the focus doesn't actually leave the radio button until after focus is given to another control, and in fact not until after the Click event of the newly focused control has fired. The result of this, as it pertains to radio buttons, is that if you try to bind to them, the bound properties in your datasource will actually lag your radio buttons' visual state by one click. If you have just two radio buttons, the datasource will be exactly opposite the visible state, until you click somewhere else that doesn't trigger an action that references those datasource properties. Which can make this a really infuriating bug to track down. I almost thought I was hallucinating.


Now, in all honesty, it's possible to make it work. But it is the kludgiest kludge that ever kludged. Okay maybe it's not that bad... but it's a messy hack for sure. It takes a lot of work for something that really should already be available. As near as I can tell, the only way to solve this problem without giving up the databinding mechanism is to essentially make your own RadioButton control, with a property change and event order that is actually useful. You can either write one from scratch, or sub-class RadioButton and override all the event logic with custom message handling.


So....


There's the result of 3 weeks of frustration. I hope it helps someone else out there someday. I'll make list addendum posts if/when I come across any other mind-boggling flaws in the WinForms databinding model. And in the meantime, I welcome any additions or corrections that anyone is willing to contribute in the comments.

Is There a Place for Deletionism in Wikipedia?

Just dipping a toe in the water with something too big for Twitter, but that I definitely have a few thoughts on which I am willing to pitch out onto the internet.

Yesterday evening and this morning I watched (Or overheard? No idea what the appropriate term here is...) an exchange over Twitter between Tim Bray and Jeff Atwood about "deletionism". Tim's thoughts on the matter were apparently too strong to be held within the restrictive confines of Twitter and so he wrote a strongly worded post on his blog as well.

The post is obviously... passionate. But putting aside the bulk of the post that simply expresses that passion, he makes a couple really strong points.

The first point is that deletionists are just your garden variety elitists.
"the arguments from the deletionists are jargon-laden (hint: real experts use language that the people they’re talking to can understand)"

The other point really could have comprised the whole post and, IMHO, would have been a sufficient argument all by itself.

"What harm would ensue were Wikipedia to contain an accurate if slightly boring entry on someone who was just an ordinary person and entirely fame-free? Well, Wikipedia’s “encyclopedia-ness” might be impaired... but I thought the purpose of Wikipedia was to serve the Net’s users, not worry about how closely it adheres to the traditional frameworks of the reference publishing industry?" [emphasis added]

This, to me, undermines the deletionists' whole platform. That is what Wikipedia was supposed to be: a new model of information archival. A model not subject to the sensibilities of some authoritarian arbiters of what is "notable" or "interesting". And the deletionists are unequivocally trying to undermine this goal, by deciding what "deserves" to be archived.

I say, if they want to take that old dead-tree encyclopedia model and just port it to the web, they can go find their own site, and their own content, to do it with. I've even got a couple recommendations for them.

Identity Crisis in Computer Science Education

A while back, the seeds of a post started rolling around in my head, inspired by my lack of satisfaction with the preparation my education provided me for a programming career. But I didn't quite know what I thought. Then not long ago, Joel Spolsky presented a radical repackaging of Computer Science degrees, supposedly geared toward producing better programmers, and my thoughts started to gel.

I knew what my personal complaint was, and I knew that it was connected to a larger problem with the state of computer science education in general. But I didn't know exactly what the larger problem was, let alone have any ideas worth sharing on what might be done about it.

Now that seemingly the entire remainder of the blogosphere has weighed in on this and related topics, I think I am finally ready to throw my two cents in. I'm going to barrage you with links in the following paragraphs, to ensure that I credit everyone I read who assisted me in coming to my final conclusions. Feel free not to click through, but be aware that they represent a rich cross-section of an important discussion.

It took me several weeks to realize that what we have going on is essentially a three-way tug of war from people in different regions of the vast sphere of software development, who need very different things out of their workers, and hence out of their education. Below I will give a run-down of some of the claims made, expressing the different forces pulling on CS graduates these days. You'll quickly see it's no wonder that the schools are so confused....

The Artisan Programmer

Joel laments the uselessness of theory courses in computer science curricula, saying "I remember the exact moment I vowed never to go to graduate school" and then proceeding to recall a terrible experience he had with a Dynamic Logic class. Jeff Atwood insists that real-world development environments need to be in place and mandatory, including, but not limited to, source control, bug tracking, deployment, and user feedback. Then, as mentioned above, Joel proposes offering BFAs in software development, to make darn well sure that none of the academic (in the pejorative sense) theory stuff gets mixed in unnecessarily. The upshot of most of these points are that computer science / programming degrees should spend as much time as possible teaching people what they need to know to go into a career in software development, writing business software or software products.

The Computer Scientist

Brian Hurt comes in from another direction entirely, and in the process makes some very good points about the true purpose of higher education. He lays the blame for the flood of single-language programmers entering the workforce at the feet of schools who do just exactly what Joel and Jeff are asking for. He makes some great points. And while he sounds more than a little reminiscent of the classic Joel post about the perils of java schools, his argument is much more thorough than just blaming the tools. Chris Cummer joins this party, wishing that his theory foundations had been firmer, and making an excellent analogy to the difference between someone who studies a language, and someone who studies language. We also have the respectable Raganwald, who although he has admirably pointed out good points from all sides, doesn't shy from offering his opinion that programmers ignore CS fundamentals at risk of their own career advancement.

The Software Engineer

But thats not all. Several people have weighed in from yet another direction. Robert Dewar and Edmond Schonberg wrote one of the posts that started off this blog firestorm. Along with alluding to a similar sentiment as Hurt and Cummer, they heavily criticize the state of software engineering education for focusing too much on how to use specific, limited tools, when there are more sophisticated ones available. They claim understanding these will allow software engineers to easily pick up the other tools that may come along and direct them to appropriate purposes. Ravi Mohan stops short of calling the education satisfactory, instead sensibly pointing out simply that an engineer who doesn't use standard engineering tools such as system modeling, isn't really an engineer. Mohan comes on a little too strong for me in the comments, but the posts themselves (of which there are also a precursor and a successor, and should soon be one more) are worth reading. Like the others he makes valid points.

Resolving the Crisis

Mark Guzdial is one of the few people that really puts his finger near the pressure point. Though maybe not near enough to feel the pulse beneath it when he did so. At risk of quoting a little too heavily....
Rarely, and certainly not until the upper division courses, do we emphasize creativity and novel problem-solving techniques. That meshes with good engineering practice. That does not necessarily mesh with good science practice.

Computer scientists do not need to write good, clean code. Science is about critical and creative thinking. Have you ever read the actual source code for great programs like Sketchpad, or Eliza, or Smalltalk, or APL 360? The code that I have seen produced by computational scientists and engineers tends to be short, without comments, and is hard to read. In general, code that is about great ideas is not typically neat and clean. Instead, the code for the great programs and for solving scientific problems is brilliant. Coders for software engineers need to write factory-quality software. Brilliant code can be factory-quality. It does not have to be though. Those are independent factors.
And there it is.... Different environments require different mindsets/approaches/philosophies. Research requires one mindset/philosophy of work, engineering requires another, and in-the-trench-based programming requires yet a third.

When a person suffers from a personality fracture, the resolution is often to merge the personalities by validating each as part of a whole. Fortunately, since we are not dealing with a person, we have the freedom to go another direction: make the split real and permanent.

Associate of Science in Computer Programming

To fill Joel and Jeff's need, the student who wants to work in the craft of software development / computer programming, who wants to be an artisan, needs to have an appropriate degree. It needs to provide them with an exposure to the generalized idea of the programming platforms and tools that they will have to deal with for the rest of their career. Lose the lambda calculus, compiler-writing projects, etc. These things not necessary for them to get stuff done in the trenches. But they do need to be exposed to the fundamental generalities that pervade programming. And they need to be prepared to learn at an accelerated rate while in the field. That just comes with the territory. Focus on core programming skills like program analysis, debugging, and test practices. Introduce industry-standard tools (emphasizing generality and platform-independence) such as source-control, bug tracking, etc.

I think a two-year associate degree is perfect for the code-monkeys and business programmers that just love to dig in and mess around with code, and don't want to concern themselves with the overarching concerns. Especially with these jobs increasingly being pushed offshore, computer science grads are rapidly being priced out of the market. An associate degree is cheap enough to be worth the investment for a lower-paying programming job. And it doesn't carry the overhead of any unnecessary theoretical content that they may not be interested in learning. It should be noted though that this type of programming job enters the realm of the trades, with all the associated benefits and drawbacks.

If you're looking for a more well-rounded individual capable of moving up out of this position into a lead position, or even management, then a 4-year bachelor of science (or Joel's BFA, but I tend not to think so) may be a viable option as well.

Bachelor of Science in Computer Science

There's not much to say about this degree, because if you look at all the schools that are famous for their CS degrees, this is pretty much what you'll find. Lighter on general studies, heavy on theory, heavy on math. Light on tools because the students will be expected to find (or make) tools that work for them. Light on specific language education because students will be expected to adapt to whatever language is necessary for their problem domain.

This is a degree that will produce people primed for going on to masters and doctorates. They will end up in research, "disruptive" startups, or working on new languages, OSes, etc. This degree is designed for people who want to work at the edge of things. Who want to solve new problems and push the boundaries. They are people upon whom will be placed the burden of pushing the state of knowledge in CS into the next era.

Bachelor of Science in Software Engineering

I am hesitant to propose this degree, because I am not certain that the practice of Software Engineering has evolved to the point where we have 4 years worth of general knowledge that's worth teaching, and that won't be out of style by the time the student's graduate.

It seems that some people, when they talk about Software Engineering, are talking about architecture and design, and others are talking about process, resource allocation, estimation, etc. To be frank, I don't think the former qualifies as a true engineering discipline. At least not yet. I don't know how much the modeling of programs that Ravi Mohan talks about is going on out there in the industry. I suspect that it happens more in the process of really big projects, and maybe in digital security. The second type of engineering people think of, however, I think is very similar to what we see in manufacturing, with industrial and process engineers. These are people who get an intimate knowledge of the domain, and then figure out ways to get everything to run smoother, more efficiently, and producing higher quality.

I can definitely see some education possibilities here, though I am not sure myself how to fill out the whole degree. It should at least encompass a good portion of the Associate of Science in Computer Programming, because they need to understand the intricacies involved. I can also see this degree teaching some of the more established measurement and estimation techniques found among the industry's established and experienced software project managers. Generally more management-related topics such as resource allocation, planning, product design, feature negotiation, etc. might fit in well here. Different project processes, testing/QA models, and of course an ability to keep up to date with technologies and platforms, are all par for the course as it's all critical for making decisions in the industry.

Conclusion

I really, honestly believe that Computer Science education as a whole needs a makeover. It needs more structure, more integrity in the vision of what each degree means, across schools. When someone has one of these degrees, you need to be able to reliably assume they should have learned certain things, regardless what school they went to. Many of the degrees currently on offer don't satisfactorily prepare their students for any one of the possible careers discussed above. I'm not saying their hand needs to be held from enrollment right on through to their first job. That's not the purpose of college. The purpose of college is to provide a cohesive education, directed to some relatively well-defined goal of capability and knowledge. Today this is tragically non-uniform at best, and absent altogether at worst.

So I see plenty of room in software development education for a clarification of purpose, and a readjustment of goals and curricula. A few different tracks, each geared toward a distinct section of the sphere with different goals and different responsibilities. And if we resolve to use existing terminology with some respect for the historical meaning of the words, we can re-use our existing nomenclature. But there can be no more of this muddy slurry of computer science, craft of programming, and software engineering all overlapping in claims of purpose, treading on each others' territory without care. Everyone can have what they are asking for. They just need to accept that no one can claim the "one true way".

I am not a Computer Scientist

Prepare yourselves. I have an embarrassing and melodramatic admission to make.

My career is a sham.

Although my degree and education are in a field that is typically referred to as "computer science". I am not actually a "scientist". Nor do I "practice science". But I won't be satisfied to go down alone for this charade. I'll going on record saying that I am convinced that for the vast majority of people who were educated in or work in the field of "computer science", the ubiquitous presence of the word "science" in proximity to our work or education, is a tragic misnomer.

I don't know how long this has been on my mind, but I know almost precisely when I became conscious of it. It was a couple months ago. I was newly exposed to devlicio.us, and perusing the blogs hosted there, when I came across a post by Bill McCafferty about a lack of respect and discipline in our field.

Early in the post, Bill reveals an injustice he encountered during his education.

...When I started my undergrad in this subject, I recall reading articles debating whether it should be called a science at all. Gladly, I do not see this argument thrown around much anymore.

I think I am probably not going to make the exact argument here that he disagreed with back then. The things we all studied in school are definitely part of a nebulous field of study that may rightfully be called "computer science". As Bill points out,

"From Knuth's classic work in The Art of Computer Programming to the wide-spread use of pure mathematics in describing algorithmic approaches, computer science has the proper foundations to join other respected sciences such as physics, concrete mathematics, and engineering. Like other sciences, computer science demands of its participants a high level of respect and pursuit of knowledge."

I have no argument with any of this. He's right on. Donald Knuth (who is indeed my homeboy in the sense that we share our hometown) studied and practiced computer science (which if you know anything about Knuth, you'll know is an almost tragic understatement). And thousands of people who have followed in Knuth's foot steps can lay the same claim. However, that's not me. And it's not more than 99% of all programmers in the field today.

Computer science suffers the same type of misnomer as many other disciplines who have adopted the word "science" into their name, such as political science, social science, animal science, food science, etc. And it seems that most such fields, if not all, have done so because the very validity of the field of study itself was subject to severe criticism at some point in the past. So we take on the term "science" to get it through people's heads that there is a root in formal practices and honest intellectual exploration. But to then blanket every profession that derives from this root with the term "science" is a misappropriation of the term.

I can think of a number of examples.... The programmer working for the bank to develop their website, or for the manufacturing company to manage their transaction processing system is no more necessarily a "computer scientist" than the election commentator is necessarily a "political scientist". When someone gets an electrical engineering degree and goes to design circuits for a living we do not say he "works in electrical science". We say he is an electrical engineer. When someone gets a technical degree in mechanics and then goes to support or produce custom machinery, we do not say he "works in mechanical science". We say he is a mechanic, or a technician. Why, then, when someone gets an education that amounts to a "programming degree", and then goes to work doing programming, do we say that he "works in computer science"? It's a uselessly vague and largely inappropriate label.

By contrast, if you have a doctorate in computer science, I'm prepared to say you deserve the label. If you write essays, papers, articles, books, etc. for use by the general practitioner, you probably deserve the label. If you do research, or work on the unexplored fringes of the field--if you are exploring the substance and nature of the information or practices that the rest of us simply consume and implement, then in all likelihood you deserve the label.

Please, please understand that I am by no means belittling the value of our work, or the nobility of our profession. Often we simply consume the information produced by true "computer scientists". But we transform it from theory into practice. We resolve the concrete instances of the abstract problems that the true scientists formally define. We take the pure thought-stuff produced by scientists and turn it into tangible benefit.

This is not trivial. It is not easy. It deserves respect, discipline, study, and care. But it is not "practicing science".

I should say in closing that I am not as upset about all this as the tone of this post might imply. I don't even really have a big problem with the use of the word "science" to refer to a field of study or work that largely does not include research-type activities. I don't like it, but I accept that it happens. But "computer science" has a problem that other similar "sciences" don't. When someone says they work in "political science" or "food science", you can make a guess as to the type of work they do, and it's hard to be significantly incorrect. Though maybe it's my outsider's naïveté that allows me to make this claim. At any rate, "computer science" as a field is so broad and vague that I don't think the term communicates a useful amount of information. But you wouldn't know that by talking to programmers, who seem only too ready to attempt to take hold of the term and own it for themselves.

I think this is one small facet of a larger and far more critical issue in our field in general, which I fully intend to write more about very soon. But until then, lets take the small step of starting to consider what we really mean when we use much of the popular but often ambiguous terminology when discussing our profession.

I work in the field of computer science. This tells you nothing except that I am unlikely to be a prime specimen of the wondrous human physiology. But.... I am a programmer. I have a degree in Computer Engineering. I am interested in programming theory. I work as a software development consultant. And now, you know something about what I know and what I do.

Now what about you?

Update: I forgot to note that in McCafferty's blog entry, he himself makes use of "trade" terminology to categorize different levels of reading materials. Which belies the uncertain nature of programming as a profession. We certainly wouldn't say that a carpenter works in the "wood sciences", would we?

What is a Senior Programmer?

My friend and co-worker Nate recently wrote about some hurdles he has encountered in pursuing his professional ambitions as a software developer. I know what he's going through because I entered both my internship and my first post-college job with tragic misconceptions (non-conceptions really, in the case of my internship) as to how my career would develop.

My first mistake was that, as I began my internship, I had no idea how my career would progress, how it should progress, or how active a participant I would be in whatever progression did occur. I knew that I enjoyed twiddling around with code, and that I seemed to have more of a natural talent for it than most of my classmates. And I figured that if I was going to make a career out of anything, it should be something that I did at least passingly well, and that I enjoyed.

As I entered my first post-college job, I had decided I most definitely did not want to become a manager. I enjoyed programming too much to give it up, for one thing. Further, managers had so far stood mostly as obstacles to my involvement in interesting work, and as incriminating figures more prepared to remonstrate me for my professional flaws rather than empower me to better myself as a programmer. So I wanted nothing to do with that. Instead I set a near-term goal of becoming a "senior developer", at which point I would re-evaluate my career trajectory and adjust course if necessary.

An important question in gauging the advancement of your career as a programmer is to ask what exactly it means to be a senior programmer. I have come to see this title as being tied to the professional respect that one has accumulated in a programming career. A senior programmer is not someone who has served a certain amount of time "in the trenches". Nor is it even someone with a broad experience base.

No, I think there is something a bit more intangible, that identifies someone deserving of the "senior developer" title. Something less measurable. Something that is probably sensed, but not necessarily explicable by those with less experience. But something that would be conspicuous by its absence.

As I see it, what distinguishes someone deserving of the "senior developer" title is a sense of stability. A senior developer is someone who can stand in the middle of the chaos that arises in a project, and exert a calming influence on the people and efforts swirling around him. A senior developer is an anchor to which a project can be tied to keep it from drifting into dangerous waters. He is a sounding board against which claims will ring true or false and goals will ring possible or impossible. He is the steady hand, not necessarily on the rudder of your project's ship, but wherever that hand is needed most. And he is a strong voice that reliably calls out your bearing relative to your destination.

Of course, these metaphors sound a bit grandiose. But the general picture I think is accurate. A senior developer is someone that you put on a project to ensure there's some measure of certainty in the mix. It doesn't mean your project is guaranteed to succeed. But it should mean that you can sleep a little easier knowing that where you think the project is, is where it really is. And that if it's not where it needs to be, that there is someone involved who has a decent idea of how to get it there.

Naturally, these things do come with time and experience. So what I said earlier isn't completely true, in that a senior developer is someone with tenure and experience. However, these are necessary, but not sufficient conditions. Not everyone with 20 years of experience on 5 platforms and 15 languages qualifies. And not everyone with 5 years of experience on 1 platform and 2 languages doesn't qualify. Rather, if you show that you can learn from experiences both positive and negative, port technical and non-technical knowledge from one domain to another, and educate, inspire, or empower colleagues and junior developers.... Then you are showing yourself to have what it takes.

If you're like me, and Nate, you don't feel that you're there yet, but you hope one day to proudly contribute this kind of value. Don't lose hope. Every new experience, technical and non-technical, is a growth opportunity. But it is important to broaden your horizons in both of those respects. If you hope to educate, inspire, and empower others, you must first learn to do so for yourself. And if there's one thing I've learned, it's that you can't do that if you feel stagnant. In that case, your first responsibility to yourself is to educate your boss of the professional value you could offer with a little broader exposure. And if that doesn't work, address the issue yourself, dedicating a bit of time outside work. If you can find them, working together with some like-minded friends or co-workers can be very encouraging, like what we have done with our "book club". Remember, nothing changes if things just stay the same.

LINQ: Not just for queries anymore

Take a look at this LINQ raytracer, discovered via Scott Hanselman's latest source code exploration post.

For those of you who don't know what the heck LINQ is, it was created as a language extension for .NET to allow querying of datasets using an in-place SQL-inspired notation. This was a huge step up from maintaining separate bodies of stored queries, or hard-coding and dynamically assembling queries as strings in your source code.

Lately LINQ has been enhanced and expanded upon even further to become a more broadly usable miniature functional/declarative language within .NET. This is impressively illustrated by the raytracer code. The author disclaims that it's probably not the best way to write it, which is probably true. But this in no way detracts from its illustration of the power, expressiveness, and flexibility of LINQ.

I love it! It's a great example of taking a deceptively simple language and showing its power by doing things with it that aren't strictly within the purview of its design. It reminds me of some SQL code that I've written for my current employer, to get functionality out of inefficient PL/SQL and into blazing-fast pure SQL. Stuff that it was said couldn't be done in pure SQL. Of course, this is usually said by people who think that SQL is a second-class citizen among languages, not realizing the power of its declarative foundations. This is something that I hope to write about more extensively in the future, either here or in a new SQL blog I'm considering starting with a friend and coworker.

Anyway, I'm tempted to say that there's a bit of a down side in that this LINQ raytracer is mostly comprised of LETs, which feel more imperative than declarative. However, I've long claimed that just such functionality in actual SQL would make it a tremendously more compact and elegant language, so I won't complain too much about that. =)

My gut reaction upon seeing this code is that it feels like a crazy hybrid of LISP and SQL, but it's written in C#. Which all makes my head spin, but in a good way. I love that C# is becoming such a great hybrid of procedural and functional programming paradigms.

Where will your programming job be in 7 years?

There's been a lot of noise recently about the future of programming, as a profession. Okay, let's be honest, people have been talking about the imminent programming crash for a long time. But they didn't know what they were talking about. When I graduated college a few years back, we were warned that many of us would not find jobs, but this fear turned out to be overblown.

But now the noise is coming from some different directions. It's not analysts, or people who've been layed off, or the people who got screwed in the "dot.com bust". Though to be fair, the analysts are still saying it, louder than ever. But it's also now coming from the people who hire programmers. And unfortunately, if you are a programmer, those are the people you care about. So, what's different this time? Why don't companies need programmers anymore?

Well, they do need programmers. But they need programmers who can do more than just program. So odds are good that if your job title is "programmer", or "programming" constitutes 90% or better of your responsibilities on the job, they're looking at you like day-old tuna salad.

From the ComputerWorld article:
"It's not that you don't need technical skills, but there's much more of a need for the business skills, the more rounded skills"
"Crap," you say to yourself. "I hate business stuff." Or maybe you're saying "Crap. Why didn't I go for that business minor or second major back in school?" Easy, cougar. Don't get too worked up yet. Let's look at the context of this statement. We already know who's saying it: "business people". People working in IT in companies throughout the nation. Not software companies. Not consulting companies. Just regular companies. This is crucial information. It means the positions we are talking about are mostly going to be be "business programming" jobs.

Now we have a few more questions. Where are the jobs going? Why did I put quotes around "business programming"? And why are these jobs going away for real this time?

First answer: they're going to go to people with Associate degrees, and people who are self-taught, and consultants. Some to high-quality on-shore resources, but an awful lot to people who speak Hindi or Urdu (or Chinese, or Korean). Wages and education requirements will be cut for in-house employees, and others jobs are going to consultants. Offshore in most cases.

Next... The reason I put quotes around "business programming" is to distinguish these jobs from other types of programming positions. If you are a "business programmer", you're part of what is currently a pretty huge job market. You're part of a population that does a lot of work for a lot of companies, but is often looked at as an unfortunate necessity by other people at these companies. "Business programming" is the type of programming for which, in the past, a company who sells shoes, would hire some people on full-time and in-house to do. It's order processing, financial reporting, CRM, network management, database scripting, and so on and so forth. And for a long time, management types have had an uneasy feeling that they were getting kind of a raw deal on these guys. And I hate to break it to any business programmers out there.... but they were right.

According to the Occupational Outlook Handbook entry for computer programmers, there are very specific and very real reasons that programming jobs are being phased out.
"Sophisticated computer software now has the capability to write basic code, eliminating the need for many programmers to do this routine work. The consolidation and centralization of systems and applications, developments in packaged software, advances in programming languages and tools, and the growing ability of users to design, write, and implement more of their own programs mean that more of the programming functions can be transferred from programmers to other types of information workers, such as computer software engineers.

Another factor limiting growth in employment is the outsourcing of these jobs to other countries. Computer programmers can perform their job function from anywhere in the world and can digitally transmit their programs to any location via e-mail. Programmers are at a much higher risk of having their jobs outsourced abroad than are workers involved in more complex and sophisticated information technology functions, such as software engineering, because computer programming has become an international language, requiring little localized or specialized knowledge. Additionally, the work of computer programmers can be routinized, once knowledge of a particular programming language is mastered."
I would add to this list that "business programming" is almost inherently redundant. Every company out there that employs in-house developers is reinventing the wheel. 99% of the problems their programmers are solving have been solved by thousands of other programmers at other companies around the country. When I look at it that way, it feels like such a tremendous waste of money and time. These programmers could be working on real problems like true AI, building Skynet, or bringing about the rise of the machines and their subsequent domination over the human race.

So essentially, the big reason is that there is finally a way for a company to separate their technical needs from their business needs. Packaged software has finally come to a point where it solves the general problems, while still providing a minimum amount of flexibility necessary to handle a company's critical peculiarities. When they have a need that isn't fulfilled by some packaged solution out there, contracting resources have become plenteous and cheap enough to fill that need. The company can move the business knowledge into more generic manager roles that are more useful to the company. (Roles with better pay, and job titles such as "software engineer" and "system analyst", and "project manager".) And the technical knowledge can be moved mostly out of the company into a "unpluggable" resource that they only need to pay as long as the work is actually being performed.

So, what's a programmer to do?

Well, first of all, stop being just a programmer. Don't be a code monkey. Yes, the under-appreciated coder-geeks of the world have "owned" that pejorative term. But in the eyes of corporate America, the words "code monkey" are always preceded by the words "just a". Code monkeys abound. They are replaceable. If you don't evolve (pun intended) you're going to end up obsolete when your company finds an offshore contractor that doesn't suck. (Yes, they do exist!)

One way you can evolve is by taking the advice of all the articles and reports I've linked to in this post. You can take courses, get certifications, etc., and become a "software engineer" or a "system analyst" or a "project manager". The numbers show there are plenty of companies out there that will be willing to pay you to do this as long as you don't suck at it. (And some that will even if you do suck. But I strongly advise against sucking.)

Many programmers go this route at some point. And there's no shame in it, if you can let go of coding every day (or at all) as part of your job and not hate yourself for it. I think I might go that route one day, but I'm not ready for it yet. For one thing I don't feel my experience base is either broad or deep enough to be as effective as I would want to be. But also, there are just too many cool languages and libraries out there. Too many programs that haven't been written yet. Too many problems that haven't been solved.

So what is there for me, and those like me? Those who don't want to give up programming as the core of our job, but don't want to end up teaching programming to high school kids by day while we sate our code-lust at night in thankless open source efforts? (No not all open source is thankless. But there are a lot of projects that go nowhere and just end up abandoned.)

Two answers:
  1. Consulting. Not everyone is willing to send the work around the world, dealing with language barriers, time-zone difficulties, and security uncertainties, to get it done. There's a definite place in the market for on-shore, face-to-face consulting. This spot is getting tighter though, so you had better make sure you're a top-shelf product.
  2. Software companies. The companies that sell the software that's putting code monkeys out of jobs are still raking it in. And they're actually solving new problems, building new technologies, etc. All the exciting stuff I dream of working on as I churn out yet another database script for a client.
These are your new targets. The sooner you make the jump the better. Not only will you be trading up, as far as job satisfaction (and probably pay), but you'll also be contributing your own small part to the further expansion of software technology. Both by actually working on it, and by condemning the tremendously redundant field of "business programming" to the death it has deserved for so long. You'll probably still have to learn that business and management stuff, but you can also probably avoid it taking over your job.

A new kind of Democracy

Friday night I was listening to the Boston-based NPR show On Point. The episode was about how reading affects the brain. It was interesting, but that's not what I want to talk about. There were two guests on the show. The main guest is a professor of child development at Tufts, and the other a professor of educational communication and technology at Wisconsin's very own UW Madison. This latter guest was brought in as a counterpoint to the main guest's fear that people spending more time with videogames and on the internet were missing out on many of the amazing benefits that reading bestows upon the brain.

Given the UW prof's area of focus, the conversation of course eventually came around, briefly, to blogs. She claimed that just as the Guttenberg printing press enabled the "democratization of knowledge", blogging will bring about another fundamental shift. One where, as before, some very nice things will be left behind, forgotten by most, in sacrifice to the emergence of a new mode of interaction that will bring about it's own useful new evolutions and societal impacts. I found this a tremendously interesting proposition, and hoped they would follow this line of thought a bit more, but unfortunately the host had latched onto some earlier point and swung back around to that, never to return.

I'm not sure I heard another word of the broadcast anyway though because my mind was off and racing.

Many people have already staked a claim on the blogosphere as the next frontier in journalism. Rejoicing has already begun that journalism will migrate into the blogosphere and morph into something new and different than ye olde journalisme, bringing enlightenment to all and ushering in the Age of Aquarius.... Certainly it will offer a new value proposition, but I am not yet convinced of the extent to which blog journalism will supplant traditional journalism. Regardless of that, however, these people are thinking too narrowly.

Let's go back for a minute to that statement about "democratized knowledge". I've heard this turn of phrase several times in reference to the rise of the printing press, but never really stopped to think about what is really being expressed there. The history of human communication seems to me to track very closely the history of significant "advancements" in civilization, and with a bit of thought I think it becomes abundantly clear why this should be.

When speech and body language were all that was available, the spread of knowledge was a social affair. You needed to have at least two people, in close proximity, interacting more or less directly. The advent of writing freed knowledge from the bounds of proximity in space and time. But consumption was serial because the recordings required a huge time investment. First one person, then another, then another.... If the recording was not lost or stored away in a private library. The printing press resolved these issues. It became trivial to make multiple copies of a manuscript. Copies could be made as long as there was demand for them. If some were destroyed others would survive.

The printing press made knowledge available to all those who desired to consume it. It was no longer the privilege of rich or powerful men. If you could read, you could learn, regardless whether someone was available to teach you personally. One man's idea could be made available to thousands, across time and space, and generally without concern for the social position of the consumer. This is what is meant by the democratization of knowledge, and it really was a significant turning point in human history.

I contend that the rise of blogging is the next great evolution of human communication. Sir Isaac Newton is quoted as having said "If I have seen farther, it is only by standing on the shoulders of giants." How do we know they were giants? Because they had the time and resources to make recordings of their ideas, and people valued them enough to demand and preserve them. In the past there were generally two ways you could get your ideas disseminated to humanity at large. If you were privileged enough to have the time and resources, you could finance a recording of your idea regardless of its perceived value. Or if you really believed in your idea, you could make a big sacrifice to get it recorded, and pray that people valued it enough to compensate you for your investment.

With blogging, these restrictions are largely abolished. If you have access to a computer, you can create a blog, for free, as I've done here. You can record your ideas as quickly as you can type them, and once you post it, it is instantly available for billions of people to simultaneously search for, stumble upon, consume, discuss, and share.

Newton and Einstein stood on the shoulders of giants. With blogging, we no longer have need of the giants. For you and I to see farther, we can assemble a great pyramid of normal human beings, each with a small contribution that in aggregate has potential that the "little people" of the past could only dream of participating in.

We can also move beyond ideas and into expression and experience. Blogging allows us all to share with the world our unique experiences, our viewpoints, our disasters, our epiphanies, our humanity. No matter how mundane our individualities may be judged by most, we now have the potential to find the other people out there who are weird in the same ways we are. With increasing frequency and ease we can now connect online with someone else who has shared our joys and sufferings, our beliefs and needs. Someone with whom we can identify and be comforted. Or we can peer into the "mundane" individualities of others, and experience the wide breadth of simple but beautiful diversities of humanity. These are often things that are difficult to get published in print, unless the presentation is particularly engaging, or the story is something that we expect will touch large groups of people. Online these barriers do not exist. And we need not simply consume the others' narratives either. We are empowered to interact with the speaker and participate in their narrative.

I could go on into more specifics, but instead maybe I will dedicate another post to that sometime down the line when I've thought about it a bit more.

To put it succinctly, Gutenberg's printing press enabled the democratization of the consumption of knowledge. With the emergence of blogging, we are witnessing the democratization of the production of knowledge and of expression of the human experience.

Maybe not the beginning of the Age of Aquarius.... But I think it's a bit of a "big deal" nonetheless. Though I am probably late to the party on this.

My Grand Entrance

I have officially joined the blogosphere. I'm still not sure I like that word. It sounds like something Dan Simmons would have made up. But here I am nonetheless. My entry was unceremonious, not surprisingly, as my online presence has heretofore consisted solely of my college Senior Design project page and scattered posts on a few unrelated forums in various corners of the internet (some of which I am now thoroughly ashamed of).

I created this blog as an outlet, for my many and varied interests. It has been exceedingly rare to find someone who is willing to listen to me talk about them, let alone actually hold up one end of a discussion. So, hopefully this will provide me a pressure valve, where I can let out some of the thoughts that build up in my head without thoroughly irritating my friends and coworkers.

If I'm lucky, someone will read one of my posts all the way through before realizing they've wasted their time on the ramblings of some idiot on the internet.

And if I'm really lucky, it'll turn out that there's another person out there who actually cares enough about the same things as I do to tell me how ridiculous my ideas are and that I should just close up shop and go back to my day job.

Anyway, here are a few of the things I'll likely blog about:
  • Software Development
  • Programming Languages (and theory)
  • Technology and Culture
  • Theoretical Physics
  • Education
  • Math
  • Professional Development
  • Language / Linguistics
If these obscure and esoteric topics look interesting to you, God help you.... I mean stay tuned!