Kevin Hegg: March 2008

Monday, March 31, 2008

Throwing out the first pitch at a Major League Baseball game

My 9-year nephew, Charlie Costa, is a die-hard St. Louis Cardinals baseball fan. I have fun talking trash with him because I have been a die-hard Chicago Cubs fan since about his age.

A little over one year ago Charlie was diagnosed with a pretty serious brain tumor. The doctors treated it right away, but in the fall of last year they determined they hadn't gotten all of it. At the beginning of this year he went in for another 6 weeks of treatment. This time the doctors think they got all of it, but only time will tell. This last year has been very tough on Charlie, his mother (my sister), and their family. There have been a lot of trips from Springfield, Illinois to St. Louis for treatment.

The hospital that Charlie has been treated at has an affiliation with the St. Louis Cardinals. When the opportunity for a patient at the hospital to throw out the first pitch came up, Charlie was the first one everyone thought of. There is no bigger fan than him. He will be throwing out the first pitch when the St. Louis Cardinals play the Washington Nationals on April 6. This is such a huge thrill for him and a nice end to a difficult period in his life. So, if you are watching the baseball game and you see a little kid throwing out the first pitch you know the story behind it.

Monday, March 24, 2008

High functioning teams

Over the last week on the ALT.NET list there has been a discussion about the "expert" ALT.NET'ers joining forces to form the best IT consulting business. This idea is not a new one. Many have discussed this in other venues and some have acted on it. Recently, Martin Fowler wrote about PreferDesignSkills. This is not a new idea either. What is common between both of these discussions is the focus on the individual. For the first 20 years of my career I fell into the "best individual" camp, however over the last 10 years I have been in the "best team" camp. This change in thought can lead to very different hiring practices and company culture.

When interviewing a candidate I start by evaluating the person on the things they claim to know. If you don't know what you claim to know or are as good as you claim to be in that area then that is grounds for immediate rejection. Next, I evaluate the candidate for capacity and willingness to learn. This is where Fowler and I start to disagree. Fowler values design skills above anything else, as far as I can tell. I take a much broader view. I value candidates with a strong foundation. Design skills are important, but they are just one part of the foundation. Regarding capacity to learn, I am looking for someone who can reason about a new method, technology, process, tool, etc. and develop an informed opinion about if and when to use it. If the candidate only believes in the "one true way" or "one true tool" then they have lost their capacity to learn, at least in that area. Regarding willingness to learn, I am trying to determine how self-motivated the candidate is to learn new things. Next, I evaluate the candidate on their ability to make a team better. Is this candidate willing to play a specific role on the team? Will this candidate help make other team members better? Will this candidate be respectful to other team members? Will this candidate show loyalty to the company, project, and customer? Finally, I try to form an assessment of how much of an investment is required before the candidate is a contributing member of the team.

Getting back to Fowler's design vs. platform skills choice, there are a couple of problems with his conclusions when you look at it from a team point of view. First, what are the skills that are required to round out the team this candidate will be put on? If the team doesn't have all of the skills necessary for the project to succeed then I believe the most important thing is to fill the holes in the team. It is easy to say that you are hiring the person for future and not for the project, but if you put a person on a project that they aren't a good fit for then you haven't done the candidate, company, project team, or customer any favors. Putting any butt in any seat doesn't work, in my opinion. Second, I don't believe that you can make a blanket statement "that a good programmer should be able to pick up a new platform relatively quickly". Some projects require solving hard problems and that may require a deep understanding of the platform. Third, Fowler shows clear bias in design or platform skills. He implies that platform skilled people will never be able to acquire design skills or at least not easily. After a couple of years working in Microsoft Consulting Services I have a very different perspective. I can match his horror stories one-for-one like ripping out 1,000 lines on non-working Spring.NET code and replacing it with 20 lines of .NET platform optimized code or reliance on Rational XDE design-generated code resulted in a sub-optimal .NET solution that required substantial performance tuning or a sub-optimal data access layer that failed to leverage any of the features of the DBMS. My point is not that platform is preferred over design, it's that both play an equally important role in the success of a project.

NOTE: I have nothing against Spring.NET. It is something that I have used successfully, but in this case it was used when it wasn't necessary and it was used incorrectly. The person who wrote this code was a design-focused person who was looking implement IoC and AOP everywhere and into places where it wasn't needed to meet the requirements.

There is a perpetual debate that goes like "1 top-2% developer can outperform 10 bottom-50% developers". You can replace any of the four numbers with whatever you want, but the story is the same. A highly-talented person can outperform a bunch of below-average people. As Craig Andera said at our last ALT.NET meeting, by its very definition we are always going to have many more below-average than highly-talented developers. I've been there and done that. I've been the top developer that outperformed X other developers. So, what good came from it? Certainly, my customers appreciated the efforts, but the team didn't. There are only so many hours in the day that a highly-talented person can work and because there are so few of them it isn't a scalable practice. When I left a project the team wasn't able to continue on the same path or velocity because they didn't get it. When the customer finally realizes the maintenance issues it is too late. Is there something better than this? I think so and have been increasingly working differently over the last 10 years. Instead of stating that I can outperform X developers I ask "How can I make X developers and the team as a whole more productive?". I don't have any numbers to support this claim, but I believe this change in attitude can lead to higher functioning teams. I believe the "outperform" attitude causes you to write off a bunch of people unnecessarily. As in any industry, the IT profession has some useless individuals, but I think the number is a lot less than elitists believe it is.

Back to the original question, can a bunch of experts join forces to form the best IT consulting business? It is certainly possible, but I don't believe it is probable and its certainly harder than forming a corporate structure. I believe the question is too focused on the individual. It takes a lot more than individual effort, no matter how expert, to build the best company. It takes a lot more than individual effort, no matter how expert, to grow beyond a small company. High functioning teams are an essential ingredient to building the best company.

Friday, March 14, 2008

Are we going to see major changes in data management?

Row-oriented Relational Database Management Systems (R-RDBMS) have grown in popularity since the early 1980's to the point where the overwhelming majority of data management by the early 2000's was handled by RDBMS's. There have always been alternatives to R-RDBMS's, but until recently none of the alternatives have provided sustainable technical advantages or gained significant market share. So, what's different now and is it going to impact the life of a software developer?

Early in my career I had the opportunity to program IBM's SQL/DS and shortly after that Oracle (2.0 or 3.0, I forget). Relational database development was simple for me. In late 1986 Sybase released SQL Server 1.0. I was assigned to a project that was going to use it, but since the project had large-scale data and significant performance requirements I needed special training. I spent a couple of weeks at Sybase learning SQL Server internals and there was no going back. Since then I have kept up with Sybase, Microsoft SQL Server, and Oracle database internals and have spent a lot of time designing and tuning databases. At this point in my career I feel that I can squeeze every drop of performance out of any R-RDBMS. Now that we have established the I am an experienced database guy, let's resume the discussion. :-)

As R-RDBMS's matured vendors began throwing everything into a single product. Michael Stonebraker discusses the "one size fits all" approach and the problems with the approach here and here. He lays the foundation for when and why R-RDBMS's start to fall apart. From my personal experience building applications to process sensor data and financial data feeds it took a lot of expertise, effort, and cost to tune commercial R-RDBMS's to meet the performance requirements.

With the rapidly decreasing cost and increasing capacity of CPU, RAM, and disk storage the alternatives to R-RDBMS start to become much more attractive. This increased the appetite to process and store substantially more data and this is now testing the limits for R-RDBMS technology. As solutions scale into the petabyte range many of the traditional data modeling techniques like Ralph Kimball's dimensional modeling are less successful. The same can be said for indexing (bit-map) and physical partitioning schemes. What worked for 10 GB - 10 TB doesn't work near as well for 10 TB - 10+ PB. Also, the problems become more severe as your processing requirements approach real-time.

Over the last couple of years the number of columnar storage solutions has increased noticeably, led by Google's BigTable, Sybase IQ, Stonebraker's research with C-Store, etc. Much of the benefit from columnar storage comes from the ability to compress data and to substantially reduce the I/O's needed to complete a query. Google decided not to use SQL for a query language while other columnar solutions stuck with SQL, but what this showed is that SQL is also reaching its limits.

Next, Stonebraker published his research on H-Store. He proposes that pure OLTP applications can see dramatic improvements in performance from massive simplification of the database engine and performing in-memory, distributed processing of data. Also, he proposes to do away with SQL as the query language. Werner Vogels, who is well-respected in the distributed systems community, expresses some scepticism. While he likes Stonebraker's challenge to provide 50x improvements, he is worried that Stonebraker is only solving the scale-up problem when instead he should be focused on the scale-out problem, similar to what Google and Amazon do. That was my concern initially, but the more that I think about it I don't think that H-Store is necessarily unable to scale-out. I think it is just a matter of time. Now that there is a working implementation of H-Store, they can now focus on scaling-out.

This led into a (contentious, if you read the blog comments) debate between Curt Monash and Stonebraker about how many different types of databases should be supported. It doesn't matter so much on the exact categorizations. What does matter is that the R-RDBMS world is "hitting the wall" with increasingly regularity with the "one size fits all" solution and that is driving the database market to come up with a variety of alternate solutions for each category of data management. I don't believe that R-RDBMS's offer a good solution for managing XML, semi-structured, and unstructured data, especially as the amount of data increases and the processing requirements approach real-time. Also, I don't believe that SQL is the correct language for processing this data.

I disagree with a some of Stonebraker's comments though. He thinks the low-end OLTP market will go almost entirely to open-source databases. I don't believe it is that simple. First, brand loyalty should not be underestimated. The cost difference between low-end commercial and open-source R-RDBMS's isn't significant enough to drive people in one direction or another. Second, the cost and complexity of swapping out R-RDBMS's for legacy systems far outweighs the license cost savings. Third, I think that solutions like Amazon's SimpleDB and Microsoft's SQL Server Data Services will be a more attractive option for the low-end than open-source. Not only do they eliminate the software license fees, but they eliminate the hardware and system/database administration costs. Head-count reduction is far more important to many organizations than software license reduction.

I agree with Stonebraker that the current R-RDBMS vendors are at risk of getting caught in the middle as we undergo change in the data management market. Also, I agree that the R-RDBMS's are getting too complex and this complexity is unnecessary. Finally, there is one conclusion that I would like to add. If solutions like H-Store are able to eliminate transactions, concurrency management, and other complexities then the benefits will be so great that they will quickly permeate into the mid-range solutions.

How is the developer's life going to change? First, they will potentially have to unlearn a lot of database relational design and programming habits. For an H-Store type of solution, this could result in a substantial reduction in the amount and complexity of the code. It will bring us a lot closer to being able to automatically generate the data access layer from a data model since much of the programmer intervention is due to transaction and concurrency management. Second, if we find a suitable replacement for SQL then we can potentially eliminate another big pain, Object/Relational Mapping. Am I the only who thinks that every O/RM tool completely sucks? I know what problem they are trying to solve. I just don't think they are solving the problem. Yes, I can build a working application with them, but I feel so dirty afterwards. I feel like you are just trading one problem for another. Third, in the area of columnar storage and non-relational data solutions I think we will have to be prepared for more developer effort in the short-term. The Google-imitators are just now learning that BigTable and MapReduce type of solutions are no free ride. The lack of tool support and best practices is something that will be fixed over time, but in the short-term it will be an issue.

When are the major changes going to happen? They already are happening. When is it going to impact the average customer or developer? I am not smart enough to accurately predict this, but I think it is close enough that I am paying attention. If and when changes start to happen, I want to be ready.

Wednesday, March 12, 2008

MiX 2008 Conference Summary

I attended the MiX 2008 Conference last week. Overall, it was a good conference and I recommend attending it in the future.

The keynote speeches were pretty dry. The best part was the pre-keynote entertainment by Vince Mira. He did an incredible job singing Johnny Cash songs. Ray Ozzie, Steve Ballmer, and Guy Kawasaki were boring. Dean Hachamovitch provided a good overview of IE 8. Scott Guthrie talked about Silverlight, ASP.NET futures, and the Visual Studio 2008/Windows Server 2008/SQL Server 2008 launch. Deep Zoom (previously called Seadragon) looks interesting.

The "Crossing the Usability Chasm - Advanced and Adaptive User Interfaces" talk was well-done. Gil Hupert-Graff provided good advice on user interface development based upon his research.

The "Building Rich Internet Applications Using Microsoft Silverlight 2, Parts 1 & 2" talks provided an introduction into Silverlight 2.0 development. Since I haven't done much with Silverlight this provided me with exactly the introduction I was looking for.

The "RESTful Data Services with the ADO.NET Data Services Framework" and "Building RESTful Real World Applications with ADO.NET Data Services" talks provided a good overview of the ADO.NET Data Services Framework. I liked what I heard and plan on diving into this deeper over the next month.

The "Introducing SQL Server Data Services" talk provided a decent overview of SQL Server Data Services (SSDS). Contrary to the SSDS team's blog the current state of SSDS makes it a very comparable product to Amazon's SimpleDB. I am sure that over time Microsoft will expose more of SQL Server through SSDS and that will allow it to surpass SimpleDB. I am waiting for my account so that I can experiment with SSDS. I am interested in seeing how complex of a data model you can build or how large of a database you can create before it no longer makes sense to use SSDS. Just as SimpleDB, SSDS should be able to satisfy the small, simple database niche well. As Microsoft rolls out more of their Software plus Services offerings I think they are uniquely positioned to benefit from product integration. I wonder how long it will take before we hear more anti-trust complaints.

The "Building Great AJAX Applications from Scratch Using ASP.NET 3.5 and Visual Studio 2008" talk provided an introduction to AJAX development features in .NET 3.5 and Visual Studio 2008. Even though .NET 3.5 and Visual Studio 2008 were just released this information has been out in various forums for many months. Since I have already been playing with these bits for a couple of months there wasn't any new information for me in this talk.

The "Cross-Browser Layout with Internet Explorer 8" talk provided a lot of useful information. The most important thing that was discussed (also in the keynote) was the commitment that Microsoft is making to CSS 2.1 compliance. This should calm a lot of web developers down if compliance is mostly achieved. Microsoft seems to be doing a lot to build unit tests to prove compliance. It will be interesting to see how much of the non-Microsoft community contributes to the suite of unit tests. The performance improvements in IE 8 look promising too.

The "Developing ASP.NET Applications Using the Model View Controller (MVC) Pattern" talk was probably the best at this conference. Scott Hanselman is always entertaining, but I think the MVC bits are going to be a huge hit for architects. Scott made a point of saying that MVC is optional, but if the MVC bits evolve as I believe they will evolve then there is not going to be much of a reason to develop ASP.NET applications any other way. I am looking forward to digging into the MVC bits.

The "Using an Internet Service Bus to Build Next Generation Applications and Services" talk provided a good overview of BizTalk Services, which is not BizTalk Server. Just like SSDS, BizTalk Services is going to address a niche set of problems. I look forward to playing with these bits.

The "Using the Microsoft Sync Framework and FeedSync" talk provided an excellent overview of the Sync Framework. I think this is going to be a huge hit with developers. I look forward to playing with these bits.

Leaving Microsoft

I joined Microsoft in 2005 to work in the Intelligence and Homeland Security practice of Microsoft Services. I resigned from Microsoft on Feb. 29, 2008.

I am grateful for the opportunity I had to work for Microsoft. During my stay at Microsoft I worked as a software development consultant, architect, and manager, sometimes all at the same time. My managers were always very supportive of me. They put me in charge of growing the business and managing the project delivery in a couple of key areas and gave me a lot of freedom to do so. My team and I didn't disappoint. We had very significant revenue growth driven by successful project delivery due to a lot of hard work.

If everything was so great then why did I leave Microsoft? I am at a point in my life where I can do things differently than most. I am 49 years old, my children have been raised, I am virtually debt-free, I have a decent amount of money saved, and I live very frugally. So, I am leaving Microsoft because I can, not because of any specific negative reason. I have a couple of itches that need to be scratched. Who knows, after I have scratched my itches I might consider rejoining Microsoft.

What's next? First, I am not retiring even though I can afford to. Anyone who knows me knows that I like to work hard. I will probably continue working until I drop. I figure I have at least 20 years of productive work life remaining. Second, I am considering whether to resume my Ph.D. studies. Over 20 years ago I dropped out of the Computer Science Ph.D. program at University of Michigan because I had a wife and three children to support. While I can handle the academics I am not sure if I have the temperament to go through a formal program. Third, I like entrepreneurial work. I enjoyed the couple of start-ups that I have been involved with. I see one or more new ventures in my future.

At a minimum, I am taking the month of March off. I have a long list of technical topics that I want to dig into. After March I have no specific timeline for my next venture.

Kevin Hegg