Friday, May 15, 2009

Innovation in the computer industry

Most innovations are combinations of existing ideas, this is how innovation works. In a recent ZD Net blog, Larry Dignan examined how effective Microsoft, IBM and others are at profiting from their R&D spending. The article seems quite reasonable and the claims seem well supported.
But in the comments, there was a recurring meme that claimed that there is no innovation in FLOSS software. For example, mikefarinha claims that "all of the big name OSS projects" exist to steal market share from Microsoft. His list of FLOSS projects is Firefox, Open Office, Samba, WINE and Lindows. For starters, Lindows is hardly a major FLOSS project, I would list it well behind Apache, any of the BSDs, Linux and OpenJDK, none of which made his list of 'big name projects'.

Firefox comes from Mozilla. Mozilla comes from Netscape. Apparently, Firefox exists to take market share away from Internet Explorer. If you look at the User-Agent string from Internet Explorer, you will read
Mozilla/4.0 (compatible; MSIE 7.0b; Windows NT 6.0)
Interesting, it looks like Internet Explorer is emulating Mozilla. If you really want to look for who invented the web browser, you will discover ViolaWWW. Rather that running on Windows, ViolaWWW ran on Unix and X Window. You may recognize X Window, since X.org is a 'big name' FOSS projects

OpenOffice.org is simply the open source version of Star Office. StarOffice began with StarWriter, just like Microsoft Office began with Microsoft Word. But StarWriter was originally a German word processor for Zilog Z80 and CP/M. CP/M is an operating system that was originally developed by Gary Kindall at DEC. CP/M is the ancestor of several DOS systems, including MS-DOS. In Microprocessor Report (Vol 8, No. 13, October 3, 1994), John Warton concluded "The Origins of DOS" with:
The strong impression I drew 13 years ago was that Microsoft programmers were untrained, undisciplined, and content merely to replicate other people’s ideas, and that they did not seem to appreciate the importance of deļ¬ning operating systems and user interfaces with an eye to the future. In the end it was this latter vision, I feel, that set Gary Kildall so far apart from his peers.
Not exactly a rousing defense of software innovation at Microsoft. So, we find that StarWriter was developed for a OS that predates MS-DOS. If I recall correctly, the GUI version of MS Word was actually developed for the Macintosh. So it is not clear to me how OpenOffice.org is an example of a Microsoft innovation that others copied, it can trace its code base back further than MS Word can.

Next, we have the case of Samba. Samba implements the SMB protocol, which Barry Feigenbaum at IBM. Microsoft implemented a heavily modified version of SMB. Andrew Tidgell, and the Samba team, worked to create a version of SMB that worked with DEC Pathworks. It is only later that they tried to understand Microsoft's undocumented modifications so their SMB Server could work with more of the computers in their employer's network. It seems that trying to achieve file sharing between Windows and Unix is just an attempt to steal market share. To my way of thinking, when Microsoft added undocumented changes to an otherwise open protocol, it was an attempt to achieve vendor lock-in and steal market share from the rest of the industry.

That leaves only Wine, which is an emulator that allows *nix computers to run Windows applications. Personally, I haven't had much luck with Wine, so I don't see how Microsoft has much to worry about from Wine.

In addition to mikefarinha, we have Rick S._z who states
But here's a TECHNICAL creation which changed the computing world, and was almost totally invented by Microsoft: truetype fonts. Before MS built Windows 3.1 around them, no one had thought to use the SAME fonts for your printers and your screens. Fantastic idea, and implemented beautifully.
The time is about right, but the source of the innovation is wrong. TrueType Fonts were developed by Apple, as noted by Microsoft.

I am not actually just an 'anything but Microsoft' Zealot, but it is just wrong to state that FOSS does nothing other than 'steal market share from Microsoft.' For example, Microsoft did release NT, which does represent several significant innovations over DOS. According to the New York Times, the original name for NT was OS/2 3.0. OS/2 was an operating system developed by IBM. Microsoft did a great deal to improve the network stack. Namely, they used the BSD Unix network stack, see the Defcon archives for evidence to support this claim. There is nothing wrong with this, but you have to admit that the innovations for the network stack came from those open source Unix developers at Berkeley. Just for giggles, open ftp.exe, its in your C:/Windows/System32 directory with Notepad. You will see the BSD license. Microsoft is not amazing because they invented everything, they are amazing because they integrated innovations from all of the world.

Remember Pablo Picasso's observation, "Bad artists copy. Good artists steal." If you want to be creative, you have to steal the good ideas.

Tuesday, May 5, 2009

Choosing SCT Type at Run Time

When a data warehouse is designed, the architect must choose a dimension type for each dimension in the warehouse.  The two must common types identified by Ralph Kimball are the Type 1 SCD, in which changes in dimensional attributes are overwritten.  In the Type 2 SCD, changes are not overwritten, but are recorded in multiple records for each object in the dimension table.  

After using Pentaho's PDI, it become clear that you can specify the dimension type at the attribute level as long as the underlying table is a Type 2 SCD.  Still these decisions need to be made when the dimension table's ETL is being designed.

What happens if you would rather delay the decision until run-time?  This flexibility can be provided with an auxiliary table for each dimension table.  The auxiliary table links each record in the dimension table with the most recent record for that object.  With that view defined, you can easily set up a view for a Type 2 dimension table that behaves like a Type 1 dimension.

I am surprised that I don't find any references to this in standard data warehouse references or in online articles.  If anyone can point out a reference to this, I would be grateful,.