Monday, February 26, 2007

Curious as it sounds, Microsoft's open source lab has published an article on how to install and configure PostgreSQL on a Windows Server system.  This is very interesting, considering how PostgreSQL can be considered a competitor to Microsoft's own SQL Server 2005.

For those of you not familiar with PostgreSQL, it is an open-source, industrial-strength relational database with most of the features you'd find in a mature database server.  It has triggers, stored procedures and views, and can handle a variety of stored procedure languages, including Java, PHP and Ruby.  On the client side, PostgreSQL ships with both OLEDB and a .NET data providers.  One of the fascinating features of PostgreSQL is its ability to have inherited tables.  I can see where this could dovetail with OR mappings for subclasses.

posted on Monday, February 26, 2007 9:43:05 AM (Central Standard Time, UTC-06:00) by Christopher S. Velazquez
 Wednesday, January 31, 2007

I have this habit of underestimating Microsoft, especially when it comes to the Office product line. At first blush, I was unimpressed with Office 2007.  To the average user, on first glance, it looks like Microsoft has traded one set of GUI widgets for another set of more complicated GUI widgets.  Combining this with the IT pundits declaring Office 2007 dead on arrival, I ignored it.  It's hard for me to get very excited about new widgets.

But last night I chanced upon the VSTO site (Visual Studio Tools for Office), and took a peek at this video.  I was amazed, and had to concede that Microsoft had pulled it off once again.  They have raised the bar for us programmers and expectations of users.  Briefly, your WinForms user controls can plug right into the Office environment, into either the ribbon or in the side docking panel.  Having worked on a system in C# that emulates a similar behavior, I was very interested, and a little disappointed that my craftily written code would soon be obsolete.

The thing about Office, and Word and Excel in particular, is that everyone uses it.  People get a warm fuzzy from software they use frequently and have some understanding of.  The number one feature request I get is to be able to download data into an Excel spreadsheet.  And VSTO 2005 SE (Second Edition) makes the job for programmers integrating into Office a whole lot better.  And it gives a compelling reason to upgrade to Office 2007.  This is only a minor upgrade for the end-user, but this is a sea-change upgrade for the Office developer.

VSTO 2005 SE will also allow Office programmers to achieve the type of service-oriented architecture that was difficult to achieve with our previous tools.  Since your user controls can do all those wonderful things like consume web services, chat over networks, query databases, access the file system, etc., the integration possibilities are nearly endless.  And the ability to dock your components right at the user's fingertips within the Office application makes the software convenient to summon for the end-user.

Now to work on my "reasons to upgrade" spiel for my clients...

posted on Wednesday, January 31, 2007 11:13:09 AM (Central Standard Time, UTC-06:00) by Christopher S. Velazquez
 Wednesday, January 03, 2007

I found an interesting alternative data access layer (DAL) to NHibernate called SubSonic.  It was formerly called "ActionPack", but I guess there must be enough confusion with Microsoft's "Action Pack" that they changed the name.  SubSonic is open-source (MPL)

SubSonic has been hailed as "Ruby on Rails" for .NET developers.  There is a screencast of it here, prepare to be amazed.  It works kind of like this:

1) Add a reference to the SubSonic DLL
2) Modify your web.config with connection string info and SubSonic build provider info
3) Build the project
4) ...
Copyright (C) Sidney Harris.

5) Amazing!  Domain objects and data access layer have been automagically created!

I foresee this as a great way to quickly get a data-driven site up with a minimum of development effort.  Realizing this is not the end-all and be-all of data access, you are still free to use "traditional" ADO.NET techniques.  A recent benefit is that CodePlex has come out with an XmlProvider, so that if you would rather persist to XML instead of a database, it can handle that data access layer, too.  This thing keeps getting better and better.

posted on Wednesday, January 03, 2007 10:27:55 AM (Central Standard Time, UTC-06:00) by Christopher S. Velazquez
 Tuesday, January 02, 2007

I just recently started using Microsoft SQL Server Management Studio Express (SSMSE).  This free tool is intended for SQL Server 2005 databases, but it works just fine with MSDE and SQL 2000 databases, as far as I can tell.  SSMSE obsoletes the tools I have been using up to this point, namely Enterprise Manager and Query Analyzer.

Enterprise Manager is slow to open on most systems I've used.  I have a server instance with 20 databases and I can take a nap before the thing finishes opening up.  SSMSE uses demand loading of the tree nodes, so performance is a lot better.

The whole integration between Enterprise Manager and Query Analyzer has a more natural feel.  Nice job, MS!

posted on Tuesday, January 02, 2007 4:25:55 PM (Central Standard Time, UTC-06:00) by Christopher S. Velazquez

OK, I'm definitely not the first person in the world to figure this out, but it bit me in the behind last week.  So I figured I would share this bit of wisdom with you so it doesn't happen to you.

Most of the data that's stored in this MSDE (SQL Server) database lives in an "engineering variable data" table.  In this parallel universe, the term "variable" doesn't mean quite what we programmers think it should mean, so I'll call it a "datum"  A "datum" can have a value, a specification (that determines how it is used in a calculation), upper and lower limits, residuals, bound type, and other properties of interest to engineers.

I experienced a dreadful performance problem in the production system.  Out of necessity, the program would sometimes have to delete 10,000 records from the VariableData table and insert a new copy of those records.  In the development system with a much smaller database, this worked as intended.  However, when we started getting upwards of a million records in the table, this data replacement step would bring the program to its knees for 30-120 seconds.  Sometimes the queries would time out, gobble up RAM like candy on Halloween, or both.  I tried various hacks and workarounds to improve performance, including using a bulk insert, but to no avail.

We finally took a look at the indexes:

IX_SolutionCaseID: key to an integer in SolutionCase table
IX_Specification: key to a varchar(7), whose value can be one of 7 possibilities
IX_Guid: unique key for a uniqueidentifier
IX_VariableID: key to an integer in Variable table
IX_VariableVariableID: a redundant key

I realized that the way I get data out of the database is by SolutionCaseID and VariableID, which uniquely identifies a datum.  So the queries I was currently running were using one of the existing indexes and then doing a table scan to find the record I actually wanted. Ick.  So I added a new two-field key:

IX_SolutionCaseIDVariableID: added key for both SolutionCaseID and VariableID

Adding this index to a populated table only took a few seconds and it cut the wait time in half!

Next I started hacking away indexes that I never use.

IX_VariableVariableID: deleted - this had no noticeable effect on performance.
IX_Specification : deleted - this improved performance.  This is analogous to have an index on sex "M" or "F", which is always a bad idea.
IX_Guid: deleted - I never look up data by Guid, this is for UI magic only.  This had a huge benefit on performance

So the conclusions can be:

  • Avoid indexes on GUID columns, unless they are primary keys and that's how you look things up
  • If you look up on two fields a lot, make sure you have an index that contains both fields
  • Don't index on columns that have lots of repetitive data.  You might actually win with a table scan on this one!

And if you didn't know this already, you can always force an index to be used in a select query (use at your own discretion):

SELECT Field1, Field2, Field3 FROM TableName WITH (INDEX = MyIndexName) WHERE Field4 = 42

And now my client is much happier, and there was much rejoicing!

posted on Tuesday, January 02, 2007 2:19:36 PM (Central Standard Time, UTC-06:00) by Christopher S. Velazquez