Windows Installer and the creation of WiX

I just want to add some more specific technical information on the Windows Installer technology itself, and some of the history leading up to the creation of the WiX toolkit since this post may be found by people who are just getting into the field of installers, WiX and Windows Installer.

This is intended as a quick introduction to WiX and MSI from a developer’s perspective. There is also a somewhat popular serverfault.com article that might be useful to grasp Windows Installer’s benefits: The corporate benefits of using MSI files (many developers dislike Windows Installer, but the corporate deployment benefits are actually quite significant – perhaps worth a quick skim if you think MSI is more trouble than it is worth).

The origin of the WiX toolkit

MSI files are essentially stripped down SQL Server databases stored as COM-structured storage files. This is the file format used in Microsoft Office (note that MS Office used to use OLE / COM files – but newer versions now use Office Open XML), and it was designed as a way to store hierarchical data within a single file. Essentially a file system within a file with storage streams of various types – one of which is the files to install inside one or more cab files .

Early on MSI files / databases were best modified directly using third-party tools such as InstallShield, Advanced Installer and Wise Package Studio (no longer available due to legal issues – see a comparison of currently available MSI tools).These tools stored the MSI file in its native “installable” format as a COM structured storage file. This meant your MSI file was both source and executable – and in binary format. This made source control of your installation project difficult. Binary diffs on different MSI databases were difficult, and due to database referential integrity even the most basic changes in the MSI will cascade through dozens of tables and make it difficult to see what changed even for trained eyes.

WiX came around as a way for developers to allow the creation of a binary MSI file from regular text source files. Just like a regular EXE binary, an MSI binary is “compiled” from WiX text XML files. This is a quantum leap in terms of managing your release process and understanding changes in the MSI file. The toolkit is very comprehensive and much more intuitive for a developer and features a degree of “automagic” in that it shields the developer from some of the intricacies of the MSI database schema since changes are made in an XML format with its own schema and not the database itself. In effect WiX takes MSI from its database origins into the “XML age” of today so that developers work with text files, and the MSI files can be seen as compiled executables as opposed to database source files.

It is actually possible to make good MSI files without knowing too much about the inner workings of the MSI file – provided you follow WiX best practices – and trust me as a developer you will want to stay out of MSI files. They are complex, and distinctively unorthodox and counterintuitive for a developer mindset. It has to do with the complexity of storing a whole installer as a single database. It is almost entirely declarative and not procedural – but some parts are sequential and define installation order. Lots of moving parts and a clockwork of “conspiratory complexity” (gotchas that you discover as you thought everything was fine).

These sequencing constructs are some of the most complex parts of an MSI involving “elevated rights” and file system operations run as a database transaction. When you learn MSI as a developer you are bound to feel that “something is wrong with this design”, and the truth is that the whole technology was designed around the deployment requirements for Office back in the day – and it became as complex as it had to be. Furthermore MSI files may be a preview of things to come – perhaps Windows will use SQL Server as its main storage solution in the future, and MSI is the first step in turning deployment into a “declarative language” or a huge SQL statement for what is going to happen on the target system during deployment? This is just speculation though.

Some practical WiX advice

Keep it simple, follow best practice and whatever you do don’t fight the design – it fights back. If WiX can’t do it, it is likely trying to help you avoid deployment problems. Fight your manager to simplify or change requirements, not MSI – for once it’s easier :-).

Most of the time we find that unusual setup designs and the use of custom actions cause a lot of unnecessary complexity, or deployment anti-patterns if you like, and the problem can often be avoided by small changes in application design, or the use of built-in MSI constructs. A good manager will allow efforts to simplify deployment, but they need to understand why it is necessary. I like licensing as an example of how you can do things differently and make deployment simpler by avoiding old fashioned or needlessly, complicated application and deployment solutions.

Avoid unnecessary (read/write) custom actions at all cost – they quadruple a setup’s complexity and risk. Ask here on Stack Overflow and search to see if there is a built-in alternative. In most or at least many cases, there are equivalent built-in constructs in MSI to get the job done.

This particular advice can not be overstated. In my personal opinion, read-only custom actions (that may set properties) are the opposite: they are recommended. They do not cause significant extra risk in most cases – since they make no changes on the system requiring rollback support, and can be used very effectively to gather setup logic in one place – and crucially they work well between co-workers to allow picking up each other’s work when written in simple scripting languages such as JavaScript (some clunky aspects when dealing with the MSI API) or VBScript (poor error handling and overall language features, but well tested with the MSI API. Frankly it seems like Microsoft is trying to “kill” the language. JavaScript is at least “alive and well” in heavy use for web-stuff).

To wrap up things with regards to scripts: there is general agreement among deployment specialists that script actions of all types are in general hard to debug, vulnerable to anti-virus interference and lacking in language features needed to implement advanced coding constructs. In conclusion: it is hard to write robust code with scripts – of any type. Managed code custom actions are possible (.NET), but due to their requirement of .NET being installed, the safe recommendation is to write custom actions in C++. This allows minimal dependencies, very good debugging and advanced language constructs. There is a long “discussion” of this issue here: pros and cons of different custom action types (not great, just a dump of real-world experience). It might be worth a skim though – custom actions are the leading cause of deployment failures (link to my propaganda against them), and this brings us to the next point: the overall complexity of deployment (and how to deal with it).

The Complexity of Deployment

Deployment is the complex process of migrating heterogeneous target computers from one stable state to another – this requires a disciplined approach since:

  1. Errors are cumulative in nature – you often cause more problems the more you try to fix things with a quick fix. Pretty soon you have an impossibility to maintain on your hands since the problem is generally “in the wild” (published) and must be dealt with like a delivery process – each iteration with its own, added risk, and not just a single problem to debug until you have a fix.
  2. Errors are extremely hard to debug when you have no access to the system in question. Logging can help when done right, but it is often not delivered to you when you need it for debugging, or it is in the wrong format or verbosity, or just plain useless altogether since custom actions often don’t log things properly.
  3. The target systems (and target environment) differ in just about every way imaginable (this is the case even if it is a standard operating environment (SOE) as most companies use with standardized OS installations and packages): hardware and driver differences (large and small), fat client / thin client, terminal server, OS platform (x86/x64/etc…), OS version (Win7, Win10, WinXP, etc…), OS edition (ultimate, home, etc…), OS language version, OS upgrade status and patch level, malware situation, disk space issues, partitioning scheme, file system types, encryption issues (file system, network), user rights setup, UAC configuration, system privilege configuration (NTRights), connection type, connection speed, network configuration (domain, workgroup, etc…), sub-netting, proxy setup, email system and configuration (Exchange, Outlook, Novell, etc…), active directory, authentication scheme, network shares and drives, application estate, scripting availability, scripting lock-down, all kinds of runtime versions (C, C++, MFC, ATL, ADO, OLE DB, ADO.NET, Java, scripting runtimes), COM and DCOM object registration and configuration, COM+, IIS and web servers, path variables and environmental variables, file associations and shell operations, wireless software setup, number of users, .NET versions & configuration, language packs, GAC and WinSxS state and configuration (policy files), software firewalls, emulated / virtualized systems, deployment system (SCCM, Tivoli, Etc…). It goes on and on.

Deployment is a simple concept, with a complicated mix of variables that can cause the most mysterious errors – including the developer favorite: the intermittent bug. As we all know the seriousness of such bugs can not be overstated as they are often impossible to debug properly.

More on deployment and what a modern setup program might need to do: What is the benefit and real purpose of program installation?. This is a summary of what tasks a setup can be required to do peppered with various technical details. Too many details may have been added, perhaps destroying the “overview quality” of the answer. However the intent is to stay relevant for developers.


Related MSI Tools

A Visual Studio MSI project file is a light-weight way to create an MSI file as part of Visual Studio, and it was extremely limited in its feature set. There were talks to replace the MSI project type within Visual Studio with a WiX XML project, and this is generally how people build their MSI files now. Don’t use this project type. It has caused serious problems for many users due to its lack of flexibility and serious bugs.

Orca is the Windows SDK tool which allows binary MSI files to be opened, edited and to a certain degree compared. It was in fact written mostly by the man who later created the WiX toolkit itself. Rob Mensching while he was working in the Windows Installer team at Microsoft. The tool also allows other operations such as generating transform files for modifying the MSI files and some other technical operations. Though it is a very basic tool lacking most advanced features available in commercial tools, it remains an application packager favorite to use and have available for debugging and small fixes due to its reliability, simplicity and “cleanliness” – it doesn’t add “default junk” to an MSI when saving it (third-party tools add custom tables and similar junk). I use it for small MSI updates, debugging, inspection of the summary stream, creation of basic transforms, viewing patches, package validation, and other important operations.

In fact, I guess it is an advanced tool, with a simple interface – and not a basic tool at all :-). In order to get hold of Orca, you need to install the Windows SDK (!). A bit over the top really when the size of the tool is so small, but at least it is easy to know where it is available instead of hunting for a separate download.

UPDATE: If you have Visual Studio and the SDK installed search for Orca-x86_en-us.msi and install it. If you don’t, maybe have a friend with Visual Studio installed search for it and then send it to you? It is a small file.

There are also some alternative, free tools available as described here (towards bottom): How can I compare the content of two (or more) MSI files?

DTF – Deployment Tools Foundation is a .NET suite of classes to deal with MSI files programatically. Well written, easy to use and very powerful it is now included with the main WiX download. It is a crucial component in any project to automate corporate use of MSI files. Here is a brief answer on serverfault.com discussing its use and describing its basic components. The help files included with DTF will get you going quickly with the toolkit, and you will never look back at using Win32 functions or COM classes to access MSI files.

There are many other Windows Installer tools on the market, and some of them you can find compared to WiX at What installation product to use? InstallShield, WiX, Wise, Advanced Installer, etc (same link as above).

Leave a Comment