How do I avoid distributing sensitive information in my MSI by accident?

Super Condensed: Install Orca, get someone else involved to help and go through the raw tables in sequence and then any custom action code.


All of this is obvious – if this has happened to you and you have sensitive information in the wild: all you can do is pull back the MSI (from download hopefully – it was even worse in the days of optical media), change any passwords or whatever else was revealed – and then make sure you don’t experience it again. Now for the important part, how to avoid it for the future.

In addition to the information below on sensitive information, please also remember that some files you want to include in your setup may not be redistributable legally. Typical examples would be debugging tools from Microsoft or debugging tools from third party SDK toolkit. Please read documentation thoroughly and avoid use of such “hacky tools” in your custom actions.


Short Version

UPDATE: Let me jot down, before I forget, that you should eliminate the “downloaded file blocking flag” from all setup files (and generally read-only flags too).

All that is being suggested below is essentially to 1) scan your finalized MSI with Orca, 2) look over installed settings files as well as any template installation scripts delivered with your MSI. Further, 3) review your compiled custom action sources very well and maybe improve release build configuration practice (#ifdef _DEBUG for example, see below). 4) review your script custom actions by checking what is actually in your MSI (extract them). And crucially: to 5) get some help from other people for all manual testing – get some accomplices :-). Seriously: the setup is as important as the application – to make your solution succeed, it is your duty to get QA-personnel and other people involved testing – and also to tell them what and how to test.

I would avoid trying to automate such checking yourself. There is no substitute for real eyeballs on the data. Perhaps a community solution would help long term. It could become part of a validation suite? Semi-automatic help might work, but fully auto-magic: forget it. There are too many ways to use all the rope you’ve got to shoot yourself in the foot.

Sensitive data may be the wrong term, maybe “invalid content” is more appropriate. Problems can result by your application pointing to your test server rather than production server on launch. Unexpected message boxes may pop up from custom actions (sometimes revealing sensitive data), and similar release blunders beyond pure sensitive data being exposed.


QA – Bug Quest

Checking for sensitive data included by accident obviously ties into general QA of your package. It should be done simultaneously with general testing. QA people are so busy with application testing that you really have to push this deployment testing, and make a test plan. Nothing fancy, but do test all installation modes (install, reinstall, repair, self-repair, upgrade, patching, uninstall, administrative install, resume / suspended install (setup reboot issues) and you should also do publishing and advertisement – if you have the equipment and network to test this) and test all custom action functionality (thoroughly). Realistically and minimally you must test install, reinstall, uninstall and upgrade, but please test all modes.

And if you are localizing, test in all core regions in all editions. Also run English in German locations, and vice versa just for smoke testing. In fact, test English in all regions – obvious I guess. Custom actions could easily fail on localized machines triggered by a random state on that machine (CA trying to access a hard coded English path for example and an exception results), or show some forgotten message box in English inside your exception handler code or similar that was never triggered on an English box. Bad, oh yes, and I have seen it often enough that it should be chalked up as an issue.

And I guess it should be mentioned the words of an experienced developer: “…don’t hit too many people with testing until every bug found is a genuine surprise“. And also – his funnier advice – for pre-releases leave in a couple of known bugs and tell the QA guys that there are such and such number of bugs to find – just for some motivation to focus the mind :-). P.S: I like to refer to this experienced developer as “The Elder Grasshopper“, or as he is more commonly known “Veggie Boy“. Confucius says: “Never trust a man who can be bribed with (organic) carrots!

A large digression, back to the real topic: erroneous inclusion of sensitive data.


Checking MSI Files

I keep it simple when it comes to checking my MSI files for sensitive information.

  1. First a quick once-over of the source files (WiX, Installshield, Advanced Installer, or whichever tool you use) for hard coded dev-box sins.
  2. Then significant attention checking the finished release-candidate MSI file itself. The real McCoy. All tables, and also extraction of some embedded content for verification (scripts, custom action dlls, etc…).
  3. Actual installations in all installation modes, as described above. Sensitive content can be revealed, but so can a lot of other issues – for example unexpected message boxes popping up – sometimes with sensitive debugging info (custom action test focus).

How to check? Some scripted checks could be useful, but from experience I don’t get fancy about it. I prefer a second pair of eyes over fancy script checking if I am honest. Just my two cents from real release work.

  1. Install Orca
    • Orca is as down to the wire as you get with MSI – other tools often show bogus, proprietary tables. Orca is a straight up view of the file content.
    • Search for “Orca” here to find the installer quickly if you have Visual Studio installed – or tell someone with Visual studio installed to send you the installer MSI.
    • You can also try “Super Orca” – but Orca is recommended.
  2. Now just open your release-candidate MSI with Orca – and skim through the tables.
    • And just to say the obvious:
      • Enforce any changes in the real source, don’t hotfix the finished MSI.
      • If you ask me, no in-situ hotfixing at all – you need a fix at the source and a full MSI file rebuild in my opinion. Then you label your source code (if you got proper, old-fashioned source control with delicious revisions and labels).
    • Most vulnerable tables are probably: Registry, Property, IniFile– but there could be something in several other locations.
    • If you actually use the MSI GUI: tables relating to GUI are also vulnerable.
      • Many people just use the standard GUI without modifications. This should eliminate most risks.
      • If you have a custom GUI, then there are quite a few tables involved in the MSI GUI declaration. I would eyeball them all.
      • Perhaps particular focus on: ListBox, ComboBox, UIText, Dialog
      • Obviously with extra focus on your own, custom dialogs – if any.
    • Third party tools feature vulnerable custom tables for things such as XML file updates. Eyeball these as well.
      • Anything that looks like XMLFile, SQLUpdates,etc…
      • There are more and more such custom tables from different vendors. They relate to all kinds of things now, not just config files (firewall rules, SQL scripts, etc…)
    • Check any included scripts.
      • Check in source control, but also…
      • CustomAction table or Binary table – the latter requiring you to stream out any scripts – or check them in their source locations).
  3. Check any settings files the application installs (via MSI’s File table).
    • INI files with hard coded settings could be installed via the File table and hence not have their values visible in Orca (as opposed to the INIFile table which shows all fields of an INI to write).
      • The difference here is essentially whether the file is handled as a file or as a set of group-value pairs to write to an INI for example. The latter approach is the “correct” one.
      • Note that some INI files may need to be installed as files, if they have non-standard formatting (extra fields and various strangeness contrary to normal key-value pair formatting), or even more common: INI files may have huge comments sections with help information (often for developer tools) that you want to preserve – and you can’t via the INI file table. Then the option is to install it as a file.
    • Other settings files such as XML files could be installed the same way. Very often in fact.
      • As stated above third party tools often feature support for writing updates from a custom table viewable in Orca.
      • There can be many such different tables (Encrypted fields? What is in there?)
    • Files included like this are generally maintained by developers, but it is still a release responsibility to check. Make an administrative installation (further links in linked answer) of your MSI and check the extracted settings files in the created network image. msiexec.exe /a "Your.msi", or setup.exe /a (Installshield), or setup.exe /extract (Advanced Installer). Some setup.exe info.
  4. Check supporting batch installation scripts, Powershell scripts, or other forms of scripts delivered with your setup – with the intent to do the actual installation of the software.
    • Sometimes you see ready-made scripts delivered with some setups to help automate deployment, often some form of hard coded information can sneak in here (UNC paths, even IP-addresses, or other kinds of test data”).
    • These scripts, sometimes delivered as a separate download, may be an afterthought that ends up being neglected for QA in my experience.
    • Use these scripts actively during QA and testing if available, or better yet: document large scale deployment in a single page PDF instead (much more generic and less error prone).
  5. Warning: compiled custom actions can still contain sensitive stuff even if nothing is visible in Orca – obviously. Easy to forget sometimes in the heat of the moment. This is of high cruciality (new favorite word) – back to the source code.
    • Compiled custom actions are not directly “viewable” so any hard coded sensitive stuff is “less exposed”.
    • However, an erroneous hard-coded IP address could cause all your users to try to connect to your test server or whatever other server you want for yourself… I would suspect this wouldn’t happen during setup, but during first application launch.
    • Again: get help – a second pair of eyes will save you trouble, but this time let them read the actual source code as well. Tell them to focus on unexpected, hard-coded values and weird defines – anything that looks “experimental”.
      • Such “white box” or “transparent” QA is probably good here. Recruit another developer? I would focus on just eyeballing code for “weird stuff” rather than testing actual functionality (that is for black box testing).
      • Obviously the code should work with MSI PUBLIC PROPERTIES only for values input by the user or set at the command line. Anything hard coded shouldn’t really be there. In the real world most developers end up setting something hard coded in debug builds however.
      • QA professionals should be told how to test the same custom action “black box” obviously – and also the first application launch to test that correct values were written to wherever they were going.
      • Can you provide some convenient application level logging for the QA-guys that they know exist and they know how to use (and that checking it is expected of them?). Then you should be down to just a few minutes before they discover that your release application hits your internal test server and not your production server.
    • For compiled C++ custom actions I would suggest using good nitpicking debugging practice if you insist on hard coding stuff. Use #ifdef _DEBUG to wrap debugging message boxes and any hard coded test variables. See C++ snippet below. This means no experimental values are ever in release builds at all (the pre-processor will remove all debugging constructs).
    • Perhaps also add NOMB to your release build as well? See sample below as well – should prevent stray release build message boxes –
      the define essentially “forbids”
      them (other, possible defines: How to tame the Windows headers (useful defines)?).

      • I have only tested this briefly. I have had my fair share of forgotten C++ message boxes pop up in a release build though – I have to admit – no disasters luckily (knock on wood and such forth and whatnot, etc…).
      • Keep in mind that such a message box could mysteriously halt a setup run remotely in silent mode dead in its track without any warning or clear reason (usually no log message).
      • Such a message could typically be triggered by some sort of error condition or exception not generally triggered for most installations – so it is suddenly there, on some PCs. Nothing you can do to recover when deploying remotely. The setup doesn’t correctly roll-back, it is just stuck. If there is no user logged on locally there is no way to dismiss it on the machine either. Not strictly about sensitive information outright, but related (exactly what is displayed on the box?), and something to look at when testing custom actions.
      • A natural question is whether there is a way to make message-boxes time-out auto-magically? I briefly smoke-tested this suggested MessageBoxTimeout method (from user32.dll), and it seems it even supports the above NOMB feature as well as timeout. In other words you can set the message box both forbidden in release builds and to time-out in debug builds. Not tested thoroughly.
    • C++ is not my thing, use best practice for release builds as defined by your company. Maybe look for defines and string variables. Or all settings may be in a settings file only included by debug builds (but weird stuff tends to creep in here and there anyway).
    • For managed code, the question I ask myself is: how de-compilable is this managed binary? I have little experience here. I have never taken the time to de-compile managed binaries.
      • There should be nothing sensitive in the code anyway – unless you have a hidden private key, a license key or something similarly weird in there – which I would definitely not do, leave it for the application to do.
      • For features such as setting up a trial period for your application I suppose you could want to “hide the implementation” better. Some sort of obfuscation is probably common that I am not up to speed on. Maybe this is one of the bigger problems in the .NET world? Experts on the topic: please educate us.
      • I’d focus on the same issues as above: debugging constructs that are erroneously included in release mode binaries and erroneous links and paths set to test servers and test resources.
  6. Including debug build binaries in your official release by accident
    • One more issue that in certain cases can easily happen: you include debug versions of you custom action DLL(s) in your MSI.
    • This can obviously happen to any file in your setup, not just your custom action DLL, but the custom action DLL is particularly “hidden” in your package after building (embedded in the MSI’s Binary table – verify it).
    • Perhaps make sure to add a d to the file name for your compiled custom action dll – or any other file for that matter? Even if it causes you some extra work?
    • I am not sure how “sensitive” a debug dll really is (a proper C++ expert must elaborate) – but I sure don’t want to distribute such files in my setups unintentionally. I sometimes (rarely) make debug build MSI files for QA teams containing only debug binaries and symbols for testing purposes, and in my opinion these setups should expire after a month or two and never be easy to install and never be used outside your QA team. Passwords to install could be added, but MSI is an open format and can still be extracted. No drama, just something to keep in mind and manage I guess.
  7. Now this is pushing it a little for the topic of “sensitive data”, but how about a thorough malware scan of anything you intend to sign digitally and release publicly? Signed malware is not something you want to experience.
    • Verify your digital signature on your release file (if any). Test with UAC, etc…
    • Maybe use Virustotal.com or an equivalent malware scanning service / solution to scan your final MSI file for malware (or false positives).
    • Use procexp64.exe (direct download of Sysinternals Process Explorer) to scan all your running processes after test installation. See some suggested usage steps for the tool here.
    • Using these tools may help you eliminate false-positives for your solution as well. A terrible problem that seems to be getting worse as security software tightens security and malware becomes more prevalent.
      • False-positives may cause endless self-repair (see issue 7 in that link) for your deployed package as files are quarantined repeatedly and then put back by Windows Installer via self-repair.
      • The False Positive Irony: For real malware you tell your users to rebuild their computers. For a false-positive the pressure is on you to resolve the problem with security software vendors. Now how to do that for dozens of security tools and suites?

Debug only message box in C++ custom action:

I use message boxes in order to attach the debugger to C++ custom action code. How to avoid these critters showing up in a release build? Here is one suggestion:

#ifdef _DEBUG //Display Debug information only for debug builds
     MessageBox(NULL, "Text", "Caption", MB_OK|MB_SYSTEMMODAL);
#endif

Advanced C++ guys will immediately see that they should make themselves a better macro for this – wrapping it all – I am no C++ wiz, so I’ll leave that out for now (SafeMessageBox? DeploymentMessageBox?).

In stdafx.h, maybe additionally enable NOMB (should prevent MessageBox from compiling unless wrapped with #ifdef _DEBUG – making MessageBoxes only available in debug builds):

#ifndef _DEBUG // Forbid MessageBox in Release builds
    #define NOMB
#endif

(a fair bet this may become one of your most hated defines ever :-). Who smells a commented out section? I wouldn’t use #undef to add an ad-hoc release message box – it ruins the whole protection feature – likely causing precisely what you hope to avoid: a stray message box. Perhaps just comment out the #define in stdafx.h if you have to, and enable the define again – automatically via the build process – for a real, public release build triggering a compile error for any stray message boxes)

And as mentioned above, you could try the new MessageBoxTimeout method (from user32.dll, apparently available since XP) to show message boxes that don’t get “stuck” but timeout after a specified number of seconds. Not for release use, but might be useful for debugging and QA.

Some context: #ifdef DEBUG versus #if DEBUG. People who actually know C++ properly, feel free to clarify or elaborate as required. The above is from a very old C++ project.

That is basically it – hardly rocket science – just “trifles that bite”. Some further discussion on the topic below, but there is no substitute for this manual scanning IMHO. My honest suggestion, grab some people (managers are just fine 🙂 – pull them in as accomplices!), install Orca for them and just tell them to click through the tables and look in all settings files – and get a developer to help with the compiled custom action code. Just looking at the raw Orca tables may even be effective in order to find other bugs or imperfections as well.


Sensitive Information

There is plenty of opportunity to include sensitive information in your MSI sources by accident during development: login credentials, passwords, database connection strings, user names, share name, IP-address, machine names, ftp passwords, web host login credentials or other sensitive data.

Your MSI should obviously not contain any such sensitive information at all – unless you want to point to your own web-site of course, or provide a contact email or telephone number. However, anything else is almost always undesirable – and it is quick to forget to remove such hard coded information from production MSI-files due to development experimentation (often in script custom actions – or compiled custom actions for that matter – even worse and not detectable by the Orca review approach suggested above, but generally not view-able by the user unless it is show via an unexpected message box – or if .NET managed code is disassembled).

If actually required for the install, such “sensitive” information should be parameters (properties) that are set by the end user at install time, either via the setup’s interactive GUI or set via PUBLIC PROPERTIES or transforms at the command line when the setup is being installed. There is some information here on using transforms and PUBLIC PROPERTIES: How to make better use of MSI files for silent, corporate deployment of MSI files (the linked answer also provides a rather ad-hoc description of MSI problems and benefits in a more general sense).

Leave a Comment