Configuration backups: the forgotten WMQ security control

Update: IBM has reconsidered and has announced that dmpmqcfg will be fixed as a defect! Subscribe if you would like a notification when the fix is announced. But please do read the post, especially if you are using amqoamd for anything.

Most of the time when someone says “security” they are actually thinking of intrusion detection – the parts that keep unauthorized people out of the system.  But security is so much more than that.  We need to know if our security fails.  That part is intrusion detection.  After an incident we want to be able to determine how it happened and what to fix so it doesn’t happen again.  The logging and accountability functions are what support that forensic analysis.  Of course we also need to recover from an incident.  That’s the business continuity aspect.  All of these fall under the broad security umbrella and any security assessment that I perform includes aspects of all of these.

But there is an implicit assumption that that the tools to perform these functions are available.  In particular, the ability to back up a configuration for later recovery is something so fundamental that many people consider it part of the baseline functionality of any Enterprise-class product.  But what if it’s not?  Or worse, what if you think it is but the provided tools do not actually do what you are expecting?  Would you know?

If your shop is concerned about the security of the WebSphere MQ network, then chances are you consciously and deliberately configure your queue managers to prevent administration by adjacent queue managers in the network.  If you do not routinely do this, then by default compromise of any one queue manager on the network compromises the entire network.  Such a compromise would include any FTE, Broker, Advanced Message Security or other components or applications that can be administered by sending messages to their command queues.

The only problem is, IBM does not currently provide a tool in the product that back these security settings up accurately.  There are two tools documented in the WebSphere MQ Infocenter and a Technote which are describes as providing configuration backup functionality.  Assuming you want that backup to be complete, both are broken.  IBM has officially responded in both cases that they are working as designed.

The Tooling

The official tool that IBM provides for backing up the objects and access control lists is dmpmqcfg.  Recently, IBM published Technote 16667771: Unexpected results for WebSphere MQ dmpmqcfg output describing problems with that utility.

In the Technote, IBM recommends use of amqoamd, an undocumented utility supplied with WebSphere MQ.  Because this is the first official acknowledgement of that component, its problems are NOT documented by IBM.  More on this later.

Until recently, SupportPac MS03 provided the saveqmgr utility.  This SupportPac was withdrawn after dmpmqcfg became available and is no longer maintained.  This is unfortunate because it is not capable of backing up v7.5 queue managers.  Prior to cessation of maintenance, saveqmgr was the only IBM WebSphere MQ utility that correctly backed up MQ objects and access control lists.  Since its retirement, there are no IBM WebSphere MQ utilities that correctly back up MQ objects and access control lists.

Yes, you got that right.  There is no IBM method to back up all your WMQ objects and access control lists.  There are 3rd party tools that do and I’ll provide links below.  Tivoli may be capable of backing up the configurations but I haven’t been able to confirm or disprove that statement.

The dmpmqcfg utility

The Technote explains how dmpmqcfg works:

The dmpmqcfg utility reads the QMGR configuration (that is, the pre-defined objects in the queue manager) and then for each object found it lists the setmqaut commands needed to recreate the authority records.

According to the Technote, the consequence of this is that authority records for dynamic queues are not recorded by dmpmqcfg because these do not have a local object definition.  It says that the utility first looks for local objects, then looks for authorization profiles that match those objects.  Dynamic queues are not first-class objects and are not found by dmpmqcfg.  Unfortunately, these are not the only entities on which you can place an access control list and for which no local object exists.  Not by a long shot.

Anyone who knows WMQ security at all knows that securing a cluster was a Ninja-level Black Art prior to WMQ v7.1.  Why?  That’s when WMQ got the ability to set access control lists against remote queue managers.  Before that, securing the cluster required running two exits per channel which, of course, nobody actually did.  So the moment that WMQ got the capability to secure the cluster without exits, I began showing people how to use it.  The WMQ security foils for every IMPACT conference since v7.1 was released contain a section on this.  It relies on the new capability to set authorization records on something that is most definitely not defined locally – the remote queue manager.

Does dmpmqcfg capture access control lists set against remote queue managers?  No.

Topics also fall into this category.  They do not have to be defined locally.  In fact, IBM specifically recommends picking a specific queue manager in the cluster on which to define all the topics.  So it’s well established that these are frequently non-local objects and that it’s possible to set access control lists against them.

Does dmpmqcfg capture access control lists set against non-local topics?  No.

“But wait,” I hear you asking.  “Can’t we define access control lists against clustered queues?”  Yes.  When WMQ gained the capability to set access control lists against topics, this opened the door to set access control on other non-local objects.  Shortly after, the ability to issue setmqaut commands against non-local queues was introduced.  You can now run a setmqaut command against any valid queue name, regardless of whether the named queue exists locally, exists out in the cluster somewhere, or doesn’t exist anywhere at all.

Does dmpmqcfg capture access control lists set against non-local queues?  No.

Someone who knows nothing about WebSphere MQ looking at this might conclude that clustered queues, clustered topics and remote queue managers are all exotic use cases for which a security test case is not necessary.  But these are the most basic use cases of the product so how did it get out the door like this?

The Technote as written drastically under-reports the problems with dmpmqcfg and misleads customers into believing that they are OK using the utility, as long as they do not use dynamic queues.  Customers harboring this belief may continue confidently using a tool that does not provide a complete backup of their configuration.  It is possible they will not find out that the tool is broken until they actually need to recover a queue manager in Production and things start to break just when they believe they are almost done recovering from a breach.  They would then be forced to choose between extending the outage versus disabling security just to get things working again.  That’s not something I want to have to explain to my executive management.  How about you?

And if you think that’s a bad outcome, moving to amqoamd is worse.

The amqoamd utility

Why is amqoamd worse?  For starters, it isn’t documented.  When you ask IBM to provide defect support, the contract between you and them is the documentation.  If it’s documented and does not behave as described, IBM will fix it.  Why include things that are supported but not documented?  That reserves the right for IBM to change the behavior of that component without notice. If you depend on undocumented behavior and it breaks, shame on you.  IBM has no obligation to fix it urgently, if at all.

A Technote sorta, kinda documents a component.  But does that mean IBM will fix it?  After you read the rest of this section, maybe you will want to open a PMR and see because amqoamd is broken worse than dmpmqcfg.  When I reported it several years ago I was told it is “working as designed” and it has to this day not been fixed.  But then again, it was not documented and therefore it was possible to respond that the utility is working as designed with no obligation to fix the problem.

The new Technote says that the amqoamd “output includes all setmqaut commands (including permanent dynamic queues).”  That’s not in the Infocenter but it might be enough of a contract to get the component fixed, because the output of amqoamd provably does not include all setmqaut commands.  If a Technote really provides sufficient contractual commitment from IBM to fix amqoamd as a defect, then a PMR should resolve it in no time.

To understand the issue consider authorizing an MQ network or cluster.  Preventing an adjacent queue manager from administering the local one requires setting the MCAUSER of the RCVR/RQSTR/CLUSRCVR channels with a low-privileged account.  I typically specify ‘mqmmca’ as the group and on Windows, where the ID cannot match the group name, I use ‘mqmmcaid’ for the account.  The group is then authorized with a set of rules that look something like this:

setmqaut -m QMGR -g mqmmca -n '**' -t queue -all +put +setall
setmqaut -m QMGR -g mqmmca -n 'SYSTEM.**' -t queue -all +none

This sets a default of “allow limited access to all queues” and then blacklists the exceptions.  A similar technique of broad whitelisting followed by targeted blacklisting is usually granted to anything that needs to dynamically resolve destinations but which does not need full admin access.  The IIB Broker is often authorized this way.

Many shops use similar rules on SVRCONN channels because whitelisting every possible use case across many queues and many people is too resource-intensive to not use generic grants followed by small blacklist entries.  This is a very common technique for any shop that needs slightly granular user security, or anything greater.

The problem with all of these scenarios is that amqoamd does not record the lines where permissions of the queues was +none.  It does, however, happily record the lines where access is granted.  So if you restore a queue manager from the output of amqoamd, your application, users or even external business partners will be massively over-authorized, possibly even to the point of giving them full administrative access.  When I reported this several years ago I was told amqoamd is working as designed and that it would not be fixed.  Part of the justification for this was that amqoamd is not a documented component.  When I checked today on a v7.5 queue manager it still has this behavior.

Oddly, if you grant access to a remote queue manager name with +put, amqoamd records it as +none.  This is better than silently over-authorizing but very inconvenient when you discover that after restoring your queue manager, nothing can put to non-local destinations.  Let’s hope that you saved the input scripts and were not actually relying on the backup tool to know who is supposed to be authorized to what.

The saveqmgr utility

Forget about it.  Although saveqmgr was once the only utility associated with the product itself that correctly saved MQ security settings, saveqmgr has gone to the retirement home of old code and has not been updated recently.  It does not correctly parse WMQ v7.5 objects and I suspect it barfs on v7.1 as well. (I verified on v7.5 but didn’t dig up an old v7.1 VM to test on.)

If you have a queue manager old enough that saveqmgr still works,  keep using saveqmgr.  If you know C and wish to give back to the WMQ community, you might consider updating saveqmgr to work with modern versions of WMQ.  It was delivered as source code and one hopes that IBM would freely grant a derivative works license in lieu of having to invest in a utility that provides a complete backup.  Of course, the best solution would be to have a tool that both works and is supported by IBM.

See for yourself

Want to test all this?  If you have WebSphere MQ installed on your Windows workstation, cut and paste this into a command window to see for yourself:

crtmqm DUMMY 
strmqm DUMMY 
amqoamd  -m DUMMY -s | findstr Guest 
setmqaut -m DUMMY -p Guest -n ** -t queue -all +put +inq +dsp
setmqaut -m DUMMY -p Guest -n MYTOPIC -t topic -all +pub +sub 
setmqaut -m DUMMY -p Guest -n REMOTEQMGR -t rqmname -all +put
setmqaut -m DUMMY -p Guest -n SYSTEM.CLUSTER.COMMAND.QUEUE -t queue -all +none 
amqoamd  -m DUMMY -s | findstr Guest 
dmpmqcfg -m DUMMY -n * -x all -o 1line | findstr Guest

Note, I used -p Guest here because this account should be present everywhere on Windows and works for our purposes even if you’ve disabled it.  WMQ only needs to know it exists and resolve it.  Please do not EVER use -p except on Windows hosts and even then always specify the ID in user@domain format.

This is what I get when I paste in those commands (less the bits where the queue manager is created and started):

C:\Users\T.Rob>amqoamd  -m DUMMY -s | findstr Guest

C:\Users\T.Rob>setmqaut -m DUMMY -p Guest -n ** -t queue -all +put +inq +dsp
The setmqaut command completed successfully.

C:\Users\T.Rob>setmqaut -m DUMMY -p Guest -n MYTOPIC -t topic -all +pub +sub
The setmqaut command completed successfully.

C:\Users\T.Rob>setmqaut -m DUMMY -p Guest -n REMOTEQMGR -t rqmname -all +put
The setmqaut command completed successfully.

C:\Users\T.Rob>setmqaut -m DUMMY -p Guest -n SYSTEM.CLUSTER.COMMAND.QUEUE -t queue -all +none
The setmqaut command completed successfully.

C:\Users\T.Rob>amqoamd  -m DUMMY -s | findstr Guest
setmqaut -m DUMMY -n '**' -t queue -p Guest@M4700 +inq +put +dsp
setmqaut -m DUMMY -n MYTOPIC -t topic -p Guest@M4700 +pub +sub
setmqaut -m DUMMY -n REMOTEQMGR -t rqmname -p Guest@M4700 +None

C:\Users\T.Rob>dmpmqcfg -m DUMMY -n * -x all -o 1line | findstr Guest
SET AUTHREC PROFILE('**') PRINCIPAL('Guest@M4700') OBJTYPE(QUEUE) AUTHADD(DSP,INQ,PUT)
SET AUTHREC PROFILE('**') PRINCIPAL('Guest@M4700') OBJTYPE(QUEUE) AUTHADD(DSP,INQ,PUT)
SET AUTHREC PROFILE('SYSTEM.CLUSTER.COMMAND.QUEUE') PRINCIPAL('Guest@M4700') OBJTYPE(QUEUE) AUTHADD(NONE)

The first line uses findstr (the Windows equivalent of grep) to see whether any security profiles exist for the Guest account. None do when the queue manager is first created.  After running the setmqaut profiles, findstr returns different sets of records depending on the utility used.

Note that dmpmqcfg appears to do exactly what is described in the Technote – find all objects, then print matching security profiles.  The setmqaut against ‘**’ naturally matches all queues and so dmpmqcfg dutifully returns duplicate lines for that profile, one per defined queue.  This too was reported early and is said to be working as designed.  It clutters up your backup and reveals the architectural nature of the defect, but otherwise doesn’t make you any less secure.

Did you notice what doesn’t print?  The amqoamd output omits the blacklisted queues that were set with +none.  The dmpmqcfg command omits the topic and the remote queue manager.  Both of these are profiles for which there is no local object and neither is a dynamic queue so even if you found the Technote and followed the advice you might not realize that security after a restore will be much more relaxed than it had been previously.  Since no legitimnate app will fail, you might never know, unless of course you are breached because of it.

On top of all this, the amqoamd command has changed the permission granted to the remote queue manager from +put to +none.  This one will at least cause failures and alert you to the problem.  Unfortunately, if you are trying to recover from an actual outage those failures are likely to come as a complete surprise.

You are betting the business on your messaging.  Security is important.  Tools capable of accurately, reliably, and completely saving and restoring the security configuration should be basic functionality.  Neither of the IBM official tools actually capture the entire configuration and, thus far, IBM’s response has been that both of these are working as designed.

If this concerns you, tell IBM.  Leave negative feedback on the Technote explaining why it’s a problem for your shop.  Open a PMR.  It’s bad enough that these tools do not provide a complete backup and that customers using them do not know it.  But it’s even worse if IBM either doesn’t know the tools don’t work or simply doesn’t acknowledge that they are defective.  Either the documentation is defective and needs to be changed to say clearly that none of the provided tools provide a complete backup, or else the lack of a complete backup is a code defect and needs to be addressed.

Alternatives

Many people save the scripts from which the authorization settings were generated.  This remains the best way to know what was intended.  However, the run-time settings often differ from what was documented.  We absolutely need good run-time tools that capture the state of a running system specifically so that we can see how it might differ from the documented version, or how it changes over time.

If you have a central tool such as Infrared-360 which works, use that.  I do not know whether Tivoli produces a complete backup but it being an IBM product doesn’t dispel the uncertainty in either direction.  Other tools may or may not work and I’m happy to report here any that can be confirmed either way.

The MQSCX tool from MQGem works.  I verified this with Paul when the dmpmqcfg issues first broke on the list server.  It has client functionality and can print the objects on one line like dmpmqcfg and saveqmgr for scripting.  You will however need to supply it with runmqsc commands like DIS AUTHREC to get the the definitions.

Michael Dag tells me that MQDocument also captures a complete set of configurations.  I do not have any experience with that product but I invite you to go to MQSystems web site and read more about it.

If you know of any additional products – and can confirm whether or not they properly save the profiles I’ve supplied here – please leave a comment.  You can subscribe to comments on this post if you wish to receive updates as comments are added.

This entry was posted in Fail, General, IIB, Security, WMQ, WMQ AMS, WMQ ESE, WMQ FTE, WMQ Security and tagged , , , , , , , . Bookmark the permalink.

Leave a Reply