Saturday, May 14, 2016

Image Sharing: Are we there yet? It seems not.

Short version: Why are we still using CDs? Its not the lack of standards or commercial solutions, it seems to be the lack of will, aka. incentives.

Long version.

In Joe Biden's Full Interview With Robin Roberts on the Cancer Moonshot he rightly bemoans (at 08:20 minutes in) the inability of two prestigious organizations, Walter Reed Hospital in Washington, D.C., and MD Anderson Cancer Center in Houston, TX, to share his son's medical imaging data electronically, without resorting to flying discs across the country (and even that apparently required the intervention of his son-in-law, who is a surgeon). Unfortunately, he attributes this to an absence of a "common language", which for this particular case is not true (since we have DICOM, which is the lingua franca of images). Earlier in the interview, the issue of incentives is discussed though.

This experience mirrors my own, dealing with family attending Memorial Sloan Kettering Cancer Center (MSKCC)  in New York, NY. The only mechanism I have to obtain images from there is again via CD. Speaking to one of the radiologists at Memorial, I was told that the inbound problem is just as bad; they employ 10 (!) FTEs whose only function is to stuff CDs received into drives to import them. Apparently they do have one of the commercial network image sharing alternatives installed, but are planning on ditching it and going with another vendor, not sure why. "Continuing bandwidth issues" were cited as a concern. MSKCC has a limited patient portal, which does have radiology results available through it (plain text of course, nothing structured to download), but apparently making images available (whether to View, Download or Transmit) through the portal is not a priority. It does make paying the bills easier though (I guess that is important for them).

Now, it is great that CDs work at all, and work relatively well. And of course they are thoroughly standardized (using the DICOM PS3.10 files that are specified by IHE PDI), as long as they don't come from older Stentor/Philips crap. But surely, well into the 21st Century, we can do better than "sneaker net", especially between major medical centers.

Yesterday, on a call with the HIMSS-SIIM Enterprise Imaging Joint Workgroup Best Practice Image Exchange and Sharing (Team 3) (which I have belatedly joined), there was a discussion about reorganizing the work groups and starting a new one on Standards and Interoperability. I was keen to emphasize that I don't think the interoperability problem is one of a lack of standards or implementation of them, but rather a lack of incentives, funding, prioritization or indeed a clearly articulated value proposition for deploying solutions, using the standards that we already have (or even using a non-standard solution, if it works).

When the UK folks were facing the problem of image sharing, and the NHS failed to deliver a suitable central solution, an ad hoc network of push-driven sharing evolved, the Image Exchange Portal (IEP), which has been bought and expanded by Sectra. They claim that:

"100% of NHS Acute Trusts in England plus private hospitals are connected to one another via the IEP network".

As I understand it, these guys were no more incentivized to develop, join or use the IEP sharing than are their counterparts in the US, nor were there any disincentives for not bothering to share images. Perhaps there were just no funds available to employ an army of CD-stuffers to work around the problem, so the pain was being felt by the decision makers. Or perhaps the resources for repeat imaging were more tightly controlled (as opposed to being a potential source of more revenue in the US), so the shared images were the only images available. I am just guessing, but I doubt it was because the Brits are any more altruistic or sensible than their Cousins (I can say that, since I am nominally a Brit, even though I have lived and worked in the US for decades).

The Canadians have their much vaunted, centrally funded, regional Diagnostic Image Repositories (DI-r's), but am I told that, in some provinces at least, you are lucky if you can get out what you put in, and there is little if any useful access to images submitted by other sites. Some provinces have apparently been able to do better though.

Regardless, all of us who work in medical imaging IT know that the technology is there, and is affordable, and the workflow is manageable despite having to deal with stupid things like the lack of a single national patient identifier. It doesn't really matter for the sharing use case which standard or combination of standards you choose for the transfer, as long as the payload is DICOM. Whether you push them or pull them, use traditional DICOM protocols or DICOMweb or XDS-I RAD-69 or XDR-I or some proprietary mechanism, or follow IHE Import Reconciliation Workflow (IRWF) to deal with the identifiers or do it your own way, with a little configuration, the images are going to get where they need to be. It is really just a question of motivating sites to get off their collective asses.

In the "collective" probably lies part of the problem, since on a large scale, what motivates competitors to share?

For once though, the problem can hardly be laid at the door of the evil vendors who might be accused of "data blocking". For image sharing, there is an army of vendors willing to help solve your sharing problem, as well as open source components to assemble your own, there are no format issues, the problem is way simpler than that of general EHR interoperability, and there is no debate over documents versus APIs (all of the radiology and cardiology images, at least, are already in DICOM format and document-like in that respect).

When I discussed this in late 2012 with Farzad Mostashari, after expressing my disappointment that the MU2 didn't insist on image sharing, he wrote that:

"My hope is that the business case for this is so clear that it will happen regardless (perhaps with some help from convening, best practices, etc) and we can point to the on-the-ground reality in two years as the ultimate refutation of the concerns."
Now here we are three and a half years later, not two, with a plethora of commercial solutions as well as multitude of standards for image sharing, but the "business case" is apparently not so clear after all, if the Vice President of the United States still needs to arrange to fly CDs around.

Shame on us all for failing him and his family.


PS. As far as I have been able to ascertain, the MACRA proposed rule doesn't provide any incentives or requirements for imaging sharing either. This may be as much because nobody has submitted sharing related performance measures as the lack of central recognition that this is important or a priority. Maybe the VP should submit comments on it!

PPS. In the same interview, Joe Biden also takes a shot at the much reviled editor of the NEJM, Jeffrey Drazen,  over his ill-considered "data parasites" comments (actually "research parasites", in the editorial co-authored with Deputy Editor Dan Longo). While Drazen may be well on his way to becoming the most hated man in America (perhaps overshadowing Martin Shkreli, the AIDS drug robber baron) the issues raised in Drazen's editorial are about a different kind of "sharing" than the subject of this post.

No doubt Drazen's comments reflect the opinion of many in the "elite healthcare research establishment", who seem to regard the right to solely exploit their taxpayer-funded research and data in order to exclude success by their funding competitors (not to mention their unwillingness to have their own data and analysis scrutinized for integrity and repeatability) as something akin to the divine right of kings. Again, this all seems to be a matter of incentives, this time the perverse incentives of the research funding infrastructure that encourage data hoarding rather than sharing due to the competitive nature of the process. NIH, perhaps crippled by the Bayh–Dole Act, doesn't seem to have any teeth in its data sharing policy when it comes to reviewing and approving grant applications or monitoring their performance, so there is no "level playing field" of mandatory and immediate sharing. Since most of what is published is probably false anyway, perhaps it doesn't matter:(

There is something for everyone in the interview, and the lack of open access to research publications comes in for its share of criticism too. Hear, hear!

I wish the VP every success in his crusade.

Sunday, May 8, 2016

To C-MOVE is human; to C-GET, divine

Summary: C-GET is superior to C-MOVE for use beyond the firewall; contrary to some misleading reports, it has NOT been retired from DICOM, and implementations do exist.

Long Version.

With apologies to Alexander Pope, I wanted to draw attention to what appears to be a common misconception, that DICOM C-GET is retired or obsolete or deprecated.

C-GET is not retired; it most definitely is alive and well, and more importantly, useful.

C-GET is especially useful for DICOM use over the public Internet, beyond the local area network.

As you know, by far the most common way to retrieve a study, series or individual instances is to use a C-MOVE request, which instructs the server (SCP) to initiate the necessary C-STORE operations on one or more different connections (associations) to transfer the data.

This necessitates:
  • the requester being able to listen for and accept inbound connections (i.e., be a C-STORE SCP),
  • that any impediments on the network (like firewalls) allow such inbound connections,
  • that the sender be configured with the host/IP address and port of the requester (since only the Destination AET is communicated in the C-MOVE request), and
  • that Network Address Translation (NAT) be correctly configured to forward the inbound connections to the requester.
By comparison, a C-GET request does not depend on separate associations being established, but rather "turns around" the same connection on which the request is made, and re-uses it to receive the inbound C-STORE operation. I.e., it is just like an HTTP GET in that all the data comes back on the same connection. It is similar in functionality to a Passive FTP transfer, though in ftp there are actually two separate connections, though both are initiated by the requester (one for commands and one for data).

With all three protocols, DICOM C-GET, HTTP GET and Passive FTP GET, there is:
  • no need for the requester to be able to respond to inbound connections
  • no need to configure firewalls to allow inbound connections or perform NAT, and
  • no need (other than for access control) to configure the sender to know anything about the requester.
Of course, firewalls may also restrict outbound connections, but that affects all protocols similarly.

All three protocols can of course communicate over secured channels, whether by using TLS or a VPN.

So, if C-GET is so useful, why is it not as commonly implemented?

Historically, when DICOM was first getting started and being used mostly for mini-PACS clusters of acquisition modalities and workstations, the thinking of the designers went something like this. First, I have to be able to send and receive images by pushing them around, so I have to implement C-STORE as an SCU and SCP. Now, the product manager says I have to allow users to pull them too, so the easiest way is to write a C-MOVE SCU and SCP to command that the transfer takes place, but I can just reuse the existing C-STORE SCU and SCP code that I have already written. I only have a handful of devices to connect on the LAN, so the administrative burden of configuring them all to know about each other is not an issue. QED.

As smaller systems were scaled to enterprise level, and larger proprietary systems added DICOM Q/R capability to allow the same mini-PACS workstations to gain access to the archive, the use of C-MOVE became entrenched, without much further thought being given to the potential future benefits of C-GET for use beyond the walls of the enterprise or on a really large scale. Much later, IHE specified C-MOVE for the Retrieve Images (RAD-16) transaction (in Year 2 for 2000), which subsequently became part of the Scheduled Workflow Profile, but did not mention C-GET, presumably because the conventional wisdom at the time was that C-MOVE was much more widely implemented.

So who does support C-GET?

A Google search reveals quite a few systems that do. There are some open source or freely available SCUs and SCPs too. When I monitor at Connectathons, it is extremely convenient to be able to retrieve stuff from testers' systems (to compare what they have with what is expected) without having to go and bother them to add my configuration for C-MOVE, and off hand I would guess about 15-25% of the systems respond to a C-GET, including, of course, the central archive, which for the last few years has been dcm4chee. Dave Harvey's publicly accessible server and PixelMed's support C-GET, as do clients like Osirix, though I don't think either ClearCanvas or K-PACS do:(

The tricky thing with implementing C-GET as an SCU is the Association Negotiation, and particularly the (annoying, gratuitous, arbitrary) limit on the total number of Presentation Contexts caused by the "odd integers between 1 and 255" requirement on the single byte Presentation-context-ID. The naive (though inefficient) approach of listing all possible (storage) SOP Classes permuted with all possible Transfer Syntaxes reaches that limit quickly nowadays. Allowing the SCP to choose the Transfer Syntax, and using SOP Classes in Study from an earlier STUDY level C-FIND (or using plausible SOP Classes based on Modalities in Study, or if these are not supported as return keys by the C-FIND SCP, Modality from a SERIES level C-FIND, or worst case, the SOP Class UID from an IMAGE level C-FIND) helps a lot with this, though does limit the re-usability of the Association if you want to keep it alive in a "connection pool" for later retrievals.

From a performance perspective, single connection C-GET and C-MOVE are similar, which is not surprising since both are often limited by latency effects on the synchronous C-STORE response. In the absence of Asynchronous Operations support, it is obviously easier to accelerate C-MOVE by opening multiple return Associations across which to spread the C-STORE operations, which one can't do with C-GET, unless one selectively retrieves at the IMAGE level, which is possible, but tedious to set up and requires an initial IMAGE level C-FIND to get SOP Instance UIDs. Using large multi-frame images instance mitigates this issue.

It would be interesting to see, for the simple pull use case, how close the C-GET with Asynchronous Operations support could approach raw socket transfer speeds though, and how it would compare with an HTTP GET or Passive FTP GET.

The security considerations (include channel confidentiality, access control and audit trail) would seem to be similar for C-GET and C-MOVE, and both TLS and user identity communication are available if necessary.


PS. I was motivated to write this when I noticed that Sébastien Jodogne says in Note 1 of his description of "C-Move: Query/retrieve" documenting his Orthanc server:

"Even if C-Move may seem counter-intuitive, it is the only way to initiate a query/retrieve. Once upon a time, there was a conceptually simpler C-Get command, but this command is now deprecated."

I asked Sébastien where he got this impression and attributes the source of his confusion to this post by Roni Zaharia. Both are incorrect in this respect.

During the great DICOM purge of 2006 (Sup 98), though the Patient/Study Only Query/Retrieve Information Model was retired from the Query/Retrieve Service, C-GET was left alone, and none of the other Supplements or CPs related to retirement touched it either. On the contrary, subsequent additions to the standard to support Instance and Frame Level Retrieve and Composite Instance Retrieve Without Bulk Data (Sup 119) extended the use of C-GET significantly.

Sébastien profusely apologizes for relying on hearsay and failing to check the standard, and hopes to implement C-GET when he has a chance.

PPS. I observe in passing that Roni also recommends the use of Patient Root rather than Study Root queries, which I would strongly disagree with. In the early days, many systems' databases were implemented with the study as the top level and the patient's identifiers and characteristics were managed as attributes of the study, if for no other reason than HIS/RIS integration was not as common as it is today, and patient level stuff was often inconsistent and/or incorrect. IHE, for example, when Q/R was added in Year Two, specified the Study Root C-FIND as required and the Patient Root as optional for the Query Images (RAD-14) and Retrieve Images (RAD-16) transactions, and that is still true in Scheduled Workflow today. I never use Patient Root if I can avoid it, and Roni's assertion that "everyone supports it" certainly didn't used to be true.

PPPS. Some old comp.protocols.dicom posts on the subject of C-GET include the following, which show the "evolution" of my thinking:

Difference between C-GET and C-MOVE
DICOM retrieve (C-GET-RQ) example anyone?
C-GET vs C-MOVE (was Retrieving off-line studies from DICOM archive)
C-Get versus C-Move, was Re: C-Move

Wednesday, March 2, 2016

DICOM and SNOMED back in bed together

Summary: Users and commercial and open source DICOM developers can be reassured that they may continue to use the subset of SNOMED concepts in the DICOM standard in their products and software, globally and without a fee or individual license.

Long Version.

The news from IHTSDO and a summary of the relationship can be found at this IHTSDO DICOM Partnership page, including links to the text of the agreement and a press release.

DICOM has used SNOMED since the days of the SNOMED DICOM Microglossary in the mid-nineties. This was the work of Dean Bidgood, who was not only very actively involved in DICOM but also a member of the SNOMED Editorial Board. As SNOMED evolved over time, it became necessary to reach an agreement with the original producers, the College of American Pathologists. This allowed DICOM to continue to publish and use SNOMED codes in software and products without a fee, and in return DICOM continued to contribute imaging concepts to be added to SNOMED.

This has worked out really well so far, so it is reassuring that we now have a similar agreement in place with the new owners, IHTSDO.

The subset of SNOMED concepts that DICOM may use includes all concepts that are currently in the standard as of the 2016a release and that are active in the SNOMED 2016 INT release, as well as those in some upcoming Supplements and CPs. I have been going through and cleaning up any concepts that have been inactivated in SNOMED (due to errors, duplicates, ambiguities, etc.) and adding them to CP 1495 to replace them and mark them as retired. This is pretty tedious but with the XML DocBook source of the standard, a lot of the checking can be automated, so this process should converge pretty soon. Note that per both the original agreement with CAP and the new agreement with IHTSDO, there is recognition that products and software that use retired inactive codes may continue to do so if necessary.

A small subset of codes (for non-human applications) have been handed off by IHTSDO to the maintainers of the Veterinary Extension of SNOMED CT, and we have been reassured by those folks that it is OK for us to continue to use them too.

If anyone actually needs a tabulated list of all the concepts in the SNOMED DICOM subset in some more convenient form than the PDF that lists the concept identifiers, just let me know and I can send you some of my working files. I also have some XSLT style sheets that can be used to trawl the source for both coded tuples and codes in tables, so if you need to do that sort of thing, just let me know (I will add these to the source and rendering archive file in the next release of the DICOM standard).


Tuesday, March 1, 2016

How many (medical image exchange) standards can dance on the head of a pin?

Summary: There are too many alternative standards for sharing images. For the foreseeable future, traditional DICOM DIMSE services will remain the mainstay of modality and intra-enterprise image management, perhaps with the exception of viewers used internally. The WADO-URI and WADO-RS services are attractive in their simplicity and have sufficient features for many other uses, including submission of other 'ology images using STOW (WIC). If one has not already deployed it (and even then), one might want to give serious consideration to "skipping over" XDS-I as a dead-end digression and going straight to the more mobile and ZFP friendly WADO-RS instead (including potentially revised MHD-I). The RSNA Image Share Validation program for XDS-I is perhaps not such a cool idea, and should be refocused on validating WADO-RS-based services. How/if FHIR ImagingStudy and ImagingObjectSelection fit in remains to be determined.

Long Version.

Do standards have location in space, but not extension, so the answer is an infinite number? Or no location at all, so, perhaps none?

We certainly have no shortage of standards in general, as the sarcastic quote from Andy Tanenbaum ("The nice thing about standards is that you have so many to choose from") illustrates. This xkcd cartoon explains one among many reasons for their proliferation.

Some of the drivers that encourage excessive proliferation of multiple standards for the same thing include:
  • extension of an existing successful standard into a new domains to compete with an incumbent
  • "technology refreshment" (wanting to use the latest and greatest trendy buzzword compliant mechanisms that may or may not offer real benefit)
  • simpler solutions to address real or perceived complexity of existing standards
  • "not invented here"
  • laziness (easy to write than read)
  • pettiness (we hate your standard and the horse it rode in on)
  • low barrier to entry (anyone can use the word "standard")
  • bad standards (seemed like a good idea to someone at the time)
So what does this mean for medical image sharing, both for traditional radiology and cardiology applications, as well as the other 'ologies?

If we just consider DICOM image and related "payloads" for the moment, and focus strictly on the exchange services, currently one has a choice of several overlapping mainstream "standard" services:
as well as some niche services for specific purposes:
Each of these can be considered from many perspectives, including:
  • installed base (for various scenarios)
  • intra-enterprise (LAN) capability
  • extra-enterprise (remote, WAN) capability
  • cross-enterprise (WAN, cross identity and security domain) capability
  • performance (bandwidth and latency)
  • functionality (to support simple and advanced use cases)
  • complexity (from developer, deployment and dependency aspect)
  • security support
  • scalability support (server load, load balancing, caching)
  • reliability support
  • ...
However, to cut a long story short, at one end of the spectrum we have the ancient DICOM services. These are used ubiquitously:
  • between traditional acquisition modalities and the PACS or VNA
  • for pushing stuff around inside an enterprise
  • for pushing (over secure connections) to central/regional/national archives (like Canadian DIrs)
  • for interfacing to traditional "workstations" for RT, advanced image processing, etc.
Many people hate traditional DICOM for inbound queries, whine about "performance" issues (largely due to poor/lazy implementations that are excessively latency sensitive due to the default protocol's need for acknowledgement), and rarely bother to secure it (whether over TLS or with use of any of its user identity features). Certainly traditional DICOM protocols are excessively complicated and obscurely documented in arcane OSI-reminiscent terminology, making it much harder for newbies to implement it from scratch. But it works just fine, and everybody sensible uses a robust open-source or commercial toolkit to hide the protocol details; but that creates a dependency, which in an ideal world would be avoidable.

At the other end of the spectrum, there is the closest thing to a "raw socket" (the network developers' ideal), which is an HTTP GET or POST from/to an endpoint specified by a URL. In terms of medical imaging standards this means WADO-URI or WADO-RS for fetching stuff, STOW-RS for sending stuff, and QIDO-RS for finding it. FHIR's ImagingStudy resource also happens to have a means for actually including the payload in the resource as opposed to using WADO URLs.

Nothing is ever as simple as it seems though, and many committee hours have been spent on the low level details, like parameters, accept headers, character sets, media types and transfer syntaxes. There is insufficient experience to know whether the lack of a SOP Class specific negotiation mechanism really matters or not. But certainly for the simple use cases of getting DICOM PS3.10 or rendered JPEG "files", a few examples probably suffice to get a non-DICOM literate developer handwriting the code on either end without resorting to a toolkit or the need for too many dependencies. If one puts aside the growing "complexity" of HTTP itself, especially HTTP 2.0 with of its optimizations, in its degenerate form, this WADO-URI and WADO-RS stuff can be really "simple". Theoretically, WADO-RS is also supposed to be "RESTful", whatever that is, if anyone actually cares.

But its main claim to fame is there is no SOAP involved. On the subject of which ...

Somewhere in the middle (or off to one side) we have the old-fashioned SOAP Web Services based XDS-I.b, and the retrospectively DICOM-standardized and extended version of its transfer mechanism, WADO-WS. XDS-I.b includes SOAP services to interact with a registry to find stuff (documents and image manifests), and then the image manifest can be used to fetch the DICOM images, either using another SOAP transaction (RAD 69 based on ITI 42) or various DICOM or WADO mechanisms.

Born of a well-intentioned but perhaps misguided attempt to leverage the long defunct OASIS ebXML standard, and built on the now universally-despised SOAP-based web services, the entire XDS family suffers from being both complex and not terribly developer friendly. Though, the underlying XDS standards are gaining some traction (perhaps because there really weren't too many competing standards for moving documents around), there are not that many XDS-I.b implementations actually being used, though certainly some vendors have implemented it (and a few aggressively promote it).

Or to put in another way, with the benefit of 20-20 hindsight, XDS-I.b is beginning to look like the worst of all worlds - excessively complex, bloated, dependent on a moribund technology and with a negligible installed base.

What XDS-I.b does bring to the table is an architectural concept with registries and repositories and sources. So, rather than throw the baby out with the bathwater, there is ongoing IHE work to get rid of the SOAP stuff and make FHIR-based MHD the new profile on which to implement the same architecture (though it is not phrased in terms of "getting rid" of anything, of course, at least not yet). In IHE Radiology there is ongoing work to redo the first try at MHD-I to use WADO-URI and WADO-RS and the FHIR ImagingObjectSelection resource as a manifest.

Of course, it is very easy to be critical of XDS-I.b in retrospect.

Long before it became "obvious" (?) that simple HTTP+URL was sufficient for most use cases, as long as XDS-I, and later XDS-I.b, were the "only" non-DICOM-protocol approaches sanctioned by IHE, we all ran around promoting it as preferable to proprietary solutions, myself included. There was tacit acceptance that DICOM protocol detractors would never be satisfied with a non-port 80 solution, and so XDS-based image exchange was the only theoretical game in town.

Fortunately, hardly anybody listened.

I am oversimplifying, as well as eliding numerous subtleties (e.g., difficulties of cross-community exchange without URL rewriting, or benefits for caching, concerns about how to pass SAML assertions, benefits of leveraging same services and architecture as documents). And I am probably underestimating the size of the installed base (just as protagonists probably exaggerate it).

But the core message is important ... should we abandon XDS-I.b now, before it is too late?

I am increasingly convinced that for every objection some XDS-loving Neanderthal raises against using a light-weight HTTP non-SOAP no-action-semantics-in-the-payload URL-only pseudo-RESTful solution (LWHNSNASITPUOPRS), there is a solution somewhere out in the "real" (non-healthcare) world. Religious wars have been fought over less, but I think I have finally come around to the SOAP Sucks camp, not because XDS-I.b can't be made to work, obviously it can, but because nobody in this day and age needs to be burdened with trying to do so.

Since DICOM and HL7 embraced the RESTful way, it really seems like a waste of time to be swimming against the current, so to mitigate the issue of standards proliferation leading to barriers to interoperability, something has to be sacrificed, and the older less palatable approach may need to die.

Unfortunately, some folks are pulling in the wrong direction. One major imaging vendor (GE) is totally obsessed with XDS, and some (though not all) of its representatives jump up and down like Cartman having a tantrum whenever it is suggested that we retire the no-longer-useful and potentially harmful standards like WADO-WS (and even XDS-I.b itself perhaps). A few small vendors who have bet the farm on XDS join the chorus, to prove the point that somebody somewhere has actually used XDS-I.b for something. Right now there is a discussion in IHE Radiology about extending XDS-I.b to include more of the WADO-WS transactions like fetching rendered images, etc., which is quite the opposite of retirement.

So, as usual, the standards organizations like DICOM and IHE go back to the cycle of developing and promoting the union of alternatives, not the intersection, and almost everyone suffers. Not least of whom is the customer who has to (a) pay for the all the development and testing effort for their vendors to maintain all of these competing interfaces, (b) endure poor performance from any one of these interfaces on which insufficient effort has been devoted to optimization, and (c) is restricted in their choice of products when incompatible choices of competing standards have been implemented. Once upon a time the value proposition for IHE was navigating through the morass of standards but now it is an equal opportunity offender.

Some folks make out like bandits amongst this chaos, of course, including the more agile newbie VNA vendors who make it their bread and butter to try and support every imaginable interface (some even claim to support MINT). Whether they work properly or add any actual value is another matter, but there will always be an opportunity for those who make the glue. Can you say "HL7 Interface Engine?

Sadly, RSNA has recently jumped on the XDS-I.b bandwagon with the announcement of their RSNA Image Share Validation program. To be fair, I was among those who years ago encouraged the RSNA Image Share developers to use out-of-the-box XDS-I.b transactions to implement the original Edge Server to Clearinghouse and PHR connections, in lieu of any standard alternatives (given that they wouldn't just use DICOM). But the government handout from the Recovery Act is drying up, it is clear that patient's aren't rushing to pay to subscribe to PHRs, much less image-enabled ones, and frankly, this project has run its course. I am not really sure why RSNA wants to get involved in the image sharing certification business in the first place (which is what the prospectus describes), but in XDS-I.b they may have picked the wrong standard for this day and age.

Of course, may be we should just give up now and start making a new even simpler completely different universal standard that covers everyone's use cases :)
Oops, that was FHIR, wasn't it? Subject for another day perhaps.


PS.  You may respond that my complaining about the "complexity" of XDS-I.b is a case of the pot calling the kettle black: I am an advocate of DICOM, and DICOM is hardly "simple" in terms of either its encoding or its information model (which is why the official DICOM XML and more recently DICOM JSON representations are, at the very least, superficially attractive), or the size of its documentation (which we have been trying to improve in terms of navigability).

And I would agree with you. But trying to simplify the payload, it turns out, is a lot harder than trying to simplify the exchange and query protocols, and if we can do the latter before yet another bloated and excessively complicated standard is inflicted on the developers and users, why not?

PPS. Few people notice it, but there is actually yet another DICOM standard for exchanging images, and that is in PS3.19 Application Hosting interfaces, which define SOAP-based WS transport intended for interoperability between host and applications written in different languages and running on the same machine. It is theoretically usable across multiple machines though. Using SOAP to pass parameters seemed like the best alternative at the time to making up something new, particularly given the tooling available to implement it in various popular languages. There has been talk in WG 23 of revisiting this with REST instead, but nothing has got off the ground yet; but think JSON with JAX-RS and JAXB, or similar. Since "API" is the buzzword du jour, maybe there is life in that idea!

Saturday, October 31, 2015

The slings and arrows of outrageous radiologists - I want my FLA.

Summary: We don't need fewer arrows. We need more arrows more often. And we need better arrows (in the sense that they are hyperlinked to the findings in the report when images are rendered, i.e., are Findings-Linked Annotations (FLA)). The term "arrows" being a surrogate for "visual indication of location".

Long Version.

I came across the strangest article about "arrows" in EJR.

Now, I don't normally read EJR because it is a little expensive, it doesn't come along with any professional society membership I have, I don't work at an institution that gets it, most of its articles are not open access (there is an EJR Open companion journal though), and it doesn't have a lot of informatics content. But this paper happened to be quoted in full for some reason on Aunt Minnie Europe, so I got a squizz without having to wait to receive a pre-print from the authors via ResearchGate or some other mechanism.

The thesis of the radiologist authors seems to be that "arrows" on images pointing to findings are a bad thing, and that recipients of the report should read the report instead of having access to such visual aids.

This struck me as odd, from the perspective of someone who has spent the last two decades or so building and evangelizing about standards and systems to do exactly that, i.e., to make annotations on images and semantically link them to specific report content so that they can be visualized interactively (ideally through DICOM Structured Reports, less ideally through the non-semantic but more widely available DICOM Softcopy Presentation States, and in the worst case in a pre-formatted multimedia rather than plain text report).

What are the authors' arguments against arrows? To summarize (fairly I hope), arrows:
  • are aesthetically ugly, especially if multitudinous, and may obscure underlying features
  • draw attention from unmarked less obvious findings (may lead to satisfaction of search)
  • are not a replacement for the more detailed account in the report
  • are superfluous in the presence of the more detailed account in the report
  • might be removed (or not be distributed)
  • detract from the role of the radiologist as a "readily accessible collaborator"
For the sake of argument, I will assume that what the authors' mean by "arrows" includes any "visual indication of location" rendered on an image, passively or interactively. They actual describe them as "an unspoken directional signal".

The authors appear to conflate the presence of arrows with either the absence of, or perhaps the ignorance of, the report ("relying on an arrow alone as a manifestation of our special capabilities", "are merely a figurative crutch we can very well do without").

I would never assert that arrows alone (or any form of selective annotation) substitute for a good report, nor it would seem to me, would it be best or even common practice to fail to produce a full report. The implication in the paper seems to be that when radiologists use arrows (that they expect will be visible to the report recipient), they record less detail about the location in the report, or the recipient does not read the report. Is that actually the case? Do the authors put forth any evidence to support that assertion? No, they do not; nor any evidence about what recipients actually prefer.

I would completely agree with the authors that there is an inherent beauty in many images, and they are best served in that respect unadorned. That's why we have buttons to toggle annotations on and off, including not only arrows but those in the corners for demographics and management as well. And why lead markers suck. And who really cares whether we can check to see if we have the right patient or not? OK, so there are safety issues to consider, but that's another story.

As for concerns about satisfaction of search, one could equally argue that one should not include an impression or conclusion in a report either, since I gather few recipients will taken the time to read more than that. Perhaps they should be forced to wade through reams of verbosity just in case they miss something subtle not restated in its entirety in the impression anyway. And there is no rule that says one can't point out subtle findings with arrows too. Indeed, I was lead to believe during my training that it was the primary interpreting radiologist's function (and major source of added value) to detect, categorize and highlight (positively or negatively) those subtle findings that might be missed in the face of the obvious.

Wrt. superfluousness, I don't know about you, but when I read a long prose description in a report that attempts to describe the precise location of a finding, whether it uses:
  • identifiers ("in series 3, on slice 9, approximately 13.8 mm lateral to the left margin of the descending aorta", which assumes incorrectly that the recipient's viewer numbers things the same way the radiologist's does),
  • approximate regions ("left breast MLO 4 o'clock position"), or
  • anatomical descriptions ("apical segment of the right lower lobe")
even if I find something on the image that is plausibly or even undeniably associated with the description, I am always left wondering if I am looking at exactly the same thing as the reporting radiologist is talking about, and with the suspicion that I have missed something. My level of uncertainty is significantly higher than it needs to be. Arrows are not superfluous, they are complementary and add significant clarity.

Or to put it another way, there is a reason the wax pencil was invented.

In my ideal world, every significant localized finding in a report would be intimately linked electronically with a specific set of coordinates in an image, whether that be its center (which might rendered as an arrow, or a cross-hair, or some other user interface element), or its outline (which might be a geometric shape like an ellipse or rectangle, or an actual outline or filled in region that has been semi-automatically segmented, if volume measurements are reported). Further, the display of such locations would be under my interactive control as a recipient (just as one turns on and off CAD marks, or applies presentation states selectively); this would address the "aesthetic" concern of the annotation obscuring underlying structure.

We certainly have the standards. Coordinate references in reports were one of the core elements of Dean Bidgood's Structured Reporting (SR) initiative in DICOM ("Documenting the information content of images", 1997). I used a (contrived) example of a human-generated report to emphasize the point in Figure 1 of my 2000 DICOM SR textbook (long due for revision, I know). There was even work to port the DICOM SR coordinate reference pattern into HL7 CDA (although of late this has been de-emphasized in favor of leaving these in the DICOM realm and referencing them, e.g., in PS3.20).

Nor is this beyond the state of the art of authoring and rendering applications, even if it is not commonly implemented or used. The primary barriers to adoption seem to be:
  • the diversity of the heterogeneous mix of image display, voice reporting and report display systems that are difficult to integrate tightly enough to achieve this,
  • coupled with the real or perceived difficulty of enabling the radiologist to author more highly linked content without reducing their "productivity" (as currently incentivized).
In a world in which the standard of care in the community is the fax of a printed report, possibly coupled with a CD full of images with a brain-dead viewer (and no presentation state or structured report coordinate rendering), the issue of any arrows at all is probably moot. The financial or quality incentives are focused on embellishing the report not with clinically useful content but instead with content for reimbursement optimization. The best we can probably do for these scenarios is the (non-interactive) "multimedia report", i.e., the one that has the selected images or regions of images pre-windowed and embedded in the report with arrows and numbers shared with the findings in the prose, or similar. An old concept once labelled as an "illustrated" report, recently revisited or renamed (MERR), but still rarely implemented AFAIK.

Even within a single enterprise, the "hyperlink" between specific findings in the report content and the image annotations is usually absent. The EHR and PACS may be nominally "integrated" to the point of being able to trigger the PACS viewer whilst reading the report (whether to get Well Meaningful Use Brownie Points or to actually serve the needs of the users), and the PACS may be able to render the radiologist's arrows (e.g., if they are stored as presentation states in the PACS). While this scenario is way better than having no arrows at all, it is not IMHO as good as "findings-linked annotations" (let's call them FLA, since we need more acronyms like we need a hole in the head). Such limited integrated deployments are typically present when the lowest common denominator for "report interchange" is essentially the same old plain text report, perhaps "masquerading" as something more sophisticated (e.g., by wrapping the text in CDA or DICOM SR, with or without a few section headings but without "semantic" links from embedded findings to image coordinates or references to softcopy presentation states).

Likewise, though the radiology and cardiology professional societies have been strongly pushing so-called "structured reporting" again lately, these efforts are pragmatic and only an incremental extension to the lowest common denominator. They are still essentially limited to standardization of layout and section headings, and do not extend to visual hyperlinking of findings to images. Not to dismiss the importance of these efforts; they are a vital next step, and when adopted offer valuable improvements, but IMHO they are not sufficient to communicate most effectively with the report recipients.

So, as radiologists worry about their inevitable outsourcing and commodification, perhaps they should be more concerned about how to provide added value beyond the traditional verbose prose, rather than bemoaning the hypothetical (if not entirely spurious) disadvantages of visual cues. The ability to "illustrate" a report effectively may become a key component of one's "competitive advantage" at some point.

I suggest that we need more FLA to truly enable radiologists to be "informative and participatory as caregivers, alerting our colleagues with more incisiveness and counsel" (paraphrasing the authors). That is, to more effectively combine the annotations and the report, rather than to exaggerate the importance of one over the other.


PS. Patients read their reports and look at their images too, and they really seem to like arrows, not many of them being trained anatomists.

PPS.  I thought for a moment that the article might be a joke, and that the authors were being sarcastic, but its Halloween not April Fool's, the paper was submitted in August and repeated on Aunt Minnie, so I guess it is a serious piece with the intention of being provocative rather than being taken literally. It certainly provoked me!

PPPS. Do not interpret my remarks to in any way advocate a "burned in" arrow, i.e., one that replaces the original underlying pixel values and which is then sent as the only "version" of the image; that is obviously unacceptable. I understand the author's article to be referring to arrows in general and not that abhorrent encoding mechanism in particular.

Thursday, October 22, 2015

I think she's dead ... no I'm not ... Is PACS pining for the fiords?

Summary: The death of PACS, and its deconstruction, have been greatly exaggerated. Not just recently, but 12 years ago.

Long Version:

Mixing quotes from different Monty Python sketches (Death of Mary Queen of Scots, Pet Shop) is probably almost as bad as mixing metaphors, but as I grow older it is more effort to separate these early associations.

These lines came to mind when I was unfortunately reminded of one the most annoying articles published in the last few years, "PACS in 2018: An Autopsy", which is in essence an unapologetic unsubstantiated promotion of the VNA concept.

Quite apart from the fact that nobody can agree on WTF a VNA actually is (despite my own lame attempt at a retrospective Wikipedia definition), this paper is a weird collage of observable technological trends in standards and products, marketing repackaging of existing technology with new labels, and fanciful desiderata that lack real market drivers or evidence of efficacy (or the regulatory (mis-)incentives that sometimes serve in lieu).

That's fine though, since it is reasonable to discuss alternative architectures and consider their pros and cons. But wait, surprise, there is actually very little if any substance there? No discussion of the relative merits or drivers for change? Is this just a fluff piece, the sort of garbage that one might see in a vendor's press release or in one of those junk mail magazines that clutter one's physical mailbox? All hype and no substance? What is it doing in a supposedly peer-reviewed scientific journal like JDI?

OK, so its cute, and its provocative, and let's give the paper the benefit of the doubt and categorize it as editorial rather than scientific, which allows for some latitude.

And no doubt, somewhat like Keeping Up with the Kardashians and its ilk, since folks seem to be obsessed with train wrecks, it is probably destined to become the "most popular JDI article of all time".

And let's be more even generous and forgive the drawing of pretty boxes that smells like "Marchitecture". Or, that it would be hard for a marketing executive to draft a more buzzword compliant brochure. And perhaps as an itemized list of contemporary buzzwords, it has some utility.

My primary issue is with the title, specifically the mention of "autopsy".

Worse, the author's follow up at the SIIM 2015 meeting in his opening address entitled "The Next Imaging Evolution: A World Without PACS (As We Know It)" perpetuated this theme of impending doom for PACS, a theme that dominated the meeting.

Indeed, though the SIIM 2015 meeting was, overall, very enjoyable and relatively informative, albeit repetitive, the main message I returned home with was the existence of a pervasive sense of desperation among the attendees, many of whom seem to fear not just commoditization (Paul Chang's theme in past years) but perhaps even total irrelevance in the face of the emerging "threat" that is enterprise image management. I.e., PACS administrators and radiologists are doomed to become redundant. Or at least they are if they don't they buy products with different labels, or re-implement the same solutions with different technology.

When did SIIM get hijacked by fear-mongers and doubters? We should be demanding more rigidly defined areas of doubt and uncertainty ... wait, no, wrong radio show.

OK, I get that many sites are faced with the challenge of expanding imaging beyond radiology and cardiology, and indeed many folks like the VA have been doing that for literally decades. And I get that Meaningful Use consumes all available resources. And that leveraging commodity technology potentially lowers barriers to entry. And that mobile devices need to be integrated. And that radiology will no longer be a significant revenue stream as it becomes a cost rather than profit center (oops, who said that). But surely the message that change may be coming can be spun positively, as an opportunity rather than a threat, as incremental improvement rather than revolution. Otherwise uninformed decision makers as well as uneducated worker bees who respond to hyperbole rather than substance, or who are seeking excuses, may be unduly influenced in undesirable or unpredictable ways.

More capable commentators than I have criticized this trend of hyping the supposed forthcoming "death of PACS", ranging from Mike Cannavo to Herman O's review of SIIM 2015 and the equally annoying deconstruction mythology.

Call me a Luddite, but these sorts of predictions of PACS demise are not new; indeed, I just came across an old RSNA 2003 abstract by Nogah Haramati entitled "Web-based Viewers as Image Distribution Solutions: Is PACS Already a Dead Concept?". Actually, encountering that abstract was what prompted me to write this diatribe, and triggered the festering irritation to surface. It is interesting to consider the current state of the art in terms of web viewing and what is currently labelled as "PACS" in light of that paper, considering it was written and presented 12 years ago. Unfortunately I don't have the slides, just the abstract, but I will let you know if/when I do get hold of them.

One has to wonder to what extent recent obsession with this morbid terminology represents irresponsible fear mongering, detachment from whatever is going on in the "real world" (something I am often accused of), self-serving promotion of a new industry segment, extraordinary popular delusions and the madness of crowds, or just a desire to emulate the breathless sky-is-falling reporting style that seems to have made the transition from cable news even to documentary narrative (judging by the "Yellowstone fauna are doomed" program we watched at home on Animal Planet the other night). Where is David Attenborough when you need him? Oh wait, I think he's dead. No he's not!


plus c'est la même chose

Sunday, October 4, 2015

What's that 'mean', or is 'mean' 'meaningless'?

Summary: The current SNOMED code for "mean" used in DICOM is not defined to have a particular meaning of mean, which comes to light when considering adding geometric as opposed to arithmetic mean. Other sources like NCI Thesaurus have unambiguously defined terms. The STATO formal ontology does not help because of its circular and incomplete definitions.

Long Version:

In this production company closing logo for Far Field Productions, a boy point to a tree and says "what's that mean?"

One might well ask when reading DICOM PS3.16 and trying to decide when to use the coded  "concept" (R-00317, SRT, "Mean") (SCT:373098007).

This question arose when Mathieu Malaterre asked about adding "geometric mean", which means (!) it is now necessary to distinguish "geometric" from "arithmetic" mean.

As you probably know, DICOM prefers not to "make up" its own "concepts" for such things, but to defer to external sources when possible. SNOMED is a preferred such external source (at least for now, pending an updated agreement with IHTSDO that will allow DICOM to continue to add SNOMED terms to PS3.16 and allow implementers to continue to use them with license or royalty payments, like the old agreement). However, when we do this, we do not provide explicit (textual or ontologic) definitions, though we may choose to represent one of multiple possible alternative terms (synonyms) rather than the preferred term, or indeed make up our own "code meaning" (which is naughty, probably, if it subtly alters the interpretation).

So what does "mean" "mean"?

Well, SNOMED doesn't say anything useful about (R-00317, SRT, "Mean") (SCT:373098007). The SNOMED "concept" for "mean" has parents:

  > Qualifier value (qualifier value) 
     > Descriptor (qualifier value) 
        > Numerical descriptors (qualifier value)

which doesn't help a whole lot. This is pretty par for the course with SNOMED, even though some SNOMED "concepts" (not this one) have (in addition to their "Is a" hierarchy), a more formal definition produced by other types of relationship (e.g., "Procedure site - direct", "Method"), etc. I believe these are called "fully defined" (as distinct from "primitive").

So one is left to interpret the SNOMED "term" that is supplied as best one can.

UMLS has (lexically) mapped SCT:373098007 to UMLS:C1298794, which is "Mean - numeric estimation technique", and unfortunately has no mappings to other schemes (i.e., it is a dead end). UMLS seems to have either consciously or accidentally not linked the SNOMED-specific meaningless mean with any of (C0444504 ,UMLS, "Statistical mean"), (C2347634, UMLS, "Population mean") or (C2348143, UMLS, "Sample mean").

There is no UMLS entry for "arithmetic mean" that I could find, but the "statistical mean" that UMLS reports, is linked to the "mean" from NCI Thesaurus, (C53319, NCIt, "Mean"), which is defined textually as one might expect, as "the sum of a set of values divided by the number of values in the set". This is consistent with how Wikipedia, the ultimate albeit evolving source of all knowledge, defines "arithmetic mean".

SNOMED has no "geometric mean" but UMLS and NCI Thesaurus do. UMLS:C2986759 maps to NCIt:C94906.

One might expect that one should be able to do better than arbitrary textual definitions for a field as formalized as statistics. Sure enough I managed to find STATO, a general-purpose STATistics Ontology, which looked promising on the face of it. One can poke around in it on-line (hint: look at the classes tab and expand the tree), or download the OWL file and use a tool like Protégé.

If you are diligent (and are willing to wade through the Basic Formal Ontology (BFO) based hierarchy:

> continuant
  > dependent continuant
    > generic dependent continuant
      > information content entity
        > data item
          > measurement data item
            > measure of central tendency
              > average value

one finally gets to a child, "average value", which has an "alternative term" of "arithmetic mean".


But wait, what is its definition? There is a textual annotation "definition" that is "a data item that is produced as the output of an averaging data transformation and represents the average value of the input data".

F..k! After all that work, can you say "circular"? I am sure Mr. Rogers can.

More formally, STATO says "average value" is equivalent to "is_specified_output_of some 'averaging data transformation'". OK, may be there is hope there, so let's look at the definition of "averaging data transformation" in the "occurrent" hierarchy (don't ask; read the "Building Ontologies with Basic Formal Ontology" book).

Textual definition: "An averaging data transformation is a data transformation that has objective averaging". Equivalent to "(has_specified_output some 'average value') or (achieves_planned_objective some 'averaging objective')".


Shades of lexical semantics (Cruse is a good read, by the way), and about as useful for our purposes:(

At least though, we know that STATO:'average value' is a sub-class of STATO:'measure of central tendency', which has a textual definition of "a measure of central tendency is a data item which attempts to describe a set of data by identifying the value of its centre", so I guess we are doing marginally better than SNOMED in this respect (but that isn't a very high bar). Note that in the previous sentence I didn't show "codes" for the STATO "concepts", because it doesn't seem to define "codes", and just uses the human-readable "labels" (but Cimino-Desiderata-non-compliance is a subject for another day).

In my quest to find a sound ontological source for the "concept" of "geometric mean", I was also thwarted. No such animal in STATO apparently, yet, as far as I could find (maybe I should ask them).

So not only does STATO have useless circular definitions but it is not comprehensive either. Disappointed!

So I guess the best we can do in DICOM for now, given that the installed base (especially of ultrasound devices) probably use (R-00317, SRT, "Mean") a lot, is to add text that says when we use that code, we really "mean" "mean" in the sense of "arithmetic mean", and not the more generic concept of other things called "mean", and add a new code that is explicitly "geometric mean". Perhaps SNOMED will add a new "concept" for "geometric mean" on request and/or improve their "numerical descriptors" hierarchy, but in the interim either the NCI Thesaurus term NCIt:C94906 or the UMLS entry UMLS:C2986759 would seem to be adequate for our purposes. Sadly, the more formal ontologies have not been helpful in this respect, at least the one I could find anyway.

Maybe we should also be extremely naughty and replace all uses of (R-00317, SRT, "Mean") in the DICOM Standard with (R-00317, SRT, "Arithmetic mean"), just to be sure there is no ambiguity in the DICOM usage (and suggest to SNOMED that they add it as an alternative term). This would be less disruptive to the DICOM installed base than replacing the inadequately defined SNOMED code with the precisely defined NCI Thesaurus code.


PS. I italicize "concept" because there is debate over what SNOMED historically and currently defines "concept" to be, quite apart from the philosophical distinctions made by "realist" and "idealist" ontologists (or is it "nominalists" and "conceptualists"). I guess you know you are in trouble when you invoke Aristotle. Sort of like invoking Lincoln I suppose (sounds better when James McEachin says it).

Sunday, October 26, 2014

Keeping up with Mac Java - Bundling into Executable Apps

Summary: Packaging a Java application into an executable Mac bundle is not difficult, but has changed over time; JavaApplicationStub is replaced by JavaAppLauncher; manually building the package content files and hand editing the Info.plist is straightforward, but the organization and properties have changed. Still irritating that JWS/JNLP does not work properly in Safari.

Long Version.

I have long been a fan of Macs and of Java, and I have a pathological aversion to writing single-platform code, if for no other reason than my favorite platforms tend to vanish without much notice. Since I am a command-line weenie, use XCode only for text editing and never bother much with "integrated development environments" (since they tend to vanish too), I am also a fan of "make", and tend to use it in preference to "ant" for big projects. I am sure "ant" is really cool but editing all those build.xml files just doesn't appeal to me. This probably drives the users of my source code crazy, but c'est la vie.

The relevance of the foregoing is that my Neanderthal approach makes keeping up with Apple's and Oracle's changes to the way in which Java is developed and deployed on the Mac a bit of a challenge. I do need to keep up, because my primary development platform is my Mac laptop, since it has the best of all three "worlds" running on it, the Mac stuff, the Unix stuff and the Windows stuff (under Parallels), and I want my tools to be as useful to as many folks as possible, irrespective of their platform of choice (or that which is inflicted upon them).

Most of the tools in my PixelMed DICOM toolkit, for example, are intended to be run from the command line, but occasionally I try to make something vaguely useful with a user interface (not my forte), like the DoseUtility or DicomCleaner. I deploy these as Java Web Start, which fortunately continues to work fine for Windows, as well for Firefox users on any platform, but since an unfortunate "security fix" from Apple, is not so great in Safari anymore (it downloads the JNLP file, which you have to go find and open manually, rather than automatically starting; blech!). I haven't been able to find a way to restore JNLP files to the "CoreTypes safe list", since the "XProtect.plist XProtect.meta.plist" and "XProtect.plist" files in "/System/Library/CoreServices/CoreTypes.bundle/Contents/Resources/" don't seem to be responsible for this undesirable change in behavior, and I haven't found an editable file that is yet.

Since not everyone likes JWS, and in some deployment environments it is disabled, I have for a while now also been creating selected downloadable executable bundles, both for Windows and the Mac.

Once upon a time, the way to do this to build Mac applications was with a tool that Apple supplied called "jarbundler". This did the work of populating the tree of files that constitute a Mac application "bundle"; every Mac application is really a folder called "", and it contains various property files and resources, etc., including a binary executable file. In the pre-Oracle days, when Apple supplied its own flavor of Java, the necessary binary file was "JavaApplicationStub", and jarbundler would stuff that into the necessary place when it ran. There is obsolete documentation of this still available from Apple.

Having used jarbundler once, to see what folder structure it made, I stopped using it and just manually cut and past stuff into the right places for each new application, and mirrored what jarbundler did to the Info.plist file when JVM options needed to be added (such as to control the heap size), and populated the resources with the appropriate jar files, updated the classpaths in Info.plist, etc. Automating updates to such predefined structures in the Makefiles was trivial. Since I was using very little if anything that was Apple-JRE specific in my work, when Apple stopped doing the JRE and Oracle took over, it had very little impact on my process. So now I am in the habit of using various bleeding edge OpenJDK versions depending on the phase of the moon, and everything still seems to work just fine (putting aside changes in the appearance and performance of graphics, a story for another day).

Even though I have been compiling to target the 1.5 JVM for a long time, just in case anybody was still on such an old unsupported JRE, I finally decided to bite the bullet and switch to 1.7. This seemed sensible when I noticed that Java 9 (with which I was experimenting) would no longer compile to such an old target. After monkeying around with the relevant javac options (-target, -source, and -bootclasspath) to silence various (important) warnings, everything seemed good to go.

Until I copied one of these 1.7 targeted jar files into a Mac application bundle, and thought hey, why not rev up the JVMVersion property from "1.5+" to "1.7+"? Then it didn't work anymore and gave me a warning about "unsupported versions".

Up to this point, for years I had been smugly ignoring all sorts of anguished messages on the Mac Java mailing list about some new tool called "appbundler" described by Oracle, and the Apple policy that executable apps could no longer depend on the installed JRE, but instead had to be bundled with their own complete copy of the appropriate JRE (see this link). I was content being a fat dumb and happy ostrich, since things were working fine for me, at least as soon as I disabled all that Gatekeeper nonsense by allowing apps from "anywhere" to run (i.e., not just from the App Store, and without signatures), which I do routinely.

So, when my exposed ostrich butt got bitten by my 1.7 target changes (or whatever other incidental change was responsible), I finally realized that I had to either deal with this properly, or give up on using and sharing Mac executables. Since I have no idea how many, if any, users of my tools are dependent on these executables (I suspect not many), giving up wouldn't have been so bad except that (a) I don't like to give up so easily, and (b) occasionally the bundled applications are useful to me, since they support such things as putting it in the Dock, dragging and dropping to an icon, etc.

How hard can this be I thought? Just run appbundler, right? Well, it turns out the appbundler depends on using ant, which I don't normally use, and its configuration out of the box doesn't seem to handle the JVM options I wanted to specify. One can download it from, and here is its documentation. I noticed it seemed to be a little old (two years) and doesn't seem to be actively maintained by Oracle, which is a bit worrying. It turns out there is a fork of it that is maintained by others (infinitekind) that has more configuration options, but this all seemed to be getting a little more complicated than I wanted to have to deal with. I found a post from Michael Hall on the Mac Java developers mailing list that mentioned a tool he had written, AppConverter, which would supposedly convert the old to the new. Sounded just like what I needed. Unfortunately, it did nothing when I tried it (did not respond to a drag and drop of an app bundle as promised).

I was a bit bummed at this point, since it looked like I was going to have to trawl through the source of one of the appbundler variants or AppConverter, but then I decided I would first try and just cheat, and see if I could find an example of an already bundled Java app, and copy it.

AppConverter turned out to be useful after all, if only to provide a template for me to copy, since when I opened it up to show the Package Contents, sure enough, it was a Java application, contained a copy of the java binary executable JavaAppLauncher, which is what is used now instead of JavaApplicationStub, and had an Info.plist that showed what was necessary. In addition, it was apparent that the folder where the jar files go has moved, from being in "Contents/Resources/Java" to "Contents/Java" (and various posts on the Mac Java developers mailing list mentioned that too).

So, with a bit of manual editing of the file structure and the Info.plist, and copying the JavaAppLauncher out of AppConverter, I got it to work just fine, without the need to figure out how to run and configure appbundler.

By way of example, here is the Package Contents of DicomCleaner the old way:

and here it is the new way:

And here is the old Info.plist:

and here is the new Info.plist:

Note that it is no longer necessary to specify the classpath (not even sure how to); apparently the JavaAppLauncher adds everything in Contents/Java to the classpath automatically.

Rather than have all the Java properties under a single Java key, the JavaAppLauncher seems to use a JVMMainClassName key rather than Java/MainClass, and JVMOptions, rather than Java/VMOptions. Also, I found that in the absence of a specific Java/Properties/apple.laf.useScreenMenuBar key, another item in JVMOptions would work.

Why whoever wrote appbundler thought that they had to introduce these gratuitous inconsistencies, when they could have perpetuated the old Package Content structure and Java/Properties easily enough, I have no idea, but at least the structure is sufficiently "obvious" so as to permit morphing one to the other.

Though I had propagated various properties that jarbundler had originally included, and added one that AppConverter had used (Bundle display name), I was interested to know just what the minimal set was, so I started removing stuff to see if it would keep working, and sure enough it would. Here is the bare minimum that "works" (assuming you don't need any JVM options, don't care what name is displayed in the top line and despite the Apple documentation's list of "required" properties):

To reiterate, I used the JavaAppLauncher copied out of AppConverter, because it worked, and it wasn't obvious where to get it "officially".

I did try copying the JavaAppLauncher binary that is present in the "com/oracle/appbundler/JavaAppLauncher" in appbundler-1.0.jar, but for some reason that didn't work. I also poked around inside javapackager (vide infra), and extracted "com/oracle/tools/packager/mac/JavaAppLauncher" from the JDKs "lib/ant-javafx.jar", but that didn't work either (reported " ... Job failed to exec(3) for weird reason: 13"), so I will give up for now and stick with what works.

It would be nice to have an "official" source for JavaAppLauncher though.

In case it has any impact, I was using OS 10.8.5 and JDK 1.8.0_40-ea whilst doing these experiments.


PS. What I have not done is figure out how to include a bundled JRE, since I haven't had a need to do this myself yet (and am not motivated to bother with the AppStore), but I dare say it should be easy enough to find another example and copy it. I did find what looks like a fairly thorough description in this blog entry by Danno Ferrin about getting stuff ready for the AppStore.

PPS. I will refrain from (much) editorial comment about the pros and cons of requiring an embedded JRE in every tiny app, sufficeth to say I haven't found many reasons to do it, except for turn key applications (such as on a CD) where I do this on Windows a bit, just because one can. I am happy Apple/Oracle have enabled it, but surprised that Apple mandated it (for the AppStore).

PPPS. There is apparently also something from Oracle called "javafxpackager", which is pretty well documented, and which is supposed to be able to package non-FX apps as well, but I haven't tried it. Learning it looked more complicated than just doing it by hand. Digging deeper, it seems that this has been renamed to just "javapackager" and is distributed with current JDKs.

PPPPS. There is apparently an effort to develop a binary app that works with either the Apple or Oracle Package Contents and Info.plist properties, called "universalJavaApplicationStub", but I haven't tried that either.