Sunday, June 17, 2007

Where to get images for research and testing - Public collections, routine re-use, and the possibility of direct patient contributions

Summary: Large useful collections of publicly accessible medical images for testing and research are few in number; despite public initiatives to build such collections, progress is slow though improving; the additional possibility of having individual members of the public contribute their own images and data directly has been raised; logistic and legal concerns are significant but surmountable, and there would seem to be few privacy and human research regulation issues.

Long version:

I have long fantasized about the existence of a large collection of complete sets of images suitable for research and testing purposes, whether it be for testing image pixel data for different types of compression, display, analysis, or similar studies, or for more mundane tasks like checking for DICOM compliance or testing DICOM-capable tools like PACS and workstations against the installed base of equipment. Indeed, I first developed an interested in DICOM in the early nineties not for clinical interchange, but as a means of formatting and organizing my own teaching and research collections. Little did I know where that would lead !

Traditionally in academic research studies, one begins with a laborious exercise of collecting patient-related images prospectively or retrospectively; this often involves multi-site collaboration, approval by Institutional Review Boards (IRBs), etc.; this is very expensive, time consuming, and frankly, beyond the capabilities of many scientists, engineers, programmers and students who just want to test their ideas, algorithms and code. Further, the folks who need the images may not have the academic affiliations, credibility or stature to even get to first base as far as funding or approval is concerned.

Some of us are fortunate enough to be actively engaged in large scale multi-center clinical trials and industry testing collaborations and we can often find ways of re-purposing and reusing images gathered for other purposes, with the appropriate approvals and permissions. This avenue is not open to many folks who need images though. Some of the NIH folks are keen to remedy this problem by recruiting images from other studies and making them publicly accessible via such mechanisms as the National Cancer Image Archive (NCIA) and the Alzheimer's Disease Neuroimaging Initiative (ADNI) projects to name just a few of several. These projects emphasize the importance of gathering not just any images, but complete sets, in a relatively homogeneous manner with respect to acquisition protocol, at multiple time points in the course of diseases that need to be followed over time, and with additional related data, such as experts' assessment of lesion location and outcome and historical data where relevant. Such efforts still require significant resources and involve sometimes difficult negotiations with respect to funding and permission.

Another option that I have considered in the past is to somehow capture images and associated information as a "side effect" of routine clinical use. For example, many facilities are partially or totally digital already, with respect to images, diagnosis codes and reports if not the entire medical record. Further, many such sites already use "off-site storage" provided by third-parties either as their primary archive or to support disaster recovery. Would it be a difficult step to go a little further and automatically collect and de-identify all such image and related data and make it publicly available for research ?

From a legal perspective, possibly all it would take would be for the facility to add consent and authorization for such routine (as opposed to prospectively identified) re-use purposes; however, each IRB would undoubtedly weigh in with policy and risk-management related issues that might be difficult to get by. And frankly, many physicians might feel threatened by releasing what they otherwise consider their proprietary material, which potentially provides them with a competitive advantage with respect to grant applications and publishing papers. To put it another way, one would need to provide a facility with one hell of an incentive to get by the obstacles that naysayers might raise.

One such incentive might be to provide free or really cheap storage; how many CFOs or CIOs would drool over the possibility of reducing or eliminating bulk data storage costs if a third party (such as a non-profit organization established for the benefit of the public research community) were to underwrite these costs, on the proviso that their de-identified form be made available ? Such an incentive might serve to significantly undermine any opposition within an institution. It might be possible to leverage the capabilities of existing commercial providers of off-site archives, who could offer a reduced price for such data sets. Conversely however, less well intentioned folks might see this as a commercial opportunity and explore the possibility of selling the data instead of making it publicly available for free.

Some existing archive providers also provide the opportunity for patients to contribute and maintain their own images, allowing access to their health care providers as appropriate, myNDMA being an example (though I noticed as I was researching this post that myNDMA are "accepting no new registrations at this time"). The concept of patient empowerment and patient-centric control of one's own destiny is perhaps a concept whose time has come, though obviously only a subset of the population will be willing to or capable of taking on such responsibility. An example of extending this concept to one's entire record is the MedCommons project.

On a previous occasion, frustrated by the difficulty of getting images from a broad range of installed modalities to test DICOM software, I had considered setting up a publicly accessible archive that would also allow anybody from the public at large to contribute. My plan was to canvas the community of digital imaging and PACS users as well as ordinary people undergoing imaging to submit material that I would then de-identify and make available for testing. At the time my primary interest was in the "DICOM-ishness" of the data and not the research applications, though I was interested in complete sets rather than individual images. I did not pursue this, since about the same time NEMA was initiating an effort to gather images from modality vendors for similar sorts of testing (the NEMA DICOM Object Library). However I was sorely disappointed when, despite my strong protests, the NEMA vendors decided to keep this a closed and secret database not accessible to non-NEMA members or the public, which it remains to this date. Bet you didn't even know about it, did you ?

However, I was reminded of the possibility of direct patient contribution to image archives at a recent Cancer Research and Prevention Foundation Lung Cancer Workshop, during which the concept of approaching patients, people under going screening, and survivors for image contributions was raised. A lively conversation among the participants ensued led by Jim Mulshine, David Yankelevitz and Rick Avila. In essence, most of the attendees were quite excited by this concept, particularly since there is an opportunity to leverage the good will of the survivor-driven charitable organizations to organize and promote such an activity. KitWare has kindly volunteered to coordinate some of this work and you can follow along on their Wiki once it gets under way. Though this was discussed in the context of lung cancer, and particularly with respect to gathering images for CAD testing and validation, the concept is obviously generalizable.

For example, in lieu of there being a good publicly available collection of images for digital (as opposed to digitized) mammography image compression research, one might consider attempting to build such a collection with the assistance of contributions from individual women. One of the obvious problems with this is the relatively low prevalence of disease; i.e., one might receive far more normal contributions than abnormal, which makes performing research on disease-enriched data more difficult, or conversely, means storing and curating a large amount of data for a relatively low yield of useful information. However, unlike the unfortunate situation for lung cancer, a far higher proportion of women either have a negative biopsy or survive their disease, and potentially a high yield of images with positive findings could be obtained from this group.

Another problem is the matter of gathering additional outcome data; for many types of experiment it is necessary to have some knowledge of the truth beyond what can be ascertained from the images themselves. Contribution of pathology reports and/or follow-up images would be desirable. The former presents problems in that these reports are less often accessible to patients (or screening participants) in digital form, though perhaps they could be scanned or faxed The latter might be contributed on a separate occasion, but if de-identified, how are they to be linked to the same (anonymized) individual ?

In general, the problem of reliable de-identification and anonymization (or pseudonymization) on a large scale is hard. Sure, one can clean the DICOM header information well enough, especially if one can discard most of the string descriptive and private attributes without affecting reuse, though even that is non-trivial in the general case. The problem of burned in pixel data identification can at least be detected in a subset of images (by automated algorithms examining header patterns as well as OCR-like analysis of pixel data), which can then be sequestered for manual review. Anything that is not an image though, such as a scanned or faxed, or even PDF or HL7 plain text or DICOM structured report will likely require manual (and hence error prone) attention. The resource burden of manual de-identification (and QC process to check on it) is not to be underestimated.

One approach would be to have the contributor themselves actually perform the de-identification by providing them with the appropriate web-deployed tool to use to contribute, view and edit the content; that way they could both do the work and absolve the archiver from future responsibility in this respect. Indeed, if all the work were performed client-side, the central server would not ever need to have access to or knowledge of the actual Protected Health Information (PHI), which might considerably simplify the necessary security measures. Continuity across contributions would be more difficult but could be achieved with some sort of registration or identity hash based mechanism. It would be shame if this additional burden were to prove a disincentive to contribute, though.

Thorough de-identification in the general case remains non-trivial though, especially if one goes so far as to consider facial information possibly recognizable from a 3D rendering of images of the head; there are means to disrupt the data to prevent this, but that would make it useless for many (though not all) potential future uses. Though trials on the matter of recognizability are currently under way, there is no consensus on this yet, and perhaps it would be easiest just to have the contributor consent around this issue.

Indeed, on the matter of consent, this might be more challenging than all the procedural and technical and resource issues put together. One would have to be sure that the contribution agreement would stand legal scrutiny, cover all potential uses of the data, irrevocably, and allow for the archive maintainer to disclaim any liability. Liability might include not only privacy concerns, but also responsibility to feed back any findings with respect to the data to the contributor. For example, in the case of CAD testing, one would not want the contributor to have the (unrealistic) expectation that if a future CAD experiment found something undesirable that they would receive feedback that would impact their care. Such an agreement would somehow need to be "signed", presumably, to have any legal standing, and a mechanism to do this via the web at the time of contribution and to archive the signature would be necessary.

Note that I distinguish the matter of the individual contribution agreement with respect to permission and liability from the matter of permission from others. To my knowledge, at least in the US, there are no regulations that would govern the establishment of such a repository of images. Whilst the HIPAA Privacy and Security rules might provide helpful guidance, the repository would not in and of itself be a Covered Entity, and hence would not be subject to the rules. Further, since contributions would be directly from individuals rather than Covered Entities, no HIPAA provisions on the sending side would come into play.

Would some form of IRB approval be required, either to contribute, maintain or to use any of the data ? The US federal regulation on Protection of Human Subjects, which potentially applies to federally funded activities, specifically exempts "research involving the collection or study of existing data ... if these sources are publicly available or if the information is recorded ... in such a manner that subjects cannot be identified ..." (45 CFR 46.101(b)(4)),.

However, whilst there might be no formal need for an IRB approval, review of the policies and procedures and agreements by some form of central IRB might well be worthwhile to mitigate any concern that the rights of the contributors are not being abused. Perhaps the NCI's Central IRB (CIRB) Initiative might be willing to take on this responsibility. One could envisage drafting a set of standard "open source" pre-approved documents that would allow any number of willing organizations to implement and replicate this strategy.

This is of course a somewhat US-centric view of the privacy and human research situation biased by my own experience; since any such repository might be open to global contributions, a further analysis of the issues in other countries is desirable.

But the bottom line is that there would seem to be few if any restrictions to a person who has access to their own record in electronic form to use it in any manner they see fit, and hence to contribute it to such a research collection for the public good. Whilst one may debate about who actually "owns" the data, I hope few would be so crass as to attempt to restrict an individual's use of their own personal data in such a manner.

What remains now is for those of us who see merit in this approach to take action to make it happen, and in such a manner that the data becomes useful in advancing the state of the art.

David

Sunday, June 3, 2007

On the lack of DICOM Police, the example of IHE content profiles, and the need for usability standards and cross-certification ...

Summary: Neither DICOM nor IHE may be sufficient to solve users' real world problems with usability of imaging devices, neither a hypothetical DICOM police nor the existing IHE Connectathon process would solve this problem, and there may be a need for a new type of "usability" standard and certification process, even to the extent of cross-certification of combinations of devices.

Long version:

As everyone is fond of saying, there are no "DICOM police".

NEMA, for example, specifically disclaims responsibility for policing or enforcing compliance to the DICOM standard. There is, for example, no DICOM "certification".

Nor is there an "IHE police", nor, for the time being, IHE "certification".

Some folks are under the mistaken impression that successful participation at an IHE Connectathon represents some sort of certification, but what is tested at IHE is not necessarily a product and may be a prototype, and often is not representative of what you can go out and buy, now, or ever. Furthermore, the IHE tests are limited in scope and depth, not only to the limits of the "profiles" being tested but also by the rigor of the tests themselves. For example, though vendors may demonstrate transfer of images within a specified workflow with the correct identifiers during the Connectathon, whether those images will be usable in any meaningful fashion by the receiver is not tested. These issues may be addressed over time as the IHE testing approach matures and is revised, and more "content" profiles like NM and Mammo are developed and tested. The Connectathon is a fantastic cooperative effort and an enormous investment of time and resources that results in considerable progress, but the fact remains that products are not certified during this process.

Hence the publicly posted "Connectathon Results" are only a guide to what vendors might or might not choose to make available as product, one is left to rely on so-called "self-certification" by the vendors. Vendors dutifully provide DICOM Conformance Statements and IHE Integration Statements, which both guide users with respect to what features are supposed to be available and outline what a product is supposed to do, but it seems that not infrequently products remain deficient in some small or significant way, either with respect to what is claimed, or even correct implementation of the underlying standard.

Who then, will police the compliance of the vendor in this respect? Currently, this is left to the users, or the experts with whom they consult. The vendors mostly appear to act in good faith, but when problems arise some are none too swift to acknowledge that they are at fault or to provide a solution.

But even if there were a DICOM (or IHE) police, would it actually help the users ?

Take for example the matter of compliance with the standard with respect to the encoding of images for a particular modality, say projection X-ray using the DICOM DX image object. Consider a frontal chest X-ray, which, depending on whether it is taken AP or PA, might from the perspective of the pixels read out from the detector have the left or the right side of the patient orientated towards the right side of the image. Now, the DICOM standard does NOT say that the transmitted image must be oriented in any particular manner; rather it says that the orientation of the rows and columns must be sent. In this case the row orientation would be sent either as towards the patient's left, meaning that the pixel data if rendered that way would look the way (most) radiologists would expect it, or the row orientation might be sent towards the patient's right, meaning that the receiver could use this orientation to flip the image into the expected orientation.

And therein lies the rub, since no standard, DICOM or IHE, currently requires that the receiver flip the image into the "desired" orientation for display based on the encoded orientation parameters. So a completely DICOM (and IHE) compliant storage SCU (Acquisition Modality actor) could encode an image in one orientation, and a DICOM (and IHE) compliant storage SCP (Image Display actor) could display it, and the user would still be unsatisfied and have to manually flip the image. No DICOM (or IHE) police or certification or anything else would be able to solve this problem for the user, beyond explaining it.

Conversely, if the modality were to not send the orientation at all, and violate the DICOM standard in this respect, if the pixels happened to be oriented correctly, the user experience would be satisfactory, and no problem would be perceived (except perhaps for the absence of an orientation marker to indicate the side). Indeed this would typically be the case for devices that use the older CR image object in DICOM, which allows the orientation to be empty, ostensibly on the grounds that sometimes it won't be known (e.g., there is a plate reader but no means for the operator to enter this information on the QC workstation, if there is one).

The acquisition modality vendors may solve this problem by making the sending device configurable in such a manner as to "flip" the images as necessary to give the expected result at the other end, either automatically or with the assistance of the operator, but the fact remains that this sort of configurability is not required by the standards.

Another example would be the matter of display shutters, such as to blank out the perimeter around a circular or rectangular angiography or RF acquisition, so that it remains black regardless of whether the image is inverted or not. The DICOM standard defines their existence encoded within an image, but does not mandate their application by the display (unlike in a presentation state). I was recently reminded of this when there was a compatibility issue between one vendor's acquisition device and another's PACS. The modality was sending a display shutter and the PACS was ignoring it, and the resulting white background was unacceptable to the user. A modality vendor would typically provide a configuration option to burn in the background as black in this case (resulting in white when inverted, but you can't configure around everything), and handle the lame PACS, but this particular modality did not have that feature. The PACS vendor had I am told only just released display shutter capability in a new and expensive release, so the user was essentially out of luck. Again, there would be no help from the DICOM police in this regard, assuming they could only act within the bounds of the "law" (what is written in the standard). Furthermore, it is very difficult to ascertain a priori from conformance statements what is possible in these situations, there typically being little if any documentation of the scope of configuration possible on the sending end, or the display behavior on the receiving end.

So, one is inevitably led to the conclusion that the standards are insufficient to satisfy the users needs in this regard, and that DICOM police or certification, whilst arguably necessary, would not be sufficient in their own right.

Or to put it another way, there seems to be a need for "usability" standards, perhaps layered on top of the DICOM and IHE standards. This is an area that vendors may be reluctant to address, since such standards might potentially erode what they see as "added value" (though many users might argue the same are "bare necessities"), and are a source of risk in that if they fail to offer the complete spectrum of "usability" requirements, they might be unmarketable.

There are two categories of precedent for this sort of thing that may be relevant. One category includes the IHE Radiology "content" profiles, specifically NM and Mammography; the other is the federally-mandated certification effort, exemplified currently by the Certification Commission for Healthcare Information Technology (CCHIT).

The IHE content profiles differ from much of the other radiology work in IHE in that they are less about workflow and more about modality-display interaction. Anyone with NM experience knows exactly how woeful most general purpose PACS are with respect to handling NM images, either in terms of providing interface tools with which NM physicians are comfortable, providing layout and reconstruction capability appropriate to different types of acquisition, not to mention analytic tools for quantitative measurements, especially cardiac. The NM folks (in the form of the SNM) finally said enough and ultimately decided to work through the IHE framework to achieve their goal. I have little experience in this area, so cannot say to what extent this profile has actually influenced purchasable products or helped the users in the real world, but this effort paved the way for content profiles that specified image display behavior in detail.

The IHE Mammo profile, on the other hand is one that I was directly involved in. In this case a bunch of very disgruntled users who had faced the realities of owning multiple vendors' FFDM equipment and trying to use it in high volume environments expressed their disappointment at a special SCAR session, which resulted in the formation of a sub-committee in IHE to address the concerns, and ultimately a profile that specified mutually compatible requirements for both modalities and displays.

The process by which the Mammo profile was developed is instructive. First the users expressed their concerns and requirements with respect to real world experience with products; second, the FFDM and dedicated display system vendors admitted that there were problems and expressed willingness to engage in a dialog; third, everyone met together face-to-face to hash out what the priorities were and where there was common ground. There was considerable argument on the fringes, especially with respect to exactly how much application behavior could be standardized or required as a minimum, and which of several competing solutions to choose for particular problems when there existed an installed base of incompatible solutions, but ultimately a reasonable compromise was reached. The users insisted that deployment be swift and arranged a series of public demonstrations at short intervals to ensure that progress was made.

What distinguishes the Mammo profile is that it is very specific about how displays behave and in particular what features they must have, e.g., the ability to display images all at the same size, current versus prior, regardless of vendor and detector pitch, to display true size, to display CAD marks, to annotate in a particular way to meet regulatory and quality requirements, and which DICOM header attributes to use in what manner to implement those features. Further, given the different type of processing and grayscale contrast from the various detectors, the display is required to implement all of the possible grayscale contrast windowing and lookup table mechanisms, not just a vendor-specific subset. I.e., in some cases the vendors agreed to standardize the "intersection" of various different possibilities, and in other cases the "union" of all possible, depending on the impact on the installed base and the usability of the feature.

This cooperative effort seems successful so far, though I am biased in this assessment having been intimately involved. However, is it scalable to more ambitious "content", "functional" or "usability" specifications, either within IHE or elsewhere ?

The mammography effort was made considerably easier by the fact that the digital mammography user and vendor communities are relatively small and tightly focused, if by no other factor than the regulatory burden imposed by MQSA. Everyone knows everyone else, basically everybody gets along and likes one another, and it is hard to take too unreasonable a stance in this group for very long. A certain amount of "cat herding" was required of course, but on a level of difficulty scale of 1 to 10, I would rate this one about a 4.

One risk to scalability is that "users" will not bother to ask for the IHE profile in their RFPs and contracts, and will buy whatever non-compliant lame "mammo package" their existing PACS vendor deigns to offer and force their radiologists to use it. This risk could be mitigated if the FDA were to require that only certified products were used for primary interpretation, but this would be a very special case since mammography is about the only area in which the FDA has authority to regulate the practice of medicine, and is not generally applicable. Other organizations, like JCAHO or third party payors could require certified compliance, but would there be any benefit for them to do so ?

Another risk with respect to generalizing the approach is the lack of interest by users in developing usability standards. The mammography and NM examples were perhaps atypical in that there were highly motivated individuals to champion the cause who devoted enormous amounts of time and energy with the support of their organizations. Is this degree of user involvement likely to be repeatable in other areas where the problems may not be so acutely felt, where the scope is broader, or the problem is larger in scale ?

Likewise, there is the risk that the vendors will be unresponsive to such efforts. Both DICOM and IHE development have been characterized by the active participation (some might say total domination) of vendors and have as a consequence been at least somewhat successful. Externally imposed standards to which there may be outright vendor opposition would be less likely to be successful.

On the subject of scale, it is potentially enormous, if one were to go the extent of defining the required functionality of an entire PACS with respect to usability of workflow and display. Anyone who has written requirements specifications and test scripts for the implementation of such products is familiar with the level of effort, but then again since this has already been done internally by vendors many times over this is not a new experience.

To that end it may be instructive to review the work of CCHIT so far; kick-started by federal funding and a requirement to certify ambulatory EHRs, this effort has produced some interesting materials, even if one is not a fan of the politics involved. On their web site you can find documentation of their process, the functional requirements against which certification takes place, the actual test scripts that are used, as well as the public comments received as these materials were being developed, which give an interesting insight into the vendors opinion of the process and the expense as well as the heavy handiness of the CCHIT.

I have no involvement in this process at all, so can't speak to its success or value so far, and you can read the materials as well as I can. It is interesting though, to review the functionality criteria for an ambulatory EHR and envisage how one might write similar criteria for a PACS. Likewise, to review the test script for these criteria from the perspective of perhaps testing an Image Display with the same approach. To return to the example at the beginning of this entry, one could envisage a criterion for a PACS such as "shall be able to display a frontal chest x-ray rotated or flipped into the correct orientation based on the DICOM orientation description" and a corresponding test script entry with a range of test materials that included images encoded in a manner that required such flipping. This is exactly the sort of testing that we did for the IHE Mammo profile.

If this were to be done, would self-attestation or self-certification be sufficient or would there need to be in addition external verification and certification such as CCHIT performs ?

Who would require either form of certification ? The users themselves ? The payors ? The government ?

What would be the appropriate organization to perform such work ? Would CCHIT take on imaging or do they have enough on their plate, not to mention no expertise in this area ? Could or would IHE do it, particularly now that it has grown well beyond radiology into other domains that have their own issues and priorities ? Would ACR, who are all very eager to "accredit" modalities, be interested in or capable of this ? SIIM would perhaps be a logical choice, were it not for the apparent influence vendors have on their decision making process about things controversial. How about RSNA, or are they too invested in IHE already to begin a separate effort if one were thought to be necessary ?

Or is there a need for yet another independent organization to do this ? If so who would start it ? Who would run it ? Who would pay for it ?

And ultimately, would "standalone" certification against criteria be of sufficient benefit ? It would be a start, but if there is one thing that the IHE Connectathons have demonstrated it is that the proof is in the testing of multiple devices working together. To that end, does one need an infrastructure to support certification of permutations and combinations of devices inter-operating together, either in a test environment or in the field ?

One could envisage an approach in which the two (or more) vendors involved submitted a "joint application" for certification of a combination evaluated against specific criteria based on the first actual deployment. Funding, implementing. monitoring and promulgating this information would be a challenge, but perhaps not insurmountable.

Imagine in the display shutter example that the forward thinking purchaser of the PACS had included in their support contract a requirement that the PACS vendor participate in such cross-certification activities as new modalities were acquired by the site; likewise before accepting the new modality the site would have required the same of the modality vendor. If both had been previously cross-tested satisfactorily they would already be certified, and indeed the purchaser would have known this by consulting the certifying authorities web site; any limitations would have been publicly documented and disclosed. If the particular combination had not, then a first-time test would need to be performed against the certification criteria, supervised by some sort of "designated examiner" trained and licensed by the certifying authority. The result, whether successful or not, would be promulgated in full. Fees to cover the cost would be payable by the pair of vendors, and they would recover this in their service contracts or purchase price. If one or other of the vendors refused to participate then the user could still execute the (publicly available) test script themselves at their own expense, with or without an examiner, the results could still be promulgated with or without either vendor's prior approval, and failure might be a clue to the user not to accept the modality or to plan to replace their PACS.

So we have come full circle, in that this is exactly the sort of paradigm that the IHE Connectathon supports. Except, that it would involve products rather than experimental or prototype systems, the details of test script execution would be fully public, rather than categorized as a simple pass/fail or prevented from disclosure by confidentiality agreements, a considerably more comprehensive range of old and new products would be tested, the result would be a formal certification, the criteria would be at a level that addressed functionality and usability not just message transfer and workflow, and the users and sites could specify certifications as criteria in their purchase and support contracts.

Or perhaps, the "great learning experience" for engineers, which is essentially what the Connectathon is, could be translated into a formalized process of direct, rather than only indirect (albeit very important), benefit to the user.

David

Welcome to David Clunie's Blog ...

... in which you will find various ruminations (calm, lengthy considerations ?) and periodic stream-of-consciousness dumps, in a less structured form than the regular material on this site ... feel free to leave comments but do not be offended if I edit or delete them at will ...