30 May 2016 Draft Report HDRMX Meeting , B NL 26-28 May 2016
ACA 201623 July
DRAFT 7 August
This is the second of a series of three meetings in
2016 on changes needed to existing major software packages for support of very
high data rate macromolecular crystallography. This meeting was an informal dinner gathering at the
meeting of the American Crystallographic Association in Denver Colorado, 22 –
26 July 2016. This
meeting was chaired by Herbert J, Bernstein of Rochester Institute of
Technology. It was a
follow-on meeting to the first meeting was held at Brookhaven National
Laboratory, 26 – 28 May 2016. The BNL meeting was organized by Herbert J. Bernstein of Rochester
Institute of Technology, Nicholas K. Sauter of Lawrence Berkeley National
Laboratory and Robert M. Sweet of Brookhaven National Laboratory.
This is a draft report by HJB for review and commends and correction. A raw BlueJeans recording is available at
The attendees at th
Alun Ashton Diamond Light Source
Kaden Badalian Binghamton University
Herbert J. Bernstein Rochester Institute of Technology
Frances C. Bernstein Brookhaven National Laboratory (ret.)
Aaron Brewster Lawrence Berkeley National Laboratory
Gerard Bricogne Global Phasing Ltd.
James Holton UCSF/LBNL/SLAC
Andrew Howard Illinois Institute of Technology
Wladek Minor University of Virginia
Katherine McAuley Diamond Light Source
Marcus Mueller DECTRIS Ltd.
Jie Nan MAX IV Lund University
George Phillips Rice University
John Rose SER-CAT University of Georgia
Gerold Rosenbaum Argonne National Laboratory
Nicholas K. Sauter Lawrence Berkeley National Laboratory
Clemens Schulze-Briese DECTRIS Ltd.
Marian Szebenyl MacCHESS Cornell University
Thomas Ursby MAX IV Lund University
19 participants attended on-site, 1 participant attended remotely.
Issues Discussed at the Meeting
Inasmuch as some of the participants were new to these discussions, the meeting began with a brief review of the fact that the new generation of detectors and increased brightness at new MX beamlines requires a careful reconsideration and re-engineering of networks, computers, software and process flows for macromolecular crystallography in order to deal efficiently with the new data flows without significant delays. The report of the May meeting (http://www.medsbio.org/meetings/HDRMX_Meeting_BNL_26_28May_Report.html) was distributed to all participants and they were referred to the http://hdrmx.medsbio.org web site.
There was brief discussion of whether our concern should be high structure-rate macromolecular crystallography, rather than high data-rate macromolecular crystallography. Inasmuch as many beamlines are facing an immediate need to get organized for high data-rate macromolecular crystallography, the discussion returned to that issue.
In the May meeting, Wladek Minor agreed to provide 200 TB of storage for relevant datasets to facilitate softare testing. Participants were encourage to deposit such data promptly.
The participants were informed of the major progress and desirabilty of having applications read the native HDF5 data format directly to save significant time in processing. It was noted that DIALS has made the transition and so has a test version of ESRF’s Dozor, and that it appears that, thanks to Takanori Nakane’s eiger2cbf, a framework is now available that appears to avoid some of the HDF5 library blocking issues that has been a problem with such adaptations in the past.
The rest of the discussion centered on issues of ensuring the most useful and portable data. There was considerable discussion of what metadata belongs with the images. It was the sense of the meeting that it is very important to provide complete metadata on detector geometry and on goniometer geometry, essentially the Nexus mapping of the full CBF metadata that has already been demonstrated for CSPAD data. The objective would be to extend the current, simpler NXmx application definition with optional metadata consistent with this objective. If this community can come to agreement on the details before the Nexus International Advisory Committee in Copenhagen in October 2016, that extended application definition will be proposed for formal adoption. Informal use would, of course, continue in the meantime.
It was the sense of the group that it would be desirable to coordinate image metadata for deposition with the IUCr Diffraction Data Deposition Working Group, and HJB was charged with trying to arrange participation by a DDDWG representative in the HDRMX ECM-30 satellite meeting on 2 September 2016.
As data is added to the HDRMX website, we discussed requesting that specific 'gold standard' datasets be included, not necessarily from an Eiger detector, that are representatives of fully compliant NXmx and/or NeXus data. These gold standards would derive from easily processable data (for example lysozyme) and include 1) a simple NXmx set with a single panel detector orthogonal to the beam and a single axis goniometer. 2) A complex dataset with a multi-panel detector that includes tilts, rotations and translations off of the main beam, and that includes a multi-axis goniometer with axes not parallel to X, Y or Z axes, including a kappa axis. With these standards developers would be able to test their processing methods.
It was noted that community members may find these links useful:
ImageCIF to NeXus mapping: https://sites.google.com/site/nexuscbf/
NXmx specification: http://download.nexusformat.org/sphinx/classes/applications/NXmx.html
There were two major conclusions specific to this meeting. The first was the need to encourage dataset deposition at Wladek Minor’s site. It would be most useful if datasets relevant to the HDRMX data processing initiative could be specially tagged (as "HDRMX") or perhaps highlighted at the web site with a dedicated link. The second was that inclusion of full-CBF-based metadata in data files is essential, in particular so that multi-panel detectors such as the CSPAD could be properly represented in standards-conformant HDF5/NXmx files In order to allow further processing of such data at later time, at different sites, or with different software, full metadata (especially on geometry) needs to be available with the data. This will hopefully reduce the present need for reconstruction of such metadata from laboratory notebooks, which might not be available.