30 May 2016 Draft Report HDRMX Meeting, BNL 26-28 May 2016

 

 

Report of the

High Data-Rate Macromolecular Crystallography Meeting

 ACA 2016 23 July 2016

DRAFT Report Date:  7 August 2016

 

This is the second of a series of three meetings in Springspring and sSummer 2016 on changes needed to existing major software packages for support of very high data rate macromolecular crystallography.   This meeting was an informal dinner gathering at the meeting of the American Crystallographic Association in Denver Colorado, 22 – 26 July 2016.  This meeting was chaired by Herbert J, Bernstein of Rochester Institute of Technology.  It was a follow-on meeting to the first meeting was held at Brookhaven National Laboratory, 26 – 28 May 2016.  The BNL meeting was organized by Herbert J. Bernstein of Rochester Institute of Technology, Nicholas K. Sauter of Lawrence Berkeley National Laboratory and Robert M. Sweet of Brookhaven National Laboratory.

 

This is a draft report by HJB for review and commends and correction.  A raw BlueJeans recording is available at

 

https://bluejeans.com/s/a2gV/

 

The editor of the text is HJB (yayahjb at gmail dot com) to whom comments and corrections should be directed.

 

The attendees at theat meeting were:

 

Name                                           Institution

On-site Participants

 

Alun Ashton                                 Diamond Light Source

Kaden Badalian                           Binghamton University

Herbert J. Bernstein                   Rochester Institute of Technology

Frances C. Bernstein                  Brookhaven National Laboratory (ret.)

Aaron Brewster                           Lawrence Berkeley National Laboratory

Gerard Bricogne                          Global Phasing Ltd.

James Holton                              UCSF/LBNL/SLAC

Andrew Howard                          Illinois Institute of Technology

Wladek Minor                              University of Virginia

Katherine McAuley                     Diamond Light Source

Marcus Mueller                            DECTRIS Ltd.

Jie Nan                                         MAX IV Lund University

George Phillips                           Rice University

John Rose                                   SER-CAT University of Georgia

Gerold Rosenbaum                   Argonne National Laboratory

Nicholas K. Sauter                       Lawrence Berkeley National Laboratory

Clemens Schulze-Briese           DECTRIS Ltd.

Marian Szebenyl                        MacCHESS Cornell University

Thomas Ursby                            MAX IV Lund University

 

Electronic Participants

Keitaro Yamashita                       Spring 8SPring-8

 

19 participants attended on-site, 1 participant attended remotely.

 

Issues Discussed at the Meeting

 

Inasmuch as some of the participants were new to these discussions, the meeting began with a brief review of the fact that the new generation of detectors and increased brightness at new MX beamlines requires a careful reconsideration and re-engineering of networks, computers, software and process flows for macromolecular crystallography in order to deal efficiently with the new data flows without significant delays.  The report of the May meeting (http://www.medsbio.org/meetings/HDRMX_Meeting_BNL_26_28May_Report.html) was distributed to all participants and they were referred to the http://hdrmx.medsbio.org web site.

 

There was brief discussion of whether our concern should be high structure-rate macromolecular crystallography, rather than high data-rate macromolecular crystallography.  Inasmuch as many beamlines are facing an immediate need to get organized for high data-rate macromolecular crystallography, the discussion returned to that issue.

 

In the May meeting, Wladek Minor agreed to provide 200 TB of storage for relevant datasets to facilitate softare testing.  Participants were encourage to deposit such data promptly.

 

 

The participants were informed of the major progress and desirabilty of having applications read the native HDF5 data format directly to save significant time in processing.  It was noted that DIALS has made the transition and so has a test version of ESRF’s Dozor, and that it appears that, thanks to Takanori Nakane’s eiger2cbf, a framework is now available that appears to avoid some of the HDF5 library blocking issues that has been a problem with such adaptations in the past.

 

The rest of the discussion centered on issues of ensuring the most useful and portable data.  There was considerable discussion of what metadata belongs with the images.  It was the sense of the meeting that it is very important to provide complete metadata on detector geometry and on goniometer geometry, essentially the Nexus mapping of the full CBF metadata that has already been demonstrated for CSPAD data.  The objective would be to extend the current, simpler NXmx application definition with optional metadata consistent with this objective.  If this community can come to agreement on the details before the Nexus International Advisory Committee in Copenhagen in October 2016, that extended application definition will be proposed for formal adoption.  Informal use would, of course, continue in the meantime.

 

It was the sense of the group that it would be desirable to coordinate image metadata for deposition with the IUCr Diffraction Data Deposition Working Group, and HJB was charged with trying to arrange participation by a DDDWG representative in the HDRMX ECM-30 satellite meeting on 2 September 2016.

 

As data is added to the HDRMX website, we discussed requesting that specific 'gold standard' datasets be included, not necessarily from an Eiger detector, that are representatives of fully compliant NXmx and/or NeXus data.  These gold standards would derive from easily processable data (for example lysozyme) and include 1) a simple NXmx set with a single panel detector orthogonal to the beam and a single axis goniometer. 2) A complex dataset with a multi-panel detector that includes tilts, rotations and translations off of the main beam, and that includes a multi-axis goniometer with axes not parallel to X, Y or Z axes, including a kappa axis.  With these standards developers would be able to test their processing methods.

 

It was noted that community members may find these links useful:

ImageCIF: http://www.bernstein-plus-sons.com/software/CBF/doc/cif_img_1.7.8.html

NeXus: http://www.nexusformat.org/

ImageCIF to NeXus mapping: https://sites.google.com/site/nexuscbf/

NXmx specification: http://download.nexusformat.org/sphinx/classes/applications/NXmx.html

 

Conclusions:

 

There were two major conclusions specific to this meeting.  The first was the need to encourage dataset deposition at Wladek Minor’s site. It would be most useful if datasets relevant to the HDRMX data processing initiative could be specially tagged (as "HDRMX") or perhaps highlighted at the web site with a dedicated link.  The second was that inclusion of full-CBF-based metadata in data files is essential, in particular so that multi-panel detectors such as the CSPAD could be properly represented in standards-conformant HDF5/NXmx files   In order to allow further processing of such data at later time, at different sites, or with different software, full metadata (especially on geometry) needs to be available with the data.  This will hopefully reduce the present need for reconstruction of such metadata from laboratory notebooks, which might not be available.