30 May 2016 Draft Report HDRMX Meeting, BNL 26-28 May 2016
Report of
the
High
Data-Rate Macromolecular Crystallography Meeting
ACA 2016 23
July 2016
DRAFT Report
Date: 7 August 2016
This is the second of a series of three meetings in Springspring
and sSummer
2016 on changes needed to existing major software packages for support of very
high data rate macromolecular crystallography. This meeting was an informal dinner gathering at the
meeting of the American Crystallographic Association in Denver Colorado, 22 –
26 July 2016. This
meeting was chaired by Herbert J, Bernstein of Rochester Institute of
Technology. It was a
follow-on meeting to the first meeting was held at Brookhaven National
Laboratory, 26 – 28 May 2016. The BNL meeting was organized by Herbert J. Bernstein of Rochester
Institute of Technology, Nicholas K. Sauter of Lawrence Berkeley National
Laboratory and Robert M. Sweet of Brookhaven National Laboratory.
This is a draft
report by HJB for review and commends and correction. A raw BlueJeans recording is
available at
The editor of the text is HJB (yayahjb at gmail dot com) to whom
comments and corrections should be directed.
The attendees at theat
meeting were:
Name Institution
On-site Participants
Alun Ashton Diamond
Light Source
Kaden Badalian Binghamton
University
Herbert J.
Bernstein Rochester
Institute of Technology
Frances C.
Bernstein Brookhaven
National Laboratory (ret.)
Aaron Brewster Lawrence
Berkeley National Laboratory
Gerard Bricogne Global
Phasing Ltd.
James Holton UCSF/LBNL/SLAC
Andrew Howard Illinois
Institute of Technology
Wladek Minor University
of Virginia
Katherine McAuley Diamond
Light Source
Marcus Mueller DECTRIS
Ltd.
Jie Nan MAX
IV Lund University
George Phillips Rice
University
John Rose SER-CAT
University of Georgia
Gerold Rosenbaum Argonne
National Laboratory
Nicholas K. Sauter Lawrence
Berkeley National Laboratory
Clemens Schulze-Briese DECTRIS
Ltd.
Marian Szebenyl MacCHESS Cornell University
Thomas Ursby MAX
IV Lund University
Electronic Participants
Keitaro Yamashita Spring 8SPring-8
19 participants attended on-site, 1
participant attended remotely.
Issues Discussed at the Meeting
Inasmuch as some of the
participants were new to these discussions, the meeting began with a brief
review of the fact that the new generation of detectors and increased
brightness at new MX beamlines requires a careful reconsideration and
re-engineering of networks, computers, software and process flows for
macromolecular crystallography in order to deal efficiently with the new data
flows without significant delays.
The report of the May meeting (http://www.medsbio.org/meetings/HDRMX_Meeting_BNL_26_28May_Report.html) was distributed to all
participants and they were referred to the http://hdrmx.medsbio.org web site.
There was brief discussion of
whether our concern should be high structure-rate macromolecular
crystallography, rather than high data-rate macromolecular
crystallography. Inasmuch as many
beamlines are facing an immediate need to get organized for high data-rate
macromolecular crystallography, the discussion returned to that issue.
In the May meeting, Wladek Minor
agreed to provide 200 TB of storage for relevant datasets to facilitate softare
testing. Participants were
encourage to deposit such data promptly.
The participants were informed of the major progress and
desirabilty of having applications read the native HDF5 data format directly to
save significant time in processing.
It was noted that DIALS has made the transition and so has a test version
of ESRF’s Dozor, and that it appears that, thanks to Takanori Nakane’s eiger2cbf, a framework is now
available that appears to avoid some of the HDF5 library blocking issues that
has been a problem with such adaptations in the past.
The rest of the discussion
centered on issues of ensuring the most useful and portable data. There was considerable discussion of
what metadata belongs with the images.
It was the sense of the meeting that it is very important to provide
complete metadata on detector geometry and on goniometer geometry, essentially
the Nexus mapping of the full CBF metadata that has already been demonstrated
for CSPAD data. The objective
would be to extend the current, simpler NXmx application definition with
optional metadata consistent with this objective. If this community can come to agreement on the details
before the Nexus International Advisory Committee in Copenhagen in October
2016, that extended application definition will be proposed for formal
adoption. Informal use would, of
course, continue in the meantime.
It was the sense of the group
that it would be desirable to coordinate image metadata for deposition with the
IUCr Diffraction Data Deposition Working Group, and HJB was charged with trying
to arrange participation by a DDDWG representative in the HDRMX ECM-30
satellite meeting on 2 September 2016.
As data is added to the HDRMX
website, we discussed
requesting that specific 'gold standard' datasets be included, not necessarily from
an Eiger detector, that are representatives of fully compliant NXmx and/or
NeXus data. These gold standards
would derive from easily processable data (for example lysozyme) and include 1)
a simple NXmx set with a single panel detector orthogonal to the beam and a
single axis goniometer. 2) A complex dataset with a multi-panel detector that
includes tilts, rotations and translations off of the main beam, and that
includes a multi-axis goniometer with axes not parallel to X, Y or Z axes,
including a kappa axis. With these standards developers
would be able to
test their processing methods.
It was
noted that community
members may find these links useful:
ImageCIF:
http://www.bernstein-plus-sons.com/software/CBF/doc/cif_img_1.7.8.html
NeXus: http://www.nexusformat.org/
ImageCIF
to NeXus mapping: https://sites.google.com/site/nexuscbf/
NXmx specification: http://download.nexusformat.org/sphinx/classes/applications/NXmx.html
Conclusions:
There were two major conclusions
specific to this meeting. The
first was the need to encourage dataset deposition at Wladek Minor’s site. It would be most useful if
datasets relevant to the HDRMX data processing initiative could be specially
tagged (as "HDRMX") or perhaps highlighted at the web site with a
dedicated link. The second was that inclusion of
full-CBF-based metadata in data files is essential, in particular so that
multi-panel detectors such as the CSPAD could be properly represented in standards-conformant HDF5/NXmx files In order to allow further processing of such data at later
time, at different sites, or with different software, full metadata (especially
on geometry) needs to be available with the data. This will hopefully reduce the present need for
reconstruction of such metadata from laboratory notebooks, which might not be
available.