1.4. Strategies for storing information in NeXus data files

NeXus may appear daunting, at first, to use. The number of base classes is quite large as well as is the number of application definitions. This chapter describes some of the strategies that have been recommended for how to store information in NeXus data files.

When we use the term storing, some might be helped if they consider this as descriptions for how to classify their data.

It is intended for this chapter to grow, with the addition of different use cases as they are presented for suggestions.

1.4.1. Strategies: The simplest case(s)

Perhaps the simplest case might be either a step scan with two or more columns of data. Another simple case might be a single image acquired by an area detector. In either of these hypothetical cases, the situation is so simple that there is little addition information available to be described (for whatever reason).

1.4.1.1. Step scan with two or more data columns

Consider the case where we wish to store the data from a step scan. This case may involve two or more related 1-D arrays of data to be saved, each having the same length. For our hypothetical case, we’lll have these positioners as arrays and assume that a default plot of photodiode vs. ar:

positioner arrays detector arrays
ar, ay, dy I0, I00, time, Epoch, photodiode

Data file structure for Step scan with two or more data columns

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
file.nxs: NeXus HDF5 data file
   @default = entry
   entry: NXentry
      @NX_class = NXentry
      @default = data
      data: NXdata
         @NX_class = NXdata
         @signal = photodiode
         @axes = ar
         ar: NX_FLOAT[]
         ay: NX_FLOAT[]
         dy: NX_FLOAT[]
         I0: NX_FLOAT[]
         I00: NX_FLOAT[]
         time: NX_FLOAT[]
         Epoch: NX_FLOAT[]
         photodiode: NX_FLOAT[]

1.4.2. Strategies: The wavelength

Where should the wavelength of my experiment be written? This is one of the Frequently Asked Questions. The canonical location to store wavelength has been:

/NXentry/NXinstrument/NXcrystal/wavelength

Partial data file structure for canonical location to store wavelength

1
2
3
4
5
6
7
entry: NXentry
   @NX_class = NXentry
   instrument: NXinstrument
      @NX_class = NXinstrument
      crystal: NXcrystal
         @NX_class = NXcrystal
         wavelength: NX_FLOAT

More recently, this location makes more sense to many:

/NXentry/NXinstrument/NXmonochromator/wavelength

Partial data file structure for location which makes more sense to many to store wavelength

1
2
3
4
5
6
7
entry: NXentry
   @NX_class = NXentry
   instrument: NXinstrument
      @NX_class = NXinstrument
      monochromator: NXmonochromator
         @NX_class = NXmonochromator
         wavelength: NX_FLOAT

NXcrystal describes a crystal monochromator or analyzer. Recently, scientists with monochromatic radiation not defined by a crystal, such as from an electron-beam undulator or a neutron helical velocity selector, were not satisfied with creating a fictitious instance of a crystal just to preserve the wavelength from their instrument. Thus, the addition of the NXmonochromator base class to NeXus, which also allows “energy” to be specified if one is so inclined.

Note

See the Class path specification section for a short discussion of the difference between the HDF5 path and the NeXus symbolic class path.

1.4.3. Strategies: Time-stamped data

How should I store time-stamped data? Time-stamped data can stored in both NXlog and NXevent_data. NXevent_data is used for storing neutron event data and NXlog would be used for storing any other time-stamped data, e.g. sample temperature, chopper top-dead-centre, motor position etc.

Both NXlog and NXevent_data have additional support for storing time-stamped data in the form of cues; cues can be used to place markers in the data that allow one to quickly look up coarse time ranges of interest. This coarse range of data can then be manually trimmed to be more selective, if required. The application writing the NeXus file is responsible for writing cues and when they are written. For example, the cue could be written every 10 seconds, every pulse, every 100 datapoints and so on.

Let’s consider the case where NXlog is being used to store sample temperature data that has been sampled once every three seconds. The application that wrote the data has added cues every 20 seconds. Pictorially, this may look something like this:

_images/timestamp-cues-example.png

If we wanted to retrieve the mean temperature between 30 and 40 seconds, we would use the cues to grab the data between 20 seconds and 40 seconds, and then trim that data to get the data we want. Obviously in this simple example this does not gain us a lot, but it is easy to see that in a large dataset having appropriately placed cues can save significant computational time when looking up values in a certain time-stamp range.

In the NeXus Features repository, the feature ECB064453EDB096D shows example code that uses cues to select time-stamped data.

1.4.4. Strategies: The next case

The NIAC: The NeXus International Advisory Committee welcomes suggestions for additional sections in this chapter.