PX MetaData Proposal

Dear Colleagues, I have promised Bob a strawman list for discussion. I will state it in CBF terms and, then, once we agree on the final list, we can run it through the concordance and put it out in NeXus as well. For reference, I append, first, Graeme's wish list and then Chris Nielsen's CBF template to satisfy Mosflm. _diffrn.id and diffrn.crystal_id -- which experiment and which crystal I would suggest making both mandatory _diffrn_source.diffrn_id, _diffrn_source.source, _diffrn_source.current, _diffrn_source.type -- details on the source of the radiation used Even though this is useful, I would suggest not making this mandatory _diffrn_radiation.diffrn_id, _diffrn_radiation.wavelength_id, _diffrn_radiation.probe, _diffrn_radiation.monochromator, _diffrn_radiation.polarizn_source_ratio, _diffrn_radiation.polarizn_source_norm, _diffrn_radiation.div_x_source, _diffrn_radiation.div_y_source, _diffrn_radiation.div_x_y_source, _diffrn_radiation.collimation, _diffrn_radiation_wavelength.id, _diffrn_radiation_wavelength.wavelength, _diffrn_radiation_wavelength.wt -- these are the actual details of the radiation used, which as Graeme notes are definitely needed (espeically, of course, the wavelength. T simplify the metadata for the monochromatic experiment, I would suggest allowing the specification of a single, representative wavelength, e.g. as _diffrn_radiation.wavelength _diffrn_detector.diffrn_id, _diffrn_detector.id, _diffrn_detector.type, _diffrn_detector.details, _diffrn_detector.number_of_axes, _diffrn_detector_axis.detector_id, _diffrn_detector_axis.axis_id -- broad details on what sort of detector with how many degrees of freedom was used. All of if is important, and at the very least, the dectector type, number of axes, and list of axes should be mandatory _diffrn_detector_element.id, _diffrn_detector_element.detector_id -- lists the detector elements. As we move to more complex detectors, this will become increasingly important and should be mandatory _diffrn_data_frame.id, _diffrn_data_frame.detector_element_id, _diffrn_data_frame.array_id, _diffrn_data_frame.binary_id -- provides the information needed to associate specific arrays of data and pixel layous with specific frames. This should be mandatory _diffrn_measurement.diffrn_id, _diffrn_measurement.id, _diffrn_measurement.number_of_axes, _diffrn_measurement.method, _diffrn_measurement.details _diffrn_measurement_axis.measurement_id, _diffrn_measurement_axis.axis_id -- provides information on the positioner for the sample. Thsi should be mandatory _diffrn_scan.id, _diffrn_scan.frame_id_start, _diffrn_scan.frame_id_end _diffrn_scan.frames, _diffrn_scan_axis.scan_id, _diffrn_scan_axis.axis_id _diffrn_scan_axis.angle_start, _diffrn_scan_axis.angle_range, _diffrn_scan_axis.angle_increment, _diffrn_scan_axis.displacement_start, _diffrn_scan_axis.displacement_range, _diffrn_scan_axis.displacement_increment _diffrn_scan_frame.frame_id, _diffrn_scan_frame.frame_number, _diffrn_scan_frame.integration_time, _diffrn_scan_frame.scan_id, _diffrn_scan_frame.date, _diffrn_scan_frame_axis.frame_id, _diffrn_scan_frame_axis.axis_id, _diffrn_scan_frame_axis.angle, _diffrn_scan_frame_axis.displacement -- provides information to relate frames to particular axis settings -- mandatory _axis.id, _axis.type, _axis.equipment, _axis.depends_on, _axis.vector[1], _axis.vector[2], _axis.vector[3], _axis.offset[1], _axis.offset[2], _axis.offset[3] -- provided details on the layout of axes used in the experiment. This is absolutely essential data to be able to get the physics of the experiment right. For FELs we need to add an alternate form of dependency in which a nested coordinate frame can depend on a rotation axis and a rotation angle. This will be used to specify the exact layout of FEL detector elements. All of this is mandatory _array_structure_list.array_id, _array_structure_list.index, _array_structure_list.dimension, _array_structure_list.precedence, _array_structure_list.direction, _array_structure_list.axis_set_id _array_structure_list_axis.axis_set_id, _array_structure_list_axis.axis_id, _array_structure_list_axis.displacement, _array_structure_list_axis.displacement_increment, _array_structure.id, _array_structure.encoding_type, _array_structure.compression_type, _array_structure.byte_order -- this is essential metadata on the layout of pixels in images. It is all mandatory in the sense that it must be recorded somewhere. _array_intensities.array_id, _array_intensities.binary_id, _array_intensities.linearity, _array_intensities.gain, _array_intensities.gain_esd, _array_intensities.overload, _array_intensities.undefined_value -- this is essential metadata on the meanings of the values stored in the pixels -- mandatory _array_data.array_id, _array_data.binary_id, _array_data.data -- associates the actual data with the related metadata. it must be recorded somewhere. What is missing from the above are the pixel mask arrays -- they are just additional images, but we should standardize the notation for them. The Dectris Eiger conventions seem a possible starting point. Regards, Herbert > > Dear All, > > A while back I promised some thoughts in advance of Jonathan’s visit to Long Island on what we should promise in MX single crystal diffraction data. Here goes. > > Basic requirements: > > · Use one single well defined coordinate frame implicitly or explicitly defined within the file – explicitly is nice as this will allow validation, but critically enough that I can determine the positional relationship between all elements taking part in the experiment. > > · Use e.g. SI (+ degrees and eV) units for the definition of displacement and rotation > > · In addition to values add the option to store the expected variance of the value e.g. if the value refines substantially beyond this value trigger a warning or error – these may be derived from analysis of historical data > > · Timestamps on all values to allow for storage of refined values maintaining access to original values (e.g. variants) > > Specific minimum requirements: > > · Structure of the beam e.g. wavelength, energy dispersion, size, divergence, profile, energy spectrum (for X-FEL in particular) and direction (defined either sample to source or the reverse, but defined within the standard.) > > · All rotation axes, their composition order, those being scanned and those which are fixed, all of their settings and offsets, perhaps their derivatives if possible (i.e. “axis speed” if not constant) > > · Position, orientation and extent of the detector(s) and their mechanical relationships e.g. the structure of the detector relevant to refinement – the detector is made of 4 quadrants which are in turn constructed from 8 elements etc. This may be expressed as conventional use of axes in imgCIF. > > · Image masks for dead pixels, hot pixels, tile join regions etc. and appropriate values for these. > > · Limits on trusted pixel values e.g. above ‘X’ ignore value as overloaded. > > · Exposure time / beam attenuation or flux > > · Sample xyz translation applied > > · Sample unique ID > > · Sample group unique ID > > Last two are to learn automatically from the structure of the experiment, the sample ID and the timestamps the intent of the experiment. These should not necessarily take any format but scans with the same sample ID should be from the same sample, and scans from the same sample group should be somehow related e.g. multi-crystal experiments. > > Clearly these are just a starting point but should be a useful place to start the discussion with our colleagues at BNL. > > Best wishes, > > Graeme > > ###CBF: VERSION 1.1 > > data_image_1 > > > # category DIFFRN > > loop_ > _diffrn.id > _diffrn.crystal_id > DIFFRN_ID DIFFRN_CRYSTAL_ID > > > # category DIFFRN_SOURCE > > loop_ > _diffrn_source.diffrn_id > _diffrn_source.source > _diffrn_source.current > _diffrn_source.type > DIFFRN_ID synchrotron 100.0 'SSRL beamline 1-5' > > > # category DIFFRN_RADIATION > > loop_ > _diffrn_radiation.diffrn_id > _diffrn_radiation.wavelength_id > _diffrn_radiation.probe > _diffrn_radiation.monochromator > _diffrn_radiation.polarizn_source_ratio > _diffrn_radiation.polarizn_source_norm > _diffrn_radiation.div_x_source > _diffrn_radiation.div_y_source > _diffrn_radiation.div_x_y_source > _diffrn_radiation.collimation > DIFFRN_ID WAVELENGTH1 x-ray 'Si 111' 0.8 0.0 0.08 0.01 0.00 '0.20 mm x 0.20 mm' > > > # category DIFFRN_RADIATION_WAVELENGTH > > loop_ > _diffrn_radiation_wavelength.id > _diffrn_radiation_wavelength.wavelength > _diffrn_radiation_wavelength.wt > WAVELENGTH1 0.98 1.0 > > > # category DIFFRN_DETECTOR > > loop_ > _diffrn_detector.diffrn_id > _diffrn_detector.id > _diffrn_detector.type > _diffrn_detector.details > _diffrn_detector.number_of_axes > DIFFRN_ID ADSCQ4 'ADSC QUANTUM4' 'slow mode' 4 > > > # category DIFFRN_DETECTOR_AXIS > > loop_ > _diffrn_detector_axis.detector_id > _diffrn_detector_axis.axis_id > ADSCQ4 DETECTOR_X > ADSCQ4 DETECTOR_Y > ADSCQ4 DETECTOR_Z > ADSCQ4 DETECTOR_PITCH > > > # category DIFFRN_DETECTOR_ELEMENT > > loop_ > _diffrn_detector_element.id > _diffrn_detector_element.detector_id > ELEMENT1 ADSCQ4 > > > # category DIFFRN_DATA_FRAME > > loop_ > _diffrn_data_frame.id > _diffrn_data_frame.detector_element_id > _diffrn_data_frame.array_id > _diffrn_data_frame.binary_id > FRAME1 ELEMENT1 ARRAY1 1 > > > # category DIFFRN_MEASUREMENT > > loop_ > _diffrn_measurement.diffrn_id > _diffrn_measurement.id > _diffrn_measurement.number_of_axes > _diffrn_measurement.method > _diffrn_measurement.details > DIFFRN_ID GONIOMETER 3 rotation > 'i0=1.000 i1=1.000 i2=1.000 ib=1.000 beamstop=20 mm 0% attenuation' > > > # category DIFFRN_MEASUREMENT_AXIS > > loop_ > _diffrn_measurement_axis.measurement_id > _diffrn_measurement_axis.axis_id > GONIOMETER GONIOMETER_PHI > GONIOMETER GONIOMETER_KAPPA > GONIOMETER GONIOMETER_OMEGA > > > # category DIFFRN_SCAN > > loop_ > _diffrn_scan.id > _diffrn_scan.frame_id_start > _diffrn_scan.frame_id_end > _diffrn_scan.frames > SCAN1 FRAME1 FRAME1 1 > > > # category DIFFRN_SCAN_AXIS > > loop_ > _diffrn_scan_axis.scan_id > _diffrn_scan_axis.axis_id > _diffrn_scan_axis.angle_start > _diffrn_scan_axis.angle_range > _diffrn_scan_axis.angle_increment > _diffrn_scan_axis.displacement_start > _diffrn_scan_axis.displacement_range > _diffrn_scan_axis.displacement_increment > SCAN1 GONIOMETER_OMEGA 0.0 0.0 0.0 0.0 0.0 0.0 > SCAN1 GONIOMETER_KAPPA 0.0 0.0 0.0 0.0 0.0 0.0 > SCAN1 GONIOMETER_PHI 0.0 0.0 0.0 0.0 0.0 0.0 > SCAN1 DETECTOR_Z 0.0 0.0 0.0 0.0 0.0 0.0 > SCAN1 DETECTOR_Y 0.0 0.0 0.0 0.0 0.0 0.0 > SCAN1 DETECTOR_X 0.0 0.0 0.0 0.0 0.0 0.0 > SCAN1 DETECTOR_PITCH 0.0 0.0 0.0 0.0 0.0 0.0 > > > # category DIFFRN_SCAN_FRAME > > loop_ > _diffrn_scan_frame.frame_id > _diffrn_scan_frame.frame_number > _diffrn_scan_frame.integration_time > _diffrn_scan_frame.scan_id > _diffrn_scan_frame.date > FRAME1 1 0.0 SCAN1 1997-12-04T10:23:48 > > > # category DIFFRN_SCAN_FRAME_AXIS > > loop_ > _diffrn_scan_frame_axis.frame_id > _diffrn_scan_frame_axis.axis_id > _diffrn_scan_frame_axis.angle > _diffrn_scan_frame_axis.displacement > FRAME1 GONIOMETER_OMEGA 0.0 0.0 > FRAME1 GONIOMETER_KAPPA 0.0 0.0 > FRAME1 GONIOMETER_PHI 0.0 0.0 > FRAME1 DETECTOR_Z 0.0 0.0 > FRAME1 DETECTOR_Y 0.0 0.0 > FRAME1 DETECTOR_X 0.0 0.0 > FRAME1 DETECTOR_PITCH 0.0 0.0 > > > # category AXIS > > loop_ > _axis.id > _axis.type > _axis.equipment > _axis.depends_on > _axis.vector[1] _axis.vector[2] _axis.vector[3] > _axis.offset[1] _axis.offset[2] _axis.offset[3] > GONIOMETER_OMEGA rotation goniometer . 1 0 0 . . . > GONIOMETER_KAPPA rotation goniometer GONIOMETER_OMEGA 0.64279 0 0.76604 . . . > GONIOMETER_PHI rotation goniometer GONIOMETER_KAPPA 1 0 0 . . . > SOURCE general source . 0 0 1 . . . > GRAVITY general gravity . 0 -1 0 . . . > DETECTOR_Z translation detector . 0 0 -1 0 0 0 > DETECTOR_Y translation detector DETECTOR_Z 0 1 0 0 0 0 > DETECTOR_X translation detector DETECTOR_Y 1 0 0 0 0 0 > DETECTOR_PITCH rotation detector DETECTOR_X 0 1 0 0 0 0 > ELEMENT_X translation detector DETECTOR_PITCH 1 0 0 -94.0032 94.0032 0 > ELEMENT_Y translation detector ELEMENT_X 0 1 0 0 0 0 > > > # category ARRAY_STRUCTURE_LIST > > loop_ > _array_structure_list.array_id > _array_structure_list.index > _array_structure_list.dimension > _array_structure_list.precedence > _array_structure_list.direction > _array_structure_list.axis_set_id > ARRAY1 1 2304 1 increasing ELEMENT_X > ARRAY1 2 2304 2 increasing ELEMENT_Y > > > # category ARRAY_STRUCTURE_LIST_AXIS > > loop_ > _array_structure_list_axis.axis_set_id > _array_structure_list_axis.axis_id > _array_structure_list_axis.displacement > _array_structure_list_axis.displacement_increment > ELEMENT_X ELEMENT_X 0.0408 0.0816 > ELEMENT_Y ELEMENT_Y -0.0408 -0.0816 > > > # category ARRAY_INTENSITIES > > loop_ > _array_intensities.array_id > _array_intensities.binary_id > _array_intensities.linearity > _array_intensities.gain > _array_intensities.gain_esd > _array_intensities.overload > _array_intensities.undefined_value > ARRAY1 1 linear 0.23 0.03 65000 0 > > > # category ARRAY_STRUCTURE > > loop_ > _array_structure.id > _array_structure.encoding_type > _array_structure.compression_type > _array_structure.byte_order > ARRAY1 "signed 32-bit integer" packed little_endian > > > # category ARRAY_DATA > > loop_ > _array_data.array_id > _array_data.binary_id > _array_data.data > ARRAY1 1 ?