nexusLIMS.schemas package

Submodules

nexusLIMS.schemas.activity module

class nexusLIMS.schemas.activity.AcquisitionActivity(start=datetime.datetime(2021, 11, 29, 16, 47, 43, 296775), end=datetime.datetime(2021, 11, 29, 16, 47, 43, 296778), mode='', unique_params=None, setup_params=None, unique_meta=None, files=None, previews=None, sigs=None, meta=None, warnings=None)[source]

Bases: object

A collection of files/metadata attributed to a physical acquisition activity

Instances of this class correspond to AcquisitionActivity nodes in the NexusLIMS schema

Parameters:
  • start (datetime.datetime) -- The start point of this AcquisitionActivity

  • end (datetime.datetime) -- The end point of this AcquisitionActivity

  • mode (str) -- The microscope mode for this AcquisitionActivity (i.e. 'IMAGING', 'DIFFRACTION', 'SCANNING', etc.)

  • unique_params (set) -- A set of dictionary keys that comprises all unique metadata keys contained within the files of this AcquisitionActivity

  • setup_params (dict) -- A dictionary containing metadata about the data that is shared amongst all data files in this AcquisitionActivity

  • unique_meta (list) -- A list of dictionaries (one for each file in this AcquisitionActivity) containing metadata key-value pairs that are unique to each file in files (i.e. those that could not be moved into setup_params)

  • files (list) -- A list of filenames belonging to this AcquisitionActivity

  • previews (list) -- A list of filenames pointing to the previews for each file in files

  • sigs (list) -- A list of lazy (to minimize loading times) HyperSpy signals in this AcquisitionActivity. HyperSpy is used to facilitate metadata reading

  • meta (list) -- A list of dictionaries containing the "important" metadata for each signal/file in sigs and files

  • warnings (list) -- A list of metadata values that may be untrustworthy because of the software

add_file(fname, generate_preview=True)[source]

Add a file to this activity's file list, parse its metadata (storing a flattened copy of it to this activity), generate a preview thumbnail, get the file's type, and a lazy HyperSpy signal

Parameters:
  • fname (str) -- The file to be added to the file list

  • generate_preview (bool) -- Whether or not to create the preview thumbnail images

as_xml(seqno, sample_id, indent_level=1, print_xml=False)[source]

Build an XML string representation of this AcquisitionActivity (for use in instances of the NexusLIMS schema)

Parameters:
  • seqno (int) -- An integer number representing what number activity this is in a sequence of activities.

  • sample_id (str) -- A unique identifier pointing to a sample identifier. No checks are done on this value; it is merely reproduced in the XML output

  • indent_level (int) -- (Default is 1) the level of indentation to use in exporting. If 0, no lines will be indented. A value of 1 should be appropriate for most cases as used in the Nexus schema

  • print_xml (bool) -- Whether to print the XML output to the console or not (Default: False)

Returns:

activity_xml -- A string representing this AcquisitionActivity (note: is not a properly-formed complete XML document since it does not have a header or namespace definitions)

Return type:

str

store_setup_params(values_to_search=None)[source]

Search the metadata of files in this AcquisitionActivity for those containing identical values over all files, which will then be defined as parameters attributed to experimental setup, rather than individual datasets.

Stores a dictionary containing the metadata keys and values that are consistent across all files in this AcquisitionActivity as an attribute (self.setup_params).

Parameters:

values_to_search (list) -- A list (or tuple, set, or other iterable type) containing values to search for in the metadata dictionary list. If None (default), all values contained in any file will be searched.

store_unique_metadata()[source]

For each file in this AcquisitionActivity, stores the metadata that is unique rather than common to the entire AcquisitionActivity (which are kept in self.setup_params.

store_unique_params()[source]

Analyze the metadata keys contained in this AcquisitionActivity and store the unique values in a set (self.unique_params)

nexusLIMS.schemas.activity.cluster_filelist_mtimes(filelist)[source]

Perform a statistical clustering of the timestamps (mtime values) of a list of files to find "relatively" large gaps in acquisition time. The definition of relatively depends on the context of the entire list of files. For example, if many files are simultaneously acquired, the "inter-file" time spacing between these will be very small (near zero), meaning even fairly short gaps between files may be important. Conversely, if files are saved every 30 seconds or so, the tolerance for a "large gap" will need to be correspondingly larger.

The approach this method uses is to detect minima in the Kernel Density Estimation (KDE) of the file modification times. To determine the optimal bandwidth parameter to use in KDE, a grid search over possible appropriate bandwidths is performed, using Leave One Out cross-validation. This approach allows the method to determine the important gaps in file acquisition times with sensitivity controlled by the distribution of the data itself, rather than a pre-supposed optimum. The KDE minima approach was suggested here.

Parameters:

filelist (list) -- The files (as a list) whose timestamps will be interrogated to find "relatively" large gaps in acquisition time (as a means to find the breaks between discrete Acquisition Activities)

Returns:

aa_boundaries -- A list of the mtime values that represent boundaries between discrete Acquisition Activities

Return type:

list