nexusLIMS.schemas package¶
Submodules¶
nexusLIMS.schemas.activity module¶
-
class
nexusLIMS.schemas.activity.AcquisitionActivity(start=datetime.datetime(2021, 11, 29, 16, 47, 43, 296775), end=datetime.datetime(2021, 11, 29, 16, 47, 43, 296778), mode='', unique_params=None, setup_params=None, unique_meta=None, files=None, previews=None, sigs=None, meta=None, warnings=None)[source]¶ Bases:
objectA collection of files/metadata attributed to a physical acquisition activity
Instances of this class correspond to AcquisitionActivity nodes in the NexusLIMS schema
- Parameters:
start (datetime.datetime) -- The start point of this AcquisitionActivity
end (datetime.datetime) -- The end point of this AcquisitionActivity
mode (str) -- The microscope mode for this AcquisitionActivity (i.e. 'IMAGING', 'DIFFRACTION', 'SCANNING', etc.)
unique_params (set) -- A set of dictionary keys that comprises all unique metadata keys contained within the files of this AcquisitionActivity
setup_params (dict) -- A dictionary containing metadata about the data that is shared amongst all data files in this AcquisitionActivity
unique_meta (list) -- A list of dictionaries (one for each file in this AcquisitionActivity) containing metadata key-value pairs that are unique to each file in
files(i.e. those that could not be moved intosetup_params)files (list) -- A list of filenames belonging to this AcquisitionActivity
previews (list) -- A list of filenames pointing to the previews for each file in
filessigs (list) -- A list of lazy (to minimize loading times) HyperSpy signals in this AcquisitionActivity. HyperSpy is used to facilitate metadata reading
meta (list) -- A list of dictionaries containing the "important" metadata for each signal/file in
sigsandfileswarnings (list) -- A list of metadata values that may be untrustworthy because of the software
-
add_file(fname, generate_preview=True)[source]¶ Add a file to this activity's file list, parse its metadata (storing a flattened copy of it to this activity), generate a preview thumbnail, get the file's type, and a lazy HyperSpy signal
-
as_xml(seqno, sample_id, indent_level=1, print_xml=False)[source]¶ Build an XML string representation of this AcquisitionActivity (for use in instances of the NexusLIMS schema)
- Parameters:
seqno (int) -- An integer number representing what number activity this is in a sequence of activities.
sample_id (str) -- A unique identifier pointing to a sample identifier. No checks are done on this value; it is merely reproduced in the XML output
indent_level (int) -- (Default is 1) the level of indentation to use in exporting. If 0, no lines will be indented. A value of 1 should be appropriate for most cases as used in the Nexus schema
print_xml (bool) -- Whether to print the XML output to the console or not (Default: False)
- Returns:
activity_xml -- A string representing this AcquisitionActivity (note: is not a properly-formed complete XML document since it does not have a header or namespace definitions)
- Return type:
-
store_setup_params(values_to_search=None)[source]¶ Search the metadata of files in this AcquisitionActivity for those containing identical values over all files, which will then be defined as parameters attributed to experimental setup, rather than individual datasets.
Stores a dictionary containing the metadata keys and values that are consistent across all files in this AcquisitionActivity as an attribute (
self.setup_params).- Parameters:
values_to_search (list) -- A list (or tuple, set, or other iterable type) containing values to search for in the metadata dictionary list. If None (default), all values contained in any file will be searched.
-
nexusLIMS.schemas.activity.cluster_filelist_mtimes(filelist)[source]¶ Perform a statistical clustering of the timestamps (mtime values) of a list of files to find "relatively" large gaps in acquisition time. The definition of relatively depends on the context of the entire list of files. For example, if many files are simultaneously acquired, the "inter-file" time spacing between these will be very small (near zero), meaning even fairly short gaps between files may be important. Conversely, if files are saved every 30 seconds or so, the tolerance for a "large gap" will need to be correspondingly larger.
The approach this method uses is to detect minima in the Kernel Density Estimation (KDE) of the file modification times. To determine the optimal bandwidth parameter to use in KDE, a grid search over possible appropriate bandwidths is performed, using Leave One Out cross-validation. This approach allows the method to determine the important gaps in file acquisition times with sensitivity controlled by the distribution of the data itself, rather than a pre-supposed optimum. The KDE minima approach was suggested here.
- Parameters:
filelist (list) -- The files (as a list) whose timestamps will be interrogated to find "relatively" large gaps in acquisition time (as a means to find the breaks between discrete Acquisition Activities)
- Returns:
aa_boundaries -- A list of the mtime values that represent boundaries between discrete Acquisition Activities
- Return type:
