nexusLIMS.extractors package

This module contains the code used to harvest metadata from various file types generated from instruments in the Electron Microscopy Nexus facility.

Extractors should return a dictionary containing the values to be displayed in NexusLIMS as a sub-dictionary under the key nx_meta. The remaining keys will be for the metadata as extracted. Under nx_meta, a few keys are expected (although not enforced):

  • 'Creation Time' - ISO format date and time as a string

  • 'Data Type' - a human-readable description of the data type separated by underscores - e.g "STEM_Imaging", "TEM_EDS", etc.

  • 'DatasetType' - determines the value of the Type attribute for the dataset (defined in the schema)

  • 'Data Dimensions' - dimensions of the dataset, surrounded by parentheses, separated by commas as a string- e.g. '(12, 1024, 1024)'

  • 'Instrument ID' - instrument PID pulled from the instrument database

nexusLIMS.extractors.flatten_dict(d, parent_key='', separator=' ')[source]

Utility method to take a nested dictionary structure and flatten it into a single level, separating the levels by a string as specified by separator

Cribbed from: https://stackoverflow.com/a/6027615/1435788

Parameters:
  • d (dict) -- The dictionary to flatten

  • parent_key (str) -- The "root" key to add to add to the existing keys

  • separator (str) -- The string to use to separate values in the flattened keys (i.e. {'a': {'b': 'c'}} would become {'a' + sep + 'b': 'c'})

Returns:

flattened_dict -- The dictionary with depth one, with nested dictionaries flattened into root-level keys

Return type:

str

nexusLIMS.extractors.parse_metadata(fname, write_output=True, generate_preview=True, overwrite=True)[source]

Given an input filename, read the file, determine what "type" of file (i.e. what instrument it came from) it is, filter the metadata (if necessary) to what we are interested in, and return it as a dictionary (writing to the NexusLIMS directory as JSON by default). Also calls the preview generation method, if desired.

Parameters:
  • fname (str) -- The filename from which to read data

  • write_output (bool) -- Whether to write the metadata dictionary as a json file in the NexusLIMS folder structure

  • generate_preview (bool) -- Whether to generate the thumbnail preview of this dataset (that operation is not done in this method, it is just called from here so it can be done at the same time)

  • overwrite (bool) -- Whether or not to overwrite the .json metadata file and thumbnail image if either exists

Returns:

  • nx_meta (dict or None) -- The "relevant" metadata that is of use for NexusLIMS. If None, the file could not be opened

  • preview_fname (str or None) -- The file path of the generated preview image, or None if it was not requested

Submodules

nexusLIMS.extractors.digital_micrograph module

nexusLIMS.extractors.digital_micrograph.get_dm3_metadata(filename)[source]

Returns the metadata (as a dict) from a .dm3 file saved by the Gatan's Digital Micrograph in the Nexus Microscopy Facility, with some non-relevant information stripped out, and instrument specific metadata parsed and added by one of the instrument-specific parsers.

Parameters:

filename (str) -- path to a .dm3 file saved by Gatan's Digital Micrograph

Returns:

metadata -- The metadata of interest extracted from the file. If None, the file could not be opened

Return type:

dict or None

nexusLIMS.extractors.digital_micrograph.get_pre_path(mdict)[source]

Get the path into a dictionary where the important DigitalMicrograph metadata is expected to be found. If the .dm3/.dm4 file contains a stack of images, the important metadata for NexusLIMS is not at its usual place and is instead under a plan info tag, so this method will determine if the stack metadata is present and return the correct path. pre_path will be something like ['ImageList', 'TagGroup0', 'ImageTags', 'plane info', 'TagGroup0', 'source tags'].

Parameters:

mdict (dict) -- A metadata dictionary as returned by get_dm3_metadata()

Returns:

pre_path -- A list containing the subsequent keys that need to be traversed to get to the point in the mdict where the important metadata is stored

Return type:

list

nexusLIMS.extractors.digital_micrograph.parse_642_jeol(mdict)[source]

Add/adjust metadata specific to the 642 FEI Titan ('**REMOVED** in **REMOVED**') to the metadata dictionary

Parameters:

mdict (dict) -- "raw" metadata dictionary as parsed by get_dm3_metadata()

Returns:

mdict -- The original metadata dictionary with added information specific to files originating from this microscope with "important" values contained under the nx_meta key at the root level

Return type:

dict

nexusLIMS.extractors.digital_micrograph.parse_642_titan(mdict)[source]

Add/adjust metadata specific to the 642 FEI Titan ('**REMOVED** in **REMOVED**') to the metadata dictionary

Parameters:

mdict (dict) -- "raw" metadata dictionary as parsed by get_dm3_metadata()

Returns:

mdict -- The original metadata dictionary with added information specific to files originating from this microscope with "important" values contained under the nx_meta key at the root level

Return type:

dict

nexusLIMS.extractors.digital_micrograph.parse_643_titan(mdict)[source]

Add/adjust metadata specific to the 643 FEI Titan ('**REMOVED** in **REMOVED**') to the metadata dictionary

Parameters:

mdict (dict) -- "raw" metadata dictionary as parsed by get_dm3_metadata()

Returns:

mdict -- The original metadata dictionary with added information specific to files originating from this microscope with "important" values contained under the nx_meta key at the root level

Return type:

dict

nexusLIMS.extractors.digital_micrograph.parse_dm3_eds_info(mdict)[source]

Parses metadata from the DigitalMicrograph tag structure that concerns any EDS acquisition or spectrometer settings, placing it in an EDS dictionary underneath the root-level nx_meta node. Metadata values that are commonly incorrect or may be placeholders are specified in a list under the nx_meta.warnings node.

Parameters:

mdict (dict) -- A metadata dictionary as returned by get_dm3_metadata()

Returns:

mdict -- The metadata dictionary with all the "EDS-specific" metadata added as sub-node under the nx_meta root level dictionary

Return type:

dict

nexusLIMS.extractors.digital_micrograph.parse_dm3_eels_info(mdict)[source]

Parses metadata from the DigitalMicrograph tag structure that concerns any EELS acquisition or spectrometer settings, placing it in an EELS dictionary underneath the root-level nx_meta node.

Parameters:

mdict (dict) -- A metadata dictionary as returned by get_dm3_metadata()

Returns:

mdict -- The metadata dictionary with all the "EELS-specific" metadata added as sub-node under the nx_meta root level dictionary

Return type:

dict

nexusLIMS.extractors.digital_micrograph.parse_dm3_microscope_info(mdict)[source]

Parse the "important" metadata that is saved at specific places within the DM3 tag structure into a consistent place in the metadata dictionary returned by get_dm3_metadata(). Specifically looks at the "Microscope Info", "Session Info", and "Meta Data" nodes of the tag structure (these are not present on every microscope)

Parameters:

mdict (dict) -- A metadata dictionary as returned by get_dm3_metadata()

Returns:

mdict -- The same metadata dictionary with some values added under the root-level nx_meta key

Return type:

dict

nexusLIMS.extractors.digital_micrograph.parse_dm3_spectrum_image_info(mdict)[source]

Parses metadata from the DigitalMicrograph tag structure that concerns any spectrum imaging information (from the "SI" tag) and places it in a "Spectrum Imaging" dictionary underneath the root-level nx_meta node. Metadata values that are commonly incorrect or may be placeholders are specified in a list under the nx_meta.warnings node.

Parameters:

mdict (dict) -- A metadata dictionary as returned by get_dm3_metadata()

Returns:

mdict -- The metadata dictionary with all the "EDS-specific" metadata added as sub-node under the nx_meta root level dictionary

Return type:

dict

nexusLIMS.extractors.digital_micrograph.process_tecnai_microscope_info(microscope_info, delimiter='\u2028')[source]

Process the Microscope_Info metadata string from an FEI Titan TEM into a dictionary of key-value pairs

Parameters:
  • microscope_info (str) -- The string of data obtained from the original_metadata.ImageList. TagGroup0.ImageTags.Tecnai.Microscope_Info leaf of the metadata tree obtained when loading a .dm3 file as a HyperSpy signal

  • delimiter (str) -- The value (a unicode string) used to split the microscope_info string. Should not need to be provided (this value is hard-coded in DigitalMicrograph), but specified as a parameter for future flexibility

Returns:

info_dict -- The information contained in the string, in a more easily-digestible form.

Return type:

dict

nexusLIMS.extractors.fei_emi module

nexusLIMS.extractors.fei_emi.get_emi_from_ser(ser_fname)[source]

Get the accompanying .emi filename from an ser filename. This method assumes that the .ser file will be the same name as the .emi file, but with an underscore and a digit appended. i.e. file.emi would result in .ser files named file_1.ser, file_2.ser, etc.

Parameters:

ser_fname (str) -- The absolute path of an FEI TIA .ser data file

Returns:

  • emi_fname (str) -- The absolute path of the accompanying .emi metadata file

  • index (int) -- The number of this .ser file (i.e. 1, 2, 3, etc.)

Raises:

FileNotFoundError -- If the accompanying .emi file cannot be resolved to be a file

nexusLIMS.extractors.fei_emi.get_ser_metadata(filename)[source]

Returns metadata (as a dict) from an FEI .ser file + its associated .emi files, with some non-relevant information stripped.

Parameters:

filename (str) -- Path to FEI .ser file

Returns:

metadata -- Metadata of interest which is extracted from the passed files. If files cannot be opened, at least basic metadata will be returned ( creation time, etc.)

Return type:

dict

nexusLIMS.extractors.fei_emi.map_keys(term_mapping, base, metadata)[source]

Given a term mapping dictionary and a metadata dictionary, translate the input keys within the "raw" metadata into a parsed value in the "nx_meta" metadata structure.

Parameters:
  • term_mapping (dict) -- Dictionary where keys are tuples of strings (the input terms), and values are either a single string or a list of strings (the output terms).

  • base (list) -- The 'root' path within the metadata dictionary of where to start applying the input terms

  • metadata (dict) -- A metadata dictionary as returned by get_ser_metadata()

Returns:

metadata -- The same metadata dictionary with some values added under the root-level nx_meta key, as specified by term_mapping

Return type:

dict

Notes

The term_mapping parameter should be a dictionary of the form:

{
    ('val1_1', 'val1_2') : 'output_val_1',
    ('val1_1', 'val2_2') : 'output_val_2',
    etc.
}

Assuming base is ['ObjectInfo', 'AcquireInfo'], this would map the term present at ObjectInfo.AcquireInfo.val1_1.val1_2 into nx_meta.output_val_1, and ObjectInfo.AcquireInfo.val1_1.val2_2 into nx_meta.output_val_2, and so on. If one of the output terms is a list, the resulting metadata will be nested. e.g. ['output_val_1', 'output_val_2'] would get mapped to nx_meta.output_val_1.output_val_2.

nexusLIMS.extractors.fei_emi.parse_acquire_info(metadata)[source]

Parse the metadata that is saved at specific places within the .emi tag structure into a consistent place in the metadata dictionary returned by get_ser_metadata(). Specifically looks at the "AcquireInfo" node of the metadata structure.

Parameters:

metadata (dict) -- A metadata dictionary as returned by get_ser_metadata()

Returns:

metadata -- The same metadata dictionary with some values added under the root-level nx_meta key

Return type:

dict

nexusLIMS.extractors.fei_emi.parse_data_type(s, metadata)[source]

Determine "Data Type" and "DatasetType" for the given .ser file based off of metadata and signal characteristics. This method is used to determine whether the image is TEM or STEM, Image or Diffraction, Spectrum or Spectrum Image, etc.

Due to lack of appropriate metadata written by the FEI software, a heuristic of axis limits and size is used to determine whether a spectrum's data type is EELS or EDS. This may not be a perfect determination.

Parameters:
Returns:

  • data_type (str) -- The string that should be stored at metadata['nx_meta']['Data Type']

  • dataset_type (str) -- The string that should be stored at metadata['nx_meta']['DatasetType']

nexusLIMS.extractors.fei_emi.parse_experimental_conditions(metadata)[source]

Parse the metadata that is saved at specific places within the .emi tag structure into a consistent place in the metadata dictionary returned by get_ser_metadata(). Specifically looks at the "ExperimentalConditions" node of the metadata structure.

Parameters:

metadata (dict) -- A metadata dictionary as returned by get_ser_metadata()

Returns:

metadata -- The same metadata dictionary with some values added under the root-level nx_meta key

Return type:

dict

nexusLIMS.extractors.fei_emi.parse_experimental_description(metadata)[source]

Parse the metadata that is saved at specific places within the .emi tag structure into a consistent place in the metadata dictionary returned by get_ser_metadata(). Specifically looks at the "ExperimentalDescription" node of the metadata structure.

Parameters:

metadata (dict) -- A metadata dictionary as returned by get_ser_metadata()

Returns:

metadata -- The same metadata dictionary with some values added under the root-level nx_meta key

Return type:

dict

Notes

The terms to extract in this section were

nexusLIMS.extractors.fei_emi.split_fei_metadata_units(metadata_term)[source]

If present, separate a metadata term into its value and units. In the FEI metadata structure, units are indicated separated by an underscore at the end of the term. i.e. High tension_kV indicates that the High tension metadata value has units of kV.

Parameters:

metadata_term (str) -- The metadata term read from the FEI tag structure

Returns:

mdata_and_unit -- A length-2 tuple with the metadata value name as the first item and the unit (if present) as the second item

Return type:

tuple of str

nexusLIMS.extractors.quanta_tif module

nexusLIMS.extractors.quanta_tif.get_quanta_metadata(filename)[source]

Returns the metadata (as a dictionary) from a .tif file saved by the FEI Quanta SEM in the Nexus Microscopy Facility. Specific tags of interest are duplicated under the root-level nx_meta node in the dictionary.

Parameters:

filename (str) -- path to a .tif file saved by the Quanta

Returns:

mdict -- The metadata text extracted from the file

Return type:

dict

nexusLIMS.extractors.quanta_tif.parse_beam_info(mdict, beam_name)[source]
Parameters:
  • mdict (dict) -- A metadata dictionary as returned by get_quanta_metadata()

  • beam_name (str) -- The "beam name" read from the root-level Beam node of the metadata dictionary

Returns:

mdict -- The same metadata dictionary with some values added under the root-level nx_meta key

Return type:

dict

nexusLIMS.extractors.quanta_tif.parse_det_info(mdict, det_name)[source]

Parses the Detector portion of the metadata dictionary from the Quanta to get values such as brightness, contrast, signal, etc.

Parameters:
  • mdict (dict) -- A metadata dictionary as returned by get_quanta_metadata()

  • det_name (str) -- The "detector name" read from the root-level Beam node of the metadata dictionary

Returns:

mdict -- The same metadata dictionary with some values added under the root-level nx_meta key

Return type:

dict

nexusLIMS.extractors.quanta_tif.parse_nx_meta(mdict)[source]

Parse the "important" metadata that is saved at specific places within the Quanta tag structure into a consistent place in the metadata dictionary returned by get_quanta_metadata().

Parameters:

mdict (dict) -- A metadata dictionary as returned by get_quanta_metadata()

Returns:

mdict -- The same metadata dictionary with some values added under the root-level nx_meta key

Return type:

dict

nexusLIMS.extractors.quanta_tif.parse_scan_info(mdict, scan_name)[source]

Parses the Scan portion of the metadata dictionary (on a Quanta this is always "EScan") to get values such as dwell time, field width, and pixel size

Parameters:
  • mdict (dict) -- A metadata dictionary as returned by get_quanta_metadata()

  • scan_name (str) -- The "scan name" read from the root-level Beam node of the metadata dictionary

Returns:

mdict -- The same metadata dictionary with some values added under the root-level nx_meta key

Return type:

dict

nexusLIMS.extractors.quanta_tif.parse_system_info(mdict)[source]

Parses the System portion of the metadata dictionary from the Quanta to get values such as software version, chamber config, etc.

Parameters:

mdict (dict) -- A metadata dictionary as returned by get_quanta_metadata()

Returns:

mdict -- The same metadata dictionary with some values added under the root-level nx_meta key

Return type:

dict

nexusLIMS.extractors.thumbnail_generator module

nexusLIMS.extractors.thumbnail_generator.add_annotation_markers(s)[source]

Read annotations from a signal originating from DigitalMicrograph and convert the ones (that we can) into Hyperspy markers for plotting. Adapted from a currently (at the time of writing) open pull request in HyperSpy.

Parameters:

s (hyperspy.signal.BaseSignal (or subclass)) -- The HyperSpy signal for which a thumbnail should be generated

nexusLIMS.extractors.thumbnail_generator.down_sample_image(fname, out_path, output_size=None, factor=None)[source]

Load an image file from disk, down-sample it to the requested dpi, and save. Sometimes the data doesn't need to be loaded as a HyperSpy signal, and it's better just to down-sample existing image data (such as for .tif files created by the Quanta SEM).

Parameters:
  • fname (str) -- The filepath that will be resized. All formats supported by PIL.Image.open() can be used

  • out_path (str) -- A path to the desired thumbnail filename. All formats supported by PIL.Image.Image.save() can be used.

  • output_size (tuple) -- A tuple of ints specifying the width and height of the output image. Either this argument or factor should be provided (not both).

  • factor (int) -- The multiple of the image size to reduce by (i.e. a value of 2 results in an image that is 50% of each original dimension). Either this argument or output_size should be provided (not both).

nexusLIMS.extractors.thumbnail_generator.sig_to_thumbnail(s, out_path, dpi=92)[source]

Generate a preview thumbnail from an arbitrary HyperSpy signal. For a 2D signal, the signal from the first navigation position is used (most likely the top- and left-most position. For a 1D signal (i.e. a spectrum or spectrum image), the output depends on the number of navigation dimensions:

  • 0: Image of spectrum

  • 1: Image of linescan (a la DigitalMicrograph)

  • 2: Image of spectra sampled from navigation space

  • 2+: As for 2 dimensions

Parameters:
  • s (hyperspy.signal.BaseSignal (or subclass)) -- The HyperSpy signal for which a thumbnail should be generated

  • out_path (str) -- A path to the desired thumbnail filename. All formats supported by savefig() can be used.

  • dpi (int) -- The "dots per inch" resolution for the outputted figure

Returns:

f -- Handle to a matplotlib Figure

Return type:

matplotlib.figure.Figure

Notes

This method heavily utilizes HyperSpy's existing plotting functions to figure out how to best display the image