nexusLIMS package

The NexusLIMS back-end software.

This module contains the software required to monitor a database for sessions logged by users on instruments that are part of the NIST Electron Microscopy Nexus Facility. Based off this information, records representing individual experiments are automatically generated and uploaded to the front-end NexusLIMS CDCS instance for users to browse, query, and edit.

Example

In most cases, the only code that needs to be run directly is initiating the record builder to look for new sessions, which can be done by running the record_builder module directly:

$ python -m nexusLIMS.builder.record_builder

Refer to Record building workflow for more details.

Configuration variables

The following variables should be defined as environment variables in your session, or in the .env file in the root of this package's repository (if you are running using pipenv).

nexusLIMS_user

The username used to authenticate to calendar resources and CDCS

nexusLIMS_pass

The password used to authenticate to calendar resources and CDCS

mmfnexus_path

The path (should be already mounted) to the root folder containing data from the Electron Microscopy Nexus. This folder is accessible read-only, and it is where data is written to by instruments in the Electron Microscopy Nexus. The file paths for specific instruments (specified in the NexusLIMS database) are relative to this root.

nexusLIMS_path

The root path used by NexusLIMS for various needs. This folder is used to store the NexusLIMS database, generated records, individual file metadata dumps and preview images, and anything else that is needed by the back-end system.

nexusLIMS_db_path

The direct path to the NexusLIMS SQLite database file that contains information about the instruments in the Nexus Facility, as well as logs for the sessions created by users using the Session Logger Application.

nexusLIMS.get_config()[source]
nexusLIMS.validate_config(config)[source]

Subpackages

Submodules

nexusLIMS.cdcs module

nexusLIMS.cdcs.delete_record(record_id)[source]

Delete a Data record from the NexusLIMS CDCS instance via REST API

Parameters:

record_id (str) -- The id value (on the CDCS server) of the record to be deleted

Returns:

r -- The REST response returned from the CDCS instance after attempting the delete

Return type:

Response

nexusLIMS.cdcs.get_template_id()[source]

Get the template ID for the schema (so the record can be associated with it)

Returns:

template_id -- The template ID

Return type:

str

nexusLIMS.cdcs.get_workspace_id()[source]

Get the workspace ID that the user has access to (should be the Global Public Workspace)

Returns:

workspace_id -- The workspace ID

Return type:

str

nexusLIMS.cdcs.upload_record_content(xml_content, title)[source]

Upload a single XML record to the NexusLIMS CDCS instance.

Parameters:
  • xml_content (str) -- The actual content of an XML record (rather than a file)

  • title (str) -- The title to give to the record in CDCS

Returns:

  • post_r (Response) -- The REST response returned from the CDCS instance after attempting the upload

  • record_id (str) -- The id (on the server) of the record that was uploaded

nexusLIMS.cdcs.upload_record_files(files_to_upload, progress=False)[source]

Upload a list of .xml files (or all .xml files in the current directory) to the NexusLIMS CDCS instance using upload_record_content()

Parameters:
  • files_to_upload (list or None) -- The list of .xml files to upload. If None, all .xml files in the current directory will be used instead.

  • progress (bool) -- Whether or not to show a progress bar for uploading

Returns:

  • files_uploaded (list of str) -- A list of the files that were successfully uploaded

  • record_ids (list of str) -- A list of the record id values (onthe server) that were uploaded

nexusLIMS.instruments module

Attributes

nexusLIMS.instruments.instrument_db

A dictionary of Instrument objects.

Each object in this dictionary represents an instrument detected in the NexusLIMS remote database.

Type:

dict

class nexusLIMS.instruments.Instrument(api_url=None, calendar_name=None, calendar_url=None, location=None, name=None, schema_name=None, property_tag=None, filestore_path=None, computer_ip=None, computer_name=None, computer_mount=None)[source]

Bases: object

A simple object to hold information about an instrument in the Microscopy Nexus facility, fetched from the external NexusLIMS database

Parameters:
  • api_url (str or None) -- The calendar API url for this instrument

  • calendar_name (str or None) -- The "user-friendly" name of the calendar for this instrument as displayed on the sharepoint resource (e.g. "FEI Titan TEM")

  • calendar_url (str or None) -- The URL to this instrument's web-accessible calendar on the sharepoint resource

  • location (str or None) -- The physical location of this instrument (building and room number)

  • name (str or None) -- The unique identifier for an instrument in the Nexus Microscopy facility

  • schema_name (str or None) -- The name of instrument as defined in the Nexus Microscopy schema and displayed in the records

  • property_tag (str or None) -- The NIST property tag for this instrument

  • filestore_path (str or None) -- The path (relative to the Nexus facility root) on the central file storage where this instrument stores its data

  • computer_name (str or None) -- The name of the 'support PC' connected to this instrument

  • computer_ip (str or None) -- The REN IP address of the 'support PC' connected to this instrument

  • computer_mount (str or None) -- The full path where the files are saved on the 'support PC' for the instrument (e.g. 'M:/')

nexusLIMS.instruments.get_instr_from_api_url(url)[source]

Using the NexusLIMS database, get an instrument object by a calendar API url.

Parameters:

url (str) -- The unique API url for the instrument calendar.

Returns:

instrument -- An _Instrument instance matching the API url, or None if no match was found.

Return type:

Instrument or None

Examples

>>> inst = get_instr_from_api_url('URL_FOR_YOUR_INSTRUMENT_CALENDAR')
>>> str(inst)
'JEOL_TEM_01 in Location_01'
nexusLIMS.instruments.get_instr_from_calendar_name(cal_name)[source]

Using the NexusLIMS database, get an instrument object by a calendar name.

Parameters:

cal_name (str) -- A calendar name (e.g. "REMOVED") that will be used to search for a matching instrument in the api_url values

Returns:

instrument -- An _Instrument instance matching the path, or None if no match was found.

Return type:

Instrument or None

Examples

>>> inst = get_instr_from_calendar_name('JEOL_TEM_01')
>>> str(inst)
'JEOL_TEM_01 in Location_01'
nexusLIMS.instruments.get_instr_from_filepath(path)[source]

Using the NexusLIMS database, get an instrument object by a given path.

Parameters:

path (str) -- A path (relative or absolute) to a file saved in the central filestore that will be used to search for a matching instrument

Returns:

instrument -- An _Instrument instance matching the path, or None if no match was found

Return type:

Instrument or None

Examples

>>> inst = get_instr_from_filepath('/mnt/**REMOVED**_mmfnexus/Titan/**REMOVED**/' +
...                                '190628 - **REMOVED** Training/' +
...                                '6_28_2019 Box6 4S/4_330mm.dm3')
>>> str(inst)
'**REMOVED** in **REMOVED**'
nexusLIMS.instruments.get_instrument_db()[source]

Connect to the NexusLIMS database and get a list of all the instruments contained within.

Returns:

instrument_db -- A dictionary of Instrument instances that describe all the instruments that were found in the instruments table of the NexusLIMS database

Return type:

dict

nexusLIMS.utils module

nexusLIMS.utils.find_dirs_by_mtime(path, dt_from, dt_to)[source]

Given two timestamps, find the directories under a path that were last modified between the two

Deprecated since version 0.0.9: find_dirs_by_mtime is not recommended for use to find files for record inclusion, because subsequent modifications to a directory (e.g. the user wrote a text file or did some analysis afterwards) means no files will be returned from that directory (because it is not searched)

Parameters:
  • path (str) -- The root path from which to start the search

  • dt_from (datetime.datetime) -- The "starting" point of the search timeframe

  • dt_to (datetime.datetime) -- The "ending" point of the search timeframe

Returns:

dirs -- A list of the directories that have modification times within the time range provided

Return type:

list of str

nexusLIMS.utils.find_files_by_mtime(path, dt_from, dt_to)[source]

Given two timestamps, find files under a path that were last modified between the two.

Parameters:
  • path (str) -- The root path from which to start the search

  • dt_from (datetime.datetime) -- The "starting" point of the search timeframe

  • dt_to (datetime.datetime) -- The "ending" point of the search timeframe

Returns:

files -- A list of the files that have modification times within the time range provided (sorted by modification time)

Return type:

list

nexusLIMS.utils.get_from_db(query)[source]

Get contents and column names from a table in the NexusLIMS database file.

Parameters:

query (str) -- Query for the database. e.g. "SELECT * from instruments

Returns:

  • results (list) -- Fetched all (remaining) rows of a query result defined in the sqlite3's cursor. The list is empty if fetching failed.

  • col_names (list) -- A list of column names defined in the table. Note that the list is different from the 7-tuple defined in the original cursor.description. Only the first item in the 7-tuple is saved in the list.

nexusLIMS.utils.get_nested_dict_key(nested_dict, key_to_find, prepath=())[source]

Use a recursive method to find a key in a dictionary of dictionaries (such as the metadata dictionaries we receive from the file parsers). Cribbed from: https://stackoverflow.com/a/22171182/1435788

Parameters:
  • nested_dict (dict) -- Dictionary to search

  • key_to_find (object) -- Value to search for

  • prepath (tuple) -- "path" to prepend to the search to limit the search to only part of the dictionary

Returns:

path -- The "path" through the dictionary (expressed as a tuple of keys) where value was found. If None, the value was not found in the dictionary.

Return type:

tuple or None

nexusLIMS.utils.get_nested_dict_value(nested_dict, value, prepath=())[source]

Use a recursive method to find a value in a dictionary of dictionaries (such as the metadata dictionaries we receive from the file parsers). Cribbed from: https://stackoverflow.com/a/22171182/1435788

Parameters:
  • nested_dict (dict) -- Dictionary to search

  • value (object) -- Value to search for

  • prepath (tuple) -- "path" to prepend to the search to limit the search to only part of the dictionary

Returns:

path -- The "path" through the dictionary (expressed as a tuple of keys) where value was found. If None, the value was not found in the dictionary.

Return type:

tuple or None

nexusLIMS.utils.get_nested_dict_value_by_path(nest_dict, path)[source]

Get the value from within a nested dictionary structure by traversing into the dictionary as deep as that path found and returning that value

Parameters:
  • nest_dict (dict) -- A dictionary of dictionaries that is to be queried

  • path (tuple) -- A tuple (or other iterable type) that specifies the subsequent keys needed to get to a a value within nest_dict

Returns:

value -- The value at the path within the nested dictionary; if there's no value there, return the string "not found"

Return type:

object or str

nexusLIMS.utils.get_nist_div_and_group(username)[source]

Query the NIST active directory to get division and group information for a user.

Parameters:

username (str) -- a valid NIST username (the short format: e.g. "ear1" instead of ernst.august.ruska@nist.gov).

Returns:

div, group -- The division and group numbers for the user (as strings)

Return type:

str

nexusLIMS.utils.gnu_find_files_by_mtime(path, dt_from, dt_to, extensions)[source]

Given two timestamps, find files under a path that were last modified between the two. Uses the system-provided GNU find command. In basic testing, this method was found to be approximately 3 times faster than using find_files_by_mtime() (which is implemented in pure Python).

Parameters:
  • path (str) -- The root path from which to start the search

  • dt_from (datetime.datetime) -- The "starting" point of the search timeframe

  • dt_to (datetime.datetime) -- The "ending" point of the search timeframe

  • extensions (list of str) -- A list of strings representing the extensions to find

Returns:

files -- A list of the files that have modification times within the time range provided (sorted by modification time)

Return type:

list of str

Raises:
  • NotImplementedError -- If the system running this code is not Linux-based

  • RuntimeError -- If the find command cannot be found, or running it results in output to stderr

nexusLIMS.utils.is_subpath(path, of_paths)[source]

Helper function to determine if a given path is a "subpath" of a set of paths. Useful to help determine which instrument a given file comes from, given the instruments filestore_path and the path of the file to test.

Parameters:
  • path (str) -- The path of the file (or directory) to test. This will usually be the absolute path to a file on the local filesystem (to be compared using the host-specific mmf_nexus_root_path.

  • of_paths (str or list) -- The "higher-level" path to test against (or list thereof). In typical use, this will be a path joined of an instruments filestore_path with the root-level mmf_nexus_root_path

Returns:

result -- Whether or not path is a subpath of one of the directories in of_paths

Return type:

bool

Examples

>>> is_subpath('/mnt/**REMOVED**_mmfnexus/Titan/**REMOVED**/190628 - **REMOVED** ' +
...            'Training/6_28_2019 Box6 4S/4_330mm.dm3',
...            os.path.join(CONFIG['mmfnexus_path'],
...                         titan.filestore_path))
True
nexusLIMS.utils.local_datetime(dt, tz)[source]

Convert an UTC datetime to a local datetime. :param dt: Datetime in the UTC timezone. :type dt: :py:class:~datetime.datetime :param tz: Local timezone information. e.g. "America/Chicago". :type tz: str

Returns:

local_dt -- New datetime in the local timezone.

Return type:

:py:class:~datetime.datetime

nexusLIMS.utils.nexus_req(url, fn, basic_auth=False, **kwargs)[source]

A helper method that wraps a function from requests, but adds a local certificate authority chain to validate the SharePoint server's certificates and authenticates using NTLM.

Parameters:
  • url (str) -- The URL to fetch

  • fn (function) -- The function from the requests library to use (e.g. get(), put(), post(), etc.)

  • basic_auth (bool) -- If True, use only username and password for authentication rather than NTLM (like what is used for CDCS access rather than for NIST network resources)

  • **kwargs (dict, optional) -- Other keyword arguments are passed along to the fn

Returns:

r -- A requests response object

Return type:

requests.Response

nexusLIMS.utils.parse_xml(xml, xslt_file, **kwargs)[source]

Parse and translate an XML string from the API into a nicer format

Parameters:
  • xml (str or bytes) -- A string containing XML, such as that returned by fetch_xml()

  • xslt_file (str or io.BytesIO) -- Path to the XSLT file to use for transformation

  • **kwargs (dict, optional) -- Other keyword arguments are passed as parameters to the XSLT transformer. None values are converted to an empty string.

Returns:

simplified_dom

Return type:

lxml.etree._XSLTResultTree

nexusLIMS.utils.set_nested_dict_value(nest_dict, path, value)[source]

Set a value within a nested dictionary structure by traversing into the dictionary as deep as that path found and changing it to value. Cribbed from https://stackoverflow.com/a/13688108/1435788

Parameters:
  • nest_dict (dict) -- A dictionary of dictionaries that is to be queried

  • path (tuple) -- A tuple (or other iterable type) that specifies the subsequent keys needed to get to a a value within nest_dict

  • value (object) -- The value which will be given to the path in the nested dictionary

Returns:

value -- The value at the path within the nested dictionary

Return type:

object

nexusLIMS.utils.setup_loggers(log_level)[source]

Set logging level of all NexusLIMS loggers

Parameters:

log_level (int) -- The level of logging, such as logging.DEBUG

nexusLIMS.utils.try_getting_dict_value(d, key)[source]

This method will try to get a value from a dictionary (potentially nested) and fail silently if the value is not found, returning None.

Parameters:
  • d (dict) -- The dictionary from which to get a value

  • key (str or tuple) -- The key to query, or if an iterable container type (tuple, list, etc.) is given, the path into a nested dictionary to follow

Returns:

val -- The value of the dictionary specified by key. If the dictionary does not have a key, returns the string "not found" without raising an error

Return type:

object or str