nexusLIMS.harvester package

This module contains the code used to harvest metadata from various calendar sources. Currently, only the sharepoint_calendar submodule is implemented, which provides abilities to download calendar information from a Microsoft SharePoint calendar resource.

Submodules

nexusLIMS.harvester.google_calendar module

class nexusLIMS.harvester.google_calendar.GCalendarEvent(title=None, instrument=None, updated=None, username=None, created_by=None, start_time=None, end_time=None, category_value=None, experiment_purpose=None, sample_details=None, project_id=None)[source]

Bases: object

A representation of a single calendar "entry" returned from the Google Calendar API. Datetime attributes should be timezone-aware, i.e. matching the timezone used in the instrument calendar.

title

The title of the event (present at /feed/entry/content/m:properties/d:TitleOfExperiment)

Type:

str

instrument

The instrument associated with this calendar entry (fetched using the name of the calendar, present at /feed/title)

Type:

Instrument

updated

The time this event was last updated (present at /feed/entry/updated)

Type:

datetime.datetime

username

The NIST "short" username of the user indicated in this event (present at /feed/entry/link[@title="UserName"]/m:inline/feed/entry/content /m:properties/d:UserName)

Type:

str

created_by

The NIST "short" username of the user that created this event (present at `/feed/entry/link[@title="CreatedBy"]/m:inline/feed/entry/content /m:properties/d:UserName)

Type:

str

start_time

The time this event was scheduled to start (present at /feed/entry/content/m:properties/d:StartTime) The API response returns this value without a timezone, in the timezone of the sharepoint server

Type:

datetime.datetime

end_time

The time this event was scheduled to end (present at /feed/entry/content/m:properties/d:EndTime)

Type:

datetime.datetime

category_value

The "type" or category of this event (such as User session, service, etc.) (present at /feed/entry/content/m:properties/d:CategoryValue)

Type:

str

experiment_purpose

The user-entered purpose of this experiment (present at /feed/entry/content/m:properties/d:ExperimentPurpose)

Type:

str

sample_details

The user-entered sample details for this experiment (present at /feed/entry/content/m:properties/d:SampleDetails)

Type:

str

project_id

The user-entered project identifier for this experiment (present at /feed/entry/content/m:properties/d:ProjectID)

Type:

str

classmethod from_dict(query)[source]
nexusLIMS.harvester.google_calendar.build_service(service, cred='credentials.json')[source]

Build Google Calendar service through OAuth2. For more information: https://developers.google.com/workspace/guides/create-credentials

Parameters:
  • service (dict) -- An empty dictionary. This is needed because Process is used to timeout the authorization process. If the user chooses to do nothing when the Google OAuth2 page pops out, or uses a wrong Google account without access to the Calendar, then the process is killed after 10 seconds.

  • cred (str) -- Path for the JSON credentials file downloaded from the Google Cloud Console. Default: "credentials.json" in the same folder as this script.

Returns:

service -- A Resource instance built for "calendar" and "v3" using the credentials from the cred file.

Return type:

Resource

nexusLIMS.harvester.google_calendar.fetch_dict(instrument, dt_from=None, dt_to=None)[source]

Get a dict of information from the Google Calendar event that best matches the datetime period specified by dt_from and dt_to for one instrument. The current stage is that, if dt_from or dt_to is None, then the most recent Calendar event is returned. In the future this may change to a list of events (up to a certain maximum number).

Due to how the Google Calendar API was set up, dt_from and dt_to have to be fairly close to the event of interest, since the query will only get a maximum of 1000 events between the two dates.

Parameters:
  • instrument (Instrument) -- One of the NexusLIMS instruments contained in the instrument_db database. Contains information about the api_url to be used to connect to the instrument Calendar.

  • dt_from (datetime or None) -- A datetime object representing the start of a calendar event to search for. Must be an RFC3339 timestamp with mandatory time zone offset. For example, 2011-06-03T10:00:00-07:00, 2011-06-03T10:00:00Z. If both dt_from and dt_to are None, no date filtering will be done. If just dt_from is None, all events from the beginning of the calendar record will be returned up until dt_to.

  • dt_to (datetime or None) -- dt_to : datetime or None A datetime object representing the end of calendar event to search for. Must be an RFC3339 timestamp with mandatory time zone offset. For example, 2011-06-03T10:00:00-07:00, 2011-06-03T10:00:00Z. If dt_from and dt_to are None, no date filtering will be done. If just dt_to is None, all events from the dt_from to the present will be returned.

Returns:

query -- A dict with detailed information about the Google Calendar event returned from the query. The two most important keys for query are "timeZone" and "items". query["timeZone"] is the timezone of the Google Calendar, e.g. "America/Chicago"; query["items"] is a list of Calendar events meeting the dt_from and dt_to time criteria, up to 1000 events. The events are sorted by time and in an ascending order. Each event is a dict itself, containing information about the event.

Return type:

dict

nexusLIMS.harvester.sharepoint_calendar module

exception nexusLIMS.harvester.sharepoint_calendar.AuthenticationError(message)[source]

Bases: Exception

Class for showing an exception having to do with authentication

nexusLIMS.harvester.sharepoint_calendar.dump_calendars(instrument=None, user=None, dt_from=None, dt_to=None, group=None, division=None, filename='cal_events.xml')[source]

Write the results of get_events() to a file.

Parameters:
  • instrument (Instrument or str) -- One of the NexusLIMS instruments contained in the database. Controls what instrument calendar is used to get events. If value is a string, it should be one of the instrument PIDs from the Nexus facility

  • dt_from (datetime or None) -- A datetime object representing the start of a calendar event to search for, as in fetch_xml(). If dt_from and dt_to are None, no date filtering will be done. If just dt_from is None, all events from the beginning of the calendar record will be returned up until dt_to.

  • dt_to (datetime or None) -- A datetime object representing the end of calendar event to search for, as in fetch_xml(). If dt_from and dt_to are None, no date filtering will be done. If just dt_to is None, all events from the dt_from to the present will be returned.

  • user (None or str) -- Either None or a valid NIST username (the short format: e.g. "ear1" instead of ernst.august.ruska@nist.gov). If None, no user filtering will be performed. No verification of username is performed, so it is up to the user to make sure this is correct.

  • division (None or str) -- The division number of the project. If provided, this string will be replicated under the "project" information in the outputted XML. If None (and user is provided), the division will be queried from the active directory server.

  • group (None or str) -- The group number of the project. If provided, this string will be replicated under the "project" information in the outputted XML. If None (and user is provided), the group will be queried from the active directory server.

  • filename (str) -- The filename to which the events should be written

nexusLIMS.harvester.sharepoint_calendar.fetch_xml(instrument, dt_from=None, dt_to=None)[source]

Get the XML responses from the Nexus Sharepoint calendar for one instrument.

Parameters:
  • instrument (Instrument) -- As defined in get_events(), one of the NexusLIMS instruments contained in the database. Controls what instrument calendar is used to get events

  • dt_from (datetime or None) -- A datetime object representing the start of a calendar event to search for. If dt_from and dt_to are None, no date filtering will be done. If just dt_from is None, all events from the beginning of the calendar record will be returned up until dt_to.

  • dt_to (datetime or None) -- A datetime object representing the end of calendar event to search for. If dt_from and dt_to are None, no date filtering will be done. If just dt_to is None, all events from the dt_from to the present will be returned.

Returns:

api_response -- A string containing the XML calendar information for each instrument requested, stripped of the empty default namespace. If dt_from and dt_to are provided, it will contain just one "entry" representing a single event on the calendar

Return type:

str

Notes

To find the right event, an API request to the Sharepoint Calendar will be made for all events starting on the same day as dt_from. This could result in multiple events being returned if there is more than one session scheduled on that microscope for that day. To find the right one, the timespan between each event's StartTime and EndTime returned from the calendar will be compared with the timespan between dt_from and dt_to. The event with the greatest overlap will be taken as the correct one. This approach should allow for some flexibility in terms of non-exact matching between the reserved timespans and those recorded by the session logger.

nexusLIMS.harvester.sharepoint_calendar.get_auth(filename='credentials.ini', basic=False)[source]

Set up NTLM authentication for the Microscopy Nexus using an account as specified from a file that lives in the package root named .credentials (or some other value provided as a parameter). Alternatively, the stored credentials can be overridden by supplying two environment variables: nexusLIMS_user and nexusLIMS_pass. These variables will be queried first, and if not found, the method will attempt to use the credential file.

Parameters:
  • filename (str) -- Name relative to this file (or absolute path) of file from which to read the parameters

  • basic (bool) -- If True, return only username and password rather than NTLM authentication (like what is used for CDCS access rather than for NIST network resources)

Returns:

auth -- NTLM authentication handler for requests

Return type:

requests_ntlm.HttpNtlmAuth or tuple

Notes

The credentials file is expected to have a section named [nexus_credentials] and two values: username and password. See the credentials.ini.example file included in the repository as an example.

nexusLIMS.harvester.sharepoint_calendar.get_events(instrument=None, dt_from=None, dt_to=None, user=None, division=None, group=None, wrap=True)[source]

Get calendar events for a particular instrument on the Microscopy Nexus, on some date, or by some user

Parameters:
  • instrument (Instrument or str) -- One of the NexusLIMS instruments contained in the database. Controls what instrument calendar is used to get events. If string, value should be one of the instrument PIDs from the Nexus facility.

  • dt_from (datetime or None) -- A datetime object representing the start of a calendar event to search for, as in fetch_xml(). If dt_from and dt_to are None, no date filtering will be done. If just dt_from is None, all events from the beginning of the calendar record will be returned up until dt_to.

  • dt_to (datetime or None) -- A datetime object representing the end of calendar event to search for, as in fetch_xml(). If dt_from and dt_to are None, no date filtering will be done. If just dt_to is None, all events from the dt_from to the present will be returned.

  • user (None or str) -- Either None or a valid NIST username (the short format: e.g. "ear1" instead of ernst.august.ruska@nist.gov). If None, no user filtering will be performed. No verification of username is performed, so it is up to the user to make sure this is correct.

  • division (None or str) -- The division number of the project. If provided, this string will be replicated under the "project" information in the outputted XML. If None (and user is provided), the division will be queried from the active directory server.

  • group (None or str) -- The group number of the project. If provided, this string will be replicated under the "project" information in the outputted XML. If None (and user is provided), the group will be queried from the active directory server.

  • wrap (bool) -- Boolean used to choose whether to apply the _wrap_events() function to the output XML string.

Returns:

output -- A well-formed XML document in a string, containing one or more <event> tags that contain information about each reservation, including title, instrument, user information, reservation purpose, sample details, description, and date/time information.

Return type:

str