Dashboards

The following dashboards provide the status of the MEG systems mainly

  • the QD Helium recovery system

  • the MEG-KIT system

  • the MEG-OPM system

  • the Vpixx devices system

The dashboard are updated weekly to show any incident or problem.

QD Helium Recovery Dashboard

MEG-KIT system status dashboard

MEG-OPM system status dashboard

Vpixx devices system status dashboard

Data Quality Dashboards

An MEG signal is a measurement of a very small magnetic field, in the order of 100 fT, where femtotesla \(1fT = 10^{-15} T\) and picotesla \(1pT = 10^{-12} T\).

EEG scalp signals are about 50 to 100 \(\mu\text{V}\)

Note

The dashboards are automatically updated every day at 10:00 am UAE time check the github actions for more information

Data Quality metrics

The metrics, defined in the table below, will serve as basis to asess the quality of empty-room data acquired from either MEG-KIT or MEG-OPM systems. The SNR (Signal to Noise Ratio) is a measure that can be qualitatively evaluated given the measurements of the different metrics. Poor SNR, can have multiple reasons: new unindentified recurrent noise source, defect in the equipment, specific event causing noise that is usually absent. Poor SNR can lead to experiments needing more number of trials, or artifact removal analysis.

Noise Metrics Table

Metric

Formula

Description

Threshold

Label

Average value

\(A(t) = \frac{1}{(m - p + 1)} \sum_{i=p}^{m} a_i\)

The average or mean of a set of data points is simply the sum of all the data points divided by the total number of data points.

?

Max value

\(max(s(t))\)

<3fT

Variance

\(\text{Variance} = \frac{1}{(m - p + 1)} \sum_{i=p}^{m} (a_i - A(t))^2\)

It is calculated by finding the average of the squared differences from the mean

?

FFT (Fast Fourier Transform)

FFT is used to convert from time to frequency domain

<= 10 fT

KIT Data Quality Dashboard

This dashboard monitors the quality of the data generated from the KIT-MEG system. Empty room data is recorded from the KIT system every couple of days, then for each dataset,the metrics are computed. The results are displayed automatically in the following dashboards.

Plot of Average and Variance Over Time

Perspectives on Data Quality dashboards

  • checkout the lab manual to define new metrics and threshold

  • Whitening the data  compute a noise vector

  • optimise the memory usage of process_con_files_for_table.py by cropping the data to 10 seconds only

  • migrate the dashboard generation to a dedicated server triggered on file adding

  • track dataset files, if one is already processed, avoid redownloading again, unless metrics have been changed

  • create the .csv statistics table dynamically, everytime a file is processed, the entry in the csv is added (and not after processing all the files)

  • currently metrics are computed over all the channels, it would be ideal to have other views showing a per channel computation, this can help detect faulty sensors and troubleshoot any channel based issue

Documentation for dashboards

This section is a documentation guide targeted to users who would like to understand the current dashboards and some technical details regarding their generation. The dashboards are meant to monitor the status of the systems and the quality of the data by measuring noise levels in empty-room data while providing informative labels, for quick access and over all numerical values in a simple format. It has graphs showing the computation of different metrics (e.g., average and variance) of each empty-room data file, as well as a table listing the current state of each empty-room data dataset. The state indicates whether or not the dataset is within the “good” noise thresholds for each metric.

1. The following use cases are enabled by the dashboards:

  • Easily monitor the status of the systems using the table and track periods of time where incidents had happened

  • Get a summary of several measurements of metrics that evaluates the quality of data

  • Track down in history any data-quality issues when an experiment has been performed on a specific day

1. The source of this data is empty-room data hosted on the NYU-BOX data drive.

1. Overview of the table: the used data quality metrics are present in docs/source/data/noise_metrics.csv

File Details

Column Name

Description

Obtained

Status

Gives the status of the given file’s average if it’s above or within the threshold. The status is indicated by a color: green for safe, red for above the threshold. The threshold is defined as below 3 fT.

Calculates the average of the signal over time and compares it to the threshold.

File Name

It is a combination of the time and the name of the file, separated by a ‘_’.

Obtained from the metadata available in the NYU-DATA box.

Average

Calculates the average of the signal over time.

Calculated by the simple functions defining the average function in Python.

Variance

Calculates the variance of the signal over time.

Calculated by the simple functions defining the variance function in Python.

Date

The date is defined as: format=”%d-%m-%y %H:%M:%S”.

Obtained from the metadata of the ‘last-modified’ field.

Details

Describes the details of the day and/or experiment that might explain the results obtained in the file.

Added by the user, default is “Nothing added yet”.

1. Future directions and perspectives:
  • build a database to host the data instead of having them as files

  • identify other metrics to be added to the existing list

  • get system status values by executing automated system tests

Dashboard Generation Developer Guide

Overview

The dashboard is generated from empty room data hosted on the NYU-BOX storage drive. The scripts for generating the dashboard are located under docs/source/scripts/dashboard-generating-scripts. This guide explains how to download empty room data from the NYU-BOX storage using Python scripts. It covers setting up the Box SDK, authenticating using JWT, accessing folder data, and downloading .con files. It also includes information on processing these downloaded files.

The stack being used comprises:

  • backend: boxsdk, readthedocs

  • frontend: sphinx documentation, plotly

The conf.py is a sphinx documentation backend script that executes several operation necessary for building the documentation website, we use it to execute the following scripts for dashboard generation:

  • System dashboard generation: generate_system_status_dashboards.py
    • takes a .csv file as input, that contains the data of the status of a system (timestamp, status, sub-system name)

    • computes the weekly status activities from the timestamp

    • generates the .html files for the display of the system status dashboards

  • Data quality dashboard generation:
    • box_script.py connects to NYU-BOX using the BOX-SDK and downloads empty room data to the build server (Read The Docs)
      • uses private keys, which can be provided as an .env file on your machine or set as environment variables in your build

      • step will vary depending on your setup, so it’s important to include error handling.

      • an NYU Box app has been approved with the permissions required to access and download the files, the secrets are generated from the approved app

    • processing_empty_room_data_files.py
      • for each downloaded file, computes the data-quality metrics

      • produces a .csv with the results con_file_statistics.csv

      • generates the .html files to plot the dashboards

    • convert_csv_to_rst.py converts two .csv
      • ‘9-dashboard/data/noise_metrics.csv’ containing the definition of the data quality metrics and acceptable thresholds for each

      • ‘9-dashboard/data/con_file_statistics.csv’ containing the data quality metrics computation for each dataset and whether or not the file is in the thresholds

Installation

First, you need to install the boxsdk library. If you are using a .env file, you will also need to install python-dotenv:

pip install boxsdk
pip install python-dotenv

Setting Up Authentication

Define your private keys, such as client_id, client_secret, and any other necessary keys. Then, set up JWT authentication:

from boxsdk import JWTAuth, Client

auth = JWTAuth(
    client_id=client_id,
    client_secret=client_secret,
    jwt_key_id=public_key_id,
    # Add any additional keys needed
)
client = Client(auth)

Accessing Folders

After accessing the Box data correctly, you need to create a function that retrieves the ID of folders (the unique address for each folder). This function will start at the root directory and traverse the path, which is a list of folder names separated by “/”. It begins with the root folder ID and checks each folder name in the path. If it finds a folder with the matching name, it updates the folder_id to that folder’s ID and continues to the next folder:

def get_folder_id_by_path(path):
    # Root folder id is "0"
    folder_id = "0"
    for folder_name in path.split("/"):
        items = client.folder(folder_id).get_items()
        folder_id = None
        for item in items:
            if item.type == "folder" and item.name == folder_name:
                folder_id = item.id
                break
        if folder_id is None:
            raise ValueError(f'Folder "{folder_name}" not found in path.')
    return folder_id

Downloading Files

Next, create a function that downloads files from a specified directory. This function will download all .con files, and if it finds a folder, it will call the function again recursively:

import os

def download_con_files_from_folder(folder_id, path):
    folder = client.folder(folder_id).get()
    items = folder.get_items(limit=100, offset=0)

    for item in items:
        # Define the type of file you want to download
        if item.type == "file" and item.name.endswith(".con"):
            file_id = item.id
            file = client.file(file_id).get()
            filename = f"{file.name}"
            file_path = os.path.join(path, filename)
            with open(file_path, "wb") as open_file:
                file.download_to(open_file)
        elif item.type == "folder":
            new_folder_path = os.path.join(path, item.name)
            os.makedirs(new_folder_path, exist_ok=True)
            download_con_files_from_folder(item.id, new_folder_path)

To get the date when a file was last modified, you can use file.modified_at.

Data Preparation

  • processing_con_files_for_table.py processes the .con files, computes metrics, and generates a .csv file with the results.

import os
import numpy as np
import pandas as pd
import mne

def process_all_con_files(base_folder):
    results = []

    for root, _, files in os.walk(base_folder):
        for file in files:
            if file.endswith(".con"):
                file_path = os.path.join(root, file)
                # Get the results of the function that calculates the average, variance, and status
                avg, var, status = process_con_file(file_path)
                # A function that extracts the date
                date = extract_date(file)
                # Default value for details
                details = "Nothing added yet"
                # Format the date string to your needs
                date_str = (
                    date.strftime("%d-%m-%y %H:%M:%S") if date else "Unknown Date"
                )
                results.append(
                    {
                        "Status": status,
                        "File Name": file,
                        "Average": avg,
                        "Variance": var,
                        "Date": date_str,
                        "Details": details,
                    }
                )

    return results

This script processes all .con files, calculating the average and variance of each signal. It also checks the date to see if it falls within a specified threshold.

def process_con_file(file_path):
    # Load the .con file using MNE
    threshold = 3  # Set the threshold
    raw = mne.io.read_raw_kit(file_path, preload=True)
    raw.pick_types(meg=True, eeg=False)

    # Get data for all channels
    data, times = raw.get_data(return_times=True)
    # Calculate average and variance across all channels
    avg = (np.mean(data)) * 1e15  # Convert to femtotesla
    var = np.var(data)
    status = [
        f"🟢 In the threshold" if avg < threshold else f"🔴 Above the threshold"
    ]

    return avg, var, status

The script generates a .csv file with the results and creates graphs to display the numerical values.

def save_results_to_csv(results, output_file):
    # Ensure the directory exists
    os.makedirs(os.path.dirname(output_file), exist_ok=True)

    # Save results to CSV
    df = pd.DataFrame(results)
    df.to_csv(output_file, index=False)
  • convert_csv_to_rst.py generates .rst pages from the CSV files. It accesses all the .csv files in a specific directory, converts them into reStructuredText format, and saves them in the output folder.