Dashboards
The following dashboards provide the status of the MEG systems mainly
the QD Helium recovery system
the MEG-KIT system
the MEG-OPM system
the Vpixx devices system
The dashboard are updated weekly to show any incident or problem.
QD Helium Recovery Dashboard
MEG-KIT system status dashboard
MEG-OPM system status dashboard
Vpixx devices system status dashboard
Data Quality Dashboards
An MEG signal is a measurement of a very small magnetic field, in the order of 100 fT, where femtotesla \(1fT = 10^{-15} T\) and picotesla \(1pT = 10^{-12} T\).
EEG scalp signals are about 50 to 100 \(\mu\text{V}\)
Note
The dashboards are automatically updated every day at 10:00 am UAE time check the github actions for more information
Data Quality metrics
The metrics, defined in the table below, will serve as basis to asess the quality of empty-room data acquired from either MEG-KIT or MEG-OPM systems. The SNR (Signal to Noise Ratio) is a measure that can be qualitatively evaluated given the measurements of the different metrics. Poor SNR, can have multiple reasons: new unindentified recurrent noise source, defect in the equipment, specific event causing noise that is usually absent. Poor SNR can lead to experiments needing more number of trials, or artifact removal analysis.
Metric |
Formula |
Description |
Threshold |
Label |
|---|---|---|---|---|
Average value |
\(A(t) = \frac{1}{(m - p + 1)} \sum_{i=p}^{m} a_i\) |
The average or mean of a set of data points is simply the sum of all the data points divided by the total number of data points. |
? |
|
Max value |
\(max(s(t))\) |
<3fT |
||
Variance |
\(\text{Variance} = \frac{1}{(m - p + 1)} \sum_{i=p}^{m} (a_i - A(t))^2\) |
It is calculated by finding the average of the squared differences from the mean |
? |
|
FFT (Fast Fourier Transform) |
FFT is used to convert from time to frequency domain |
<= 10 fT |
KIT Data Quality Dashboard
This dashboard monitors the quality of the data generated from the KIT-MEG system. Empty room data is recorded from the KIT system every couple of days, then for each dataset,the metrics are computed. The results are displayed automatically in the following dashboards.
Metric computation for KIT empty-room data
..
.. csv-table:: Metric computation for KIT empty-room data
:file: ../data/kit-con-files-statistics.csv
:header-rows: 1
Plot of Average and Variance Over Time
KIT average metric plot
KIT variance plot
KIT max value plot
KIT FFT plot
Perspectives on Data Quality dashboards
checkout the lab manual to define new metrics and threshold
Whitening the data compute a noise vector
optimise the memory usage of process_con_files_for_table.py by cropping the data to 10 seconds only
migrate the dashboard generation to a dedicated server triggered on file adding
track dataset files, if one is already processed, avoid redownloading again, unless metrics have been changed
create the .csv statistics table dynamically, everytime a file is processed, the entry in the csv is added (and not after processing all the files)
currently metrics are computed over all the channels, it would be ideal to have other views showing a per channel computation, this can help detect faulty sensors and troubleshoot any channel based issue
Documentation for dashboards
This section is a documentation guide targeted to users who would like to understand the current dashboards and some technical details regarding their generation. The dashboards are meant to monitor the status of the systems and the quality of the data by measuring noise levels in empty-room data while providing informative labels, for quick access and over all numerical values in a simple format. It has graphs showing the computation of different metrics (e.g., average and variance) of each empty-room data file, as well as a table listing the current state of each empty-room data dataset. The state indicates whether or not the dataset is within the “good” noise thresholds for each metric.
1. The following use cases are enabled by the dashboards:
Easily monitor the status of the systems using the table and track periods of time where incidents had happened
Get a summary of several measurements of metrics that evaluates the quality of data
Track down in history any data-quality issues when an experiment has been performed on a specific day
1. The source of this data is empty-room data hosted on the NYU-BOX data drive.
1. Overview of the table: the used data quality metrics are present in docs/source/data/noise_metrics.csv
File Details Column Name
Description
Obtained
Status
Gives the status of the given file’s average if it’s above or within the threshold. The status is indicated by a color: green for safe, red for above the threshold. The threshold is defined as below 3 fT.
Calculates the average of the signal over time and compares it to the threshold.
File Name
It is a combination of the time and the name of the file, separated by a ‘_’.
Obtained from the metadata available in the NYU-DATA box.
Average
Calculates the average of the signal over time.
Calculated by the simple functions defining the average function in Python.
Variance
Calculates the variance of the signal over time.
Calculated by the simple functions defining the variance function in Python.
Date
The date is defined as: format=”%d-%m-%y %H:%M:%S”.
Obtained from the metadata of the ‘last-modified’ field.
Details
Describes the details of the day and/or experiment that might explain the results obtained in the file.
Added by the user, default is “Nothing added yet”.
- 1. Future directions and perspectives:
build a database to host the data instead of having them as files
identify other metrics to be added to the existing list
get system status values by executing automated system tests
Dashboard Generation Developer Guide
Overview
The dashboard is generated from empty room data hosted on the NYU-BOX storage drive. The scripts for generating the dashboard are located under docs/source/scripts/dashboard-generating-scripts. This guide explains how to download empty room data from the NYU-BOX storage using Python scripts. It covers setting up the Box SDK, authenticating using JWT, accessing folder data, and downloading .con files. It also includes information on processing these downloaded files.
The stack being used comprises:
backend: boxsdk, readthedocs
frontend: sphinx documentation, plotly
The conf.py is a sphinx documentation backend script that executes several operation necessary for building the documentation website, we use it to execute the following scripts for dashboard generation:
- System dashboard generation: generate_system_status_dashboards.py
takes a .csv file as input, that contains the data of the status of a system (timestamp, status, sub-system name)
computes the weekly status activities from the timestamp
generates the .html files for the display of the system status dashboards
- Data quality dashboard generation:
- box_script.py connects to NYU-BOX using the BOX-SDK and downloads empty room data to the build server (Read The Docs)
uses private keys, which can be provided as an .env file on your machine or set as environment variables in your build
step will vary depending on your setup, so it’s important to include error handling.
an NYU Box app has been approved with the permissions required to access and download the files, the secrets are generated from the approved app
- processing_empty_room_data_files.py
for each downloaded file, computes the data-quality metrics
produces a .csv with the results con_file_statistics.csv
generates the .html files to plot the dashboards
- convert_csv_to_rst.py converts two .csv
‘9-dashboard/data/noise_metrics.csv’ containing the definition of the data quality metrics and acceptable thresholds for each
‘9-dashboard/data/con_file_statistics.csv’ containing the data quality metrics computation for each dataset and whether or not the file is in the thresholds
Installation
First, you need to install the boxsdk library. If you are using a .env file, you will also need to install python-dotenv:
pip install boxsdk
pip install python-dotenv
Setting Up Authentication
Define your private keys, such as client_id, client_secret, and any other necessary keys. Then, set up JWT authentication:
from boxsdk import JWTAuth, Client
auth = JWTAuth(
client_id=client_id,
client_secret=client_secret,
jwt_key_id=public_key_id,
# Add any additional keys needed
)
client = Client(auth)
Accessing Folders
After accessing the Box data correctly, you need to create a function that retrieves the ID of folders (the unique address for each folder). This function will start at the root directory and traverse the path, which is a list of folder names separated by “/”. It begins with the root folder ID and checks each folder name in the path. If it finds a folder with the matching name, it updates the folder_id to that folder’s ID and continues to the next folder:
def get_folder_id_by_path(path):
# Root folder id is "0"
folder_id = "0"
for folder_name in path.split("/"):
items = client.folder(folder_id).get_items()
folder_id = None
for item in items:
if item.type == "folder" and item.name == folder_name:
folder_id = item.id
break
if folder_id is None:
raise ValueError(f'Folder "{folder_name}" not found in path.')
return folder_id
Downloading Files
Next, create a function that downloads files from a specified directory. This function will download all .con files, and if it finds a folder, it will call the function again recursively:
import os
def download_con_files_from_folder(folder_id, path):
folder = client.folder(folder_id).get()
items = folder.get_items(limit=100, offset=0)
for item in items:
# Define the type of file you want to download
if item.type == "file" and item.name.endswith(".con"):
file_id = item.id
file = client.file(file_id).get()
filename = f"{file.name}"
file_path = os.path.join(path, filename)
with open(file_path, "wb") as open_file:
file.download_to(open_file)
elif item.type == "folder":
new_folder_path = os.path.join(path, item.name)
os.makedirs(new_folder_path, exist_ok=True)
download_con_files_from_folder(item.id, new_folder_path)
To get the date when a file was last modified, you can use file.modified_at.
Data Preparation
processing_con_files_for_table.py processes the .con files, computes metrics, and generates a .csv file with the results.
import os
import numpy as np
import pandas as pd
import mne
def process_all_con_files(base_folder):
results = []
for root, _, files in os.walk(base_folder):
for file in files:
if file.endswith(".con"):
file_path = os.path.join(root, file)
# Get the results of the function that calculates the average, variance, and status
avg, var, status = process_con_file(file_path)
# A function that extracts the date
date = extract_date(file)
# Default value for details
details = "Nothing added yet"
# Format the date string to your needs
date_str = (
date.strftime("%d-%m-%y %H:%M:%S") if date else "Unknown Date"
)
results.append(
{
"Status": status,
"File Name": file,
"Average": avg,
"Variance": var,
"Date": date_str,
"Details": details,
}
)
return results
This script processes all .con files, calculating the average and variance of each signal. It also checks the date to see if it falls within a specified threshold.
def process_con_file(file_path):
# Load the .con file using MNE
threshold = 3 # Set the threshold
raw = mne.io.read_raw_kit(file_path, preload=True)
raw.pick_types(meg=True, eeg=False)
# Get data for all channels
data, times = raw.get_data(return_times=True)
# Calculate average and variance across all channels
avg = (np.mean(data)) * 1e15 # Convert to femtotesla
var = np.var(data)
status = [
f"🟢 In the threshold" if avg < threshold else f"🔴 Above the threshold"
]
return avg, var, status
The script generates a .csv file with the results and creates graphs to display the numerical values.
def save_results_to_csv(results, output_file):
# Ensure the directory exists
os.makedirs(os.path.dirname(output_file), exist_ok=True)
# Save results to CSV
df = pd.DataFrame(results)
df.to_csv(output_file, index=False)
convert_csv_to_rst.py generates .rst pages from the CSV files. It accesses all the .csv files in a specific directory, converts them into reStructuredText format, and saves them in the output folder.