Data Management

Data Uploader

The Data Uploader is the main place for uploading data into a notebook. With this interface it is possible to parse and upload multiple files of the same type in a single operation. It can be accessed from any notebook, by clicking the icon-upld button in the NavBar.

The workflow is straightforward:

  1. select the type of data ot upload

  2. drag&drop or select all files (or folders) to upload

  3. select file format type (if required)

  4. configure the parser (if required)

  5. check parser output preview

  6. upload everything

_images/data.uploader.1.png

Data Uploader interface (IMAGE data type preview).

It is important to know that data uploaded this way become data records inside AMAD database. Please read the section about Data Records to better understand what happens to uploaded files.

The upper part of the panel includes the drag&drop area, the type/format selectors and few global options for the naming conventions appied to the data records placed in the database.

The lower part of the panel shows the preview of the parser output. The icon-prev/icon-next buttons and record selector can be used to toggle the preview object.

Warning

In principle it is possible to drag&drop as many files as desired, but it seems the browser does not let that happen. Users have observed that files become empty when too many are selected, and even when repeatedly uploading smaller batches or ~100 files at a time. Should this happen, just refresh the page and proceed to upload more files.

The following sections discuss supported data formats and their parser configurations.

icon-FILE FILE

Files uploaded as FILE data records are not parsed: the record data contains an exact copy of the original file. This format is suitable to all files that need to be stored safely in the notebook, but do not contain parsable data (PDFs, text files, …).

One preview object is generated for each selected file, the preview only shows file name and size.

The default name for a FILE record is the name of the original file.

icon-IMAGE IMAGE

Naturally, image files (jpg, png, svg, …) should be uploaded as IMAGE data records. Just like FILE, no parsing occurs and the record’s data is an exact copy of the original image. However, IMAGE records can be displayed anwhere in the notebook by the Image Element.

One preview object is generated for each selected file, the preview shows file name, size, and the image itself.

The default name for an IMAGE record is the name of the original file.

icon-DATA1D DATA1D

DATA1D represents one-dimensional arrays, parsed from csv or numpy files. These records can be visualised in several notebook elements such as Table Element or Line/Scatter Plot Element. The accepted formats are:

  • csv: one DATA1D record for each column in the file

  • npy (2D): one DATA1D record for each column in the numpy matrix

  • npy (1D): one DATA1D record for each file

One preview object is generated for each selected file, and the preview shows a table with all records that could be extracted from the file.

The default record names are extracted from the source file if it is in csv format, or automatically generated for npy files.

icon-DATA2D DATA2D

DATA2D is used to store matrixes, and can be visualised by the 2D Plot Element. Currently, these records can only be parsed from numpy files that contain a two-dimensional tensor.

Each file generates one preview object, displayed as a 2D plot.

The default record name is the source file name.

icon-DATA3D DATA3D

DATA3D is used to store three-dimensional tensors, parsed from numpy files only. At present there is no notebook element for their visualisation.

Each file generates one preview object, displaying a partial list of tensor values and an isosurface plot (can take some time to render).

The default record name is the source file name.

icon-STRUCT STRUCTURE

STRUCTURE records are designed to describe an atomistic structure, such as those commonly used in molecular dynamics or electronic structure simulations. These records can be visualised in the Atomic Structure Element. The following common formats are parseable by the platform:

  • XYZ

  • FHI-AIMS

  • VASP POSCAR

  • LAMMPS (data file)

  • CASTEP

  • PDB

Note

Only atomic positions and types are parsed and stored in the data record. Any additional information in the file will not be included in the database.

Note

XYZ file may contain multiple frames of a trajectory, but only the first frame is actually parsed and stored in the database.

Each file generates one preview object, displaying the table of atomic coordinates and a 3D rendering of the system.

The default record name is the source file name.

icon-SPM SPM

This parser can extract data from raw scanning probe data files, and organises it into COMPOSITE records of SPM subtype. These records can then be visualised and processed in the Scanning Probe Microscopy. Currently supported raw data file formats are:

  • Nanonis SXM

  • IBW (Asylum Research)

  • Createc (.dat)

  • Omicron Matrix* (see SPM / Matrix)

Note

Only multi-channel 2D images are supported.

Each file generates one preview object, where the user can inspect the 2D plots of the channels and a partial list of extracted metadata (depending on the source file format)

The default record name is [SPM] filename.

icon-SPM SPM / Matrix

This parser extracts data from raw files generated by Omicron Matrix controllers, and organises it into COMPOSITE records of SPM subtype. These records can then be visualised in the Scanning Probe Microscopy.

Note

The parser requires the user to drag&drop or select folders instead of files. The system determines how to find and organise information based on the location of _0001.mtrx files in the overall selection.

Note

Only multi-channel 2D images are supported.

The parser generates one preview_object (and its related data record) for each cycle/session/dataset/sample found in the list of selected files. All matrix data files belonging to the same sample should be in the same folder as their _0001.mtrx master file. The parser will further split them into separate records according to their dataset, session ID and cycle number.

The output record name is [SPM] sample_name-data_set_name-session-cycle.

icon-NMR NMR / Bruker

This parser extracts data from raw Bruker TopSpin files, and organises it into COMPOSITE records of NMR subtype. These records can then be visualised in the Nuclear Magnetic Resonance.

Note

The parser requires the user to drag&drop or select folders instead of files. The system determines how to find and organise information based on the location of proc files in the overall selection.

Note

Only the NMR spectra and some metadata are encoded in the data record. Most files describing the state of the instrument are ignored.

Note

Only 1D and 2D spectra are supported.

Each detected experiment generates one preview object, where the user can inspect the plots of NMR channels.

The default record name is [NMR] foldername.

Data Records

Data is seamlessly incorporated into notebooks through the editor. Typically, when data is uploaded, the client browser is responsible for parsing the file and compile a data record to upload to the server. The platform supports records of several basic types, listed here:

Built-in data types

type

description

DATA0D

Scalar value

DATA1D

One-dimensional array of elements (numbers or not)

DATA2D

Two-dimensional array (matrix)

DATA3D

Three-dimensional tensor

FILE

Generic file, not parsed by the system

IMAGE

Image data parsed from a jpg or png

STRUCTURE

Descriptor of an atomistic structure

COMPOSITE

Complex data structure made of other records

Data parsing and upload is done by the notebook elements (e.g. images, tables, …). Each data type can be uploaded from the the configuration panel of a notebook element that is intended to visualise or operate with that particular type. For example IMAGE records are uploaded through image elements, while DATA1D records are parsed out of the columns of CSV files in the table elements.

Files intended to be uploaded without parsing will become FILE records, uploaded by the Data Explorer upload tab.

Each record consists of a data structure with various fields. On the server they are python dictionaries, transferred to the client as JSON strings, where they finally become JavaScript objects.

Internal representation of a data decord

field

type

description

name

string

name of the record (unique in the notebook)

uuid

string

a unique ID (within the notebook, likely also everywhere)

type

TYPE

The built-in data type (see Built-in data types)

units

string

physical units of the data (see Physical Quantities)

data

object

data content, variable format depending on type

Warning

In principle there is no limit to the size of a data record, however, in practice the browser limits memory usage to a quite small portion of the total available memory. For this reason, the client cannot open and parse too large files. The maximum file size depends on the browser’s configuration (typically about 100 MB).

Note

Records of type FILE and IMAGE are not parsed, but sent directly as a binary stream, thus they do not have size limits (apart from reasonableness).

Metadata Systems

Metadata is an important tool to curate and sort data, making it easy to search, use and understand. AMAD offers three metadata systems: tags, value-tags and inventory-tags.

Metadata for a record, or a selection of multiple records, is accessed from the ref_dataexplorer explore tab, using the icon-tag edit button in the metadata icon-kvp details section.

icon-tag Tags

Tags are simple labels that can be attached to data records. For example, data coming from experiments can be tagged with an experimental tag. Custom user tags can be defined using the Tags Tab of the Data Explorer.

Note

Tag definitions are global in the AMAD database. All users will be able to attach any tag to their data, regardless who defined it.

icon-kvp Value-Tags

Value-tags (v-tags) are tags that can hold a custom value, also known as key-value pairs. The tag name and description are set when the v-tag is defined. Its value is assigned by the user when the v-tag is attached to a data record. v-tags can be useful to tag data records depending on some external conditions such as acquisition temperature or pressure, or a particular methodology. Custom v-tags can be defined using the Value-Tags Tab of the Data Explorer.

Note

v-tag definitions are global in the AMAD database. All users will be able to attach any defined v-tag to their data.

icon-inv Inventory Tags

Inventory tags are labels, much like normal tags, but restricted to inventory items (see Inventory). These are automatically defined when an item (tool, instrument, …) is added to a Lab. Users can use inventory tags corresponding to all the items in all the labs where they are members.

Data Explorer

The data explorer (icon-data) is the main access panel for dealing with data and metadata.

Explore Tab

By default, the Data Explorer opens with the explore tab, that looks like the usual file manager on any graphical operating system.

_images/data.dataexplorer.explore.png

Data Explorer - explore tab .

Most data records uploaded into the notebook will appear as files in the notebook root folder; composite data records, such as SPM or NMR are shown as folders, and their children data records can be found inside. Additional folders are created for certain notebook elements that create their own data records. For example, the Atomic Structure Element, when configured in static mode will generate an IMAGE record with a snapshot of the rendering. Similarly, the DATA0D records associated with an Active Table Element are placed inside a dedicated folder named after the table (e.g. Table 1).

Records can be renamed as in a conventional file manager: select the item to rename and click it once more to start editing its name.

Note

It might not be possible to rename certain data records generated by the system.

When a single data record is selected, the panel on the right will show some more details about it, if available, and an editable text area with its physical units, useful when the original raw data files do not contain physical units information (which is often the case).

Note

Changing physical units will not perform a conversion on the data: this will only replace the current record units with the given ones.

Interaction buttons to edit metadata (icon-kvp,|icon-edit|), download (icon-dwld) or delete (icon-del) the record are also shown in the panel.

Clicking the icon-edit button in the icon-kvp metadata section will open the Metadata Editor. Metadata can be edited in bulk for a selection of multiple records at once if needed.

The icon-dwld button will open a selection panel where the user can chose the format to use for retrieving the data.

Warning

Some data types cannot currently be downloaded.

The icon-del delete button, or alternatively pressing the delete key on the keyboard, will permanently, inevitably and unrecoverably delete the selected records from the notebook so that its data and any trace of it will no longer exist anywhere on the platform. A confirmation dialog will be shown first, where the user is required to type the record name onece more before its complete and utter obliteration.

Warning

Absolute, undeniable erasure of the data!

Explore Tab - Metadata Editor

This tab is part of the Explore tab, but it is shown when the user edits the metadata for a selection of records.

_images/data.dataexplorer.metaedit.png

Data Explorer - metadata editor tab .

The tab lists the records for which metadata will be applied, and the metadata they have. Tags found only in some records, but not all, are rendered differently, with a warning, and a button to harmonise the tag across the whole record selection.

The lower side of the tab shows the metadata type and tag selectors, and the button to add the chosen tag to the selection.

Tags Tab

_images/data.dataexplorer.tags.png

Data Explorer - tags tab .

With this tab, users can search for available metadata tags (see Metadata Systems) and add their own definition.

To add a new tag definition, fill the name and info text fields and click the add button. Tag names must be unique in the database.

Note

Definitions are global in AMAD database. Please look for suitable, existing tags before defining new ones, and report duplicates to system administrators.

Value-Tags Tab

This tab is analogous to the Tags Tab but operates with value-tags (see Metadata Systems).

To add a new tag definition, fill the name and info text fields and click the add button. Tag names must be unique in the database.

Note

Definitions are global in AMAD database. Please look for suitable, existing v-tags before defining new ones, and report duplicates to system administrators.

Import Tab

_images/data.dataexplorer.import.png

Data Explorer - import tab .

The import tab of the data explorer lets the user import data records from any other accessible notebook into the currently opened one. This will not clone the data, but only create a reference to the source record, with the same metadata.

In order to import data, chose the source notebook from the dropdown list: this will update the list of records available in that notebook. Then press the icon-add button of the desired record to import it.

The system will prompt a dialog asking for the local alias to use as record name within the notebook, thus avoiding possible name conflicts with pre-existing records.

Note

Imported records initially have the same metadata copied from the source records. Any metadata changes will only not affect source records.

The tab also shows a list of imported records, where it is possible to remove them from the notebook.

Note

Deleting imported data will not delete the original record, but only the reference.

Warning

If the source record is deleted, the reference will lose its meaning, as the actual data no loger exists.