Principle

The overall structure of the database that is the Data Mining Lab Freiberg is derived from the data storage philosophy developed at the Institute of Mechanical Process Engineering and Mineral Processing. The most important two components of the database are samples and experiments. Samples are unique and are described by a set of metadata. Because the Data Mining Lab Freiberg is concerned with battery recycling, the principal sample types are battery, solid, liquid, and suspension. The sample type battery is introduced to distinguish original end-of-life batteries before the recycling process as well as newly produced battery cells. Depending on their sample type, the necessary metadata changes to accomodate the characteristic properties.

Principal structure of data in the Data Mining Lab

Samples can be analyzed or processed by methods. Every time a method is used on a sample, an experiment is performed. For example, a sieve analysis produces the particle size distribution of a sample without necessarily altering it, while a crushing step alters the sample and produces a new sample. Samples are therefore defined by a parent sample and by a specific experiment. In this way, any sample in the database can be traced back to its original parent sample that will (most likely) be of type battery. Note that methods can be analysis devices as well as procedures like sample splitting.

An experiment produces measurement data and metadata. Metadata for an experiment consists of all relevant parameter values as well as information on certain parts of the method, e.g., measurement ranges of sensors. Measurement data is concerned with the actual data produced during the experiment and may consist of both physical measurements taken during the experiment and derived properties and analysis results.

Database Structure

The database itself consists of the aforementioned tables for samples, experiments, and methods. More tables further specify the relationship of the three tables: institute, staff (member), project, and funding body.

Conceptual model of the Data Mining Lab database

Three user groups exist:

standard user: can access the database, write queries via the provided API and download datasets
creator user: can create new samples and experiments by uploading measurement and metadata via the website or the API
admin user: can create new methods, staff, projects, funding bodies, and users of the two previous groups

Users having access to the database benefit from using the API to query the data. Only minor sorting capability is provided via the frontend of the Data Mining Lab Freiberg.

Implementation

The conceptual model of the Data Mining Lab Freiberg shown above is implemented as a SQL (Postgres) database. Because of the widely differing datasets defined by the different methods, the actual sample and experimental data is stored as zip archives in the form it was uploaded by the user. These uploaded datasets should be consistent for a given method and their structure can be explained as supplementary files uploaded for a given method.

Logical SQL model of the Data Mining Lab database

Because the Data Mining Lab Freiberg is implemented in Django (Python), its users benefit from the Object Relational Mapper with which datasets can be queried without any SQL. As of this moment, standard users may access the data via the RESTful implementation of the DMLF API, provided with Django REST framework.