DATAmaestro Analytics is a web based application dedicated to advanced analytics and allowing the user to perform all types of analytics:
- Descriptive analytics to describe past behaviour/situation of systems
- Diagnostic analytics to understand why such system behaviours/situations happened in the past
- Predictive analytics to predict outcome, behaviour of the system
- Prescriptive analytics to discover optimum decisions so that the system behaves in an optimal way
Analytics workflow are structured in a logical way :
- Project are used to save all tasks built on the data
- Data is used to link datasource(s) to the project
- Select is used to define subset of data to be analysed
- Transform is used to create new variables (columns) useful for further processing
- Visualize is used to explore and analyse data with visual tools and charts
- Model is used to apply machine learning models on selected data
- Report is used to extract results of the data analytics tasks
DATAmaestro is dedicated to end users with limited or no skills in data science. Limited or no coding is required to use DATAmaestro.
In a DATAmaestro Analytics project, the main features are organized in a project workflow in the top menu. To access your saved project tasks, open the sidebar on the left.
Items in the menu bar help you manage your projects through every stage.
|Project||Create, open, delete or copy a project. You can also set preferences to customize the page appearance.|
Manage the data sources that support your projects. You can upload CSV or Excel files, or use data connections like AllegroGraph or DATAserver.
To upload Excel files you must install a plugin, contact Technical Support.
Create a new Record set or Variable set. For more information, see Data.
|Transform||Use operators to create or edit Function Variables expressions, as well as process control features including CUSUM, Statistical Process Control (SPC), and Principal Component Analysis (PCA).|
|Visualize||Create histograms, scatter plots, box plots, trends and dendrograms based on data you select.|
|Models ||Use learning and test sets to build automatic learning models. Train your data using robust classification, regression and clustering techniques.|
|Reports ||Export subsets of your data to use for reporting purposes outside of DATAmaestro.|
|Help||Find answers to questions in the collection of user references and tutorials.|
|User Menu||Switch between Lake, Analytics or Dashboards, Change password, change Language preferences, and more. |
In DATAmaestro Analytics, use the sidebar to access the elements that have been saved in your project. Elements are saved with their associated feature, which is represented by a grey icon. Hold your mouse pointer over an icon to see the name of the feature and the number of elements that are saved. For example, histograms:
|The current project has nine saved histograms.|
To expand and collapse all tasks using the sidebar, click All at the top of the sidebar. The expanded panel displays all the saved elements for each feature.
Search: Use the search field at the top to find saved elements. A space in the filter field works as an “AND” E.g. “temp qual” searches for anything with ”temp” AND “qual” (in that order)
Sort task: By default, reverse chronological order (latest task at top of list). Click () to switch to sort in alphabetical or reverse alphabetical order.
Pin: To pin the sidebar panel and view all of the current page, click ( ).
Open: To open an element, click the name in the sidebar.
New: To create a new element, click Add new... in the appropriate section.
Manage: To edit ( ) or delete ( ) elements, click the appropriate icon.
Deleting and data dependency
If the element you want to delete is used by another task, the title of the task is displayed in the dialog box for you to confirm. If you delete the element, the depending tasks are also removed. For example, if you confirm and delete a function variable that is used for a histogram, a scatter plot and a decision tree, they will all be removed.
|Data sources||Data source(s) uploaded to support the project. The number of files is indicated on the active icon.|
|Variable sets||Variable sets that have been created for the project. By default, all the individual variables appear in the Variable List.|
|Record sets||Record sets are created to establish specific record groups for a given data source. For example, records that correspond to a production regime, records associated with the normal variation range of a KPI variable.|
|Function variables||Function variables are created to transform variables for models and advanced analysis.|
|Histograms||Histograms are typically used to illustrate the process distribution and are used to make predictions about a stable process.|
Scatter plots are typically used to show how much one variable is affected by another. Each row in the data table is represented by a marker whose position depends on its values in the columns set on the X- and Y-axes.
|Box plots ||Box plots provide effective views to help identify outliers. They show a data set’s lowest value, highest value, median value, and the size of the first and third quartiles. The box plot is useful in analyzing small data sets that do not lend themselves easily to histograms.|
|Curves ||Curves are typically used to compare the temporal dynamic behaviour/trend of two or more numerical variables of a database. |
|Dendrograms||Dendrograms and Correlation matrices provide a means of assessing dependency levels between variables. A dendrogram is a tree-structured graph used in heat maps to visualize the result of a hierarchical clustering calculation.|
|Linear regressions ||A simple linear regression is an approach to modeling the relationship between a scalar dependent variable Y and one explanatory variable denoted X. Several input variables may be combined to predict an output variable Y in the shape of a multi-linear regression.|
|Trees ||Decision trees and Regression trees in are important modeling methods that are used to accomplish a prediction objective (classification or regression).|
|Ensemble trees |
Ensemble trees are an extension of the regular tree methods; Extra Trees, Adaboost & MART.
|Clustering ||K-Means and Subclu models in the project. |
|Other tasks||All other items can be found under the Other menu, including models: Artificial Neural Networks, Partial Least Squares, k-Nearest Neighbors, Partial Dependence Plots, Sensitivity Analysis, Dynamic Inputs, PRIM analysis & Optimizer, Statistical tests: T test, Spearman and Pearson, as well as, Data Exports, Summary Charts, Process Flow and more. |