Difference between revisions of "Principal component analysis"

From Dynamo
Jump to navigation Jump to search
Line 1: Line 1:
= Operative steps =
+
 
 
[[Category:PCA]]
 
[[Category:PCA]]
 
[[Category:Classification]]
 
[[Category:Classification]]
In general, a Principal Component Analysis (PCA) aims at analyzing a data set and discovering a set of coordinates that capture the most representative features of said data. Often the term ''PCA classification'' is used, although PCA is not a classification method: classification itself is performed on the features extracted through PCA.
+
In general, a Principal Component Analysis (PCA) aims at analyzing a data set and discovering a set of coordinates that capture the most representative features of said data. Often the term ''PCA classification'' is loosely used. PCA is not a classification method: classification itself is performed on the features extracted through PCA.
  
 
In ''Dynamo'', the PCA is the process of finding a reduced set of "eigenvolumes" that allow to approximatively represent each particle in our data set as a combination of these eigenvolumes.  Which this representation, a generic particle can be represented by the contributions of each "eigenvolume" to the particle, i.e., by a set of "eigencomponents", normally in a number no much higher than 20.
 
In ''Dynamo'', the PCA is the process of finding a reduced set of "eigenvolumes" that allow to approximatively represent each particle in our data set as a combination of these eigenvolumes.  Which this representation, a generic particle can be represented by the contributions of each "eigenvolume" to the particle, i.e., by a set of "eigencomponents", normally in a number no much higher than 20.
Line 8: Line 8:
 
Once the particles are represent by small sets of scalars, they can be classified with standard methods like k-means.
 
Once the particles are represent by small sets of scalars, they can be classified with standard methods like k-means.
  
 
+
==Operative steps ==
Operatively, this entails:
+
Operatively, completing a PCA based classification requires three steps:
 
; Selecting the input
 
; Selecting the input
 
a data folder, a table, a mask
 
a data folder, a table, a mask
Line 16: Line 16:
 
; Computing the eigenvalues, eigenvolumes and eigencomponents
 
; Computing the eigenvalues, eigenvolumes and eigencomponents
 
; Using the eigencomponents to create a classification.
 
; Using the eigencomponents to create a classification.
= More steps =
+
 
 +
== GUIs for PCA classification ==
 +
 
 +
PCA
 +
 
 +
There are two GUIs available to cover the [[#Operative steps | pipeline]]:
 +
{{t|dynamo_ccmatrix_project_manager}}
 +
 
 +
== Tutorials ==
 +
There are some pdf tutorials available inside the ''Dynamo''distribution:
 +
* General introduction to PCA based classification.
 +
* Command line classification.

Revision as of 08:51, 19 April 2016

In general, a Principal Component Analysis (PCA) aims at analyzing a data set and discovering a set of coordinates that capture the most representative features of said data. Often the term PCA classification is loosely used. PCA is not a classification method: classification itself is performed on the features extracted through PCA.

In Dynamo, the PCA is the process of finding a reduced set of "eigenvolumes" that allow to approximatively represent each particle in our data set as a combination of these eigenvolumes. Which this representation, a generic particle can be represented by the contributions of each "eigenvolume" to the particle, i.e., by a set of "eigencomponents", normally in a number no much higher than 20.

Once the particles are represent by small sets of scalars, they can be classified with standard methods like k-means.

Operative steps

Operatively, completing a PCA based classification requires three steps:

Selecting the input

a data folder, a table, a mask

Computing a cross-correlation matrix
this is typically the most consuming part, as it involves to compare all particles in the data folder against all particles.
Computing the eigenvalues, eigenvolumes and eigencomponents
Using the eigencomponents to create a classification.

GUIs for PCA classification

PCA

There are two GUIs available to cover the pipeline: dynamo_ccmatrix_project_manager

Tutorials

There are some pdf tutorials available inside the Dynamodistribution:

  • General introduction to PCA based classification.
  • Command line classification.