hanaml.MDS.Rd
hanaml.MDS is a R wrapper for SAP HANA PAL Multi-dimensional scaling algorithm.
hanaml.MDS(
data,
matrix.type,
key,
features = NULL,
thread.ratio = NULL,
dim = NULL,
metric = NULL,
minkowski.power = NULL
)
DataFrame
DataFrame containting the data.
character
"observation.feature"
: Observation-feature matrixc.
"dissimilarity"
: Dissimilarity matrix.
character
Name of the ID column.
character or list of characters, optional
Specifies the attribute columns to apply scaling to.
Defaults to all non-ID columns.
double, optional
Controls the proportion of available threads that can be used by this
function.
The value range is from 0 to 1, where 0 indicates a single thread,
and 1 indicates all available threads.
Values between 0 and 1 will use up to
that percentage of available threads.Values outside this
range are ignored.
Defaults to 0.
integer, optional
The number of dimension that the input dataset is to be reduced to.
Defaults to 2.
chracter, optional
"manhattan"
: Manhattan distance.
"euclidean"
: Euclidean distance.
"minkowski"
: Minkowski distance.
Only valid when matrix.type = "observation.feature".
Defaults to "euclidean".
double, optional
When you use the Minkowski distance, this parameter controls the value
of power.
Only valid if matrix.type = "observation.feature" and metric = "minkowski".
Defaults to 3.
Returns a list of DataFrames:
DataFrame 1
Sampling results, structured as follows:
DATA_ID: name as shown in input DataFrame.
DIMENSION: dimension.
VALUE: value
DataFrame 2
Statistic results, structured as follows:
STAT_NAME: statistic name.
STAT_VALUE: statistic value.
This function serves as a tool for dimensional reduction or data visualization. The function embeds the samples in N-dimension in a lower K-dimensional space by applying a non-linear transformation – classical multidimensional scaling. The characteristic of this transformation is that it is able, or does the best it could, to preserve the distances between entities after reducing to a lower dimension.
Input DataFrame data:
> data$collect()
ID X1 X2 X3 X4
1 1 0.0000000 0.9047814 0.9085961 0.9103063
2 2 0.9047814 0.0000000 0.2514457 0.5975016
3 3 0.9085961 0.2514457 0.0000000 0.4403572
4 4 0.9103063 0.5975016 0.4403572 0.0000000
Call the function:
> mds <- hanaml.MDS(data,
key = "ID",
matrix.type = "dissimilarity",
thread.ratio = 0.5)
Output:
> mds$labels$Collect()
ID DIMENSION VALUE
1 1 1 0.65191741
2 1 2 -0.01585861
3 2 1 -0.21773716
4 2 2 -0.25319456
5 3 1 -0.24990695
6 3 2 -0.07294968
7 4 1 -0.18427330
8 4 2 0.34200285