hanaml.DataFrame {hana.ml.r}R Documentation

hanaml DataFrame

Description

This module represents a database query as a DataFrame. Most operations are designed to never bring data back from the database unless explicitly asked for.

Usage

hanaml.DataFrame(conn.context = NULL,
                 select.statement = NULL,
                 name = NULL)

Arguments

conn.context

ConnectionContext, optional
Contains a handle to a database connection.

select.statement

character, optional
The sql query for the DataFrame.

name

character, optional
Name of the DataFrame.

Format

An object of class R6ClassGenerator of length 24.

Value

Object of R6Class with methods for DataFrame that is backed by a database sql statement.

Methods

AddId(id)

Adds an ID column based on ROW_NUMBER() as the first column.

Usage: dataframe$AddId(id = "NEW_ID")
Arguments:

Returns: DataFrame with an ID column based on ROW_NUMBER() built-in.

Alias(aliasName)

Returns a new DataFrame with an alias set.

Usage: NewDf <- dataframe$Alias('TABLE1')
Arguments:

Returns: DataFrame with an alias set.

cast(cols, new.type)

Converts columns from one datatype to another specified datatype.

Usage: dataframe$cast("ID", new.type = "DOUBLE")
Arguments:

Returns: DataFrame with new datatype.

Collect()

Copies this DataFrame to an R DataFrame.

Usage: dataframe$Collect()
Returns: R DataFrame containing this DataFrame's data.

Count()

Computes the number of rows in a DataFrame.

Usage: dataframe$Count()
Returns: integer, number of rows in the DataFrame.

Describe(cols=NULL)

Generate descriptive statistics that summarize the central tendency,
ispersion and shape of a dataset’s distribution.

Usage: dataframe$Describe()
Arguments:

Returns: DataFrame with descriptive statistics.

distinct(cols=NULL)

Return distinct values.

Usage: dataframe$distinct()
Arguments:

Returns: DataFrame with distinct values.

Drop(cols)

Returns a new DataFrame after removing specified columns.

Usage: dataframe$Drop('colList')
Arguments:

Returns: DataFrame, new DataFrame retaining only columns not in cols.

DropDuplicates(subset.dataframe=NULL)

Returns DataFrame with duplicate rows removed.

Usage: dataframe$DropDuplicates('subsetList')
Arguments:

Returns: DataFrame with only one copy of duplicate rows.

DropNa(how = NULL, thresh = NULL, subset = NULL)

Returns a new DataFrame with NULLs removed.

Usage: dataframe$DropNa(how = 'any',thresh = 1,subset = 'subsetone')
Arguments:

Returns: DataFrame with a select statement that removes NULLs.

dtypes(subset.col = NULL)

Return column names and their data types as a list.

Usage: dataframe$dtypes()
Arguments:

Returns: list of column names and their data types .

FillNa(value, subset.dataframe = NULL)

Returns a DataFrame with NULLs replaced with the fill value. Only supports filling numeric columns.

Usage: dataframe$FillNa(0, 'col1')
Arguments:

Returns: DataFrame, new DataFrame with NULLs filled.

Filter(condition)

Selects rows matching the given condition. The condition string is not sanity-checked in any way. Do not take condition strings from untrusted input, as this can easily be used for SQL injection.

Usage: dataframe$Filter("select * from test where col1 = 'A'")
Arguments:

Returns: DataFrame with only rows matching the given condition.

GenerateColname(prefix = 'GEN_COL')

Generates a new column name for the DataFrame.

Usage: dataframe$GenerateColname('COL1')
Arguments:

Returns: character, newly generated column name.

GetDf(select.statement, name = NULL)

Creates a new DataFrame.

Usage: dataframe$GetDf('SELECT * FROM TEMP;', NAME = 'DF1')
Arguments:

Returns: DataFrame

GetDfCounter()

Returns the number of DataFrame.
Usage: dataframe$GetDfCounter()

Returns: integer.

GetNRows()

Sets the value of DataFrame's nrows.df.

Usage: dataframe$GetNRows()
Returns: No return value.

Has(col)

Returns TRUE if a column is in the DataFrame.

Usage: dataframe$has('col1')
Arguments:

Returns: logical\, TRUE if the column exists in the DataFrame's projection list.

Head(n = 1)

Returns a new DataFrame containing the first n rows of the DataFrame.

Usage: dataframe$head(n = 5)
Arguments:

Returns: DataFrame, new DataFrame of the first n rows of this DataFrame.

Join(other, on.expression, how = 'inner')

Returns a new DataFrame that is a join of this DataFrame with another DataFrame.

Usage: dataframe$Drop(other = DF1,on.expression = 'col',how = 'outer')
Arguments:

Returns: DataFrame, new DataFrame object that joins the current DataFrame with another DataFrame.

rename.columns(new.col.names)

Updates the column name.

Usage: dataframe$rename.columns(list("A", "C"))
Arguments:

Returns: DataFrame with rename columns.

RunQuery(Query)

Performs the query.
Usage: b <- dataframe$RunQuery('select "target" from IRIS')
Arguments:

Returns: DataFrame, new DataFrame generated by sql Query.

save(table, table.type = NULL, force = TRUE, schema = NULL)

Creates a table holding this DataFrame's data.

Usage: Save('TAB1','ROW')
Arguments:

Returns: DataFrame representing the new table.

Select(cols)

Returns a new DataFrame with columns derived from the current DataFrame.

Usage: dataframe$Select('col1') OR
col.list <- list('*','select')
cols <- sets::as.tuple(x = col.list)
dataframe$Select(cols)
Arguments:

Returns: DataFrame, new DataFrame object with the specified columns projected.

Sort(cols, desc = FALSE)

Returns a new DataFrame sorted by the specified columns.

Usage: dataframe$Sort('COL1')
Arguments:

Returns: DataFrame, new DataFrame object with rows in sorted order.

WithColumnRenamed(original, newName)

Returns a DataFrame with a new name for one column.

Usage: dataframe$WithColumnRenamed('col1','colnew')
Arguments:

Returns: DataFrame, the same data as this DataFrame, with one changed column name.


[Package hana.ml.r version 1.0.8 Index]