This module represents a database query as a DataFrame. Most operations are designed to never bring data back from the database unless explicitly asked for.

hanaml.DataFrame(
  connection.context = NULL,
  select.statement = NULL,
  name = NULL
)

Arguments

connection.context

ConnectionContext
SAP HANA Database connection object.

select.statement

character, optional
The sql query for the DataFrame.

name

character, optional
Name of the DataFrame.

Value

Object of R6Class with methods for DataFrame that is backed by a database sql statement.

Methods

AddId(id)

Adds an ID column based on ROW_NUMBER() as the first column.
Usage: dataframe$AddId(id = "NEW_ID")
Arguments:

  • id: character, name of the added ID column.

Returns: DataFrame with an ID column based on ROW_NUMBER() built-in.

Alias(aliasName)

Returns a new DataFrame with an alias set.
Usage: NewDf <- dataframe$Alias("TABLE1")
Arguments:

  • aliasName: character, alias name of the DataFrame.

Returns: DataFrame with an alias set.

cast(cols, new.type)

Converts columns from one datatype to another specified datatype.
Usage: dataframe$cast("ID", new.type = "DOUBLE")
Arguments:

  • cols: list of characters, the columns to be converted.

  • new.type: character, the datatype to convert expression to.

Returns: DataFrame with new datatype.

Collect()

Copies this DataFrame to an R DataFrame.
Usage: dataframe$Collect()
Returns: R DataFrame containing this DataFrame's data.

Count()

Computes the number of rows in a DataFrame.
Usage: dataframe$Count()
Returns: integer, number of rows in the DataFrame.

Describe(cols=NULL)

Generate descriptive statistics that summarize the central tendency,
ispersion and shape of a dataset’s distribution.
Usage: dataframe$Describe()
Arguments:

  • cols: list of characters, optional, the columns to be summarized. Defaults to summmarize all columns.

Returns: DataFrame with descriptive statistics.

distinct(cols=NULL)

Return distinct values.
Usage: dataframe$distinct()
Arguments:

  • cols: list of characters, optional, name of columns which return distinct values.

Returns: DataFrame with distinct values.

Drop(cols)

Returns a new DataFrame after removing specified columns.
Usage: dataframe$Drop("colList")
Arguments:

  • cols: list of characters, list of column names to drop.

Returns: DataFrame, new DataFrame retaining only columns not in cols.

DropDuplicates(subset.dataframe=NULL)

Returns DataFrame with duplicate rows removed.
Usage: dataframe$DropDuplicates("subsetList")
Arguments:

  • subset.dataframe: list of characters, optional,
    List of columns to consider when deciding whether rows are duplicates of each other. Defaults to all columns.

Returns: DataFrame with only one copy of duplicate rows.

DropNa(how = NULL, thresh = NULL, subset = NULL)

Returns a new DataFrame with NULLs removed.
Usage: dataframe$DropNa(how = "any", thresh = 1,subset = "subsetone")
Arguments:

  • how : ("any", "all"), optional, if provided, "any" eliminates rows with any NULLs, and "all" eliminates rows that are entirely NULLs. If neither how nor thresh are provided, how defaults to "any".

  • thresh: integer, optional, if provided, keep rows with at least thresh non-NULL values and drop rows with less. how and thresh cannot both be provided.

  • subset: list of characters, optional, columns to consider when looking for NULLs. Values in other columns will be ignored, NULL or not. Defaults to all columns in the DataFrame.

Returns: DataFrame with a select statement that removes NULLs.

dtypes(subset.col = NULL)

Return column names and their data types as a list.
Usage: dataframe$dtypes()
Arguments:

  • subset.col: list of characters, selected columns to show datatype.

Returns: list of column names and their data types .

FillNa(value, subset.dataframe = NULL)

Returns a DataFrame with NULLs replaced with the fill value. Only supports filling numeric columns.
Usage: dataframe$FillNa(0, "col1")
Arguments:

  • value: integer or double, value to replace NULLs with. value should have type appropriate for the selected columns.

  • subset: character, optional, list of columns in which to replace NULLs. Defaults to all columns.

Returns: DataFrame, new DataFrame with NULLs filled.

Filter(condition)

Selects rows matching the given condition. The condition string is not sanity-checked in any way. Do not take condition strings from untrusted input, as this can easily be used for SQL injection.
Usage: dataframe$Filter("select * from test where col1 = 'A'")
Arguments:

  • condition: character, condition to filter on. This should be in the format of a SQL WHERE clause test (not including the word "WHERE").

Returns: DataFrame with only rows matching the given condition.

GenerateColname(prefix = "GEN_COL")

Generates a new column name for the DataFrame.
Usage: dataframe$GenerateColname("COL1")
Arguments:

  • prefix: character, optional, name of the column. If no name if provided, it creates a default column named "GEN_COL".

Returns: character, newly generated column name.

GetDf(select.statement, name = NULL)

Creates a new DataFrame.
Usage: dataframe$GetDf("SELECT * FROM TEMP;", NAME = "DF1")
Arguments:

  • select.statement: character, Dataframe sql query

  • name: character, optional, Dataframe name

Returns: DataFrame

GetDfCounter()

Returns the number of DataFrame.
Usage: dataframe$GetDfCounter()
Returns: integer.

GetNRows()

Sets the value of DataFrame's nrows.df.
Usage: dataframe$GetNRows()
Returns: No return value.

Has(col)

Returns TRUE if a column is in the DataFrame.
Usage: dataframe$Has("col1")
Arguments:

  • col: character Name of column to search in the projection list of this DataFrame.

Returns: logical, TRUE if the column exists in the DataFrame's projection list.

Join(other, on.expression, how = "inner")

Returns a new DataFrame that is a join of this DataFrame with another DataFrame.
Usage: dataframe$Join(other = DF1, on.expression = "col", how = "outer")
Arguments:

  • other: DataFrame The DataFrame to join with.

  • on.expression: character Join expression

  • how: ("inner", "left", "right", "outer"), Optional Type of join. Defaults to "inner".

Returns: DataFrame, new DataFrame object that joins the current DataFrame with another DataFrame.

rename.columns(new.col.names)

Updates the column name.
Usage: dataframe$rename.columns(list("A", "C"))
Arguments:

  • new.col.names:list of characters List of new columns' name.

Returns: DataFrame with rename columns.

RunQuery(Query)

Performs the query.
Usage: b <- dataframe$RunQuery('select "target" from IRIS')
Arguments:

  • Query: character: sql statement.

Returns: DataFrame, new DataFrame generated by sql Query.

save(table, table.type = NULL, force = TRUE, schema = NULL)

Creates a table holding this DataFrame's data.
Usage: Save("TAB1", "ROW")
Arguments:

  • table: character Table name. save() will fail if a conflicting table already exists.

  • table.type: character, optional, what kind of table to create. Case-insensitive. Can be one of "ROW", "COLUMN", "HISTORY COLUMN", "GLOBAL TEMPORARY", "GLOBAL TEMPORARY COLUMN", "LOCAL TEMPORARY", or "LOCAL TEMPORARY COLUMN".Defaults to "LOCAL TEMPORARY COLUMN" if `where` starts with "#" and "COLUMN" otherwise.

  • force: logical, optional, if TRUE, the existed table will be replaced. Defaults to TRUE.

  • schema: character, optional, schema name. save() will fail if a conflicting table already exists.

Returns: DataFrame representing the new table.

Select(cols)

Returns a new DataFrame with columns derived from the current DataFrame.
Usage: dataframe$Select("col1") OR
col.list <- list("*", "select")
cols <- sets::as.tuple(x = col.list)
dataframe$Select(cols)
Arguments:

  • cols: character or (character, character) tuple Columns of the new DataFrame. A string is treated as the name of a column to select; a (character, character) tuple is treated as (SQL expression, alias). As a special case, "*" is expanded to all columns of the original DataFrame.

Returns: DataFrame, new DataFrame object with the specified columns projected.

Sort(cols, desc = FALSE)

Returns a new DataFrame sorted by the specified columns.
Usage: dataframe$Sort("COL1")
Arguments:

  • cols: list of characters, list of columns to sort by. Must be a list, even for sorting by one column.

  • desc: logical, Optional, TRUE to sort in descending order, FALSE for ascending order. Defaults to FALSE.

Returns: DataFrame, new DataFrame object with rows in sorted order.

WithColumnRenamed(original, newName)

Returns a DataFrame with a new name for one column.
Usage: dataframe$WithColumnRenamed("col1", "colnew")
Arguments:

  • original: character, original column name.

  • newName: character, new column name.

Returns: DataFrame, the same data as this DataFrame, with one changed column name.