hanaml.DataFrame.RdThis module represents a database query as a DataFrame. Most operations are designed to never bring data back from the database unless explicitly asked for.
hanaml.DataFrame( connection.context = NULL, select.statement = NULL, name = NULL )
| connection.context |
|
|---|---|
| select.statement |
|
| name |
|
Object of R6Class with methods for DataFrame that is
backed by a database sql statement.
AddId(id)Adds an ID column based on ROW_NUMBER() as the first column.
Usage: dataframe$AddId(id = "NEW_ID")
Arguments:
id: character, name of the added ID column.
DataFrame with an ID column based on ROW_NUMBER() built-in.Alias(aliasName)Returns a new DataFrame with an alias set.
Usage: NewDf <- dataframe$Alias("TABLE1")
Arguments:
aliasName: character, alias name of the DataFrame.
DataFrame with an alias set.cast(cols, new.type)Converts columns from one datatype to another specified datatype.
Usage: dataframe$cast("ID", new.type = "DOUBLE")
Arguments:
cols: list of characters, the columns to be converted.
new.type: character, the datatype to convert expression to.
DataFrame with new datatype.Collect()Copies this DataFrame to an R DataFrame.
Usage: dataframe$Collect()
Returns: R DataFrame containing this DataFrame's data.
Count()Computes the number of rows in a DataFrame.
Usage: dataframe$Count()
Returns: integer, number of rows in the DataFrame.
Describe(cols=NULL)Generate descriptive statistics that summarize the central tendency,
ispersion and shape of a dataset’s distribution.
Usage: dataframe$Describe()
Arguments:
cols: list of characters, optional, the columns to be summarized.
Defaults to summmarize all columns.
DataFrame with descriptive statistics.distinct(cols=NULL)Return distinct values.
Usage: dataframe$distinct()
Arguments:
cols: list of characters, optional, name of columns which return distinct values.
DataFrame with distinct values.Drop(cols)Returns a new DataFrame after removing specified columns.
Usage: dataframe$Drop("colList")
Arguments:
cols: list of characters, list of column names to drop.
DataFrame,
new DataFrame retaining only columns not in cols.DropDuplicates(subset.dataframe=NULL)Returns DataFrame with duplicate rows removed.
Usage: dataframe$DropDuplicates("subsetList")
Arguments:
subset.dataframe: list of characters, optional,
List of columns to consider when deciding whether rows are duplicates
of each other. Defaults to all columns.
DataFrame with only one copy of duplicate rows.DropNa(how = NULL, thresh = NULL, subset = NULL)Returns a new DataFrame with NULLs removed.
Usage:
dataframe$DropNa(how = "any", thresh = 1,subset = "subsetone")
Arguments:
how : ("any", "all"), optional, if provided,
"any" eliminates rows with any NULLs, and "all" eliminates rows
that are entirely NULLs. If neither how nor thresh are provided,
how defaults to "any".
thresh: integer, optional, if provided, keep
rows with at least thresh non-NULL values and drop rows with less.
how and thresh cannot both be provided.
subset: list of characters, optional, columns to
consider when looking for NULLs. Values in other columns will be
ignored, NULL or not. Defaults to all columns in the DataFrame.
DataFrame with a select statement that removes NULLs.dtypes(subset.col = NULL)Return column names and their data types as a list.
Usage: dataframe$dtypes()
Arguments:
subset.col: list of characters, selected columns to show datatype.
list of column names and their data types .FillNa(value, subset.dataframe = NULL)Returns a DataFrame with NULLs replaced with the fill value. Only
supports filling numeric columns.
Usage: dataframe$FillNa(0, "col1")
Arguments:
value: integer or double, value to replace NULLs with.
value should have type appropriate for the selected columns.
subset: character, optional, list of columns in which
to replace NULLs. Defaults to all columns.
DataFrame, new DataFrame with NULLs filled.Filter(condition)Selects rows matching the given condition. The condition string is not
sanity-checked in any way. Do not take condition strings from untrusted
input, as this can easily be used for SQL injection.
Usage:
dataframe$Filter("select * from test where col1 = 'A'")
Arguments:
condition: character, condition to filter on. This
should be in the format of a SQL WHERE clause test (not including the word "WHERE").
DataFrame with only rows matching the given condition.GenerateColname(prefix = "GEN_COL")Generates a new column name for the DataFrame.
Usage: dataframe$GenerateColname("COL1")
Arguments:
prefix: character, optional, name of the column. If no name if provided, it
creates a default column named "GEN_COL".
character, newly generated column name.GetDf(select.statement, name = NULL)Creates a new DataFrame.
Usage:
dataframe$GetDf("SELECT * FROM TEMP;", NAME = "DF1")
Arguments:
select.statement: character, Dataframe sql query
name: character, optional, Dataframe name
DataFrameGetDfCounter()Returns the number of DataFrame.
Usage: dataframe$GetDfCounter()
Returns: integer.
GetNRows()Sets the value of DataFrame's nrows.df.
Usage: dataframe$GetNRows()
Returns: No return value.
Has(col)Returns TRUE if a column is in the DataFrame.
Usage: dataframe$Has("col1")
Arguments:
col: character Name of column to search in the
projection list of this DataFrame.
logical, TRUE if the column exists in the DataFrame's projection list.Join(other, on.expression, how = "inner")Returns a new DataFrame that is a join of this DataFrame with another DataFrame.
Usage:
dataframe$Join(other = DF1, on.expression = "col", how = "outer")
Arguments:
other: DataFrame The DataFrame to join with.
on.expression: character Join expression
how: ("inner", "left", "right", "outer"), Optional
Type of join. Defaults to "inner".
DataFrame,
new DataFrame object that joins the current DataFrame with another DataFrame.rename.columns(new.col.names)Updates the column name.
Usage: dataframe$rename.columns(list("A", "C"))
Arguments:
new.col.names:list of characters List of new columns' name.
DataFrame with rename columns.RunQuery(Query)Performs the query.
Usage: b <- dataframe$RunQuery('select "target" from IRIS')
Arguments:
Query: character: sql statement.
DataFrame, new DataFrame generated by sql Query.save(table, table.type = NULL, force = TRUE, schema = NULL)Creates a table holding this DataFrame's data.
Usage: Save("TAB1", "ROW")
Arguments:
table: character
Table name. save() will fail if a
conflicting table already exists.
table.type: character, optional, what kind of table
to create. Case-insensitive. Can be one of "ROW", "COLUMN", "HISTORY COLUMN",
"GLOBAL TEMPORARY", "GLOBAL TEMPORARY COLUMN", "LOCAL TEMPORARY", or
"LOCAL TEMPORARY COLUMN".Defaults to "LOCAL TEMPORARY COLUMN" if `where` starts
with "#" and "COLUMN" otherwise.
force: logical, optional,
if TRUE, the existed table will be replaced. Defaults to TRUE.
schema: character, optional,
schema name. save() will fail if a conflicting table already exists.
DataFrame representing the new table.Select(cols)Returns a new DataFrame with columns derived from the current DataFrame.
Usage: dataframe$Select("col1") OR
col.list <- list("*", "select")
cols <- sets::as.tuple(x = col.list)
dataframe$Select(cols)
Arguments:
cols: character or (character, character) tuple
Columns of the new DataFrame. A string is treated as the name of a
column to select; a (character, character) tuple is treated as
(SQL expression, alias). As a special case, "*" is expanded to
all columns of the original DataFrame.
DataFrame,
new DataFrame object with the specified columns projected.Sort(cols, desc = FALSE)Returns a new DataFrame sorted by the specified columns.
Usage: dataframe$Sort("COL1")
Arguments:
cols: list of characters, list of columns to
sort by. Must be a list, even for sorting by one column.
desc: logical, Optional, TRUE to sort in
descending order, FALSE for ascending order. Defaults to FALSE.
DataFrame,
new DataFrame object with rows in sorted order.WithColumnRenamed(original, newName)Returns a DataFrame with a new name for one column.
Usage: dataframe$WithColumnRenamed("col1", "colnew")
Arguments:
original: character, original column name.
newName: character, new column name.
DataFrame,
the same data as this DataFrame, with one changed column name.