| hanaml.DataFrame {hana.ml.r} | R Documentation |
This module represents a database query as a DataFrame. Most operations are designed to never bring data back from the database unless explicitly asked for.
hanaml.DataFrame(conn.context = NULL,
select.statement = NULL,
name = NULL)
conn.context |
|
select.statement |
|
name |
|
An object of class R6ClassGenerator of length 24.
Object of R6Class with methods for DataFrame that is
backed by a database sql statement.
AddId(id)Adds an ID column based on ROW_NUMBER() as the first column.
Usage: dataframe$AddId(id = "NEW_ID")
Arguments:
id: character, name of the added ID column.
Returns: DataFrame with an ID column based on ROW_NUMBER() built-in.
Alias(aliasName)Returns a new DataFrame with an alias set.
Usage: NewDf <- dataframe$Alias('TABLE1')
Arguments:
aliasName: character, alias name of the DataFrame.
Returns: DataFrame with an alias set.
cast(cols, new.type)Converts columns from one datatype to another specified datatype.
Usage: dataframe$cast("ID", new.type = "DOUBLE")
Arguments:
cols: list of characters, the columns to be converted.
new.type: character, the datatype to convert expression to.
Returns: DataFrame with new datatype.
Collect()Copies this DataFrame to an R DataFrame.
Usage: dataframe$Collect()
Returns: R DataFrame containing this DataFrame's data.
Count()Computes the number of rows in a DataFrame.
Usage: dataframe$Count()
Returns: integer, number of rows in the DataFrame.
Describe(cols=NULL)Generate descriptive statistics that summarize the central tendency,
ispersion and shape of a dataset’s distribution.
Usage: dataframe$Describe()
Arguments:
cols: list of characters, optional, the columns to be summarized. Defaults to summmarize all columns.
Returns: DataFrame with descriptive statistics.
distinct(cols=NULL)Return distinct values.
Usage: dataframe$distinct()
Arguments:
cols: list of characters, optional, name of columns which return distinct values.
Returns: DataFrame with distinct values.
Drop(cols)Returns a new DataFrame after removing specified columns.
Usage: dataframe$Drop('colList')
Arguments:
cols: list of characters, list of column names to drop.
Returns: DataFrame,
new DataFrame retaining only columns not in cols.
DropDuplicates(subset.dataframe=NULL)Returns DataFrame with duplicate rows removed.
Usage: dataframe$DropDuplicates('subsetList')
Arguments:
subset.dataframe: list of characters, optional,
List of columns to consider when deciding whether rows are duplicates
of each other. Defaults to all columns.
Returns: DataFrame with only one copy of duplicate rows.
DropNa(how = NULL, thresh = NULL, subset = NULL)Returns a new DataFrame with NULLs removed.
Usage:
dataframe$DropNa(how = 'any',thresh = 1,subset = 'subsetone')
Arguments:
how : ('any', 'all'), optional, if provided,
'any' eliminates rows with any NULLs, and 'all' eliminates rows
that are entirely NULLs. If neither how nor thresh are provided,
how defaults to 'any'.
thresh: integer ,optional, if provided, keep
rows with at least thresh non-NULL values and drop rows with less.
how and thresh cannot both be provided.
subset: list of characters, optional, columns to
consider when looking for NULLs. Values in other columns will be
ignored, NULL or not. Defaults to all columns in the DataFrame.
Returns: DataFrame with a select statement that removes NULLs.
dtypes(subset.col = NULL)Return column names and their data types as a list.
Usage: dataframe$dtypes()
Arguments:
subset.col: list of characters, selected columns to show datatype.
Returns: list of column names and their data types .
FillNa(value, subset.dataframe = NULL)Returns a DataFrame with NULLs replaced with the fill value. Only
supports filling numeric columns.
Usage: dataframe$FillNa(0, 'col1')
Arguments:
value: integer or double, value to replace NULLs with.
value should have type appropriate for the selected columns.
subset: character, Optional, list of columns in which
to replace NULLs. Defaults to all columns.
Returns: DataFrame, new DataFrame with NULLs filled.
Filter(condition)Selects rows matching the given condition. The condition string is not
sanity-checked in any way. Do not take condition strings from untrusted
input, as this can easily be used for SQL injection.
Usage:
dataframe$Filter("select * from test where col1 = 'A'")
Arguments:
condition: character, condition to filter on. This
should be in the format of a SQL WHERE clause test (not including the word "WHERE").
Returns: DataFrame with only rows matching the given condition.
GenerateColname(prefix = 'GEN_COL')Generates a new column name for the DataFrame.
Usage: dataframe$GenerateColname('COL1')
Arguments:
prefix: character, optional, name of the column. If no name if provided, it
creates a default column named 'GEN_COL'.
Returns: character, newly generated column name.
GetDf(select.statement, name = NULL)Creates a new DataFrame.
Usage:
dataframe$GetDf('SELECT * FROM TEMP;', NAME = 'DF1')
Arguments:
select.statement: character, Dataframe sql query
name: character, optional, Dataframe name
Returns: DataFrame
GetDfCounter()Returns the number of DataFrame.
Usage: dataframe$GetDfCounter()
Returns: integer.
GetNRows()Sets the value of DataFrame's nrows.df.
Usage: dataframe$GetNRows()
Returns: No return value.
Has(col)Returns TRUE if a column is in the DataFrame.
Usage: dataframe$has('col1')
Arguments:
col: character Name of column to search in the
projection list of this DataFrame.
Returns:
logical\, TRUE if the column exists in the DataFrame's projection list.
Head(n = 1)Returns a new DataFrame containing the first n rows of the DataFrame.
Usage: dataframe$head(n = 5)
Arguments:
n: integer (Optional) Number of rows to return.
Defaults to 1.
Returns: DataFrame,
new DataFrame of the first n rows of this DataFrame.
Join(other, on.expression, how = 'inner')Returns a new DataFrame that is a join of this DataFrame with another DataFrame.
Usage:
dataframe$Drop(other = DF1,on.expression = 'col',how = 'outer')
Arguments:
other: DataFrame The DataFrame to join with.
on.expression: character Join expression
how: ('inner', 'left', 'right', 'outer'), Optional
Type of join. Defaults to 'inner'.
Returns:
DataFrame,
new DataFrame object that joins the current DataFrame with another DataFrame.
rename.columns(new.col.names)Updates the column name.
Usage: dataframe$rename.columns(list("A", "C"))
Arguments:
new.col.names:list of characters List of new columns' name.
Returns: DataFrame with rename columns.
RunQuery(Query)Performs the query.
Usage: b <- dataframe$RunQuery('select "target" from IRIS')
Arguments:
Query: character: sql statement.
Returns: DataFrame, new DataFrame generated by sql Query.
save(table, table.type = NULL, force = TRUE, schema = NULL)Creates a table holding this DataFrame's data.
Usage: Save('TAB1','ROW')
Arguments:
table: character
Table name. save() will fail if a
conflicting table already exists.
table.type: character, optional, what kind of table
to create. Case-insensitive. Can be one of "ROW", "COLUMN", "HISTORY COLUMN",
"GLOBAL TEMPORARY", "GLOBAL TEMPORARY COLUMN", "LOCAL TEMPORARY", or
"LOCAL TEMPORARY COLUMN".Defaults to "LOCAL TEMPORARY COLUMN" if 'where' starts
with "#" and "COLUMN" otherwise.
force: logical, optional,
if TRUE, the existed table will be replaced. Defaults to TRUE.
schema: character, optional,
schema name. save() will fail if a conflicting table already exists.
Returns: DataFrame representing the new table.
Select(cols)Returns a new DataFrame with columns derived from the current DataFrame.
Usage: dataframe$Select('col1') OR
col.list <- list('*','select')
cols <- sets::as.tuple(x = col.list)
dataframe$Select(cols)
Arguments:
cols: character or (character, character) tuple
Columns of the new DataFrame. A string is treated as the name of a
column to select; a (character, character) tuple is treated as
(SQL expression, alias). As a special case, '*' is expanded to
all columns of the original DataFrame.
Returns: DataFrame,
new DataFrame object with the specified columns projected.
Sort(cols, desc = FALSE)Returns a new DataFrame sorted by the specified columns.
Usage: dataframe$Sort('COL1')
Arguments:
cols: list of characters, list of columns to
sort by. Must be a list, even for sorting by one column.
desc: logical, Optional, TRUE to sort in
descending order, FALSE for ascending order. Defaults to FALSE.
Returns: DataFrame,
new DataFrame object with rows in sorted order.
WithColumnRenamed(original, newName)Returns a DataFrame with a new name for one column.
Usage: dataframe$WithColumnRenamed('col1','colnew')
Arguments:
original: character, original column name.
newName: character, new column name.
Returns: DataFrame,
the same data as this DataFrame, with one changed column name.