What’s New in hana.ml.r v2.10.210918
New Functions:
- TextMining : TF.Analysis, Text.Classification, Get.Related.Doc, Get.Relevant.Doc, Get.Related.Term, Get.Relevant.Term, Get.Suggested.Term, Text.Collector, Text.TFIDF.
- New function hanaml.OnlineLinearRegression(), an online version of linear regression.
- New function hanaml.OnlineMultiLogisticRegression(), an online version of multi logistic regression.
- New Bayesian Change-Point Detection function hanaml.BCPD().
Enhancement: - Categorical feature support in AdditiveModelForecast(). - Supports SAP HANA Data Lake in ModelStorage.
API Changes:
- Added a new parameter
categorical.variable in hanaml.AdditiveModelForecast().
What’s New in hana.ml.r v2.9.210619
Bug Fixed:
- Fixed missing default value of parameter ‘cols’ in hanaml.CovarianceMatrix and hanaml.PearsonrMatrix.
- Fixed missing schema setting issue in hanaml.ConvertToHANADataFrame using JDBC methods to write the HANA table.
- Fixed lowercase error of ACCURACY_MEASURE spelling.
- Fixed the corrected the value of param intercept in linear regression.
What’s New in hana.ml.r v2.8.210321
Version 2.8.210321 supports SAP HANA SPS05 and SAP HANA Cloud
New Functions:
- Added tbl() to support dbplyr
- Added hana2dbplyr() and dbplyr2hana() to subpport dbplyr
- New Unified functions: hanaml.UnifiedRegression, hanaml.UnifiedClustering.
- New Markov Chain Monte Carlo function: hanaml.mcmc.
- New ‘hana.version’ function is provided by hanaml.ConnectionContext.
- New function hanaml.OnlineARIMA.
- New function hanaml.VectorARIMA.
- New function hanaml.sqltrace.
API Changes:
- hanaml.KMeans with two added parameters ‘use.fast.library’ and ‘use.float’.
- Added a parameter ‘distance.level’ in hanaml.UnifiedClustering when ‘func’ is AgglomerateHierarchicalClustering and DBSCAN. Please refer to documentation for details.
- Added a parameter ‘range.penalty’ in hanaml.CPD.
- Provides more methods of ‘decomposition’ in various regression functions, please refer to the documentation of the specific algorithm.
- Added a parameter ‘key’ in prediction() function of hanaml.ARIMA and hanaml.AutoARIMA.
- Added 2 parameters ‘min.measure’, ‘max.consequent’ in hanaml.KORD.
- Added a parameter ‘output.threshold’ in hanaml.AUC to enable the output of threshold values in roc table.
- Removed RODBC dependency.
Enhancement:
- Improved the data upload in ConvertToHANADataFrame() for odbc and jdbc connection.
- Enhanced hanaml.UnifiedClustering to support ‘distance.level’ in AgglomerateHierarchicalClustering and DBSCAN functions. Please refer to the documentation for details.
- Enhanced the speed of data import functionality when odbc is applied.
- Enhanced the validation of types of parameters.
- Enhanced the parameter check of season.start and allow.linear in hanaml.AutoARIMA.
- Enhanced the model storage for hanaml.ARIMA, hanaml.AutoARIMA, hanaml.VectorARIMA and hanaml.OnlineARIMA.
Bug Fixed:
- Fixed the displacement of parameter ‘dispersion’ in hanaml.CPD.
- Fixed the displacement of parameter ‘category.weight’ in hanaml.GaussianMixture.
What’s New in hana.ml.r v2.6.201016
API Changes:
- HybridGradientBoostingClassifier, HybridGradientBoostingRegressor: added a parameter ‘adopt_prior’ to indicate whether to adopt the prior distribution as the initial point.
- LinearRegression: added a parameter ‘features.must.select’ to specifies the column name that needs to be included in the final training model when executing the variable selection.
What’s New in hana.ml.r v2.6.200928
API Changes:
- Added encrypt, validateCertificate and autocommit options to JDBC ConnectionContext.
- SVC, SVR, OneClassSVM, SVRanking: added parameters ‘compression’, ‘max.bits’, ‘max.quantization.iter’ for model compression.
- RandomForestClassifier: added parameters ‘compression’, ‘max.bits’, ‘quantize.rate’ for model compression.
- RandomForestRegressor: added parameters ‘compression’, ‘max.bits’, ‘quantize.rate’, ‘fittings.quantization’ for model compression.
- In prediction function ARIMA and AutoARIMA, new value ‘truncation.algorithm’ of forecast_method is introduced to improve the prediction performance.
- New parameters ‘string.variable’, ‘variable.weight’ added in KNNClassifier, KNNRegressor and DBSCAN to enable distance calculation based on String distance.
- New parameters ‘extrapolation’, ‘smooth.width’, ‘auxiliary.normalitytest’ are added in SeasonalDecompose.
New Functions:
- Clustering: SlightSilhouette.
- Preprocessing : SMOTETomek, TomekLinks.
Bug Fixed:
- Fixed version check to support HANA cloud.
- Fixed the error in random distribution sampling when full distribution parameter set is specified.
- Fixed HAS_ID error in TomekLinks.
- Fixed the Collect() method that returns ‘No Data’ when contraining a column of type ST_GEOMETRY(in particular, for GeoDBSCAN).
What’s New in hana.ml.r v2.5.200626
API Changes:
- Removed parameter ConnectionContext in functions.
- Added a new parameter ‘decay’ to replace ‘learning.rate’ as the later one is misleading in SOM().
- Added a new parameter ‘stratified.columns’ in hanaml.Sampling() to replace ‘features’ whose name is misleading.
- Added a new parameter ‘col.types’ in ConvertToHANADataFrame() to replace ‘clob.columns’ for enhancement.
- Changed parameter ‘seed’ to ‘random.state’ for parameter name consistency in MLPClassifier() and MLPRegressor().
- Update function names for consistency: Arima -> ARIMA, AutoArima -> AutoARIMA, Auc -> AUC, Kmeans -> KMeans, Kmedian -> KMedian, Kmedoid -> KMedoid, Kord -> KORD.
New Functions:
- Recommender System Algorithms : Alternating Least Square, Factorized Polynomial Regression Models, Field-Aware Factorization Machine.
- Regression : Cox Proportional Hazard Model.
- Statistics Functions : Cumulative Distribution Function (CDF), Distribution Fitting, Distribution Quantile, Entropy, Equal Variance Test, Factor Analysis, Wilcoxon Signed Rank Test, Grubbs’ Test, Kaplan-Meier Survival Analysis, Kernel Density Estimation, One-Sample Median Test.
- Time Series : Fast DTW
- Preprocessing : SMOTE
- Miscellaneous : ABC Analysis, T-Distributed Stochastic Neighbour Embedding(TSNE), Weighted Score Table.
- Unified classification
- Model Selection
Bug Fixed:
- Fixed parameter ‘formula’ parsing issue when a single feature is entered.
- Fixed falsely recognizing the type of query statement with JDBC in sqlQueryMix().
- Fixed phrasing error of parameter ‘timeout’ in Decision Tree, Linear Regression, Logistic Regression, Naive Bayes and SVM functions.
Enhancement:
- Added cross-validation options to some functions (Decision Tree Classifier/Regressor, Gradient Boosting Classifier/Regressor, Hybrid Gradient Boosting Classifier/Regressor, Generalised Linear Models(GLM), Naive Bayes, Linear Regression, Logistic Regression Multi-Layer Perceptron Classifier/Regressor, Support Vector Machines functions, K-Nearest Neighbors Classifier/Regressor, Alternating Least Square(ALS), Factorized Polynomial Regression Models(FRM), Polynomial Regression).
- Improved the robustness of ConvertToHANADataFrame().
- Enhancement of ConvertToHANADataFrame() function: support data.frame input with missing (NA) values.
What’s New in hana.ml.r v1.0.8
Bug Fixed:
- Fixed wrong error message with RJDBC connection. Add two additional error message for missing RJDBC connection and wrong RJDBC connection.context type.
- Fix label type error in RandomForest. Add process to check continuous label type for regression and categorical label type for classification.
- Fixed wrong API for DecisionTreeRegressor. Remove type conversion for result.
- Fixed wrong cast error in Neighbors. Use NVARCHAR instead of DOUBLE for result.
- Fixed wrong cast error in HGBT. Use NVARCHAR instead of DOUBLE for result.
What’s New in hana.ml.r v1.0.7
New Algorithms:
- Association : Apriori, Apriorilite, FP-Growth, KORD, Sequential Pattern Mining (SPM)
- Clustering : Affinity Propagation, Agglomerate Hierarchical Clustering, DBSCAN, Geometry DBSCAN, Latent Dirichlet Allocation, Self-Organizing Maps (SOM)
- Classification : Conditional Random Field (CRF), Confusion Matrix, Hybrid Gradient Boosting Tree (HGBT), Logistic Regression, Multilayer Perceptron (MLP).
- Regression : Bi-Variate Geometric Regression, Bi-Variate Natural Logarithmic Regression, Exponential Regression.
- Time Series : ARIMA, Auto ARIMA, FFT, Seasonal Decompose, Trend Test, White Noise Test Single/Double/Triple/Auto/Brown Exponential Smoothing, Change-Point Detection, Croston’s Method, Linear Regression With Damped Trend And Seasonal Adjust, Additive Model Forecast, Hierarchical Forecast, Correlation Function.
- Preprocessing : Discretize, Inter-Quartile Range (IQR), Missing Value Handing, Multidimensional Scaling (MDS), Partition, Random Distribution Sampling, Variance Test function.
- Statistics Functions : Chi-Squared Test Functions, T-Test Functions, Analysis Of Variance Functions (ANOVA), Univariate/Multivariate Analysis Functions.
- Random Distribution Sampling Functions : Bernoulli, Beta, Binomial, Cauchy, Chi_Squared, Exponential, Extreme_Value, F, Gamma, Geometric, Gumbel, Lognormal, Negative_Binomial, Normal, Pert, Poisson, Student_T, Uniform, Weibull, Multinomial).
- Social Networks : Link Prediction, Pagerank.
- Connection Context : JDBC Option.
- Model Storage Services.
- Dataframe Functions : ConvertToHANADataFrame.
Quality Improvements:
- Use Anonymous Block.
- Remove Unnecessary Temporary Table.
- Error Message Enhancement.
- Fix Documentation.
- Code Quality Improvement.