hana-ml
2.27.260206
  • Python Machine Learning Client for SAP HANA
    • Prerequisites
    • SAP HANA DataFrame
    • Machine Learning API
    • End-to-End Example: Using SAP HANA Predictive Analysis Library (PAL) Module
    • End-to-End Example: Using SAP HANA Automated Predictive Library (APL) Module
    • Visualizers Module
    • Spatial and Graph Features
    • Summary
  • Installation Guide
  • hana-ml Tutorials
  • Changelog
  • hana_ml.dataframe
    • quotename()
    • ConnectionContext
      • ConnectionContext.enable_abap_sql()
      • ConnectionContext.disable_abap_sql()
      • ConnectionContext.close()
      • ConnectionContext.add_primary_key()
      • ConnectionContext.drop_primary_key()
      • ConnectionContext.add_auto_incremented_key()
      • ConnectionContext.copy()
      • ConnectionContext.create_pse()
      • ConnectionContext.create_certificate()
      • ConnectionContext.add_certificate_to_pse()
      • ConnectionContext.create_schema()
      • ConnectionContext.create_table()
      • ConnectionContext.create_remote_source()
      • ConnectionContext.create_virtual_table()
      • ConnectionContext.drop_procedure()
      • ConnectionContext.drop_view()
      • ConnectionContext.truncate_table()
      • ConnectionContext.drop_table()
      • ConnectionContext.copy_to_data_lake()
      • ConnectionContext.explain_plan_statement()
      • ConnectionContext.has_schema()
      • ConnectionContext.has_table()
      • ConnectionContext.hana_version()
      • ConnectionContext.get_current_schema()
      • ConnectionContext.get_tables()
      • ConnectionContext.get_schemas()
      • ConnectionContext.get_procedures()
      • ConnectionContext.get_temporary_tables()
      • ConnectionContext.get_connection_id()
      • ConnectionContext.cancel_session_process()
      • ConnectionContext.restart_session()
      • ConnectionContext.clean_up_temporary_tables()
      • ConnectionContext.hana_major_version()
      • ConnectionContext.is_cloud_version()
      • ConnectionContext.sql()
      • ConnectionContext.execute_sql()
      • ConnectionContext.table()
      • ConnectionContext.upsert_streams_data()
      • ConnectionContext.update_streams_data()
      • ConnectionContext.to_sqlalchemy()
      • ConnectionContext.create_vector_index()
      • ConnectionContext.drop_vector_index()
      • ConnectionContext.embed_query()
    • DataFrame
      • DataFrame.columns
      • DataFrame.shape
      • DataFrame.name
      • DataFrame.quoted_name
      • DataFrame.description
      • DataFrame.description_ext
      • DataFrame.declare_lttab_usage()
      • DataFrame.disable_validate_columns()
      • DataFrame.enable_validate_columns()
      • DataFrame.add_vector()
      • DataFrame.add_id()
      • DataFrame.add_constant()
      • DataFrame.alias()
      • DataFrame.count()
      • DataFrame.diff()
      • DataFrame.drop()
      • DataFrame.distinct()
      • DataFrame.drop_duplicates()
      • DataFrame.dropna()
      • DataFrame.deselect()
      • DataFrame.has_constant_columns()
      • DataFrame.drop_constant_columns()
      • DataFrame.dtypes()
      • DataFrame.empty()
      • DataFrame.filter()
      • DataFrame.has()
      • DataFrame.head()
      • DataFrame.hasna()
      • DataFrame.fillna()
      • DataFrame.get_table_structure()
      • DataFrame.join()
      • DataFrame.set_name()
      • DataFrame.set_index()
      • DataFrame.smart_save()
      • DataFrame.save()
      • DataFrame.save_nativedisktable()
      • DataFrame.split_column()
      • DataFrame.concat_columns()
      • DataFrame.nullif()
      • DataFrame.replace()
      • DataFrame.sort()
      • DataFrame.sort_values()
      • DataFrame.sort_index()
      • DataFrame.sort_by_similarity()
      • DataFrame.select()
      • DataFrame.set_operations()
      • DataFrame.union()
      • DataFrame.collect()
      • DataFrame.geometries
      • DataFrame.srids
      • DataFrame.rename_columns()
      • DataFrame.auto_cast()
      • DataFrame.cast()
      • DataFrame.tail()
      • DataFrame.to_head()
      • DataFrame.to_tail()
      • DataFrame.summary()
      • DataFrame.stats
      • DataFrame.describe()
      • DataFrame.bin()
      • DataFrame.agg()
      • DataFrame.is_numeric()
      • DataFrame.corr()
      • DataFrame.min()
      • DataFrame.max()
      • DataFrame.sum()
      • DataFrame.median()
      • DataFrame.mean()
      • DataFrame.stddev()
      • DataFrame.value_counts()
      • DataFrame.pivot_table()
      • DataFrame.generate_table_type()
      • DataFrame.rearrange()
      • DataFrame.set_source_table()
      • DataFrame.to_pickle()
      • DataFrame.to_datetime()
      • DataFrame.generate_feature()
      • DataFrame.mutate()
    • read_pickle()
    • create_dataframe_from_pandas()
    • create_dataframe_from_spark()
    • melt()
    • create_dataframe_from_shapefile()
    • import_csv_from()
  • hana_ml.algorithms.apl package
    • hana_ml.algorithms.apl.gradient_boosting_classification
      • GradientBoostingClassifier
        • GradientBoostingClassifier.set_params()
        • GradientBoostingClassifier.fit()
        • GradientBoostingClassifier.score()
        • GradientBoostingClassifier.get_metrics_per_class()
        • GradientBoostingClassifier.build_report()
        • GradientBoostingClassifier.set_metric_samplings()
        • GradientBoostingClassifier.disable_hana_execution()
        • GradientBoostingClassifier.enable_hana_execution()
        • GradientBoostingClassifier.export_apply_code()
        • GradientBoostingClassifier.generate_html_report()
        • GradientBoostingClassifier.generate_notebook_iframe_report()
        • GradientBoostingClassifier.get_apl_version()
        • GradientBoostingClassifier.get_artifacts_recorder()
        • GradientBoostingClassifier.get_best_iteration()
        • GradientBoostingClassifier.get_debrief_report()
        • GradientBoostingClassifier.get_evalmetrics()
        • GradientBoostingClassifier.get_feature_importances()
        • GradientBoostingClassifier.get_fit_operation_log()
        • GradientBoostingClassifier.get_indicators()
        • GradientBoostingClassifier.get_model_info()
        • GradientBoostingClassifier.get_params()
        • GradientBoostingClassifier.get_performance_metrics()
        • GradientBoostingClassifier.get_predict_operation_log()
        • GradientBoostingClassifier.get_summary()
        • GradientBoostingClassifier.is_fitted()
        • GradientBoostingClassifier.load_model()
        • GradientBoostingClassifier.predict()
        • GradientBoostingClassifier.save_artifact()
        • GradientBoostingClassifier.save_model()
        • GradientBoostingClassifier.schedule_fit()
        • GradientBoostingClassifier.schedule_predict()
        • GradientBoostingClassifier.set_framework_version()
        • GradientBoostingClassifier.set_scale_out()
        • GradientBoostingClassifier.set_shapley_explainer_of_predict_phase()
        • GradientBoostingClassifier.set_shapley_explainer_of_score_phase()
      • GradientBoostingBinaryClassifier
        • GradientBoostingBinaryClassifier.set_params()
        • GradientBoostingBinaryClassifier.score()
        • GradientBoostingBinaryClassifier.build_report()
        • GradientBoostingBinaryClassifier.disable_hana_execution()
        • GradientBoostingBinaryClassifier.enable_hana_execution()
        • GradientBoostingBinaryClassifier.export_apply_code()
        • GradientBoostingBinaryClassifier.fit()
        • GradientBoostingBinaryClassifier.generate_html_report()
        • GradientBoostingBinaryClassifier.generate_notebook_iframe_report()
        • GradientBoostingBinaryClassifier.get_apl_version()
        • GradientBoostingBinaryClassifier.get_artifacts_recorder()
        • GradientBoostingBinaryClassifier.get_best_iteration()
        • GradientBoostingBinaryClassifier.get_debrief_report()
        • GradientBoostingBinaryClassifier.get_evalmetrics()
        • GradientBoostingBinaryClassifier.get_feature_importances()
        • GradientBoostingBinaryClassifier.get_fit_operation_log()
        • GradientBoostingBinaryClassifier.get_indicators()
        • GradientBoostingBinaryClassifier.get_model_info()
        • GradientBoostingBinaryClassifier.get_params()
        • GradientBoostingBinaryClassifier.get_performance_metrics()
        • GradientBoostingBinaryClassifier.get_predict_operation_log()
        • GradientBoostingBinaryClassifier.get_summary()
        • GradientBoostingBinaryClassifier.is_fitted()
        • GradientBoostingBinaryClassifier.load_model()
        • GradientBoostingBinaryClassifier.predict()
        • GradientBoostingBinaryClassifier.save_artifact()
        • GradientBoostingBinaryClassifier.save_model()
        • GradientBoostingBinaryClassifier.schedule_fit()
        • GradientBoostingBinaryClassifier.schedule_predict()
        • GradientBoostingBinaryClassifier.set_framework_version()
        • GradientBoostingBinaryClassifier.set_metric_samplings()
        • GradientBoostingBinaryClassifier.set_scale_out()
        • GradientBoostingBinaryClassifier.set_shapley_explainer_of_predict_phase()
        • GradientBoostingBinaryClassifier.set_shapley_explainer_of_score_phase()
    • hana_ml.algorithms.apl.gradient_boosting_regression
      • GradientBoostingRegressor
        • GradientBoostingRegressor.set_params()
        • GradientBoostingRegressor.predict()
        • GradientBoostingRegressor.score()
        • GradientBoostingRegressor.build_report()
        • GradientBoostingRegressor.disable_hana_execution()
        • GradientBoostingRegressor.enable_hana_execution()
        • GradientBoostingRegressor.export_apply_code()
        • GradientBoostingRegressor.fit()
        • GradientBoostingRegressor.generate_html_report()
        • GradientBoostingRegressor.generate_notebook_iframe_report()
        • GradientBoostingRegressor.get_apl_version()
        • GradientBoostingRegressor.get_artifacts_recorder()
        • GradientBoostingRegressor.get_best_iteration()
        • GradientBoostingRegressor.get_debrief_report()
        • GradientBoostingRegressor.get_evalmetrics()
        • GradientBoostingRegressor.get_feature_importances()
        • GradientBoostingRegressor.get_fit_operation_log()
        • GradientBoostingRegressor.get_indicators()
        • GradientBoostingRegressor.get_model_info()
        • GradientBoostingRegressor.get_params()
        • GradientBoostingRegressor.get_performance_metrics()
        • GradientBoostingRegressor.get_predict_operation_log()
        • GradientBoostingRegressor.get_summary()
        • GradientBoostingRegressor.is_fitted()
        • GradientBoostingRegressor.load_model()
        • GradientBoostingRegressor.save_artifact()
        • GradientBoostingRegressor.save_model()
        • GradientBoostingRegressor.schedule_fit()
        • GradientBoostingRegressor.schedule_predict()
        • GradientBoostingRegressor.set_framework_version()
        • GradientBoostingRegressor.set_scale_out()
        • GradientBoostingRegressor.set_shapley_explainer_of_predict_phase()
        • GradientBoostingRegressor.set_shapley_explainer_of_score_phase()
    • hana_ml.algorithms.apl.time_series
      • AutoTimeSeries
        • AutoTimeSeries.set_params()
        • AutoTimeSeries.fit()
        • AutoTimeSeries.predict()
        • AutoTimeSeries.fit_predict()
        • AutoTimeSeries.forecast()
        • AutoTimeSeries.get_model_components()
        • AutoTimeSeries.get_performance_metrics()
        • AutoTimeSeries.get_horizon_wide_metric()
        • AutoTimeSeries.load_model()
        • AutoTimeSeries.export_apply_code()
        • AutoTimeSeries.build_report()
        • AutoTimeSeries.generate_html_report()
        • AutoTimeSeries.generate_notebook_iframe_report()
        • AutoTimeSeries.disable_hana_execution()
        • AutoTimeSeries.enable_hana_execution()
        • AutoTimeSeries.get_apl_version()
        • AutoTimeSeries.get_artifacts_recorder()
        • AutoTimeSeries.get_debrief_report()
        • AutoTimeSeries.get_fit_operation_log()
        • AutoTimeSeries.get_indicators()
        • AutoTimeSeries.get_model_info()
        • AutoTimeSeries.get_params()
        • AutoTimeSeries.get_predict_operation_log()
        • AutoTimeSeries.get_summary()
        • AutoTimeSeries.is_fitted()
        • AutoTimeSeries.save_artifact()
        • AutoTimeSeries.save_model()
        • AutoTimeSeries.schedule_fit()
        • AutoTimeSeries.schedule_predict()
        • AutoTimeSeries.set_scale_out()
    • hana_ml.algorithms.apl.classification
      • AutoClassifier
        • AutoClassifier.fit()
        • AutoClassifier.predict()
        • AutoClassifier.score()
        • AutoClassifier.disable_hana_execution()
        • AutoClassifier.enable_hana_execution()
        • AutoClassifier.export_apply_code()
        • AutoClassifier.get_apl_version()
        • AutoClassifier.get_artifacts_recorder()
        • AutoClassifier.get_debrief_report()
        • AutoClassifier.get_feature_importances()
        • AutoClassifier.get_fit_operation_log()
        • AutoClassifier.get_indicators()
        • AutoClassifier.get_model_info()
        • AutoClassifier.get_params()
        • AutoClassifier.get_performance_metrics()
        • AutoClassifier.get_predict_operation_log()
        • AutoClassifier.get_summary()
        • AutoClassifier.is_fitted()
        • AutoClassifier.load_model()
        • AutoClassifier.save_artifact()
        • AutoClassifier.save_model()
        • AutoClassifier.schedule_fit()
        • AutoClassifier.schedule_predict()
        • AutoClassifier.set_params()
        • AutoClassifier.set_scale_out()
    • hana_ml.algorithms.apl.regression
      • AutoRegressor
        • AutoRegressor.fit()
        • AutoRegressor.predict()
        • AutoRegressor.score()
        • AutoRegressor.disable_hana_execution()
        • AutoRegressor.enable_hana_execution()
        • AutoRegressor.export_apply_code()
        • AutoRegressor.get_apl_version()
        • AutoRegressor.get_artifacts_recorder()
        • AutoRegressor.get_debrief_report()
        • AutoRegressor.get_feature_importances()
        • AutoRegressor.get_fit_operation_log()
        • AutoRegressor.get_indicators()
        • AutoRegressor.get_model_info()
        • AutoRegressor.get_params()
        • AutoRegressor.get_performance_metrics()
        • AutoRegressor.get_predict_operation_log()
        • AutoRegressor.get_summary()
        • AutoRegressor.is_fitted()
        • AutoRegressor.load_model()
        • AutoRegressor.save_artifact()
        • AutoRegressor.save_model()
        • AutoRegressor.schedule_fit()
        • AutoRegressor.schedule_predict()
        • AutoRegressor.set_params()
        • AutoRegressor.set_scale_out()
    • hana_ml.algorithms.apl.clustering
      • AutoUnsupervisedClustering
        • AutoUnsupervisedClustering.fit()
        • AutoUnsupervisedClustering.fit_predict()
        • AutoUnsupervisedClustering.get_metrics()
        • AutoUnsupervisedClustering.disable_hana_execution()
        • AutoUnsupervisedClustering.enable_hana_execution()
        • AutoUnsupervisedClustering.export_apply_code()
        • AutoUnsupervisedClustering.get_apl_version()
        • AutoUnsupervisedClustering.get_artifacts_recorder()
        • AutoUnsupervisedClustering.get_debrief_report()
        • AutoUnsupervisedClustering.get_fit_operation_log()
        • AutoUnsupervisedClustering.get_indicators()
        • AutoUnsupervisedClustering.get_model_info()
        • AutoUnsupervisedClustering.get_params()
        • AutoUnsupervisedClustering.get_predict_operation_log()
        • AutoUnsupervisedClustering.get_summary()
        • AutoUnsupervisedClustering.is_fitted()
        • AutoUnsupervisedClustering.load_model()
        • AutoUnsupervisedClustering.predict()
        • AutoUnsupervisedClustering.save_artifact()
        • AutoUnsupervisedClustering.save_model()
        • AutoUnsupervisedClustering.schedule_fit()
        • AutoUnsupervisedClustering.schedule_predict()
        • AutoUnsupervisedClustering.set_params()
        • AutoUnsupervisedClustering.set_scale_out()
      • AutoSupervisedClustering
        • AutoSupervisedClustering.set_params()
        • AutoSupervisedClustering.fit()
        • AutoSupervisedClustering.predict()
        • AutoSupervisedClustering.fit_predict()
        • AutoSupervisedClustering.get_metrics()
        • AutoSupervisedClustering.load_model()
        • AutoSupervisedClustering.disable_hana_execution()
        • AutoSupervisedClustering.enable_hana_execution()
        • AutoSupervisedClustering.export_apply_code()
        • AutoSupervisedClustering.get_apl_version()
        • AutoSupervisedClustering.get_artifacts_recorder()
        • AutoSupervisedClustering.get_debrief_report()
        • AutoSupervisedClustering.get_fit_operation_log()
        • AutoSupervisedClustering.get_indicators()
        • AutoSupervisedClustering.get_model_info()
        • AutoSupervisedClustering.get_params()
        • AutoSupervisedClustering.get_predict_operation_log()
        • AutoSupervisedClustering.get_summary()
        • AutoSupervisedClustering.is_fitted()
        • AutoSupervisedClustering.save_artifact()
        • AutoSupervisedClustering.save_model()
        • AutoSupervisedClustering.schedule_fit()
        • AutoSupervisedClustering.schedule_predict()
        • AutoSupervisedClustering.set_scale_out()
    • hana_ml.algorithms.apl.drift_detector
      • DriftDetector
        • DriftDetector.fit()
        • DriftDetector.detect()
        • DriftDetector.fit_detect()
        • DriftDetector.build_report()
        • DriftDetector.generate_html_report()
        • DriftDetector.generate_notebook_iframe_report()
        • DriftDetector.get_detect_operation_log()
        • DriftDetector.disable_hana_execution()
        • DriftDetector.enable_hana_execution()
        • DriftDetector.export_apply_code()
        • DriftDetector.get_apl_version()
        • DriftDetector.get_artifacts_recorder()
        • DriftDetector.get_debrief_report()
        • DriftDetector.get_fit_operation_log()
        • DriftDetector.get_indicators()
        • DriftDetector.get_model_info()
        • DriftDetector.get_params()
        • DriftDetector.get_predict_operation_log()
        • DriftDetector.get_summary()
        • DriftDetector.is_fitted()
        • DriftDetector.load_model()
        • DriftDetector.save_artifact()
        • DriftDetector.save_model()
        • DriftDetector.schedule_fit()
        • DriftDetector.schedule_predict()
        • DriftDetector.set_params()
        • DriftDetector.set_scale_out()
  • hana_ml.algorithms.pal package
    • Algorithms
      • PAL Base
        • PALBase
      • Auto ML
        • AutomaticClassification
        • AutomaticRegression
        • AutomaticTimeSeries
        • Preprocessing
        • MassiveAutomaticClassification
        • MassiveAutomaticRegression
        • MassiveAutomaticTimeSeries
      • Unified Interface
        • UnifiedClassification
        • UnifiedRegression
        • UnifiedClustering
        • UnifiedExponentialSmoothing
        • UnifiedTimeSeries
        • MassiveUnifiedTimeSeries
      • Clustering
        • AffinityPropagation
        • AgglomerateHierarchicalClustering
        • DBSCAN
        • GeometryDBSCAN
        • HDBSCAN
        • KMeans
        • KMedians
        • KMedoids
        • SpectralClustering
        • ConstrainedClustering
        • KMeansOutlier
        • GaussianMixture
        • SOM
        • SlightSilhouette
        • outlier_detection_kmeans
      • Classification
        • LinearDiscriminantAnalysis
        • LogisticRegression
        • OnlineMultiLogisticRegression
        • NaiveBayes
        • KNNClassifier
        • MLPClassifier
        • SVC
        • OneClassSVM
        • DecisionTreeClassifier
        • RDTClassifier
        • HybridGradientBoostingClassifier
        • MLPMultiTaskClassifier
      • Regression
        • LinearRegression
        • OnlineLinearRegression
        • KNNRegressor
        • MLPRegressor
        • PolynomialRegression
        • GLM
        • ExponentialRegression
        • BiVariateGeometricRegression
        • BiVariateNaturalLogarithmicRegression
        • CoxProportionalHazardModel
        • SVR
        • DecisionTreeRegressor
        • RDTRegressor
        • HybridGradientBoostingRegressor
        • MLPMultiTaskRegressor
      • Preprocessing
        • FeatureNormalizer
        • FeatureSelection
        • IsolationForest
        • KBinsDiscretizer
        • Imputer
        • Discretize
        • MDS
        • SMOTE
        • SMOTETomek
        • TomekLinks
        • Sampling
        • ImputeTS
        • PowerTransform
        • QuantileTransform
        • OutlierDetectionRegression
        • PCA
        • CATPCA
        • train_test_val_split
        • variance_test
      • Time Series
        • Unified Time Series
        • Automatic Time Series
        • ARIMA Family
        • Exponential Smoothing Family
        • Intermittent Demand Forecasting
        • Deep Learning
        • Time Series Explainability
        • Time Series Classification
        • Test and Diagnostics
        • Change and Anomaly Detection
        • Signal Processing and Spectral Analysis
        • Distance and alignment
      • Statistics
        • bernoulli
        • beta
        • binomial
        • cauchy
        • chi_squared
        • exponential
        • gumbel
        • f
        • gamma
        • geometric
        • lognormal
        • negative_binomial
        • normal
        • pert
        • poisson
        • student_t
        • uniform
        • weibull
        • multinomial
        • mcmc
        • chi_squared_goodness_of_fit
        • chi_squared_independence
        • ttest_1samp
        • ttest_ind
        • ttest_paired
        • f_oneway
        • f_oneway_repeated
        • univariate_analysis
        • covariance_matrix
        • pearsonr_matrix
        • iqr
        • wilcoxon
        • median_test_1samp
        • grubbs_test
        • entropy
        • condition_index
        • cdf
        • ftest_equal_var
        • factor_analysis
        • kaplan_meier_survival_analysis
        • quantile
        • distribution_fit
        • ks_test
        • interval_quality
        • benford_analysis
        • pairwise_distances
        • KDE
      • Association
        • Apriori
        • AprioriLite
        • FPGrowth
        • KORD
        • SPM
      • Recommender System
        • ALS
        • FRM
        • FFMClassifier
        • FFMRegressor
        • FFMRanker
        • MLPRecommender
      • Social Network Analysis
        • LinkPrediction
        • PageRank
      • Ranking
        • SVRanking
      • Miscellaneous
        • abc_analysis
        • weighted_score_table
        • create_model_card
        • parse_model_card
        • create_dataset_card
        • TSNE
        • FairMLClassification
        • FairMLRegression
      • Metrics
        • accuracy_score
        • auc
        • confusion_matrix
        • multiclass_auc
        • r2_score
        • binary_classification_debriefing
      • Model and Pipeline
        • ParamSearchCV
        • GridSearchCV
        • RandomSearchCV
        • Pipeline
      • Text Processing
        • CRF
        • LatentDirichletAllocation
        • VectorPCA
      • PAL Scheduler
        • ScheduledExecution
      • Massive Interface
        • MassiveAutomaticClassification
        • MassiveAutomaticRegression
        • MassiveAutomaticTimeSeries
        • UnifiedClassification
        • UnifiedRegression
        • UnifiedClustering
        • UnifiedExponentialSmoothing
        • MassiveUnifiedTimeSeries
        • AdditiveModelForecast
        • ARIMA
        • AutoARIMA
        • Croston
        • CrostonTSB
        • fft
        • OnlineBCPD
        • OutlierDetectionTS
        • accuracy_measure
        • IsolationForest
    • Topics
      • Model Evaluation and Parameter Selection
        • Resampling Methods
        • Search Strategies
      • Successive Halving and Hyperband for Parameter Selection
        • Key Relevant Parameters
      • Biased Linear Models
      • Model State for Real-Time Scoring
      • Local Interpretability of Models
        • SHAP
        • Surrogate
        • Direct Explanation
        • Models/Algorithms in hana_ml.algorithms.pal Packages that Support Local Interpretability
        • Key Relevant Parameters in hana-ml.algorithms.pal Package
      • Explaining the Forecasts of ARIMA
      • Methods for Residual Extraction in Time-Series Outlier Detection
        • 1. Residual from Median Filter
        • 2. Residual from Seasonal Decomposition
        • 3. Residual Extraction from Median Filter and Seasonal Decomposition
        • 4. Meaningless Parameter Combination to be Avoided
      • Methods of Outlier Detection from Residual
        • 1. Z1 Score
        • 2. Z2 Score
        • 3. IQR Score
        • 4. MAD Score
        • 5. Isolation Forest Score
        • 6. DBSCAN
      • Genetic Optimization in AutoML
        • Individual Representation
        • Selection
        • Crossover
        • Mutation
        • Evolutional Iteration Step
        • Control Parameters
      • Probability Density Functions for MCMC Sampling
      • Miscellaneous Topics
        • Early Stop in HGBT
        • Feature Grouping in HGBT
        • Histogram Splitting in HGBT
        • Model Compression for Random Decision Trees
        • Model Compression for Support Vector Machine
        • Seasonalities in Additive Model Forecast
      • Precomputed Distance Matrix as input data in UnifiedClustering
      • Parameters for Missing Value Handling in HANA DataFrame
      • Permutation Feature Importance
      • Permutation Feature Importance for Time Series
        • Parameters
        • Returns
        • Examples
    • Parameter Mappings
  • hana_ml.visualizers package
    • hana_ml.visualizers.eda
      • quarter_plot()
      • seasonal_plot()
      • timeseries_box_plot()
      • bubble_plot()
      • parallel_coordinates()
      • plot_acf()
      • plot_pacf()
      • plot_time_series_outlier()
      • plot_change_points()
      • plot_moving_average()
      • plot_rolling_stddev()
      • plot_seasonal_decompose()
      • kdeplot()
      • hist()
      • plot_psd()
      • EDAVisualizer
        • EDAVisualizer.distribution_plot()
        • EDAVisualizer.pie_plot()
        • EDAVisualizer.correlation_plot()
        • EDAVisualizer.scatter_plot()
        • EDAVisualizer.bar_plot()
        • EDAVisualizer.box_plot()
        • EDAVisualizer.ax
        • EDAVisualizer.cmap
        • EDAVisualizer.reset()
        • EDAVisualizer.set_ax()
        • EDAVisualizer.set_cmap()
        • EDAVisualizer.set_size()
        • EDAVisualizer.size
      • Profiler
        • Profiler.description()
        • Profiler.set_size()
    • hana_ml.visualizers.metrics
      • MetricsVisualizer
        • MetricsVisualizer.plot_confusion_matrix()
        • MetricsVisualizer.ax
        • MetricsVisualizer.cmap
        • MetricsVisualizer.reset()
        • MetricsVisualizer.set_ax()
        • MetricsVisualizer.set_cmap()
        • MetricsVisualizer.set_size()
        • MetricsVisualizer.size
    • hana_ml.visualizers.m4_sampling
      • get_min_index()
      • get_max_index()
      • m4_sampling()
    • hana_ml.visualizers.model_debriefing
      • TreeModelDebriefing
        • TreeModelDebriefing.tree_debrief()
        • TreeModelDebriefing.tree_parse()
        • TreeModelDebriefing.tree_debrief_with_dot()
        • TreeModelDebriefing.tree_debrief_with_text()
        • TreeModelDebriefing.tree_export()
        • TreeModelDebriefing.tree_export_with_dot()
        • TreeModelDebriefing.tree_export_with_text()
        • TreeModelDebriefing.shapley_explainer()
    • hana_ml.visualizers.dataset_report
      • DatasetReportBuilder
        • DatasetReportBuilder.build()
        • DatasetReportBuilder.set_framework_version()
        • DatasetReportBuilder.get_report_html()
        • DatasetReportBuilder.get_iframe_report_html()
        • DatasetReportBuilder.generate_html_report()
        • DatasetReportBuilder.generate_notebook_iframe_report()
    • hana_ml.visualizers.shap
      • ShapleyExplainer
        • ShapleyExplainer.get_feature_value_and_effect()
        • ShapleyExplainer.get_force_plot_item()
        • ShapleyExplainer.get_beeswarm_plot_item()
        • ShapleyExplainer.get_bar_plot_item()
        • ShapleyExplainer.get_dependence_plot_items()
        • ShapleyExplainer.get_enhanced_dependence_plot_items()
        • ShapleyExplainer.force_plot()
        • ShapleyExplainer.summary_plot()
      • TimeSeriesExplainer
        • TimeSeriesExplainer.explain_arima_model()
        • TimeSeriesExplainer.explain_additive_model()
    • hana_ml.visualizers.unified_report
      • UnifiedReport
        • UnifiedReport.set_model_report_style()
        • UnifiedReport.set_dataset_report_style()
        • UnifiedReport.build()
        • UnifiedReport.set_metric_samplings()
        • UnifiedReport.tree_debrief()
        • UnifiedReport.display()
        • UnifiedReport.get_iframe_report()
    • hana_ml.visualizers.visualizer_base
      • forecast_line_plot()
    • hana_ml.visualizers.digraph
      • Node
      • InPort
      • OutPort
      • Edge
      • DigraphConfig
        • DigraphConfig.set_text_layout()
        • DigraphConfig.set_digraph_layout()
        • DigraphConfig.set_node_sep()
        • DigraphConfig.set_rank_sep()
      • BaseDigraph
        • BaseDigraph.add_model_node()
        • BaseDigraph.add_python_node()
        • BaseDigraph.add_edge()
      • Digraph
        • Digraph.to_json()
        • Digraph.build()
        • Digraph.generate_html()
        • Digraph.generate_notebook_iframe()
        • Digraph.add_edge()
        • Digraph.add_model_node()
        • Digraph.add_python_node()
      • MultiDigraph
        • MultiDigraph.ChildDigraph
        • MultiDigraph.add_child_digraph()
        • MultiDigraph.to_json()
        • MultiDigraph.build()
        • MultiDigraph.generate_html()
        • MultiDigraph.generate_notebook_iframe()
    • hana_ml.visualizers.word_cloud
      • WordCloud
        • WordCloud.build()
        • WordCloud.fit_words()
        • WordCloud.generate()
        • WordCloud.generate_from_frequencies()
        • WordCloud.generate_from_text()
        • WordCloud.process_text()
        • WordCloud.recolor()
        • WordCloud.to_array()
        • WordCloud.to_file()
        • WordCloud.to_svg()
    • hana_ml.visualizers.automl_progress
      • PipelineProgressStatusMonitor
        • PipelineProgressStatusMonitor.start()
      • SimplePipelineProgressStatusMonitor
        • SimplePipelineProgressStatusMonitor.start()
    • hana_ml.visualizers.automl_report
      • BestPipelineReport
        • BestPipelineReport.generate_notebook_iframe()
        • BestPipelineReport.generate_html()
    • hana_ml.visualizers.time_series_report
      • convert_sort_key()
      • TimeSeriesReport
        • TimeSeriesReport.addPage()
        • TimeSeriesReport.addPages()
        • TimeSeriesReport.build()
        • TimeSeriesReport.generate_html()
        • TimeSeriesReport.generate_notebook_iframe()
      • DatasetAnalysis
        • DatasetAnalysis.pacf_item()
        • DatasetAnalysis.moving_average_item()
        • DatasetAnalysis.rolling_stddev_item()
        • DatasetAnalysis.seasonal_item()
        • DatasetAnalysis.timeseries_box_item()
        • DatasetAnalysis.seasonal_decompose_items()
        • DatasetAnalysis.quarter_item()
        • DatasetAnalysis.outlier_item()
        • DatasetAnalysis.stationarity_item()
        • DatasetAnalysis.real_item()
        • DatasetAnalysis.change_points_item()
    • hana_ml.visualizers.automl_config
      • AutoMLConfig
        • AutoMLConfig.get_config_dict()
        • AutoMLConfig.generate_html()
  • hana_ml.text package
    • hana_ml.text.tm
      • tf_analysis
        • tf_analysis()
      • text_tokenize
        • text_tokenize()
      • text_classification
        • text_classification()
      • get_related_doc
        • get_related_doc()
      • get_related_term
        • get_related_term()
      • get_relevant_doc
        • get_relevant_doc()
      • get_relevant_term
        • get_relevant_term()
      • get_suggested_term
        • get_suggested_term()
      • search_docs_by_keywords
        • search_docs_by_keywords()
      • TFIDF
        • TFIDF
        • Inherited Methods from PALBase
      • TextClassificationWithModel
        • TextClassificationWithModel
        • Inherited Methods from PALBase
    • hana_ml.text.anns_model
      • ANNSModel
        • ANNSModel
        • Inherited Methods from PALBase
      • list_models
        • list_models()
    • hana_ml.text.pal_embeddings
      • PALEmbeddings
        • PALEmbeddings
        • Inherited Methods from PALBase
    • hana_ml.text.text_splitter
      • TextSplitter
        • TextSplitter
        • Inherited Methods from PALBase
    • hana_ml.text.ta
      • text_analysis
        • text_analysis()
      • pos_tag
        • pos_tag()
      • named_entity_recognition
        • named_entity_recognition()
      • sentiment_analysis
        • sentiment_analysis()
  • hana_ml.hana_scheduler
    • HANAScheduler
      • HANAScheduler.check_scheduler_job_exist()
      • HANAScheduler.list_schedules()
      • HANAScheduler.get_job_names()
      • HANAScheduler.display_schedule_status()
      • HANAScheduler.set_schedule()
      • HANAScheduler.delete_schedule()
      • HANAScheduler.delete_schedules()
      • HANAScheduler.drop_procedure()
      • HANAScheduler.clean_up_schedules()
      • HANAScheduler.create_training_schedule()
      • HANAScheduler.create_applying_schedule()
      • HANAScheduler.create_scoring_schedule()
  • hana_ml.ml_exceptions
    • Error
    • FitIncompleteError
    • BadSQLError
    • PALUnusableError
    • ModelExistingError
  • hana_ml.model_storage
    • ModelStorageError
    • ModelStorage
      • ModelStorage.export_model()
      • ModelStorage.load_model_from_files()
      • ModelStorage.import_model()
      • ModelStorage.list_models()
      • ModelStorage.model_already_exists()
      • ModelStorage.change_storage_type()
      • ModelStorage.save_model()
      • ModelStorage.save_model_to_files()
      • ModelStorage.delete_model()
      • ModelStorage.delete_models()
      • ModelStorage.load_mlflow_model()
      • ModelStorage.clean_up()
      • ModelStorage.load_model()
      • ModelStorage.get_model_card()
      • ModelStorage.display_model_report()
      • ModelStorage.enable_persistent_memory()
      • ModelStorage.disable_persistent_memory()
      • ModelStorage.load_into_memory()
      • ModelStorage.unload_from_memory()
      • ModelStorage.set_data_lake_container()
      • ModelStorage.set_schedule()
      • ModelStorage.display_hana_schedule()
      • ModelStorage.start_schedule()
      • ModelStorage.terminate_schedule()
      • ModelStorage.set_logfile()
      • ModelStorage.upgrade_meta()
  • hana_ml.artifacts package
    • AMDP Examples
    • hana_ml.artifacts.deployers.amdp
      • gen_pass_key()
      • AMDPDeployer
        • AMDPDeployer.deploy()
        • AMDPDeployer.deploy_class()
        • AMDPDeployer.register_islm()
        • AMDPDeployer.get_is_information_from_islm()
        • AMDPDeployer.format()
    • hana_ml.artifacts.generators.abap
      • AMDPGenerator
        • AMDPGenerator.generate()
    • hana_ml.artifacts.generators.hana
      • HANAGeneratorForCAP
        • HANAGeneratorForCAP.configure()
        • HANAGeneratorForCAP.materialize_ds_data()
        • HANAGeneratorForCAP.generate_artifacts()
      • HanaGenerator
        • HanaGenerator.generate_artifacts()
  • hana_ml.docstore package
    • create_collection_from_elements()
  • hana_ml.spatial package
    • create_predefined_srs()
    • is_srs_created()
    • get_created_srses()
  • hana_ml.graph package
    • Graph
      • Graph.describe()
      • Graph.degree_distribution()
      • Graph.drop()
      • Graph.has_vertices()
      • Graph.vertices()
      • Graph.edges()
      • Graph.in_edges()
      • Graph.out_edges()
      • Graph.source()
      • Graph.target()
      • Graph.subgraph()
    • create_graph_from_dataframes()
    • create_graph_from_edges_dataframe()
    • create_graph_from_hana_dataframes()
    • discover_graph_workspace()
    • discover_graph_workspaces()
  • hana_ml.graph.algorithms package
    • ShortestPath
      • ShortestPath.execute()
      • ShortestPath.vertices
      • ShortestPath.edges
      • ShortestPath.weight
    • Neighbors
      • Neighbors.execute()
      • Neighbors.vertices
    • NeighborsSubgraph
      • NeighborsSubgraph.execute()
      • NeighborsSubgraph.vertices
      • NeighborsSubgraph.edges
    • KShortestPaths
      • KShortestPaths.execute()
      • KShortestPaths.paths
    • TopologicalSort
      • TopologicalSort.execute()
      • TopologicalSort.vertices
      • TopologicalSort.is_sortable
    • ShortestPathsOneToAll
      • ShortestPathsOneToAll.execute()
      • ShortestPathsOneToAll.vertices
      • ShortestPathsOneToAll.edges
    • StronglyConnectedComponents
      • StronglyConnectedComponents.execute()
      • StronglyConnectedComponents.vertices
      • StronglyConnectedComponents.components
      • StronglyConnectedComponents.components_count
    • WeaklyConnectedComponents
      • WeaklyConnectedComponents.execute()
      • WeaklyConnectedComponents.vertices
      • WeaklyConnectedComponents.components
      • WeaklyConnectedComponents.components_count
    • AlgorithmBase
      • AlgorithmBase.signature_from_cols()
      • AlgorithmBase.projection_expr_from_cols()
      • AlgorithmBase._default_vertex_cols()
      • AlgorithmBase._default_edge_cols()
      • AlgorithmBase._default_vertex_select()
      • AlgorithmBase._default_edge_select()
      • AlgorithmBase._process_parameters()
      • AlgorithmBase._validate_parameters()
      • AlgorithmBase.execute()
  • FAQs
hana-ml
  • »
  • hana_ml.algorithms.pal package »
  • Algorithms »
  • TomekLinks
  • View page source
Next Previous

TomekLinks¶

class hana_ml.algorithms.pal.preprocessing.TomekLinks(distance_level=None, minkowski_power=None, thread_ratio=None, search_method=None, sampling_strategy=None, category_weights=None)¶

This class is for performing under-sampling by removing Tomek's links.

Parameters
distance_levelstr, optional

Specifies the distance method between train data and test data point.

  • 'manhattan'

  • 'euclidean'

  • 'minkowski'

  • 'chebyshev'

  • 'cosine'

Defaults to 'euclidean'.

minkowski_powerfloat, optional

Specifies the value of power for Minkowski distance calculation.

Defaults to 3.

Valid only when distance_level is 'minkowski'.

thread_ratiofloat, optional

Adjusts the percentage of available threads to use, from 0 to 1. A value of 0 indicates the use of a single thread, while 1 implies the use of all possible current threads. Values outside the range will be ignored and this function heuristically determines the number of threads to use.

Default to 0.

search_methodstr, optional

Specifies the searching method when finding K nearest neighbour.

  • 'brute-force'

  • 'kd-tree'

Defaults to 'brute-force'.

sampling_strategystr, optional

Specifies the classes targeted by resampling:

  • 'majority' : resamples only the majority class

  • 'non-minority' : resamples all classes except the minority class

  • 'non-majority' : resamples all classes except the majority class

  • 'all' : resamples all classes

Defaults to 'majority'

category_weightsfloat, optional

Specifies the weight for categorical attributes.

Defaults to 0.707 if not provided.

Attributes
None

Methods

fit_transform(data[, key, label, ...])

Perform under-sampling on given datasets by removing Tomek's links.

Examples

>>> tomeklinks = TomekLinks(search_method='kd-tree',
                            sampling_strategy='majority')
>>> res = smotetomek.fit_transform(data=df, label='TYPE')
fit_transform(data, key=None, label=None, categorical_variable=None, variable_weight=None)¶

Perform under-sampling on given datasets by removing Tomek's links.

Parameters
dataDataFrame

Dataframe that contains the training data.

keystr, optional

Specifies the name of the ID column.

If key is not provided, then:

  • if data is indexed by a single column, then key defaults to that index column;

  • otherwise, it is assumed that data contains no ID column.

labelstr, optional

Specifies the dependent variable by name.

If not specified, defaults to the 1st non-key column in data.

categorical_variablestr or a list of str, optional

Specifies which INTEGER columns should be treated as categorical, with all other INTEGER columns treated as continuous.

No default value.

variable_weightdict, optional

Specifies the weights of variables participating in distance calculation in a dictionary:

  • key : variable(column) name

  • value : weight for distance calculation

No default value.

Returns
DataFrame
  • Undersampled result, the same structure as defined in the input data.

Inherited Methods from PALBase¶

Besides those methods mentioned above, the TomekLinks class also inherits methods from PALBase class, please refer to PAL Base for more details.


© Copyright 2026, SAP.

Built with Sphinx using a theme provided by Read the Docs.
  • Copyright
  • Disclaimer
  • Privacy Statement
  • Legal Disclosure
  • Trademark
  • Terms of Use