mcmc
- hana_ml.algorithms.pal.random.mcmc(conn_context, distribution, location=0.0, scale=1.0, shape=1.0, dof=1.0, chain_iter=None, random_state=None, init_radius=None, adapt=None, warmup=None, thin=None, adapt_gamma=None, adapt_delta=None, adapt_kappa=None, adapt_offset=None, adapt_init_buffer=None, adapt_term_buffer=None, adapt_window=None, stepsize=None, stepsize_jitter=None, max_depth=None, mu=None, sigma=None, xi=None, alpha=None, beta=None, nu=None, omega=None, L=None, y_min=None, lambda_=None)
Given a distribution, this function generates samples of the distribution using Markov chain Monte Carlo simulation.
- Parameters:
- conn_contextConnectionContext
Connection to HANA database.
- distributionstr
Specifies the name of distribution.
Valid options include:
'normal' : normal distribution
'skew_normal' : skew normal distribution
'student_t' : student-t distribution
'cauchy' : Cauchy distribution
'laplace' : Laplace distribution
'logistic' : Logistic distribution
'gumbel' : Gumbel distribution
'exponential' : Exponential distribution
'chi_square' : Chi-square distribution
'invchi_square' : Inverse Chi-square distribution
'gamma' : Gamma distribution
'weibull' : Weibull distribution
'frechet' : Frechet distribution
'rayleigh' : Rayleigh distribution
'multinormal' : Multivariate normal distribution
'multinormalprec' : Multivariate normal distribution with precision parameterization
'multinormalcholesky' : Multivariate normal distribution with Cholesky parameterization
'multistudent_t' : Multivariate student-t distribution
'dirichlet' : Dirichlet distribution
'lognormal' : Lognormal distribution
'invgamma' : Inverse Gamma distribution
'beta' : Beta distribution
'pareto' : Pareto distribution
'lomax' : Lomax distribution
For parameterized probability density functions(PDFs) of each distribution listed as above, please see Probability Density Functions for more details.
- locationfloat, optional(deprecated)
Specifies the location parameter for a distribution.
Valid when
distribution
is set as one of the following values: 'normal', 'skew_normal', 'student_t', 'cauchy', 'laplace', 'logistic'.Defaults to 0.
This parameter is deprecated, please use
xi
for skew normal distribution, andmu
for other valid distributions instead.- scalefloat, optional(deprecated)
Specifies the scale parameter for a distribution.
Valid only when
distribution
is set to one of the following values: 'normal', 'skew_normal', 'student_t', 'cauchy', 'laplace', 'logistic', 'gumbel', 'exponential'.Defaults to 1.0.
This parameter is deprecated, please use
omega
for skew normal distribution,beta
for gumbel and exponential distribution(the value needs to be inversed for exponential distribution), andsigma
for other valid distributions instead.- shapefloat, optional(deprecated)
Specifies the shape parameter for a distribution.
Valid only when
distribution
is set as 'skew_normal'.Defaults to 1.0.
This parameter is deprecated, please use
alpha
instead.- doffloat, optional(deprecated)
Specifies the degree of freedom of a distribution. Valid only when
distribution
is 'student_t' or 'chi_square'.Defaults to 1.0.
This parameter is deprecated, please use
nu
instead.- chain_iterint, optional
Specifies number of iterations for each Markov chain including warmup.
Defaults to 2000.
- random_stateint, optional
Specifies the seed used to initialize the random number generator, where 0 means current system time as seed, while other values are simply seed values.
Defaults to 0.
- init_radiusfloat, optional
Specifies the radius to initialize the process.
Defaults to 2.0.
- adaptbool, optional
Specifies whether or not to use adaption.
Defaults to True.
- warmupint, optional
Specifies the number of warm-up iterations.
Defaults to half of
chain_iter
.- thinint, optional
Specifies the period for saving samples.
Defaults to 1.
- adapt_gammafloat, optional
Specifies the regularization scale for adaption, must be positive.
Invalid when
adapt
is False.Defaults to 0.05.
- adapt_deltafloat, optional
Specifies the target Metropolis acceptance rate, must be restricted between 0 and 1(inclusive of both limits).
Not valid when
adapt
is False.Defaults to 0.8.
- adapt_kappafloat, optional
Specifies the relaxation exponent, must be positive.
Not valid when
adapt
is False.Defaults to 0.75.
- adapt_offsetfloat, optional
Specifies the adaption iteration offset, must be positive.
Not valid when
adapt
is False.Defaults to 10.0.
- adapt_init_bufferint, optional
Specifies the width of initial fast adaption interval.
Not valid when
adapt
is False.Defaults to 75.
- adapt_term_bufferint, optional
Specifies the width of terminal(final) fast adaption interval.
Not valid when
adapt
is False.Defaults to 50.
- adapt_windowint, optional
Specifies the initial width of slow adaption interval.
Not valid when
adapt
is False.Defaults to 25.
- stepsizefloat, optional
Specifies the value for discretizing the time interval.
Defaults to 1.0.
- stepsize_jitterfloat, optional
Specifies the uniform random jitter of step-size.
Defaults to 0.
- max_depthint, optional
Specifies the maximum tree depth.
Defaults to 10.
- mufloat, list or numpy.ndarray, optional
Specifies value of parameter \(\mu\) in a probability density function.
Mandatory and valid only when the corresponding distribution takes \(\mu\) as a parameter in its probability density function. See
distribution
for more details.- sigmafloat, list or numpy.ndarray
Specifies the value of parameter \(\sigma\) or \(\Sigma\) in a probability density function.
Mandatory and valid only when the corresponding distribution takes \(\sigma\) or \(\Sigma\) as a parameter in its probability density function. See
distribution
for more details.- xifloat, optional
Specifies the value of parameter \(\xi\) for the probability density function of skew normal distribution.
Valid only when the
distribution
is 'skew_normal'- alphafloat list or numpy.ndarray, optional
Specifies value of parameter \(\alpha\) in a probability density function.
Mandatory and valid only when the corresponding distribution takes \(\alpha\) as a parameter in its probability density function. See
distribution
for more details.- betafloat, optional
Specifies value of parameter \(\beta\) in a probability density function.
Mandatory and valid only when the corresponding distribution takes \(\beta\) as a parameter in its probability density function. See
distribution
for more details.- nufloat, optional
Specifies value of parameter \(\nu\) in a probability density function.
Mandatory and valid only when the corresponding distribution takes \(\nu\) as a parameter in its probability density function. See
distribution
for more details.- omegafloat, list or numpy.ndarray, optional
Specifies the value of parameter \(\omega\) or \(\Omega\) in a probability density function.
Mandatory and valid only when the corresponding distribution takes \(\omega\) or \(\Omega\) as a parameter in its probability density function. See
distribution
for more details.- Llist of numpy.ndarray, optional
Specifies the value of parameter L in the probability density function of multivariate normal distribution with Cholesky parameterization. It should be a lower triangular matrix provided in the form of either a list or a numpy.ndarray.
Mandatory and valid only when
distribution
is 'multinormalcholesky'.- y_minfloat, optional
Specifies the value of parameter \(y_{min}\) in Pareto distribution.
Mandatory and valid only when
distribution
is 'pareto'.- lambda_float, optional
Specifies the value of parameter \(\lambda\) in Lomax distribution.
Mandatory and valid only when
distribution
is 'lomax'.
- Returns:
- DataFrame
Samples of the specified distribution generated from Markov Chain Monte-Carlo process.
Examples
The following line of code shows how to generate MCMC samples from student_t distribution with specified distribution parameters.
>>> res = mcmc(conn, distribution = 'student_t', mu = 0, sigma = 1, ... nu = 1, chain_iter = 50, thin = 10, init_radius = 0) >>> res.collect() ID SAMPLES 0 0 -1.728452 1 1 1.575337 2 2 1.185957 3 3 4.913828 4 4 0.220282 5 5 -5.588809