mcmc

hana_ml.algorithms.pal.random.mcmc(conn_context, distribution, location=0.0, scale=1.0, shape=1.0, dof=1.0, chain_iter=None, random_state=None, init_radius=None, adapt=None, warmup=None, thin=None, adapt_gamma=None, adapt_delta=None, adapt_kappa=None, adapt_offset=None, adapt_init_buffer=None, adapt_term_buffer=None, adapt_window=None, stepsize=None, stepsize_jitter=None, max_depth=None, mu=None, sigma=None, xi=None, alpha=None, beta=None, nu=None, omega=None, L=None, y_min=None, lambda_=None)

Given a distribution, this function generates samples of the distribution using Markov chain Monte Carlo simulation.

Parameters
conn_contextConnectionContext

Connection to HANA database.

distributionstr

Specifies the name of distribution.

Valid options include:

  • 'normal' : normal distribution

  • 'skew_normal' : skew normal distribution

  • 'student_t' : student-t distribution

  • 'cauchy' : Cauchy distribution

  • 'laplace' : Laplace distribution

  • 'logistic' : Logistic distribution

  • 'gumbel' : Gumbel distribution

  • 'exponential' : Exponential distribution

  • 'chi_square' : Chi-square distribution

  • 'invchi_square' : Inverse Chi-square distribution

  • 'gamma' : Gamma distribution

  • 'weibull' : Weibull distribution

  • 'frechet' : Frechet distribution

  • 'rayleigh' : Rayleigh distribution

  • 'multinormal' : Multivariate normal distribution

  • 'multinormalprec' : Multivariate normal distribution with precision parameterization

  • 'multinormalcholesky' : Multivariate normal distribution with Cholesky parameterization

  • 'multistudent_t' : Multivariate student-t distribution

  • 'dirichlet' : Dirichlet distribution

  • 'lognormal' : Lognormal distribution

  • 'invgamma' : Inverse Gamma distribution

  • 'beta' : Beta distribution

  • 'pareto' : Pareto distribution

  • 'lomax' : Lomax distribution

For parameterized probability density functions(PDFs) of each distribution listed as above, please see Probability Density Functions for more details.

locationfloat, optional(deprecated)

Specifies the location parameter for a distribution.

Valid when distribution is set as one of the following values: 'normal', 'skew_normal', 'student_t', 'cauchy', 'laplace', 'logistic'.

Defaults to 0.

This parameter is deprecated, please use xi for skew normal distribution, and mu for other valid distributions instead.

scalefloat, optional(deprecated)

Specifies the scale parameter for a distribution.

Valid only when distribution is set to one of the following values: 'normal', 'skew_normal', 'student_t', 'cauchy', 'laplace', 'logistic', 'gumbel', 'exponential'.

Defaults to 1.0.

This parameter is deprecated, please use omega for skew normal distribution, beta for gumbel and exponential distribution(the value needs to be inversed for exponential distribution), and sigma for other valid distributions instead.

shapefloat, optional(deprecated)

Specifies the shape parameter for a distribution.

Valid only when distribution is set as 'skew_normal'.

Defaults to 1.0.

This parameter is deprecated, please use alpha instead.

doffloat, optional(deprecated)

Specifies the degree of freedom of a distribution. Valid only when distribution is 'student_t' or 'chi_square'.

Defaults to 1.0.

This parameter is deprecated, please use nu instead.

chain_iterint, optional

Specifies number of iterations for each Markov chain including warmup.

Defaults to 2000.

random_stateint, optional

Specifies the seed used to initialize the random number generator, where 0 means current system time as seed, while other values are simply seed values.

Defaults to 0.

init_radiusfloat, optional

Specifies the radius to initialize the process.

Defaults to 2.0.

adaptbool, optional

Specifies whether or not to use adaption.

Defaults to True.

warmupint, optional

Specifies the number of warm-up iterations.

Defaults to half of chain_iter.

thinint, optional

Specifies the period for saving samples.

Defaults to 1.

adapt_gammafloat, optional

Specifies the regularization scale for adaption, must be positive.

Invalid when adapt is False.

Defaults to 0.05.

adapt_deltafloat, optional

Specifies the target Metropolis acceptance rate, must be restricted between 0 and 1(inclusive of both limits).

Not valid when adapt is False.

Defaults to 0.8.

adapt_kappafloat, optional

Specifies the relaxation exponent, must be positive.

Not valid when adapt is False.

Defaults to 0.75.

adapt_offsetfloat, optional

Specifies the adaption iteration offset, must be positive.

Not valid when adapt is False.

Defaults to 10.0.

adapt_init_bufferint, optional

Specifies the width of initial fast adaption interval.

Not valid when adapt is False.

Defaults to 75.

adapt_term_bufferint, optional

Specifies the width of terminal(final) fast adaption interval.

Not valid when adapt is False.

Defaults to 50.

adapt_windowint, optional

Specifies the initial width of slow adaption interval.

Not valid when adapt is False.

Defaults to 25.

stepsizefloat, optional

Specifies the value for discretizing the time interval.

Defaults to 1.0.

stepsize_jitterfloat, optional

Specifies the uniform random jitter of step-size.

Defaults to 0.

max_depthint, optional

Specifies the maximum tree depth.

Defaults to 10.

mufloat, list or numpy.ndarray, optional

Specifies value of parameter \(\mu\) in a probability density function.

Mandatory and valid only when the corresponding distribution takes \(\mu\) as a parameter in its probability density function. See distribution for more details.

sigmafloat, list or numpy.ndarray

Specifies the value of parameter \(\sigma\) or \(\Sigma\) in a probability density function.

Mandatory and valid only when the corresponding distribution takes \(\sigma\) or \(\Sigma\) as a parameter in its probability density function. See distribution for more details.

xifloat, optional

Specifies the value of parameter \(\xi\) for the probability density function of skew normal distribution.

Valid only when the distribution is 'skew_normal'

alphafloat list or numpy.ndarray, optional

Specifies value of parameter \(\alpha\) in a probability density function.

Mandatory and valid only when the corresponding distribution takes \(\alpha\) as a parameter in its probability density function. See distribution for more details.

betafloat, optional

Specifies value of parameter \(\beta\) in a probability density function.

Mandatory and valid only when the corresponding distribution takes \(\beta\) as a parameter in its probability density function. See distribution for more details.

nufloat, optional

Specifies value of parameter \(\nu\) in a probability density function.

Mandatory and valid only when the corresponding distribution takes \(\nu\) as a parameter in its probability density function. See distribution for more details.

omegafloat, list or numpy.ndarray, optional

Specifies the value of parameter \(\omega\) or \(\Omega\) in a probability density function.

Mandatory and valid only when the corresponding distribution takes \(\omega\) or \(\Omega\) as a parameter in its probability density function. See distribution for more details.

Llist of numpy.ndarray, optional

Specifies the value of parameter L in the probability density function of multivariate normal distribution with Cholesky parameterization. It should be a lower triangular matrix provided in the form of either a list or a numpy.ndarray.

Mandatory and valid only when distribution is 'multinormalcholesky'.

y_minfloat, optional

Specifies the value of parameter \(y_{min}\) in Pareto distribution.

Mandatory and valid only when distribution is 'pareto'.

lambda_float, optional

Specifies the value of parameter \(\lambda\) in Lomax distribution.

Mandatory and valid only when distribution is 'lomax'.

Returns
DataFrame

Samples of the specified distribution generated from Markov Chain Monte-Carlo process.

Examples

The following line of code shows how to generate MCMC samples from student_t distribution with specified distribution parameters.

>>> res = mcmc(conn, distribution = 'student_t', mu = 0, sigma = 1,
...            nu = 1, chain_iter = 50, thin = 10, init_radius = 0)
>>> res.collect()
   ID   SAMPLES
0   0 -1.728452
1   1  1.575337
2   2  1.185957
3   3  4.913828
4   4  0.220282
5   5 -5.588809