“early_warnings” subpackage
This is the antiCPy.early_warnings subpackage. Up to now it contains a LangevinEstimation class that provides a few possibilities to parameterize the drift and diffusion function in terms of polynomials. The parameters of these functions can be estimated via Markov Chain Monte Carlo (MCMC) sampling and the drift slope can be calculated in a rolling window approach in order to detect changes in the resilience and noise level of a system.
- class antiCPy.early_warnings.langevin_estimation.LangevinEstimation(data, time, drift_model='3rd order polynomial', diffusion_model='constant', prior_type='Non-informative linear and Gaussian Prior', prior_range=None, scales=None, detrending_of_whole_dataset=None, gauss_filter_mode='reflect', gauss_filter_sigma=6, gauss_filter_order=0, gauss_filter_cval=0.0, gauss_filter_truncate=4.0, plot_detrending=False)[source]
The
LangevinEstimationclass includes tools to estimate a polynomial Langevin equation with various drift and diffusion terms and provides the drift slope \(\zeta\) as a resilience measure.- Parameters
data (One dimensional numpy array of floats) – A one dimensional numpy array containing the times series to analyse.
time (One dimensional numpy array of floats) – A one dimensional numpy array containing the time samples of the time series data.
drift_model (str) – Defines the drift model. Default is
'3rd order polynomial'. Additionally, a'first order polynomial'can be chosen.diffusion_model (str) – Defines the diffusion model. Default is
'constant'. Additionally, a'first order polynomial'can be chosen.prior_type – Defines the used prior to calculate the posterior distribution of the data. Default is
'Non-informative linear and Gaussian Prior'. A flat prior can be chosen with'Flat Prior'.prior_range (Two-dimensional numpy array of floats, optional) – Customize the prior range of the estimated polynomial parameters :math:` heta_i` starting for :math:` heta_0` in row with index 0 and upper limits in the first column. Default is
Nonewhich means prior ranges of -50 to 50 for the drift parameters and 0 to 50 for constant diffusion terms.scales (One-dimensional numpy array with two float entries) – Two tailed percentiles to create credibility bands of the estimated measures.
theta (One-dimensional numpy array of floats) – Array that contains the estimated drift and diffusion parameters with drift and ending with diffusion. The lowest order is mentioned first.
ndim (int) – Total number of parameters to estimate.
nwalkers (int) – Number of MCMC chains.
nsteps (int) – Length of Markov chains.
nburn (int) – Length of burn in period of the MCMC chains.
data_size (int) – Length of the data array.
dt (float) – Time intervall of the equidistant time array.
drift_slope (One-dimensional numpy array of floats) – Array with the current drift slope estimate in the 0th component. Component 1, 2 and 3, 4 contain the upper and lower bound of the credibility intervals defined by the
scalesvariable.noise_level_estimate (One-dimenional numpy array of floats) – Array with the noise level estimate in the 0th component. Component 1, 2 and 3, 4 contain the upper and lower bound of the credibility intervals defined by the
scalesvariable.loop_range (One-dimensional (
data_size - window_size / window_shift + 1) numpy array of integers) – Array with the start indices of the time array for each rolled time window.window_size (int) – Size of rolling windows.
data_window (One-dimensional numpy array of floats) – Array that contains data of a certain interval of the dataset in
data.time_window (One-dimensional numpy array of floats) – Array that contains time of a certain interval of the dataset in
data.increments (One-dimensional numpy array of floats) – Array that contains the increments from \(t\) to \(t+1\).
joint_samples (Two-dimensional (
ndim, num_of_joint_samples) numpy array of floats) – Array that contains drawed parameter samples from their joint posterior probability density function that is estimated from the data in the currentdata_window.slope_estimates (One-dimensional numpy array of floats) – Estimates of the drift slope on the current
data_window.fixed_point_estimate (float) – Fixed point estimate calculated as the mean of the current
data_window.starting_guesses (Two-dimensional numpy array of floats) – Array that contains the initial guesses of the MCMC sampling.
theta_array (Two-dimensional numpy array of floats) – Array that contains the drift and diffusion parameters sampled with MCMC.
slope_storage (Two-dimensional (
5, loop_range.size) numpy array of floats) – Array that contains the drift slope estimates \(\hat{\zeta}\) of the rolling windows. The columns encode each rolling window with the drift slope estimate \(\hat{\zeta}\) in the 0th row and in rows 1,2 and 3,4 the upper and lower bound of the credibility intervals defined byscales.noise_level_storage (Two-dimensional (
5, loop_range.size) numpy array of floats) – Array that contains the noise level estimates \(\hat{\sigma}\) of the rolling windows. The columns encode each rolling window with the noise level estimate \(\hat{\sigma}\) in the 0th row and in rows 1,2 and 3,4 the upper and lower bound of the credibility intervals defined byscales.slope_kernel_density_obj (
scipykernel density object of thescipy.gaussian_kde(...)function.) – Kernel density object ofscipy.gaussian_kde(...)of the slope posterior created with a Gaussian kernel and a bandwidth computed by silverman’s rule.noise_kernel_density_obj – Kernel density object of
scipy.gaussian_kde(...)of the noise level posterior created with a Gaussian kernel and a bandwidth computed by silverman’s rule.slow_trend (One-dimensional numpy array of floats.) – Contains the subtracted slow trend if a detrending is applied to the whole data or each window separately.
detrending_of_whole_dataset (str) – Default is
None. If'linear'the wholedatais detrended linearly. IfGaussian kernel smootheda Gaussian smoothed curve is subtracted from the wholedata. The parameters of the kernel smoothing can be set bygauss_filter_mode,gauss_filter_sigma,gauss_filter_order,gauss_filter_cvalandgauss_filter_truncate.gauss_filter_mode (str) – According to the
scipy.ndimage.filters.gaussian_filteroption.gauss_filter_sigma (float) – According to the
scipy.ndimage.filters.gaussian_filteroption.gauss_filter_order (int) – According to the
scipy.ndimage.filters.gaussian_filteroption.gauss_filter_cval (float) – According to the
scipy.ndimage.filters.gaussian_filteroption.gauss_filter_truncate (float) – According to the
scipy.ndimage.filters.gaussian_filteroption.plot_detrending (bool) – Default is
False. IfTrue, theself.dataas well as theself.slow_trendand the detrended version are shown.
- static detrend(time, data, detrend_mode='linear', gauss_filter_mode='reflect', gauss_filter_sigma=6, gauss_filter_order=0, gauss_filter_cval=0.0, gauss_filter_truncate=4.0, plot_detrending=False)[source]
Method to apply a linear or Gaussian kernel smoothed detrending to the whole data or to each data window separately.
- Parameters
time (One-dimensional numpy array of floats.) – Time points of the time series.
data (One-dimensional numpy array of floats.) – Time series that is to be detrended.
detrend_mode (str) – Each window is linearly detrended for
detrend = 'linear'and with a Gaussian kernel filter viadetrend = 'gauss_kernel'. The linear detrending is the default option.gauss_filter_mode (str) – According to the
scipy.ndimage.filters.gaussian_filteroption.gauss_filter_sigma (float) – According to the
scipy.ndimage.filters.gaussian_filteroption.gauss_filter_order (int) – According to the
scipy.ndimage.filters.gaussian_filteroption.gauss_filter_cval (float) – According to the
scipy.ndimage.filters.gaussian_filteroption.gauss_filter_truncate (float) – According to the
scipy.ndimage.filters.gaussian_filteroption.
- Returns
The first return object contains the detrended data, the second one the subtracted slow trend.
- Return type
One-dimensional numpy arrays.
- third_order_polynom_slope_in_fixed_point()[source]
Calculate the slope_estimates of a third order polynomial drift function with joint_samples of the posterior distribution around a fixed point estimate given as the data mean of the current window.
- static calc_D1(theta, data, drift_model)[source]
Returns the drift parameterized by a first or third order polynomial for input data. Static function in order to speed up computation time via
numba.jit(nopython = True).
- static calc_D2(theta, data, diffusion_model)[source]
Returns the diffusion parameterized by a constant or first order polynomial for input data. Static function in order to speed up computation time via
numba.jit(nopython = True).
- log_prior()[source]
Returns the logarithmic prior probability for a given set of parameters based on the short time propagator depending on the drift and diffusion models and the given
prior_rangeof the parameterstheta. A flat prior for the parameters apart from restriction to positive constant diffusion and a non-informative prior for linear parts and Gaussian priors for higher orders are implemented. The standard deviation of the Gaussian priors is given by thescalesvariable.
- log_likelihood()[source]
Returns the logarithmic likelihood of the data for the given model parametrization.
- log_posterior(theta)[source]
Returns the logarithmic posterior probability of the data for the given model parametrization.
- neg_log_posterior(theta)[source]
Returns the negative logarithmic posterior distribution of the data for the given model parametrization.
- declare_MAP_starting_guesses(nwalkers=None, nsteps=None)[source]
Declare the maximum a posterior (MAP) starting guesses for a MCMC sampling with
nwalkersandnsteps.
- compute_posterior_samples(print_AC_tau, ignore_AC_error, thinning_by, print_progress)[source]
Compute the
theta_arraywith \(nwalkers \cdot nsteps\) Markov Chain Mone Carlo samples. Ifignore_AC_error = Falsethe calculation will terminate with error if the autocorrelation of the sampled chains is to high compared to the chain length. Otherwise the highest autocorrelation length will be used to thin the sampled chains. In order to run tests in shorter time you can setignore_AC_error = Trueand define athinning_byn steps. Ifprint_AC_tau = Truethe autocorrelation lengths of the sampled chains is printed.
- static init()[source]
Defines the animation background for easy visualization of a rolling window scan of a given time series. It prepares plots for the time series and the time evolution of the drift slope estimate \(\hat{\zeta}\) and noise level estimate \(\hat{\sigma}\) and its probability density estimate. The method needs to be static. Some variables have to be declared global in to guarantee the functionality of the animation procedure.
- animation(i, slope_grid, noise_grid, nwalkers, nsteps, nburn, n_joint_samples, n_slope_samples, n_noise_samples, cred_percentiles, print_AC_tau, ignore_AC_error, thinning_by, print_progress, detrending_per_window, gauss_filter_mode, gauss_filter_sigma, gauss_filter_order, gauss_filter_cval, gauss_filter_truncate, plot_detrending)[source]
Function that is called iteratively by the
matplotlib.animation.FuncAnimation(...)tool to generate an animation of the rolling window scan of a time series. The time series, the rolling windows, the time evolution of the drift slope estimate \(\hat{\zeta}\) and the noise level estimate \(\hat{\sigma}\) and its posterior probality density is shown in the animation.
- calc_driftslope_noise(slope_grid, noise_grid, nwalkers=50, nsteps=10000, nburn=200, n_joint_samples=50000, n_slope_samples=50000, n_noise_samples=50000, cred_percentiles=[16, 1], print_AC_tau=False, ignore_AC_error=False, thinning_by=60, print_progress=False, detrending_per_window=None, gauss_filter_mode='reflect', gauss_filter_sigma=6, gauss_filter_order=0, gauss_filter_cval=0.0, gauss_filter_truncate=4.0, plot_detrending=False)[source]
Calculates the drift slope estimate \(\hat{\zeta}\) and the noise level \(\hat{\sigma}\) from the MCMC sampled parameters of a Langevin model for a given
window_sizeand givenwindow_shift. In the course of the computations the parametersjoint_samples, anoise_kernel_density_obj, aslope_kernel_density_obj, thedrift_slopewith credibility intverals, thenoise_levelwith credibility intervals and in case of a third order polynomial drift the estimated fixed_point_estimate of the window will be stored.- Parameters
slope_grid (One-dimensional numpy array of floats) – Array on which the drift slope kernel density estimate is evaluated.
noise_grid (One-dimensional numpy array of floats) – Array on which the noise level kernel density estimate is evaluated.
nwalkers (int) – Number of walkers that are initialized for the MCMC sampling via the package
emcee. Default is 50.nsteps (int) – Length of the sampled MCMC chains. Default is 10000.
nburn (int) – Number of data points at the beginning of the Markov chains which are discarded in terms of a burn in period. Default is 200.
n_joint_samples (int) – Number of joint samples that are drawn from the estimated joint posterior probability in order to calculate the drift slope estimate \(\zeta\) and corresponding credibility bands. Default is 50000.
n_slope_samples (int) – Number of drift slope samples that are drawn from the estimated drift slope posterior computed based on the sampling results of the joint posterior probability of the Langevin parameters :math:` heta_i`. Default is 50000.
n_noise_samples (int) – Number of constant noise level samples that are drawn from the estimated marignal probility distribution of the diffusion model is chosen constant. Default is 50000.
cred_percentiles (One-dimensional numpy array of integers) – Two entries to define the percentiles of the calculated credibility bands of the estimated parameters. Default is
numpy.array([16,1]).print_AC_tau – If
Truethe estimated autocorrelation lengths of the Markov chains is shown. The maximum length is used for thinning of the chains. Default isFalse.ignore_AC_error (bool) – If
Truethe autocorrelation lengths of the Markov chains is not estimated and thus, it is not checked whether the chains are too short to give an reliable autocorrelation estimate. This avoids error interruption of the procedure, but can lead to unreliable results if the chains are too short. The option should be chosen in order to save computation time, e.g. when debugging your code with short Markov chains. Default isFalse.thinning_by (int) – If
ignore_AC_error = Truethe chains are thinned bythinning_by. Everythinning_by-th data point is stored. Default is 60.print_progress (bool) – If
Truethe progress of the MCMC sampling is shown. Default isFalse.detrending_per_window (str) – Default is
None. If'linear', theself.data_windowis detrended linearly. IfGaussian kernel smoothed, a Gaussian smoothed curve is subtracted fromself.data_window. The parameters of the kernel smoothing can be set bygauss_filter_mode,gauss_filter_sigma,gauss_filter_order,gauss_filter_cvalandgauss_filter_truncate. Drift slope and noise level are estimated after detrending in these cases.gauss_filter_mode (str) – According to the
scipy.ndimage.filters.gaussian_filteroption.gauss_filter_sigma (float) – According to the
scipy.ndimage.filters.gaussian_filteroption.gauss_filter_order (int) – According to the
scipy.ndimage.filters.gaussian_filteroption.gauss_filter_cval (float) – According to the
scipy.ndimage.filters.gaussian_filteroption.gauss_filter_truncate (float) – According to the
scipy.ndimage.filters.gaussian_filteroption.plot_detrending (bool) – Default is
False. IfTrue, theself.dataas well as theself.slow_trendand the detrended version are shown.
- perform_resilience_scan(window_size, window_shift, slope_grid, noise_grid, nwalkers=50, nsteps=10000, nburn=200, n_joint_samples=50000, n_slope_samples=50000, n_noise_samples=50000, cred_percentiles=[16, 1], print_AC_tau=False, ignore_AC_error=False, thinning_by=60, print_progress=False, slope_save_name='default_save_slopes', noise_level_save_name='default_save_noise', save=True, create_animation=False, ani_save_name='default_animation_name', animation_title='', mark_critical_point=None, mark_noise_level=None, detrending_per_window=None, gauss_filter_mode='reflect', gauss_filter_sigma=6, gauss_filter_order=0, gauss_filter_cval=0.0, gauss_filter_truncate=4.0, plot_detrending=False)[source]
Performs an automated window scan with defined
window_shiftover the whole time series. In each window the drift slope and noise level estimates with corresponding credibility bands are computed and saved in theslope_storageand thenoise_level_storage. It can also be used to create an animation of the sliding window approach plotting the time series, the moving window, and the time evolution of the drift slope estimates \(\hat{\zeta}\), the noise level \(\hat{\sigma}\) and the noise kernel density estimate. The start indices of the shifted windows is also saved in order to facilitate customized plots.- Parameters
window_size (int) – Time window size.
window_shift (int) – The rolling time window is shifted about
window_shiftdata points.slope_grid (One-dimensional numpy array of floats) – Array on which the drift slope kernel density estimate is evaluated.
noise_grid (One-dimensional numpy array of floats) – Array on which the noise level kernel density estimate is evaluated.
nwalkers (int) – Number of walkers that are initialized for the MCMC sampling via the package
emcee. Default is 50.nsteps (int) – Length of the sampled MCMC chains. Default is 10000.
nburn (int) – Number of data points at the beginning of the Markov chains which are discarded in terms of a burn in period. Default is 200.
n_joint_samples (int) – Number of joint samples that are drawn from the estimated joint posterior probability in order to calculate the drift slope estimate \(\zeta\) and corresponding credibility bands. Default is 50000.
n_slope_samples (int) – Number of drift slope samples that are drawn from the estimated drift slope posterior computed based on the sampling results of the joint posterior probability of the Langevin parameters :math:` heta_i`. Default is 50000.
n_noise_samples (int) – Number of constant noise level samples that are drawn from the estimated marignal probility distribution of the diffusion model is chosen constant. Default is 50000.
cred_percentiles (One-dimensional numpy array of integers) – Two entries to define the percentiles of the calculated credibility bands of the estimated parameters. Default is
numpy.array([16,1]).print_AC_tau – If
Truethe estimated autocorrelation lengths of the Markov chains is shown. The maximum length is used for thinning of the chains. Default isFalse.ignore_AC_error (bool) – If
Truethe autocorrelation lengths of the Markov chains is not estimated and thus, it is not checked whether the chains are too short to give an reliable autocorrelation estimate. This avoids error interruption of the procedure, but can lead to unreliable results if the chains are too short. The option should be chosen in order to save computation time, e.g. when debugging your code with short Markov chains. Default isFalse.thinning_by (int) – If
ignore_AC_error == Truethe chains are thinned bythinning_by. Everythinning_by-th data point is stored. Default is 60.print_progress (bool) – If
Truethe progress of the MCMC sampling is shown. Default isFalse.slope_save_name (str) – Name of the file in which the
slope_storagearray will be saved. Default isdefault_save_slopes.noise_level_save_name (str) – Name of the file in which the
noise_level_storagearray will be saved. Default isdefault_save_noise.save (bool) – If
Truetheslope_storageand thenoise_level_storagearrays are saved in the end of the scan in an .npy file. Default isTrue.create_animation (bool) –
If
Truean automated animation of the time evolution of the drift slope estimate :math:hat{zeta}`, the noise level \(\hat{\sigma}\) and the noise kernel density is shown together with the time series and rolling windows. Default isFalse.Warning
It makes use of global variables to incorporate the movie animation tool in the class. The
create_animationparameter is only intended to easily reconstruct the plot results of the related publication. In other circumstances it should not be used in order to avoid conflicts with the global variable names.ani_save_name (str) – If
create_animation = Truethe animation is saved in a .mp4 file with this name. Default isdefault_animation_name.animation_title (str) – A title can be given to the animation.
mark_critical_point (float) – A red dotted vertical line is shown at time
mark_critical_point. Default isNone.mark_noise_level (float) – A green dotted line is shown at the noise level
mark_noise_levelin the noise kernel density plot and the time evolution plot of the noise level. Default isNone.