| Title: | Spatial Structural Change Detection by an Analysis of Variability Between Blocks of Observations |
|---|---|
| Description: | Provides methods to detect structural changes in time series or random fields (spatial data). Focus is on the detection of abrupt changes or trends in independent data, but the package also provides a function to de-correlate data with dependence. The functions are based on the test suggested in Schmidt (2024) <DOI:10.3150/23-BEJ1686> and the work in Görz and Fried (2025) <DOI:10.48550/arXiv.2512.11599>. |
| Authors: | Sheila Goerz [aut, cre], Roland Fried [ctb, ths] |
| Maintainer: | Sheila Goerz <[email protected]> |
| License: | GPL (>= 3) |
| Version: | 0.1.1 |
| Built: | 2026-05-26 15:51:53 UTC |
| Source: | https://github.com/sgoerz/schangeblock |
Estimates the autocovariance matrix for a given data matrix X. Via the parameter direction, it is possible to estimate only
row- or columnwise autocovariance matrices, which is useful if the autocovariance function is separable.
autocov(X, b, M = as.integer(c(1, 1)), direction = 0L, type = 0L)autocov(X, b, M = as.integer(c(1, 1)), direction = 0L, type = 0L)
X |
numeric matrix, |
b |
numeric vector containing two integer values: the bandwidths for the row- resp. column-wise estimation.
(Up to which lag should the autocovariances be estimated?) If |
M |
numeric vector containing two integer values, only needed for |
direction |
0: all directions, 1: only row-wise autocovariances, 2: only column-wise autocovariances. |
type |
0: ordinary autocovariance estimation, 1: difference-based autocovariance estimation. See Details. |
In this function, the autocovariance matrix of X is interpreted as the autocovariance matrix of
x = as.vector(X), i.e. where X is ordered into a vector column-wise.
If type = 0, the autocovariance to lags is estimated using the regular estimator
If type = 1, the autocovariance to lags is estimated by a difference-based version, inspired by
the estimator of Tecuapetla-Gómez and Munk (2017) for time series:
with
where M[1], M[2].
A numeric matrix of size . If direction = 0 then prod(dim(X)).
If direction = 1 then ncol(X), if direction = 2 then nrow(X).
Tecuapetla‐Gómez, I., & Munk, A. (2017). Autocovariance estimation in regression with a discontinuous signal and m‐dependent errors: A difference‐based approach. Scandinavian Journal of Statistics, 44(2), 346-368.
X <- genField(c(20, 20)) autocov(X, c(4, 4))[1:10, 1:100] # if separable: Sigma1 <- autocov(X, 4, direction = 1) Sigma2 <- autocov(X, 4, direction = 2) kronecker(Sigma1, Sigma2)[1:10, 1:100]X <- genField(c(20, 20)) autocov(X, c(4, 4))[1:10, 1:100] # if separable: Sigma1 <- autocov(X, 4, direction = 1) Sigma2 <- autocov(X, 4, direction = 2) kronecker(Sigma1, Sigma2)[1:10, 1:100]
Calculate MSE-optimal bandwidths according to Andrews (1991).
bandwidth(X, p1 = 0.3, p2 = 0.3, lag = 1)bandwidth(X, p1 = 0.3, p2 = 0.3, lag = 1)
X |
numeric vector or matrix. |
p1, p2
|
exponents for sample size n resp. estimated dependency, between 0 and 1. |
lag |
lag to which the autocorrelations are to be estimated. Integer > 0 but smaller than the length resp. number of rows and columns of X. |
Bandwidth is estimated via
where and are the mean row- resp. column-wise Spearman autocorrelations to lag 1.
A numeric vector containing one or two elements, depending on if a vector or matrix is supplied. In case of a matrix: the first value is the bandwidth for the row-wise and the second one for the column-wise estimation.
Andrews, D. W. (1991). “Heteroskedasticity and autocorrelation consistent covariance matrix estimation”. In: Econometrica: Journal of the Econometric Society, pp. 817–858.
X1 <- genField(c(50, 50), Theta = genTheta(1, 0.4)) bandwidth(X1, 0.3, 0.3) Theta <- matrix(c(0.08, 0.1, 0.08, 0.8, 1, 0.8, 0.08, 0.1, 0.08), ncol = 3) X2 <- genField(c(50, 50), Theta = Theta) bandwidth(X2, 1/3, 2/3)X1 <- genField(c(50, 50), Theta = genTheta(1, 0.4)) bandwidth(X1, 0.3, 0.3) Theta <- matrix(c(0.08, 0.1, 0.08, 0.8, 1, 0.8, 0.08, 0.1, 0.08), ncol = 3) X2 <- genField(c(50, 50), Theta = Theta) bandwidth(X2, 1/3, 2/3)
Returns the p-value of a test statistic according to the block test for structural changes.
block_pValue(tn, fun = "gmd")block_pValue(tn, fun = "gmd")
tn |
test statistic |
fun |
Character string; one of "gmd" (default), "var", "jb", "grubbs", "ANOVA", "ad", "sw" (see |
A numeric value between 0 and 1.
Computes test statistics of the block test on structural changes.
block_stat(x, s, fun = "gmd", varEstim = var)block_stat(x, s, fun = "gmd", varEstim = var)
x |
times series or random field to be tested. Either a numeric vector or a numeric matrix. |
s |
parameter for the size of the blocks, 0.5 < s < 1, block length |
fun |
Character string; one of "gmd" (default), "var", "jb", "grubbs", "ANOVA". |
varEstim |
variance estimator or variance estimation of the whole field or times series. Either a function to estimate the variance with, or a numeric value. |
First, the time series or random field is divided into blocks and the means of
the blocks are computed. Then the function fun is applied to the block means:
gmd: Gini's mean difference
var: Ordinary variance estimator
jb: Jarque-Bera test
grubbs: Grubbs test for outliers
ANOVA: simple ANOVA.
A numeric value.
For fun = "grubbs" it has the attribute n
indicating the number of blocks, i.e. the number of observations used in the Grubbs test.
For fun = "ANOVA" it has the attributes k (number of blocks) and
N (total number of observations).
# time series with a shift x <- arima.sim(model = list(ar = 0.5), n = 100) x[1:50] <- x[1:50] + 1 block_stat(x, sOpt(100, 0.6)) # field without shift and ordinary variance X <- genField(c(50, 50)) block_stat(X, sOpt(50, 0.6), "var") # field with a shift and ordinary variance X <- genField(c(50, 50), type = 2) block_stat(X, sOpt(50, 0.6), "var") # GMD test statistic, scaling variance estimated by the mad block_stat(X, 0.6, fun = "var", varEstim = mad)# time series with a shift x <- arima.sim(model = list(ar = 0.5), n = 100) x[1:50] <- x[1:50] + 1 block_stat(x, sOpt(100, 0.6)) # field without shift and ordinary variance X <- genField(c(50, 50)) block_stat(X, sOpt(50, 0.6), "var") # field with a shift and ordinary variance X <- genField(c(50, 50), type = 2) block_stat(X, sOpt(50, 0.6), "var") # GMD test statistic, scaling variance estimated by the mad block_stat(X, 0.6, fun = "var", varEstim = mad)
A test to detect whether an underlying time series or random field is stationary (hypothesis) or if there is a location shift present in a region (alternative).
block_test(x, s, fun = "gmd", varEstim = var)block_test(x, s, fun = "gmd", varEstim = var)
x |
times series or random field to be tested. Either a numeric vector or a numeric matrix. |
s |
parameter for the size of the blocks, 0.5 < s < 1, block length |
fun |
Character string; one of "gmd" (default), "var", "jb", "grubbs", "ANOVA", "ad", "sw" (see |
varEstim |
variance estimator or variance estimation of the whole field or times series. Either a function to estimate the variance with, or a numeric value. |
A list of the class "htest" containing the following components:
statistic |
value of the test statistic (numeric). |
p.value |
p-value (numeric). |
alternative |
alternative hypothesis (character string). |
method |
name of the performed test (character string). |
data.name |
name of the data (character string). |
Görz, S. and Fried, R. (2025). "Detecting changes in the mean of spatial random fields on a regular grid", arXiv preprint arXiv:2512.11599
Schmidt, S. K. (2024). Detecting changes in the trend function of heteroscedastic time series. Bernoulli, 30(4), 2598-2622.
# time series with a shift x <- arima.sim(model = list(ar = 0.5), n = 100) x[1:50] <- x[1:50] + 1 block_test(x, sOpt(100, 0.6)) # field without a shift and ordinary variance X <- genField(c(50, 50)) block_test(X, sOpt(50, 0.6), "var") # field with a shift and ordinary variance X <- genField(c(50, 50), type = 2) block_test(X, sOpt(50, 0.6), "var") #' # GMD test statistic, scaling variance estimated by the mad block_stat(X, 0.6, fun = "var", varEstim = mad)# time series with a shift x <- arima.sim(model = list(ar = 0.5), n = 100) x[1:50] <- x[1:50] + 1 block_test(x, sOpt(100, 0.6)) # field without a shift and ordinary variance X <- genField(c(50, 50)) block_test(X, sOpt(50, 0.6), "var") # field with a shift and ordinary variance X <- genField(c(50, 50), type = 2) block_test(X, sOpt(50, 0.6), "var") #' # GMD test statistic, scaling variance estimated by the mad block_stat(X, 0.6, fun = "var", varEstim = mad)
Generates the indices for different types of change regions.
changeRegion(n, s, type = 1L, middle, delta = 0.15, distFun = dist)changeRegion(n, s, type = 1L, middle, delta = 0.15, distFun = dist)
n |
dimensions of the random field. Numeric vector. |
s |
parameter for the size of the blocks, 0.5 < s < 1, block length |
type |
change region type (integer or character). See "Details". |
middle, delta, distFun
|
parameters for type 4L. See "Details". |
Change region types:
1L or "1a": exactly one block is shifted
"1b": size of the shift region is one block, but the shift region lies in two blocks
"1c": size of the shift region is one block, but the shift region lies in four blocks
2L: exactly half of the data is shifted
3L: there is a steady increase from left to right
4L: shift is at
5L: for demonstration purposes: the field is divided into 10 "columns". Every other column is shifted
6L: shift is at (upper triangle)
7L: a "circle" including everything within distFun delta from middle is shifted
Types 1a, 1b, 1c, 2, 4, 5
|
A vector of indices. |
Type 3 |
A numeric (n x n) matrix containing the heights of the shifts at the corresponding locations. |
changeRegion(c(50, 50), 0.6, "1a") changeRegion(c(50, 50), type = 2) changeRegion(c(50, 50), type = 3L) changeRegion(c(50, 50), type = 7L, middle = c(10, 10))changeRegion(c(50, 50), 0.6, "1a") changeRegion(c(50, 50), type = 2) changeRegion(c(50, 50), type = 3L) changeRegion(c(50, 50), type = 7L, middle = c(10, 10))
De-correlates a random field or time series, so that the resulting values can be treated as independent.
decorr(X, lags, method = 1L, separable = FALSE, M = 1, type = 0)decorr(X, lags, method = 1L, separable = FALSE, M = 1, type = 0)
X |
Random Field, numeric matrix, or time series |
lags |
numeric vector containing two integer values: the bandwidths for the row- reps. column-wise autocovariance estimation. (Up to which lag should the autocovariances be estimated?) Lags must be smaller than the dimensions of X. |
method |
1L: square root of the matrix via |
separable |
if the autocovariance function is (assumed to be) separable in the two directions of X, those two autocovariances can be estimated separately and then combined (after square root and inversion) as a Kronecker product. |
M |
numeric vector containing two integer values, only needed for |
type |
0: ordinary autocovariance estimation, 1: difference-based autocovariance estimation. See |
The contents of X are ordered into a vector column-wise. The autocovariance matrix of is estimated by autocov().
is taken the square root of and being inverted using the functions specified in method. Then
Then is ordered back into a matrix with the same dimension as .
De-correlated random field or time series; same data type and size as input X.
x <- arima.sim(list(ar = 0.4), 200) y <- decorr(x, 3) oldpar <- par(mfrow = c(2, 2)) acf(x) pacf(x) acf(y) pacf(y) par(oldpar) X <- genField(c(20, 20), Theta = genTheta(1, 0.4)) Y <- decorr(X, c(2, 2)) # custom function: require(robcp) Y2 <- decorr(X, c(2, 2), method = function(x) modifChol(solve(x)))x <- arima.sim(list(ar = 0.4), 200) y <- decorr(x, 3) oldpar <- par(mfrow = c(2, 2)) acf(x) pacf(x) acf(y) pacf(y) par(oldpar) X <- genField(c(20, 20), Theta = genTheta(1, 0.4)) Y <- decorr(X, c(2, 2)) # custom function: require(robcp) Y2 <- decorr(X, c(2, 2), method = function(x) modifChol(solve(x)))
Generate random fields on a regular grid. Dependency can be modeled to some extend.
genField( n, distr = rnorm, type = 0L, H = 100, Theta = NULL, q = NULL, param = NULL, ... )genField( n, distr = rnorm, type = 0L, H = 100, Theta = NULL, q = NULL, param = NULL, ... )
n |
dimensions of the requested random field. Numeric vector. |
distr |
function specifying the error distribution. |
type |
change region type (integer or character). See |
H |
height of the location shift (numeric value). |
Theta |
matrix specifying the dependency between observations of a 2-dim random field. Has to be a 2-dim. matrix where there is an odd number of entries in both dimensions. Explanations are given in genTheta. |
q |
dependency parameter for the model order of the MA field (integer > 0). |
param |
dependency parameters for the model parameters of the MA field (numeric vector.)
Both |
... |
additional arguments for the generation of |
Set dim(X) and . The dependent random field follows the equation:
where .
Only for type = 3L the term is replaced by .
The function returns a matrix of dimension n that has the class "RandomField".
genField(c(50, 50)) genField(c(50, 50), type = 3, Theta = genTheta(1, 0.2))genField(c(50, 50)) genField(c(50, 50), type = 3, Theta = genTheta(1, 0.2))
This function generates a symmetric dependency matrix of a
specific type of spatial MA(q) model.
genTheta(q, param, structure = "MA")genTheta(q, param, structure = "MA")
q |
model order (integer). |
param |
parameter (numeric value between -1 and 1). |
structure |
Character string, either "MA" or "AR" indicating the structure of the dependency matrix. Details below. |
Symmetric spatial MA(q) model (or an approximation to a spatial AR(1) model) for 2-dim. random fields:
.
For "MA":
For "AR":
A matrix of size (2q + 1) x (2q + 1).
genTheta(1, 0.2, "MA") genTheta(40, 0.2, "AR")genTheta(1, 0.2, "MA") genTheta(40, 0.2, "AR")
Calculates Gini's mean difference of a given vector x.
GMD(x)GMD(x)
x |
numeric vector. |
A numeric value.
x <- rnorm(100) GMD(x)x <- rnorm(100) GMD(x)
Computes the test statistic and the critical value of the outlier test according to Grubbs.
grubbs(x, varEstim = var) crit.grubbs(n, alpha = 0.05)grubbs(x, varEstim = var) crit.grubbs(n, alpha = 0.05)
x |
numeric vector. |
varEstim |
Variance estimation or estimation function. Either a numeric value or a function taking one argument. |
n |
sample size; positive numeric value. |
alpha |
significance level; between 0 and 1. |
numeric value.
x <- rnorm(100) grubbs(x) > crit.grubbs(100, 0.05) # add outlier x[1] <- x[1] + 100 grubbs(x) > crit.grubbs(100, 0.05)x <- rnorm(100) grubbs(x) > crit.grubbs(100, 0.05) # add outlier x[1] <- x[1] + 100 grubbs(x) > crit.grubbs(100, 0.05)
This function returns a vector of the block means for a given random field X.
Mu(x, group = NULL, l = NULL)Mu(x, group = NULL, l = NULL)
x |
Numeric vector or matrix. |
group |
strictly positive integer vector or matrix indicating the group (or block) of the corresponding observation in X.
Overwrites |
l |
block length. Integer vector of length 1 or 2, depending on the number of dimensions of X, with strictly positive entries. |
A numeric vector of length floor(n[1] / l[1]) * floor(n[2] / l[2]), n = dim(x).
X <- genField(c(50, 100), H = 100, type = 2) M <- Mu(X, l = c(10, 20)) plot(X) image(matrix(M, ncol = 5))X <- genField(c(50, 100), H = 100, type = 2) M <- Mu(X, l = c(10, 20)) plot(X) image(matrix(M, ncol = 5))
Plot random field
## S3 method for class 'RandomField' plot(x, main = "", colors, name = "value", alpha = 1, p = NULL, ...)## S3 method for class 'RandomField' plot(x, main = "", colors, name = "value", alpha = 1, p = NULL, ...)
x |
|
main |
title to the plot. |
colors |
vector of colors over which a gradient is created. |
name |
title of the legend, character string. |
alpha |
opacity of the raster tiles. Numeric value between 0 and 1. |
p |
additional ggplot2 component added BEFORE this function's components. |
... |
other parameters to be passed through to plotting functions. |
a ggplot2 object. Function is mostly called for its side effect.
plot(genField(c(50, 50))) plot(genField(c(50, 50), type = 2))plot(genField(c(50, 50))) plot(genField(c(50, 50), type = 2))
Generates a random sample from a mixture normal distribution.
rmix(n, q = 0.01, h = 10, sigma = 1)rmix(n, q = 0.01, h = 10, sigma = 1)
n |
sample size, numeric. |
q |
probability that observation is drawn from the contamination distribution, numeric. |
h |
mean of the contamination distribution, numeric. |
sigma |
standard deviation of the contamination distribution, numeric. |
The resulting sample is drawn from the distribution
Numeric vector of length n containing the random sample.
# random sample with 0.01 chance of contamination distribution with mean 10 rmix(100) # random sample with 0.01 chance of contamination distribution with standard deviation 10 # IMPORTANT: h needs to be set to 0! rmix(100, h = 0, sigma = 1)# random sample with 0.01 chance of contamination distribution with mean 10 rmix(100) # random sample with 0.01 chance of contamination distribution with standard deviation 10 # IMPORTANT: h needs to be set to 0! rmix(100, h = 0, sigma = 1)
Compute the skewness and the kurtosis of the data vector x.
skewness(x) kurtosis(x)skewness(x) kurtosis(x)
x |
numeric vector. |
A numeric value.
skewness(rnorm(100)) skewness(rexp(100, 2)) kurtosis(rnorm(100)) kurtosis(rt(100, 5))skewness(rnorm(100)) skewness(rexp(100, 2)) kurtosis(rnorm(100)) kurtosis(rt(100, 5))
Calculates the best parameter for a given approximation s, such that .
sSizes(n, lower = 0.5, upper = 1, step = 0.1) sOpt(n, s = 0.6)sSizes(n, lower = 0.5, upper = 1, step = 0.1) sOpt(n, s = 0.6)
n |
Sample size(s), numeric (vector). |
lower, upper
|
lower and upper search border, between 0 and 1. |
step |
size of the step for the search, between 0 and 1. |
s |
Desired exponent, |
sSizes returns a data frame containing
n |
the given sample size |
s |
the exponent in question |
ln |
the resulting block length |
bn |
the corresponding number of block |
ln.bn |
block length times number of blocks |
diff |
difference between the given sample size and the number of observations covered by the blocks |
sOpt returns a numeric vector of the optimal exponent(s).
sSizes(50) sSizes(50, 0.6, 0.8, 0.01) sOpt(50, 0.6) sOpt(100, 0.6)sSizes(50) sSizes(50, 0.6, 0.8, 0.01) sOpt(50, 0.6) sOpt(100, 0.6)