06 May, 2022
Fabricated data:
\[ g(E(y)) = \beta_0 + x_1\beta_1 \]
dat.lm = lm(y~x, data=dat) dat2.lm = lm(y~x, data=dat2) autoplot(list(dat.lm, dat2.lm), which=1)
\[ g(E(y)) = \beta_0 + x_1\beta_1 \]
grid.arrange(
    ggpredict(dat.lm, ~x) %>% plot(rawdata=TRUE),
    ggpredict(dat2.lm, ~x) %>% plot(rawdata=TRUE),
    nrow=1
    )
\[g(E(y)) = \beta_0 + x_1\beta_1 + x_1^2\beta_2 + x_1^3\beta_3 \]
dat.lm2 = lm(y~x+I(x^2)+I(x^3), data=dat) dat2.lm2 = lm(y~x+I(x^2)+I(x^3), data=dat2) autoplot(list(dat.lm2,dat2.lm2), which=1)
\[g(E(y)) = \beta_0 + x_1\beta_1 + x_1^2\beta_2 + x_1^3\beta_3 \]
grid.arrange(
    ggpredict(dat.lm2, ~x) %>% plot(rawdata=TRUE),
    ggpredict(dat2.lm2, ~x) %>% plot(rawdata=TRUE),
    nrow=1)
Can we:
Splines defined by basis functions.
\[ g(E(y)) = \beta_0 + \sum^k_{j=1} b_j(x_1)\beta_j \]
\[ g(E(y)) = \beta_0 + \sum^k_{j=1} b_j(x_1)\beta_j \]
\[ g(E(y)) = \beta_0 + \sum^k_{j=1} b_j(x_1)\beta_j \]
## # A tibble: 5 × 5 ## term estimate std.error statistic p.value ## <chr> <dbl> <dbl> <dbl> <dbl> ## 1 (Intercept) 1.71 0.300 5.71 8.45e- 7 ## 2 bs(x, knots = k, degree = 3)1 2.95 0.600 4.92 1.21e- 5 ## 3 bs(x, knots = k, degree = 3)2 -7.49 0.515 -14.5 1.27e-18 ## 4 bs(x, knots = k, degree = 3)3 -0.199 0.562 -0.354 7.25e- 1 ## 5 bs(x, knots = k, degree = 3)4 -4.66 0.392 -11.9 1.65e-15
\[ g(E(y)) = \beta_0 + \sum^k_{j=1}{b_j(x_1)\beta_j} \]
\[ g(E(y)) = \beta_0 + \sum^k_{j=1}{b_j(x_1)\beta_j} \]
Smoothing penalty (maximize penalized likelihood): \[ 2L(\boldsymbol{\beta}) - \boldsymbol{\lambda} (\boldsymbol{\beta^T}\boldsymbol{S}\boldsymbol{\beta}) \]
mgcv| Basis function | Name | Properties | 
|---|---|---|
tp | 
Thin plate spline | Low rank (far fewer parameters than data), isotropic (equal smoothing in any direction) regression splines | 
ts | 
Thin plate spline | Thin plate spline with penalties on the null space such that it can be shrunk to zero. Useful in combination with by= as this does not penalize the null space. | 
cr | 
Cubic regression spline | Splines with knots spread evenly throughout the covariate domain | 
cc | 
Cyclic cubic regression spline | Cubic regression splines with ends that meet (have the same second order derivatives) | 
ps | 
Penalized B-spline | Allow the distribution of knots to be based on data distribution | 
cp | 
Cyclic penalized B-spline | Penalized B-splines with ends that meet | 
gp | 
Gaussian process | Gaussian process smooths with five sets of correlation structures | 
mrf() | 
Markov random fields | Useful for modelling of discrete space. Uses penalties based on neighbourhood matrices (pairwise distances between discrete locations) | 
mgcv| Basis function | Name | Properties | 
|---|---|---|
re | 
Random effects | Parametric terms penalized via identity matrices. Equivalent to i.i.d in mixed effects models | 
fs | 
Factor smooth interaction | For single dimension smoothers. Duplicates the basis functions for each level of the categorical covariate, yet uses a single smoothing parameter across all. | 
sos | 
Splines on the sphere | Isotropic 2D splines in spherical space. Useful for large spatial domains. | 
so | 
Soap film smooths | Smooths within polygon boundaries. These are useful for modelling complex spatial areas. | 
mgcv| Smoother definition functions | Type | Properties | 
|---|---|---|
s() | 
General spline smoothers | - for multidimensional smooths, assumes that each component are on the same scale as there is only a single smoothing parameter for the smooth | 
te() | 
Tensor product smooth | - smooth functions of numerous covariates that are built as the tensor product of the comprising smooths (and penalties) | 
| - the interaction between numerous terms, each with their own smoothing parameter that penalizes the average wiggliness of that term. | ||
ti() | 
Tensor product interaction smooth | - smooths that include only the highest order interactions (exclude the basis functions associated with the main effects of the comprising smooths) | 
t2() | 
Alternative tensor product smooth with non-overlapping … | - creates basis functions and penalties for paired combinations of separate penalized and unpenalized components of each term |