The rema
(rare event meta-analysis) package performs a
permutation-based meta-analysis for heterogeneous, rare event data. Rare
events and sparse data often cause conventional approaches to fail.
While recent work in this area has provided options for sparse data,
problems can still arise when heterogeneity across the available studies
differs based on treatment group. This package provides a meta-analysis
approach that often outperforms traditional methods when such patterns
of rare events and heterogeneity are observed.
It is important to note that rema
is designed for small
data sets with rare events. Computation time and memory usage increase
rapidly as the number of observed events increases. Particularly,
rema
should not be used on a machine with a standard amount
of RAM if the total number of observations is relatively large
and the observed event rates are relatively large, or if the
total number of observed events is relatively large.
rema
packageThe following table contains information from a data set regarding the use of antibiotics, which we will use to illustrate this package (data provided by Spinks et al. (2013), Analysis 4.1 and used in Friedrich et al. (2007)). Sixteen studies, conducted from 1950 to 2000, compare the use of antibiotics to prevent acute rheumatic fever, a sore throat resulting from inadequately treated strep throat or scarlet fever, compared to a placebo. The event of interest is the occurrence of acute rheumatic fever.
Study | Antibiotics Events | Antibiotics Total | Placebo Events | Placebo Total |
---|---|---|---|---|
1 | 0 | 121 | 0 | 118 |
2 | 0 | 59 | 0 | 58 |
3 | 0 | 358 | 0 | 164 |
4 | 0 | 454 | 0 | 216 |
5 | 0 | 238 | 0 | 268 |
6 | 0 | 62 | 0 | 59 |
7 | 0 | 186 | 0 | 97 |
8 | 0 | 87 | 0 | 94 |
9 | 0 | 369 | 0 | 386 |
10 | 0 | 257 | 2 | 109 |
11 | 2 | 798 | 17 | 804 |
12 | 5 | 978 | 35 | 996 |
13 | 0 | 605 | 2 | 608 |
14 | 2 | 277 | 5 | 198 |
15 | 2 | 157 | 1 | 50 |
16 | 26 | 650 | 12 | 220 |
We will use the rema
package to analyze this data. We
note that the event of acute rheumatic fever is rare, and the data is
sparse. We also notice the relatively large number of events in some of
the cells (particularly, in study 16 and studies 11 and 12 for the
placebo group), which will increase computation time and memory usage.
Accordingly, we ran this data set on a system with an Intel Xeon
processor E5-2689 v4 @ 3.10GHz with 192GB of RAM, and it took about 1
hour and 20 minutes to run.
The default arguments of the rema
function are
rema(trt.events = NULL, trt.total = NULL, ctrl.events = NULL, ctrl.total = NULL,
rema.obj, mid.p = TRUE, distr = TRUE, one.sided.p = FALSE, alpha = 0.05)
where trt.events
, trt.total
,
ctrl.events
, and ctrl.total
are numeric
vectors containing, for each independent study, the number of observed
events in the treatment group, the total number of observations in the
treatment group, the number of observed events in the control group, and
the total number of observations in the control group, respectively. The
remaining arguments specify whether the p-values and confidence interval
should be adjusted using the mid-p correction to reduce conservatism
(mid.p
), whether the permutational distribution of the test
statistic is returned (distr
), whether the one-sided
p-value is returned (one.sided.p
), and the level of
confidence for a (1 - alpha)% confidence interval for the overall
treatment effect (alpha
).
For the antibiotics data example, we will create the four needed vectors as follows.
anti.events <- c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 5, 0, 2, 2, 26)
anti.total <- c(121, 59, 358, 454, 238, 62, 186, 87, 369, 257, 798, 978, 605,
277, 157, 650)
plac.events <- c(0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 17, 35, 2, 5, 1, 12)
plac.total <- c(118, 58, 164, 216, 268, 59, 97, 94, 386, 109, 804, 996, 608,
198, 50, 220)
It is important to note that no continuity correction should be
applied to studies with zero observed events to data sets passed into
rema
.
rema
will return a list with elements that include the
combined odds ratio, its associated (1 - alpha)% confidence interval,
and the two-sided p-value. The following examples illustrate how the
returned object and resulting output differ depending on the values
passed to the various arguments listed above.
This example runs rema
with the default values. The
output contains the combined odds ratio, its associated 95% confidence
interval, and the mid-p corrected two-sided p-value. Under the details
section, the method used when computing the odds ratio and its
confidence interval is reported. If the observed test statistic is not
at either extreme (the minimum or maximum value) of the distribution,
the conditional maximum likelihood estimate (CMLE) will be used;
otherwise, the median unbiased estimate (MUE) will be employed. Here,
the CMLE was used. Overall, this method indicates no decrease the risk
of acute rheumatic fever from the use of antibiotics.
antibiotics.rema <- rema(anti.events, anti.total, plac.events, plac.total)
summary(antibiotics.rema)
# Call:
# rema(trt.events = anti.events, trt.total = anti.total, ctrl.events = plac.events,
# ctrl.total = plac.total)
#
# OR 95%-CI p-value
# 0.5708 [0.1947; 1.5984] 0.2556
#
# Details on meta-analytical method:
# - Rare event, heterogeneous meta-analysis method
# - Two-sided p-value returned (mid.p = TRUE)
# - Conditional Maximum Likelihood Estimate (CMLE) used when computing the odds ratio
Some may find it interesting to view the permutational distribution.
Since the distribution is saved by default, it can be called and printed
or visualized with the plot
function. The asterisk in the
plot marks the observed value of the test statistic, and the dark grey
bars mark the area that contributes to the two-sided p-value. The
observed value of the test statistics is also saved in the
rema
object.
antibiotics.rema$dist
#> test.stat norm.probs
#> 1 27 6.716051e-12
#> 2 28 6.243005e-10
#> 3 29 4.605874e-08
#> 4 30 1.178779e-06
#> 5 31 1.810190e-05
#> 6 32 1.900967e-04
#> 7 33 1.361220e-03
#> 8 34 6.418245e-03
#> 9 35 2.290489e-02
#> 10 36 5.697199e-02
#> 11 37 1.160254e-01
#> 12 38 1.694432e-01
#> 13 39 2.084688e-01
#> 14 40 1.724891e-01
#> 15 41 1.360043e-01
#> 16 42 6.433921e-02
#> 17 43 3.335175e-02
#> 18 44 8.452742e-03
#> 19 45 3.156814e-03
#> 20 46 3.155256e-04
#> 21 47 8.395551e-05
#> 22 48 2.683688e-06
#> 23 49 7.058532e-07
#> 24 50 6.078317e-09
#> 25 51 1.895755e-09
#> 26 52 3.216023e-12
#> 27 53 5.180658e-14
antibiotics.rema$tstat
#> [1] 37
For the remaining examples, we will use the
antibiotics.rema
object as input instead of the first four
vectors. Running rema
once with the first four vectors
allows us to obtain the permutational distribution. Once we have that
distribution, the various arguments pertaining to inference
(e.g. mid.p
, alpha
) can be changed without the
need to recompute the distribution, saving computation time.
This example requests the one-sided p-value be returned. The summary
output does not change, but the returned object now has the element
pval.one.sided
that can be accessed. This p-value is the
sum of the probabilities from test statistics less than or equal to the
observed test statistic (the direction of the treatment effect), with
the mid-p adjustment.
This example applies no correction to the confidence bounds and the two-sided p-value. As expected, the 95% confidence interval is wider, and the two-sided p-value is larger when the mid-p correction to reduce conservatism is not applied.
rema(antibiotics.rema, mid.p = FALSE)
#> Call:
#> rema(trt.events = antibiotics.rema, mid.p = FALSE)
#>
#> OR 95%-CI p-value
#> 0.5708 [0.1738; 1.7631] 0.3136
#>
#> Details on meta-analytical method:
#> - Rare event, heterogeneous meta-analysis method
#> - Two-sided p-value returned (mid.p = FALSE)
#> - Conditional Maximum Likelihood Estimate (CMLE) used when computing the odds ratio
This example sets the alpha
argument to 0.1, resulting
in a 90% confidence interval. As expected, increasing alpha decreases
the confidence interval width.
rema(antibiotics.rema, alpha = 0.1)
#> Call:
#> rema(trt.events = antibiotics.rema, alpha = 0.1)
#>
#> OR 90%-CI p-value
#> 0.5708 [0.2331; 1.3585] 0.2556
#>
#> Details on meta-analytical method:
#> - Rare event, heterogeneous meta-analysis method
#> - Two-sided p-value returned (mid.p = TRUE)
#> - Conditional Maximum Likelihood Estimate (CMLE) used when computing the odds ratio