`vignettes/D_functional_perspectives.Rmd`

`D_functional_perspectives.Rmd`

The greatest value of a picture is when it forces us to notice what we never expected to see.

(John W. Tukey)^{1}

The `riskyr`

data structures essentially describe networks of dependencies. This is best illustrated by the network diagram (see examples of `plot_fnet`

in the user guide and data formats). However, sometimes it is instructive to view all possible values of some parameter as a function of others. A functional perspective illustrates how the value of some parameter (or its values) changes as a function of other parameters (and their values).

The basic format of a function is \(y = f(x)\), which illustrates how values of \(y\) depend on values of \(x\) given some function \(f\). `riskyr`

provides 2 functions for viewing parameters as a function of other parameters (and their values).

The function `plot_curve`

draws the curves (or lines) of selected parameters as a function of the prevalence (with `prev`

ranging from 0 to 1) for a given decision process or diagnostic test (i.e., given values of `sens`

and `spec`

):

\[y \ = \ f(\texttt{prev} \textrm{, from 0 to 1}) \textrm{ with } y \in \{\texttt{PPV}, \texttt{NPV}, \texttt{ppod}, \texttt{acc}\} \ \ \ \ \ \ (1)\]

As an example, reconsider our original scenario (on mammography screening, see user guide). Earlier, we computed a positive predictive value (PPV) of 7.8%. But rather than just computing a single value, we could ask: How do values of PPV develop as a function of prevalence? The `plot_curve`

function illustrates this relationship:

```
plot_curve(prev = .01, sens = .80, spec = (1 - .096),
what = c("prev", "PPV", "NPV"),
title_lbl = "Mammography screening", cex.lbl = .8)
```

The curves illustrate that values of `PPV`

and `NPV`

crucially depend on the value of prevalence `prev`

in the current population. In fact, they actually vary across their entire range (i.e., from 0 to 1), rendering any communication of their value utterly meaningless without specifying the current population’s prevalence value `prev`

.

The dependency of `PPV`

and `NPV`

on `prev`

can be illustrated by assuming a higher prevalence rate. For instance, if we knew that some woman was genetically tested and known to exhibit the notorious BRCA1 mutation, the prevalence value of her corresponding population (given a positive mammography result in a routine screening) is increased to about 60% (graph not shown here to save space, but try running the following code for yourself):

```
high.prev <- .60 # assume increased prevalence due to BRCA1 mutation
plot_curve(prev = high.prev, sens = .80, spec = (1 - .096),
what = c("prev", "PPV", "NPV"),
title_lbl = "Mammography screening (BRCA1 mutation)", cex.lbl = .80)
```

This shows that — given an increased prevalence value `prev`

of 60% — the positive predictive value `PPV`

of a positive test result increases from 7.8% (in the standard population) to around 93% (given the BRCA1 mutation).

In addition, the actual values of population and test parameters are often unclear. The `plot_curve`

function reflects this by providing an uncertainty parameter `uc`

that is expressed as a percentage of the specified value. For instance, the following assumes that our parameter values may deviate up to 5% from the specified values and marks the corresponding ranges of uncertainty as shaded areas around the curves that assume exact parameter values.

Both the notions of expressing probabilities as a function of prevalence and of uncertainty ranges for imprecise parameter estimates can be extended to other probabilities. The following curves show the full set of curves currently drawn by `plot_curve`

. In addition to the predictive values `PPV`

and `NPV`

, we see that the proportion of positive decisions `ppod`

and the overall accuracy `acc`

also vary as a function of prevalence `prev`

:

```
high.prev <- .60 # assume increased prevalence due to BRCA1 mutation
plot_curve(prev = high.prev, sens = .80, spec = (1 - .096),
what = c("prev", "PPV", "NPV", "ppod", "acc"),
title_lbl = "Mammography screening (BRCA1 mutation)", uc = .05, cex.lbl = .80)
```

The function `plot_plane`

draws a plane for a selected parameter as a function of sensitivity and specificity (with `sens`

and `spec`

both ranging from 0 to 1) for a given prevalence `prev`

:

\[y \ = \ f(\texttt{sens} \times\ \texttt{spec} \textrm{, both from 0 to 1, for given value of } \texttt{prev}) \textrm{ with } y \in \{\texttt{PPV}, \texttt{NPV}, \texttt{ppod}, \texttt{acc}\} \ \ \ \ \ \ \ (2)\]

Some examples (not shown here, but please try evaluating the following function calls):

```
plot_plane(prev = high.prev, sens = .80, spec = (1 - .096), what = "PPV",
title_lbl = "A. Mammography (BRCA1)", cex.lbl = .8)
```

Related plots (showing different probabilities) include:

```
plot_plane(prev = high.prev, sens = .80, spec = (1 - .096), what = "NPV",
title_lbl = "B. Mammography (BRCA1)", cex.lbl = .8)
plot_plane(prev = high.prev, sens = .80, spec = (1 - .096), what = "ppod", what_col = "firebrick",
title_lbl = "C. Mammography (BRCA1)", phi = 45, cex.lbl = .8)
plot_plane(prev = high.prev, sens = .80, spec = (1 - .096), what = "acc", what_col = "forestgreen",
title_lbl = "D. Mammography (BRCA1)", cex.lbl = .8)
```

Overall, viewing conditional probabilities (like `PPV`

/`NPV`

, but also `ppod`

and `acc`

) as a function of other probabilities (e.g., `prev`

, `sens`

, `spec`

or `fart`

) often provides unexpected relationships and new insights.

The following resources and versions are currently available:

Type: | Version: | URL: |
---|---|---|

A. `riskyr` (R package): |
Release version | https://CRAN.R-project.org/package=riskyr |

Development version | https://github.com/hneth/riskyr | |

B. `riskyrApp` (R Shiny code): |
Online version | http://riskyr.org |

Development version | https://github.com/hneth/riskyrApp | |

C. Online documentation: | Release version | https://hneth.github.io/riskyr |

Development version | https://hneth.github.io/riskyr/dev |

We appreciate your feedback, comments, or questions.

Please report any

`riskyr`

-related issues at https://github.com/hneth/riskyr/issues.Email us at contact.riskyr@gmail.com if you want to modify or share this software.

Nr. | Vignette | Content |
---|---|---|

A. | User guide | Motivation and general instructions |

B. | Data formats | Data formats: Frequencies and probabilities |

C. | Confusion matrix | Confusion matrix and accuracy metrics |

D. | Functional perspectives | Adopting functional perspectives |

E. | Quick start primer | Quick start primer |

Tukey, J.W. (1977),

*Exploratory data analysis*. Reading, MA: Addison-Wesley. (p. vi).↩