Skip to content

Commit

Permalink
Don't subscript formulas of species in expr.species() and ratlab()
Browse files Browse the repository at this point in the history
git-svn-id: svn://scm.r-forge.r-project.org/svnroot/chnosz/pkg/CHNOSZ@853 edb9625f-4e0d-4859-8d74-9fd3b1da38cb
  • Loading branch information
jedick committed Dec 1, 2024
1 parent 7963906 commit 6f56b71
Show file tree
Hide file tree
Showing 7 changed files with 69 additions and 45 deletions.
4 changes: 2 additions & 2 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Date: 2024-11-30
Date: 2024-12-01
Package: CHNOSZ
Version: 2.1.0-25
Version: 2.1.0-26
Title: Thermodynamic Calculations and Diagrams for Geochemistry
Authors@R: c(
person("Jeffrey", "Dick", , "[email protected]", role = c("aut", "cre"),
Expand Down
6 changes: 4 additions & 2 deletions R/util.expression.R
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,8 @@ expr.species <- function(species, state = "aq", value = NULL, log = FALSE, molal
var <- "a"
if(molality) var <- "m"
if(state %in% c("g", "gas")) var <- "f"
expr <- substitute(italic(a)[b], list(a = var, b = expr))
# For better readability, no longer using subscripts for species formulas 20241201
expr <- substitute(italic(a)*b, list(a = var, b = expr))
# Use the logarithm?
if(log) expr <- substitute(log ~ a, list(a = expr))
# Write the value if not NULL or NA
Expand Down Expand Up @@ -311,7 +312,8 @@ ratlab <- function(top = "K+", bottom = "H+", molality = FALSE) {
# With molality, change a to m
a <- ifelse(molality, "m", "a")
# The final expression
substitute(log~(italic(a)[expr.top]^exp.top / italic(a)[expr.bottom]^exp.bottom),
# For better readability, no longer using subscripts for species formulas 20241201
substitute(log~(italic(a)^exp.top*expr.top / italic(a)^exp.bottom*expr.bottom),
list(a = a, expr.top = expr.top, exp.top = exp.top, expr.bottom = expr.bottom, exp.bottom = exp.bottom))
}

Expand Down
48 changes: 35 additions & 13 deletions inst/NEWS.Rd
Original file line number Diff line number Diff line change
Expand Up @@ -15,55 +15,77 @@
\newcommand{\Cp}{\ifelse{latex}{\eqn{C_P}}{\ifelse{html}{\out{<I>C<sub>P</sub></I>}}{Cp}}}
\newcommand{\DG0}{\ifelse{latex}{\eqn{{\Delta}G^{\circ}}}{\ifelse{html}{\out{&Delta;<I>G</I>&deg;}}{ΔG°}}}

\section{Changes in CHNOSZ version 2.1.0-25 (2024-11-30)}{
\section{Changes in CHNOSZ version 2.1.0-26 (2024-12-01)}{

\itemize{

\item Move \code{read.fasta()}, \code{count.aa()}, and \code{aasum()} to canprot package with different names.
\item Move \code{read.fasta()}, \code{count.aa()}, and \code{aasum()} to
canprot package with different names.

\item Remove \code{seq2aa()}.

\item Remove \code{add.alpha()}. Now \code{grDevices::adjustcolor()} is
used in \code{stack_mosaic()} to add transparency.

\item Add FAQ question: Why are mineral stability boundaries curved on mosaic diagrams?
\item Add FAQ question: Why are mineral stability boundaries curved on
mosaic diagrams?

\item \code{check.EOS()} now uses values of Born coefficients \emph{X} and \emph{Q} that are consistent with either SUPCRT92 or DEW, depending on the \code{model} defined in OBIGT (HKF or DEW, respectively).
\item \code{check.EOS()} now uses values of Born coefficients \emph{X}
and \emph{Q} that are consistent with either SUPCRT92 or DEW, depending
on the \code{model} defined in OBIGT (HKF or DEW, respectively).

\item Add aqueous uranyl complexes from \href{https://doi.org/10.1016/j.gca.2024.04.023}{Migdisov et al. (2024)} and related U-bearing minerals from \href{https://doi.org/10.1787/bf86a907-en}{Grenthe et al. (2020)}.
\item Add aqueous uranyl complexes from
\href{https://doi.org/10.1016/j.gca.2024.04.023}{Migdisov et al. (2024)}
and related U-bearing minerals from
\href{https://doi.org/10.1787/bf86a907-en}{Grenthe et al. (2020)}.

\item Except for HUO\s{4}\S{-}, aqueous uranium species from \href{https://doi.org/10.1016/S0016-7037(97)00240-8}{Shock et al. (1997)} have been moved to \file{SLOP98.csv}.
\item Except for HUO\s{4}\S{-}, aqueous uranium species from
\href{https://doi.org/10.1016/S0016-7037(97)00240-8}{Shock et al. (1997)}
have been moved to \file{SLOP98.csv}.

\item Remove \code{ispecies} from output of \code{check.OBIGT()} to avoid superfluous diffs.
\item Remove \code{ispecies} from output of \code{check.OBIGT()} to avoid
superfluous diffs.

\item Move americium complexes from \file{slop98.dat} back to \file{inorganic_aq.csv} (entropy of the element Am has been available for checking GHS self-consistency since version 1.4.0).
\item Move americium complexes from \file{slop98.dat} back to
\file{inorganic_aq.csv} (entropy of the element Am has been available for
checking GHS self-consistency since version 1.4.0).

\item Add alkylamines, benzylamines, and aminiums from \href{https://doi.org/10.1016/j.gca.2024.03.013}{Robinson et al. (2024)}.
\item Add alkylamines, benzylamines, and aminiums from
\href{https://doi.org/10.1016/j.gca.2024.03.013}{Robinson et al. (2024)}.

\item \samp{OBIGT/inorganic_cr.csv}: Add cerianite (CeO\s{2}) and
chromite (FeCr\s{2}O\s{4}) from
\href{https://doi.org/10.3133/b2131}{Robie and Hemingway (1995)}.

\item Add \file{demo/total_S.R} (total activity of S--pH diagram for Fe-S-O-C minerals).
\item Add \file{demo/total_S.R} (total activity of S--pH diagram for
Fe-S-O-C minerals).

\item Restore \code{lines} to the output of \code{diagram()} for the x and y values of predominance field boundaries.
\item Restore \code{lines} to the output of \code{diagram()} for the x
and y values of predominance field boundaries.

\item OBIGT: Restore steam (H\s{2}O,g) in \file{inorganic_gas.csv}.

\item OBIGT: Fix sign of \emph{c} parameter for nesquehonite. Thanks to James Leong and Grayson Boyer.
\item OBIGT: Fix sign of \emph{c} parameter for nesquehonite. Thanks to
James Leong and Grayson Boyer.

\item Add \file{demo/uranyl.R}, total (carbonate|sulfate)-pH diagrams
for uranyl species, after
\href{https://doi.org/10.1016/j.gca.2024.04.023}{Migdisov et al. (2024)}.

\item Adjust \code{axis.label()} to show sum symbol.
\item Adjust \code{axis.label()} to show sum symbol for
\code{mosaic()}-ed basis species used as a plot variable.

\item Rename \file{demo/total_S.R} to \file{demo/sum_S.R}, change axes,
and add solubility contours for Fe and Au.

\item Add \file{OBIGT/IGEM24.csv} with updated data for Au and Cu
complexes and SO\s{4}\S{-2}-bearing species from
researchers at IGEM RAS.

\item For better readability, formulas of species are no longer
subscripted by \code{axis.label()} (via \code{expr.species()}). An
example of the new formatting is log \emph{f}O\s{2}). Similar formatting
for other expressions has been used in the vignettes.

}

Expand Down
12 changes: 6 additions & 6 deletions vignettes/FAQ.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -185,14 +185,14 @@ HCO3_ <- "HCO<sub>3</sub><sup>−</sup>"
O2 <- "O<sub>2</sub>"
S2 <- "S<sub>2</sub>"
log <- "log&thinsp;"
aHCO2_ <- "<i>a</i><sub>HCO<sub>2</sub><sup>−</sup></sub>"
aHCO3_ <- "<i>a</i><sub>HCO<sub>3</sub><sup>−</sup></sub>"
logfO2 <- "log&thinsp;<i>f</i><sub>O<sub>2</sub></sub>"
aHCO2_ <- "<i>a</i>HCO<sub>2</sub><sup>−</sup>"
aHCO3_ <- "<i>a</i>HCO<sub>3</sub><sup>−</sup>"
logfO2 <- "log&thinsp;<i>f</i>O<sub>2</sub>"
Ctot <- "C<sub>tot</sub>"
C3H5O2_ <- "C<sub>3</sub>H<sub>5</sub>O<sub>2</sub><sup>−</sup>"
a3HCO3_ <- "<i>a</i><sup>3</sup><sub>HCO<sub>3</sub><sup>−</sup></sub>"
aC3H5O2_ <- "<i>a</i><sub>C<sub>3</sub>H<sub>5</sub>O<sub>2</sub><sup>−</sup></sub>"
a2HCO3_ <- "<i>a</i><sup>2</sup><sub>HCO<sub>3</sub><sup>−</sup></sub>"
a3HCO3_ <- "<i>a</i><sup>3<sub>HCO<sub>3</sub><sup>−</sup>"
aC3H5O2_ <- "<i>a</i>C<sub>3</sub>H<sub>5</sub>O<sub>2</sub><sup>−</sup>"
a2HCO3_ <- "<i>a</i><sup>2</sup>HCO<sub>3</sub><sup>−</sup>"
logCtot <- "log&thinsp;C<sub>tot</sub>"
CO2 <- "CO<sub>2</sub>"
H2O <- "H<sub>2</sub>O"
Expand Down
22 changes: 11 additions & 11 deletions vignettes/anintro.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -643,7 +643,7 @@ acr <- affinity(aaq)
diagram(acr, add = TRUE, bold = acr$species$state=="cr", limit.water = FALSE)
# Add legend
legend <- c(
bquote(log * italic(a)["Mn(aq)"] == .(logact)),
bquote(log ~ italic(a)*"Mn(aq)" == .(logact)),
bquote(italic(T) == .(T) ~ degree*C)
)
legend("topright", legend = as.expression(legend), bty = "n")
Expand Down Expand Up @@ -761,7 +761,7 @@ a$values <- lapply(a$values, `*`, -0.001)
```

```{r rainbow_diagram, fig.margin=TRUE, fig.width=4, fig.height=4, out.width="100%", echo=FALSE, message=FALSE, cache=TRUE, fig.cap="Affinities of organic synthesis in a hydrothermal system, after Shock and Canovas (2010).", pngquant=pngquant, timeit=timeit}
diagram(a, balance = 1, ylim = c(-100, 100), ylab = quote(italic(A)*", kcal/mol"),
diagram(a, balance = 1, ylim = c(-100, 100), ylab = quote(italic(A)~"(kcal/mol)"),
col = rainbow(8), lwd = 2, bg = "slategray3")
abline(h = 0, lty = 2, lwd = 2)
```
Expand Down Expand Up @@ -1160,7 +1160,7 @@ subcrt("LYSC_CHICK")$out[[1]][1:6, ]
Let's compare experimental values of heat capacity of four proteins, from @PM90, with those calculated using group additivity.
We divide Privalov and Makhatadze's experimental values by the lengths of the proteins to get per-residue values, then plot them.

```{r protein_Cp, fig.margin=TRUE, fig.width=4, fig.height=4, small.mar=TRUE, out.width="100%", echo=FALSE, message=FALSE, fig.cap='The heat capacity calculated by group additivity closely approximates experimental values for aqueous proteins. For a related figure showing the effects of ionization in the calculations, see <span style="color:blue">`?ionize.aa`</span>.', cache=TRUE, pngquant=pngquant, timeit=timeit}
```{r protein_Cp, fig.margin=TRUE, fig.width=4, fig.height=4, small.mar=TRUE, out.width="100%", echo=FALSE, message=FALSE, fig.cap='The heat capacity calculated by group additivity closely approximates experimental values for aqueous proteins.', cache=TRUE, pngquant=pngquant, timeit=timeit}
PM90 <- read.csv(system.file("extdata/cpetc/PM90.csv", package = "CHNOSZ"))
plength <- protein.length(colnames(PM90)[2:5])
Cp_expt <- t(t(PM90[, 2:5]) / plength)
Expand Down Expand Up @@ -1193,7 +1193,7 @@ The following plot shows the calculated affinity of reaction between nonionized
Charged and uncharged sets of basis species are used to to activate and suppress the ionization calculations.
The calculation of affinity for the ionized proteins returns multiple values (as a function of pH), but there is only one value of affinity returned for the nonionized proteins, so we need to use R's `as.numeric()` to avoid subtracting nonconformable arrays:

```{r protein_ionization, fig.margin=TRUE, fig.width=4, fig.height=4, small.mar=TRUE, out.width="100%", echo=FALSE, results="hide", message=FALSE, fig.cap='Affinity of ionization of proteins. See [<span style="color:blue">demo(ionize)</span>](../demo) for ionization properties calculated as a function of temperature and pH.', cache=TRUE, pngquant=pngquant, timeit=timeit}
```{r protein_ionization, fig.margin=TRUE, fig.width=4, fig.height=4, small.mar=TRUE, out.width="100%", echo=FALSE, results="hide", message=FALSE, fig.cap='Affinity of ionization of proteins. See [demo(ionize)](../demo) for ionization properties calculated as a function of temperature and pH.', cache=TRUE, pngquant=pngquant, timeit=timeit}
ip <- pinfo(c("CYC_BOVIN", "LYSC_CHICK", "MYG_PHYCA", "RNAS1_BOVIN"))
basis("CHNOS+")
a_ion <- affinity(pH = c(0, 14), iprotein = ip)
Expand Down Expand Up @@ -1309,20 +1309,20 @@ legend("topright", legend = c("Archaea", "Bacteria", "Eukaryota"),
Let's look at different ways of representing the chemical compositions of the proteins.
<span style="color:green">`protein.basis()`</span> returns the stoichiometry for the formation reaction of each proteins.
Dividing by <span style="color:green">`protein.length()`</span> gives the per-residue reaction coefficients.
Using the set of basis species we have seen before (CO<sub>2</sub>, NH<sub>3</sub>, H<sub>2</sub>S, `r h2o`, `r o2`) there is a noticeable correlation between *n*<sub>`r o2`</sub> and `r zc`, but even more so between *n*<sub>`r h2o`</sub> and `r zc` (shown in the plots on the left).
Using the set of basis species we have seen before (CO<sub>2</sub>, NH<sub>3</sub>, H<sub>2</sub>S, `r h2o`, `r o2`) there is a noticeable correlation between *n*`r o2` and `r zc`, but even more so between *n*`r h2o` and `r zc` (shown in the plots on the left).
```{marginfigure}
The calculation of *Z*<sub>C</sub>, which sums elemental ratios, is not affected by the choice of basis species.
```
The `QEC` keyword to <span style="color:red">`basis()`</span> loads basis species including three amino acids (glutamine, glutamic acid, cysteine, `r h2o`, `r o2`).
This basis strengthens the relationship between `r zc` and *n*<sub>`r o2`</sub>, but weakens that between `r zc` and *n*<sub>`r h2o`</sub> (shown in the plots on the right).
This basis strengthens the relationship between `r zc` and *n*`r o2`, but weakens that between `r zc` and *n*`r h2o` (shown in the plots on the right).

```{r rubisco_O2, fig.margin=TRUE, fig.width=4, fig.height=4, small.mar=TRUE, out.width="100%", echo=FALSE, results="hide", message=FALSE, fig.cap="Chemical metrics (*n*<sub>O<sub>2</sub></sub> and *n*<sub>H<sub>2</sub>O</sub>) calculated with different sets of basis species, compared to carbon oxidation state (*Z*<sub>C</sub>) which does not depend on the basis species.", cache=TRUE, pngquant=pngquant, timeit=timeit}
```{r rubisco_O2, fig.margin=TRUE, fig.width=4, fig.height=4, small.mar=TRUE, out.width="100%", echo=FALSE, results="hide", message=FALSE, fig.cap="Chemical metrics (*n*O<sub>2</sub> and *n*H<sub>2</sub>O) calculated with different sets of basis species, compared to carbon oxidation state (*Z*<sub>C</sub>) which does not depend on the basis species.", cache=TRUE, pngquant=pngquant, timeit=timeit}
layout(matrix(1:4, nrow = 2))
par(mgp = c(1.8, 0.5, 0))
pl <- protein.length(aa)
ZClab <- axis.label("ZC")
nO2lab <- expression(italic(n)[O[2]])
nH2Olab <- expression(italic(n)[H[2]*O])
nO2lab <- expression(italic(n)*O[2])
nH2Olab <- expression(italic(n)*H[2]*O)
lapply(c("CHNOS", "QEC"), function(thisbasis) {
basis(thisbasis)
pb <- protein.basis(aa)
Expand All @@ -1336,10 +1336,10 @@ lapply(c("CHNOS", "QEC"), function(thisbasis) {
```{r rubisco_O2, eval=FALSE}
```

By projecting the compositions of proteins into the `QEC` set of basis species, *n*<sub>`r o2`</sub> emerges as a strong indicator of oxidation state, while *n*<sub>`r h2o`</sub> is a relatively uncorrelated (i.e. independent) variable.
By projecting the compositions of proteins into the `QEC` set of basis species, *n*`r o2` emerges as an indicator of oxidation state, while *n*`r h2o` is a relatively uncorrelated (i.e. independent) variable.
These independent variables make it easier to distinguish the effects of oxidation and hydration state in proteomic datasets [@DYT20].

- The [canprot](https://github.com/jedick/canprot) package has functions to calculate chemical metrics (*Z*<sub>C</sub>, *n*<sub>`r o2`</sub>, and *n*<sub>`r h2o`</sub>) directly from amino acid compositions of proteins, and to read amino acid compositions from FASTA files (`canprot::read_fasta()`).
- The [canprot](https://github.com/jedick/canprot) package has functions to calculate chemical metrics (*Z*<sub>C</sub>, *n*`r o2`, and *n*`r h2o`) directly from amino acid compositions of proteins, and to read amino acid compositions from FASTA files (`canprot::read_fasta()`).

## Normalization to residues

Expand Down
12 changes: 6 additions & 6 deletions vignettes/equilibrium.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -292,13 +292,13 @@ figure:
than two species.

* Diagrams **E** and **F** both use the equilibration method to calculate
activities of species as a function of log*a*~H2O~ at log*f*~O2~ = -66.
Diagram **F** shows the default setting, where *a*~CO2~ is taken from the sum
activities of species as a function of log*a*H~2~O at log*f*O~2~ = -66.
Diagram **F** shows the default setting, where *a*CO~2~ is taken from the sum
of initial activities of species (each 10^-3^). The positions of equal
activities (where the lines cross) are the same as in Diagram **D** at
log*f*~O2~ = -66.
log*f*O~2~ = -66.

* Diagram **F** is drawn for a lower total activity of *a*~CO2~. Not only are
* Diagram **F** is drawn for a lower total activity of *a*CO~2~. Not only are
the activities of all amino acids decreased, but glycine starts to become
predominant at some conditions. This is because, compared to other amino
acids, it is smaller and has a lower coefficient on CO~2~ in its formation
Expand Down Expand Up @@ -425,7 +425,7 @@ label.figure("F", yfrac=0.9, xfrac=0.1, font = 2)

Diagrams **A**, **C**, and **E** are predominance (equal-activity) diagrams
made with different balancing constraints. Diagrams **B**, **D**, and **F**
show a cross-section of the same system at log*a*~H2O~ = 0.
show a cross-section of the same system at log*a*H~2~O = 0.

* In Diagrams **A** and **B**, the reactions are balanced on protein length.
The drastic transitions between proteins in **B** result from the large
Expand Down Expand Up @@ -473,7 +473,7 @@ title(main = "Equilibrium activities of proteins, normalize = TRUE")

Although it is below the stability limit of H~2~O (vertical dashed line), there
is an interesting convergence of the metastable equilibrium activities of some
proteins at low log *f*~O2~.
proteins at low log *f*O~2~.

## Document history

Expand Down
10 changes: 5 additions & 5 deletions vignettes/multi-metal.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -122,7 +122,7 @@ The methods are **mashing**, **mixing**, **mosaic stacking**, and **secondary ba

Mashing or simple overlay refers to independent calculations for two different systems that are displayed on the same diagram.

This example starts with a log*f*~O<sub>2</sub>~--pH base diagram for the C-O-H system then overlays a diagram for S-O-H.
This example starts with a log f*O<sub>2</sub>--pH base diagram for the C-O-H system then overlays a diagram for S-O-H.
The second call to `affinity()` uses the argument recall feature, where the arguments after the first are taken from the previous command.
This allows calculations to be run at the same conditions for a different system.
This feature is also used in other examples in this vignette.
Expand Down Expand Up @@ -915,15 +915,15 @@ subcrt(c("CuCl3-2", "Cu+", "Cl-"), c(-1, 1, 3), T = T)$out$logK

The higher stability of these complexes from @Hel69 causes the 35 ppm contour to move closer to the position shown by @Sve87.

Interestingly, the calculation here also predict substantial increases of Cu concentration of Cu at high *f*~S<sub>2</sub>~ and low *f*~O<sub>2</sub>~, due to the formation of bisulfide complexes with Cu.
Interestingly, the calculation here also predict substantial increases of Cu concentration of Cu at high *f*S<sub>2</sub> and low *f*O<sub>2</sub>, due to the formation of bisulfide complexes with Cu.
The aqueous species considered in the calculation can be seen like this:
```{r iCu.aq}
names(iCu.aq)
```

CuHS and Cu(HS)~2~^-^ can be excluded by removing S from the `retrieve()` call above (i.e. only `c("O", "H", "Cl")` as the elements in possible ligands); doing so precludes a high concentration of aqueous Cu in the highly reduced, sulfidic region.

The third plot for the concentation of SO~4~^-2^ is simply made by using `affinity()` to calculate the affinity of its formation reaction as a function of *f*~S<sub>2</sub>~ and *f*~O<sub>2</sub>~ at pH 6 and 125 °C, then using `solubility()` to calculate the solubility of S~2~(gas), expressed in terms of moles of SO~4~^-2^ in order to calculate parts per million (ppm) by weight.
The third plot for the concentation of SO~4~^-2^ is simply made by using `affinity()` to calculate the affinity of its formation reaction as a function of *f*S<sub>2</sub> and *f*O<sub>2</sub> at pH 6 and 125 °C, then using `solubility()` to calculate the solubility of S~2~(gas), expressed in terms of moles of SO~4~^-2^ in order to calculate parts per million (ppm) by weight.

## Secondary Balancing

Expand Down Expand Up @@ -1058,8 +1058,8 @@ legend("topleft", legend = lTP(400, 2000), bty = "n")
This example was prompted by Figure 3 of @MH85; earlier versions of the diagram are in @HBL69 and @Hel70a.

In some ways this is like the inverse of the **mosaic stacking** example.
There, reactions were balanced on Fe or Cu, and *f*~O<sub>2</sub>~ and pH were used as plotting variables.
Here, the reactions are balanced on O~2~ and implicitly on H^+^ through the activity ratios with *a*~Fe^+2^~ and *a*~Cu^+^~, which are the plotting variables.
There, reactions were balanced on Fe or Cu, and *f*O<sub>2</sub> and pH were used as plotting variables.
Here, the reactions are balanced on O~2~ and implicitly on H^+^ through the activity ratios with *a*Fe^+2^ and *a*Cu^+^, which are the plotting variables.

More common diagrams of this type are balanced on Si or Al.
See `demo(saturation)` for an example in the H~2~O-CO~2~-CaO-MgO-SiO~2~ system.
Expand Down

0 comments on commit 6f56b71

Please sign in to comment.