Skip to content

Commit

Permalink
Joss review fixes (#115)
Browse files Browse the repository at this point in the history
* add note about julia version; fixes issue #111

* Other improvements to README.md

* add dictionary and named tuple constructors; fixes #110

* Improve docstrings for feature types; fixes #114

* docstring refinements

* add vingette for readin in data; fixes #112

* attempt to resolve Documenter issues

* solved doctest issues
  • Loading branch information
kescobo authored Nov 5, 2021
1 parent 7dd8455 commit 64dc7d9
Show file tree
Hide file tree
Showing 8 changed files with 157 additions and 28 deletions.
10 changes: 7 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,16 @@
![Microbiome.jl logo](logo.png)

## Latest Release
![EcoJulia maintainer: kescobo](https://img.shields.io/badge/EcoJulia%20Maintainer-kescobo-blue.svg)
## Latest Release

[![Latest Release](https://img.shields.io/github/release/EcoJulia/Microbiome.jl.svg)](https://github.com/EcoJulia/Microbiome.jl/releases/latest)

[![Docs stable](https://img.shields.io/badge/docs-stable-blue.svg)](https://docs.ecojulia.org/Microbiome.jl/stable/)
[![License](https://img.shields.io/badge/license-MIT-green.svg)](https://github.com/EcoJulia/Microbiome.jl/blob/master/LICENSE)
[![Project Status: Active – The project has reached a stable, usable state and is being actively developed.](https://www.repostatus.org/badges/latest/active.svg)](https://www.repostatus.org/#active)

## Development Status

![EcoJulia maintainer: kescobo](https://img.shields.io/badge/EcoJulia%20Maintainer-kescobo-blue.svg)
[![Docs dev](https://img.shields.io/badge/docs-latest-blue.svg)](https://docs.ecojulia.org/Microbiome.jl/latest/)
[![CI](https://github.com/EcoJulia/Microbiome.jl/workflows/CI/badge.svg)](https://github.com/EcoJulia/Microbiome.jl/actions?query=workflow%3ACI)
[![codecov](https://codecov.io/gh/EcoJulia/Microbiome.jl/branch/main/graph/badge.svg)](https://codecov.io/gh/EcoJulia/Microbiome.jl)
Expand All @@ -21,7 +22,10 @@ microbiome and microbial community data.

## Installation

Install Microbiome from the Julia REPL:
To use the latest version of `Microbiome.jl`,
you must be on `julia` v1.6 or greater.

Install `Microbiome.jl` from the Julia REPL:

```julia
julia> using Pkg
Expand Down
2 changes: 2 additions & 0 deletions docs/Project.toml
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
[deps]
CSV = "336ed68f-0bac-5ca0-87d4-7b16caf5d00b"
DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0"
Dictionaries = "85a47980-9c8c-11e8-2b9f-f7ca1fa99fb4"
Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
Microbiome = "3bd8f0ae-a0f2-5238-a5af-e1b399a4940c"
Expand Down
99 changes: 99 additions & 0 deletions docs/src/profiles.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,15 @@ DocTestSetup = quote
using Microbiome.SparseArrays
using Random
Random.seed!(42)
open("taxprof.csv", "w") do io
print(io, """
features,s1,s2
s__Escherichia_coli,1,0
s__Bifidobacterium_longum,3,2
s__Prevotella_copri,5,4
""")
end
end
```

Expand Down Expand Up @@ -304,6 +313,96 @@ julia> metadata(comm)
(sample = "s3", subject = "annelle", foo = "baz", other = missing)
```

## Importing from tabular data

The `CommunityProfile` type is compatible with `Tables.jl`,
so should be able to import data from tabular sources
with the help of some other packages.

For example, using the [`DataFrames.jl`](https://github.com/JuliaData/DataFrames.jl) package
(which you can install at the REPL with `] add DataFrames`):

```jldoctest
julia> using DataFrames, Microbiome
julia> df = DataFrame(features = ["gene$i|s__Some_species" for i in 1:5], s1 = 1:2:10, s2 = 0:2:9)
5×3 DataFrame
Row │ features s1 s2
│ String Int64 Int64
─────┼─────────────────────────────────────
1 │ gene1|s__Some_species 1 0
2 │ gene2|s__Some_species 3 2
3 │ gene3|s__Some_species 5 4
4 │ gene4|s__Some_species 7 6
5 │ gene5|s__Some_species 9 8
julia> gfs = genefunction.(df.features)
5-element Vector{GeneFunction}:
GeneFunction("gene1", Taxon("Some_species", :species))
GeneFunction("gene2", Taxon("Some_species", :species))
GeneFunction("gene3", Taxon("Some_species", :species))
GeneFunction("gene4", Taxon("Some_species", :species))
GeneFunction("gene5", Taxon("Some_species", :species))
julia> mss = MicrobiomeSample.(names(df)[2:end])
2-element Vector{MicrobiomeSample}:
MicrobiomeSample("s1", {})
MicrobiomeSample("s2", {})
julia> CommunityProfile(Matrix(df[!, 2:end]), gfs, mss)
CommunityProfile{Int64, GeneFunction, MicrobiomeSample} with 5 features in 2 samples
Feature names:
gene1, gene2, gene3, gene4, gene5
Sample names:
s1, s2
```

Alternatively, with a CSV file, and the [`CSV.jl`](https://github.com/JuliaData/CSV.jl) package:

```jldoctest csvexample
julia> println.(eachline("taxprof.csv"));
features,s1,s2
s__Escherichia_coli,1,0
s__Bifidobacterium_longum,3,2
s__Prevotella_copri,5,4
julia> using CSV, CSV.Tables
julia> tbl = CSV.read("taxprof.csv", Tables.columntable);
julia> txs = taxon.(tbl[1])
3-element Vector{Taxon}:
Taxon("Escherichia_coli", :species)
Taxon("Bifidobacterium_longum", :species)
Taxon("Prevotella_copri", :species)
julia> mss = [MicrobiomeSample(string(k)) for k in keys(tbl)[2:end]]
2-element Vector{MicrobiomeSample}:
MicrobiomeSample("s1", {})
MicrobiomeSample("s2", {})
julia> mat = hcat([tbl[i] for i in 2:length(tbl)]...)
3×2 Matrix{Int64}:
1 0
3 2
5 4
julia> CommunityProfile(mat, txs, mss)
CommunityProfile{Int64, Taxon, MicrobiomeSample} with 3 features in 2 samples
Feature names:
Escherichia_coli, Bifidobacterium_longum, Prevotella_copri
Sample names:
s1, s2
```

You may also be interested in [`BiobakeryUtils.jl`](https://github.com/EcoJulia/BiobakeryUtils.jl)
which has convenient functions for reading in file types generated by
the `bioBakery` suite of computational tools.

## Types and Methods

```@docs
Expand Down
9 changes: 5 additions & 4 deletions docs/src/samples_features.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,10 @@ At its most basic, an `AbstractSample` simply encodes a `name`
The concrete type [`MicrobiomeSample`](@ref) is implemented with these two fields,
the latter of which is a `Dictionary` from [`Dictionaries.jl`](https://github.com/andyferris/Dictionaries.jl).


You can instantiate a `MicrobiomeSample` with just a name (in which case the metadata dictionary will be empty),
using keyword arguments for metadata entries,
or with existing metadata in the form of a dictionary.
or with existing metadata in the form of a dictionary (with keys of type `Symbol`) or a `NamedTuple`.

```jldoctest sampletypes
julia> s1 = MicrobiomeSample("sample1")
Expand All @@ -30,8 +31,8 @@ MicrobiomeSample("sample1", {})
julia> s2 = MicrobiomeSample("sample2"; age=37)
MicrobiomeSample("sample2", {:age = 37})
julia> s3 = MicrobiomeSample("sample3", Dictionary([:gender, :age], ["female", 23]))
MicrobiomeSample("sample3", {:gender = "female", :age = 23})
julia> s3 = MicrobiomeSample("sample3", Dict(:gender=>"female", :age=>23))
MicrobiomeSample("sample3", {:age = 23, :gender = "female"})
```

### Working with metadata
Expand All @@ -45,7 +46,7 @@ julia> insert!(s1, :age, 50)
MicrobiomeSample("sample1", {:age = 50})
julia> set!(s3, :gender, "nonbinary")
MicrobiomeSample("sample3", {:gender = "nonbinary", :age = 23})
MicrobiomeSample("sample3", {:age = 23, :gender = "nonbinary"})
julia> delete!(s3, :gender)
MicrobiomeSample("sample3", {:age = 23})
Expand Down
35 changes: 23 additions & 12 deletions src/features.jl
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,9 @@ Microbial taxon with a name and a rank that can be one of
9. `:strain`
or `missing`. Contructors can also use numbers 0-9, or pass a string alone
(in which case the `taxon` will be stored as `missing`)
(in which case the `taxon` will be stored as `missing`).
See also [`taxon`](@ref Microbiome.taxon).
"""
struct Taxon <: AbstractFeature
name::String
Expand Down Expand Up @@ -94,7 +96,7 @@ end
"""
taxrank(t::Union{Taxon, missing})
Get the `rank` field from an `Taxon`.
Get the `rank` field from a [`Taxon`](@ref) `t`.
Returns `missing` if the rank is not set.
"""
taxrank(t::Taxon) = t.rank
Expand All @@ -103,7 +105,9 @@ taxrank(::Missing) = missing
"""
hasrank(t::Taxon)::Bool
Pretty self-explanatory.
Boolean function that returns `true` if the `rank`
field in [`Taxon`](@ref) `t` is not `missing`,
or `false` if it is `missing`
"""
hasrank(t::Taxon) = !ismissing(taxrank(t))

Expand All @@ -123,32 +127,39 @@ GeneFunction(n::AbstractString) = GeneFunction(n, missing)
GeneFunction(n::AbstractString, t::AbstractString) = GeneFunction(n, taxon(t))

"""
taxon(t::GeneFunction)
taxon(gf::GeneFunction)
Get the `taxon` field from a `GeneFunction`.
Get the `taxon` field from a [`GeneFunction`](@ref), `gf`.
Returns `missing` if the taxon is not set.
"""
taxon(gf::GeneFunction) = gf.taxon

"""
hastaxon(t::GeneFunction)::Bool
hastaxon(gf::GeneFunction)::Bool
Pretty self-explanatory.
Boolean function that returns `true` if the `taxon`
field in a [`GeneFunction`](@ref) `gf` is not `missing`,
or `false` if it is `missing`
"""
hastaxon(gf::GeneFunction) = !ismissing(taxon(gf))

"""
taxrank(gf::GeneFunction)
Get the `rank` field from the Taxon, if `gf` has one.
Returns `missing` if the taxon or rank is not set.
Get the `rank` field from the `taxon` field of a [`GeneFunction`](@ref) `gf`
if it has one.
Returns `missing` if the `taxon` or `rank` is not set.
"""
taxrank(gf::GeneFunction) = taxrank(taxon(gf))

"""
hasrank(t::GeneFunction)::Bool
hasrank(gf::GeneFunction)::Bool
Boolean function that returns:
Pretty self-explanatory.
- `true` if `gf` has a [`Taxon`](@ref) with a non-missing `rank` field,
- `false` if there's no `Taxon`, or
- `false` if the `Taxon` has no `rank`
"""
hasrank(gf::GeneFunction) = hastaxon(gf) && !ismissing(taxrank(gf))

Expand All @@ -157,7 +168,7 @@ Base.String(gf::GeneFunction) = hastaxon(gf) ? string(name(gf), '|', String(taxo
"""
genefunction(n::AbstractString)
Make a gene function from a string,
Make a [`GeneFunction`](@ref) from a string,
Converting anything after an initial `|` as a [`Taxon`](@ref).
"""
function genefunction(n::AbstractString)
Expand Down
1 change: 1 addition & 0 deletions src/samples.jl
Original file line number Diff line number Diff line change
Expand Up @@ -167,3 +167,4 @@ struct MicrobiomeSample <: AbstractSample
end

MicrobiomeSample(n::AbstractString; kwargs...) = isempty(kwargs) ? MicrobiomeSample(n, Dictionary{Symbol, Any}()) : MicrobiomeSample(n, dictionary(kwargs))
MicrobiomeSample(n::AbstractString, d::Union{AbstractDict,NamedTuple}) = MicrobiomeSample(n; pairs(d)...)
2 changes: 2 additions & 0 deletions test/Project.toml
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
[deps]
CSV = "336ed68f-0bac-5ca0-87d4-7b16caf5d00b"
DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0"
Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
SparseArrays = "2f01184e-e22b-5df5-ae63-d93ebab69eaf"
Expand Down
27 changes: 18 additions & 9 deletions test/runtests.jl
Original file line number Diff line number Diff line change
Expand Up @@ -36,16 +36,25 @@ using Documenter
ms.thing2 == "metadata2"
end

# kwargs constructor
ms2 = MicrobiomeSample("sample2"; age=10, birthtype="vaginal", allergies=true)
@test ms2.age == 10
@test ms2.birthtype == "vaginal"
@test ms2.allergies

@test_throws ArgumentError insert!(ms2, (; birthtype="cesarean"))
insert!(ms2, (; foo=10))
@test ms2.foo == 10
set!(ms2, (; birthtype="cesarean"))
@test ms2.birthtype == "cesarean"
# Dict constructor
ms3 = MicrobiomeSample("sample3", Dict(:age=>10, :birthtype=>"vaginal", :allergies=>true))
# NamedTuple constructor
ms4 = MicrobiomeSample("sample4", (;age=10, birthtype="vaginal", allergies=true))

for ms in [ms2, ms3, ms4]
@test ms.age == 10
@test ms.birthtype == "vaginal"
@test Bool(ms.allergies)

@test_throws ArgumentError insert!(ms, (; birthtype="cesarean"))
insert!(ms, (; foo=10))
@test ms.foo == 10
set!(ms, (; birthtype="cesarean"))
@test ms.birthtype == "cesarean"
end

end

@testset "Taxa" begin
Expand Down

0 comments on commit 64dc7d9

Please sign in to comment.