Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Biodiversity page: query by taxon through an extendable tree #35

Open
NicBailly opened this issue May 1, 2019 · 0 comments
Open

Biodiversity page: query by taxon through an extendable tree #35

NicBailly opened this issue May 1, 2019 · 0 comments

Comments

@NicBailly
Copy link

NicBailly commented May 1, 2019

@QQ-Sortiz

  1. Page

http://www.seaaroundus.org/data/#/topic/biodiversity

  1. Rationale

It seems that users are confused by the labels of the supraspecific taxa.
Example:
Let's take the genus Salmo (common name: trouts) and 2 species.
Let's assume that the total catch report for the 3 taxa are:
Salmo 3
Salmo salar 5
Salmo trutta 7

In the current interface, if one selects Salmo, the result is the catch graph for the catch reported at the genus Salmo level only, i.e., 3.
It seems that a number of users think that when they select Salmo, they also select by default S. salar and S. trutta, and then that the graph represents their cumulative catches , i.e., 3+5+7 = 15.

There are also cases of catch reported to NEI (= not elsewhere included). The correct structure is, e.g., with sharks:
- Sharks
+ Sharks NEI
+ Families of sharks
+ ...

and not
- Sharks NEI
- Sharks
+ Families of sharks
+ ...

Actually in the FAO table ASFIS, all supraspecific taxa are noted NEI, which is not the case in SAU.

  1. Goal and objectives

Reorganize the page to include an extendable tree of the table taxon that allows to query graphically by taxon to get the catch data.

3.1. Query through an extendable tree
The tree should reflect the hierarchical structure of the table taxon proposes the cumulative catches of a node + of all its children down to the terminal nodes.
Deng and Vulcan have already proposed a solution through Mongoose/JavaScript/Json (MJJ hereafter) [PLACE A LINK TO WHERE THE ZIP FILE IS DEPOSITED].

3.2. Selection of catches at each node
The first difficulty resides in the interface, and how to indicate to the user how he gets what he wants between the catches allocated to a node, and the cumulative catches with all children (subtree starting from the node).
The second minor difficulty is to prepare the data for the page http://www.seaaroundus.org/data/#/taxon through dedicated queries.

  1. taxon_tree new table

Constraints:

  • the table taxon records only taxa with catch values.
  • the table stick generally to the FAO names but it is not totally true.

4.1. Three issues
4.1.1. Missing levels
In the table taxon, the structure of the classification tree does not include all levels (genera, families, orders, etc.), i.e., the ones without the taxa with no catch.
Example:
The three species of Gadus (Gadus macrocephalus, Gadus morhua, Gadus ogac) have reported catch, but no catch is reported under the genus Gadus itself.
Currently under MJJ, one can select separately each of the three species, but there is no node Gadus that would allow to select it and the three species at the same time (selection of one subtree).
However, the field Lineage stores these nodes. This may allow to show the intermediary levels. To be explored.

Consequence: the nodes at the same level of the tree are not necessarily of the same taxonomic level. But it is ok like that.

image

Fig.1: Snapshot of the MJJ solution showing the Salmo and Gadus examples. Note this is based on data from 2015. In particular, the red nodes (nodes without catch data) have been removed from the taxon table since then.

4.1.2. Missing taxa
Some nodes like Selachii (sharks) are not in the table taxon at all, neither as a record nor in Lineages.

With reference to the constraints above, the consequence of the second point is that it is difficult to add records in the taxon table with no catch, either taxa that are already in Lineages or taxa that do not exist at all. Therefore it is reasonable to create a second taxon table (taxon_tree) avoiding to disturb routines, queries, and webpages that are based on the taxon table. The inconvenient is that when the taxon table will be updated, the updates will have to be inserted in the taxon_tree table. However, the taxon table is not modified often, so it should remain a minor issue.

4.1.3. Non taxonomic names
With the changes in the classification methods, the classification has rather changed. For example, shrimps that were altogether are now separated in two groups, one group being more related to crabs and lobsters than to the other group of shrimps. However, in the framework of fisheries, it seems more reasonable to keep such groups. But then, these do not fit in the classification, and then the structure of the table taxon_tree will need to adapt that and will not totally fit with the table taxon.

4.2. Structure and content of the table
Previous to further work, the table taxon was cleaned (standardization of common names, misspellings, common names added, missing data or corrections in the taxon level fields). An excel spreadsheet reports the changes (also in the table TTT) [PLACE A LINK TO WHERE THE EXCEL FILE IS DEPOSITED].

Then a table named taxon_tree was created only with the taxonomic fields (copy from the table taxon). Currently the taxon table contains one field per taxonomic rank, and the field Lineages that contains the concatenation of all the taxon rank fields.

To anticipate some programming needs, fields were added and filled for auto-referencing the table with the taxon parent-child relationship, and the listing sequence of names. 604733

Some nodes were added (thus with no catch), but the nodes existing in the Lineage field (but not as a record) were not created as a record. We need to check first if the ltree or 'Recursive' [FIND THE CORRECT NAME] commands in pgSQL.

  1. Display

5.1 Form and functionalities of the tree
NB presented to DP various trees on the web (FB, WoRMS, CoL, Pensoft publisher), but DP wants to stick with the fluffy MJJ stuff.

5.2 Size of the tree framework
The framework of the tree has a constant size in the web page. When many nodes are extended, it becomes messy and nodes and labels overlay. Although the re are few chances that a user would like to see all (>2000 terminal nodes), maybe check if the size can extend in some ways.

5.3. Node iconography
NB suggests that:

  • the intermediary nodes are represented by triangles with a vertical base on the left, and the summit pointing towards the right, indicating that there are children. Plain triangles indicate that there is catch data allocated to that node, and empty triangles that there is no data (data are with children). One click on the triangle extend the children, another retracts them (we may play with the orientation of the triangle, let see later). Maybe plain and empty triangles should be in 2 different colors.
  • only the terminal nodes are represented by (plain) circles (plain = catch data allocated to this node which always the case for terminal nodes).
  • when rolling over the nodes, details of the node are shown (like currently in MJJ) but we have to review the actual information listed there, it seems that it was to test MJJ rather.
  • When double-clicking a node, display the catch graph (for the intermediary nodes, selection of the subtree), when double clicking on the name of a (plain) node, display the catch graph with the data only allocated to that node.

These possibilities have to be tested, and other solutions can be proposed at this stage.

image
Fig.2 Suggestion for a modified and extended iconography. Note that trouts = genus Salmo and as no catch data allocated.

End.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant