Skip to content

Commit

Permalink
Merge pull request #326 from ZJUEarthData/web
Browse files Browse the repository at this point in the history
perf: add the built-in dataset for abnormal detection and update the docs.
  • Loading branch information
SanyHe authored Mar 29, 2024
2 parents b7c5fd4 + a710271 commit e9e212d
Show file tree
Hide file tree
Showing 4 changed files with 158 additions and 27 deletions.
68 changes: 63 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,19 +61,29 @@ Eos Website: https://eos.org/editor-highlights/machine-learning-for-geochemists

## Quick Installation

Our software is well tested on **macOS** and **Windows** system with **Python 3.9**. Other systems and Python version are not guranteed.

One instruction to download on **command line**, such as Terminal on macOS, Power Shell on Windows.

```
pip install geochemistrypi
```

Download the latest version to avoid some old version issues, such as dependency downloading.
```
pip install "geochemistrypi==0.5.0"
```

One instruction to download on **Jupyter Notebook** or **Google Colab**.

```
!pip install geochemistrypi
```

Check the latest version of our software:
Download the latest version to avoid some old version issues, such as dependency downloading.
```
!pip install "geochemistrypi==0.5.0"
```
Check the downloaded version of our software:

```
geochemistrypi --version
Expand All @@ -95,13 +105,52 @@ One instruction to download on **Jupyter Notebook** or **Google Colab**.
!pip install --upgrade geochemistrypi
```

Check the latest version of our software:
Check the updated version of our software:

```
geochemistrypi --version
```

## Example
## Data Preparation

In order to utilize the functions provided by our software, your own data set should satisfy:

- be with the suffix **.xlsx** or **.csv**, which is supported by Microsoft Excel.
- be comprise of location information **LATITUDE** and **LONGITUDE**, two columns respectively. It is optional.

If you want to run **classification** algorithm, you data set should satisfy:

- a label column. You can name it as you wish, such as **Label**.

Column name specification:

- No restriction on the column names. You can name them as you want except for two special and optional column **LATITUDE** and **LONGITUDE**.

- every column can only one column name. Multi level column names are not allowed.

- Between two columns with values, a completed void column can exists.

The following are seven built-in data sets in our software stored on Google Drive and Tecent Docs, have a look on them. For the algorithm you intend to run, you can refer to the data format of the corresponding dataset.

+ Data_Regression.xlsx [[Google Drive]](https://docs.google.com/spreadsheets/d/13MB4t_2PiZ90tTMJKw7HcBUi2sb3tXej/edit?usp=sharing&ouid=110717816678586054594&rtpof=true&sd=true) | [[Tencent Docs]](https://docs.qq.com/document/DQ3VmdWZCTGV3bmpM?&u=6868f96d4a384b309036e04e637e367a)

+ ApplicationData_Regression.xlsx [[Google Drive]](https://docs.google.com/spreadsheets/d/1FCek2OOYQD887jfQz21g0ovqVuUJIjVoNI77D-Ufr9Y/edit?usp=sharing) | [[Tencent Docs]](
https://docs.qq.com/document/DQ3BDeHhxRGNzSXZN)

+ Data_Classification.xlsx [[Google Drive]](https://docs.google.com/spreadsheets/d/1xFBCYVmtZfuEAbeBljUlzqBjxVuLAt8x/edit?usp=sharing&ouid=110717816678586054594&rtpof=true&sd=true) | [[Tencent Docs]](https://docs.qq.com/document/DQ0JUaUFsZnRaZkNG?&u=6868f96d4a384b309036e04e637e367a)

+ ApplicationData_Classification.xlsx [[Google Drive]](https://docs.google.com/spreadsheets/d/1J7QvdvbbHJMlKtiumBgKDW7ALghfQQZyKGEoOqhKQjw/edit?usp=sharing) | [[Tencent Docs]](https://docs.qq.com/document/DQ2dnQWtubHRBTGtB)

+ Data_Clustering.xlsx [[Google Drive]](https://docs.google.com/spreadsheets/d/1sbuJdOzGNQ2Pk-bVURfPYg1rltyBbn5J/edit?usp=sharing&ouid=110717816678586054594&rtpof=true&sd=true) | [[Tencent Docs]](https://docs.qq.com/document/DQ3dKdGtlWkhZS2xR?&u=6868f96d4a384b309036e04e637e367a)

+ Data_Decomposition.xlsx [[Google Drive]](https://docs.google.com/spreadsheets/d/1kix82qj5--vhnm8-KhuUBH9dqYH6zcY8/edit?usp=sharing&ouid=110717816678586054594&rtpof=true&sd=true) | [[Tencent Docs]](https://docs.qq.com/document/DQ29oZ0lhUGtZUmdN?&u=6868f96d4a384b309036e04e637e367a)

+ Data_AbnormalDetectioon.xlsx [[Google Drive]](https://docs.google.com/spreadsheets/d/1NqTQZCkv74Sn_iOJOKRc-QnJzpaWmnzC_lET_0ZreiQ/edit?usp=sharing) | [[Tencent Docs]](
https://docs.qq.com/document/DQ2hqQ2N2ZGlOUWlT)

**Note**: For more detail on data preparation, please refer to our online documentation in **Model Example** under the section of **FOR USER**.

## Running Example

**How to run:** After successfully downloading, run this instruction on **command line / Jupyter Notebook / Google Colab** whatever directory it is.

Expand Down Expand Up @@ -181,6 +230,12 @@ For more details: Please refer to:

- MLflow UI user guide - Geochemistry π v0.5.0 [[Bilibili]](https://b23.tv/CW5Rjmo) | [[YouTube]](https://www.youtube.com/watch?v=Yu1nzNeLfRY)

The following screenshot shows the downloads and launching of our software on macOS:

<p align="center">
<img src="https://github.com/ZJUEarthData/geochemistrypi/assets/47497750/70728795-59b7-4741-ab5b-9e63d284ad37" alt="Downloads and Launching on macOS" width="450" />
</p>

## Roadmap

### First Phase
Expand Down Expand Up @@ -247,7 +302,6 @@ The whole package is under construction and the documentation is progressively e

+ Jianming Zhao (Jamie, Zhejiang University, China)
+ Jianhao Sun (Jin, China University of Geosciences, Wuhan, China)
+ Kaixin Zheng (Hayne, Sun Yat-sen University, China)
+ Yongkang Chan (Kill-virus, Lanzhou University, China)
+ Mengying Ye (Mary, Jilin University, China)
+ Mengqi Gao (China University of Geosciences, Beijing, China)
Expand All @@ -261,6 +315,9 @@ The whole package is under construction and the documentation is progressively e
+ Yucheng Yan (Andy, University of Sydney, Australia)
+ Ruitao Chang (China University of Geosciences Beijing, China)
+ Junchi Liao(Roceda, University of Electronic Science and Technology of China, China)
+ Panyan Weng (The University of Sydney, Australia)
+ Siqi Yao (Clara, Dongguan University of Technology, China)
+ Zhelan Lin(Lan, Fuzhou University, China)

## Join Us :)

Expand Down Expand Up @@ -327,6 +384,7 @@ More Videos will be recorded soon.
+ Shengxin Wang (Samson, Lanzhou University, China)
+ Wenyu Zhao (Molly, Zhejiang University, China)
+ Qiuhao Zhao (Brad, Zhejiang University, China)
+ Kaixin Zheng (Hayne, Sun Yat-sen University, China)
+ Anzhou Li (Andrian, Zhejiang University, China)
+ Dan Hu (Notre Dame University, United States)
+ Xunxin Liu (Tante, China University of Geosciences, Wuhan, China)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,26 +9,40 @@ Firstly you need to start the geochemistrypi programm via command line instrucit

In order to utilize the functions provided by our software, your own data set should satisfy:

- be with the suffix **.xlsx**, which is supported by Microsoft Excel.
- be comprise of location information **LATITUDE** and **LONGITUDE**, two columns respectively.
- be with the suffix **.xlsx** or **.csv**, which is supported by Microsoft Excel.
- be comprise of location information **LATITUDE** and **LONGITUDE**, two columns respectively. It is optional.

If you want to run **classification** algorithm, you data set should satisfy:

- Tag column **LABEL** to differentiate the data.
- a label column. You can name it as you wish, such as **Label**.

The following are four built-in data set in our software stored on Google Drive, have a look on them. For the algorithm you intend to run, you can refer to the data format of the corresponding dataset.
Column name specification:

+ [Data_Regression.xlsx (International - Google drive)](https://docs.google.com/spreadsheets/d/13MB4t_2PiZ90tTMJKw7HcBUi2sb3tXej/edit?usp=sharing&ouid=110717816678586054594&rtpof=true&sd=true)
+ [Data_Regression.xlsx (China - Tencent Docs)](https://docs.qq.com/document/DQ3VmdWZCTGV3bmpM?&u=6868f96d4a384b309036e04e637e367a)
- No restriction on the column names. You can name them as you want except for two special and optional column **LATITUDE** and **LONGITUDE**.

+ [Data_Classification.xlsx (International - Google drive)](https://docs.google.com/spreadsheets/d/1xFBCYVmtZfuEAbeBljUlzqBjxVuLAt8x/edit?usp=sharing&ouid=110717816678586054594&rtpof=true&sd=true)
+ [Data_Classification.xlsx (China - Tencent Docs)](https://docs.qq.com/document/DQ0JUaUFsZnRaZkNG?&u=6868f96d4a384b309036e04e637e367a)
- every column can only one column name. Multi level column names are not allowed.

- Between two columns with values, a completed void column can exists.

The following are seven built-in data sets in our software stored on Google Drive and Tecent Docs, have a look on them. For the algorithm you intend to run, you can refer to the data format of the corresponding dataset.

+ Data_Regression.xlsx [[Google Drive]](https://docs.google.com/spreadsheets/d/13MB4t_2PiZ90tTMJKw7HcBUi2sb3tXej/edit?usp=sharing&ouid=110717816678586054594&rtpof=true&sd=true) | [[Tencent Docs]](https://docs.qq.com/document/DQ3VmdWZCTGV3bmpM?&u=6868f96d4a384b309036e04e637e367a)

+ ApplicationData_Regression.xlsx [[Google Drive]](https://docs.google.com/spreadsheets/d/1FCek2OOYQD887jfQz21g0ovqVuUJIjVoNI77D-Ufr9Y/edit?usp=sharing) | [[Tencent Docs]](
https://docs.qq.com/document/DQ3BDeHhxRGNzSXZN)

+ Data_Classification.xlsx [[Google Drive]](https://docs.google.com/spreadsheets/d/1xFBCYVmtZfuEAbeBljUlzqBjxVuLAt8x/edit?usp=sharing&ouid=110717816678586054594&rtpof=true&sd=true) | [[Tencent Docs]](https://docs.qq.com/document/DQ0JUaUFsZnRaZkNG?&u=6868f96d4a384b309036e04e637e367a)

+ ApplicationData_Classification.xlsx [[Google Drive]](https://docs.google.com/spreadsheets/d/1J7QvdvbbHJMlKtiumBgKDW7ALghfQQZyKGEoOqhKQjw/edit?usp=sharing) | [[Tencent Docs]](https://docs.qq.com/document/DQ2dnQWtubHRBTGtB)

+ Data_Clustering.xlsx [[Google Drive]](https://docs.google.com/spreadsheets/d/1sbuJdOzGNQ2Pk-bVURfPYg1rltyBbn5J/edit?usp=sharing&ouid=110717816678586054594&rtpof=true&sd=true) | [[Tencent Docs]](https://docs.qq.com/document/DQ3dKdGtlWkhZS2xR?&u=6868f96d4a384b309036e04e637e367a)

+ Data_Decomposition.xlsx [[Google Drive]](https://docs.google.com/spreadsheets/d/1kix82qj5--vhnm8-KhuUBH9dqYH6zcY8/edit?usp=sharing&ouid=110717816678586054594&rtpof=true&sd=true) | [[Tencent Docs]](https://docs.qq.com/document/DQ29oZ0lhUGtZUmdN?&u=6868f96d4a384b309036e04e637e367a)

+ Data_AbnormalDetectioon.xlsx [[Google Drive]](https://docs.google.com/spreadsheets/d/1NqTQZCkv74Sn_iOJOKRc-QnJzpaWmnzC_lET_0ZreiQ/edit?usp=sharing) | [[Tencent Docs]](
https://docs.qq.com/document/DQ2hqQ2N2ZGlOUWlT)

+ [Data_Clustering.xlsx (International - Google drive)](https://docs.google.com/spreadsheets/d/1sbuJdOzGNQ2Pk-bVURfPYg1rltyBbn5J/edit?usp=sharing&ouid=110717816678586054594&rtpof=true&sd=true)
+ [Data_Clustering.xlsx (China - Tencent Docs)](https://docs.qq.com/document/DQ3dKdGtlWkhZS2xR?&u=6868f96d4a384b309036e04e637e367a)

+ [Data_Decomposition.xlsx (International - Google drive)](https://docs.google.com/spreadsheets/d/1kix82qj5--vhnm8-KhuUBH9dqYH6zcY8/edit?usp=sharing&ouid=110717816678586054594&rtpof=true&sd=true)
+ [Data_Decomposition.xlsx (China - Tencent Docs)](https://docs.qq.com/document/DQ29oZ0lhUGtZUmdN?&u=6868f96d4a384b309036e04e637e367a)
#### Loading Data

By running the start command, there will be a prompt if your dataset is successfully loaded:
Expand All @@ -43,6 +57,7 @@ By running the start command, there will be a prompt if your dataset is successf
47 - U(PPM)
--------------------
(Press Enter key to move forward.)

#### World Map Projection

After successfully loading your data, you will be asked if you would like to plot a world map projection for a specific element:
Expand Down
78 changes: 68 additions & 10 deletions docs/source/Home/Introduction.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,19 +62,29 @@ Eos Website: https://eos.org/editor-highlights/machine-learning-for-geochemists

## Quick Installation

Our software is well tested on **macOS** and **Windows** system with **Python 3.9**. Other systems and Python version are not guranteed.

One instruction to download on **command line**, such as Terminal on macOS, Power Shell on Windows.

```
pip install geochemistrypi
```

Download the latest version to avoid some old version issues, such as dependency downloading.
```
pip install "geochemistrypi==0.5.0"
```

One instruction to download on **Jupyter Notebook** or **Google Colab**.

```
!pip install geochemistrypi
```

Check the latest version of our software:
Download the latest version to avoid some old version issues, such as dependency downloading.
```
!pip install "geochemistrypi==0.5.0"
```
Check the downloaded version of our software:

```
geochemistrypi --version
Expand All @@ -96,13 +106,52 @@ One instruction to download on **Jupyter Notebook** or **Google Colab**.
!pip install --upgrade geochemistrypi
```

Check the latest version of our software:
Check the updated version of our software:

```
geochemistrypi --version
```

## Example
## Data Preparation

In order to utilize the functions provided by our software, your own data set should satisfy:

- be with the suffix **.xlsx** or **.csv**, which is supported by Microsoft Excel.
- be comprise of location information **LATITUDE** and **LONGITUDE**, two columns respectively. It is optional.

If you want to run **classification** algorithm, you data set should satisfy:

- a label column. You can name it as you wish, such as **Label**.

Column name specification:

- No restriction on the column names. You can name them as you want except for two special and optional column **LATITUDE** and **LONGITUDE**.

- every column can only one column name. Multi level column names are not allowed.

- Between two columns with values, a completed void column can exists.

The following are seven built-in data sets in our software stored on Google Drive and Tecent Docs, have a look on them. For the algorithm you intend to run, you can refer to the data format of the corresponding dataset.

+ Data_Regression.xlsx [[Google Drive]](https://docs.google.com/spreadsheets/d/13MB4t_2PiZ90tTMJKw7HcBUi2sb3tXej/edit?usp=sharing&ouid=110717816678586054594&rtpof=true&sd=true) | [[Tencent Docs]](https://docs.qq.com/document/DQ3VmdWZCTGV3bmpM?&u=6868f96d4a384b309036e04e637e367a)

+ ApplicationData_Regression.xlsx [[Google Drive]](https://docs.google.com/spreadsheets/d/1FCek2OOYQD887jfQz21g0ovqVuUJIjVoNI77D-Ufr9Y/edit?usp=sharing) | [[Tencent Docs]](
https://docs.qq.com/document/DQ3BDeHhxRGNzSXZN)

+ Data_Classification.xlsx [[Google Drive]](https://docs.google.com/spreadsheets/d/1xFBCYVmtZfuEAbeBljUlzqBjxVuLAt8x/edit?usp=sharing&ouid=110717816678586054594&rtpof=true&sd=true) | [[Tencent Docs]](https://docs.qq.com/document/DQ0JUaUFsZnRaZkNG?&u=6868f96d4a384b309036e04e637e367a)

+ ApplicationData_Classification.xlsx [[Google Drive]](https://docs.google.com/spreadsheets/d/1J7QvdvbbHJMlKtiumBgKDW7ALghfQQZyKGEoOqhKQjw/edit?usp=sharing) | [[Tencent Docs]](https://docs.qq.com/document/DQ2dnQWtubHRBTGtB)

+ Data_Clustering.xlsx [[Google Drive]](https://docs.google.com/spreadsheets/d/1sbuJdOzGNQ2Pk-bVURfPYg1rltyBbn5J/edit?usp=sharing&ouid=110717816678586054594&rtpof=true&sd=true) | [[Tencent Docs]](https://docs.qq.com/document/DQ3dKdGtlWkhZS2xR?&u=6868f96d4a384b309036e04e637e367a)

+ Data_Decomposition.xlsx [[Google Drive]](https://docs.google.com/spreadsheets/d/1kix82qj5--vhnm8-KhuUBH9dqYH6zcY8/edit?usp=sharing&ouid=110717816678586054594&rtpof=true&sd=true) | [[Tencent Docs]](https://docs.qq.com/document/DQ29oZ0lhUGtZUmdN?&u=6868f96d4a384b309036e04e637e367a)

+ Data_AbnormalDetectioon.xlsx [[Google Drive]](https://docs.google.com/spreadsheets/d/1NqTQZCkv74Sn_iOJOKRc-QnJzpaWmnzC_lET_0ZreiQ/edit?usp=sharing) | [[Tencent Docs]](
https://docs.qq.com/document/DQ2hqQ2N2ZGlOUWlT)

**Note**: For more detail on data preparation, please refer to our online documentation in **Model Example** under the section of **FOR USER**.

## Running Example

**How to run:** After successfully downloading, run this instruction on **command line / Jupyter Notebook / Google Colab** whatever directory it is.

Expand Down Expand Up @@ -176,10 +225,17 @@ Copy the URL shown on the console into any browser to open the MLflow web interf

For more details: Please refer to:

+ [Manual v1.1.0 for Geochemistry π - Beta (International - Google drive)](https://drive.google.com/file/d/1yryykCyWKM-Sj88fOYbOba6QkB_fu2ws/view?usp=sharing)
+ [Manual v1.1.0 for Geochemistry π - Beta (China - Tencent Docs)](https://docs.qq.com/pdf/DQ0l5d2xVd2VwcnVW?&u=6868f96d4a384b309036e04e637e367a)
+ [Geochemistry π - Download and Run the Beta Version (International - Youtube)](https://www.youtube.com/watch?v=EeVaJ3H7_AU&list=PLy8hNsI55lvh1UHjhVhqNUj3xPdV9sEiM&index=9)
+ [Geochemistry π - Download and Run the Beta Version (China - Bilibili)](https://www.bilibili.com/video/BV1UM4y1Q7Ju/?spm_id_from=333.999.0.0&vd_source=27944ab3b73a78970c1a52a5dcbb9140)
- Manual v1.1.0 for Geochemistry π - Beta [[Tencent Docs]](https://docs.qq.com/pdf/DQ0l5d2xVd2VwcnVW?&u=6868f96d4a384b309036e04e637e367a) | [[Google drive]](https://drive.google.com/file/d/1yryykCyWKM-Sj88fOYbOba6QkB_fu2ws/view?usp=sharing)

- Geochemistry π - Download and Run the Beta Version [[Bilibili]](https://www.bilibili.com/video/BV1UM4y1Q7Ju/?spm_id_from=333.999.0.0&vd_source=27944ab3b73a78970c1a52a5dcbb9140) | [[YouTube]](https://www.youtube.com/watch?v=EeVaJ3H7_AU&list=PLy8hNsI55lvh1UHjhVhqNUj3xPdV9sEiM&index=9)

- MLflow UI user guide - Geochemistry π v0.5.0 [[Bilibili]](https://b23.tv/CW5Rjmo) | [[YouTube]](https://www.youtube.com/watch?v=Yu1nzNeLfRY)

The following screenshot shows the downloads and launching of our software on macOS:

<p align="center">
<img src="https://github.com/ZJUEarthData/geochemistrypi/assets/47497750/70728795-59b7-4741-ab5b-9e63d284ad37" alt="Downloads and Launching on macOS" width="450" />
</p>

## Roadmap

Expand Down Expand Up @@ -236,7 +292,6 @@ The whole package is under construction and the documentation is progressively e

![Geochemistry π.png](https://github.com/ZJUEarthData/geochemistrypi/assets/97781484/e77b1f11-41ab-4354-9064-6d62cc1bf1e4)


## Team Info

**Leader:**
Expand All @@ -248,7 +303,6 @@ The whole package is under construction and the documentation is progressively e

+ Jianming Zhao (Jamie, Zhejiang University, China)
+ Jianhao Sun (Jin, China University of Geosciences, Wuhan, China)
+ Kaixin Zheng (Hayne, Sun Yat-sen University, China)
+ Yongkang Chan (Kill-virus, Lanzhou University, China)
+ Mengying Ye (Mary, Jilin University, China)
+ Mengqi Gao (China University of Geosciences, Beijing, China)
Expand All @@ -262,6 +316,9 @@ The whole package is under construction and the documentation is progressively e
+ Yucheng Yan (Andy, University of Sydney, Australia)
+ Ruitao Chang (China University of Geosciences Beijing, China)
+ Junchi Liao(Roceda, University of Electronic Science and Technology of China, China)
+ Panyan Weng (The University of Sydney, Australia)
+ Siqi Yao (Clara, Dongguan University of Technology, China)
+ Zhelan Lin(Lan, Fuzhou University, China)

## Join Us :)

Expand Down Expand Up @@ -328,6 +385,7 @@ More Videos will be recorded soon.
+ Shengxin Wang (Samson, Lanzhou University, China)
+ Wenyu Zhao (Molly, Zhejiang University, China)
+ Qiuhao Zhao (Brad, Zhejiang University, China)
+ Kaixin Zheng (Hayne, Sun Yat-sen University, China)
+ Anzhou Li (Andrian, Zhejiang University, China)
+ Dan Hu (Notre Dame University, United States)
+ Xunxin Liu (Tante, China University of Geosciences, Wuhan, China)
Expand Down
Binary file not shown.

0 comments on commit e9e212d

Please sign in to comment.