Skip to content

Commit

Permalink
Merge pull request #454 from usc-isi-i2/dev
Browse files Browse the repository at this point in the history
release 1.0.0
  • Loading branch information
saggu authored Oct 22, 2021
2 parents b8860c8 + 038b180 commit dbb28b8
Show file tree
Hide file tree
Showing 199 changed files with 31,881 additions and 6,462 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/python-publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ jobs:
- name: Set up Python
uses: actions/setup-python@v2
with:
python-version: '3.7'
python-version: '3.8'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
Expand Down
40 changes: 29 additions & 11 deletions .github/workflows/run-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,25 +11,43 @@ jobs:
- name: Install Python
uses: actions/setup-python@v2
with:
python-version: '3.7'
- name: Add conda to system path
run: |
echo $CONDA/bin >> $GITHUB_PATH
python-version: '3.8'
# 11-Oct-2021: Adding the system conda gives pip
# access python 3.9 site packages:
#
# /usr/share/miniconda/lib/python3.9/site-packages
#
# We'd prefer to use python 3.8 consistently while
# chasing down problems.
#
# - name: Add conda to system path
# run: |
# echo $CONDA/bin >> $GITHUB_PATH
- name: Setup conda
uses: s-weigand/setup-conda@v1
with:
update-conda: true
python-version: '3.8'
conda-channels: anaconda, conda-forge
- name: Setup env
run: |
pip install --upgrade pip
pip install -e .
pip install coveralls
python -m spacy download en_core_web_sm
conda install -c conda-forge graph-tool
pip uninstall -y rdflib
pip install git+https://github.com/RDFLib/rdflib.git@master
# 11-Oct-2021: Do not use rdflib 6.0.0 and later, at
# the moment there's a fatal interaction, probably with ETK.
# See issues #537 and #538.
#
# pip uninstall -y rdflib
# pip install git+https://github.com/RDFLib/rdflib.git@master
- name: Run Tests
run: |
cd tests
coverage run --source=kgtk -m unittest discover
- name: Coveralls Finished
uses: coverallsapp/github-action@master
with:
github-token: ${{ secrets.COVERALLS_TOKEN }}
parallel-finished: true
- name: Coverage
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
coveralls --service=github
108 changes: 5 additions & 103 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,24 +20,16 @@ KGTK can process Wikidata-sized KGs with billions of edges on a laptop. We have

KGTK is open source software, well documented, actively used and developed, and released using the MIT license. We invite the community to try KGTK. It is easy to get started with our tutorial notebooks available and executable online.



## Getting started

### Documentation

https://kgtk.readthedocs.io/en/latest/
### Online Documentation

### Demo: try KGTK online with MyBinder
The easiest, no-cost way of trying out KGTK is through [MyBinder](https://mybinder.org/). We have made available several **example notebooks** to show some of the features of KGTK, which can be run in two environments:
You can read our latest documentation online with:

* **Basic KGTK functionality**: This notebook may take **5-10 minutes** to launch, please be patient. Note that in this notebook some KGTK commands (graph analytics and embeddings) **will not run**. To launch the notebook in your browser, click on the "Binder" icon: [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/usc-isi-i2/kgtk/master?filepath=examples%2FExample5%20-%20AIDA%20AIF.ipynb)

* **Advanced KGTK functionality**: This notebook may take **10-20 minutes to launch**. It includes basic KGTK functionality and **graph analytics and embedding capabilities** of KGTK: [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/dgarijo/kgtk/dev?filepath=%2Fkgtk%2Fexamples%2FCSKG%20Use%20Case.ipynb)
https://kgtk.readthedocs.io/en/latest/

For executing KGTK with large datasets, **we recommend a Docker/local installation**.

### KGTK notebooks
### KGTK Notebooks

The [examples folder](examples/) provides a larger and constantly increasing number of easy-to-follow Jupyter Notebooks which showcase different functionalities of KGTK. These include computing:
* Embeddings for ConceptNet nodes
Expand All @@ -52,97 +44,7 @@ The [examples folder](examples/) provides a larger and constantly increasing num

## Installation


### Installation through Docker

```
docker pull uscisii2/kgtk
```

To run KGTK in the command line:

```
docker run -it --rm --user root -e NB_GID=100 -e GEN_CERT=yes -e GRANT_SUDO=yes uscisii2/kgtk:latest /bin/bash
```

Note: if you want to load data from your local machine, you will need to [mount a volume](https://docs.docker.com/storage/volumes/).
For example, to mount the current directory (`$PWD`) and launch KGTK in command line mode:

```
docker run -it --rm -v $PWD:/out --user root -e NB_GID=100 -e GEN_CERT=yes -e GRANT_SUDO=yes uscisii2/kgtk:latest /bin/bash
```

If you want to run KGTK in a **Jupyter notebook**, mounting the current directory (`$PWD`) as a folder called `/out` then you will have to type:
```
docker run -it -v $PWD:/out -p 8888:8888 uscisii2/kgtk:latest /bin/bash -c "jupyter notebook --ip='*' --port=8888 --no-browser"
```

More information about versions and tags is available here: https://hub.docker.com/repository/docker/uscisii2/kgtk. For example, the `dev` branch is available at `uscisii2/kgtk:latest-dev`.

See additional examples in [the documentation](https://kgtk.readthedocs.io/en/latest/install/).

### Local installation

Our installation will be in a **conda environment**. If you don't have conda installed, follow [link](https://docs.conda.io/projects/conda/en/latest/user-guide/install/) to install it. Once installed, follow the instructions below:

1. Set up your own conda environment:
```
conda create -n kgtk-env python=3.7
conda activate kgtk-env
```
**Note:** Installing Graph-tool is problematic on python 3.8 and out of a virtual environment. Thus: **the advised installation path is by using a virtual environment.**

2. Install (the dev branch at this point): `pip install kgtk`

You can test if `kgtk` is installed properly now with: `kgtk -h`.

3. Download the English model of SpaCY: `python -m spacy download en_core_web_sm`

4. Install `graph-tool`: `conda install -c conda-forge graph-tool`. If you don't use conda or run into problems, see these [instructions](https://git.skewed.de/count0/graph-tool/-/wikis/installation-instructions).

5. Python library rdflib has a known [issue](https://github.com/RDFLib/rdflib/issues/1043), where the ttl serialization of decimal values is incorrect. The library will add a `.0` at the end of decimal values in scientific notation. This will make the ttl invalid and cannot be loaded into a triplestore.

To solve this issue, run the following commands after the `kgtk` installation is complete.
```
pip uninstall rdflib
pip install git+https://github.com/RDFLib/rdflib.git@master
```

The code fix for this bug is already merged into the library, but has not been released as a `pypi` package. This step will be removed after `rdflib` version 6 is released.

### Updating your KGTK installation
To update your version of KGTK, just follow the instructions below:

- If you installed KGTK with through Docker, then just pull the most recent image: `docker pull <image_name>`, where `<image_name>` is the tag of the image of interest (e.g. uscisii2/kgtk:latest)
- If you installed KGTK from pip, then type `pip install -U kgtk`.
- If you installed KGTK from GitHub, then type `git pull && pip install` . Alternatively, you may execute: `git pull && python setup.py install`.
- If you installed KGTK in development mode, (i.e., `pip install -e`); then you only need to do update your repository: `git pull`.

## Running KGTK commands

To list all the available KGTK commands, run:

```
kgtk -h
```

To see the arguments of a particular commands, run:

```
kgtk <command> -h
```

An example command that computes instances of the subclasses of two classes:

```
kgtk instances --transitive --class Q13442814,Q12345678
```

## Running unit tests locally
```
cd kgtk/tests
python -W ignore -m unittest discover
```
Please see our [installation document](/docs/install.md) for installation procedures.

## KGTK Text Search API

Expand Down
14 changes: 9 additions & 5 deletions docker/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -10,20 +10,24 @@ RUN apt-get update && apt-get install -y \
libxrandr-dev \
libxinerama-dev \
pv \
gcc
gcc

RUN pip install thinc==7.4.0
RUN apt-get install --reinstall build-essential -y

RUN git clone https://github.com/usc-isi-i2/kgtk/
RUN pip install huggingface-hub==0.0.17

RUN cd /kgtk && python setup.py install
RUN git clone https://github.com/usc-isi-i2/kgtk/

RUN cd /kgtk && python setup.py install

RUN conda update -n base -c defaults conda

RUN conda install -c conda-forge graph-tool

RUN conda install -c conda-forge jupyterlab

RUN pip install chardet

ARG NB_USER=jovyan
ARG NB_UID=1000
ENV USER ${NB_USER}
Expand All @@ -34,7 +38,7 @@ RUN adduser --disabled-password \
--gecos "Default user" \
--uid ${NB_UID} \
${NB_USER}

COPY . ${HOME}
USER root
RUN chown -R ${NB_UID} ${HOME}
Expand Down
12 changes: 8 additions & 4 deletions docker/dev/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -10,20 +10,24 @@ RUN apt-get update && apt-get install -y \
libxrandr-dev \
libxinerama-dev \
pv \
gcc
gcc

RUN pip install thinc==7.4.0
RUN apt-get install --reinstall build-essential -y

RUN pip install huggingface-hub==0.0.17

RUN git clone https://github.com/usc-isi-i2/kgtk/ --branch dev

RUN cd /kgtk && python setup.py install
RUN cd /kgtk && python setup.py install

RUN conda update -n base -c defaults conda

RUN conda install -c conda-forge graph-tool

RUN conda install -c conda-forge jupyterlab

RUN pip install chardet

ARG NB_USER=jovyan
ARG NB_UID=1000
ENV USER ${NB_USER}
Expand All @@ -34,7 +38,7 @@ RUN adduser --disabled-password \
--gecos "Default user" \
--uid ${NB_UID} \
${NB_USER}

COPY . ${HOME}
USER root
RUN chown -R ${NB_UID} ${HOME}
Expand Down
2 changes: 2 additions & 0 deletions docker/lite/readme.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
## KGTK-lite as a Docker image

## This Docker version KGTK is no longer maintained. Please use dev or main (in the parent folder)

This version of KGTK does not incorporate graph-tool and embeddings to be lighter.

To use this Dockerfile, you can build it yourself:
Expand Down
Loading

0 comments on commit dbb28b8

Please sign in to comment.