Skip to content

Commit

Permalink
started work on dplace 3
Browse files Browse the repository at this point in the history
  • Loading branch information
xrotwang committed Nov 15, 2023
1 parent 9a94a80 commit e7f35e8
Show file tree
Hide file tree
Showing 71 changed files with 47,462 additions and 16,978 deletions.
16 changes: 7 additions & 9 deletions .github/workflows/python-package.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,24 +12,22 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: [3.6, 3.7, 3.8, 3.9]
python-version: [3.8, 3.9, "3.10", "3.11"]

steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
run: |
python -m pip install --upgrade pip
sudo apt-add-repository ppa:ubuntugis/ubuntugis-unstable
sudo apt-get update
sudo apt-get install gdal-bin libgdal-dev
pip install GDAL==3.2.3
pip install .[test]
- name: Test with pytest
run: |
pytest
- name: "Convert coverage"
run: "python -m coverage xml"
- name: "Upload coverage to Codecov"
uses: "codecov/codecov-action@v1"
with:
fail_ci_if_error: true
87 changes: 68 additions & 19 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,42 +1,91 @@
# pydplace

A Python library to access [D-PLACE](https://d-place.org) data.
A Python library to curate [D-PLACE](https://d-place.org) data.

[![Build Status](https://github.com/D-PLACE/pydplace/workflows/tests/badge.svg)](https://github.com/D-PLACE/pydplace/actions?query=workflow%3Atests)
[![codecov](https://codecov.io/gh/D-PLACE/pydplace/branch/master/graph/badge.svg)](https://codecov.io/gh/D-PLACE/pydplace)
[![PyPI](https://img.shields.io/pypi/v/pydplace.svg)](https://pypi.org/project/pydplace)


To install `pydplace` you need a python installation on your system, running python 2.7 or >3.4. Run
To install `pydplace` run

```
pip install pydplace
```

to install the requirements, `pydplace` and the command line interface `dplace`.
## Usage

`pydplace` is built to access data in a local clone or export of D-PLACE's data repository https://github.com/D-PLACE/dplace-data
### Bootstrapping a `pydplace`-curated dataset

`pydplace` provides a `cldfbench` dataset template to create the skeleton of files and directories for a
D-PLACE dataset, to be run with [cldfbench new](https://github.com/cldf/cldfbench/#creating-a-skeleton-for-a-new-dataset-directory).

## CLI
Running

Command line functionality is implemented via sub-commands of `dplace`. The list of
available sub-commands can be inspected running
```shell
cldfbench new --template dplace_dataset
```
$ dplace --help
usage: deplace [-h] [--verbosity VERBOSITY] [--log-level LOG_LEVEL]
[--repos REPOS]
command ...
...

Use 'dplace help <cmd>' to get help about individual commands.
will create a dataset skeleton looking as follows
```shell
$ tree testtree/
```

## Python API

D-PLACE data can also be accessed programmatically. All functionality is mediated through an instance of `pydplace.api.Repos`, e.g.
### Implementing CLDF creation

```python
>>> from pydplace.api import Repos
>>> api = Repos('.')
Implementing CLDF creation means - as for any other `cldfbench`-curated dataset - filling in the
`cmd_makecldf` method of the `Dataset` subclass in `cldfbench_<id>.py`.


### Running CLDF creation

With `cmd_makecldf` implemented, CLDF creation can be triggered running
```shell
cldfbench makecldf cldfbench_<id>.py
```

The resulting CLDF dataset can be validated running
```shell
pytest
```


### Release workflow

```shell
cldfbench makecldf --glottolog-version v4.8 --with-cldfreadme cldfbench_<id>.py
pytest
cldfbench zenodo --communities dplace cldfbench_<id>.py
cldfbench cldfviz.map cldf --pacific-centered --format png --width 20 --output map.png --with-ocean --no-legend
cldfbench readme cldfbench_<id>.py
dplace check cldfbench_<id>.py
git commit -a -m"release vX.Y"
git push origin
```

Then create a release on GitHub, thereby pushing the repos to Zenodo.


### Using the datasets

```shell
$ csvgrep -c Var_ID -m AnnualMeanTemperature cldf/data.csv | csvstat -c Value
4. "Value"

Type of data: Number
Contains null values: False
Unique values: 1649
Smallest value: -19,45
Largest value: 29,153
Sum: 32.700,717
Mean: 16,449
Median: 19,721
StDev: 9,684
Most common values: 14,392 (9x)
21,66 (6x)
6,96 (6x)
23,335 (5x)
21,619 (5x)

Row count: 1988
```
3 changes: 3 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
[build-system]
requires = ["setuptools"]
build-backend = "setuptools.build_meta"
89 changes: 85 additions & 4 deletions setup.cfg
Original file line number Diff line number Diff line change
@@ -1,10 +1,82 @@
[metadata]
name = pydplace
version = 3.0.0.dev0
author = Robert Forkel
author_email = [email protected]
description = A cldfbench plugin to curate D-PLACE datasets
long_description = file: README.md
long_description_content_type = text/markdown
keywords = linguistics
license = Apache 2.0
license_files = LICENSE
url = https://github.com/D-PLACE/pydplace
project_urls =
Bug Tracker = https://github.com/D-PLACE/pydplace/issues
platforms = any
classifiers =
Development Status :: 5 - Production/Stable
Intended Audience :: Developers
Intended Audience :: Science/Research
Natural Language :: English
Operating System :: OS Independent
Programming Language :: Python :: 3
Programming Language :: Python :: 3.8
Programming Language :: Python :: 3.9
Programming Language :: Python :: 3.10
Programming Language :: Python :: 3.11
Programming Language :: Python :: 3.12
Programming Language :: Python :: Implementation :: CPython
Programming Language :: Python :: Implementation :: PyPy
License :: OSI Approved :: Apache Software License

[options]
zip_safe = False
packages = find:
package_dir =
= src
python_requires = >=3.8
install_requires =
pybtex
attrs>=19.1
cldfbench
clldutils>=3.5.0
csvw>=1.6
pyglottolog>=3.0
pycldf>=1.14
fiona
shapely
include_package_data = True

[options.packages.find]
where = src

[options.package_data]
pycldf =
dataset_template/*
*.json
*.geojson

[options.entry_points]
console_scripts =
dplace = pydplace.__main__:main
cldfbench.scaffold =
dplace_dataset = pydplace.scaffold:DatasetTemplate

[options.extras_require]
dev =
tox
flake8
wheel>=0.36
twine
test =
pytest>=5
pytest-mock
pytest-cov
coverage>=4.2

[easy_install]
zip_ok = false

[metadata]
description-file = README.md
license_file = LICENSE

[bdist_wheel]
universal = 1

Expand All @@ -26,3 +98,12 @@ source =
[coverage:report]
show_missing = true
skip_covered = true

[tox:tox]
envlist = py38, py39, py310, py311, py312
isolated_build = true
skip_missing_interpreter = true

[testenv]
deps = .[test]
commands = pytest {posargs}
63 changes: 2 additions & 61 deletions setup.py
Original file line number Diff line number Diff line change
@@ -1,63 +1,4 @@
from setuptools import setup, find_packages
from setuptools import setup


setup(
name='pydplace',
version='2.4.1.dev0',
license='Apache 2.0',
description='programmatic access to D-PLACE/dplace-data',
long_description=open('README.md').read(),
long_description_content_type='text/markdown',
classifiers=[
"Programming Language :: Python",
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3.6",
"Programming Language :: Python :: 3.7",
"Programming Language :: Python :: 3.8",
"Programming Language :: Python :: 3.9",
],
author='Robert Forkel',
author_email='[email protected]',
url='https://d-place.org',
keywords='data',
packages=find_packages(where='src'),
package_dir={'': 'src'},
include_package_data=True,
zip_safe=False,
platforms='any',
python_requires='>=3.6',
install_requires=[
'pybtex',
'attrs>=19.1',
'clldutils>=3.5.0',
'cldfcatalog',
'csvw>=1.6',
'pyglottolog>=3.0',
'python-nexus>=2.2.0',
'pycldf>=1.14',
# ete3 doesn't install its dependencies properly, so we have to:
'six',
'numpy',
'ete3>=3.1.2',
#'pygdal>=1.11.3.3',
#'fiona',
#'shapely',
],
extras_require={
'dev': ['flake8', 'wheel', 'twine'],
'test': [
'pytest>=5',
'pytest-mock',
'pytest-cov',
'coverage>=4.2',
],
},
entry_points={
'console_scripts': [
'dplace=pydplace.__main__:main',
],
'cldfbench.commands': [
'dplace=pydplace.cldfbench_commands',
],
},
)
setup()
5 changes: 3 additions & 2 deletions src/pydplace/__init__.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
#
from pydplace.api import Repos as API
from .dataset import DatasetWithSocieties, DatasetWithoutSocieties

DPLACE = API # provide a more specific alias for the API class
__version__ = "2.4.1.dev0"
assert DatasetWithSocieties
assert DatasetWithoutSocieties
10 changes: 1 addition & 9 deletions src/pydplace/__main__.py
Original file line number Diff line number Diff line change
@@ -1,22 +1,15 @@
import sys
import pathlib
import contextlib

from clldutils.loglib import Logging
from clldutils.clilib import get_parser_and_subparsers, register_subcommands, PathType
from clldutils.clilib import get_parser_and_subparsers, register_subcommands

import pydplace
from pydplace.api import Repos
import pydplace.commands


def main(args=None, catch_all=False, parsed_args=None, log=None):
parser, subparsers = get_parser_and_subparsers('dplace')
parser.add_argument(
'--repos',
type=PathType(type='dir'),
default=pathlib.Path('dplace-data'),
help='Location of clone of D_PLACE/dplace-data')
register_subcommands(subparsers, pydplace.commands)

args = parsed_args or parser.parse_args(args=args)
Expand All @@ -30,7 +23,6 @@ def main(args=None, catch_all=False, parsed_args=None, log=None):
stack.enter_context(Logging(args.log, level=args.log_level))
else:
args.log = log
args.repos = Repos(args.repos)
try:
return args.main(args) or 0
except KeyboardInterrupt: # pragma: no cover
Expand Down
Loading

0 comments on commit e7f35e8

Please sign in to comment.