Skip to content

Commit

Permalink
Follow Google's Python style guide (#11)
Browse files Browse the repository at this point in the history
Migrate pyscaffold to 4.5, setup pre-commits and some sphinx changes to document private and special methods. Update docstrings, tutorial and text everywhere.
  • Loading branch information
jkanche authored Aug 22, 2023
1 parent 3cfa83e commit 5ab629b
Show file tree
Hide file tree
Showing 14 changed files with 294 additions and 141 deletions.
36 changes: 26 additions & 10 deletions .github/workflows/pypi-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,13 @@

name: Test the library

on: [push, pull_request]
on:
push:
branches:
- main
tags:
- "*"
pull_request:

jobs:
test:
Expand All @@ -25,16 +31,30 @@ jobs:
run: |
python -m pip install --upgrade pip
pip install flake8 pytest tox cython numpy
- name: Download rds2cpp deps
run: |
cd extern/rds2cpp
cmake .
cd ../..
- name: Test with tox
run: |
python setup.py build_ext --inplace
tox
- name: Build docs
run: |
tox -e docs
touch ./docs/_build/html/.nojekyll
- name: GH Pages Deployment
if: github.event_name == 'push' && startsWith(github.ref, 'refs/tags/')
uses: JamesIves/[email protected]
with:
branch: gh-pages # The branch the action should deploy to.
folder: ./docs/_build/html
clean: true # Automatically remove deleted files from the deploy branch

build_wheels:
name: Build wheels on ${{ matrix.os }}
if: github.event_name == 'push' && startsWith(github.ref, 'refs/tags/')
runs-on: ${{ matrix.os }}
strategy:
matrix:
Expand All @@ -45,17 +65,19 @@ jobs:
with:
submodules: true

- name: Install dependencies
- name: Download dependencies
run: |
cd extern/rds2cpp
cd extern/knncolle
cmake .
cd ../..
- name: Build wheels
uses: pypa/[email protected]
env:
CIBW_ARCHS_MACOS: x86_64 arm64

CIBW_ARCHS_LINUX: x86_64 # remove this later so we build for all linux archs
CIBW_PROJECT_REQUIRES_PYTHON: ">=3.9"
CIBW_SKIP: pp*
- uses: actions/upload-artifact@v3
with:
path: ./wheelhouse/*.whl
Expand All @@ -68,12 +90,6 @@ jobs:
with:
submodules: true

- name: Install dependencies
run: |
cd extern/rds2cpp
cmake .
cd ../..
- name: Build sdist
run: pipx run build --sdist

Expand Down
67 changes: 67 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
exclude: '^docs/conf.py'

repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.4.0
hooks:
- id: trailing-whitespace
- id: check-added-large-files
- id: check-ast
- id: check-json
- id: check-merge-conflict
- id: check-xml
- id: check-yaml
- id: debug-statements
- id: end-of-file-fixer
- id: requirements-txt-fixer
- id: mixed-line-ending
args: ['--fix=auto'] # replace 'auto' with 'lf' to enforce Linux/Mac line endings or 'crlf' for Windows

## If you want to automatically "modernize" your Python code:
# - repo: https://github.com/asottile/pyupgrade
# rev: v3.7.0
# hooks:
# - id: pyupgrade
# args: ['--py37-plus']

## If you want to avoid flake8 errors due to unused vars or imports:
# - repo: https://github.com/PyCQA/autoflake
# rev: v2.1.1
# hooks:
# - id: autoflake
# args: [
# --in-place,
# --remove-all-unused-imports,
# --remove-unused-variables,
# ]

- repo: https://github.com/PyCQA/isort
rev: 5.12.0
hooks:
- id: isort

- repo: https://github.com/psf/black
rev: 23.7.0
hooks:
- id: black
language_version: python3

## If like to embrace black styles even in the docs:
# - repo: https://github.com/asottile/blacken-docs
# rev: v1.13.0
# hooks:
# - id: blacken-docs
# additional_dependencies: [black]

- repo: https://github.com/PyCQA/flake8
rev: 6.1.0
hooks:
- id: flake8
## You can add flake8 plugins via `additional_dependencies`:
# additional_dependencies: [flake8-bugbear]

## Check for misspells in documentation files:
# - repo: https://github.com/codespell-project/codespell
# rev: v2.2.5
# hooks:
# - id: codespell
5 changes: 4 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
# Changelog

## Version 0.1 (development)
## Version 0.3.0 (development)

A few changes to update pyscaffold, development environment and link sphinx objects to relevant objects.
## Version 0.1

- Feature A added
- FIX: nasty bug #1729 fixed
Expand Down
33 changes: 15 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# rds2py

Parse and construct Python representations for datasets stored in RDS files. It supports a few base classes from R and Bioconductor's `SummarizedExperiment` and `SingleCellExperiment` S4 classes. ***This is possible because of [Aaron's rds2cpp library](https://github.com/LTLA/rds2cpp).***
Parse and construct Python representations for datasets stored in RDS files. `rds2py` supports a few base classes from R and Bioconductor's `SummarizedExperiment` and `SingleCellExperiment` S4 classes. **_This is possible because of [Aaron's rds2cpp library](https://github.com/LTLA/rds2cpp)._**

The package uses memory views (except for strings) to access the same memory from C++ in Python (through Cython of course). This is especially useful for large datasets so we don't make multiple copies of data.

Expand All @@ -14,42 +14,40 @@ pip install rds2py

## Usage

If you do not have an RDS object handy, feel free to download from [single-cell-test-files](https://github.com/jkanche/random-test-files/releases).
If you do not have an RDS object handy, feel free to download one from [single-cell-test-files](https://github.com/jkanche/random-test-files/releases).

```python
from rds2py import as_SCE, read_rds
from rds2py import as_summarized_experiment, read_rds

rObj = read_rds(<path_to_file>)
```

Once we have a realized structure of the RDS file, we can now build useful Python representations.
Once we have a dictionary representation of the RDS file, we can now build useful Python representations from these objects.

This `rObj` contains the realized structure of the RDS file as a Python `dict` object, it contains two keys
- `data`: if atomic entities, contains the numpy view of the memory space.
- `attributes`: additional properties available for the object.
This `rObj` contains two keys

The package provides friendly functions to easily convert few R representations to Python representations.
- `data`: If atomic entities, contains the numpy view of the memory space.
- `attributes`: Additional properties available for the object.

The package provides friendly functions to easily convert a few R representations to Python.

```python
from rds2py import as_spase_matrix, as_SCE
from rds2py import as_spase_matrix, as_summarized_experiment

# to convert an robject to a sparse matrix
sp_mat = as_sparse(rObj)

# to convert an robject to SCE
sce = as_SCE(rObj)
sce = as_summarized_experiment(rObj)
```

For more use cases converting `data.frame`, `dgCMatrix`, `dgRMatrix`, `dgTMatrix` to Python, checkout the [documentation](https://biocpy.github.io/rds2py/).

***If you want to add more representations, feel free to send a PR on this repository!***

For more examples converting `data.frame`, `dgCMatrix`, `dgRMatrix`, `dgTMatrix` to Python, checkout the [documentation](https://biocpy.github.io/rds2py/).

## Developer Notes

This project uses Cython to provide bindings from C++ to Python. It tries to use the same memory space (except for strings) instead of making copy of the data.
This project uses Cython to provide bindings from C++ to Python.

Steps to setup dependencies -
Steps to setup dependencies -

- git submodules is initialized in `extern/rds2cpp`
- `cmake .` in `extern/rds2cpp` directory to download dependencies, especially the `byteme` library
Expand All @@ -66,10 +64,9 @@ For typical development workflows, run
python setup.py build_ext --inplace && tox
```


<!-- pyscaffold-notes -->

## Note

This project has been set up using PyScaffold 4.3. For details and usage
This project has been set up using PyScaffold 4.5. For details and usage
information on PyScaffold see https://pyscaffold.org/.
13 changes: 12 additions & 1 deletion docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@

# General information about the project.
project = "rds2py"
copyright = "2022, jkanche"
copyright = "2023, jkanche"

# The version info for the project you're documenting, acts as replacement for
# |version| and |release|, also used in various other places throughout the
Expand Down Expand Up @@ -166,6 +166,15 @@
# If this is True, todo emits a warning for each TODO entries. The default is False.
todo_emit_warnings = True

autodoc_default_options = {
'special-members': True,
'undoc-members': False,
'exclude-members': '__weakref__, __dict__, __str__, __module__, __init__'
}

autosummary_generate = True
autosummary_imported_members = True


# -- Options for HTML output -------------------------------------------------

Expand Down Expand Up @@ -299,6 +308,8 @@
"scipy": ("https://docs.scipy.org/doc/scipy/reference", None),
"setuptools": ("https://setuptools.pypa.io/en/stable/", None),
"pyscaffold": ("https://pyscaffold.org/en/stable", None),
"singelcellexperiment": ("https://biocpy.github.io/SingleCellExperiment", None),
"summarizedexperiment": ("https://biocpy.github.io/SummarizedExperiment", None),
}

print(f"loading configurations for {project} {version} ...", file=sys.stderr)
25 changes: 12 additions & 13 deletions docs/tutorial.md
Original file line number Diff line number Diff line change
@@ -1,22 +1,22 @@
# Tutorial

If you do not have an RDS object handy, feel free to download from [single-cell-test-files](https://github.com/jkanche/random-test-files/releases).
If you do not have an RDS object handy, feel free to download one from [single-cell-test-files](https://github.com/jkanche/random-test-files/releases).

## Step 1: Read a RDS file in Python

The first step is to read an RDS file and get the equivalent representation in Python.
First we need to read the RDS file that can be easily explored in Python. The `read_rds` parses the R object and returns
a dictionary of the R object.

```python
from rds2py import read_rds

rObj = read_rds(<path_to_file>)
```

Once we have a realized structure of the RDS file, we can now start to build useful Python representations.
Once we have a realized structure, we can now convert this object to useful Python representations. It contains two keys

This `rObj` contains the realized structure of the RDS file as a Python `dict` object, it contains two keys
- `data`: if atomic entities, contains the numpy view of the memory space.
- `attributes`: additional properties available for the object.
- `data`: If atomic entities, contains the numpy view of the memory space.
- `attributes`: Additional properties available for the object.

The package provides friendly functions to convert some R representations to useful Python representations.

Expand All @@ -26,8 +26,7 @@ The package provides friendly functions to convert some R representations to use

Use these methods if the RDS file contains either a sparse matrix (`dgCMatrix`, `dgRMatrix`, or `dgTMatrix`) or a dense matrix.


***Note: If an R object contains `dims` in the `attributes`, we consider this as a matrix.***
**_Note: If an R object contains `dims` in the `attributes`, we consider this as a matrix._**

```python
from rds2py import as_spase_matrix, as_dense_matrix
Expand All @@ -52,15 +51,15 @@ df = as_pandas(rObj)

### S4 classes: specifically `SingleCellExperiment` or `SummarizedExperiment`

We also support `SingleCellExperiment` or `SummarizedExperiment` from Bioconductor. the `as_SCE` method is how we one can do this operation.
We also support `SingleCellExperiment` or `SummarizedExperiment` from Bioconductor. the `as_summarized_experiment` method is how we one can do this operation.

***Note: This method also serves as an example on how to convert complex R structures into Python representations.***
**_Note: This method also serves as an example on how to convert complex R structures into Python representations._**

```python
from rds2py import as_SCE
from rds2py import as_summarized_experiment

# to convert an robject to SCE
sp_mat = as_SCE(rObj)
sp_mat = as_summarized_experiment(rObj)
```

Well thats it, hack on & create more base representations to encapsulate complex structures. If you want to add more representations, feel free to send a PR on this repository!
Well thats it, hack on & create more base representations to encapsulate complex structures. If you want to add more representations, feel free to send a PR!
18 changes: 18 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,21 @@ build-backend = "setuptools.build_meta"
[tool.setuptools_scm]
# See configuration details in https://github.com/pypa/setuptools_scm
version_scheme = "no-guess-dev"

[tool.ruff]
line-length = 100
src = ["src"]

[tool.ruff.pydocstyle]
convention = "google"

[tool.ruff.per-file-ignores]
"__init__.py" = ["E402", "F401"]

[tool.isort]
profile = "black"
known_first_party = "rds2py"
skip = ["__init__.py"]

[tool.black]
force-exclude = "__init__.py"
6 changes: 4 additions & 2 deletions setup.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,7 @@ formats = bdist_wheel

[flake8]
# Some sane defaults for the code style checker flake8
max_line_length = 88
max_line_length = 100
extend_ignore = E203, W503
# ^ Black-compatible
# E203 and W503 have edge cases handled by black
Expand All @@ -119,11 +119,13 @@ exclude =
dist
.eggs
docs/conf.py
per-file-ignores = __init__.py:F401

[pyscaffold]
# PyScaffold's parameters when the project was created.
# This will be used when updating. Do not change!
version = 4.3
version = 4.5
package = rds2py
extensions =
markdown
pre_commit
Loading

0 comments on commit 5ab629b

Please sign in to comment.