Follow Google's Python style guide (#11)

Migrate pyscaffold to 4.5, setup pre-commits and some sphinx changes to document private and special methods. Update docstrings, tutorial and text everywhere.
BiocPy · Aug 22, 2023 · 5ab629b · 5ab629b
1 parent 3cfa83e
commit 5ab629b
Show file tree

Hide file tree

Showing 14 changed files with 294 additions and 141 deletions.
diff --git a/.github/workflows/pypi-test.yml b/.github/workflows/pypi-test.yml
@@ -3,7 +3,13 @@
 
 name: Test the library
 
-on: [push, pull_request]
+on:
+  push:
+    branches:
+      - main
+    tags:
+      - "*"
+  pull_request:
 
 jobs:
   test:
@@ -25,16 +31,30 @@ jobs:
         run: |
           python -m pip install --upgrade pip
           pip install flake8 pytest tox cython numpy
+      - name: Download rds2cpp deps
+        run: |
           cd extern/rds2cpp
           cmake .
           cd ../..
       - name: Test with tox
         run: |
           python setup.py build_ext --inplace
           tox
+      - name: Build docs
+        run: |
+          tox -e docs
+          touch ./docs/_build/html/.nojekyll
+      - name: GH Pages Deployment
+        if: github.event_name == 'push' && startsWith(github.ref, 'refs/tags/')
+        uses: JamesIves/[email protected]
+        with:
+          branch: gh-pages # The branch the action should deploy to.
+          folder: ./docs/_build/html
+          clean: true # Automatically remove deleted files from the deploy branch
 
   build_wheels:
     name: Build wheels on ${{ matrix.os }}
+    if: github.event_name == 'push' && startsWith(github.ref, 'refs/tags/')
     runs-on: ${{ matrix.os }}
     strategy:
       matrix:
@@ -45,17 +65,19 @@ jobs:
         with:
           submodules: true
 
-      - name: Install dependencies
+      - name: Download dependencies
         run: |
-          cd extern/rds2cpp
+          cd extern/knncolle
           cmake .
           cd ../..
 
       - name: Build wheels
         uses: pypa/[email protected]
         env:
           CIBW_ARCHS_MACOS: x86_64 arm64
-
+          CIBW_ARCHS_LINUX: x86_64 # remove this later so we build for all linux archs
+          CIBW_PROJECT_REQUIRES_PYTHON: ">=3.9"
+          CIBW_SKIP: pp*
       - uses: actions/upload-artifact@v3
         with:
           path: ./wheelhouse/*.whl
@@ -68,12 +90,6 @@ jobs:
         with:
           submodules: true
 
-      - name: Install dependencies
-        run: |
-          cd extern/rds2cpp
-          cmake .
-          cd ../..
-
       - name: Build sdist
         run: pipx run build --sdist
 

diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -0,0 +1,67 @@
+exclude: '^docs/conf.py'
+
+repos:
+- repo: https://github.com/pre-commit/pre-commit-hooks
+  rev: v4.4.0
+  hooks:
+  - id: trailing-whitespace
+  - id: check-added-large-files
+  - id: check-ast
+  - id: check-json
+  - id: check-merge-conflict
+  - id: check-xml
+  - id: check-yaml
+  - id: debug-statements
+  - id: end-of-file-fixer
+  - id: requirements-txt-fixer
+  - id: mixed-line-ending
+    args: ['--fix=auto']  # replace 'auto' with 'lf' to enforce Linux/Mac line endings or 'crlf' for Windows
+
+## If you want to automatically "modernize" your Python code:
+# - repo: https://github.com/asottile/pyupgrade
+#   rev: v3.7.0
+#   hooks:
+#   - id: pyupgrade
+#     args: ['--py37-plus']
+
+## If you want to avoid flake8 errors due to unused vars or imports:
+# - repo: https://github.com/PyCQA/autoflake
+#   rev: v2.1.1
+#   hooks:
+#   - id: autoflake
+#     args: [
+#       --in-place,
+#       --remove-all-unused-imports,
+#       --remove-unused-variables,
+#     ]
+
+- repo: https://github.com/PyCQA/isort
+  rev: 5.12.0
+  hooks:
+  - id: isort
+
+- repo: https://github.com/psf/black
+  rev: 23.7.0
+  hooks:
+  - id: black
+    language_version: python3
+
+## If like to embrace black styles even in the docs:
+# - repo: https://github.com/asottile/blacken-docs
+#   rev: v1.13.0
+#   hooks:
+#   - id: blacken-docs
+#     additional_dependencies: [black]
+
+- repo: https://github.com/PyCQA/flake8
+  rev: 6.1.0
+  hooks:
+  - id: flake8
+  ## You can add flake8 plugins via `additional_dependencies`:
+  #  additional_dependencies: [flake8-bugbear]
+
+## Check for misspells in documentation files:
+# - repo: https://github.com/codespell-project/codespell
+#   rev: v2.2.5
+#   hooks:
+#   - id: codespell
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,6 +1,9 @@
 # Changelog
 
-## Version 0.1 (development)
+## Version 0.3.0 (development)
+
+A few changes to update pyscaffold, development environment and link sphinx objects to relevant objects.
+## Version 0.1
 
 - Feature A added
 - FIX: nasty bug #1729 fixed

diff --git a/README.md b/README.md
@@ -1,6 +1,6 @@
 # rds2py
 
-Parse and construct Python representations for datasets stored in RDS files. It supports a few base classes from R and Bioconductor's `SummarizedExperiment` and `SingleCellExperiment` S4 classes. ***This is possible because of [Aaron's rds2cpp library](https://github.com/LTLA/rds2cpp).***
+Parse and construct Python representations for datasets stored in RDS files. `rds2py` supports a few base classes from R and Bioconductor's `SummarizedExperiment` and `SingleCellExperiment` S4 classes. **_This is possible because of [Aaron's rds2cpp library](https://github.com/LTLA/rds2cpp)._**
 
 The package uses memory views (except for strings) to access the same memory from C++ in Python (through Cython of course). This is especially useful for large datasets so we don't make multiple copies of data.
 
@@ -14,42 +14,40 @@ pip install rds2py
 
 ## Usage
 
-If you do not have an RDS object handy, feel free to download from [single-cell-test-files](https://github.com/jkanche/random-test-files/releases).
+If you do not have an RDS object handy, feel free to download one from [single-cell-test-files](https://github.com/jkanche/random-test-files/releases).
 
 ```python
-from rds2py import as_SCE, read_rds
+from rds2py import as_summarized_experiment, read_rds
 
 rObj = read_rds(<path_to_file>)
 ```
 
-Once we have a realized structure of the RDS file, we can now build useful Python representations.
+Once we have a dictionary representation of the RDS file, we can now build useful Python representations from these objects.
 
-This `rObj` contains the realized structure of the RDS file as a Python `dict` object, it contains two keys 
-- `data`: if atomic entities, contains the numpy view of the memory space.
-- `attributes`: additional properties available for the object. 
+This `rObj` contains two keys
 
-The package provides friendly functions to easily convert few R representations to Python representations.
+- `data`: If atomic entities, contains the numpy view of the memory space.
+- `attributes`: Additional properties available for the object.
+
+The package provides friendly functions to easily convert a few R representations to Python.
 
 ```python
-from rds2py import as_spase_matrix, as_SCE
+from rds2py import as_spase_matrix, as_summarized_experiment
 
 # to convert an robject to a sparse matrix
 sp_mat = as_sparse(rObj)
 
 # to convert an robject to SCE
-sce = as_SCE(rObj)
+sce = as_summarized_experiment(rObj)
 ```
 
-For more use cases converting `data.frame`, `dgCMatrix`, `dgRMatrix`, `dgTMatrix` to Python, checkout the [documentation](https://biocpy.github.io/rds2py/).
-
-***If you want to add more representations, feel free to send a PR on this repository!***
-
+For more examples converting `data.frame`, `dgCMatrix`, `dgRMatrix`, `dgTMatrix` to Python, checkout the [documentation](https://biocpy.github.io/rds2py/).
 
 ## Developer Notes
 
-This project uses Cython to provide bindings from C++ to Python. It tries to use the same memory space (except for strings) instead of making copy of the data.
+This project uses Cython to provide bindings from C++ to Python.
 
-Steps to setup dependencies - 
+Steps to setup dependencies -
 
 - git submodules is initialized in `extern/rds2cpp`
 - `cmake .` in `extern/rds2cpp` directory to download dependencies, especially the `byteme` library
@@ -66,10 +64,9 @@ For typical development workflows, run
 python setup.py build_ext --inplace && tox
 ```
 
-
 <!-- pyscaffold-notes -->
 
 ## Note
 
-This project has been set up using PyScaffold 4.3. For details and usage
+This project has been set up using PyScaffold 4.5. For details and usage
 information on PyScaffold see https://pyscaffold.org/.
diff --git a/docs/conf.py b/docs/conf.py
@@ -106,7 +106,7 @@
 
 # General information about the project.
 project = "rds2py"
-copyright = "2022, jkanche"
+copyright = "2023, jkanche"
 
 # The version info for the project you're documenting, acts as replacement for
 # |version| and |release|, also used in various other places throughout the
@@ -166,6 +166,15 @@
 # If this is True, todo emits a warning for each TODO entries. The default is False.
 todo_emit_warnings = True
 
+autodoc_default_options = {
+    'special-members': True,
+    'undoc-members': False,
+    'exclude-members': '__weakref__, __dict__, __str__, __module__, __init__'
+}
+
+autosummary_generate = True
+autosummary_imported_members = True
+
 
 # -- Options for HTML output -------------------------------------------------
 
@@ -299,6 +308,8 @@
     "scipy": ("https://docs.scipy.org/doc/scipy/reference", None),
     "setuptools": ("https://setuptools.pypa.io/en/stable/", None),
     "pyscaffold": ("https://pyscaffold.org/en/stable", None),
+    "singelcellexperiment": ("https://biocpy.github.io/SingleCellExperiment", None),
+    "summarizedexperiment": ("https://biocpy.github.io/SummarizedExperiment", None),
 }
 
 print(f"loading configurations for {project} {version} ...", file=sys.stderr)
diff --git a/docs/tutorial.md b/docs/tutorial.md
@@ -1,22 +1,22 @@
 # Tutorial
 
-If you do not have an RDS object handy, feel free to download from [single-cell-test-files](https://github.com/jkanche/random-test-files/releases).
+If you do not have an RDS object handy, feel free to download one from [single-cell-test-files](https://github.com/jkanche/random-test-files/releases).
 
 ## Step 1: Read a RDS file in Python
 
-The first step is to read an RDS file and get the equivalent representation in Python.
+First we need to read the RDS file that can be easily explored in Python. The `read_rds` parses the R object and returns
+a dictionary of the R object.
 
 ```python
 from rds2py import read_rds
 
 rObj = read_rds(<path_to_file>)
 ```
 
-Once we have a realized structure of the RDS file, we can now start to build useful Python representations.
+Once we have a realized structure, we can now convert this object to useful Python representations. It contains two keys
 
-This `rObj` contains the realized structure of the RDS file as a Python `dict` object, it contains two keys 
-- `data`: if atomic entities, contains the numpy view of the memory space.
-- `attributes`: additional properties available for the object. 
+- `data`: If atomic entities, contains the numpy view of the memory space.
+- `attributes`: Additional properties available for the object.
 
 The package provides friendly functions to convert some R representations to useful Python representations.
 
@@ -26,8 +26,7 @@ The package provides friendly functions to convert some R representations to use
 
 Use these methods if the RDS file contains either a sparse matrix (`dgCMatrix`, `dgRMatrix`, or `dgTMatrix`) or a dense matrix.
 
-
-***Note: If an R object contains `dims` in the `attributes`, we consider this as a matrix.***
+**_Note: If an R object contains `dims` in the `attributes`, we consider this as a matrix._**
 
 ```python
 from rds2py import as_spase_matrix, as_dense_matrix
@@ -52,15 +51,15 @@ df = as_pandas(rObj)
 
 ### S4 classes: specifically `SingleCellExperiment` or `SummarizedExperiment`
 
-We also support `SingleCellExperiment` or `SummarizedExperiment` from Bioconductor. the `as_SCE` method is how we one can do this operation. 
+We also support `SingleCellExperiment` or `SummarizedExperiment` from Bioconductor. the `as_summarized_experiment` method is how we one can do this operation.
 
-***Note: This method also serves as an example on how to convert complex R structures into Python representations.***
+**_Note: This method also serves as an example on how to convert complex R structures into Python representations._**
 
 ```python
-from rds2py import as_SCE
+from rds2py import as_summarized_experiment
 
 # to convert an robject to SCE
-sp_mat = as_SCE(rObj)
+sp_mat = as_summarized_experiment(rObj)
 ```
 
-Well thats it, hack on & create more base representations to encapsulate complex structures. If you want to add more representations, feel free to send a PR on this repository!
+Well thats it, hack on & create more base representations to encapsulate complex structures. If you want to add more representations, feel free to send a PR!
diff --git a/pyproject.toml b/pyproject.toml
@@ -6,3 +6,21 @@ build-backend = "setuptools.build_meta"
 [tool.setuptools_scm]
 # See configuration details in https://github.com/pypa/setuptools_scm
 version_scheme = "no-guess-dev"
+
+[tool.ruff]
+line-length = 100
+src = ["src"]
+
+[tool.ruff.pydocstyle]
+convention = "google"
+
+[tool.ruff.per-file-ignores]
+"__init__.py" = ["E402", "F401"]
+
+[tool.isort]
+profile = "black"
+known_first_party = "rds2py"
+skip = ["__init__.py"]
+
+[tool.black]
+force-exclude = "__init__.py"
diff --git a/setup.cfg b/setup.cfg
@@ -109,7 +109,7 @@ formats = bdist_wheel
 
 [flake8]
 # Some sane defaults for the code style checker flake8
-max_line_length = 88
+max_line_length = 100
 extend_ignore = E203, W503
 # ^  Black-compatible
 #    E203 and W503 have edge cases handled by black
@@ -119,11 +119,13 @@ exclude =
     dist
     .eggs
     docs/conf.py
+per-file-ignores = __init__.py:F401
 
 [pyscaffold]
 # PyScaffold's parameters when the project was created.
 # This will be used when updating. Do not change!
-version = 4.3
+version = 4.5
 package = rds2py
 extensions =
     markdown
+    pre_commit