Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update CI and switch to python 3.11 #57

Merged
merged 6 commits into from
Mar 14, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 19 additions & 15 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,29 +6,33 @@ on:
branches:
- main
pull_request:

schedule:
# Run every Monday at 18:00 UTC (middle of US working hours) to ensure re-running
# tests with the latest dependencies.
# * is a special character in YAML so you have to quote this string
- cron: "0 18 * * 1"

jobs:
test:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ["3.10"]
python-version: ["3.11"]

steps:
- uses: actions/checkout@v2
- uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python-version }}
- uses: actions/checkout@v2
- uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python-version }}

- uses: pre-commit/[email protected]
- uses: pre-commit/[email protected]

- uses: google-github-actions/setup-gcloud@v0
- name: Start GCP Datastore emulator
# NOTE: this emulator is used by the tests below
run: |
gcloud components install --quiet beta cloud-datastore-emulator
gcloud beta emulators datastore start --project foobar --host-port 127.0.0.1:8099 --consistency=1.0 --no-store-on-disk &
- uses: google-github-actions/setup-gcloud@v2
- name: Start GCP Datastore emulator
# NOTE: this emulator is used by the tests below
run: |
gcloud components install --quiet beta cloud-datastore-emulator
gcloud beta emulators datastore start --project foobar --host-port 127.0.0.1:8099 --consistency=1.0 --no-store-on-disk &

- run: pip install -r requirements.txt -r requirements-dev.txt
- run: pytest --color=yes --datastore_emulated
- run: pip install -r requirements.txt -r requirements-dev.txt
- run: pytest --color=yes --datastore_emulated
9 changes: 5 additions & 4 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
default_language_version:
python: python3.10
python: python3.11
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.4.0
Expand All @@ -23,7 +23,8 @@ repos:
rev: v0.991
hooks:
- id: mypy
args: ["--strict", "--show-error-codes", "--pretty", "--show-error-context"]
args:
["--strict", "--show-error-codes", "--pretty", "--show-error-context"]
additional_dependencies:
- pandas==1.5.2
- pydantic==1.10.4
- pandas==1.5.2
- pydantic==1.10.4
10 changes: 5 additions & 5 deletions articat/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,9 @@
import warnings
from collections.abc import Mapping, Sequence
from configparser import ConfigParser
from enum import Enum
from enum import StrEnum
from pathlib import Path
from typing import TYPE_CHECKING, Any, Type
from typing import TYPE_CHECKING, Any

from articat.utils.class_or_instance_method import class_or_instance_method

Expand All @@ -16,7 +16,7 @@
from articat.catalog import Catalog


class ArticatMode(str, Enum):
class ArticatMode(StrEnum):
local = "local"
gcp_datastore = "gcp_datastore"
test = "test"
Expand Down Expand Up @@ -63,7 +63,7 @@ def register_config(
cls,
config_paths: Sequence[str] | None = None,
config_dict: Mapping[str, Mapping[str, Any]] = {},
) -> "Type[ArticatConfig]":
) -> "type[ArticatConfig]":
"""
Register configuration from config paths and config dictionary. Config
paths are read in order, `config_dict` is applied after config paths.
Expand Down Expand Up @@ -96,7 +96,7 @@ def mode(self) -> ArticatMode:
return ArticatMode(self._config.get("main", "mode", fallback=ArticatMode.local))

@class_or_instance_method
def catalog(self) -> "Type[Catalog]":
def catalog(self) -> "type[Catalog]":
"""Returns the Catalog implementation for given mode"""
if self.mode() == ArticatMode.local:
from articat.catalog_local import CatalogLocal
Expand Down
7 changes: 3 additions & 4 deletions articat/tests/utils_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,14 +44,13 @@ def test_download__empty():
download_artifact(a, dst)


def test_download__weird_valid_pattern():
def test_download__invalid_double_star():
src = get_source_path_that_looks_like_path_from_catalog()
src.joinpath("part-1").write_text("sth-2")
a = TestFSArtifact(id="sth", files_pattern=f"{src}/**part-1")
dst = Path(tempfile.mktemp())
r = download_artifact(a, dst)
assert dst.joinpath("part-1").read_text() == "sth-2"
assert r == f"{dst}/**part-1"
with pytest.raises(ValueError, match="Invalid pattern"):
download_artifact(a, dst)


def test_download__weird_invalid_pattern():
Expand Down
4 changes: 3 additions & 1 deletion articat/utils/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -89,7 +89,9 @@ def download_artifact(artifact: FSArtifact, local_dir: PathType) -> str:
prefix = artifact.main_dir
# Note: glob results don't have fs scheme
prefix_no_scheme = re.sub(FSArtifact._fs_scheme_regex, "", prefix)
to_copy = src_fs.glob(artifact.files_pattern)
# Root directory can be included in glob results if a trailing ** is used (which it often is)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I remember this being inconsistent across FSs, and I remember disliking this logic a lot, is there any chance to simplify the current implementation now given the (assumption!) more consistent glob results?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mm, I gave it another read-through, and I don't see too much wrong with it. If anything, I guess I could see using the more consistent glob behavior to change what glob patterns were used upstream, but that sounds like potentially a complicated effort to track down callsites.. and moreover, I don't think it would rank high on the list of company objectives 😉

# Don't include it in the list of files to copy, since we create it above when creating local_dir
to_copy = [f for f in src_fs.glob(artifact.files_pattern) if f != prefix]
if len(to_copy) == 0:
raise ValueError(f"Nothing to copy in `{artifact.files_pattern}`")
for f in to_copy:
Expand Down
Loading