Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: dependency bumps, release commit for 0.16.12 #3831

Merged
merged 4 commits into from
Jan 5, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,15 @@
## 0.16.12-dev5
## 0.16.12

### Enhancements

- **Prepare auto-partitioning for pluggable partitioners**. Move toward a uniform partitioner call signature so a custom or override partitioner can be registered without code changes.
- **Add NDJSON file type support**
- **Add NDJSON file type support.**

### Features

### Fixes

- Base image has been updated, trigger new workflows
- **Base image has been updated.**
- **Upgrade ruff to latest.** Previously the ruff version was pinned to <0.5. Remove that pin and fix the handful of lint items that resulted.
- **CSV with asserted XLS content-type is correctly identified as CSV.** Resolves a bug where a CSV file with an asserted content-type of `application/vnd.ms-excel` was incorrectly identified as an XLS file.
- **Improve element-type mapping for Chinese text.** Fixes bug where Chinese text would produce large numbers of false-positive `Title` elements.
Expand Down
8 changes: 4 additions & 4 deletions requirements/base.txt
Original file line number Diff line number Diff line change
Expand Up @@ -20,11 +20,11 @@ cffi==1.17.1
# via cryptography
chardet==5.2.0
# via -r ./base.in
charset-normalizer==3.4.0
charset-normalizer==3.4.1
# via
# requests
# unstructured-client
click==8.1.7
click==8.1.8
# via
# nltk
# python-oxmsg
Expand Down Expand Up @@ -64,7 +64,7 @@ langdetect==1.0.9
# via -r ./base.in
lxml==5.3.0
# via -r ./base.in
marshmallow==3.23.1
marshmallow==3.23.2
# via
# dataclasses-json
# unstructured-client
Expand All @@ -88,7 +88,7 @@ packaging==24.2
# via
# marshmallow
# unstructured-client
psutil==6.1.0
psutil==6.1.1
# via -r ./base.in
pycparser==2.22
# via cffi
Expand Down
6 changes: 3 additions & 3 deletions requirements/dev.txt
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ build==1.2.2.post1
# via pip-tools
cfgv==3.4.0
# via pre-commit
click==8.1.7
click==8.1.8
# via
# -c ./base.txt
# -c ./test.txt
Expand All @@ -17,7 +17,7 @@ distlib==0.3.9
# via virtualenv
filelock==3.16.1
# via virtualenv
identify==2.6.3
identify==2.6.4
# via pre-commit
importlib-metadata==8.5.0
# via
Expand Down Expand Up @@ -51,7 +51,7 @@ tomli==2.2.1
# -c ./test.txt
# build
# pip-tools
virtualenv==20.28.0
virtualenv==20.28.1
# via pre-commit
wheel==0.45.1
# via pip-tools
Expand Down
8 changes: 4 additions & 4 deletions requirements/extra-paddleocr.txt
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ certifi==2024.12.14
# httpcore
# httpx
# requests
charset-normalizer==3.4.0
charset-normalizer==3.4.1
# via
# -c ./base.txt
# requests
Expand Down Expand Up @@ -58,7 +58,7 @@ imageio==2.36.1
# scikit-image
imgaug==0.4.0
# via unstructured-paddleocr
importlib-resources==6.4.5
importlib-resources==6.5.1
# via matplotlib
kiwisolver==1.4.7
# via matplotlib
Expand Down Expand Up @@ -104,7 +104,7 @@ paddlepaddle==3.0.0b1
# via -r ./extra-paddleocr.in
pdf2image==1.17.0
# via unstructured-paddleocr
pillow==11.0.0
pillow==11.1.0
# via
# imageio
# imgaug
Expand All @@ -119,7 +119,7 @@ protobuf==4.25.5
# paddlepaddle
pyclipper==1.3.0.post6
# via unstructured-paddleocr
pyparsing==3.2.0
pyparsing==3.2.1
# via matplotlib
python-dateutil==2.9.0.post0
# via
Expand Down
24 changes: 12 additions & 12 deletions requirements/extra-pdf-image.txt
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ cffi==1.17.1
# via
# -c ./base.txt
# cryptography
charset-normalizer==3.4.0
charset-normalizer==3.4.1
# via
# -c ./base.txt
# pdfminer-six
Expand All @@ -40,11 +40,11 @@ filelock==3.16.1
# huggingface-hub
# torch
# transformers
flatbuffers==24.3.25
flatbuffers==24.12.23
# via onnxruntime
fonttools==4.55.3
# via matplotlib
fsspec==2024.10.0
fsspec==2024.12.0
# via
# huggingface-hub
# torch
Expand Down Expand Up @@ -79,11 +79,11 @@ idna==3.10
# via
# -c ./base.txt
# requests
importlib-resources==6.4.5
importlib-resources==6.5.1
# via matplotlib
iopath==0.1.10
# via layoutparser
jinja2==3.1.4
jinja2==3.1.5
# via torch
kiwisolver==1.4.7
# via matplotlib
Expand Down Expand Up @@ -149,13 +149,13 @@ pdfminer-six==20231228
# via
# -r ./extra-pdf-image.in
# pdfplumber
pdfplumber==0.11.4
pdfplumber==0.11.5
# via layoutparser
pi-heif==0.21.0
# via -r ./extra-pdf-image.in
pikepdf==9.4.2
pikepdf==9.5.0
# via -r ./extra-pdf-image.in
pillow==11.0.0
pillow==11.1.0
# via
# layoutparser
# matplotlib
Expand All @@ -165,7 +165,7 @@ pillow==11.0.0
# pikepdf
# torchvision
# unstructured-pytesseract
portalocker==3.0.0
portalocker==3.1.1
# via iopath
proto-plus==1.25.0
# via
Expand Down Expand Up @@ -193,13 +193,13 @@ pycparser==2.22
# via
# -c ./base.txt
# cffi
pyparsing==3.2.0
pyparsing==3.2.1
# via matplotlib
pypdf==5.1.0
# via
# -c ./base.txt
# -r ./extra-pdf-image.in
pypdfium2==4.30.0
pypdfium2==4.30.1
# via pdfplumber
python-dateutil==2.9.0.post0
# via
Expand Down Expand Up @@ -233,7 +233,7 @@ requests==2.32.3
# transformers
rsa==4.9
# via google-auth
safetensors==0.4.5
safetensors==0.5.0
# via
# timm
# transformers
Expand Down
2 changes: 1 addition & 1 deletion requirements/extra-pptx.txt
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
#
lxml==5.3.0
# via python-pptx
pillow==11.0.0
pillow==11.1.0
# via python-pptx
python-pptx==1.0.2
# via -r ./extra-pptx.in
Expand Down
10 changes: 5 additions & 5 deletions requirements/huggingface.txt
Original file line number Diff line number Diff line change
Expand Up @@ -8,11 +8,11 @@ certifi==2024.12.14
# via
# -c ./base.txt
# requests
charset-normalizer==3.4.0
charset-normalizer==3.4.1
# via
# -c ./base.txt
# requests
click==8.1.7
click==8.1.8
# via
# -c ./base.txt
# sacremoses
Expand All @@ -21,7 +21,7 @@ filelock==3.16.1
# huggingface-hub
# torch
# transformers
fsspec==2024.10.0
fsspec==2024.12.0
# via
# huggingface-hub
# torch
Expand All @@ -33,7 +33,7 @@ idna==3.10
# via
# -c ./base.txt
# requests
jinja2==3.1.4
jinja2==3.1.5
# via torch
joblib==1.4.2
# via
Expand Down Expand Up @@ -74,7 +74,7 @@ requests==2.32.3
# transformers
sacremoses==0.1.1
# via -r ./huggingface.in
safetensors==0.4.5
safetensors==0.5.0
# via transformers
sentencepiece==0.2.0
# via -r ./huggingface.in
Expand Down
22 changes: 11 additions & 11 deletions requirements/test.txt
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ anyio==4.7.0
# httpx
appdirs==1.4.4
# via label-studio-sdk
argcomplete==3.5.2
argcomplete==3.5.3
# via datamodel-code-generator
attrs==24.3.0
# via
Expand All @@ -30,16 +30,16 @@ certifi==2024.12.14
# httpcore
# httpx
# requests
charset-normalizer==3.4.0
charset-normalizer==3.4.1
# via
# -c ./base.txt
# requests
click==8.1.7
click==8.1.8
# via
# -c ./base.txt
# black
# nltk
coverage[toml]==7.6.9
coverage[toml]==7.6.10
# via
# -r ./test.in
# pytest-cov
Expand Down Expand Up @@ -98,7 +98,7 @@ iniconfig==2.0.0
# via pytest
isort==5.13.2
# via datamodel-code-generator
jinja2==3.1.4
jinja2==3.1.5
# via datamodel-code-generator
joblib==1.4.2
# via
Expand Down Expand Up @@ -126,7 +126,7 @@ mccabe==0.7.0
# via flake8
multidict==6.1.0
# via yarl
mypy==1.13.0
mypy==1.14.1
# via -r ./test.in
mypy-extensions==1.0.0
# via
Expand All @@ -152,7 +152,7 @@ pandas==2.2.3
# via label-studio-sdk
pathspec==0.12.1
# via black
pillow==11.0.0
pillow==11.1.0
# via label-studio-sdk
platformdirs==4.3.6
# via black
Expand All @@ -164,13 +164,13 @@ pycodestyle==2.12.1
# via
# flake8
# flake8-print
pydantic[email]==2.10.3
pydantic[email]==2.10.4
# via
# -r ./test.in
# datamodel-code-generator
# jsf
# label-studio-sdk
pydantic-core==2.27.1
pydantic-core==2.27.2
# via pydantic
pyflakes==3.2.0
# via
Expand Down Expand Up @@ -218,7 +218,7 @@ rpds-py==0.22.3
# referencing
rstr==3.2.2
# via jsf
ruff==0.8.3
ruff==0.8.5
# via -r ./test.in
semantic-version==2.10.0
# via liccheck
Expand Down Expand Up @@ -279,7 +279,7 @@ urllib3==1.26.20
# -c ./base.txt
# requests
# vcrpy
vcrpy==6.0.2
vcrpy==7.0.0
# via -r ./test.in
wrapt==1.17.0
# via
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@
},
"date_created": "1689435368.0",
"date_modified": "1689435537.0",
"filesize_bytes": 9189
"filesize_bytes": 9179
}
}
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@
},
"date_created": "1690248382.0",
"date_modified": "1690248401.0",
"filesize_bytes": 9207
"filesize_bytes": 9197
}
}
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@
},
"date_created": "1688960344.0",
"date_modified": "1689460572.0",
"filesize_bytes": 9254
"filesize_bytes": 9244
}
}
}
Expand Down
2 changes: 1 addition & 1 deletion unstructured/__version__.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
__version__ = "0.16.12-dev5" # pragma: no cover
__version__ = "0.16.12" # pragma: no cover
Loading