Skip to content

Commit

Permalink
Azure Blob Storage: Fix unstructured format (airbytehq#34084)
Browse files Browse the repository at this point in the history
  • Loading branch information
Joe Reuter authored Jan 10, 2024
1 parent bbdd6d8 commit e728128
Show file tree
Hide file tree
Showing 4 changed files with 5 additions and 7 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,8 @@ acceptance_tests:
status: succeed
- config_path: secrets/jsonl_newlines_config.json
status: succeed
- config_path: secrets/unstructured_config.json
status: succeed
discovery:
tests:
- config_path: secrets/config.json
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ data:
connectorSubtype: file
connectorType: source
definitionId: fdaaba68-4875-4ed9-8fcd-4ae1e0a25093
dockerImageTag: 0.3.0
dockerImageTag: 0.3.1
dockerRepository: airbyte/source-azure-blob-storage
documentationUrl: https://docs.airbyte.com/integrations/sources/azure-blob-storage
githubIssueLabel: source-azure-blob-storage
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,6 @@ def get_matching_files(
if not globs or self.file_matches_globs(remote_file, globs):
yield remote_file

@contextmanager
def open_file(self, file: RemoteFile, mode: FileReadMode, encoding: Optional[str], logger: logging.Logger) -> IOBase:
try:
result = open(
Expand All @@ -73,8 +72,4 @@ def open_file(self, file: RemoteFile, mode: FileReadMode, encoding: Optional[str
f"We don't have access to {file.uri}. The file appears to have become unreachable during sync."
f"Check whether key {file.uri} exists in `{self.config.azure_blob_storage_container_name}` container and/or has proper ACL permissions"
)
# see https://docs.python.org/3/library/contextlib.html#contextlib.contextmanager for why we do this
try:
yield result
finally:
result.close()
return result
1 change: 1 addition & 0 deletions docs/integrations/sources/azure-blob-storage.md
Original file line number Diff line number Diff line change
Expand Up @@ -193,6 +193,7 @@ To perform the text extraction from PDF and Docx files, the connector uses the [

| Version | Date | Pull Request | Subject |
|:--------|:-----------|:---------------------------------------------------------|:--------------------------------------------------------------------------------|
| 0.3.1 | 2024-01-10 | [34084](https://github.com/airbytehq/airbyte/pull/34084) | Fix bug for running check with document file format |
| 0.3.0 | 2023-12-14 | [33411](https://github.com/airbytehq/airbyte/pull/33411) | Bump CDK version to auto-set primary key for document file streams and support raw txt files |
| 0.2.5 | 2023-12-06 | [33187](https://github.com/airbytehq/airbyte/pull/33187) | Bump CDK version to hide source-defined primary key |
| 0.2.4 | 2023-11-16 | [32608](https://github.com/airbytehq/airbyte/pull/32608) | Improve document file type parser |
Expand Down

0 comments on commit e728128

Please sign in to comment.