Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update dependency io.openlineage:openlineage-java to v1.28.0 #3002

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

renovate[bot]
Copy link
Contributor

@renovate renovate bot commented Jan 1, 2025

This PR contains the following updates:

Package Change Age Adoption Passing Confidence
io.openlineage:openlineage-java 1.23.0 -> 1.28.0 age adoption passing confidence

Release Notes

OpenLineage/OpenLineage (io.openlineage:openlineage-java)

v1.28.0

Compare Source

Added
  • Java: enable specifying custom SSL context #3444 @​pawel-big-lebowski
    Enable providing configuration for SSL context within HTTP transport.
  • Spark: make Spark nodes filtering configurable. #3442 @​pawel-big-lebowski
    Spark integration filters OpenLineage events for specific plan node classes. This can be now extended with extra config entries: allowedSparkNodes and deniedSparkNodes. See Spark Configuration documentation for more details.
  • Java: add task queue based async circuit breaker. #3437 @​aritrabandyo
    This circuit breaker that executes task on a queue backed threadpool, gives up tasks if the queue is full, and keeps track of rejected tasks.
  • dbt: added initial support for Trino adapter #3429 @​whitleykeith
    This allows Trino integration to emit proper events containing Trino datasets.
  • Spark: increased coverage for Spark DML commands #3430 @​ssanthanam185
    Adds coverage for AlterTableRecoverPartitionsCommandVisitor, RefreshTableCommandVisitor, RepairTableCommandVisitor.
Changed
  • Spark: the OpenLineageSparkListener was refactored to have a public, single-argument constructor taking an instance of SparkConf. #3425 @​d-m-h
    This presents no functional change to the listener, however it will allow for improved initialisation of the listener in the future.
  • Spark: Unsupported catalog exception should be less verbose. #3435 @​pawel-big-lebowski
    In case of unsupported classes, warn logs without a stacktrace should be produced.
  • Spark: Directly expose the LogicalPlan and SparkPlan objects inside OpenLineageContext. #3443 @​d-m-h
    This is an initial refactor to a larger code base change that will see the removal of direct access of the QueryExecution object. It has no functional change on the way the integration behaves.
Fixed
  • Spark: improve column lineage by including inputs within COMPLETE events. #3434 @​pawel-big-lebowski
    *Send input datasets in COMPLETE events while making sure version facet is attached on START only.
  • dbt: ParentRunFacet is now correctly attached when using structured logs option. #3432 @​MassyB
    Fixes incorrect structure of ParentRunFacet.

v1.27.0

Compare Source

Added
  • Flink: Experimental version for flink native lineage listener. #3099 @​pawel-big-lebowski
    New flink listener to extract lineage through native Flink interfaces. Supports Flink SQL. Requires Flink 2.0.
  • dbt: add support for consuming dbt structured logs for test and build commands . #3362 @​MassyB
    New option for dbt integration now can handle test and build commands too.
  • Spark: allow attaching custom facets to RDDExecContext events. #3379 @​ssanthanam185
    Events emitted from RDDExecutionContext now include custom facets that get loaded as part of InternalHandlerFactory.
  • spec: add DatasetTypeDatasetFacet. #3390 @​ssanthanam185
    Events emitted from RDDExecutionContext now include custom facets that get loaded as part of InternalHandlerFactory.
Changed
  • Python: allow adding additionalProperties to Python facets #3391 @​JDarDagran
    Adds with_additonal_properties method that allows to create modified instance of facet with additional properties.
  • Spark: allow attaching custom facets to RDDExecContext events. #3379 @​ssanthanam185
    Events emitted from RDDExecutionContext now include custom facets that get loaded as part of InternalHandlerFactory.
  • Spark: SerializedFromObject events aren't filtered for not-delta plans. #3403 @​ssanthanam185
    Those events shouldn't be filtered outside Databricks/Delta ecosystem.
Fixed
  • Spark: fixed ClassLoader handling for OpenLineageExtensionProvider. #3368 @​ddebowczyk92
    Fixes ClassNotFoundException issue when using the openlineage-spark integration alongside a Spark connector that implements the spark-extension-interfaces due to class loader conflicts.
  • SQL: add minimal support for Snowflake LATERAL. #3368 @​cisenbe
    SQL parser won't error on Snowflake's LATERAL keyword.
  • dbt: handle errors in parse_assertions #3311 @​dsaxton-1password
    dbt integration won't fail when looking at tests on seeds.
  • Spark: fix infinite loop in RDD flatten & perf optimization. #3379 @​ssanthanam185
    Spark integration now correctly handles complex jobs that have cycles and nested RDD trees.
  • Python: FileTransport now correctly attaches json file extension. #3404 @​kacpermuda
    When append=False, the json file extension wasn't properly added before.

v1.26.0

Compare Source

Added
  • dbt: Consume dbt structured logs and report progress in real time. #3314 @​MassyB
    If --consume-structured-logs flag is set, dbt integration will consume dbt structured logs and report execution progress in real time.
  • Java: Add transform transport to allow event modification. #3301 @​pawel-big-lebowski
    New transport type allows to modify the event based on the specified transformer class.
  • Java: Parallel event emitting for composite transport. #3305[#3305] @​pawel-big-lebowski
    Emit events in parallel for composite transport. Running in parallel is a default behaviour continueOnFailure set to true. Default value of continueOnFailure got changed from false to true.
  • Spark: Collect ScanReport and CommitReport in OpenLineage events when dealing with Iceberg tables. #3256 @​pawel-big-lebowski
    Collects additional Iceberg metrics for datasets read or written through the library. Visit Dataset Metrics docs for more details.
  • dbt: add support for duckdb adapter #3280 @​mobuchowski
    Adds support for duckdb adapter for dbt integration.
Changed
  • Spark: Add DatasetFactory to support Dataset creation. #3207 @​pawel-big-lebowski
    Adds DatasetFactory to support Dataset creation. This class is used to create Dataset instances for DatasetFactory.
Fixed

v1.25.0

Compare Source

Added
  • Dbt: Add support for Column-Level Lineage in dbt integration. #3264 @​mayurmadnani
    Dbt integration now uses SQL parser to add information about collected column-level lineage.
  • Spark: Add input and output statistics about datasets read and written. #3240#3263 @​pawel-big-lebowski
    Fix issues related to existing output statistics collection mechanism and fetch input statistics. Output statistics contain now amount of files written, bytes size as well as records written. Input statistics contain bytes size and number of files read, while record count is collected only for DataSourceV2 sources.
  • Introduced InputStatisticsInputDatasetFacet #3238 @​pawel-big-lebowski
    Extend spec with a new facet InputStatisticsInputDatasetFacet modelled after a similar OutputStatisticsOutputDatasetFacet to contain statistics about input dataset read by a job.
Changed
  • Spark: Exclude META-INF/*TransportBuilder from Spark Extension Interfaces #3244 @​tnazarew
    Excludes META-INF/*TransportBuilder to avoid version conflicts
  • Spark: enables building input/output facets through DatasetFactory #3207 @​pawel-big-lebowski
    Adds extra capabilities into DatasetFactory class, marks some public developers' API methods as deprecated.
Fixed
  • dbt: fix compatibility with dbt v1.8 #3228 @​NJA010
    dbt integration now takes into account modified test_metadata field
  • Spark: enabled Delta 3.x version compatibility #3253 @​Jorricks
    Take into account modified initialSnapshot name

v1.24.2

Compare Source

Added
  • Spark: Add Dataproc run facet to include jobType property #3167 @​codelixir
    Updates the GCP Dataproc run facet to include jobType property
  • Add EnvironmentVariablesRunFacet to core spec #3186 @​JDarDagran
    Use EnvironmentVariablesRunFacet in Python client
  • Add assertions for format in test events #3221 @​JDarDagran
  • Spark: Add integration tests for EMR #3142 @​arturowczarek
    Spark integration has integration tests for EMR
Changed
  • Move Kinesis to separate module, migrate HTTP transport to httpclient5 #3205 @​mobuchowski
    Moves Kinesis integration to a separate module and updates HTTP transport to use HttpClient 5.x
  • Docs: Upgrade docusaurus to 3.6 #3219 @​arturowczarek
  • Spark: Limit the Seq size in RddPathUtils::extract() #3148 @​codelixir
    Adds flag to limit the logs in RddPathUtils::extract() to avoid OutOfMemoryError for large jobs
Fixed

v1.24.1

Compare Source

v1.24.0

Compare Source


Configuration

📅 Schedule: Branch creation - "every 3 months on the first day of the month" (UTC), Automerge - At any time (no schedule defined).

🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

Rebasing: Whenever PR is behind base branch, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this PR and you won't be reminded about this update again.


  • If you want to rebase/retry this PR, check this box

This PR was generated by Mend Renovate. View the repository job log.

Copy link

netlify bot commented Jan 1, 2025

Deploy Preview for peppy-sprite-186812 failed.

Name Link
🔨 Latest commit ebaf22c
🔍 Latest deploy log https://app.netlify.com/sites/peppy-sprite-186812/deploys/67a556ea35723e0008d225ff

@renovate renovate bot force-pushed the renovate/openlineageversion branch from d415db2 to a641137 Compare January 20, 2025 19:42
@renovate renovate bot changed the title Update dependency io.openlineage:openlineage-java to v1.26.0 Update dependency io.openlineage:openlineage-java to v1.27.0 Jan 20, 2025
Copy link

codecov bot commented Jan 20, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 81.18%. Comparing base (cfff11d) to head (dc448cf).

Additional details and impacted files
@@            Coverage Diff            @@
##               main    #3002   +/-   ##
=========================================
  Coverage     81.18%   81.18%           
  Complexity     1506     1506           
=========================================
  Files           268      268           
  Lines          7356     7356           
  Branches        325      325           
=========================================
  Hits           5972     5972           
  Misses         1226     1226           
  Partials        158      158           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@renovate renovate bot force-pushed the renovate/openlineageversion branch 7 times, most recently from 403306b to 96a3b7b Compare January 22, 2025 13:56
@renovate renovate bot force-pushed the renovate/openlineageversion branch 2 times, most recently from 12587ff to dc448cf Compare February 5, 2025 17:38
Signed-off-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
@renovate renovate bot force-pushed the renovate/openlineageversion branch from dc448cf to ebaf22c Compare February 7, 2025 00:42
@renovate renovate bot changed the title Update dependency io.openlineage:openlineage-java to v1.27.0 Update dependency io.openlineage:openlineage-java to v1.28.0 Feb 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0 participants