This release addresses the following items:
- Can now split text in documents when writing them to MarkLogic. Chunks of text can be added to the source document itself or written to separate sidecar documents.
- Can now add embeddings to chunks in documents before writing them to MarkLogic. You can reuse the Flux embedding model integrations available from the Flux releases site by adding one or more of these JAR files to your Spark classpath.
- When reading rows via an Optic query, the Optic query no longer requires the use of
op.fromView
. However, when not usingop.fromView
, the Optic query will be executed in a single call to MarkLogic. - When writing files to a directory, the given path will be created automatically if it does not exist, matching the behavior of Spark file-based data sources.
Please see the writing guide for more information on the splitter and embedder features.