From adc87873606fb14ab7b74208ad733ed62a23d281 Mon Sep 17 00:00:00 2001 From: Austin Walker Date: Fri, 7 Jun 2024 12:52:32 -0400 Subject: [PATCH] chore/move client specific params to their own section (#68) --- api-reference/api-services/api-parameters.mdx | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/api-reference/api-services/api-parameters.mdx b/api-reference/api-services/api-parameters.mdx index f069a9e1..4ee859f8 100644 --- a/api-reference/api-services/api-parameters.mdx +++ b/api-reference/api-services/api-parameters.mdx @@ -23,7 +23,6 @@ The only required parameter is `files` - the file you wish to process. | `output_format` (_str_) | `outputFormat` (_string_) | The format of the response. Supported formats are `application/json` and `text/csv`. Default: `application/json`. | | `pdf_infer_table_structure` (_bool_) | `pdfInferTableStructure` (_boolean_) | **Deprecated!** If True and strategy=hi_res, any Table Elements extracted from a PDF will include an additional metadata field, 'text_as_html', where the value (string) is a just a transformation of the data into an HTML table. | | `skip_infer_table_types` (_List[str]_) | `skipInferTableTypes` (_string[]_) | The document types that you want to skip table extraction with. Default: [] | -| `split_pdf_page` (_bool_) | `splitPdfPage` (_boolean_) | Should the pdf file be split at client. Ignored on backend. | | `starting_page_number` (_int_) | `startingPageNumber` (_number_) | Indicates what page number should be assigned to the first page in the document. This information will be reflected in elements' metadata and can be be especially useful when partitioning a document that is part of a larger document. | | `strategy` (_str_) | `strategy` (_string_) | The strategy to use for partitioning PDF/image. Options are `fast`, `hi_res`, `auto`. Default: `auto` | | `unique_element_ids` (_bool_) | `uniqueElementIds` (_boolean_) | When True, assign UUIDs to element IDs, which guarantees their uniqueness (useful when using them as primary keys in database). Otherwise a SHA-256 of element text is used. Default: False | @@ -41,4 +40,11 @@ The following parameters only apply when a `chunking_strategy` is specified. Oth | `overlap` (_int_) | `overlap` (_number_) | A prefix of this many trailing characters from the prior text-split chunk is applied to second and later chunks formed from oversized elements by text-splitting. Default: None | | `overlap_all` (_bool_) | `overlapAll` (_boolean_) | When True, overlap is also applied to 'normal' chunks formed by combining whole elements. Use with caution as this can introduce noise into otherwise clean semantic units. Default: None | +The following parameters are specific to the Python and Javascript clients and are not sent to the server. + +| Python & direct call | JavaScript | Description | +|---------------------------------------|----------------------------|--------------------------------------------------------------------------------------------------------------------------------| +| `split_pdf_page` (_bool_) | `splitPdfPage` (_boolean_) | Should the pdf file be split at client. See [page splitting](/api-reference/api-services/sdk#page-splitting) for more details. | +| `split_pdf_concurrency_level` (_int_) | _Not supported yet_ | Number of split files to be sent concurrently. Default: 5, maximum: 15 | + Need help getting started? Check out the [Examples page](/api-reference/api-services/examples) for some inspiration.