Skip to content

Commit

Permalink
chore/move client specific params to their own section (#68)
Browse files Browse the repository at this point in the history
  • Loading branch information
awalker4 authored Jun 7, 2024
1 parent 380956e commit adc8787
Showing 1 changed file with 7 additions and 1 deletion.
8 changes: 7 additions & 1 deletion api-reference/api-services/api-parameters.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,6 @@ The only required parameter is `files` - the file you wish to process.
| `output_format` (_str_) | `outputFormat` (_string_) | The format of the response. Supported formats are `application/json` and `text/csv`. Default: `application/json`. |
| `pdf_infer_table_structure` (_bool_) | `pdfInferTableStructure` (_boolean_) | **Deprecated!** If True and strategy=hi_res, any Table Elements extracted from a PDF will include an additional metadata field, 'text_as_html', where the value (string) is a just a transformation of the data into an HTML table. |
| `skip_infer_table_types` (_List[str]_) | `skipInferTableTypes` (_string[]_) | The document types that you want to skip table extraction with. Default: [] |
| `split_pdf_page` (_bool_) | `splitPdfPage` (_boolean_) | Should the pdf file be split at client. Ignored on backend. |
| `starting_page_number` (_int_) | `startingPageNumber` (_number_) | Indicates what page number should be assigned to the first page in the document. This information will be reflected in elements' metadata and can be be especially useful when partitioning a document that is part of a larger document. |
| `strategy` (_str_) | `strategy` (_string_) | The strategy to use for partitioning PDF/image. Options are `fast`, `hi_res`, `auto`. Default: `auto` |
| `unique_element_ids` (_bool_) | `uniqueElementIds` (_boolean_) | When True, assign UUIDs to element IDs, which guarantees their uniqueness (useful when using them as primary keys in database). Otherwise a SHA-256 of element text is used. Default: False |
Expand All @@ -41,4 +40,11 @@ The following parameters only apply when a `chunking_strategy` is specified. Oth
| `overlap` (_int_) | `overlap` (_number_) | A prefix of this many trailing characters from the prior text-split chunk is applied to second and later chunks formed from oversized elements by text-splitting. Default: None |
| `overlap_all` (_bool_) | `overlapAll` (_boolean_) | When True, overlap is also applied to 'normal' chunks formed by combining whole elements. Use with caution as this can introduce noise into otherwise clean semantic units. Default: None |

The following parameters are specific to the Python and Javascript clients and are not sent to the server.

| Python & direct call | JavaScript | Description |
|---------------------------------------|----------------------------|--------------------------------------------------------------------------------------------------------------------------------|
| `split_pdf_page` (_bool_) | `splitPdfPage` (_boolean_) | Should the pdf file be split at client. See [page splitting](/api-reference/api-services/sdk#page-splitting) for more details. |
| `split_pdf_concurrency_level` (_int_) | _Not supported yet_ | Number of split files to be sent concurrently. Default: 5, maximum: 15 |

Need help getting started? Check out the [Examples page](/api-reference/api-services/examples) for some inspiration.

0 comments on commit adc8787

Please sign in to comment.