Skip to content

Commit

Permalink
doc update for tag python-v0.17.4
Browse files Browse the repository at this point in the history
  • Loading branch information
deltars committed May 11, 2024
1 parent befdbe8 commit a07b64a
Show file tree
Hide file tree
Showing 10 changed files with 420 additions and 65 deletions.
34 changes: 33 additions & 1 deletion api/delta_table/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -3703,7 +3703,7 @@ <h4 id="deltalake.DeltaTable.to_pyarrow_dataset" class="doc doc-heading">


</h4>
<div class="doc-signature highlight"><pre><span></span><code><span id="__span-0-1"><a id="__codelineno-0-1" name="__codelineno-0-1" href="#__codelineno-0-1"></a><span class="nf">to_pyarrow_dataset</span><span class="p">(</span><span class="n">partitions</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="n">List</span><span class="p">[</span><span class="n">Tuple</span><span class="p">[</span><span class="nb">str</span><span class="p">,</span> <span class="nb">str</span><span class="p">,</span> <span class="n">Any</span><span class="p">]]]</span> <span class="o">=</span> <span class="kc">None</span><span class="p">,</span> <span class="n">filesystem</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="n">Union</span><span class="p">[</span><span class="nb">str</span><span class="p">,</span> <span class="n">pa_fs</span><span class="o">.</span><span class="n">FileSystem</span><span class="p">]]</span> <span class="o">=</span> <span class="kc">None</span><span class="p">,</span> <span class="n">parquet_read_options</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="n">ParquetReadOptions</span><span class="p">]</span> <span class="o">=</span> <span class="kc">None</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">pyarrow</span><span class="o">.</span><span class="n">dataset</span><span class="o">.</span><span class="n">Dataset</span>
<div class="doc-signature highlight"><pre><span></span><code><span id="__span-0-1"><a id="__codelineno-0-1" name="__codelineno-0-1" href="#__codelineno-0-1"></a><span class="nf">to_pyarrow_dataset</span><span class="p">(</span><span class="n">partitions</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="n">List</span><span class="p">[</span><span class="n">Tuple</span><span class="p">[</span><span class="nb">str</span><span class="p">,</span> <span class="nb">str</span><span class="p">,</span> <span class="n">Any</span><span class="p">]]]</span> <span class="o">=</span> <span class="kc">None</span><span class="p">,</span> <span class="n">filesystem</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="n">Union</span><span class="p">[</span><span class="nb">str</span><span class="p">,</span> <span class="n">pa_fs</span><span class="o">.</span><span class="n">FileSystem</span><span class="p">]]</span> <span class="o">=</span> <span class="kc">None</span><span class="p">,</span> <span class="n">parquet_read_options</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="n">ParquetReadOptions</span><span class="p">]</span> <span class="o">=</span> <span class="kc">None</span><span class="p">,</span> <span class="n">schema</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="n">pyarrow</span><span class="o">.</span><span class="n">Schema</span><span class="p">]</span> <span class="o">=</span> <span class="kc">None</span><span class="p">,</span> <span class="n">as_large_types</span><span class="p">:</span> <span class="nb">bool</span> <span class="o">=</span> <span class="kc">False</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="n">pyarrow</span><span class="o">.</span><span class="n">dataset</span><span class="o">.</span><span class="n">Dataset</span>
</span></code></pre></div>

<div class="doc doc-contents ">
Expand Down Expand Up @@ -3765,6 +3765,38 @@ <h4 id="deltalake.DeltaTable.to_pyarrow_dataset" class="doc doc-heading">
<code>None</code>
</td>
</tr>
<tr>
<td><code>schema</code></td>
<td>
<code><span title="typing.Optional">Optional</span>[<a class="autorefs autorefs-external" title="pyarrow.Schema" href="https://arrow.apache.org/docs/python/generated/pyarrow.Schema.html#pyarrow.Schema">Schema</a>]</code>
</td>
<td>
<div class="doc-md-description">
<p>The schema to use for the dataset. If None, the schema of the DeltaTable will be used. This can be used to force reading of Parquet/Arrow datatypes
that DeltaLake can't represent in it's schema (e.g. LargeString).
If you only need to read the schema with large types (e.g. for compatibility with Polars) you may want to use the <code>as_large_types</code> parameter instead.</p>
</div>
</td>
<td>
<code>None</code>
</td>
</tr>
<tr>
<td><code>as_large_types</code></td>
<td>
<code>bool</code>
</td>
<td>
<div class="doc-md-description">
<p>get schema with all variable size types (list, binary, string) as large variants (with int64 indices).
This is for compatibility with systems like Polars that only support the large versions of Arrow types.
If <code>schema</code> is passed it takes precedence over this option.</p>
</div>
</td>
<td>
<code>False</code>
</td>
</tr>
</tbody>
</table>
<p>More info: https://arrow.apache.org/docs/python/generated/pyarrow.dataset.ParquetReadOptions.html</p>
Expand Down
Loading

0 comments on commit a07b64a

Please sign in to comment.