Skip to content

Commit

Permalink
deploy: 9500d04
Browse files Browse the repository at this point in the history
  • Loading branch information
ryannikolaidis committed Oct 11, 2023
1 parent e7ab50c commit 12f8511
Show file tree
Hide file tree
Showing 58 changed files with 66 additions and 74 deletions.
2 changes: 1 addition & 1 deletion .buildinfo
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Sphinx build info version 1
# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
config: 39ac1536ba6b738ef4f304e6af7e643a
config: 0df4f9145e5dec97b8895ea31e54e8f0
tags: 645f666f9bcd5a90fca523b33c5a78b7
2 changes: 1 addition & 1 deletion _sources/introduction/getting_started.rst.txt
Original file line number Diff line number Diff line change
Expand Up @@ -173,7 +173,7 @@ of the table will be available in the element metadata under ``element.metadata.
table extraction is available, the ``partition`` function will extract tables automatically if they are present.
For PDFs and images, table extraction requires a relatively expensive call to a table recognition model, and so for those
document types table extraction is an option you need to enable. If you would like to extract tables for PDFs or images,
pass in ``infer_table_structured=True``. Here is an example (Note: this example requires the ``pdf`` extra. This can be installed with ``pip install "unstructured[pdf]"``):
pass in ``infer_table_structure=True``. Here is an example (Note: this example requires the ``pdf`` extra. This can be installed with ``pip install "unstructured[pdf]"``):

.. code:: python
Expand Down
4 changes: 2 additions & 2 deletions _sources/introduction/key_concepts.rst.txt
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ Natural Language Processing (NLP) encompasses a broad spectrum of tasks and meth
Data Ingestion
--------------

Unstructured's ``upstream connectors`` make data ingestion easy. They ensure that your data is accessible, up to date, and usable for any downstream task. If you'd like to read more on our upstream connectors, you can find details `here <../upstream_connectors.html>`__.
Unstructured's ``upstream connectors`` make data ingestion easy. They ensure that your data is accessible, up to date, and usable for any downstream task. If you'd like to read more on our upstream connectors, you can find details `here <https://unstructured-io.github.io/unstructured/source_connectors.html>`__.

Data Preprocessing
^^^^^^^^^^^^^^^^^^
Expand Down Expand Up @@ -61,7 +61,7 @@ is particularly beneficial for organizations that lack the means to develop and

A RAG workflow can be broken down into the following steps:

1. **Data ingestion**: The first step is acquiring data from your relevant sources. At Unstructured we make this super easy with our `data connectors <https://unstructured-io.github.io/unstructured/upstream_connectors.html>`__.
1. **Data ingestion**: The first step is acquiring data from your relevant sources. At Unstructured we make this super easy with our `data connectors <https://unstructured-io.github.io/unstructured/source_connectors.html>`__.

2. **Data preprocessing and cleaning**: Once you've identified and collected your data sources a good practice is to remove any unnecessary artifacts within the dataset. At Unstructured we have a variety of different tools to remove unneccesary elements. Found `here <https://unstructured-io.github.io/unstructured/bricks.html>`_

Expand Down
3 changes: 0 additions & 3 deletions _sources/metadata.rst.txt
Original file line number Diff line number Diff line change
Expand Up @@ -48,9 +48,6 @@ the source file:
+-----------------------------+----------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| emphasized_text_tags | Tags on text that is emphasized in the original document | |
+-----------------------------+----------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| num_characters | The number of characters used | Used for chunking. |
| | for max_characters in add_chunking_strategy | |
+-----------------------------+----------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| is_continuation | True if element is a continuation of a previous element | Only relevant for chunking, if an element was divided into two due to ``max_characters``. |
+-----------------------------+----------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| detection_class_prob | Detection model class probabilities | From unstructured-inference, hi-res strategy. |
Expand Down
2 changes: 1 addition & 1 deletion _static/documentation_options.js
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
var DOCUMENTATION_OPTIONS = {
URL_ROOT: document.getElementById("documentation_options").getAttribute('data-url_root'),
VERSION: '0.10.19',
VERSION: '0.10.20',
LANGUAGE: 'en',
COLLAPSE_INDEX: false,
BUILDER: 'html',
Expand Down
2 changes: 1 addition & 1 deletion api.html
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
<link rel="index" title="Index" href="genindex.html" /><link rel="search" title="Search" href="search.html" /><link rel="next" title="Bricks" href="bricks.html" /><link rel="prev" title="Docker Installation" href="installation/docker.html" />

<link rel="shortcut icon" href="_static/unstructured_small.png" /><meta name="generator" content="sphinx-6.2.1, furo 2023.07.26" />
<title>Unstructured API - Unstructured 0.10.19 documentation</title>
<title>Unstructured API - Unstructured 0.10.20 documentation</title>
<link rel="stylesheet" type="text/css" href="_static/pygments.css" />
<link rel="stylesheet" type="text/css" href="_static/styles/furo.css?digest=369552022d0b975c8e74270ce6eabe0fb7978f24" />
<link rel="stylesheet" type="text/css" href="_static/tabs.css" />
Expand Down
2 changes: 1 addition & 1 deletion best_practices.html
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
<link rel="index" title="Index" href="genindex.html" /><link rel="search" title="Search" href="search.html" /><link rel="next" title="Strategies" href="best_practices/strategies.html" /><link rel="prev" title="Integrations" href="integrations.html" />

<link rel="shortcut icon" href="_static/unstructured_small.png" /><meta name="generator" content="sphinx-6.2.1, furo 2023.07.26" />
<title>Best Practices - Unstructured 0.10.19 documentation</title>
<title>Best Practices - Unstructured 0.10.20 documentation</title>
<link rel="stylesheet" type="text/css" href="_static/pygments.css" />
<link rel="stylesheet" type="text/css" href="_static/styles/furo.css?digest=369552022d0b975c8e74270ce6eabe0fb7978f24" />
<link rel="stylesheet" type="text/css" href="_static/tabs.css" />
Expand Down
2 changes: 1 addition & 1 deletion best_practices/models.html
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
<link rel="index" title="Index" href="../genindex.html" /><link rel="search" title="Search" href="../search.html" /><link rel="prev" title="Strategies" href="strategies.html" />

<link rel="shortcut icon" href="../_static/unstructured_small.png" /><meta name="generator" content="sphinx-6.2.1, furo 2023.07.26" />
<title>Models - Unstructured 0.10.19 documentation</title>
<title>Models - Unstructured 0.10.20 documentation</title>
<link rel="stylesheet" type="text/css" href="../_static/pygments.css" />
<link rel="stylesheet" type="text/css" href="../_static/styles/furo.css?digest=369552022d0b975c8e74270ce6eabe0fb7978f24" />
<link rel="stylesheet" type="text/css" href="../_static/tabs.css" />
Expand Down
2 changes: 1 addition & 1 deletion best_practices/strategies.html
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
<link rel="index" title="Index" href="../genindex.html" /><link rel="search" title="Search" href="../search.html" /><link rel="next" title="Models" href="models.html" /><link rel="prev" title="Best Practices" href="../best_practices.html" />

<link rel="shortcut icon" href="../_static/unstructured_small.png" /><meta name="generator" content="sphinx-6.2.1, furo 2023.07.26" />
<title>Strategies - Unstructured 0.10.19 documentation</title>
<title>Strategies - Unstructured 0.10.20 documentation</title>
<link rel="stylesheet" type="text/css" href="../_static/pygments.css" />
<link rel="stylesheet" type="text/css" href="../_static/styles/furo.css?digest=369552022d0b975c8e74270ce6eabe0fb7978f24" />
<link rel="stylesheet" type="text/css" href="../_static/tabs.css" />
Expand Down
2 changes: 1 addition & 1 deletion bricks.html
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
<link rel="index" title="Index" href="genindex.html" /><link rel="search" title="Search" href="search.html" /><link rel="next" title="Partitioning" href="bricks/partition.html" /><link rel="prev" title="Unstructured API" href="api.html" />

<link rel="shortcut icon" href="_static/unstructured_small.png" /><meta name="generator" content="sphinx-6.2.1, furo 2023.07.26" />
<title>Bricks - Unstructured 0.10.19 documentation</title>
<title>Bricks - Unstructured 0.10.20 documentation</title>
<link rel="stylesheet" type="text/css" href="_static/pygments.css" />
<link rel="stylesheet" type="text/css" href="_static/styles/furo.css?digest=369552022d0b975c8e74270ce6eabe0fb7978f24" />
<link rel="stylesheet" type="text/css" href="_static/tabs.css" />
Expand Down
2 changes: 1 addition & 1 deletion bricks/chunking.html
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
<link rel="index" title="Index" href="../genindex.html" /><link rel="search" title="Search" href="../search.html" /><link rel="next" title="Embedding" href="embedding.html" /><link rel="prev" title="Staging" href="staging.html" />

<link rel="shortcut icon" href="../_static/unstructured_small.png" /><meta name="generator" content="sphinx-6.2.1, furo 2023.07.26" />
<title>Chunking - Unstructured 0.10.19 documentation</title>
<title>Chunking - Unstructured 0.10.20 documentation</title>
<link rel="stylesheet" type="text/css" href="../_static/pygments.css" />
<link rel="stylesheet" type="text/css" href="../_static/styles/furo.css?digest=369552022d0b975c8e74270ce6eabe0fb7978f24" />
<link rel="stylesheet" type="text/css" href="../_static/tabs.css" />
Expand Down
2 changes: 1 addition & 1 deletion bricks/cleaning.html
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
<link rel="index" title="Index" href="../genindex.html" /><link rel="search" title="Search" href="../search.html" /><link rel="next" title="Extracting" href="extracting.html" /><link rel="prev" title="Partitioning" href="partition.html" />

<link rel="shortcut icon" href="../_static/unstructured_small.png" /><meta name="generator" content="sphinx-6.2.1, furo 2023.07.26" />
<title>Cleaning - Unstructured 0.10.19 documentation</title>
<title>Cleaning - Unstructured 0.10.20 documentation</title>
<link rel="stylesheet" type="text/css" href="../_static/pygments.css" />
<link rel="stylesheet" type="text/css" href="../_static/styles/furo.css?digest=369552022d0b975c8e74270ce6eabe0fb7978f24" />
<link rel="stylesheet" type="text/css" href="../_static/tabs.css" />
Expand Down
2 changes: 1 addition & 1 deletion bricks/embedding.html
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
<link rel="index" title="Index" href="../genindex.html" /><link rel="search" title="Search" href="../search.html" /><link rel="next" title="Source Connectors" href="../source_connectors.html" /><link rel="prev" title="Chunking" href="chunking.html" />

<link rel="shortcut icon" href="../_static/unstructured_small.png" /><meta name="generator" content="sphinx-6.2.1, furo 2023.07.26" />
<title>Embedding - Unstructured 0.10.19 documentation</title>
<title>Embedding - Unstructured 0.10.20 documentation</title>
<link rel="stylesheet" type="text/css" href="../_static/pygments.css" />
<link rel="stylesheet" type="text/css" href="../_static/styles/furo.css?digest=369552022d0b975c8e74270ce6eabe0fb7978f24" />
<link rel="stylesheet" type="text/css" href="../_static/tabs.css" />
Expand Down
2 changes: 1 addition & 1 deletion bricks/extracting.html
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
<link rel="index" title="Index" href="../genindex.html" /><link rel="search" title="Search" href="../search.html" /><link rel="next" title="Staging" href="staging.html" /><link rel="prev" title="Cleaning" href="cleaning.html" />

<link rel="shortcut icon" href="../_static/unstructured_small.png" /><meta name="generator" content="sphinx-6.2.1, furo 2023.07.26" />
<title>Extracting - Unstructured 0.10.19 documentation</title>
<title>Extracting - Unstructured 0.10.20 documentation</title>
<link rel="stylesheet" type="text/css" href="../_static/pygments.css" />
<link rel="stylesheet" type="text/css" href="../_static/styles/furo.css?digest=369552022d0b975c8e74270ce6eabe0fb7978f24" />
<link rel="stylesheet" type="text/css" href="../_static/tabs.css" />
Expand Down
2 changes: 1 addition & 1 deletion bricks/partition.html
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
<link rel="index" title="Index" href="../genindex.html" /><link rel="search" title="Search" href="../search.html" /><link rel="next" title="Cleaning" href="cleaning.html" /><link rel="prev" title="Bricks" href="../bricks.html" />

<link rel="shortcut icon" href="../_static/unstructured_small.png" /><meta name="generator" content="sphinx-6.2.1, furo 2023.07.26" />
<title>Partitioning - Unstructured 0.10.19 documentation</title>
<title>Partitioning - Unstructured 0.10.20 documentation</title>
<link rel="stylesheet" type="text/css" href="../_static/pygments.css" />
<link rel="stylesheet" type="text/css" href="../_static/styles/furo.css?digest=369552022d0b975c8e74270ce6eabe0fb7978f24" />
<link rel="stylesheet" type="text/css" href="../_static/tabs.css" />
Expand Down
2 changes: 1 addition & 1 deletion bricks/staging.html
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
<link rel="index" title="Index" href="../genindex.html" /><link rel="search" title="Search" href="../search.html" /><link rel="next" title="Chunking" href="chunking.html" /><link rel="prev" title="Extracting" href="extracting.html" />

<link rel="shortcut icon" href="../_static/unstructured_small.png" /><meta name="generator" content="sphinx-6.2.1, furo 2023.07.26" />
<title>Staging - Unstructured 0.10.19 documentation</title>
<title>Staging - Unstructured 0.10.20 documentation</title>
<link rel="stylesheet" type="text/css" href="../_static/pygments.css" />
<link rel="stylesheet" type="text/css" href="../_static/styles/furo.css?digest=369552022d0b975c8e74270ce6eabe0fb7978f24" />
<link rel="stylesheet" type="text/css" href="../_static/tabs.css" />
Expand Down
2 changes: 1 addition & 1 deletion destination_connectors.html
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
<link rel="index" title="Index" href="genindex.html" /><link rel="search" title="Search" href="search.html" /><link rel="next" title="Delta Table" href="destination_connectors/delta_table.html" /><link rel="prev" title="Wikipedia" href="source_connectors/wikipedia.html" />

<link rel="shortcut icon" href="_static/unstructured_small.png" /><meta name="generator" content="sphinx-6.2.1, furo 2023.07.26" />
<title>Destination Connectors - Unstructured 0.10.19 documentation</title>
<title>Destination Connectors - Unstructured 0.10.20 documentation</title>
<link rel="stylesheet" type="text/css" href="_static/pygments.css" />
<link rel="stylesheet" type="text/css" href="_static/styles/furo.css?digest=369552022d0b975c8e74270ce6eabe0fb7978f24" />
<link rel="stylesheet" type="text/css" href="_static/tabs.css" />
Expand Down
2 changes: 1 addition & 1 deletion destination_connectors/azure_cognitive_search.html
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
<link rel="index" title="Index" href="../genindex.html" /><link rel="search" title="Search" href="../search.html" /><link rel="next" title="Metadata" href="../metadata.html" /><link rel="prev" title="Delta Table" href="delta_table.html" />

<link rel="shortcut icon" href="../_static/unstructured_small.png" /><meta name="generator" content="sphinx-6.2.1, furo 2023.07.26" />
<title>Azure Cognitive Search - Unstructured 0.10.19 documentation</title>
<title>Azure Cognitive Search - Unstructured 0.10.20 documentation</title>
<link rel="stylesheet" type="text/css" href="../_static/pygments.css" />
<link rel="stylesheet" type="text/css" href="../_static/styles/furo.css?digest=369552022d0b975c8e74270ce6eabe0fb7978f24" />
<link rel="stylesheet" type="text/css" href="../_static/tabs.css" />
Expand Down
2 changes: 1 addition & 1 deletion destination_connectors/delta_table.html
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
<link rel="index" title="Index" href="../genindex.html" /><link rel="search" title="Search" href="../search.html" /><link rel="next" title="Azure Cognitive Search" href="azure_cognitive_search.html" /><link rel="prev" title="Destination Connectors" href="../destination_connectors.html" />

<link rel="shortcut icon" href="../_static/unstructured_small.png" /><meta name="generator" content="sphinx-6.2.1, furo 2023.07.26" />
<title>Delta Table - Unstructured 0.10.19 documentation</title>
<title>Delta Table - Unstructured 0.10.20 documentation</title>
<link rel="stylesheet" type="text/css" href="../_static/pygments.css" />
<link rel="stylesheet" type="text/css" href="../_static/styles/furo.css?digest=369552022d0b975c8e74270ce6eabe0fb7978f24" />
<link rel="stylesheet" type="text/css" href="../_static/tabs.css" />
Expand Down
2 changes: 1 addition & 1 deletion examples.html
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
<link rel="index" title="Index" href="genindex.html" /><link rel="search" title="Search" href="search.html" /><link rel="next" title="Integrations" href="integrations.html" /><link rel="prev" title="Metadata" href="metadata.html" />

<link rel="shortcut icon" href="_static/unstructured_small.png" /><meta name="generator" content="sphinx-6.2.1, furo 2023.07.26" />
<title>Examples - Unstructured 0.10.19 documentation</title>
<title>Examples - Unstructured 0.10.20 documentation</title>
<link rel="stylesheet" type="text/css" href="_static/pygments.css" />
<link rel="stylesheet" type="text/css" href="_static/styles/furo.css?digest=369552022d0b975c8e74270ce6eabe0fb7978f24" />
<link rel="stylesheet" type="text/css" href="_static/tabs.css" />
Expand Down
2 changes: 1 addition & 1 deletion genindex.html
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
<meta name="viewport" content="width=device-width,initial-scale=1" />
<meta name="color-scheme" content="light dark"><link rel="index" title="Index" href="#" /><link rel="search" title="Search" href="search.html" />

<link rel="shortcut icon" href="_static/unstructured_small.png" /><meta name="generator" content="sphinx-6.2.1, furo 2023.07.26" /><title>Index - Unstructured 0.10.19 documentation</title>
<link rel="shortcut icon" href="_static/unstructured_small.png" /><meta name="generator" content="sphinx-6.2.1, furo 2023.07.26" /><title>Index - Unstructured 0.10.20 documentation</title>
<link rel="stylesheet" type="text/css" href="_static/pygments.css" />
<link rel="stylesheet" type="text/css" href="_static/styles/furo.css?digest=369552022d0b975c8e74270ce6eabe0fb7978f24" />
<link rel="stylesheet" type="text/css" href="_static/tabs.css" />
Expand Down
2 changes: 1 addition & 1 deletion index.html
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
<link rel="index" title="Index" href="genindex.html" /><link rel="search" title="Search" href="search.html" /><link rel="next" title="Introduction" href="introduction.html" />

<link rel="shortcut icon" href="_static/unstructured_small.png" /><meta name="generator" content="sphinx-6.2.1, furo 2023.07.26" />
<title>Unstructured 0.10.19 documentation</title>
<title>Unstructured 0.10.20 documentation</title>
<link rel="stylesheet" type="text/css" href="_static/pygments.css" />
<link rel="stylesheet" type="text/css" href="_static/styles/furo.css?digest=369552022d0b975c8e74270ce6eabe0fb7978f24" />
<link rel="stylesheet" type="text/css" href="_static/tabs.css" />
Expand Down
Loading

0 comments on commit 12f8511

Please sign in to comment.