Skip to content

Commit

Permalink
20240912 flatten module refs (#2591)
Browse files Browse the repository at this point in the history
* Flatten module refs

* Move files in module dir

* Update redirects for flattening

* fix links

* missing savefile

* Update page title

* fix path
  • Loading branch information
databyjp authored Sep 12, 2024
1 parent 5bde5f4 commit fee4850
Show file tree
Hide file tree
Showing 51 changed files with 108 additions and 151 deletions.
2 changes: 1 addition & 1 deletion blog/2022-09-07-weaviate-1-15-release/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -406,7 +406,7 @@ Which would return the following result:
}
```
Head to the [Summarization Module docs page](/developers/weaviate/modules/reader-generator-modules/sum-transformers) to learn more.
Head to the [Summarization Module docs page](/developers/weaviate/modules/sum-transformers) to learn more.
### Hugging Face Module
The Hugging Face module (`text2vec-huggingface`) opens up doors to over 600 [Hugging Face sentence similarity models](https://huggingface.co/models?pipeline_tag=sentence-similarity), ready to be used in Weaviate as a vectorization module.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ For a use case like ours, where we would like to identify similar types of dogs,
The `img2vec-neural` module in Weaviate is designed to solve this exact problem! The module vectorizes each image to something that represents its contents, so that we can search images based on their semantic similarity. In other words, we can use `img2vec-neural` to query our database to see how similar the dogs are based on the image.

### Img2vec-neural Module
[Weaviate's img2vec-neural module](/developers/weaviate/modules/retriever-vectorizer-modules/img2vec-neural) is a flexible vectorizer that enables conversion of images to meaningful vectors. `ResNet-50` is the first model that is supported on Weaviate. `ResNet-50` is a Convolutional Neural Network (CNN) that was trained on the [ImageNet database](https://www.image-net.org/). The model was trained on more than 10 million images and 20,000 classes.
[Weaviate's img2vec-neural module](/developers/weaviate/modules/img2vec-neural) is a flexible vectorizer that enables conversion of images to meaningful vectors. `ResNet-50` is the first model that is supported on Weaviate. `ResNet-50` is a Convolutional Neural Network (CNN) that was trained on the [ImageNet database](https://www.image-net.org/). The model was trained on more than 10 million images and 20,000 classes.

## Weaviate Database
### Setup
Expand Down Expand Up @@ -215,7 +215,7 @@ app.config["UPLOAD_FOLDER"] = "/temp_images"
client = weaviate.Client("http://localhost:8080")
```

We will use the [`nearImage`](/developers/weaviate/modules/retriever-vectorizer-modules/img2vec-neural#nearimage-search) operator in Weaviate, so that it will search for images closest to the image uploaded by the user. To do this we will construct the `weaviate_img_search` function to get the relevant results. The response from our search query will include the closest objects in the Dog class. From the response, the function will output the dog image with the breed name and filepath. Note that the query is also formulated so that the response is limited to two results.
We will use the [`nearImage`](/developers/weaviate/modules/img2vec-neural#nearimage-search) operator in Weaviate, so that it will search for images closest to the image uploaded by the user. To do this we will construct the `weaviate_img_search` function to get the relevant results. The response from our search query will include the closest objects in the Dog class. From the response, the function will output the dog image with the breed name and filepath. Note that the query is also formulated so that the response is limited to two results.

```python
def weaviate_img_search(img_str):
Expand Down
4 changes: 2 additions & 2 deletions blog/2022-11-01-weaviate-1-16-release/_core-1-16-include.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -168,7 +168,7 @@ Every great choreographed dance needs a conductor. And the conductor of this who
## ref2vec-centroid Module
![ref2vec-centroid Module](./img/ref2vec-centroid.png)

Weaviate `1.16` unveils the [ref2vec-centroid](/developers/weaviate/modules/retriever-vectorizer-modules/ref2vec-centroid) module! Ref2Vec is about representing a data object based on the objects it references. The `ref2vec-centroid` module uses the average, or centroid vector, of the cross-referenced vectors to represent the referencing object.
Weaviate `1.16` unveils the [ref2vec-centroid](/developers/weaviate/modules/ref2vec-centroid) module! Ref2Vec is about representing a data object based on the objects it references. The `ref2vec-centroid` module uses the average, or centroid vector, of the cross-referenced vectors to represent the referencing object.

Or in other words, if you have an object (i.e. a shopping basket) that contains a number of cross-references (i.e. *"shorts"*, *"shoes"*, and a *"t-shirt"*), Ref2Vec can provide you with a vector that is at the center (i.e. close to all other similar clothing items). This way you can use the references to find more relevant objects.

Expand Down Expand Up @@ -201,7 +201,7 @@ As excited as we are about the applications in personalization and recommendatio

### Learn more

Check the [ref2vec-centroid](/developers/weaviate/modules/retriever-vectorizer-modules/ref2vec-centroid) documentation to learn how to work with Ref2Vec.
Check the [ref2vec-centroid](/developers/weaviate/modules/ref2vec-centroid) documentation to learn how to work with Ref2Vec.

## Node Status API
![Node status API](./img/node-status-api.png)
Expand Down
2 changes: 1 addition & 1 deletion blog/2022-11-23-ref2vec-centroid/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ description: "Weaviate introduces Ref2Vec, a new module that utilises Cross-Refe

<!-- truncate -->

Weaviate 1.16 introduced the [Ref2Vec](/developers/weaviate/modules/retriever-vectorizer-modules/ref2vec-centroid) module. In this article, we give you an overview of what Ref2Vec is and some examples in which it can add value such as recommendations or representing long objects.
Weaviate 1.16 introduced the [Ref2Vec](/developers/weaviate/modules/ref2vec-centroid) module. In this article, we give you an overview of what Ref2Vec is and some examples in which it can add value such as recommendations or representing long objects.

## What is Ref2Vec?
The name Ref2Vec is short for reference-to-vector, and it offers the ability to vectorize a data object with its cross-references to other objects. The Ref2Vec module currently holds the name ref2vec-**centroid** because it uses the average, or centroid vector, of the cross-referenced vectors to represent the **referencing** object. The benefit of this approach is that the referencing object can be characterized from its actions and relationships as well as refined over time.
Expand Down
2 changes: 1 addition & 1 deletion blog/2023-01-31-weaviate-podcast-search/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -130,7 +130,7 @@ In terms of the user experience for a **single podcast**... I think this is an i
I think the real innovation in the podcasting user experience is in a **collection of podcasts**. A collection of podcasts now achieves knowledge base reference functionality, similar to code documentation or survey papers!

Further I think this will transform podcast **recommendation**. One way of doing this could be to use each chunk of a podcast to search across all the atomic chunks of other podcasts. Or we could construct a single embedding for the entire podcast episode.
This is one of my favorite applications of our new [`ref2vec` module](/developers/weaviate/modules/retriever-vectorizer-modules/ref2vec-centroid). The first solution to ref2vec describes constructing Podcast (hasSegment) Segment cross-references, and then averaging the vectors of the segments to form the `Podcast` vector. We are exploring additional aggregation ideas as well such as:
This is one of my favorite applications of our new [`ref2vec` module](/developers/weaviate/modules/ref2vec-centroid). The first solution to ref2vec describes constructing Podcast (hasSegment) Segment cross-references, and then averaging the vectors of the segments to form the `Podcast` vector. We are exploring additional aggregation ideas as well such as:

* Ref2vec-centroids: Cluster the return multiple vectors<!-- TODO --> to represent `Podcast`. This also requires solving how we want to add multi-vector object representations to Weaviate.

Expand Down
4 changes: 2 additions & 2 deletions blog/2023-02-28-solution-to-tl-drs/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ In this day and age, this is a more common problem than ever. For a while now, t

You probably already know that Weaviate, as a vector database, can help with information cataloging and discovery. But did you know that Weaviate can also summarize information during retrieval?

Our summarizer module ([`sum-transformers`](/developers/weaviate/modules/reader-generator-modules/sum-transformers)) can be added to a Weaviate instance to do exactly that.
Our summarizer module ([`sum-transformers`](/developers/weaviate/modules/sum-transformers)) can be added to a Weaviate instance to do exactly that.

And as a bonus, we will also show you how to use our new generative module (`generative-openai`) to do the same thing as well.

Expand Down Expand Up @@ -409,7 +409,7 @@ It will run much faster on systems that support GPU acceleration with CUDA. CPUs

Currently, the `sum-transformers` module uses the `bart-large-cnn` model under the hood by default, with an option for the `pegasus-xsum` model. Both of these are well-known, high-performance models trained by Facebook and Google respectively.

In addition to these two models, however, you can use any model from the Hugging Face Hub (or your own) by following [this guide](/developers/weaviate/modules/reader-generator-modules/sum-transformers#use-another-summarization-module-from-hugging-face).
In addition to these two models, however, you can use any model from the Hugging Face Hub (or your own) by following [this guide](/developers/weaviate/modules/sum-transformers#use-another-summarization-module-from-hugging-face).

Even when looking only at language models that are trained for summarization tasks, there is still a wide range of choices in terms of sheer numbers, which vary in the target domain (e.g. medical, legal, scientific, etc.) and size (number of parameters, i.e. speed). If you have specific needs, we recommend investigating other models.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -160,7 +160,7 @@ go_memstats_heap_inuse_bytes
Batch latency is important as batch operations are the most efficient way to write data to
Weaviate. Monitoring this can give an indication if there is a problem with indexing data. This metric has a label `operation` which
allows you to see how long objects, vectors, and inverted index sub operations take. If you are using a [vectorizer module](/developers/weaviate/modules/retriever-vectorizer-modules) you will see additional latency due to the overhead of sending data to the module.
allows you to see how long objects, vectors, and inverted index sub operations take. If you are using a [vectorizer module](/developers/weaviate/model-providers/) you will see additional latency due to the overhead of sending data to the module.
```
rate(batch_durations_ms_sum[30s])/rate(batch_durations_ms_count[30s])
Expand Down
2 changes: 1 addition & 1 deletion blog/2023-05-23-pdfs-to-weaviate/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -135,7 +135,7 @@ from weaviate.embedded import EmbeddedOptions
import os
```

In this example, we are using [Embedded Weaviate](/developers/weaviate/installation/embedded). You can also run it on [WCD](https://console.weaviate.cloud) or [docker](/developers/weaviate/installation/docker-compose). This demo is also using OpenAI for vectorization; you can choose another `text2vec` module [here](/developers/weaviate/modules/retriever-vectorizer-modules).
In this example, we are using [Embedded Weaviate](/developers/weaviate/installation/embedded). You can also run it on [WCD](https://console.weaviate.cloud) or [docker](/developers/weaviate/installation/docker-compose). This demo is also using OpenAI for vectorization; you can choose another `text2vec` module [here](/developers/weaviate/model-providers).

```python
client = weaviate.Client(
Expand Down
2 changes: 1 addition & 1 deletion blog/2023-07-18-automated-testing/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ But Embedded Weaviate is not the only way to make testing easier. In the followi
## Scoping tests

While you may be familiar with tests and integration tests in general, here are some specific suggestions for Weaviate-powered applications:
* **Whether to test search quality**: This depends primarily on the model used for vectorization, such as by a [Weaviate vectorizer module](/developers/weaviate/modules/retriever-vectorizer-modules). We suggest evaluating models separately, but not tested as a part of the application.
* **Whether to test search quality**: This depends primarily on the model used for vectorization, such as by a [Weaviate vectorizer module](/developers/weaviate/model-providers). We suggest evaluating models separately, but not tested as a part of the application.
* **Focus on interactions with the inference provider**: Search itself is a core Weaviate functionality that we can trust. So, we suggest any integration tests focus on the interaction with the inference provider. For example,
* is the vectorization model the expected one?
* if switching to a different inference provider or model, does the application still function as expected?
Expand Down
2 changes: 1 addition & 1 deletion blog/2024-01-30-weaviate-non-english-unicode/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ Different languages, such as vocabulary, grammar, and alphabets, differ in many

### Language models

Whether you are using English or any other language, you need to make sure the embedding model and LLM you are using support your specific language. For example, [`text2vec-cohere`’s `embed-multilingual-v3.0`](https://weaviate.io/blog/cohere-multilingual-with-weaviate) or `text2vec-openai`’s `text-embedding-ada-002` both support multiple languages. Make sure to check the chosen [vectorizer module’s](https://weaviate.io/developers/weaviate/modules/retriever-vectorizer-modules) documentation to ensure the embedding model supports your language of choice. The same applies to the [generator modules](https://weaviate.io/developers/weaviate/modules/reader-generator-modules).
Whether you are using English or any other language, you need to make sure the embedding model and LLM you are using support your specific language. For example, [`text2vec-cohere`’s `embed-multilingual-v3.0`](https://weaviate.io/blog/cohere-multilingual-with-weaviate) or `text2vec-openai`’s `text-embedding-ada-002` both support multiple languages. Make sure to check the chosen [vectorizer module’s](https://weaviate.io/developers/weaviate/model-providers) documentation to ensure the embedding model supports your language of choice. The same applies to the [generator integrations](/developers/weaviate/model-providers).

### Character encoding

Expand Down
4 changes: 2 additions & 2 deletions blog/2024-02-13-fine-tuning-cohere-reranker/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -418,7 +418,7 @@ print(json.dumps(response, indent=2))
"Get": {
"BlogsFineTuned": [
{
"content": "---\ntitle: What is Ref2Vec and why you need it for your recommendation system\nslug: ref2vec-centroid\nauthors: [connor]\ndate: 2022-11-23\ntags: ['integrations', 'concepts']\nimage: ./img/hero.png\ndescription: \"Weaviate introduces Ref2Vec, a new module that utilises Cross-References for Recommendation!\"\n---\n![Ref2vec-centroid](./img/hero.png)\n\n<!-- truncate -->\n\nWeaviate 1.16 introduced the [Ref2Vec](/developers/weaviate/modules/retriever-vectorizer-modules/ref2vec-centroid) module. In this article, we give you an overview of what Ref2Vec is and some examples in which it can add value such as recommendations or representing long objects. ## What is Ref2Vec? The name Ref2Vec is short for reference-to-vector, and it offers the ability to vectorize a data object with its cross-references to other objects. The Ref2Vec module currently holds the name ref2vec-**centroid** because it uses the average, or centroid vector, of the cross-referenced vectors to represent the **referencing** object."
"content": "---\ntitle: What is Ref2Vec and why you need it for your recommendation system\nslug: ref2vec-centroid\nauthors: [connor]\ndate: 2022-11-23\ntags: ['integrations', 'concepts']\nimage: ./img/hero.png\ndescription: \"Weaviate introduces Ref2Vec, a new module that utilises Cross-References for Recommendation!\"\n---\n![Ref2vec-centroid](./img/hero.png)\n\n<!-- truncate -->\n\nWeaviate 1.16 introduced the [Ref2Vec](/developers/weaviate/modules/ref2vec-centroid) module. In this article, we give you an overview of what Ref2Vec is and some examples in which it can add value such as recommendations or representing long objects. ## What is Ref2Vec? The name Ref2Vec is short for reference-to-vector, and it offers the ability to vectorize a data object with its cross-references to other objects. The Ref2Vec module currently holds the name ref2vec-**centroid** because it uses the average, or centroid vector, of the cross-referenced vectors to represent the **referencing** object."
},
{
"content": "In other words, the User vector is being updated in real-time here to take into account their preferences and actions, which helps to produce more relevant results at speed. Another benefit of Ref2Vec is that this calculation is not compute-heavy, leading to low overhead. With Ref2Vec, you can use Weaviate to provide Recommendation with \"user-as-query\". This is a very common and powerful way to build Home Feed style features in apps. This can be done by sending queries like this to Weaviate:\n\n```graphql\n{\n Get {\n Product (\n nearObject: {\n id: \"8abc5-4d5...\" # id for the User object with vector defined by ref2vec-centroid\n }\n ) {\n product_name\n price\n }\n }\n}\n```\n\nThis short query encapsulates the power of Ref2Vec."
Expand Down Expand Up @@ -499,7 +499,7 @@ print(json.dumps(response, indent=2))
}
]
},
"content": "---\ntitle: What is Ref2Vec and why you need it for your recommendation system\nslug: ref2vec-centroid\nauthors: [connor]\ndate: 2022-11-23\ntags: ['integrations', 'concepts']\nimage: ./img/hero.png\ndescription: \"Weaviate introduces Ref2Vec, a new module that utilises Cross-References for Recommendation!\"\n---\n![Ref2vec-centroid](./img/hero.png)\n\n<!-- truncate -->\n\nWeaviate 1.16 introduced the [Ref2Vec](/developers/weaviate/modules/retriever-vectorizer-modules/ref2vec-centroid) module. In this article, we give you an overview of what Ref2Vec is and some examples in which it can add value such as recommendations or representing long objects. ## What is Ref2Vec? The name Ref2Vec is short for reference-to-vector, and it offers the ability to vectorize a data object with its cross-references to other objects. The Ref2Vec module currently holds the name ref2vec-**centroid** because it uses the average, or centroid vector, of the cross-referenced vectors to represent the **referencing** object."
"content": "---\ntitle: What is Ref2Vec and why you need it for your recommendation system\nslug: ref2vec-centroid\nauthors: [connor]\ndate: 2022-11-23\ntags: ['integrations', 'concepts']\nimage: ./img/hero.png\ndescription: \"Weaviate introduces Ref2Vec, a new module that utilises Cross-References for Recommendation!\"\n---\n![Ref2vec-centroid](./img/hero.png)\n\n<!-- truncate -->\n\nWeaviate 1.16 introduced the [Ref2Vec](/developers/weaviate/modules/ref2vec-centroid) module. In this article, we give you an overview of what Ref2Vec is and some examples in which it can add value such as recommendations or representing long objects. ## What is Ref2Vec? The name Ref2Vec is short for reference-to-vector, and it offers the ability to vectorize a data object with its cross-references to other objects. The Ref2Vec module currently holds the name ref2vec-**centroid** because it uses the average, or centroid vector, of the cross-referenced vectors to represent the **referencing** object."
},
{
"_additional": {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,6 @@ To use generative search, a `generative-xxx` module must be enabled in the Weavi

If you are using WCD, generative modules are enabled by default ([see docs](/developers/wcs#configuration). Otherwise, you must configure your Weaviate instance to make sure that a generative module is enabled.

This is outside the scope of this unit, but you can refer to the [module configuration](/developers/weaviate/modules/reader-generator-modules/index.md) for information on how to configure each module.

### <i class="fa-solid fa-code"></i> Configure classes

If only one generative module is enabled for the Weaviate instance, Weaviate will automatically use that module for all generative tasks.
Expand Down
Loading

0 comments on commit fee4850

Please sign in to comment.