Feat/954 llama cpp #1000

bikash119 · 2024-09-25T06:37:26Z

This PR adds llama-cpp support to create embeddings.

  from distilabel.embeddings import LlamaCppEmbeddings

  embeddings = LlamaCppEmbeddings(model="second-state/all-MiniLM-L6-v2-Q2_K.gguf")

  embeddings.load()

  results = embeddings.encode(inputs=["distilabel is awesome!", "and Argilla!"])
  # [
  #   [-0.05447685346007347, -0.01623094454407692, ...],
  #   [4.4889533455716446e-05, 0.044016145169734955, ...],
  # ]

bikash119 · 2024-09-25T06:41:03Z

@davidberenstein1957 : I have added a working version for llama-cpp support to generate embeddings. Will add more parameters as we have here

…beddings should be normalized - Added testcases to test normalize embeddings

bikash119 · 2024-09-25T14:03:59Z

Create embeddings using model from local disk
Create embeddings using huggingface hub model
Normalize embeddings by passing normalize_embeddings=True to LlamaCppEmbeddings class

bikash119 · 2024-09-25T14:19:44Z

@davidberenstein1957 : May I request you to review the changes. In addition, I couldn't add the documentation to Component Library or to API Reference. I need a little help to add / update documentation.

davidberenstein1957

Looking good already. I left some initial comments. Have you seen this already #999?

pyproject.toml

src/distilabel/embeddings/llamacpp.py

tests/unit/conftest.py

Accept recommended suggestion Co-authored-by: David Berenstein <[email protected]>

- Incorporated changes suggested in review comments.

- use atexit to forcefully invoke cleanup

- Add test_encode_batch_consistency to ensure consistent results - Test large batch processing capability - Verify embedding dimensions and count for different batch sizes

davidberenstein1957 · 2024-09-30T22:37:27Z

Hi Bikash.That would be great. Do you think we can inherent the same / similar logic using a mixin? Perhaps we can tackle a similar approach for vLLM too? Not entirely sure about potential edge cases though but feel free to explore it. Also it might be cleaner to create a separate PR/issue and tackle them there. Regards,DavidOn 1 Oct 2024, at 00:29, bikash119 ***@***.***> wrote: @bikash119 commented on this pull request. In src/distilabel/embeddings/llamacpp.py:

+ if self.hub_repository_id is not None:

+ try: + from huggingface_hub.utils import validate_repo_id + + validate_repo_id(self.hub_repository_id) + except ImportError as ie: + raise ImportError( + "Llama.from_pretrained requires the huggingface-hub package. " + "You can install it with `pip install huggingface-hub`." + ) from ie + try: + self._logger.info( + f"Attempting to load model from Hugging Face Hub: {self.hub_repository_id}" + ) + self._model = _LlamaCpp.from_pretrained( + repo_id=self.hub_repository_id, + filename=self.model, + verbose=self.verbose, + embedding=True, + kwargs=self.extra_kwargs, + ) + self._logger.info("Model loaded successfully from Hugging Face Hub") + except Exception as e: + self._logger.error( + f"Failed to load model from Hugging Face Hub: {str(e)}" + ) + raise @davidberenstein1957 : Should I capture this logic in original LLM class too? cc : @gabrielmbmb —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: ***@***.***>

bikash119 · 2024-10-01T02:33:48Z

@davidberenstein1957 : Ok , let me create a different PR to handle the scenario. I will keep the this PR untouched and once we have the scenario handled using mixins, will remove the token handling code from LlamaCppEmbeddings class.

bikash119 · 2024-10-02T12:16:57Z

Hi @davidberenstein1957 : I have made the changes to make model download from hub reusable through mixin. Please share your feedback. If this looks good, will create another PR to handle the model download in llamacpp class of llm too.

davidberenstein1957

Don't forget to update the docs and API references :)

src/distilabel/embeddings/llamacpp.py

src/distilabel/mixins/hub_downloader.py

This reverts commit 55c3a0d.

…ngs." This reverts commit 778532f. HF_TOKEN can be set as env variable to download gated model

- alligned the attribute as per the review comments.

bikash119 · 2024-10-03T15:30:13Z

Don't forget to update the docs and API references :)

Have included the LlamaCppEmbeddings class to __init__.py under embeddings folder. On executing mkdocs serve I could see the documenation in Component Gallery and API reference.

…r --cpu-only. `pytest tests/unit/embeddings/test_llamacpp.py --cpu-only` will generate embeddings using cpu `pytest tests/unit/embeddings/test_llamacpp.py` will generate embeddings using gpu

bikash119 · 2024-10-04T04:34:09Z

hi @davidberenstein1957 : Thank you for your time to review my work. I am tried to incorporate your feedback and did a few improvements.

Added example for user to use both public and private model. For private models, the user has to configure export HF_TOKEN=hf... .
Added an option to test --cpu-only to test embedding generation on cpu.
Added examples for public , private, cpu based generation.
Please share your feedback.

davidberenstein1957

Thanks for the updates. Left some minor comments. :)

src/distilabel/embeddings/llamacpp.py

tests/unit/embeddings/test_llamacpp.py

tests/unit/conftest.py

src/distilabel/embeddings/llamacpp.py

- model (name of the model) : the model used to generate embeddings.

.gitignore

src/distilabel/embeddings/llamacpp.py

davidberenstein1957 · 2024-10-14T10:37:21Z

src/distilabel/embeddings/llamacpp.py

+    from llama_cpp import Llama
+
+
+class LlamaCppEmbeddings(Embeddings, CudaDevicePlacementMixin):


aren't some of the attributes already present in the parent class?

src/distilabel/embeddings/llamacpp.py

tests/unit/conftest.py

tests/unit/embeddings/test_llamacpp.py

try except block is not needed. Co-authored-by: David Berenstein <[email protected]>

Co-authored-by: David Berenstein <[email protected]>

hidden attributes shouldn't be documented. Co-authored-by: David Berenstein <[email protected]>

…or cpu

…into feat/954_llama-cpp

bikash119 added 2 commits September 24, 2024 13:58

Support embeddings generation using llama_cpp

c9ed5fd

Added llama-cpp-python as optional dependency

c3464bc

- Added normalize_embeddings argument to allow user to pass if the em…

582ca40

…beddings should be normalized - Added testcases to test normalize embeddings

bikash119 marked this pull request as ready for review September 25, 2024 14:01

davidberenstein1957 reviewed Sep 25, 2024

View reviewed changes

bikash119 and others added 8 commits September 26, 2024 08:08

Update pyproject.toml

fba8ada

Accept recommended suggestion Co-authored-by: David Berenstein <[email protected]>

- Updated test to allow developer to define test model location.

e288b31

- Incorporated changes suggested in review comments.

Merge remote-tracking branch 'upstream/develop' into feat/954_llama-cpp

d6d4352

- Made the test session scope

a936a39

- use atexit to forcefully invoke cleanup

- Reverted the changes made to model_path

316afa0

- Implement test_encode_batch to verify various batch sizes

7137883

- Add test_encode_batch_consistency to ensure consistent results - Test large batch processing capability - Verify embedding dimensions and count for different batch sizes

- Included LlamaCppEmbeddings to __ini__.py

2d0aa76

- Use HF_TOKEN to download model from hub to generate embeddings.

778532f

- Download from hub is now available through mixin

55c3a0d

bikash119 mentioned this pull request Oct 2, 2024

Download from hub #1011

Closed

davidberenstein1957 reviewed Oct 3, 2024

View reviewed changes

src/distilabel/embeddings/llamacpp.py Outdated Show resolved Hide resolved

src/distilabel/embeddings/llamacpp.py Outdated Show resolved Hide resolved

src/distilabel/mixins/hub_downloader.py Outdated Show resolved Hide resolved

bikash119 added 3 commits October 3, 2024 15:43

Revert "- Download from hub is now available through mixin"

935cdb8

This reverts commit 55c3a0d.

Revert "- Use HF_TOKEN to download model from hub to generate embeddi…

29a8d56

…ngs." This reverts commit 778532f. HF_TOKEN can be set as env variable to download gated model

- Removed mixin implemenation to download the model

b40b0d2

- alligned the attribute as per the review comments.

bikash119 added 2 commits October 4, 2024 08:10

- Additional example added for private / public model

b08f3ae

- The tests can now be configured to use cpu or gpu based on paramete…

a49363c

…r --cpu-only. `pytest tests/unit/embeddings/test_llamacpp.py --cpu-only` will generate embeddings using cpu `pytest tests/unit/embeddings/test_llamacpp.py` will generate embeddings using gpu

davidberenstein1957 reviewed Oct 4, 2024

View reviewed changes

bikash119 added 4 commits October 4, 2024 18:31

- repo_id or model_path : one of the parameters is mandatory

575f48e

- model (name of the model) : the model used to generate embeddings.

Added description to attribute : model

48dce7b

- Fixed examples

0e1fb8e

Updated examples

f72ef30

davidberenstein1957 reviewed Oct 14, 2024

View reviewed changes

bikash119 and others added 4 commits October 14, 2024 17:19

Update src/distilabel/embeddings/llamacpp.py

8218242

try except block is not needed. Co-authored-by: David Berenstein <[email protected]>

Update src/distilabel/embeddings/llamacpp.py

db00482

Co-authored-by: David Berenstein <[email protected]>

Update src/distilabel/embeddings/llamacpp.py

0fb7f15

hidden attributes shouldn't be documented. Co-authored-by: David Berenstein <[email protected]>

Updated test to set disable_cuda_device_placement=True when testing f…

155feb2

…or cpu

davidberenstein1957 requested review from gabrielmbmb and plaguss October 14, 2024 12:56

davidberenstein1957 added this to the 1.5.0 milestone Oct 14, 2024

bikash119 added 5 commits October 15, 2024 01:49

Merge branch 'develop' into feat/954_llama-cpp

b218b44

Merge branch 'develop' into feat/954_llama-cpp

58aa996

testcase will by default load the model to cpu

3659400

Merge branch 'feat/954_llama-cpp' of github.com:bikash119/distilabel …

92481b0

…into feat/954_llama-cpp

Merge branch 'develop' into feat/954_llama-cpp

ef98d63

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/954 llama cpp #1000

Feat/954 llama cpp #1000

bikash119 commented Sep 25, 2024

bikash119 commented Sep 25, 2024 •

edited

Loading

bikash119 commented Sep 25, 2024

bikash119 commented Sep 25, 2024

davidberenstein1957 left a comment

davidberenstein1957 commented Sep 30, 2024 via email

bikash119 commented Oct 1, 2024

bikash119 commented Oct 2, 2024

davidberenstein1957 left a comment

bikash119 commented Oct 3, 2024 •

edited

Loading

bikash119 commented Oct 4, 2024

davidberenstein1957 left a comment

davidberenstein1957 Oct 14, 2024

		from llama_cpp import Llama


		class LlamaCppEmbeddings(Embeddings, CudaDevicePlacementMixin):

Feat/954 llama cpp #1000

Are you sure you want to change the base?

Feat/954 llama cpp #1000

Conversation

bikash119 commented Sep 25, 2024

bikash119 commented Sep 25, 2024 • edited Loading

bikash119 commented Sep 25, 2024

bikash119 commented Sep 25, 2024

davidberenstein1957 left a comment

Choose a reason for hiding this comment

davidberenstein1957 commented Sep 30, 2024 via email

bikash119 commented Oct 1, 2024

bikash119 commented Oct 2, 2024

davidberenstein1957 left a comment

Choose a reason for hiding this comment

bikash119 commented Oct 3, 2024 • edited Loading

bikash119 commented Oct 4, 2024

davidberenstein1957 left a comment

Choose a reason for hiding this comment

davidberenstein1957 Oct 14, 2024

Choose a reason for hiding this comment

bikash119 commented Sep 25, 2024 •

edited

Loading

bikash119 commented Oct 3, 2024 •

edited

Loading