If you've embeded a dataset with nomic-embed-text-v1.5 you can "process SAE" in the embed step.
This will then annotate each row with SAE features from https://enjalot.github.io/latent-taxonomy/articles/about
You can then explore essentially the concepts that the embedding model uses to represent each data point.
You can also filter by a particular SAE feature to see which rows strongly activate for that concept.
![Screenshot 2024-12-20 at 11 05 43 AM](https://private-user-images.githubusercontent.com/96189/397796441-1f52ffc3-9ccd-4c6d-89f5-474bdd156dd8.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MzkyNTI1MzIsIm5iZiI6MTczOTI1MjIzMiwicGF0aCI6Ii85NjE4OS8zOTc3OTY0NDEtMWY1MmZmYzMtOWNjZC00YzZkLTg5ZjUtNDc0YmRkMTU2ZGQ4LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMTElMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjExVDA1MzcxMlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTU3MDg5YTNjNTdkNDJmYmIxYmRmNDAzNjU5ZGQzMGEzNzUxZDE4MmFiODUzNjIzNzRmNDEwNjg0YjQ3OGRlYWUmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.BrkbQ9WNDgDt2iMroUglegKs6IOFzu4HXf6SvcCSZjE)
![Screenshot 2024-12-20 at 11 05 51 AM](https://private-user-images.githubusercontent.com/96189/397796461-c568b020-aa83-47c9-ad3b-597bef6b6533.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MzkyNTI1MzIsIm5iZiI6MTczOTI1MjIzMiwicGF0aCI6Ii85NjE4OS8zOTc3OTY0NjEtYzU2OGIwMjAtYWE4My00N2M5LWFkM2ItNTk3YmVmNmI2NTMzLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMTElMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjExVDA1MzcxMlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTk1YWUzNGM5YWQ0YjA0MWJiYjIxNGY0OWZkY2Y3ZDhjMTExY2U3YzE3YmQ4OWIxNDY5NWI0MDc2OWQ0ZDQ3ZTcmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.MBWaEetCALs-KRxyxVUgYsV_CADIbr30sTU6mmt8dC4)
![Screenshot 2024-12-20 at 11 06 19 AM](https://private-user-images.githubusercontent.com/96189/397796471-9c49e9a7-c47b-41b2-adfb-7963237b0332.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MzkyNTI1MzIsIm5iZiI6MTczOTI1MjIzMiwicGF0aCI6Ii85NjE4OS8zOTc3OTY0NzEtOWM0OWU5YTctYzQ3Yi00MWIyLWFkZmItNzk2MzIzN2IwMzMyLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMTElMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjExVDA1MzcxMlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWUxZTU3YzJlODNkM2ViOTA2Mzc5ZDI3MjAyODA2YmYzZjBjYjA0MDhlY2YyYTY4Mzc1OTk5MmU4MDI4YzkwZDkmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.VND3HDSeVjG44s6WiQRJxsrCmpZK01OF8S7UwDWPn7w)