-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added different embedding models #1800
base: master
Are you sure you want to change the base?
Conversation
WalkthroughThe changes update the Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant LLMProcessor
Client->>LLMProcessor: train()
LLMProcessor->>LLMProcessor: load_sparse_embedding_model()
LLMProcessor->>LLMProcessor: load_rerank_embedding_model()
Note over LLMProcessor: Initialize embedding models & configurations
Client->>LLMProcessor: get_embedding(sentence)
LLMProcessor->>LLMProcessor: check_empty_string(sentence)
LLMProcessor->>LLMProcessor: get_sparse_embedding(sentence)
LLMProcessor->>LLMProcessor: get_rerank_embedding(sentence)
LLMProcessor->>Client: Return embeddings as dict keyed by model names
Possibly related PRs
Suggested reviewers
Poem
✨ Finishing Touches
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (7)
kairon/shared/llm/processor.py (7)
76-77
: Consider lazy loading or error handling.
Loading both sparse and rerank embeddings during training can be time-consuming if the models are large. You might consider lazy loading or implement error handling in case model downloads fail.
118-120
: Remove commented-out code if no longer needed.
These lines referencing the old single-vector approach are commented out. Consider removing them or adding a clear comment explaining why they are retained.
520-527
: Use logging instead of print.
“SPARSE MODEL LOADED” is useful feedback, but considerlogging.info(...)
instead ofprint(...)
for standardized logging in production.
528-535
: Same note on print statements.
As with sparse loading, preferlogging.info(...)
for “RERANK MODEL LOADED” to maintain consistent logging.
536-557
: Consider removing debug prints and safeguarding array access.
- Replace direct
print(embedding)
with logging or remove it.- If
self._sparse_embedding.passage_embed(sentence)
unexpectedly returns an empty list,embedding[0]
will fail. Consider a pre-check or exception handling.
559-573
: Fix docstring mismatch in rerank embedding.
The docstring says “Generate a sparse embedding,” but this method actually returns a rerank embedding from the ColBERT model. Update the docstring to avoid confusion.
574-581
: Simplifycheck_empty_string
and handle potential None.
According to static analysis, you can simplify this function to return a boolean in one line. Make sure you handleNone
safely if it’s ever passed. Example fix:- if not value: - return True - if not value.strip(): - return True - else: - return False + return not value or not value.strip()🧰 Tools
🪛 Ruff (0.8.2)
578-581: Return the condition
bool(not value.strip())
directlyReplace with
return bool(not value.strip())
(SIM103)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
kairon/shared/llm/processor.py
(9 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
kairon/shared/llm/processor.py
578-581: Return the condition bool(not value.strip())
directly
Replace with return bool(not value.strip())
(SIM103)
⏰ Context from checks skipped due to timeout of 90000ms (3)
- GitHub Check: Codacy Static Code Analysis
- GitHub Check: Analyze (python)
- GitHub Check: Python CI
🔇 Additional comments (12)
kairon/shared/llm/processor.py (12)
33-34
: Be mindful of concurrency issues with class-level embeddings.
Since_sparse_embedding
and_rerank_embedding
are defined as class-level attributes rather than instance-level, concurrent usage or multithreading might cause unexpected behavior or race conditions. If multiple bots and threads share the same class, ensure proper synchronization or refactor to instance-level as needed.
45-57
: Configuration approach looks good.
Definingvectors_config
for multiple models is a clean approach to add new or variant embeddings. No major concerns here.
59-62
: Sparse config is an effective placeholder.
Keeping a separate config dictionary (sparse_vectors_config
) for BM25 is future-friendly. Just confirm that you no longer need the commentedvector_config
.
121-132
: Validate matching array lengths when building points.
You build a dictionary of embeddings per model for each index. Verify that each embedding array matches the length ofvector_ids
to avoid misalignment issues.
206-206
: Docstring aligns with multi-model approach.
Your updated docstring clarifies that multiple models are used for embeddings. Great clarity for maintainers.
212-212
: Properly initializing the embeddings dictionary.
Storing embeddings in a dict by model name is a solid approach for organizing multi-model outputs.
221-221
: Litellm usage looks correct.
You're extracting embedding vectors fromresult["data"]
properly.
223-227
: BM25 embedding usage.
These lines populate the'bm25'
key with sparse embeddings. Ensure your downstream logic fully supports sparse vectors if you plan to query them.
228-231
: ColBERT embeddings stored correctly.
You invokeget_rerank_embedding
to populate the'colbertv2.0'
key. The usage here is consistent with multi-model design.
234-234
: Good single-text return structure.
Returning a dictionary of model → single embedding for single-text inputs keeps usage consistent with multi-text outputs.
345-347
: Review Qdrant multivector support.
Assigning both'vectors'
and'sparse_vectors'
in the creation payload is correct for multi-vector indexing, but confirm your Qdrant deployment supports these features (version compatibility, etc.).
518-518
: Return statement is straightforward.
No issues here; the function properly modifies and returns the user’s message.
Summary by CodeRabbit
New Features
Refactor