-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Painting to Imagine #1096
Comments
The tokenizer is only used for token counting, this is used for message truncation such that it fits the models (set) max prompt size. It is only something to worry about if you're hitting max context limits or wanting to reduce memory usage. I intend to remove that debug line as it has confused other folks as well.
I see how this message can be confusion but the tokenizer name is not the same as the chat model name. Khoj can identify the tokenizer for offline models run directly within Khoj. But it cannot for chat models run via API. For such models, you can set the tokenizer field in the chat model settings on the admin panel to the huggingface repo corresponding to the tokenizer for your chat model. For llama models the tokenizer can be set to Again It is only something to worry about if you're hitting max context limits or wanting to reduce memory usage. Otherwise this isn't really relevant to your issue with chat model selection and image generation, IMO.
Server chat settings are prioritized over user chat to allow server admins more control on the appropriate model to use for intermediary and background tasks. It is (meant to be) an optional feature. The priority order of default chat model selection is: khoj/src/khoj/database/adapters/__init__.py Lines 1112 to 1113 in a3b5ec4
Note: The agent chat model is current only used for the final response generation step. Not for the intermediate steps. Questions:
|
You should not; that was the reason why I came across that function. IMHO the logging output should be - depending on level - much, much more. On the one hand again I think, that there should not be a static assignment of a model. On the other hand, ...
Like said in my original post, I set the tokenizer field of the model (the one of the agent as well as the one used for the prompt enhancement), but still the one out of That is a nice hint and should be at least part of an example within the docs for using on-prem constellations. But, ...
..., you saw my originating prompt: this should not lead into hitting the limits, right? (And, before you ask, ... for these tests I initiated a new instance without any nodes, docs, ... in it currently.) Like said, ....
|
Well, ... IMHO and specially due to the fact, that there are centrally/admin and user organized/maintained/... agents, everything needed should be maintained/stored/set/... within the agent specification. No user setting, no server setting, ... In fact everything starts with a conversation with an (user selected) agent. This agent has it's personality (system prompt(s)), model, parameters, ..., ... And this agent definition should be used for everything. Perhaps an implicit default model (e.g. when an agent does not have a image generation model explicitly set), but if it is set within the agent, then no other parameter, model, ... should be used anywhere. I really was very confused when realizing that - while the agent has model_1 configured - the image generation with THIS agent uses model_2 for prompt enhancement. |
Forcing Khoj to use the direct one. ;) |
Agent chat model cannot override chat model set by server admin for the intermediate steps. But this commit enables using the agents chat model instead of the default user chat model (set via the /settings page) for the intermediate steps (including image prompt generation) when no server chat settings are set. To have the agent chat model be used for intermediate steps remove the server chat setting. As priority order for intermediate steps is server > agent > user chat model |
Discussed in #1095
Originally posted by nesretep-anp1 January 22, 2025
I do have some trouble with the "Painting to Imagine" in the "Image Generation" process.
...
It seems that there are several issues with selecting the used models.
bunny-llama-3-8b-v
This result in the effect explained in the above mentioned discussion. The logs show up with ...
Ok. now the effect with "violent acts" is explained; it comes out of
gpt-4o
. But why is it used?In function truncate_messages the model
gpt-4o
is set statically as the default tokenizer.The funny thing is, that this static assignment only gets executed if all other cases fail.
So, I set a tokenizer for the used model. Nothing changed, still
gpt-4o
.Why (and I did not get to that point currently within my research) a set tokenizer is not taken into action?
I added
default_tokenizer = model_name
directly below theexcept
.Now the following happened in the logs ...
(which then certainly fails, but ...)
I now realized - what is already shown in the first log snippet - not the model (of the agent)
bunny-llama-3-8b-v
is shown, insteadmeta-llama-3.1-8b-instruct
is used!?!?!?!Ok, why the agent's model is not used?
I went to "Settings" and set the "Chat" in "Models" to
bunny-llama-3-8b-v
.Still
meta-llama-3.1-8b-instruct
is used!I then changed
server chat settings
in "Admin Panel" tobunny-llama-3-8b-v
.Now
bunny-llama-3-8b-v
is used.IMHO the real bug described here is the weird decision of which model is taken into action in which situation and constellation?!
Specially, ...
Additionally, ...
The text was updated successfully, but these errors were encountered: