Expanded the list of providers for which models are fetched dynamically to include Ollama and Llamafile, removing the need for manual model addition in the user interface for these providers. This simplifies the user experience and ensures users always have access to the latest models without manual intervention.
Adds a new "Temporary Chat" mode for quick, non-persistent conversations. The new mode is available in the header bar and will trigger a visually distinct chat experience with a temporary background color. Temporary chats do not save to the chat history and are meant for short, one-off interactions. This feature enhances flexibility and provides a more convenient option for users who need to quickly interact with the AI without committing the conversation to their history.
Adds a new setting to control the maximum number of tokens generated by the model. This provides more control over the length of responses and can be useful for limiting the amount of text generated in certain situations.
This commit introduces a new feature that displays generation information for each message in the chat.
The generation info is displayed in a popover and includes details about the model used, the prompt, and other relevant information. This helps users understand how their messages were generated and troubleshoot any issues that may arise.
The generation info is retrieved from the LLM response and is stored in the database alongside other message details.
This commit also includes translations for the generation info label in all supported languages.