We present here an approach for managing short-term memory in chatbots, using a combination of storage techniques and automatic summarization to optimize conversational context. The introduced method relies on a dynamic memory structure that limits data size while preserving essential information through intelligent summaries. This approach not only improves the fluidity of interactions but also ensures contextual continuity during long dialogue sessions. Additionally, the use of asynchronous techniques ensures that memory management operations do not interfere with the chatbot's responsiveness.
This section explains how to use the shortterm-memory
package to manage a chatbot's memory.
pip install torch transformers
pip install shortterm-memory
pip show shortterm-memory
from shortterm_memory.ChatbotMemory import ChatbotMemory
from shortterm_memory.ChatbotMemory import ChatbotMemory
# Initialisation de la mémoire du chatbot
chat_memory = ChatbotMemory()
# Mettre à jour la mémoire avec un nouvel échange
user_input = "Bonjour, comment allez-vous?"
bot_response = "Je vais bien, merci ! Et vous ?"
chat_memory.update_memory(user_input, bot_response)
# Obtenir l'historique des conversations
historique = chat_memory.get_memory()
print(historique)
-
update_memory(user_input: str, bot_response: str): Updates the conversation history with a new question-response pair.
-
get_memory(): Returns the complete conversation history as a list.
-
memory_counter(conv_hist: list) -> int: Counts the total number of words in the conversation history.
-
compressed_memory(conv_hist: list) -> list: Compresses the conversation history using a summarization model.
Ensure that user inputs and bot responses are valid strings. If the history becomes too large, the package automatically compresses older conversations to save memory.
In this section, we mathematically formalize conversation memory management in the chatbot. The memory is structured as a list of pairs representing exchanges between the user and the bot.
The conversation memory can be defined as an ordered list of pairs
where
When a new exchange occurs, a new pair
To manage memory space and decide when compression is necessary, we calculate the total number of words
where
When
where
The language model uses the compressed context to generate relevant responses. The prompt
where
This approach ensures that the chatbot always has an up-to-date conversational context, enabling more natural and engaging interactions with the user.