Version
Desktop 1.2.0 (Windows CPU)
Description
I'm not talking the interval. That can stay at 15 or 20 or whatever. I'm talking an option to lag the run a bit to account for swipes/regenerations and/or where the user might delete the LLM message, go and edit their last user message to steer the LLM better, and then trigger the model to make a new message (those of us who do a lot of editorial work on our chats). In that case, belated generation would be beneficial.
To put it like so, let's say the summary interval is 20 messages. Once you get to the 20th message, it runs the summary and memory manager. But if you swipe on that 20th message, it does not run again and therefore does not account for how you may have changed the story in that swipe. If you delete and generate a new, it'll do it again but I notice memory and summary artifacts left behind so I might revert in the activity but it didn't fix the context summary. I just saw you have another commit for that so I'll leave that alone for now.
But I realize the entire thing could be avoided if the user could set a delay to not process the window of messages 1-20 until they were on 22nd or 24th message or such. Let the dust settle on the swipes and edits before running memory maintenance. That's what I'm getting at.
I was thinking maybe it could be a new setting in the dynamic memory advanced options.

Like a "Summary Delay" (Delay memory maintenance by X messages) with a default of 0 since that's how it's been running right now and I haven't seen anyone else mention it on Discord so I'm guessing everyone else likes it in present state. But those of us who are high editors could put in something like 2 or 4 and that could be an integer that gets added to the check for when to run memory maintenance. So not changing the interval window (still want 20 messages) but changing when it runs with respect to that window (e.g. to run 1-20 when you hit the 24th message).
Does that hopefully make sense?
Version
Desktop 1.2.0 (Windows CPU)
Description
I'm not talking the interval. That can stay at 15 or 20 or whatever. I'm talking an option to lag the run a bit to account for swipes/regenerations and/or where the user might delete the LLM message, go and edit their last user message to steer the LLM better, and then trigger the model to make a new message (those of us who do a lot of editorial work on our chats). In that case, belated generation would be beneficial.
To put it like so, let's say the summary interval is 20 messages. Once you get to the 20th message, it runs the summary and memory manager. But if you swipe on that 20th message, it does not run again and therefore does not account for how you may have changed the story in that swipe. If you delete and generate a new, it'll do it again but I notice memory and summary artifacts left behind so I might revert in the activity but it didn't fix the context summary. I just saw you have another commit for that so I'll leave that alone for now.
But I realize the entire thing could be avoided if the user could set a delay to not process the window of messages 1-20 until they were on 22nd or 24th message or such. Let the dust settle on the swipes and edits before running memory maintenance. That's what I'm getting at.
I was thinking maybe it could be a new setting in the dynamic memory advanced options.

Like a "Summary Delay" (Delay memory maintenance by X messages) with a default of 0 since that's how it's been running right now and I haven't seen anyone else mention it on Discord so I'm guessing everyone else likes it in present state. But those of us who are high editors could put in something like 2 or 4 and that could be an integer that gets added to the check for when to run memory maintenance. So not changing the interval window (still want 20 messages) but changing when it runs with respect to that window (e.g. to run 1-20 when you hit the 24th message).
Does that hopefully make sense?