TABLE OF CONTENTS
Have questions or want a demo?
At Tessell, our goal is to take the operational headache out of managing databases. One of the things we offer in our MSSQL DBaaS is automated backup management which includes full daily backups using Microsoft VDI, and transaction log backups every five minutes. These backups are crucial for point-in-time recovery and are pushed to Amazon S3 for durable storage.
So far, this setup has worked great. But as always, real-world use cases have a way of nudging architectures to evolve.
Our Existing Transaction Log Backup Flow
Here’s how things worked before:
- A component called Logsweep would periodically grab the transaction logs from the SQL Server.
- These logs were temporarily written to the local disk.
- Once written, they were uploaded to Amazon S3.
- After a successful upload, the local copy was deleted to keep the disk clean.
This design is simple, efficient, and keeps things moving smoothly. The local disk is just a transient hop on the way to S3.

The Problem: Retain Logs Longer for AWS DMS
Some of our customers, especially those using AWS Database Migration Service (DMS), needed to retain transaction logs on disk for at least 24 hours. DMS scans the transaction logs directly from the disk for continuous replication, and our default behavior of deleting logs after upload broke that flow.
So now we had a challenge: how do we keep the logs around long enough for DMS to use them, without risking the local disk filling up and bringing down backups?
The Fix: Smarter Cleanup with Configurable Retention
We decided to keep things flexible. Instead of immediately deleting the logs, we now retain them by default, and introduced a new scheduled cleanup task that runs every 30 minutes.
This task doesn’t just blindly delete files. It uses a bunch of configurable thresholds to decide how much to clean up, and when. The idea is to free up space only when the disk is getting full, and to retain logs as long as possible otherwise.
Fix: Smarter Cleanup with Configurable Retention
We decided to keep things flexible. Instead of immediately deleting the logs, we now retain them by default, and introduced a new scheduled cleanup task that runs every 30 minutes.
This task uses a bunch of configurable thresholds to decide how much to clean up, and when. The idea is to free up space only when the disk is getting full, and to retain logs as long as possible otherwise
How it’s Configured
By default, we ship with a config file here:
T:\tessell\python\Lib\site-packages\tessell\resources\mssql\archive_config.json
But here's the important part: this file gets overwritten on every upgrade. So if you're a customer and want to customize retention behavior, you should create your own config file at:
T:\tessell_custom_configs\custom_archive_config.json
Here’s what a typical config looks like:
{ "custom_archive_disk_config_path": "T:\\tessell_custom_configs\\custom_archive_config.json", "archive_disk_max_limit_in_percentage": 75, "archive_disk_floor_limit_in_percentage": 35, "archive_retain_time_in_min": 360, "reduce_archive_retain_time_in_min": 30, "minimum_archive_retain_time_in_min": 180, "archive_disk_location": "R:\\Backup", "data_disk_location": "E:\\data", "disk_usage_alert_limit_in_percentage": 90}
- Retention time is defined in minutes. Here, we’re keeping logs for at least 6 hours (
360 min
), and will gradually delete them in chunks of 30 minutes if needed. - Cleanup only kicks in when disk usage crosses 75%, and stops once it drops below 35%.
- If even after cleanup, disk usage doesn’t drop enough then that’s a red flag. Either the disk is too small, or logs are coming in faster than expected.
Proactive Alerts
To avoid unpleasant surprises, we’ve built in an alert system. If the cleanup isn’t enough to keep the disk below the threshold, we raise an alert so that the SRE team can step in and resize the disk.
This prevents the backup system from failing due to a full archive disk which is something that can have cascading consequences for log sweep and restore operations.
Flexibility for Customers
One of the best parts of this setup is that it puts control in the hands of the user. If a customer needs longer retention for compliance, replication, or audit needs, they can tweak the retention period in their custom config. No code changes needed, just a simple config update.
Final Thoughts
This was one of those changes driven directly by real customer feedback, and it turned out to be a great enhancement for everyone. It gives our platform more resilience, more observability, and gives our users the flexibility they need to integrate with tools like AWS DMS without compromising on safety or efficiency.