11
Getting a good consistent backup of the SM data drive
Question asked by echoDreamz - 1/16/2020 at 2:38 AM
Unanswered
SmarterMail is not VSS-aware, so I am concerned about getting a good consistent snapshot of the SM data array. Software like MySQL, SimpleDNS and a few other systems we use have a way to tell the server or application "Hey, backup is about to start... do what you need to do to commit whatever to disk". For example, MySQL, we can lock tables/flush tables etc. to make sure that any VSS snapshots contain a fully consistent state of the server, once the VSS snapshot has finished, we have MySQL release table locks etc. so it can resume full normal operations.

SmarterMail has a ton of moving parts, files are changing, updating, writing to, created etc. so in the event there is a full catastrophic failure, the odds of our VSS-based backups being in a consistent state is probably 0. We could shutdown the service, start the backup, wait for the snapshot to complete, then restart the service, but stopping and starting a mailserver nightly isnt great.

Exchange of course is VSS-aware, Zimbra offers a backup/restore system that creates safe backups, wondering what the ST developers think about getting SmarterMail to allow us to create safe and consistent backups, especially since we are venturing into Exchange territory with EWS, EAS, MAPI.

16 Replies

Reply to Thread
0
Good point Christopher! This is a very important note to be aware of.

I think you should stop Smartermail Service, start VSS backup and then start SmarterMail Service without wait the full backup is finished.
The anly part you must wait to be completed is the VSS snapshot BEFORE the backup start the real copy of data, BUT I don't know how to be aware of the completion of the VSS snapshot...

We use Veaam for our backup...
0
echoDreamz Replied
The system we use has “hook points” where you can call different apps or batch files etc. at different points during the back up process. Such as onSart beforeSnapshot duringSnapshot etc. 
0
Matt Petty Replied
Employee Post
Just curious how long do snapshots usually take? I could see the potential for stopping services, things like indexing and spool.
Matt Petty
Software Developer
SmarterTools Inc.
(877) 357-6278
www.smartertools.com
0
Ron Raley Replied
Snapshots in our VM environment take about 30 seconds.

Ron
0
echoDreamz Replied
A few minutes for us. 
3
echoDreamz Replied
I dont think stopping the service every single night is a good idea unfortunately. We have customers all around the world who would immediately notice, start submitting tickets, calls, chats etc. We could explain to them until we are blue in the face about why, but customers dont care, they will just be upset mail is going up and down daily.

I realize this is the best way to do this, but with mail going up and down nightly, we wouldnt have any customers left to worry about backing up. I think if SM wants to compete in the Enterprise space, directly against Exchange, we need to have a clear way to make consistent backups without having to shutdown the server.

Matt - Your idea for stopping services within SM is nice, but it is also destructive. MySQL for example (far less moving parts I am sure). When we lock the tables, user apps can still read from them OK, but queries like INSERT, UPDATE, REPLACE etc. are held up until the lock on the tables is released, this doesnt break user applications or report any errors, it is basically seamless to the end user minus the little slowdown for write operations.
0
I agree, Christopher
0
echoDreamz Replied
So I created a simple app that uses the SM API to stop all the services within SM, then we call an iisreset /stop to prevent users from doing any actions via the SM interface.

-- Backup app starts
-- Backup app says its about to create a VSS snapshot
--- App shuts down the SM services and stops IIS
-- Backup app starts the snapshot
-- Backup app completes the snapshot
--- App starts IIS and then starts the SM services

When SM stops a service though, does all activity stop for that service or is there a delay?
3
JerseyConnect Team Replied
I'd like to reignite this conversation. We use Veeam and recently had to restore our SmarterMail server from backups. After the restore we experienced several issues with domains and users that were all caused by corrupted JSON files. We believe that this is directly related to the fact that the SM service is running when the backups are taken. 

In every instance we were able to get the affected domains and users back up and running by pulling good copies of the corrupted JSON files from another backup and then reattaching the user or reloading the domain. Obviously dealing with corrupted objects just exasperated an already major outage and we'd like to avoid this potential scenario in the future.

I agree with echo that stopping the service every night is not ideal, but we're gong to explore doing this with Veeam as it looks like the only option at the moment.
4
echoDreamz Replied
That's for confirming my worst fear... I was 99.99% sure there would be some corruption as our server is busy round the clock. As an ISP, we have customers in the US, but also all over the world that expect their mail to be up and going. Daily shutdowns are just not something we can do to get good consistent backups.

Need some way to signal to SM that we are wanting to do a backup...do whatever to make the data-on-disk consistent please...

2
We also need a SAFE method to make a backup (at least daily...) consisting of Smartermail...

I guess (I hope?...) that SmarterTools also use SmarterMail for their internal mail...

How do they do their backups?
3
echoDreamz Replied
Just going off the IMAP banner...

 OK IMAP4rev1 SmarterMail
Looks like they are using SM (at least on their primary MX record) - And of course... mail.smartertools.com shows an SM login :)

But yes, that is a damn good question Gabriele, what are you using @ST?
2
Kyle Kerst Replied
Employee Post
Hello everyone! I saw a question aimed at us so figured I'd chime in here. First, I like the idea of signaling SmarterMail for VSS backups and I've created a request to have this discussed internally to see if there is anything we can do to help in these areas. My discussion request may or may not be doable, so I'll leave this thread unmarked until I hear back.

Next, we do backup nightly and use Veeam to complete these backups at the hardware level (Hyper-V.) We do have a pretty busy server, but likely not as busy as yours! That said, we've never run into a corruption issue in my time here, so Veeam seems to handle this well. I believe they are using VSS to backup the Hyper-V instances as well. I hope this helps!
Kyle Kerst
Technical Support Specialist
SmarterTools Inc.
(877) 357-6278
www.smartertools.com
1
JerseyConnect Team Replied
Been testing some whole VM restores and can confirm that using VSS enabled backups results in a corruption free server. I also did several whole VM restores from non-VSS enabled backups and every single time there was a number of corrupted JSON files.
0
I am not sure the exact process that SM uses for the open files (are we only talking about the JSON files that are getting corrupted each time ?)  Does SM open them and just leave them open the entire time ?  If so, I was thinking could you (SmarterPeeps) set them up as a sort of cache file setting instead.  So, lets say the file is open, and only the last 5 minutes of activity are in the open (cache) file. After 5 minutes the data from the open file is flushed to a permanent copy of the file.  Then call the backup task.

Or - Tie in a backup hook / trigger into smarter mail.  Call in - "Prepare to do a backup" - SM then flushes the open cache files to permenant copies.  Then SM reports back - "Backup prepped" and then you can still do an online backup,

In this case, if a full restore was necessary, the most you would loose is 5 minutes of data.

I posted this back in 2019, and i think it may sort of be in alignment with this.

www.HawaiianHope.org - Providing technology services to non profit organizations, homeless shelters, clean and sober houses and prisoner reentry programs. in 2018, in just one year, we gave away 1,000 Free Computers !

3
Matt Petty Replied
Employee Post
Most of our JSON's especially the settings-related json's will be read into memory on first usage and every subsequent request for data hits our memory-cached version. The only time we touch the JSON after the first time it's read is when we save changes to the file and we save those same changes in our memory version without re-reading the file.
All of our JSON's follow this series of events when saving a JSON.
-COPY x.json -> x.json.bak
-WRITE (new json data) -> x.json.tmp
-MOVE/OVERWRITE x.json.tmp -> x.json
We do it in this crazy order so that our only experience with x.json when writing is asking the OS to move x.json.tmp to x.json in one swift operation. If the service or server shuts down unexpectedly the x.json is either old or new, never half written.

Hopefully these technical notes shed some light on how we access JSON files. All other files have different implementations. Though we do that pattern of saving above for a couple other important file types like cfg and grp files.
Matt Petty
Software Developer
SmarterTools Inc.
(877) 357-6278
www.smartertools.com

Reply to Thread