Is it a Diff Merge or a Full?

Aside

This is a question that all Datto partners have had to ask at least once: “Hi, this is for serial XYZ. Is the backup running for agent 192.0.2.1 a full or a differential merge.

In case anyone isn’t a Datto users, a “differential merge” is a method used to fix problems with a backup. It reads the entire source disk for changes, but still only writes changes. This is opposed to a new full, which also reads the entire disk and can fix any issues, but has the obvious consequence of requiring enough disk space for a full backup.

It’s tough to differentiate between the two. Differential merges will usually run faster, but that only helps if you have an idea of what a normal transfer speed looks like for that agent during a backup.

One way I’ve been using to determine is to check the amount of data that is changing on the volume. As you know, Datto uses ZFS snapshots as backup points and changes are written to the live data (this is the “inverse chain” that you’ll hear referenced). To check how much data has changed on the live set, you can use:

zfs list -o name,creation,written homePool/home/agents/192.0.2.1

You’ll get something like this:

NAME                            CREATION               WRITTEN
homePool/home/agents/192.0.2.1  Tue Jul 15 22:32 2016    3.41G

This is similar to the commands used to view the real backup size. The ‘written’ column isn’t normally part of the output used for that because we’re normally running these after the fact to determine whether an agent took a full. ‘WRITTEN’ refers to the amount of data changed on the live filesystem.

To check for a full, preface this with a ‘watch’ command:

watch zfs list -o name,creation,written homePool/home/agents/192.0.2.1

If this number is staying relatively static (or at least increasing at a rate that matches a normal backup for this agent), you have a full. To double check, view this while you’re watching the progress of a backup in the web UI. If the agent is going through the volume and the number in the written column stays the same or is much lower than what the UI shows as complete, you have a diff merge.

As always, if unsure about any of the above, open a ticket with support.

Datto Won’t Checkin

I have a device that is not checking in. After contacting the client, the device is confirmed online.

In a remote session to a server on the client’s network, the device is answering pings and I can SSH into it. That’s when the trouble starts:


root@datto:~# checkin
Updating checkin script (using device.dattobackup.com)...
/datto/scripts/DoBackup.sh: line 93: /datto/scripts/pre-commands.sh: Read-only file system
Failed to communicate with checkin server, this is usually a result of network connectivity problems, exiting...
root@datto:~#

That’s the same “read-only” filesystem error message that I previously discussed in the context of a failed OS upgrade.

In that scenario, the issue is that fstab doesn’t have the right information for the software RAID array and is only able to mount read only.

The contents of fstab are below:


root@datto:~# cat /etc/fstab
/etc/fstab: static file system information.
#
# Use 'blkid -o value -s UUID' to print the universally unique identifier
# for a device; this may be used with UUID= as a more robust way to name
# devices that works even if disks are added and removed. See fstab(5).
#
#
proc /proc proc nodev,noexec,nosuid 0 0
# / was on /dev/sda1 during installation
UUID=c01732e2-8b21-4c3c-980f-97acab2326f3e
/ ext4 errors=remount-ro 0 1
# swap was on /dev/sda5 during installation
UUID=22ab332d-22e3-462b-98ba-d80a6c0956ea none swap sw 0 0
# array
root@datto:~#

What the heck? The first line should be commented out, but it isn’t. Additionally, the mount options for ‘/’ got broken into two lines. This seems like something that would be caused by a parsing script encountering input that it didn’t expect. Regardless, this certainly explains why the filesystem won’t mount.

So how do you fix this if it happens to you?

First, verify that the software RAID array is mounted. There are a few ways to do that, but I want to verify both that the array exists as a block device and that both members are a part of it:


root@datto:~# blkid /dev/md1
/dev/md1: UUID="c01732e2-8b21-4c3c-980f-97acab2326f3e" TYPE="ext4"
root@datto:~#


root@datto:~# cat /proc/mdstat
Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10]
md1 : active raid1 sdb1[1] sda1[0]
878916032 blocks [2/2] [UU]

unused devices:
root@datto:~#

Both drives are present, so we’re not looking at a data loss issue. The next thing to do is bring the device online read/write. Use the exact same method that we used while fixing the upgrade error:


root@datto:~# mount -o remount,rw /dev/md1 /
root@datto:~#

Now you can fix fstab. Similar to the previous article, this is probably a good time to call support.

That being said, if you want to fix it yourself, the solution here is to comment the first line and join lines 10 and 11 so that the mount options are on one line as intended.

Then your mount will work:


root@datto:~# mount -a
root@datto:~#

You’re pretty much set here. In my case, the lock on checkin was held so the appliance wouldn’t check in. However, I wanted to reboot anyway to make verify everything came back up.

Even if you did successfully get this up and running, I’ still suggest getting support to take a look.