Multiple LJ backup software has a syncitems bug.

Apr 03, 2008 06:06

Many LJ backup software, including the official jbackup.pl client, have a bug when doing syncitems where the first batch of items is all comments (i.e. no posts).  LiveJournal Backup / Search Tool and ljArchive also appear to the susceptible to this bug.

I noticed it because my journal is like this and after the LJ-Sec developer reviewed it, he found that the first batch of syncitems for me was all comments and then later I tried experimenting with various backup software and found this bug manifested in many of them.

It seems that most software operates under the assumption that there's at least one post in the first batch and therefore gets stuck trying to do syncitems or otherwise doesn't do the backup properly.

In jbackup.pl, if you haven't created a GDBM database before, $lastsync won't be set and normally this isn't a problem, because if there's a single post ("L" type) in the first batch of syncitems, it will update $lastsync.

This fix shouldn't cause any problems, unless I'm mistaken.  You only need to modify do_sync's main while loop as follows:

foreach my $item (@{$hash->{syncitems} || []}) {
    # Update $lastsync regardless of item type
    $lastsync = $item->{'time'}
        if $item->{'time'} gt $lastsync;
    next unless $item->{item} =~ /L-(\d+)/;
    $synccount++;
    $sync{$1} = [ $item->{action}, $item->{'time'} ];
    $lastsync = $item->{'time'}
#  if $item->{'time'} gt $lastsync;
#  $bak{"event:realtime:$1"} = $item->{'time'};
}
$bak{'event:lastsync'} = $lastsync;

Side note/question: jbackup.pl, when run via ActiveState's windows perl client, seems to be really slow (running for minutes on a Pentium 4 system) in processing comments AFTER they've been downloaded, but is fast when comments aren't processed at all.  This holds true even if no new comments have been added when it attempts to check for new comments and appears to take about as long as when there are many new comments.

There's not a lot of CPU usage, but there is a lot of hard drive access.  I'm not sure why it would do this, especially if there are new no comments and therefore the database doesn't need to be updated, nor any new comments downloaded.

client: sync, client: export, client, code: perl

Previous post Next post
Up