"Dear lazyweb", have you done backups for Apache Lucene?

Oct 18, 2007 03:38

It pains me to write this.

Has anybody ever done backups for Apache Lucene? (In particular, Solr, though I've no reason to believe that means anthing special for backup/recovery.) What did you do to do that, in broad strokes?

Note that I'm interested in responses in accordance with jwz's rules here. I do not want to know what you think might work ( Read more... )

Leave a comment

Comments 11

reddragdiva October 18 2007, 11:02:40 UTC
Not quite a real answer, but I'm now wondering just what Wikimedia does with their hacked-up Lucene. (They reimplemented it in C#, then reimplemented that back into Java. o_0)

Reply

grumpy_sysadmin October 19 2007, 06:57:28 UTC
[reddragdiva, I do apologize for the duplicated "comment reply" notifications. I will, eventually, either learn to use the preview button or decide that proper spelling and grammar doesn't actually matter in blog comments.]

It would appear that Lucene pretends that it only ever writes files to disk atomically. The Solr community provides several utilities (actually, those are links to documentation about them; the utilities are in the distribution) that perform backups / site redundancy by way of making a hard link to the important files and then copying the data in the hard link.

Because, apparently, Lucene uses a brand new inode every time it writes data out, and then removes the predecessor link in the file system. I'm so sure this is faster than, um, using mmap(2), for example. It definitely makes sense to run every single actual I/O through the file system. Oh, wait, maybe not ( ... )

Reply


ultranurd October 18 2007, 21:02:54 UTC
Your jwz-style posts are working, because I'm scared to reply.

Reply

grumpy_sysadmin October 19 2007, 06:40:36 UTC
And yet, it curiously didn't stop you from doing so. But it's okay because you actually know me in person. (I really do mean that it didn't upset me to have you make this comment, though it's precisely the sort of comment I intended to preclude.)

Reply

ultranurd October 19 2007, 12:42:26 UTC
Still, I hesitated. On a somewhat more on-topic note, my second SATA drive + enclosure is on order, so soon I will have a drive I can store in a locked drawer at work as well as the live backup on my desktop. Sadly, I know jack about the specific backup method at hand.

Reply

grumpy_sysadmin October 20 2007, 00:08:50 UTC
HAHA: but it is doubly-on-topic, given jwz's home user backup rant last(?) week...

See above, but it sounds like the authors' stance is a lot of hand-waving. I won't believe them until I do some test restores, then purposely break things in the middle of a backup of a backup of that restored (non-prod) system, and then test a restore from that. But, you know, it's open source, so clearly my (er, their, actually) backup reliability is somebody else's learning experience.

Sigh.

Reply


Leave a comment

Up