It pains me to write this.
Has anybody ever done backups for Apache Lucene? (In particular, Solr, though I've no reason to believe that means anthing special for backup/recovery.) What did you do to do that, in broad strokes?
Note that I'm interested in responses in accordance with
jwz's rules here. I do not want to know what you think might work
(
Read more... )
Comments 11
Reply
It would appear that Lucene pretends that it only ever writes files to disk atomically. The Solr community provides several utilities (actually, those are links to documentation about them; the utilities are in the distribution) that perform backups / site redundancy by way of making a hard link to the important files and then copying the data in the hard link.
Because, apparently, Lucene uses a brand new inode every time it writes data out, and then removes the predecessor link in the file system. I'm so sure this is faster than, um, using mmap(2), for example. It definitely makes sense to run every single actual I/O through the file system. Oh, wait, maybe not ( ... )
Reply
Reply
Reply
Reply
See above, but it sounds like the authors' stance is a lot of hand-waving. I won't believe them until I do some test restores, then purposely break things in the middle of a backup of a backup of that restored (non-prod) system, and then test a restore from that. But, you know, it's open source, so clearly my (er, their, actually) backup reliability is somebody else's learning experience.
Sigh.
Reply
Leave a comment