ZFS Gets Built-In Deduplication

Slashdot is powered by your submissions, so send in your scoop

ZFS Gets Built-In Deduplication 386

Posted by ScuttleMonkey on Monday November 02, 2009 @07:21PM from the sounds-like-a-resource-hog-waiting-to-happen dept.

elREG writes to mention that Sun's ZFS now has built-in deduplication utilizing a master hash function to map duplicate blocks of data to a single block instead of storing multiples. "File-level deduplication has the lowest processing overhead but is the least efficient method. Block-level dedupe requires more processing power, and is said to be good for virtual machine images. Byte-range dedupe uses the most processing power and is ideal for small pieces of data that may be replicated and are not block-aligned, such as e-mail attachments. Sun reckons such deduplication is best done at the application level since an app would know about the data. ZFS provides block-level deduplication, using SHA256 hashing, and it maps naturally to ZFS's 256-bit block checksums. The deduplication is done inline, with ZFS assuming it's running with a multi-threaded operating system and on a server with lots of processing power. A multi-core server, in other words."

This discussion has been archived. No new comments can be posted.

ZFS Gets Built-In Deduplication

Load All Comments

Search 386 Comments Log In/Create an Account

Comments Filter:

Does that mean... (Score:4, Funny)

by Anonymous Coward writes: on Monday November 02, 2009 @07:24PM (#29956668)

Duplicate slashdot articles will be links back to the original one?

Share
twitter facebook
- Re: (Score:2)
  
  by Shikaku ( 1129753 ) writes:
  
  Er, isn't block deduplication really really bad at a hard drive block failure point of view? You'd have to compress or otherwise change the data to have a copy now, or it'd just be marked redundant; if that block where all those redundant nodes are pointing to go bad, all of those files are now bad.
  - Re:Does that mean... (Score:4, Insightful)
    
    by ezzzD55J ( 697465 ) writes: <slashdot5@scum.org> on Monday November 02, 2009 @07:46PM (#29956882) Homepage
    
    The single block is still stored redundantly, of course. Just not redundantly more than once.
    
    Parent Share
    twitter facebook
    - Infinite compression? (Score:3, Funny)
      
      by n9hmg ( 548792 ) writes:
      
      If a hash were a replacement for data. that's all we'd need....goedelize the universe? Sometimes I just want to scream, or weep, or shoot everybody....or just drop to my knees and beg them to think - just a little tiny insignificant bit - think. Maybe it'll add up. Probably not, but it's the best I can do.
      - Re: (Score:3, Interesting)
        
        by jimicus ( 737525 ) writes:
        
        If a hash were a replacement for data. that's all we'd need....goedelize the universe?
        Sometimes I just want to scream, or weep, or shoot everybody....or just drop to my knees and beg them to think - just a little tiny insignificant bit - think. Maybe it'll add up. Probably not, but it's the best I can do.
        Which is why ZFS allows you to specify using a proper file comparison rather than just a hash.
        It's unlikely you'll have a collision considering it's a 256-bit hash but, as you allude, that likelihood does go up somewhat when you're dealing with a filesystem which is designed to (and therefore presumably does) handle terabytes of information.
  - Re: (Score:2, Insightful)
    
    by Methlin ( 604355 ) writes:
    
    Er, isn't block deduplication really really bad at a hard drive block failure point of view? You'd have to compress or otherwise change the data to have a copy now, or it'd just be marked redundant; if that block where all those redundant nodes are pointing to go bad, all of those files are now bad.
    If you were concerned about block level failure or even just drive level failure, you wouldn't be running your ZFS pool without redundancy (mirror or raidz(2)).
  - - Re: (Score:3, Insightful)
      
      by hedwards ( 940851 ) writes:
      
      That requires a citation.
      
      ZFS isn't that much different than traditional file systems. I'm not quite sure how that reconciles with the fact that it reports unrecoverable bits of information when it couldn't self heal to you. If it were that unusable there'd be no point. Additionally there isn't really much likelihood of that happening considering that ZFS isn't really supposed to be used outside of a ZMIRROR or RAIDZ environment. Sure you can do it, but most of the goodness comes from multiple disks.
- Re: (Score:3, Insightful)
  
  by noidentity ( 188756 ) writes:
  
  Duplicate slashdot articles will be links back to the original one?
  
  No, see, this de-duplication is transparent at the interface level. So while dupes won't take extra disk space on Slashdot servers, we'll still see them as normal. Isn't it nice to know that this optimization will be taking place?
This is good news... (Score:2, Offtopic)

by The Ancients ( 626689 ) writes:

...and would normally make me happy; except I'm a Mac user. Still good news, but could've been better for a certain sub-set of the population, darn it.
File systems are one area where computer technology is lagging, comparatively speaking, so good to see innovation such as this.
- Re:This is good news... (Score:5, Insightful)
  
  by bcmm ( 768152 ) writes: on Monday November 02, 2009 @07:35PM (#29956770)
  
  ...and would normally make me happy; except I'm a Mac user. Still good news, but could've been better for a certain sub-set of the population, darn it.
  Use open source, get cutting edge things.
  
  Parent Share
  twitter facebook
  - Re:This is good news... (Score:4, Funny)
    
    by jeffb (2.718) ( 1189693 ) writes: on Monday November 02, 2009 @08:11PM (#29957162)
    
    Use open source, get cutting edge things.
    The last time I tried to build an Intel box for Linux work, I lost my grip on the cheap generic case, and sustained a cut that sent me to the emergency room. One of the things I like about my Mac is the lack of cutting edges.
    
    Parent Share
    twitter facebook
    - Re:This is good news... (Score:4, Funny)
      
      by Anonymous Coward writes: on Monday November 02, 2009 @08:18PM (#29957248)
      
      Shoulda gone with a blade server, then you wouldn't have had to worry about the emergency room.
      
      Parent Share
      twitter facebook
    - Re: (Score:2)
      
      by MrCrassic ( 994046 ) writes:
      
      This is called doin it wrong! :)
    - Re: (Score:3, Informative)
      
      by Tynin ( 634655 ) writes:
      
      Not sure when you tried building it, but I build cheap computers for friends / family, at least 2 or 3 computers a year. Almost a decade ago... maybe really only 8 years ago, all cheapo generic cases stopped having razor sharp edges. I used to get cuts all the time, but cheap cases, at least in the realm of having sharp edges, haven't been an issue in a long time. (I purchase all my cheapo cases from newegg these days)
  - Re: (Score:2)
    
    by The Ancients ( 626689 ) writes:
    
    Use open source, get cutting edge things.
    Cutting edge is nice for the functionality; unfortunately it more often than not comes with unintended functionality. I like standing back a bit - not too much mind you, but enough to avoid the bleeding edge.
  - Re: (Score:3, Insightful)
    
    by joe_bruin ( 266648 ) writes:
    
    Use open source, get cutting edge things.
    I run Linux, where's my ZFS? No, FUSE doesn't count.
  - - Re:Open Source Cures Cancer (Score:5, Funny)
      
      by Anonymous Coward writes: on Monday November 02, 2009 @09:20PM (#29958094)
      
      Like a cutting edge CAD packages, games, financial management and office suites?
      Umm, dia, nethack, perl, emacs?
      
      Parent Share
      twitter facebook
    - - Re:Open Source Cures Cancer (Score:5, Funny)
        
        by Anpheus ( 908711 ) writes: on Monday November 02, 2009 @11:05PM (#29959226)
        
        But if it breaks, or doesn't work, or you've hit a deadline on a project and can't deliver because Wine or the application broke, who are you going to call for support exactly? Not the people who made the software. Are you going to email the Wine mailing list and then, when they fail to deliver a timely solution for free, tell the client that open source is to blame?
        At least when I buy software, or make purchasing decisions from a business standpoint, knowing that the company will stand behind the product and our implementation of it is more important than that trying to pursue some ideal about information and it's anthropomorphized desire to be free.
        
        Parent Share
        twitter facebook
        
        Re:Open Source Cures Cancer (Score:4, Interesting)
        
        by sjames ( 1099 ) writes: on Tuesday November 03, 2009 @10:34AM (#29963200) Homepage Journal
        
        The same people you call when your proprietary system breaks and you discover that the official tech support people can't find their posterior with both hands and a map. Most cities have a number of grief councilors ready to support you in your time of need. If it was really critical, try the suicide hotline.
        
        Parent Share
        twitter facebook
- Re: (Score:2)
  
  by MBCook ( 132727 ) writes:
  
  It's neat. I can see it being rather useful for our systems at work to de-duplicate our VMs (and perhaps our DB files, since we have replicated slaves). Network storage (where multiple users may have their own copies of static documents that they've never edited) could benefit, perhaps email storage as well.
  Personally though, I don't think there is too much on my hard drive that would benefit from this. I would love for OS X to get the built in checksumming that ZFS has so it can detect silent corruption t
- This is the year of Solaris on the desktop (Score:2)
  
  by jotaeleemeese ( 303437 ) writes:
  
  Where did I hear that one?
- Re:This is good news... (Score:5, Informative)
  
  by Trepidity ( 597 ) writes: <delirium-slashdot@@@hackish...org> on Monday November 02, 2009 @08:50PM (#29957720)
  
  If you're running a normal desktop or laptop, this isn't likely to be of great use in any case. There's non-negligible overhead in doing the deduplication process, and drive space at consumer-level sizes is dirt-cheap, so it's only really worth doing this you have a lot of block-level duplicate data. That might be the case if e.g. you have 30 VMs on the same machine each with a separate install of the same OS, but is unlikely to be the case on a normal Mac laptop.
  
  Parent Share
  twitter facebook
Hash Collisions (Score:2, Interesting)

by UltimApe ( 991552 ) writes:

Surely with high amounts of data (that zfs is supposed to be able to handle), a hash collision may occur? I'm sure a block is > 256bits. Do they just expect this never to happen?
Although I suppose they could just be using it as a way to narrow down candidates for deduplication... doing a final bit for bit check before deciding the data is the same.
- Re:Hash Collisions (Score:4, Informative)
  
  by CMonk ( 20789 ) writes: on Monday November 02, 2009 @07:32PM (#29956748)
  
  That is covered very clearly in the blog article referenced from the Register article. http://blogs.sun.com/bonwick/en_US/entry/zfs_dedup [sun.com]
  
  Parent Share
  twitter facebook
- Re: (Score:2)
  
  by Score Whore ( 32328 ) writes:
  
  Yeah. If you are concerned by the fact that a block might be 128 KB and the hashed value is only 256 bits, then an option like:
  zfs set dedup=verify tank
  Might be helpful.
  - Re: (Score:2, Interesting)
    
    by dotgain ( 630123 ) writes:
    
    Before the instruction you posted, I found this explanation in TFA:
    An enormous amount of the world's commerce operates on this assumption, including your daily credit card transactions. However, if this makes you uneasy, that's OK: ZFS provies a 'verify' option that performs a full comparison of every incoming block with any alleged duplicate to ensure that they really are the same, and ZFS resolves the conflict if not. To enable this variant of dedup, just specify 'verify' instead of 'on':
    
    I fail to see h
    - Re: (Score:2)
      
      by sgbett ( 739519 ) writes:
      
      Hey! If no-one will notice then it won't be a problem ;)
    - Re: (Score:2)
      
      by SLi ( 132609 ) writes:
      
      No. We're talking about such amounts of data needed that there's no conceivable way now or in the near (1000-year) future that such a collision would be found by accident, and even after that only on some supercomputer that is larger than earth and is powered by its own sun. It's not going to happen by accident. The probabilities are just so much against it, given any conceivable amount of data - and there are elementary limits that come from physics that cannot be surpassed. Moore's law will stop working s
- Re: (Score:2)
  
  by Shikaku ( 1129753 ) writes:
  
  If blocks that are supposedly from different files have the same block data, does it really matter if it's marked redundant?
  Not only that, do you really think a SHA256 hash collision can occur? And even if it does, for the sake of CPU time, a hash table is made for a quick check rather than checking every piece of data from the to be written and already available data to see if there is a copy in situations as this. If somehow they have the same hash, it SHOULD be checked to see if it is the same data byt
  - Re: (Score:3, Funny)
    
    by icebike ( 68054 ) writes:
    
    If blocks that are supposedly from different files have the same block data, does it really matter if it's marked redundant?
    
    I thing the hash collision people are worrying about is when two blocks/files/byte-ranges are hashed to be identical but in fact differ.
    When that happens your Power Point presentation contains your Bosses bedroom-cam shots.
- Re: (Score:3, Informative)
  
  by Rising Ape ( 1620461 ) writes:
  
  The probability of a hash collision for a 256 bit hash (or even a 128 bit one) is negligible.
  How negligible? Well, the probability of a collision is never more then N^2 / 2^h, where N is the number of blocks stored and h is the number of bits in the hash. So, if we have 2^64 blocks stored (a mere billion terabytes or so for 128 byte blocks) , the probability of a collision is less than 2^(-128), or 10^(-38). Hardly worth worrying about.
  And that's an upper limit, not the actual value.
- Re: (Score:3, Funny)
  
  by pclminion ( 145572 ) writes:
  
  Suppose you can tolerate a chance of collision of 10^-18 per-block. Given a 256-bit hash, it would take 4.8e29 blocks to achieve this collision probability. Supposing a block size of 512 bytes, that's 223517417907714843750 terabytes.
  Now, supposing you have a 223517417907714843750 terabyte drive, and you can NOT tolerate a collision probability of 10^-18, then you can just do a bit-for-bit check of the colliding blocks before deciding if they are identical or not.
  - Re: (Score:3, Interesting)
    
    by pclminion ( 145572 ) writes:
    
    Oops. I didn't mean 10^-18 per-block, I meant 10^-18 for the entire filesystem. (Obviously it doesn't make sense the other way)
- Re:Hash Collisions (Score:5, Informative)
  
  by shutdown -p now ( 807394 ) writes: on Monday November 02, 2009 @07:55PM (#29956960) Journal
  
  Before I left Acronis, I was the lead developer and designer for deduplication in Acronis Backup & Recovery 10 [acronis.com]. We also used SHA256 there, and naturally the possibility of a hash collision was investigated. After we did the math, it turned out that you're about 10^6 times more likely to lose data because of hardware failure (even considering RAID) than you are to lose it because of a hash collision.
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by buchner.johannes ( 1139593 ) writes:
    
    I have an idea for an attack vector.
    Say File A is one block big. File A is publicly available on the server, not writable by users. Eve produces a SHA256 hash collision of file A and stores this file B in ~. Someone wants to retrieve file A but gets file B (e.g. like evilize exe [mscs.dal.ca] for MD5).
    Alternatively, if always the oldest file is kept, Eve has to know the next version of the file.
    Given big blocks and time until cryptoanalysis for SHA256 is at the state of where it is with MD5, why not?
    - Re: (Score:2)
      
      by hedwards ( 940851 ) writes:
      
      If I'm not mistaken, that would be a waste of time. Ultimately, you're looking to get a file executed in most cases in which case you don't really need that you just need some other exploit. If you do need to get that file retrieved, there are better ways of doing that as well.
    - Re: (Score:2)
      
      by TheRaven64 ( 641858 ) writes:
      
      Yes, it's a valid attack once you can generate hash collisions for SHA256 attacks, in the same way that 'sit between two parties and decrypt their communication' is a valid attack on RSA once you can factorise the product of two primes quickly. Currently, the best known attack on SHA256 is not feasible (and won't be for a very long time if computers only follow Moore's law).
    - Re: (Score:2)
      
      by shutdown -p now ( 807394 ) writes:
      
      Say File A is one block big. File A is publicly available on the server, not writable by users. Eve produces a SHA256 hash collision of file A
      
      The whole point of a cryptographic hash function [wikipedia.org] is that you're not supposed to be able to produce input matching a given hash value other than by brute force - that is, 2^N evaluations, where N is the digest size in bits. That's an ideal state - in practice, number of evaluations can be reduced, and this is also the case for SHA256 [iacr.org], but for this particular scenario (finding a message corresponding to a known hash, rather than just any two messages that collide with a random hash), it is still way beyond
    - Re: (Score:2)
      
      by SLi ( 132609 ) writes:
      
      But then you could just use your magic SHA-256 breaking skillz to divert bank transactions and many outright vital things in commerce and communications, so it seems to me that replacing the contents of a file on some file system would be petty crime compared to that.
- Re: (Score:2)
  
  by Just Some Guy ( 3352 ) writes:
  
  Surely with high amounts of data (that zfs is supposed to be able to handle), a hash collision may occur?
  The birthday paradox says you'd have to look at 2^(n/2) candidates, on average, to find a collision for a given n-bit hash. In this case, that means you'd have to look at about 2^128 objects to find a collision with a particular one.
  On my home server, the default block size is 128KB. With a terabyte drive, that gives about 8.4 million blocks.
  GmPy says the likelihood of an event with probably of 1/(2^128) not happening 8.4 million times (well, 1024^4/(128*1024) times) in a row is 0.999999999999999999999999
- Re: (Score:2)
  
  by Junta ( 36770 ) writes:
  
  They have the 'verify' mode to do what you prescribe, though I'm presuming it comes with a hefty performance penalty.
  I have no idea if they do this up front, inducing latency on all write operations, or as it goes.
  What I would like to see is a strategy where it does the hash calculation, writes block to new part of disk assuming it is unique, records the block location as an unverified block in a hash table, and schedules a dedupe scan if one not already pending. Then, a very low priority io task could sca
Any other file systems with that feature? (Score:3)

by Dwedit ( 232252 ) writes: on Monday November 02, 2009 @07:31PM (#29956740) Homepage

Are there any other filesystems with that feature? If not, I'm very strongly considering writing my own.

Share
twitter facebook
- Re: (Score:2)
  
  by mrmeval ( 662166 ) writes:
  
  While you're at it write one in assembler as a replacement for the Apple II and 1541 so us retrogeeks can store MORE on a floppy. ;)
  I know of all the compression schemes but this block level stuff is fascinating.
  - Re: (Score:2)
    
    by Korin43 ( 881732 ) writes:
    
    Wouldn't compression do this? I've never written a program involving compression, but it seems like the first thing you'd look for is two places that have the same data, and then you could just store them as references to the original data.
- Re:Any other file systems with that feature? (Score:5, Informative)
  
  by iMaple ( 769378 ) * writes: on Monday November 02, 2009 @07:38PM (#29956802)
  
  Windows Storage Server 2003 (yes, yes I know its from Microsoft) shipped with this feature (that is called Single Instance Storage)
  http://blogs.technet.com/josebda/archive/2008/01/02/the-basics-of-single-instance-storage-sis-in-wss-2003-r2-and-wudss-2003.a [technet.com]
  
  Parent Share
  twitter facebook
  - 404, add spx (Score:2)
    
    by buchner.johannes ( 1139593 ) writes:
    
    http://blogs.technet.com/josebda/archive/2008/01/02/the-basics-of-single-instance-storage-sis-in-wss-2003-r2-and-wudss-2003.aspx [technet.com]
    No need to mod me up.
  - Re:Any other file systems with that feature? (Score:5, Informative)
    
    by buchner.johannes ( 1139593 ) writes: on Monday November 02, 2009 @08:33PM (#29957456) Homepage Journal
    
    From that link: It is file-based and a service indexes it (whereas in ZFS it is block-based and on-the-fly). And they first introduced it in Windows Server 2000. Amazing. I'm sure it is a ugly hack since Windows has no soft/hard-links IIRC.
    
    Parent Share
    twitter facebook
    - Re:Any other file systems with that feature? (Score:5, Informative)
      
      by jpmorgan ( 517966 ) writes: on Monday November 02, 2009 @09:10PM (#29957986) Homepage
      
      You recall wrong. NTFS has long supported both hard links and a mechanism called 'reparse points,' which are much more powerful than simple symlinks.
      
      Parent Share
      twitter facebook
  - Re:Any other file systems with that feature? (Score:4, Informative)
    
    by bertok ( 226922 ) writes: on Monday November 02, 2009 @09:55PM (#29958518)
    
    Windows Storage Server 2003 (yes, yes I know its from Microsoft) shipped with this feature (that is called Single Instance Storage)
    http://blogs.technet.com/josebda/archive/2008/01/02/the-basics-of-single-instance-storage-sis-in-wss-2003-r2-and-wudss-2003.a [technet.com]
    It's not even close to the same thing.
    We investigated this a while back, and it is basically a dirty, filthy hack on top of vanilla NTFS.
    First of all, it doesn't compare blocks or byte-ranges, but entire files only. If two files are 99% identical, then they are different, and SIS won't merge them.
    Second, it uses a reparse point to merge the files, which has significant overhead, at least 4KB for each file, if I remember correctly. That is, SIS won't save you any disk space for small files, which is actually quite common on file servers. The overhead erases much of the benefit even for larger files, to the level that SIS will skip files smaller than 32KB by default.
    Third, it operates in the background, after files have been written. This means that files have to be written out in their entirety, read back in, compared byte-for-byte to another file, and then erased later. This is incredibly inefficient. On large file servers, the disk was thrashed like crazy.
    Lastly, we found that the Copy-on-Write mechanism immediately copied out the entire file if it was changed even slightly. For small files, this is not noticable, but for large files this can be a massive performance hog. A 4kb write can be potentially translated into a multi-GB copy!
    Proper single-instancing systems use in-memory hash tables that are often partitioned using "file similarity" heuristics to prevent cache thrashing. Even more advanced systems can maintain single-instancing during replication and backups, reducing bandwidth requirements enormously. Take a look at the features of the Data Domain [datadomain.com] filers for an idea of what the current state of the art is.
    
    Parent Share
    twitter facebook
    - Re:Any other file systems with that feature? (Score:5, Interesting)
      
      by binaryspiral ( 784263 ) writes: on Monday November 02, 2009 @10:27PM (#29958892)
      
      Microsoft's SIS is a joke. A few folks have dedupe down to a science - Data Domain and NetApp.
      We virtualized our filers into an ESX 3.5 cluster and dropped the VMDK files onto a NetApp 3140... deduped them to 18% of their original size. No performance impact, actually faster than our original servers and much more efficient.
      ROI - three months.
      Difficulty to implement dedup? A checkmark and the OK button.
      
      Parent Share
      twitter facebook
- Re: (Score:2, Informative)
  
  by hapalibashi ( 1104507 ) writes:
  
  Yes, Venti. I believe it originated in Plan9 from Bell Labs.
- Re: (Score:2, Interesting)
  
  by ZerdZerd ( 1250080 ) writes:
  
  I hope btrfs will get it. Or else you will have to add it :)
  - Re: (Score:2)
    
    by Junta ( 36770 ) writes:
    
    It is on their 'ideas' page:
    http://btrfs.wiki.kernel.org/index.php/Project_ideas [kernel.org]
    (content based storage)
- Re: (Score:3, Interesting)
  
  by TheSpoom ( 715771 ) * writes:
  
  What I'm wondering about all of this is what happens when you edit one of the files? Does it "reduplicate" them? And if so, isn't that inefficient in terms of the time needed to update a large file (in that it would need to recopy the file over to another section of the disk in order to maintain the fact that there are two now-different copies)?
  - Re:Any other file systems with that feature? (Score:4, Informative)
    
    by hedwards ( 940851 ) writes: on Monday November 02, 2009 @08:40PM (#29957554)
    
    ZFS is a copy on write filesystem, it already creates a temporary second copy so that the file system is always consistent if not quite up to date. I'd venture to guess that the new version of the file, not being identical to the old file would just be treated like copying it to a new name.
    
    Parent Share
    twitter facebook
  - Re: (Score:2)
    
    by PRMan ( 959735 ) writes:
    
    And worse...What happens when you go through a set of files A and change a single IP Address in each of them, defeating the duplication, while filesets B & C still point to the same set. Now, you have just increased your disk space usage by 200% while not increasing the "size" of the files at all.
    This will be extremely counter-intuitive when you run out of disk space by globally changing "192.168.1.1" to "192.168.1.2" in a huge set of files.
    - Par for the course.. (Score:5, Interesting)
      
      by Junta ( 36770 ) writes: on Monday November 02, 2009 @09:08PM (#29957962)
      
      Any filesystem implementing copy-on-write at all, data dedupe, and/or compression is already a strategy where the risk of exhausting oversubscribed storage due to unanticipated compression ratios or uniqueness is a risk. It's a reason why you have to be pretty explicit to NetApp filers implementing these features that you are accepting the risk of exhausting allocations if you actually make use of these features to the point of advertising more storage capacity than you actually have.
      You don't even need a fancy filesystem to expose yourself to this today:
      $ dd if=/dev/zero of=bigfile bs=1M seek=8191 count=1
      1+0 records in
      1+0 records out
      1048576 bytes (1.0 MB) copied, 0.00426769 s, 246 MB/s
      jbjohnso@wirbelwind:~$ ls -lh bigfile
      8.0G 2009-11-02 20:06 bigfile
      ~$ du -sh bigfile
      1.0M bigfile
      This possibility has been around a long file and the world hasn't melted. Essentially, if someone is using these features, they should be well aware of the risks incurred.
      
      Parent Share
      twitter facebook
      - Re: (Score:3, Informative)
        
        by odie_q ( 130040 ) writes:
        
        The trick isn't using /dev/zero, the trick is using the seek parameter. The dd command skips nearly 8 GiB into a newly created file and writes something there. This creates a file that is 8 GiB large, but with no data (not zero, just nothing at all) in the first 8191 MiB. Therefore, the system doesn't actually write anything there, and doesn't even allocate the storage. If you read from these blocks, you will get generated zeros. This is called a sparse file.
  - Re: (Score:2)
    
    by TheRaven64 ( 641858 ) writes:
    
    ZFS is copy on write, so every time you write a block it generates a new copy then decrements the reference count of the old copy. The 'reduplication' doesn't require any additional support, it will work automatically. Of course, you also want to check if the new block can be deduplicated...
More reason to be a ZFS fanboy (Score:4, Insightful)

by BitZtream ( 692029 ) writes: on Monday November 02, 2009 @07:32PM (#29956752)

I'm wondering how long its going to take for them to do something with ZFS that actually makes me slow down my overwhelming ZFS fanboyism.
I just love these guys.
My virtual machine NFS server is going to have to get this as soon as FBSD imports it, and I'll no longer have to worry about having backup software (like BackupPC, good stuff btw) that does this.
I don't use high end SANs but it would seem to me that they are rapidly losing any particular advantage to a Solaris or FBSD file server.

Share
twitter facebook
- Re:More reason to be a ZFS fanboy (Score:4, Informative)
  
  by HockeyPuck ( 141947 ) writes: on Monday November 02, 2009 @07:59PM (#29957016)
  
  The advantages of SANs are easy to realize, they need not necessarily be FibreChannel vs NAS (NFS/CIFS) as a SAN could be iSCSI, FCOE, FCIP, FICON etc..
  -Storage Consolidation compared with internal disk.
  -Fewer components in your servers that can break.
  -Server admins don't have to focus on Storage except at the VolMgr/Filesystem level
  -Higher Utilization (a WebServer might not need 500GB of internal disk).
  -Offloading storage based functions (RAID in the array vs RAID on your server's CPU, I'd rather the CPU perform application work rather than calculating parity, replacing failed disks etc). This increases when you want to replicate to a DR site.
  This is not a ZFS vs SANs argument. I think ZFS running on SAN based storage is a great idea as ZFS replaces/combines two applications that are already on the host (volmgr & filesystem).
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by phoenix_rizzen ( 256998 ) writes:
    
    Or, use ZFS to create a SAN for your other servers. Just create a ZVol, and share it out via iSCSI. On Solaris, it's as simple as setting shareiscsi for the dataset. On FreeBSD, you have to install an iSCSI target (there are a handful available in the ports tree) and configure it to share out the ZVol.
    - Re: (Score:2)
      
      by afidel ( 530433 ) writes:
      
      Or use a pair of them like the Sun Unified storage cluster using the 7310/7410. Of course Sun charges a fairly hefty fee for what you get (I got 72x450GB 15k drives in my EVA6400 for what they charge for the same storage is SATA and mine included 5 years of support).
    - - Re: (Score:3, Informative)
        
        by paulhar ( 652995 ) writes:
        
        TCP overhead at 1GbE for a modern processor is negligible - you're only talking about processing 120MB/sec or so.
        Here is a document including a pretty graph: http://media.netapp.com/documents/tr-3628.pdf [netapp.com]
        "...enabling the TCP Offload Engine (TOE) on the Linux hosts did not noticeably affect performance on the IBM blade side."
- Re:More reason to be a ZFS fanboy (Score:5, Informative)
  
  by Anonymous Coward writes: on Monday November 02, 2009 @08:06PM (#29957072)
  
  How about this: you can't remove a top-level vdev without destroying your storage pool. That means that if you accidentally use the "zpool add" command instead of "zpool attach" to add a new disk to a mirror, you are in a world of hurt.
  How about this: after years of ZFS being around, you still can't add or remove disks from a RAID-Z.
  How about this: If you have a mirror between two devices of different sizes, and you remove the smaller one, you won't be able to add it back. The vdev will autoexpand to fill the larger disk, even if no data is actually written, and the disk that was just a moment ago part of the mirror is now "too small".
  How about this: the whole system was designed with the implicit assumption that your storage needs would only ever grow, with the result that in nearly all cases it's impossible to ever scale a ZFS pool down.
  
  Parent Share
  twitter facebook
  - Re:More reason to be a ZFS fanboy (Score:4, Informative)
    
    by Methlin ( 604355 ) writes: on Monday November 02, 2009 @08:27PM (#29957380)
    
    Mod parent up. These are all legit deficiencies in ZFS that really need to be fixed at some point. Currently the only solutions to these is to build a new storage pool, either on the same system or different system, and export/import; big PITA and potentially expensive. Off the top of my head I can't think of anyone that lets you do #2 except enterprise storage solutions and Drobo.
    
    Parent Share
    twitter facebook
    - Re: (Score:3, Interesting)
      
      by SLi ( 132609 ) writes:
      
      Mod parent up. These are all legit deficiencies in ZFS that really need to be fixed at some point.
      Only if it's costworthy. For a case I know about XFS lacks filesystem shrinking too, and it has been asked for many times. It has been estimated that it would take months for a skilled XFS engineer to code. If it's so important that someone is willing to put up that money (or effort), it may happen; otherwise it will not. I'm sure the same applies to ZFS.
  - Re: (Score:3, Informative)
    
    by KonoWatakushi ( 910213 ) writes:
    
    How alarmist and uninformed; borderline FUD. The reality is as follows...
    First, you can't remove a vdev yet, but development is in progress, and support is expected very soon now. Same with crypto.
    Second, mistakenly typing add instead of attach will result in a warning that the specified redundancy is different, and refuse to add it.
    Third, yes, you can't expand the width of a RAID-Z. You can still grow it though, by replacing it with larger drives. Once the block pointer rewrite work is merged, removal
    - Re:More reason to be a ZFS fanboy (Score:5, Informative)
      
      by greg1104 ( 461138 ) writes: <gsmith@gregsmith.com> on Monday November 02, 2009 @10:55PM (#29959144) Homepage
      
      How alarmist and uninformed; borderline FUD. The reality is as follows...
      First, you can't remove a vdev yet, but development is in progress, and support is expected very soon now.
      The bug report for this problem goes back to at least April of 2003 [opensolaris.org]. With that background, and that I've been hearing ZFS proponents suggesting this is coming "very soon now" for years without a fix, I'll believe it when I see it. Raising awareness that Sun's development priorities clearly haven't been toward any shrinking operation isn't FUD, it's the truth. Now, to be fair, that class of operations isn't very well supported on anything short of really expensive hardware either, but if you need these capabilities the weaknesses of ZFS here do reduce its ability to work for every use case.
      
      Parent Share
      twitter facebook
- Re: (Score:2)
  
  by Just Some Guy ( 3352 ) writes:
  
  What do you know - you and I actually agree on something. Yeah, FreeBSD + ZFS is a complete win for pretty much everything involving file transfer. I honestly can't think of a single thing I don't like about it. The instant FreeBSD imports this, I'm swapping in a quad-core CPU to give it as much crunching power as it wants to do its thing.
- well ... (Score:2)
  
  by wsanders ( 114993 ) writes:
  
  There are enough tales of woe in the discussion groups of ZFS file systems that have melted down on people that I would not start shorting the midrange storage companies stock just yet. I myself have an 18TB ZFS filesystem on a X4540 and it was brought to a standstill a few weeks ago by one dead SATA disk. Didn't lose any data, and it might be buggy hardware and drivers, but still, Sun support had no explanation. That should not happen!
  I'm still a ZFS fanboy though - for about $1 per GB how can you lose. Th
- SAN, ZFS with dedupe is not a backup system (Score:2)
  
  by caseih ( 160668 ) writes:
  
  Don't mistake in-filesystem deduplication and snapshots for a backup system. It's most certainly not backup and if you treat it as such you will eventually be very sorry. A SAN with ZFS, snapshots, and deduplication features is at best an archive, which is distinct in form and purpose from a backup. Still very useful, though. Ideally you have both archive and backup systems. To get a feel for the difference, consider that an archive is for when a user says, "I overwrote a file last week sometime. Can
Next home server will be OpenSolaris (or fBSD) (Score:3, Insightful)

by 0100010001010011 ( 652467 ) writes: on Monday November 02, 2009 @07:36PM (#29956784)

ZFS, from what I can tell, kicks ass. I've played around with it in virtual machines, taking drives off line, recreating them, adding drives, etc.
When I search NewEgg I also search OpenSolaris' compatibility list.
The two areas that Linux is playing catchup is Filesystems (like this) and Sound (OSS, Pulse, Alsa Oh My!). And before you go pointing out the btrfs project, this has been in servers for years. It's tried in an enterprise environment. Your file system is still in beta with a huge "Don't use this for important stuff" warning.

Share
twitter facebook
- Re: (Score:3, Funny)
  
  by buchner.johannes ( 1139593 ) writes:
  
  Oh yeah? Well tux is cuter so I'm not switching.
- Re: (Score:3, Interesting)
  
  by buchner.johannes ( 1139593 ) writes:
  
  I'm sure btrfs -- once fully implemented and tested -- will also have problems reaching the performance of reiser4.
Wake me when they build it into the hard disk (Score:5, Interesting)

by icebike ( 68054 ) writes: on Monday November 02, 2009 @07:38PM (#29956796)

Imagine he amount of stuff you could (unreliably) store on a hard disk if massive de-duplication was built into the drive electronics. It could even do this quietly in the background.
I say unreliably, because years ago we had a Novell server that used an automated compression scheme. Eventually, the drive got full anyway, and we had to migrate to a larger disk.
But since the copy operation de-compressed files on the fly we couldn't copy because any attempt to reference several large compressed files instantly consumed all remaining space on the drive. What ensued was a nightmare of copy and delete files beginning with the smallest, and working our way up to the largest. It took over a day of manual effort before we freed up enough space to mass-move the remaining files.
De-duplication is pretty much the same thing, compression by recording and eliminating duplicates. But any minor automated update of some files runs the risk of changing them such that what was a duplicate, must now be stored separately.
This could trigger a similar situation where there was suddenly not enough room to store the same amount of data that was already on the device. (For some values of "suddenly" and "already").
For archival stuff or OS components (executables, and source code etc) which virtually never change this would be great.
But there is a hell to pay somewhere down the road.

Share
twitter facebook
- Re: (Score:2)
  
  by Shikaku ( 1129753 ) writes:
  
  That's actually very easy to explain, and ZFS could have a very similar situation:
  Say you have on your hard drive these two files that have this, which in reality is 1GB worth of data for each file (the space is a seperate file):
  ABCDABCD ABCDABCD
  Every letter has equal weight, so those two files are stored .5GB without compression. Let's change it a little bit:
  AeBCDABfCD ABCgDABChD
  efgh are 1 byte.
  You now have 2GB worth of space taken :) that's a gotcha if I ever saw one.
  - Re: (Score:2)
    
    by Shikaku ( 1129753 ) writes:
    
    Oh, I guess I should mention the blocks in my case are stupidly large, and the point is data insertion/shifting can cause sudden increases in size with block level deduplication.
- Re: (Score:2)
  
  by dgatwood ( 11270 ) writes:
  
  That's just classic bad design. There's no reason for the decompressed files to exist on disk at all just to decompress them. The software should have decompressed to RAM on the fly instead of storing the decompressed files as temp files on the hard drive. It's all probably because they made a poor attempt at shoehorning compression into a VFS layer that was too block-centric. Classic bad design all around.
  - Re:Wake me when they build it into the hard disk (Score:4, Interesting)
    
    by icebike ( 68054 ) writes: on Monday November 02, 2009 @08:15PM (#29957208)
    
    Bad design on Novell's part, but the problem persists in the de-duplicated world, where de-duplicating to memory only is not a solution.
    Imagine a hundred very large file containing largely the same content. Not imagine CHANGING just a few characters in each file via some automated process. Now 100 files which were actually stored as ONE file balloon to 100 large files.
    On a drive that was already full, changing just a few characters (not adding any total content) could cause a disk full error.
    You really can't fake what you don't have. You either have enough disk to store all of your data or you run the risk of hind-sight telling you it was a really bad design.
    
    Parent Share
    twitter facebook
    - Re:Wake me when they build it into the hard disk (Score:4, Informative)
      
      by ArsonSmith ( 13997 ) writes: on Monday November 02, 2009 @08:28PM (#29957392) Journal
      
      No you still have it stored the size of one file + 100 block sizes, in size. You'd need a substantially large number of random changes through all 100 files to balloon up from 1x file size, to 100x file size.
      
      Parent Share
      twitter facebook
    - Re: (Score:2)
      
      by dgatwood ( 11270 ) writes:
      
      True, but that's going to fail when you change the very first file, and one would hope that the process would go no further.
    - Re: (Score:2)
      
      by geniusj ( 140174 ) writes:
      
      ZFS dedupe is block level. This would be a problem, however, in file-level dedupe schemes.
- Re: (Score:2)
  
  by Znork ( 31774 ) writes:
  
  But there is a hell to pay somewhere down the road.
  I'd certainly expect that. I don't quite get what people are so desperate to de-duplicate anyway. A stripped VM os image is less than a gigabyte, you can fit 150 of them on a drive that costs less than $100. You'd have to have vast ranges of perfectly synchronized virtual machines before you'd have made back even the cost of the time spent listening to the sales pitch.
  I can't really see many situations where the extra complexity and cost would end up actual
  - Re: (Score:2)
    
    by icebike ( 68054 ) writes:
    
    >I can't really see many situations where the extra complexity and cost would end up actually saving money.
    I could see it for write-only media.
    With the proper byte-range selection, you could probably find enough duplicate blocks in just about anything to greatly expand capacity.
  - Re: (Score:2)
    
    by PRMan ( 959735 ) writes:
    
    It would be great for ISPs, where each of their user instances have files in common. Also, for a backup drive for user PCs, where each user has the OS and probably a lot of documents in common.
  - Re: (Score:2)
    
    by drsmithy ( 35869 ) writes:
    
    I'd certainly expect that. I don't quite get what people are so desperate to de-duplicate anyway. A stripped VM os image is less than a gigabyte, you can fit 150 of them on a drive that costs less than $100.
    Firstly, because dedup gives you the space savings without the hassle of "stripping" the VM image.
    Secondly, because dedup also delivers other advantages by reducing physical disk IOs, improving cache efficiency and reducing replication traffic.
    Thirdly, because enterprise storage costs a lot more tha
- Re: (Score:2)
  
  by c6gunner ( 950153 ) writes:
  
  This could trigger a similar situation where there was suddenly not enough room to store the same amount of data that was already on the device. (For some values of "suddenly" and "already").
  Yes, but what's the likelihood of that occurring? We're talking about block level duplication here. If you have two identical files and you add a bit to the end of one, you're not creating a duplicate fi;e - you're just adding a few blocks while still referencing the original de-dupped file. Now, if you were doing file-level duplication it might be an issue, but this way ... I can't see it ever being a problem unless your array is already at 99.9% percent capacity (and that's just a bad idea in general).
- Building it in makes no sense (Score:2)
  
  by saleenS281 ( 859657 ) writes:
  
  First, why would you want it built into a hard drive? Your deduplication ratio would then be limited to what you can store on one drive. The drive would have no way to reference blocks on other drives in the same system. Doing it in software allows you reference (in this case) all data within the entire zpool. That could be petabytes of storage (theoretically it could be far more, but that's probably the realistic limits today due to hardware/performance constraints).
  
  As for your "hell to pay later" t
- - Re: (Score:2)
    
    by icebike ( 68054 ) writes:
    
    You could STILL be stuck with a transaction in mid-flight when you exhaust your storage because what was one block replicated hundreds of times now becomes hundreds of blocks exhausting all storage.
    The Ease with which you can add storage only makes it somewhat more palatable. It doesn't hand wave the problem away.
    Sooner or later every you have to upgrade storage on almost every platform. The problem with a platform that uses compression or de-duplication to store more than can really fit on its drives is
    - Re: (Score:3, Insightful)
      
      by account_deleted ( 4530225 ) writes:
      
      Comment removed based on user account deletion
- - Re: (Score:2)
    
    by icebike ( 68054 ) writes:
    
    >This is because you didn't use NetWare's tools to copy the files - the command line NCOPY, for example, would have copied the files in their compressed format..
    We were moving the server content to Linux. Having it in Novells format would not have been usefull.
I tried this on my RAID system (Score:2)

by ljw1004 ( 764174 ) writes:

I tried this on my RAID-1 system and it got converted to RAID-0.
There are three types of files. (Score:5, Interesting)

by Animats ( 122034 ) writes: on Monday November 02, 2009 @09:44PM (#29958384) Homepage
I'd argue that file systems should know about and support three types of files:
- Unit files. Unit files are written once, and change only by being replaced. Most common files are unit files. Program executables, HTML files, etc. are unit files. The file system should guarantee that if you open a unit file, you will always read a consistent version; it will never change underneath a read. Unit files are replaced by opening for write, writing a new version, and closing; upon close, the new version replaces the old. In the event of a system crash during writing, the old version of the file remains. If the writing program crashes before an explicit close, the old file remains. Unit files are good candidates for unduplication via hashing. While the file is open for writing, attempts to open for reading open the old version. This should be the default mode. (This would be a big convenience; you always read a good version. Good programs try to fake this by writing a new file, then renaming it to replace the old file, but most operating systems and file systems don't support atomic multiple rename, so there's a window of vulnerability. The file system should give you that for free.)
- Log files Log files can only be appended to. UNIX supports this, with an open mode of O_APPEND. But it doesn't enforce it (you can still seek) and NFS doesn't implement it properly. Nor does Windows. Opens of a log file for reading should be guaranteed that they will always read exactly out to the last write. In the event of a system crash during writing, log files may be truncated, but must be truncated at an exact write boundary; trailing off into junk is unacceptable. Unduplication via hashing probably isn't worth the trouble.
- Managed files Managed files are random-access files managed by a database or archive program. Random access is supported. The use of open modes O_SYNC, O_EXCL, or O_DIRECT during file creation indicates a managed file. Seeks while open for write are permitted, multiple opens access the same file, and O_SYNC and O_EXCL must work as documented. Unduplication via hashing probably isn't worth the trouble and is bad for database integrity.
That's a useful way to look at files. Almost all files are "unit" files; they're written once and are never changed; they're only replaced. A relatively small number of programs and libraries use "managed" files, and they're mostly databases of one kind or another. Those are the programs that have to manage files very carefully, and those programs are usually written to be aware of concurrency and caching issues.
Unix and Linux have the right modes defined. File systems just need to use them properly.
Share
twitter facebook
- Re:There are three types of files. (Score:4, Insightful)
  
  by greg1104 ( 461138 ) writes: <gsmith@gregsmith.com> on Tuesday November 03, 2009 @01:14AM (#29960116) Homepage
  
  The main corner case in your suggested "unit file" implementation is where someone is overwriting a file too large for the filesystem to contain two copies of it. You have to truncate when this happens to fit the new one, you can't just keep the old one around until it's replaced. This makes it impossible to meet the spec you're asking for in all cases. The best you can do is try to keep the original around until disk space runs out, and only truncate it when forced to. However, if that's how the implementation works, then applications can't just blindly rely on the filesystem to always do the right thing and "give you that for free". They've still got to create the new file and confirm it got written out before they touch the original if they want to guarantee never losing the original good copy, so that they bomb with a disk space error rather than risk truncating the original. That's why this whole path doesn't go anywhere useful; better to work on poplarizing an API for atomic rewrites or something.
  As for your "managed files" case, that won't work for all database approaches. For example, in PostgreSQL, only writes to the database write-ahead log are done with O_SYNC/O_DIRECT. The main data block updates (and writes that are creating new data blocks) are written out asynchronously, and then when internal checkpoints reach their end any unwritten blocks are forced to disk with fsync if they're still in the OS cache. You'd be hard pressed to detect which of your suggested modes was the appropriate one for just the obvious behavior there, and there's still more weird corner cases to worry about buried in there too (like what the database does with the data blocks and the WAL to repair corruption after a crash).
  Both these highlight that it's hard to make improvements here at just the filesystem level. Some of the really desirable behavior is hard to do unless applications are modified to do something different too. That hasn't really been going well for ext4 [slashdot.org] this year, and how that played out highlights how hard an issue this is to crack.
  
  Parent Share
  twitter facebook
BTRFS is better (Score:3, Interesting)

by Theovon ( 109752 ) writes: on Monday November 02, 2009 @11:18PM (#29959318)

At first, BTRFS started out as an also-ran, trying to duplicate a bunch of ZFS features for Linux (where licensing wasn't compatible to incorporate ZFS into Linux). But then BTRFS took a number of things that were overly rigid about ZFS (shrinking volumes, block sizes, and some other stuff), and made it better, including totally unifying how data and metadata are stored. I'm sure there are a number of ways in which ZFS is still better (RAIDZ), but putting aside some of the enterprise features that most of us don't need, BTRFS is turning out to be more flexible, more expandable, more efficient, and better supported.

Share
twitter facebook
- Re: (Score:2)
  
  by BitZtream ( 692029 ) writes:
  
  Why did you write your first 'first post' to say that you wrote 'two' first posts? You must have, or they wouldn't be duplicate blocks, and wouldn't have been deduplicated.
- Re: (Score:2)
  
  by Per Wigren ( 5315 ) writes:
  
  ... strategically populate the available space with duplicates of commonly read blocks, for increased fault tolerance and performance?
  yes, it can [sun.com].
- Re: (Score:2)
  
  by myowntrueself ( 607117 ) writes:
  
  It reminds me the compressed file system I used to have on my old SLS Linux PC which had a small disk (1992 if memory serves me right).
  Soft Landings from DOS bailouts!!! Yaaay!
  I had a Windows 3.x PC on which I was coding some simple turbo pascal stuff to do pretty graphics.
  This Windows PC didn't have a lot of disk so I was using Stacker (or some such disk compression thing).
  One time one of my programs crashed. Just a simple graphics thing, but it crashed the PC, had to hit the reset button.
  Erm... sadly the disk compression did not survive this.
  A friend at university was *just* getting into this thing he called "Linux" and it ran on PCs, so
- Re: (Score:2)
  
  by TheRaven64 ( 641858 ) writes:
  
  The canonical use case for dedup is backup servers. Imagine you have one Solaris file server serving 40 workstations. Each of these does a full backup of its 10GB Window (or Linux, or whatever) install. You then have 400GB of data, but only about 12GB of unique data. Dedup lets you only store this 12GB, and you can store it with n redundant copies so it's easier to recover in cases of partial hardware failure. Each workstation then does incremental backups, copying files with any changes to the server.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Does that mean... (Score:4, Funny)

Re: (Score:2)

Re:Does that mean... (Score:4, Insightful)

Infinite compression? (Score:3, Funny)

Re: (Score:3, Interesting)

Re: (Score:2, Insightful)

Re: (Score:3, Insightful)

Re: (Score:3, Insightful)

This is good news... (Score:2, Offtopic)

Re:This is good news... (Score:5, Insightful)

Re:This is good news... (Score:4, Funny)

Re:This is good news... (Score:4, Funny)

Re: (Score:2)

Re: (Score:3, Informative)

Re: (Score:2)

Re: (Score:3, Insightful)

Re:Open Source Cures Cancer (Score:5, Funny)

Re:Open Source Cures Cancer (Score:5, Funny)

Re:Open Source Cures Cancer (Score:4, Interesting)

Re: (Score:2)

This is the year of Solaris on the desktop (Score:2)

Re:This is good news... (Score:5, Informative)

Hash Collisions (Score:2, Interesting)

Re:Hash Collisions (Score:4, Informative)

Re: (Score:2)

Re: (Score:2, Interesting)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3, Funny)

Re: (Score:3, Informative)

Re: (Score:3, Funny)

Re: (Score:3, Interesting)

Re:Hash Collisions (Score:5, Informative)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Any other file systems with that feature? (Score:3)

Re: (Score:2)

Re: (Score:2)

Re:Any other file systems with that feature? (Score:5, Informative)

404, add spx (Score:2)

Re:Any other file systems with that feature? (Score:5, Informative)

Re:Any other file systems with that feature? (Score:5, Informative)

Re:Any other file systems with that feature? (Score:4, Informative)

Re:Any other file systems with that feature? (Score:5, Interesting)

Re: (Score:2, Informative)

Re: (Score:2, Interesting)

Re: (Score:2)

Re: (Score:3, Interesting)

Re:Any other file systems with that feature? (Score:4, Informative)

Re: (Score:2)

Par for the course.. (Score:5, Interesting)

Re: (Score:3, Informative)

Re: (Score:2)

More reason to be a ZFS fanboy (Score:4, Insightful)

Re:More reason to be a ZFS fanboy (Score:4, Informative)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3, Informative)

Re:More reason to be a ZFS fanboy (Score:5, Informative)

Re:More reason to be a ZFS fanboy (Score:4, Informative)

Re: (Score:3, Interesting)

Re: (Score:3, Informative)

Re:More reason to be a ZFS fanboy (Score:5, Informative)

Re: (Score:2)

well ... (Score:2)

SAN, ZFS with dedupe is not a backup system (Score:2)

Next home server will be OpenSolaris (or fBSD) (Score:3, Insightful)

Re: (Score:3, Funny)

Re: (Score:3, Interesting)

Wake me when they build it into the hard disk (Score:5, Interesting)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re:Wake me when they build it into the hard disk (Score:4, Interesting)