Scareh,



To be completely honest, I fully understand and appreciate your concern. I have the exact same concern and I'm always hesitant to do an upgrade. But, if you were around when 9.1.0 was in beta testing we had some users with problems similar to the ones you are concerned about.



A handful of users had non-standard configurations that the server admins had screwed up when setting up their pools and it made pools unmountable in 9.1.0. Of course, some of these people had also upgraded their pool to v5000 so they couldn't downgrade back to 8.3.1. Their data wasn't in jeopardy, but it was completely inaccessible to the owner. So from their perspective the data was lost. One of the FreeNAS developers came up with a patch for those users that was used by a handful of people and it did get their pools back online.



Overall, I share your scepticism with regards to playing with people's data. But, I think that if all hell broke lose and a bunch of users suffered a bug that didn't destroy the pool I think that the FreeNAS developers would be interested in troubleshooting the problem and coming up with a solution.



By far, the biggest "threat" I've seen to people's data is doing things that are incredibly stupid. Usually it centers around not understanding the need for quality hardware with ECC RAM, then extends to them not understanding what they are doing in the WebGUI and making one or more *serious* noob mistakes leading to them killing their pool. In the most recent case I've worked with the user that did this:



1. Had a 4 disk RAIDZ1(remember how much I criticize RAIDZ1s....)

2. Did a disk replacement without a disk being bad. (In fact, all 5 of their disks were good and tested good by me last week)

3. Had non-ECC RAM. RAM tests done last week found no errors, but that doesn't rule out bit-flips etc.

4. After doing the disk replacement never left the server up long enough to ever finish a resilver. (Why did they do this? They didn't know better!) They'd boot up the server and use it for an hour to copy files to or from the server, then promptly shutdown to "save power". Some of their hard drives had more power cycles than power-on hours. This was part of their daily cycle to backup small mounts of data at the end of the workday which typically took 15-20 minutes at the most.

5. About a week or two later they booted up the server to find that the pool was in an unmountable status.



I was unable to get their pool back despite spending more than 15 hours reading and trying various things. The pool just would not mount under any circumstances.



Overall, I'd think that there's 3 situations that would occur if a serious bug were found:



1. Bug is found that prevents access to the data, but the data is safe. This is the scenario that we saw pre-9.1.0.

2. Bug is found that causes some kind of damage to the pool, but is undoable. I'm fairly confident that in this situation a patch of some kind would be created.

3. Bug is found that destroys data with no chance of recovery. This would obviously be very crappy, and it is quite possible that if this situation were to happen it would result in the project never being trusted again.



Edit: Finished my sentences. :)