Post by mkretzer » Feb 01, 2017 7:01 am this post

[UPDATE] October 15, 2018

The solution is to install September 2018 Windows Update (KB4343884) or later, since Windows Updates are cumulative.



Hello,



posted several threads already in the last few days but i have to post again about what happened to us this night.



First of all right now we are in the middle of migrating to REFS repos. We made the error to use 4k blocks on our temporal 120 TB repo. We thought it is no bug deal as it seemed to impact performance of file operations only at first. We monitored memory and cpu usage and did not see the memory preasure others saw because the system is gladly oversized. So we continued to successfully migrate to the new repos.



All went good for a few days, we have to wait 28 days so we can format our "production" backup storage and we were optimistic that we would "survive" that time because of the REFS space savings.



Then i got a message from our monitoring system this night. Our Veeam server was completely unreachable. I went on-site and found that i can move the mouse but not much more. I had to do a hard reset. After the system came up i saw that it tries to create 3 synthetic fulls at the same time, do a tape backup and some copy jobs. All in all nothing unusual - this worked well the nights before. So i disabled the tape job, enabled a limit of 12 concurrent tasks on the repos (before there was no limit) to regulate the load a little bit and drove back home.



10 Minutes later the next alert came in - so we had another crash. So i drove back to the company, did a hard reboot and then limited the REFS repos to 1 concurrent task so that at least our BCJs can finish at some point in the future and started to roll back to our old NTFS repository - with active fulls which i have to do for 1600 machines/140 TB.



Opening a explorer window on the REFS volume takes half a minute even without any load now so it is definately the REFS volume which has issues...



BTW i opened a sev1 case with MS - no response yet....



Markus