Dec 23, 2009 |
| Back to all newsletters |
Dear dCache community,We recently detected the following flaw in the new xrootd implementation distributed with dCache-1.9.5 and dCache-1.9.6 DescriptionWhen WRITING data with dCache releases 1.9.5 (pre 11) and 1.9.6 (pre 2), using the default xrootd implementation in dCache, there is a non-zero possibility that the data-file on the disk pool is corrupted. The corruption happens silently. There is no indication in any log file that an error occurred.Solution(s)a) A quick solution is to enable the old xrootd implementation on all pool nodes receiving data using the xrood protocol.Change the following line in the pool configuration file (/opt/d-cache/config/pool.batch) fromb) Proper solution : An upgrade to 1.9.5-11 will fix the problem. For 1.9.6, a fix is not yet available. Cleaning poolsIn order to detect corrupted files in your system, you need to run checksums on data files written with xrootd and dCache 1.9.5 resp. 1.9.6 and compare those values with the values stored in the catalogues/frameworks of the experiments. You can do this on your pool nodes directly. To our current knowledge, xrootd in dCache is only used by the Alice experiment. So please ask your Alice representative for a list of checksums of those files written with dCache 1.9.5/1.9.6. The dCache billing file can help you narrowing the possibly affected files. Please let us know if the procedure is not clear or you need help contacting the experiment.The Alice experiment as well as our CERN storage contact is already informed. Please don't hesitate contacting us if you need further information or advise. Thanks |