17.2.2010
Pool to pool tuning.Improved Pool to pool tuning in 1.9.5 |
The issue then is how does the Pool Manager know when a pool is "hot"?
The long-standing algorithm uses a fixed threshold value. The combined cost is compared against a threshold value; if the pool's cost exceeds that threshold (and on-demand replication is enabled) then a read request that would normally be sent to this pool will, instead, trigger a pool-to-pool copy. The cut-off value may be configured in the admin interface using the "set costcuts" command.
[srm-devel.desy.de] (PoolManager) admin > set costcuts -p2p=0.5
costcuts;idle=0.0;p2p=0.5;alert=0.0;halt=0.0;fallback=0.0
[srm-devel.desy.de] (PoolManager) admin > saveThe disadvantage of this approach is that, rather than detecting pools that are serving "many more" read request that their fellow pools (and so, "hot"), the algorithm selects those pools that have greater than some threshold value. The threshold value must be carefully chosen to select only pools that are hot; should the dCache system change then a new threshold value may be more appropriate. This dependency on the dCache instance means that to achieve good hot-spot replication, a site-admin must continually tune the threshold to match circumstances.
With dCache v1.9.5-nn there is a new, adaptive algorithm for triggering hot-spot replication. Instead of using a constant value as the cut-off cost for triggering pool-to-pool replication, a percentile cost is used; for example, specifying the fiftieth percential sets the cut off value to be the median pool cost. The value is calculated dynamically, taking into account new pool costs as they are received from the pools.
To use the new algorithm, you must configure the cut-off cost to be some number with the percentage symbol as a suffix; for example, to use the median value, specify "50%". The "%" at the end of the number indicates that the new algorithm should be used.
Specifying the ninety five percentile (configured as "95%") would mean that the cost cut-off for hot-spot replication is the ninety fifth percentile. The ninety fifth percentile cost is the cost of the pool that is 95% along a list of dCache pools that are sorted in ascending order of cost. If "95%" is specified as the cost cut-off then (roughly) 95% of pools will have a cost below the cut-off value and read requests to those pools will not trigger pool-to-pool transfers.
[srm-devel.desy.de] (PoolManager) admin > set costcuts -p2p=95%
costcuts;idle=0.0;p2p=95.0%;alert=0.0;halt=0.0;fallback=0.0
[srm-devel.desy.de] (PoolManager) admin > save
What this means is that the adaptive algorithm will trigger p2p replication for a fixed number of pools, rather than for a fixed cost. As the distribution of load on the pool changes, read requests destined for a particular pool may trigger replication or not; however, at any one time, read requests that target pools from a fixed-size list will suffer replication. For the above "95%" example, read requests to the pools within the top 5% loaded pools will trigger pool-to-pool replication.
This approach works because the likelihood of a read request triggering replication depends on how likely it is that a read request will land on a "hot" pool. The (percentage) number of pools that trigger replication is fixed. If the requests are evenly spread over all available pools then the likelihood of a read request will be simply (100% - cut-off). For the "95%" example, read requests involving 5% of pools will trigger replication; if the requests are evenly spread then the likelihood of a read request triggering a replication is 5%.
However, if the requests are somehow correlated then the likelihood that a read request will involve a hot pool will increase. For example, if all the files from some experiment's dataset are stored on the same pool then jobs in the batch-system that are processing the data will introduce a correlation; the likelihood of a read requests using this particular pool will increase. If a batch-farm are processing jobs that are reading files from the same dataset, a dataset stored exclusively on a single pool, then the likelihood of read requests involving a "hot" pool (so triggering replication) will be high. If there is no other activity in the storage element, it can be 100%.
After a file has been replicated, the likelihood that a read request for that file will involve one of the fixed-number of hot pools will decrease. Depending on demand, additional replicas may be made until there is a balance between the number of replicas and load.