dCache.Org eagle
black_bg
home | documentation | downloads | feedback | search | imprint
black_bg
release notes | Book | Wiki | Q&A | Client API | dccp
black_bg
Web pages | Single page | PDF (A4-size) | PDF (Letter-size)
black_bg

The copy module

The purpose of the copy module is essentially to copy the content of a pool to one or more other pools to have the data available while the source pool is not available.

[return to top]

The Concept

Vacating a pool is done in three steps.

  • The pool, to be vacated, has to be set to read-only to make sure that during the copy procedure, the content doesn't change. This is not a technical requirement, it simply ensures consistency.

  • A snapshot is taken from the repository of the source pool. Filters can be run on this repostiory listing to select or deselect classes of files to be copied. Classes can either be special types like 'cached' 'precious' a.s.o or storage groups like cms:generated@osm.

  • The actual copy is initiated. The copy is done only on the preselected list. The destination pools may be specified individually or as a PoolManager pool group.

[return to top]

Setting a pool read only

In oder to ensure consistency during a data copy process a pool needs to be set 'read-only' within the pool manager. Subsequently no repository-modification operation will be submitted to the pool. (write or restore).

(PoolManager) admin >psu set pool <PoolName> rdonly

This command doesn't stop ongoing transfers to the pool. So one needs to wait until all write and restore movers have finished.

[return to top]

Creating or attaching to a maintenance task

All subsequent operations need to be done within the framework of a copy-task. So one either has to create a new copy task or has to reuse an already existing, idle one.

(maintenance) admin > create task my-copy-task copy-module

or if there is already an idle task :

(maintenance) admin > attach my-copy-task

[return to top]

Getting and customizing the source repository listing

The next step is to obtain a listing of the source pool repository.

(maintenance) admin > load pool <poolName>

Depending on the size of the source pool listing, this may take awhile. Checkout 'task info' to learn when the fetch operation is done.

(maintenance) admin > ls stat

shows the content of the source pool repository. It provides a table of file classes containing the number of bytes and the number of files per class. For simplification, a file class may either be a the status of a file, (precious,cached,pinned, locked and bad) or the storage class, like cms:generated@osm.

(maintenance) admin > ls files

gives a full listing of all files found in the repository.

The ls files/stat listing is the bases of the subsequent copy process. The listing may be customized to select/deselect particular file classes to be copied or not. Any class may be excluded from the repository listing by

(maintenance) admin > exclude <fileClass>

where file class may be 'cached, precious, pinned, locked or bad' or a storage class like 'cms:generated@osm'. The

(maintenance) admin > keeponly <fileClass>

excludes all file classes except for the specified one. Both, 'exclude' and 'keeponly" may be used until the repository listing fits your needs.

[return to top]

Starting the copy process

Two parameters should be set before starting the copy procedure. All parameters are valid for all subsequent copy operations or until they are changed.

  • task set copy-mode nn|same|precious|cached defines the mode of the replicated file.

    Warning

    Replicating precious files or creating precious replica from or to pools which are connected to an HSM system will very likely result in a serious error condition.

    Mode can be one of the following :

    • precious Independend of the source mode, the created replica will be created in 'precious' mode.

    • cached Independend of the source mode, the created replica will be created in 'cached' mode.

    • same The mode of the destination file will be identical to the mode of the soure file.

    • nn The mode of the destination file is determined by the setting of the destination pool. (Don't use)

  • task set parallel <numberOfParallelTransfers> defines the number of concurrent parallel copy operations. This doesn't overwrite the 'max mover' settings of the pools themselves.

The actual copy operation is trigged by either copyto pools <PoolName> [<PoolName> [...]] or copyto group <PoolGroupName> The command is processed asynchronously and progress may be checked by 'task info'. The process may be stopped using halt. Already started transfers will be finished first. After the process has been finished, the ls stat -l command will show files not transferred. Information is provided for which reason the transfer hasn't been performed. This includes a manual 'halt'. This allows to resume from an 'halt'.

Remarks:

  • Replicating a file to a pool which already contains a copy of this file is not an error condition and is silently ignored.

  • To get a fair distribution of files among the destination pools, one should allow for a reasonable parallism. (task set parallel

  • Space management : Among the available destination pools, the copy process will always chose the pool with the maximum 'free + replaceable' space. In the worst case this may lead to deleting 'cached' files though there would have been 'real free' space available on a different pool. We are currently trying to improve the distribution algorithm.

[return to top]

Copy Module Reference Manual

task load pool <PoolName>

The pool repository of pool <PoolName> is loaded and becomes the current pool listing. Use 'info' to check for completion of the command.

[task] info

Provides information on the current status of this copy task.

[task] ls stat [-l]

Provides information on the currently loaded pool repository.

[task] ls files [-l]

Lists all files of the currently loaded pool repository. This may be a very long list. (Don't use)

[task] exclude pinned|cached|precious|bad|locked|<storageClass>
[task] keeponly pinned|cached|precious|bad|locked|<storageClass>

Either excludes the specified class from the pool listing 'ls stat' or only keeps the specified class. The result may be viewed by 'ls stat -l'.

[task] halt

This command finishes the currently active transfers but doesn't launch new ones. The 'ls stat' command reports on files not yet transferred. Use the 'copyto' command(s) to resume transfers.

[task] clear

Clears the internal status of a command. This might become necessary of a component (e.g. pool) doesn't respond on a request.

[task] set copy-mode precious|cached|same|nn

Sets the copy mode, valide for subsequent 'copyto' request.

Warning

Replicating precious files or creating precious replica from or to pools which are connected to an HSM system will very likely result in a serious error condition

  • precious Independend of the source mode, the created replica will be created in 'precious' mode.

  • cached Independend of the source mode, the created replica will be created in 'cached' mode.

  • same The mode of the destination file will be identical to the mode of the soure file.

  • nn The mode of the destination file is determined by the setting of the destination pool. (Don't use)

[task] set parallel <NumberOfParallelStreams>

Sets the nimber of parallel transferes. This is actually the number of transfer requests. Depending on the pp and p2p values of the involved pools, the number may be lower.

[task] copyto pools <PoolName> [<PoolName> [...]]
[task] copyto group <PoolGroupName>

Starts the actual transfer of all files prepared by the 'load pool' and 'exclude/keeponly' commands. The destination pools by either be directly specified using the 'copyto pools' command or indirectly to a PoolManager pool group using 'copyto group'. Check 'task info' on the progress of the command.

black_bg
Copyright dCache.org © 2003 - 2008