[an error occurred while processing this directive]

How to monitor what’s going on

This section briefly describes the commands and mechanisms to monitor the TSS PUT, GET and REMOVE operations. dCache provides a configurable logging facility and a Command Line Admin Interface to query and manipulate transfer and waiting queues.

[return to top]

Log Files

By default dCache is configured to only log information if something unexpected happens. However, to get familiar with Tertiary Storage System interactions you might be interested in more details. This section provides advice on how to obtain this kind of information.

[return to top]

The executable log file

Since you provide the executable, interfacing dCache and the TSS, it is in your responsibility to ensure sufficient logging information to be able to trace possible problems with either dCache or the TSS. Each request should be printed with the full set of parameters it receives, together with a timestamp. Furthermore information returned to dCache should be reported.

[return to top]

dCache log files in general

In dCache, each domain (e.g. dCacheDomain, <pool>Domain etc) prints logging information into its own log file named after the domain. The location of those log files it typically the /var/log or /var/log/dCache directory depending on the individual configuration. In the default logging setup only errors are reported. This behavior can be changed by either modifying /opt/d-cache/etc/logback.xml or using the dCache CLI to increase the log level of particular components as described below.

[return to top]

Increase the dCache log level by changes in /opt/d-cache/etc/logback.xml

If you intend to increase the log level of all components on a particular host you would need to change the /opt/d-cache/etc/logback.xml file as described below. dCache components need to be restarted to activate the changes.

<threshold>
     <appender>stdout</appender>
     <logger>root</logger>
     <level>warn</level>
   </threshold>

needs to be changed to

<threshold>
     <appender>stdout</appender>
     <logger>root</logger>
     <level>info</level>
   </threshold>

Important

The change might result in a significant increase in log messages. So don’t forget to change back before starting production operation. The next section describes how to change the log level in a running system.

[return to top]

Increase the dCache log level via the Command Line Admin Interface

Example:

Login into the dCache Command Line Admin Interface and increase the log level of a particular service, for instance for the poolmanager service:

[example.dcache.org] (local) admin > cd PoolManager
[example.dcache.org] (PoolManager) admin > log set stdout ROOT INFO
[example.dcache.org] (PoolManager) admin > log ls
stdout:
  ROOT=INFO
  dmg.cells.nucleus=WARN*
  logger.org.dcache.cells.messages=ERROR*
.....

[return to top]

Obtain information via the dCache Command Line Admin Interface

The dCache Command Line Admin Interface gives access to information describing the process of storing and fetching files to and from the TSS, as there are:

  • The Pool Manager Restore Queue. A list of all requests which have been issued to all pools for a FETCH FILE operation from the TSS (rc ls)
  • The Pool Collector Queue. A list of files, per pool and storage group, which will be scheduled for a STORE FILE operation as soon as the configured trigger criteria match.
  • The Pool STORE FILE Queue. A list of files per pool, scheduled for the STORE FILE operation. A configurable amount of requests within this queue are active, which is equivalent to the number of concurrent store processes, the rest is inactive, waiting to become active.
  • The Pool FETCH FILE Queue. A list of files per pool, scheduled for the FETCH FILE operation. A configurable amount of requests within this queue are active, which is equivalent to the number of concurrent fetch processes, the rest is inactive, waiting to become active.

For evaluation purposes, the pinboard of each component can be used to track down dCache behavior. The pinboard only keeps the most recent 200 lines of log information but reports not only errors but informational messages as well.

Example:

Check the pinboard of a service, here the poolmanager service.

[example.dcache.org] (local) admin > cd PoolManager
[example.dcache.org] (PoolManager) admin > show pinboard 100
08.30.45  [Thread-7] [pool_1 PoolManagerPoolUp] sendPoolStatusRelay: ...
08.30.59  [writeHandler] [NFSv41-dcachetogo PoolMgrSelectWritePool ...
....

Example:

The PoolManager Restore Queue.  Remove the file test.root with the pnfs-ID 00002A9282C2D7A147C68A327208173B81A6.

[example.dcache.org] (pool_1) admin > rep rm  00002A9282C2D7A147C68A327208173B81A6

Request the file test.root

[user] $ /opt/d-cache/dcap/bin/dccp dcap://example.dcache.org:/data/test.root test.root

Check the PoolManager Restore Queue:

[example.dcache.org] (local) admin > cd PoolManager
[example.dcache.org] (PoolManager) admin > rc ls
0000AB1260F474554142BA976D0ADAF78C6C@0.0.0.0/0.0.0.0-*/* m=1 r=0 [pool_1] [Staging 08.15 17:52:16] {0,}

Example:

The Pool Collector Queue. 

[example.dcache.org] (local) admin > cd pool_1
[example.dcache.org] (pool_1) admin > queue ls -l queue
                   Name: chimera:alpha
              Class@Hsm: chimera:alpha@osm
 Expiration rest/defined: -39 / 0   seconds
 Pending   rest/defined: 1 / 0
 Size      rest/defined: 877480 / 0
 Active Store Procs.   :  0
  00001BC6D76570A74534969FD72220C31D5D


[example.dcache.org] (pool_1) admin > flush ls
Class                 Active   Error  Last/min  Requests    Failed
dteam:STATIC@osm           0       0         0         1         0

Example:

The pool STORE FILE Queue. 

[example.dcache.org] (local) admin > cd pool_1
[example.dcache.org] (pool_1) admin > st ls
0000EC3A4BFCA8E14755AE4E3B5639B155F9  1   Fri Aug 12 15:35:58 CEST 2011

Example:

The pool FETCH FILE Queue. 

[example.dcache.org] (local) admin > cd pool_1
[example.dcache.org] (pool_1) admin >  rh ls
0000B56B7AFE71C14BDA9426BBF1384CA4B0  0   Fri Aug 12 15:38:33 CEST 2011

To check the repository on the pools run the command rep ls that is described in the beginning of the section called “How to Store-/Restore files via the Admin Interface”.