Highlights

  • Experimental CEPH support.
  • Automatic detection of PostgreSQL master.
  • Self describing billing files.
  • Pool manager setup is stored in ZooKeeper.
  • Pool manager read-only state is stored in setup file and ZooKeeper.
  • New performance cost calculation for improved stability of hot spot replication.
  • Allow batching of flush to tape to be “disabled”.
  • Allow polling HSM scripts.
  • Mover queues can be created at runtime.
  • Abort upload when SRM TURL is invalidated.
  • Allow xrootd clients to select mover queues.
  • Support the HAProxy proxy protocol.
  • Production ready support for high availability deployments.

Incompatibilities

  • Compatibility with pools older than 2.16 has been dropped.
  • Properties marked deprecated in 2.16 are now obsolete.
  • Several deprecated or obsolete admin commands have been removed.
  • The admin door no longer supports DSA keys.
  • The billing service no longer output records for which no format string was defined.
  • The pool manager and pool configuration files now only support the commands considered setup commands.
  • Support for the legacy UDP discovery service used in 2.15 and earlier has been dropped.
  • dCacheDomain no longer default are core domain.
  • DCAP doors no longer supportthe dcapLock feature to halt new requests.
  • Pool manager now automatically persists its setup in ZooKeeper uses it after a restart rather than poolmanager.conf.
  • The way pool manager calculates performance cost has changed and cost limits in pool manager may have to be retuned.
  • The pool to pool client queue in pools no longer has a configurable limit.
  • Return code 72 is now a special value for HSM scripts.
  • The failure semantics for empty uploads has changed.
  • The https-jglobus value has been removed for the webdav.authn.protocol property.

Acknowledgments

Thanks to Onno Zweers for several contributions to this release.

Release 3.0.41

dcache-resilience

When a checksum or broken file message/error is generated, Resilience makes a best effort to (a) remove the broken copy and (b) make another replica. This, of course, is not always possible, particularly if the broken file is the only accessible copy. This resulted in faulty behavior particularly the thrashing noted in the case of a restaging operation which results in a checksum error. This is now fixed.

The current release improved error handling for resilience. It fixed unnecessary Migration Task exceptions resulting from source pools with no replica in the repository.

Now it should be possible for Resilience to use pools blocked only for writes from doors.

Changelog 3.0.40..3.0.41

2ed30f8
[maven-release-plugin] prepare release 3.0.41
1c8c56c
bad commit put DOWN twice
57c9ccd
dcache-resilience: define non-writable pool to mean p2p-client is disabled
71419a4
dcache: fix remote pool monitor wait bug
c49e444
dcache-resilience: repair handling of broken files*
0d58321
[maven-release-plugin] prepare for next development iteration
05a7853
dcache-resilience: fix bug in source handling with Clear Cache Location messages

Release 3.0.40

cells

The current release added explicit ZooKeeper/Curator monitoring. Events generated by ZooKeeper and Curator are now logged in a new, which may help diagnose problems that are suspected to come from bad ZooKeeper interaction.

frontend

The current release improved the error handling to work with Jackson exceptions.

Changelog 3.0.39..3.0.40

2fad831
[maven-release-plugin] prepare release 3.0.40
86ccd1c
dcache-resilience: fix wrong assumption about error type in Message
bb5973e
cells: add explicit ZooKeeper/Curator monitoring
e3140e3
frontend: Map requests with bad JSON to HTTP 400 Bad Request status code
74b5c02
[maven-release-plugin] prepare for next development iteration

Release 3.0.39

nfs

NFS door has been updated to return NFS4ERR_LAYOUTUNAVAILABLE for DOT files.

star

The current release improved documentation to help dCache admins to have a better understanding of how to generate StAR record.

The current release fixed fix printing exception error message for dcache-star script if there’s a problem when run with newer versions of Python.

Changelog 3.0.38..3.0.39

6b4180c
[maven-release-plugin] prepare release 3.0.39
2776992
nfs: return LAYOUTUNAVAILABLE for DOT files
73bdcf7
star: fix printing exception error message
d40890c
star: update documentation to provide better description of script
9b96b08
[maven-release-plugin] prepare for next development iteration

Release 3.0.38

info

The info service collects information about who is allowed to reserve space. Since some of this information, like VOs, usernames and gids, may be considered sensitive information, this update allows admins to control whether or not to publish them. The default behaviour is unchanged from the previous behaviour, i.e. info publishes everything. If a site admin wants to change this, the info.limits.show-only-vo-authz property can be set to true.

scripts

The dcache script and manpage still refered explicitely to Java 6. This patch changes the phrasing of the respective text.

Changelog 3.0.37..3.0.38

f4d6bc6511
[maven-release-plugin] prepare release 3.0.38
7b41e14c48
scripts: update reference to JDK to avoid mentioning specific java version
17df369055
info: allow admin to control whether non-VO / non-FQAN identities are shown
b7bfc8b314
[maven-release-plugin] prepare for next development iteration

Release 3.0.37

chimera

The current release fixed previously introduced issues for lost+found directory permissions. Now, the lost+found directory permissions is updated without causing problems if that directory has been removed or permissions have been modified.

Changelog 3.0.36..3.0.37

89c6a8b
[maven-release-plugin] prepare release 3.0.37
bdfb780
chimera: correct previous attempt to fix ‘lost+found’ directory permission
a74b73f
[maven-release-plugin] prepare for next development iteration

Release 3.0.36

pnfsmanager

The current release improves documentation for set log slow threshold admin command help.

spacemanager

When trying to upload into dCache using a space-token where there is no selectable link for this operation then the user was presented with a generic error message; for example,

No write links configured for [net=131.169.71.98,protocol=GFtp/2,store=dot:user@osm,cache=,linkgroup=].

This behavior is changed now and an improved error message is returned to the user if they attempt an upload data into dCache using a space-reservation in a way where poolmanager configuration prevents the upload.

webdav

The current release improved error handling when dCache is full.

Changelog 3.0.35..3.0.36

38de80c
[maven-release-plugin] prepare release 3.0.36
d6d321b
systemtest: work with new OpenSSL DN format
a2a4534
spacemanager: provide space-specific error message on bad upload
1928311
webdav: return 507 Insufficient Storage when dCache is full
6351c8a
pnfsmanager: update slow logging admin command help
2fd15ed
[maven-release-plugin] prepare for next development iteration

Release 3.0.35

cells

dCache no longer logs stack-traces when running multiple cells with the same name.

pool

Closing dcap mover connection no longer logs a stack trace.

For certain failures,the pool was logging transfer failures twice. This is now fixed.

rpm

dCache ensures now that user ‘dcache’ is a member of group ‘dcache’.

star

The current release Introduced new property star.db.*, which makes possible now to run PostgreSQL on non-standard ports can use STAR.

statistics

Timeout in contacting PoolManager no longer results in a stack-trace being logged.

Changelog 3.0.34..3.0.35

f7547d7
[maven-release-plugin] prepare release 3.0.35
6c2cb7d
[maven-release-plugin] prepare for next development iteration
a212275
pool: fix double logging on remote FTP transfer error
e5ff9ab
pool: Fix how certain bugs are logged
dfc024b
star: support PostgreSQL running on non-standard TCP ports
6c721e5
rpm: don’t assume existing dcache user is member of dcache group
fc563b9
cells: don’t log stack-trace on starting cell with same name as running cell
1c42791
statistics: avoid stack-trace on internal timeout
2304f35
pool: fix stack-trace when closing dcap mover connection

Release 3.0.34

admin

The admin interface reported an attempt to connect to an absent cell as a bug. The current release fixed the issue.

httpd

Requests to httpd targeting an unknown resource was returning 200 OK response code. Nevertheless the 404 NOT FOUND response would be closer fit. This is now fixed.

nfs

The current release corrects inaccurate documentation of nfs.enable.pnfsmanager-query-on-move.

Changelog 3.0.33..3.0.34

bcd8d31
[maven-release-plugin] prepare release 3.0.34
e8014a2
nfs: fix documentation of nfs.enable.pnfsmanager-query-on-move
233b6fd
admin: do not report attempts to connect to missing cell as a bug
9a0752d
httpd: return 404 status code on an unknown page
1cd37d0
[maven-release-plugin] prepare for next development iteration

Release 3.0.33

Changes affecting multiple services

dCache no longer logs stack-traces if a Java VirtualMachineError occurs. This is unnecessary as dCache was (presumably) working fine until Java discovered a problem.

chimera

Sites updating to dCache 2.15 or later might observe that a lost+found directory with incorrect permissions was created during the update. This patch ensures correct permissions. Since we cannot know if the current permissions in lost+found are intended, this patch does not modify any existing lost+found directory permissions.

pool

The sweeper freecommand no longer logs a stack trace if it is started with incorrect input information.

An irrelevant stack trace was logged by the pool. This release corrects that.

Changelog 3.0.32..3.0.33

52aa1504eb
[maven-release-plugin] prepare release 3.0.33
ce3a5fae78
chimera: update schema migration when creating ‘lost+found’ directory.
35db8236df
pool: fix stack-trace on bad command input
f6b22647da
pool: fix stacktrace on FaultEvent logging
da4ba36bbe
system: Don’t log stack-trace on fatal JVM error
74b2e62277
[maven-release-plugin] prepare for next development iteration

Release 3.0.32

alarms

Until now, the sorting order of alarms did not provide a correct ordering for all types of alarms. With this release, alarms are now implicitly ordered by at least their latest modification timestamp.

resilience

One of the features of resilience is the enforcement of file partioning on pools according to pool tags. The pool tag restrictions are observed whenever a file is copied. In addition, it is rechecked when a storage unit is updated, in order to make sure the files are distributed correctly according to the new requirements. This is done by removing the offending copies and recopying them in a new location.

Should files get redistributed, however, by rebalancing or a migration job, it is possible that the partitioning will be violated, since only resilience observes it.

The resilience service now verifies that files are distributed according to the requirements specified by pool tags while doing periodic scans (or scans initiated through the admin command).

statistics

A possible race condition was removed from the implementation of the create stat admin command.

Changelog 3.0.31..3.0.32

b65e0caea0
[maven-release-plugin] prepare release 3.0.32
ce83a48128
statistics: fix race in “create stat” admin command
a57ac478f7
srm: fix stacktrace on database failure
c47a65e81c
alarms: revert LogEntry.compareTo() to throw NPE on null object
049419c772
resilience: force tag partition checking on scans from admin command and periodic checks
aaad9d2599
alarms: fix natural order comparator to use timestamp first
ea6aed2009
[maven-release-plugin] prepare for next development iteration

Release 3.0.31

dcap

A pool that has gone offline and comes back up again may become very slow to respond due to a large amount of superfluous error messages to dcap clients that disconnected in the meantime. This patch ensures a more responsive reaction to these cases by introducing a time-to-live value for such messages.

pool

Error reporting was improved for cases of IO errors in pools.

Changelog 3.0.30..3.0.31

f7e46f453c
[maven-release-plugin] prepare release 3.0.31
d403eb76f7
pool: avoid ‘null’ and other nondescript error messages
7008ae46a8
[maven-release-plugin] prepare for next development iteration
7ae58e2bc8
dcap: add TTL information to dcap messages

Release 3.0.30

Changes affecting multiple services

Many dCache components use RemotePoolMonitor to provide fast access to the information that PoolManager has about pools. In order to facilitate system diagnosis, the ‘info’ admin command was augmented by information about the current status of the RemotePoolMonitor.

admin

The Apache sshd library used by the admin interface was updated. Although we believe that dCache is not affected, some security vulnerability scanners alert users to a potentially vulnerable library version. The update serves mostly to avoid those false alarms.

pool

A bug in gfal2 results in FTP transfers being aborted some 50 ms after being initiated. This results in the door killing the mover shortly after the pool received the PoolDeliverFile message. If the mover is not queued, but not yet fully started, this may lead to the pool disabling itself. This patch corrects that problem, ensuring that the pool continues to run despite any aborted transfers.

xrootd

In the 4.7 releases, the xrootd client started enforcing protocol requirements for kXR_login which, unfortunately, broke access to dCache. The xrootd client expects an answer with a 16-character session ID from the door and then the pool after the redirection. Without this ID, the client would retry (without success) repeatedly and appear to hang.

dCache’s xrootd implementation has been augmented with the session ID, enabling it to work with xrootd clients of version 4.7 and up.

Changelog 3.0.29..3.0.30

8aa487149e
[maven-release-plugin] prepare release 3.0.30
a2aa5fc13b
many: add diagnostic information about remote pool monitor
666b22a5bb
pool: dont disable pool if mover cancelled before open
cdc124dce6
update apache sshd version due to security vulnerability
b730eec759
[maven-release-plugin] prepare for next development iteration
1322fd2dcc
dcache: fix bug in PoolSelectionUnitV2 match()
eee56dee0a
dcache-xrootd: Fix login handshake to support xrootd clients (> 4.7.0)

Release 3.0.29

cells

The current release improved some problematic error messages has better error reporting to ensure bugs are understood.

httpd

The curren release fixed the issue of change of format in transfers.txt after upgrading from 2.13 to 2.16.

pool

The current release fixed log and alarms duplication when rebuilding broken entry.

webdav

Regression fixed in OPTIONS output.

Changelog 3.0.28..3.0.29

f265e50
[maven-release-plugin] prepare release 3.0.29
0eec693
httpd: restore millis to transfer time for transfers.txt
539d3ed
pool: fix log and alarms duplication when rebuilding broken entry
38f5b2d
cells: better error reporting to ensure bugs are understood
c506caf
webdav: fix regression in OPTIONS response
342e21b
[maven-release-plugin] prepare for next development iteration

Release 3.0.28

pool

The current release fixed regression in GridFTP OPTS CKSM command and GridFTP OPTS CKSM command is supported again.

The current release fixe regression with third-party pull transfers using GridFTP, with dCache acting as the GridFTP client that prevented data integrity checks.

resilience

When resilience tries to copy or remove a file, and it is no longer in the namespace, this should be a fatal error, differently from the discovery that a replica is no longer in cache. These two errors, however, were treated the same way. This is now fixed and the correct behavior is immediate abort of operation when file is not in namespace.

The current release improved error handling for file deletion during scan correctly it fixed bug in formatting and handling of cache exception types.

When doing a pool scan, the storage unit information must be recovered for each file. This is done from the chimera attributes. The code checks an internal map for an index number corresponding to the unit it knows about from the PoolMonitor.

However, if the pool selection unit configuration changes such that a storage unit is eliminated, that mapping will also be deleted from resilience. In the case that the attributes stored in Chimera still have the older storage class information, there will be a NoSuchElementException thrown.

This is fixed now and Pool scans that encounter this situation do not get stuck forever in the queue.

webdav

In RFC 7230, section 3.2, it states that HTTP header names should be case insensitive. Despite this, the standard milton mechanism for acquiring header information is case sensitive, which (in turn) means that several HTTP header values in dCache are similarly case sensitive. There were reports that this case sensitivity has caused problems with certain clients (e.g., go). The current release fixed this issue and headers that control third-party transfers are now case insensitive, so should work with more clients.

Changelog 3.0.27..3.0.28

70c3e94
[maven-release-plugin] prepare release 3.0.28
06fe64a
Update FileOperationHandler.java
398244a
pool: fix data integrity regression for 3rd-party GridFTP pull transfers
c212dd4
pool: fix regression in GridFTP OPTS CKSM command
f86778a
resilience: handle file deletion during scan correctly
1020c75
resilience: add pool operation logging
3927e50
resilience: handle storage unit NoSuchElement failure
b9868d6
webdav: adjust header parsing to be case insensitive
9512b10
resilience: handle all cases where no locations for file may be discovered
05fd04d
resilience: distinguish correctly between file not in repository and file not found
80175bf
resilience: fix bug in formatting and handling of cache exception types
73bd846
PoolManager : set return code to CacheException.PERMISSION_DENIED if staging is not allowed due to stage protection.
04bef34
ftp: update exception logging to include context
a5e8bd2
[maven-release-plugin] prepare for next development iteration

Release 3.0.27

cells

The current release improved handling of rogue domains with badly formatted dCache versions.

configuration

There are configuration options in zookeeper that may affect how well the zookeeper cluster will work with dCache. The lack of documentation of how dCache uses zookeeper prevents admins from tuning their zookeeper instance. This is now fixed and zookeeper configuration is updated with hints on how dCache uses zookeeper, along with the corresponding zookeeper configuration properties.

pool

The current release fixed loading setup that requires queues created by pool.queues.

webdav

The current release fixed the stack-traces on bad client input.

Changelog 3.0.26..3.0.27

2b591bf
[maven-release-plugin] prepare release 3.0.27
fa68a01
cells: better handling of rogue domains with badly formatted dCache versions
c944c47
configuration: update zookeeper configuration with hints
cb033cb
pool: fix loading ‘setup’ that requires queues created by ‘pool.queues’
545d072
webdav: avoid stack-trace on bad user requests
47d18d7
alarms: guard against NPEs on LogEntry getters
7eb6827
[maven-release-plugin] prepare for next development iteration

Release 3.0.26

admin

While migration move tasks on pools were working correctly, for migration info command an error occurred, that the current user (root) wasn’t allowed to execute anything (due to missing ACLs). This is now fixed.

Changelog 3.0.25..3.0.26

722f000
[maven-release-plugin] prepare release 3.0.26
43fcf92
resilience: restore extractor class value for extractor property
00a2da1
admin: Fix Inconsistent ACL enforcement, RT 9207
2c0cb3a
[maven-release-plugin] prepare for next development iteration

Release 3.0.25

configuration

For some services it was unclear what were the admin’s responsibilities in terms of consistent configuration. The current release updated the documentation describing what is needed to deploy redundant services.

resilience

The current relase improved configuration for namespace provider properties by making them immutable.

srm

The current release improved configuration for srmPing responses.

Changelog 3.0.24..3.0.25

ca0220a
[maven-release-plugin] prepare release 3.0.25
240fa42
configuration: update description for replicable
583eadb
resilience: remove reference to pnfsmanager property
c73bf8e
srm/srmmanager: fix srmPing confusion
da6bbf2
[maven-release-plugin] prepare for next development iteration
99c08d4
resilience: make namespace provider properties immutable

Release 3.0.24

billing

The current release added documentation that describes how a CellAddress and its fields expand.

config

The current release improves documentation for dCache admins by adding obsolete|forbidden annotation for dropped properties. This helps admins to discover changes that might affect their dCache instance. This is both through manual inspection and through the dcache check-config command.

ftp

It has been discovered that timestamp facts reported by dCache GFTP server are expressed in local (to server) time. This is now fixed. Timestamp facts are reported in GMT. Clients like globus-url-copty -sync work properly and ubefrtp -ls returns correct timestamp.

srm

The current release added new configuration properties to allow easy modification.

It upstated as well srm/manager.properties documentation to describe the relationship between srmmanager.root and srm.loginbroker.root properties.

Changelog 3.0.23..3.0.24

c92bfa3
[maven-release-plugin] prepare release 3.0.24
a775bcf
srm/srmmanager: update documentation about root path
8ce10c1
ftp: convert timestamps to GMT (to follow RFC 3659)
1521ad1
config: use consistent terminology
5f59a0f
billing: update documentation to describe CellAddress
216b127
srm,srmmanager: add configuration property to allow easy modification of srm root
7bbcbd3
logback: make socket appender construction depend on log level
db2048a
dcache: release dcache-view 1.1.5 for dcache 3.0
e14e316
config: add obsolete|forbidden annotation for dropped properties
24c8a5f
[maven-release-plugin] prepare for next development iteration
454bb6d
config: add obsolete|forbidden annotation for dropped properties
ea3c73d
pnfsmanager: remove obsolete comments from properties file

Release 3.0.23

Changes affecting multiple services

Christoph Anton Mitterer submitted several corrections to the comment string in properties files.

zookeeper

A race condition in zookeeper caused irrelevant stacktraces to be logged on shutdowns. Since this problem is not easily solvable with Zookeeper 3.4, we decided to change the log-level threshold for NIOServerCnxnFactory. Note that the suppressed class is only used by ZooKeeper server (i.e., not the client). Therefore, only the ‘zookeeper’ service is affected.

Changelog 3.0.22..3.0.23

2f93fd01a5
[maven-release-plugin] prepare release 3.0.23
9bd1cac672
use correct terminology
c3fe563c9e
fixed several typos in the documentation
d3dfad0fe5
added hint that pnfsmanagers must use the same DB
c19e087029
zookeeper: work-around race-condition in zookeeper server shutdown
f87dd852a8
[maven-release-plugin] prepare for next development iteration

Release 3.0.22

httpd

The “Disk Space Usage” webpage (/usageInfo) contains a table showing information about each pool in the dCache cluster. The “Layout” column showed the capacity usage graphically, with different colours showing how much of that pool’s capacity is being used for different tasks. This release fixes the Layout heading to describe a previously undocumented colour.

Changelog 3.0.21..3.0.22

9f91b2f
[maven-release-plugin] prepare release 3.0.22
74c1bb2
system-test: add series of functional test for frontend service
1cef325
[maven-release-plugin] prepare for next development iteration
872a906
httpd: Fixed table headers in usageInfo

Release 3.0.21

frontend

Several improvements are added to frontend such as restriction usage is improved and easily configurable JSON is now available via frontend.

srm

The current release enforced restrictions on srmSetPermission operations.

zookeeper

The current release fixed race-condition when ZooKeeper server accepts requests before the server is fully initialised.

Changelog 3.0.20..3.0.21

6a0a31b
[maven-release-plugin] prepare release 3.0.21
d111193
frontend: refactor static (configuration) data
6a48e13
spacemanager: dont try to release expired spaces
52f2075
srmmanager: use path to support srmSetPermission operations
da19337
frontend: fix Restriction usage
29478e4
zookeeper: work-around SessionTracker racy initialisation on startup
d65d5b8
[maven-release-plugin] prepare for next development iteration

Release 3.0.20

Changes affecting multiple services

dCache previously shipped both log4j and log4j-over-slf4j libraries. This patch simplifies deployment and prevents possible problems due to non-deterministic library loading sequences by removing log4j dependencies.

frontend

If frontend was asked to create a directory listing while a file was being uploaded into the directory being listed, previously, an error would occur. This patch addresses that issue.

This release includes a new dCacheView version.

info

The info cell can occasionally send messages before it is properly registered. This can lead to Cannot send message with callback in state... messages that do not correspond to any valid error condition. This patch ensures that messages are only sent as soon as the cell is properly registered.

pool

If the communication between a pool and PnfsManager times out, the error message is not well suited to diagnosing the problem: Failed to instantiate mover due to unsupported checksum type: Request to [>PnfsManager@local] timed out. The checksum type is not playing an important role here. Hence, this patch updates the error message.

zookeeper

Embedded Zookeeper instances would occasionally log stack-traces on startup. This patch adds a work-around.

Zookeeper logs a stack-trace in cases where the client disconnects unexpectedly. This is unnecessary and potentially confusing, so this patch changes Zookeeper’s behaviour to just logging the incident.

Changelog 3.0.19..3.0.20

bb7d0aa838
[maven-release-plugin] prepare release 3.0.20
7af2a80aff
dcache: release dcache-view 1.1.4 for dcache 3.0
3002758a02
zookeeper: work-around racy startup
0ee8801ff5
zookeeper: silence ZK server errors
67fc64b642
dependencies: remove log4j jar
47e98838a4
frontend: fix “Attribute is not defined: SIZE” bug
9f75784527
pool: fix error message on timeout
6407eb9180
info: avoid sending messages too early.
9c518fbedf
frontend,webdav: add supress-wwwauthenticate to allow headers
e65c4bedec
[maven-release-plugin] prepare for next development iteration

Release 3.0.19

frontend

Clients can now send a Suppress-WWW-Authenticate header in order to avoid the native login dialog in browsers.

hsm

A potential space leak in staging requests has been mitigated.

pool

Since there are various reasons for why a mover may be terminated, the previous log message “Transfer was forcefully killed” is not detailed enough to deduce the reason behind the problem. This patch enables more detailed error messages both in the log files and in billing. Note, however, that the door’s report of the failed transfer may duplicate this information.

webdav

A recent update, commit 5abc0e1c, improved the behaviour of the Milton WebDAV libraries if an IOException occurs during an upload. That patch, unfortunately, did not address all issues, and when non-spec-conformant clients are used against dCache, stacktraces can be triggered.

This patch corrects that behaviour. Also, in case of errors, the error code returned in case of any problems was changed from 400 to 500, which should signal cliens that they are free to retry the transfer after a timeout.

In case of failing transfers, doors will log more accurate information.

Changelog 3.0.18..3.0.19

b5a645ede8
[maven-release-plugin] prepare release 3.0.19
32ee8c4542
authentication: suppress WWW-Authenticate if requested
4506319d50
webdav: improve logging on transfer failures
3758693ce4
pool: log why a transfer was forcefully aborted
3446c4b8da
webdav: make Milton work-around more robust
0f5c9f7b17
script-nearline-spi: fix space leak when polling script is used
4466f0d37c
[maven-release-plugin] prepare for next development iteration

Release 3.0.18

http

In an HA setup with multiple pool managers, the httpd (old) service failed to fetch the list of restore requests from all instances. Instead the service would query one of the instances, possibly alternating depending on the grouping of services in domains. This behaviour has been corrected, so that the http interface now shows all restored entries even in HA configurations.

webdav

File transfers through WebDAV doors could potentially bypass any Restrictions checks in PnfsManager. This patch ensures that Restrictions are always checked and observed, and improves PnfsManager’s logging to give information in case a Restrictions check is not posssible.

Changelog 3.0.17..3.0.18

04ac75e09d
[maven-release-plugin] prepare release 3.0.18
335b645fbf
webdav: Fix restriction check when downloading a file
5502db75af
[maven-release-plugin] prepare for next development iteration
79073fcda7
httpd: Fix incomplete restore list in HA setup

Release 3.0.17

frontend

OpenID connect can now be enabled from dCacheView. This change introduces three new properties in the frontend.properties file, frontend.dcache-view.oidc-provider-name-list, frontend.dcache-view.oidc-client-id-list and frontend.dcache-view.oidc-authz-endpoint-list, all of which are documented in the properties file.

pinmanager

Upgrades from dCache 2.13 could occasionally fail during the Liquibase update stage. The database table pinsv3 can not be dropped if it is still referenced by foreign keys of other tables. This modification adds a cascadeConstraints=true modifier to the dropTable command used during the conversion process. Thereby, updates are possible without errors and PinManager starts without issues after an upgrade.

rest-api

Requests to /api/v1/user now include the username of the requestor. This is intended to make working with OpenID Connect tokens easier.

Changelog 3.0.16..3.0.17

44f602a94e
[maven-release-plugin] prepare release 3.0.17
33fd02c298
httpd, admin: Fix some hard-coded cell names
fb5520975e
frontend: expose open-id connect to dcache-view
7fec3318d9
rest-api: include username to the user attributes
3c465934ee
Add cascadeConstraints=“true” to liquibase dropTable action on pinsv3 table
8083ea91bd
[maven-release-plugin] prepare for next development iteration

Release 3.0.16

nfs

Invalidation of VFS caches has been improved, so that there is no stale information left for dot files.

pool

Error handling in pools was improved for systems using CEPH backends.

Changelog 3.0.15..3.0.16

37e2ad1eac
[maven-release-plugin] prepare release 3.0.16
6d56e52d2d
nfs: bind vfs cache invalidation with file’s layout
89a192ae18
ceph: map RadosException to corresponding IOException
63d766b889
[maven-release-plugin] prepare for next development iteration

Release 3.0.15

nfs

Debugging of stuck transfers has been facilitated by adding the status of transfers to the output of the show transfers command.

The assignment of movers to NFS doors has been made more robust.

system-test

The system-test module, used for demonstration or testing purposes, comes with a built-in X.509 infrastructure. With this release, expired certificates are replaced by new ones.

Changelog 3.0.14..3.0.15

7c9d20a042
[maven-release-plugin] prepare release 3.0.15
a3005ede58
nfs: show transfer status when displayed
605db8097b
system-test: update disposable-CA generated credentials
a45eb26bb3
nfs: fix loosing movers due to short timeout
d17e6332f2
[maven-release-plugin] prepare for next development iteration

Release 3.0.14

Changes affecting multiple services

The version of the PostgreSQL driver used by dCache internally was brought up to 9.4.1212. This fixes the issue described in liquibase bug 2939 .

chimera

In order to improve IO performance, the way dCache updates atime values was modified.

nfs

An internal change in the NFS door code helps reducing irrelevant exceptions being logged.

srm

SRM no longer times out file requests. This alleviates a rare race condition where timeouts for the file request and file access competed. Requests that time out now always return SRM_REQUEST_TIMED_OUT at request level and SRM_FAILURE for the SURLs in that request.

Changelog 3.0.13..3.0.14

7a614f2ceb
[maven-release-plugin] prepare release 3.0.14
ace6b59de1
ceph: log repository IO error
706795f5b6
update postgresql driver to version 9.4.1212 to address issue with liquibase changeset and postgresql 9.6 (see liquibase bug 2939)
f6b7af5112
srm: remove file-level timeout
95c465df9f
chimera: do not update inode generation on atime only update
08a3b4cb10
nfs: do not add Origin to read-only subject
a3df429db1
[maven-release-plugin] prepare for next development iteration
2434f8375a
parent/dcache: add mail jar to deployment

Release 3.0.13

pool

Current release improves handling of CEPH exceptions, so that pools can react appropriately when request for a missing file.

Fixes double close on p2p.

Changelog 3.0.12..3.0.13

484a88d
[maven-release-plugin] prepare release 3.0.13
6562643
[maven-release-plugin] prepare for next development iteration
194c54a
pool: handle CEPH exceptions
7f7d6e2
pool: fix double close on p2p

Release 3.0.12

chimera

There was an issue with a symbolic link to a directory where destination where destination contained trailing slash. This is now fixed.

nfs

The current release fixed interoperability with RHEL7-based clients and handling of invalid principals on SETATTR sent by OSX client.

Changelog 3.0.11..3.0.12

4770eea
[maven-release-plugin] prepare release 3.0.12
9acdb7e
libs: update to bug fix release nfs4j–0.13.1
bd98067
srmclient: fix handling of checksum options
be5101d
[maven-release-plugin] prepare for next development iteration
0be5408
chimera : handle empty paths elements path2inumber stored procedure

Release 3.0.11

chimera

The current release fixed database query for storing multiple checksums for a file.

ftp

The Socket read method may return zero to indicate that no bytes were read. Although this is not an error, such occurances will result in a transfer failing.

This is now fixed.

srmclient

Directory listing for CASTOR was resulting in a NullPointerException when the request size is too large, due to the fact that the CASTOR SRM interface limits the maximum number of responses to 1024. This is now fixed.

Changelog 3.0.10..3.0.11

57ebca3
[maven-release-plugin] prepare release 3.0.11
8e607b6
srmclient: fix compatibility with castor
3389013
ftp: prevent execution of most commands when unwrapped
c422b7a
chimera: fixed database query for storing multiple checksums for a file.
71a6ac9
ftp: do not fail proxy transfer if read returns zero bytes
c480738
[maven-release-plugin] prepare for next development iteration

Release 3.0.10

ftp

The current release added implementation of MLSC. As a result Globus is able to query the contents of dCache directories using FTP and without creating additional TCP connections.

lm

The current release fixed the issue with duplicate location manager connectors.

Changelog 3.0.9..3.0.10

2b4e595
[maven-release-plugin] prepare release 3.0.10
301142f
lm: Do not rely on thread interrupt for shutdown
6b93af9
ftp: implement the MLSC command
363149a
[maven-release-plugin] prepare for next development iteration

Release 3.0.9

ftp

The current release improves compatibility between dCache FTP client and Globus GridFTP server.

resilience

When a file has no readable replicas, the default behavior of resilience is to raise an alarm. Files marked CUSTODIAL were excluded from this. The current release fixed this in a way that an alarm is raised for all files without readable disk copies, whether the file is CUSTODIAL or not.

restful

dCache has the concept of a home directory: each user may have a directory that is somehow “theirs”. This information was not available to clients accessing dCache via the REST interface. The current release fixed this and querying /api/v1/user provides the user’s home directory in addition to other user-centric information.

The RESTful interface exposed some namespace-targeted information through /api/v1/namespace, and QoS information through /api/v1/qos-management/namespace. This was undesirable as dCache exposes namespace-targeted operations in two places. This is now fixed and the client may obtain information by querying /api/v1/namespace.

The current release improves the error handling for RESTful interface. Currently the error JSON object contains potentially useful information in the message and the bugs are logged.

srm

During an ATLAS stress-test of tape recalls, it was discovered that various sites had relatively short request lifetimes. However, the SRM spec provides the opportunity for the server to inform the client (FTS, in this case) of what lifetime a request actually has. The current release includes the requests remaining lifetime in the response from the server.

The current release improves the documentation to help Admins to have a better understanding how to configure dCache correctly.

Changelog 3.0.8..3.0.9

b243cb0
[maven-release-plugin] prepare release 3.0.9
926bf7c
srm-client: use default GridFTP port if not specified
6b65522
ftp: Add support for SITE WHOAMI command
d3d12b9
resilience: remove check on CUSTODIAL for inaccessible files
8b4ab7c
ftp: update parsing of CLIENTINFO command
5911a18
srm: include remaining request lifetime in various responses
c3662f2
srm: update srm request.*.lifetime configuration properties documentation
0284bac
ftp: modify facts describing namespace ownership
4136c76
ftp: add support for SITE TASKID command
f8c16f5
ftp: add initial support for checksum performance markers
031862e
restful: support querying QoS through namespace
8d74684
[maven-release-plugin] prepare for next development iteration
393ba5d
rest: add home directory to user info
b5c8a70
restful: improve error handling
fccd7bf
ftp: add support in OPTS RETR for specifying performance marker frequency
19367b7
ftp: show SIZE facts for directories
bbf1c08
ftp: add support for paths relative to home directory

Release 3.0.8

frontend

This release of dCache ships with an updated dCache View version.

xrootd

In https://github.com/xrootd/xrootd/issues/459, it became apparent that dCache could improve xrdcp compatibility by sending checksum information in lower case. This release contains this change, which should improve xrootd operations.

Changelog 3.0.7..3.0.8

b856742
[maven-release-plugin] prepare release 3.0.8
2056dd7
xrootd : use lower case for checksum algorithm names when replying to checksum queries.
a87a7c6
[maven-release-plugin] prepare for next development iteration
cb40a59
dcache: update dcache-view version to 1.0.5

Release 3.0.7

cleaner

The list of Delete Notification targets that the Cleaner will inform when a file’s content has been cleaned is admin-configurable. In order to facilitate this configuration, the ‘show info’ command has been modified to also show Delete Notification targets.

srm

The SRM code has been made more robust against races between file deletions and copies.

systemtest

The ‘system-test’ script was updated to ensure anonymous dcap tests succeed.

Changelog 3.0.6..3.0.7

a6c0932
[maven-release-plugin] prepare release 3.0.7
4e25147
cleaner: show delete notification targets
4d70728
systemtest: allow anonymous dcap activity
fa18be9
[maven-release-plugin] prepare for next development iteration
54038a3
srm: fix recovery procedure in internal copy if source is deleted

Release 3.0.6

cells

Fixed a race condition and logging bug affecting cells relying on zookeeper. This could affect location manager, space manager and pool manager.

gplazma-ldap

Wrong quoting of two gplazma property strings could occasionally cause IllegalArgumentExceptions to be thrown when using LDAP-based logins. This problem has now been fixed.

locationmanager

Fixed a problem in Location Manager in which it would fail to shut down and could leave two overlapping connectors running.

pool

The process of deleting files on pools has been made more robust against potential problems. This further decreased the likelyhood of repository corruption during file deletion.

This release improves transactional stability of the built-in Berkeley database in pools.

srmclient

SRM Client v1 received improved error handling. Most noticeably, expired credentials no longer cause a RuntimeException but an IOException.

Changelog 3.0.5..3.0.6

92c752d
[maven-release-plugin] prepare release 3.0.6
52becab
pool: Fix race and nested transaction problem during deletion
cc8eb74
pool: Fix transactional behaviour with Berkeley DB
042c8ef
lm: Fix lost thread interrupt that prevents shutdown
b3d670f
cells: Schedule watcher notifications on cell event thread
b0bab0c
gplazma-ldap: fix default properties for root and home
a21d861
[maven-release-plugin] prepare for next development iteration
ecdf153
srmclient: consolidate credential expiry, don’t use RuntimeException

Release 3.0.5

frontend

Included a new version of dCacheView with minor improvements.

nfs

The handling of NFS requests for offline files has been improved. Such requests will no longer attempt unneccessary retries.

Changelog 3.0.4..3.0.5

6b4ef4c
[maven-release-plugin] prepare release 3.0.5
0b6d777
dcache: update dcache-view version to 1.0.4
123b47c
resilience: fix fairness algorithm bugs in file operation map
3c04c2d
nfs: rework pool selection to get rid of too many retries
61e8ed1
minor typo corrections
44ad5c2
[maven-release-plugin] prepare for next development iteration

Release 3.0.4

dcache

The current release updated dcache-view version.

pool

In the current release changes hsm ls command so that it lists all configured HSMs if no argument is provided.

The current release fixed URISyntaxException in ceph backend.

poolmanager

Ability to restrict staging to specific protocol was missing in stage protection specification. Additionally it was not possible to specify DN, FQAN, storage group and protocol that are not allowed to stage.

This is now fixed and stage protection configuration can be used as a blacklist.

Changelog 3.0.3..3.0.4

21783c2
[maven-release-plugin] prepare release 3.0.4
5314301
stage protection : minor correction for test to pass
9882a8f
3.0 compatibility changes for stage protection
1f7e1c2
pool: fix URISyntaxException in ceph backend
df629f2
PoolManager : stage protection add protocol to the list of discriminators
26c0f5f
dcache: correct the output directory
efd258c
dcache: update dcache-view version
c850ff0
pool: ‘hsm ls’ must show all configured hsm if no argument is provided
03d3434
[maven-release-plugin] prepare for next development iteration

Release 3.0.3

cells

The current release fixed an issue causing doors to log a stack trace on startup.

nfs

Remote I/O error was generated on the client side accompanied by stack trace on server side when user fumbled .(get) command. The is fixed now and client side sees no such file or directory error and no stack trace in log file.

srmclient

The current release fixed out of memory issue for srmls -l showing particular directory.

srm-common

The current release improved user experience by ensuring that bugs and JVM-critical errors do not trigger a retry of the operation, but favour a fail-fast approach.

transfermanager

The output of info, queue, ls internal and ls external admin commands in RemoteTransferManager has been improved.

Changelog 3.0.2..3.0.3

e0c5ab9
[maven-release-plugin] prepare release 3.0.3
a832e71
srm-common: don’t retry when JVM runs out of memory
104d79a
transfermanager: tidy up output from ‘info’ ‘queue’ and the two ‘ls’ commands
c814f54
cells: fix ‘This stopwatch is already stopped’ error
3859a07
srmclient: use streaming deserialisation, allow more memory.
a1e2ffb
chimera-enstore: specify fields order on insert
90021dd
NFS: throw FileNotFoundHimeraFsException in case of fumbled “.(get)” command
8768838
[maven-release-plugin] prepare for next development iteration

Release 3.0.2

cleaner

Users reported that they wanted to see space freed up by cleaner processes to be reported as free as soon as possible. This patch sends notifications about freed up space more often, resulting in quicker status updates.

doors

Log when the list of auto-discovered NIC interfaces changes. Support logging the time taken to acquire NIC information. Update logged messages to indicate the context.

nfs

The NFS implementation was updated, improving spec compliance.

pool

A problem was fixed that could cause the csm check to fail on pools containing broken files.

When deleting files, the cleaner submits batches of files to delete to a pool, but this batch is then processed sequentially. In order to improve deletion performance, such deletion tasks are now run concurrently. The increase in concurrency can reduce the overhead of syncing the meta data to disk after every deletion, thus increasing the deletion rate.

srmclient

While srmfs’s ‘-debug’ option already provided lots of information, including the SOAP request sent to the server, it did not provide the server’s response until now. This has been added, making troubleshooting easier for some problem classes.

The SRM client has been updated to work correctly if it has been installed in a path containing spaces.

Glob expansion in SRM client has been overhauled, allowing much faster namespace operations on DPM when globbing is in use.

srmfs would not work with paths that require special characters to be percent-encoded. This included path names with spaces or hash symbols. After this update to srmfs, such path names are permissible and all namespace operations and file transfers on them will succeed.

srmfs now accepts single commands directly as command-line arguments, e.g. ‘srmfs srm://srm.example.org/path ls’

Exiting will block if there are ongoing transfers. Notifications are printed, which should provide feedback on activity.

The user can force an exit by using Ctrl+C, which will still result in a clean exit.

Changelog 3.0.1..3.0.2

6a23d46
[maven-release-plugin] prepare release 3.0.2
a2f59a5
nfs41: fix open-stateid leak introduced in 0702d73
affa1b4
srmclient: support srmfs with command on command-line
97f2f8d
srmclient: show SOAP response from server when debug enabled
842f8f0
srmclient: percent-encode paths that require it
20ffed0
srmclient: fix stat/ls caching for glob expansion for DPM
d2d4b4c
srmclient: fix deploying srm-client in path with spaces
192462c
srmclient: fix Java 7 compatibility for srmfs
ba8b836
doors: add diagnostic information for NIC auto-discovery
8a8b83d
pool: Delete files concurrently
8d77acc
cleaner: Send notification more often
7a542d2
pool: Fix csm check command in the pressence of broken files
1fdb643
[maven-release-plugin] prepare for next development iteration
a6f5d40
nfs: discover open-stateid during layout return

Release 3.0.1

cells

The current release improves reply handling in batch processing.

ftpclient

The current release improves compatibility between dCache FTP client and Globus GridFTP server.

The dCache GridFTP client was not working with Globus FTP server. This is now fixed.

pool

The current release introduces a new pool.mover.nfs.thread-policy property, which add a spossibility to control
nfs request processing policies. dCache admins could choose one of the strategies for nfs request processing.
Please note, that by default the request processing policy is set to same-thtead strategy.

srmclient

srmfs was issuing SRM requests that DPM did not accept. srmfs also did not accept the responses from DPM. This was preventing any file transfers for succeeding. This is now fixed and srmPrepareToGet and srmStatusOfGetRequest operations work as expected with DPM.

The two ls commands (lls and ls) was not supporting multiple arguments or glob expansion. Both were taking an optional, single argument, which was limiting while discovering what content is available and was not matching the expectation of the users. The current releases fixes this limitation and Multiple arguments and glob expansion are now supported for the ls and lls commands.

Correct information displayed about directories from DPM.

This release of dCache improves the readability of the error returned to the client when server declines to add an explanation.

Changelog 3.0.0..3.0.1

0f643c0
[maven-release-plugin] prepare release 3.0.1
3e2e814
cells+dcache: fix Reply handling in batch processing
9f5745f
ftpclient: fix compatibility with Globus FTP server
1670622
pool: optimize checksum channel calculation
b6f591c
pool: simplify checksum channel logic
a2375f1
srmclient: ensure a reasonable error message
330e9f5
srmclient: fix DPM compatibility with SRM transfer requests
6d7b7e6
ftpclient: fix multiline ADAT reponses
9abf787
ftpclient: encrypt SITE CLIENTINFO command
7279b09
pool: add a possibility to control nfs movers threading policy
8db0380
srmclient: add glob support for lls and ls
8cb2378
srmclient: avoid NPE when stat directory with DPM
d12117a
[maven-release-plugin] prepare for next development iteration

Release 3.0.0

The most obvious change is the transition to a new versioning scheme that better reflects our existing release and compatibility policy.

Once a year, dCache.org releases a golden release. The golden release maintains compatibility with pools of the previous golden release and any intermediate version. The release after the golden release drops compatibility with pools older than the golden release it surpasses. The new versioning scheme reflects the incompatibility by incrementing the major version number in the release after a golden release. Thus the golden release marks the last release of a particular major version. Like before, patch level releases are identified by a third number in the version.

As always, this dCache release contains many internal changes, many performance improvements and many cosmetic changes that will not be explicitly described in the release notes. The changelog at the end contains the complete list.

As always in a release after a golden release (an .0 release in the new versioning scheme), configuration properties previously marked obsolete have been dropped. Configuration properties previously marked deprecated are now obsolete. You should run dcache check-config before upgrading and fix any warnings. Rerun dcache check-config after upgrading.

The following are noteworthy changes or changes that you as an admin should know about when upgrading. The changes are described in no particular order.

admin: no longer supports DSA keys

Upon upgrade, the post installation script will automatically generate RSA keys instead. Your ssh client may warn you about the changed server key. The admin.paths.dsa-host-key.private and admin.paths.dsa-host-key.public properties are obsolete.

Automatic detection of current PostgreSQL master

dcache.db.host and derived properties now accept a comma separated list of host names when used with PostgreSQL. This is intended for use with a cluster of PostgreSQL servers using streaming replication. The JDBC driver automatically detects and uses the current master server. In case a slave is promoted to master, the JDBC driver automatically discovers and reconnects to the new master without requiring a dCache restart.

caNl 2.1

Although this change should be transparent, the update of the Common Authentication Library, the Bouncy Castle cryptographic library, the ARGUS library, and the VOMS library is critical and you never know if this breaks something. Keep an eye out for CA and client problems.

On the positive side, this upgrade allowed us to update the Apache SSHD library which hopefully fixes a number of compatibility problems observed with some Python SSH libraries.

billing: empty format strings suppresses output

Previously, if a formatting string was undefined, billing records would be logged using a default format. This has been changed such that billing records with an undefined or empty formatting string are not added to the billing file. This allows specific record types to be suppressed.

billing: make billing files self-describing

Upon startup or when rolling over to a new billing file, the billing service now outputs formatting instructions to the billing file. These are lines starting with ## followed by the message type and the configured formatting string. Parsers may make use of the instructions to recognize the format. At the very least, third party parsers should be updated to ignore lines starting with a hash symbol.

The billing indexing utility has been updated to recognize these formatting instructions and thus can parse the billing files even when the output format is changed and billing files thus contain a mix of different formats.

Configuration properties with the common prefix billing.parser.format were added to define the default format for old billing files without formatting instructions. Thus the output format can be changed while still maintaining compatibility with old billing files.

billing: expose full address in billing record

Previously, some billing records were logged with a full cell address while others only included a cell name. The default format retains this behaviour, however the full cell address is now exposed in the formatting string. The cell name and domain name are also exposed separately, making it possible to output these as separate fields. Check the instructions in billing.properties for details.

billing: make service replicable

It is now possible to add multiple billing services if all components (including pools) have been updated to version 3.0. All instances of billing receive all billing records.

Introduced dcache.queue.billing and dcache.topic.billing properties and marked dcache.service.billing obsolete.

Setup files are now restricted to setup commands

The setup files of various services, in particular poolmanager.conf and setup on pools, are now restricted to setup commands. These are the commands the services themselves use when generating the setup files. Previously any cell command could be included, even though the command would be lost when the setup file was regenerated.

Setup file syntax is now checked before it is loaded

When reloading a setup file at runtime using the reload command, the file is now checked for syntactic errors before being applied to the running service. If an error is found, the file is not loaded and the service is unaffected. Previously, such errors would leave the service in an inoperable state.

Legacy cell topology support has been dropped

dCache 2.16 introduced the use of ZooKeeper for topology discovery as well as the use of core domains as message brokers and satellite domains connect to core domains. For backwards compatibility, 2.16 included the legacy UDP discovery service in dCacheDomain and dCacheDomain was a core domain by default. This UDP service is no longer included and compatibility with pools older than 2.16 has been dropped. dCacheDomain no longer defaults to being a core domain and it is in fact no longer a special name - you may call it whatever you like.

This however means that unless you define some domain as a core domain, none of your dCache domains will be connected and you will be left with an inoperative dCache. If you haven’t already done so in 2.16, upon upgrade designate one of your domains as a core domain by defining dcache.broker.scheme = core for it. Your former dCacheDomain would be an obvious candidate for such a domain.

Lots of improvements to service life-cycle management

We made lots of changes to the life cycle of dCache cells. These changes are mostly internal, but the short story is that we try to improve how dCache starts and in particular how it can cleanly shut down.

In the past, clean shutdown wasn’t all that important as the service would be inoperable anyway when shut down. With the addition of high availability deployments, this assumption is no longer true and we would like to be able to shut down cleanly. The goal has not been achieved yet in dCache 3.0, as there are still plenty of issues causing exceptions on shutdown, but the situation should be better than in previous releases.

As an admin, the obvious change you will observe is that errors in the log files have changed. In particular we log more stack traces when detecting errors and we log cells that shut down slowly; even though slow shutdown isn’t necessarily an error as many services now do more work while shutting down.

New wire protocol for cell communication

A new wire protocol for interdomain TCP connections for cell communication has been introduced. This protocol has lower overhead than the old protocol. Connections with 2.16 domains are detected automatically and the old protocol is used as a fallback.

dcap: fewer threads and faster

The DCAP door was refactored. Compared to the earlier version it got:

  • about half the number of threads than before,
  • reuse of threads to reduce the overhead,
  • detects client disconnects even if the door is busy,
  • the legacy feature to block a DCAP door by defining the dcapLock entry in the domain context has been removed.

spacemanager: improved latency on non-spacemanager transfers

Space manager is sort of a proxy for pool manager. When enabled, doors send requests they usually would send to pool manager to space manager instead. This had the effect of adding a little extra latency to all transfers, even reads and unmanaged transfers.

The interaction between pool manager, space manager and doors has been heavily refactored. One consequence is that the space manager can inject its own behaviour into the doors. This allows the requests to be sent directly to pool manager if it can be determined that space manager is not involved in processing it.

poolmanager: persist setup file in ZooKeeper

The pool manager setup is now preserved across restarts even without saving the setup to disk. This is achieved by persisting the setup in ZooKeeper whenever it changes. Upon restart, the setup is loaded back from ZooKeeper.

The consequence is that poolmanager.conf is only loaded when either no setup can be found in ZooKeeper, or when the admin explicitly uses the reload command. Restarting pool manager will not reload poolmanager.conf. As before, the save command is used to update poolmanager.conf and the file is not updated automatically.

This feature forms the basis of synchronizing the setup between multiple instances of the pool manager in a high availability deployment of dCache: Whenever the setup changes in one pool manager, the setup is stored in ZooKeeper and other instances discover the change and apply the update locally.

poolmanager: persist pool read-only status in pool manager setup

When using the psu set pool -rdonly command to mark a pool as read only, this restriction was not persisted in the pool manager configuration file. Consequently, the read only status would be reset by a pool manager restart.

The read only status is now stored in both poolmanager.conf and in ZooKeeper and thus the read only status of a pool survives restarts.

Note that the read-only status in pool manager is a separate concept from the pool disable state defined in pools.

poolmanager: make the service replicable

Pool manager is now a replicable service. This means one can create multiple redundant instances of pool manager and the load will be distributed over the available instances. The pool manager configuration is synchronized through ZooKeeper. For a given set of pool managers, a particular file is always processed by the same instance.

poolmanager: allow decentralized pool selection

Pool manager has traditionally been the central place in which all pool selection (assigning transfers to pools) has been performed. Pool manager allows great flexibility in how this selection is performed, including cost based replication, load and space triggered fallbacks, staging from tape or to export pools, etc.

To accurately enforce the different selection strategies and the various cost limits, it was essential that pool manager had a complete picture of the load of all pools. Cost updates from pools are only distributed every 30 seconds, so to obtain the necessary accuracy, pool manager would update its internal cost estimates whenever it selected a pool and it would also intercept messages between doors and pools indicating the beginning and end of a transfer.

This model falls apart if pool selection is distributed over several components as any adjustments to the internal cost state would be local to the component selecting the particular pool. Such distributed pool selection could be found in a high availability deployment of dCache in which multiple redundant pool managers each perform pool selection for a shard of the files. Another, not currently implemented, scenario is delegating pool selection to doors to reduce the latency of pool selection.

dCache 3 reimplements how the various cost limits of pool manager are enforced. Rather than assuming that a pool manager has perfect knowledge, pool selection is performed under the assumption of reasonably, but not perfectly, accurate data. Whenever the pool manager selects a pool it also encodes the assumptions under which this pool was selected (e.g. load is below a certain threshold, pool has enough free space, etc). This assumption is forwarded by the door to the pool and checked by the pool before enqueuing a mover. If the assumption fails, a cost update is prematurely submitted to pool manager and the mover creation request is rejected. The door will resubmit the pool selection request to pool manager, which now has more accurate cost information.

This change is mostly invisible, although it may certainly affect the dynamic behaviour of pool manager, replication and assignment of transfers to pools.

By the way, the cm info command of pool manager has been removed as there is nothing useful for it to report now. The commands cm set active, cm set update, cm set debug, cm set magic are obsolete. These should be removed from poolmanager.conf on upgrade.

poolmanager: improve stability of hot spot replication

Pool manager can be configured to trigger replication of files when the source pool exceeds a certain load threshold.

Unfortunately, the load was defined in terms of the average relative queue length of the various mover queues on the pool, including the queues of pool to pool transfers. Since pool to pool queues typically (and wisely) have low limits, the relative weight in the load calculation of pool to pool transfers is often high. The consequence is that triggering a replication may itself push the calculated load of the pool higher, triggering even more replications. This is bound to generate an unstable system.

dCache 3 changes how the load of a pool is calculated to exclude the two pool to pool queues as well as the HSM stage queue from the pool calculation. Existing cost cut limits may have to be retuned as the calculated performance cost has changed.

pool: drop the limit on pool to pool receive queues

A pool has two queues dedicated to pool to pool transfers; one for sending files and one for receiving files. The receiving end does not limit the number of concurrent transfers to avoid deadlocks in pool to pool transfers. For purposes of calculating a performance cost, the pool did however define a limit anyway (as the load was determined as the average queue length relative to the queue limit).

As described in the previous section, load no longer considers the pool to pool queues and thus the unenforced limit on the receive queue has been dropped. The pool pp set max active command has been deprecated. The queued and max columns of the pool to pool receive queue in webadmin and httpd are now empty.

pool: log why a mover is killed

The pool has been extended with support for injecting a message when killing a mover. Components and services that kill movers make use of this feature to document why the mover was killed. This information is included in billing and depending on the protocol it may also be propagated to the client.

pool: allow continuous flushing to tape

Traditionally, flush to tape on pools was batched by storage class. Once a batch of files began to flush, new files arriving on the pool would not begin flushing until the running batch would complete. For HSM subsystems that do their own batching, the batching in dCache is typically counterproductive as the driver or some external component may wait for more files even though such files would be held back by the pool to be included in the next batch.

To resolve this issue, dCache 3 adds an -open option to the queue define class command in the pool service. This option alters the logic such that newly created files are added to an existing batch immediately.

pool: support polling scripts with the HSM script driver

The classic script HSM driver invokes an external script to flush, stage or remove files from tape. Usually, this operation is blocking, meaning that many concurrent instances of this script will run at the same time.

Some clever admins have found a workaround for this scalability issue by letting the script poll for the completion of the operation. Rather than block, the script terminates with a failure. This relies on dCache pools retrying HSM operations. This works well, except that it fills the billing files and log files with error messages.

To support this type of script, dCache 3 extends the script driver to recognize 72 as a special return code. When the script exits with this return code, the driver places the request back into its queue and retries it without propagating the error to dCache. The result is that the operation is retried without spamming the log files and billing files. The retry period can be configured using the new -p:delay option.

Note that for flush, the retry triggered by exit code 72 is within the same batch. Thus new files will not be added to the queue until the batch completes, except if the flush queue is defined with continuous flushing (see the -open option in the previous section).

pool: alter failure semantics for empty files

When upload to a pool fails, the file is usually registered in the name space like any other upload - if a checksum or file size was predefined, the file may optionally be marked broken. In either case, the file would be considered “complete” in the sense that no further modifications to the file are allowed.

One common failure mode for uploads is when no data is uploaded at all. This could be because the mover was killed before it could be started, that is, while it is queuing; or because the client didn’t connect to the mover; or because the mover failed to connect to the client. Networking issues could be a common source of such failures. Due to the failure semantics, such movers could not be retried by the door.

In dCache 3 the failure semantics for such empty uploads has changed such that the replica on the pool is instead deleted and not registered in the name space. The name space entry is left in its “virgin state”. Protocols like NFS and DCAP can open such files for writing, allowing the upload to be retried.

Eventually, other doors will be updated to retry write mover creation, but the current release doesn’t do that yet.

pool: allow runtime management of mover queues

Pools allow movers to be placed on queues. Several queues may be created, e.g. to treat internal and external transfers differently, or to queue transfers by protocol.

Previously, custom queues were defined using the pool.queues configuration property, requiring a pool restart to change the defined queues.

In dCache 3, the admin shell commands mover queue create and mover queue delete have been added instead. The existing pool.queues property is deprecated. The queue definitions are persisted to the pool setup file when issuing the save command.

By the way, the unused mover remove command has been removed.

pool: deprecated commands for HSM limits have been removed

The deprecated rh set max active, st set max active and rm set max active are no longer support in pool setup files. Use the equivalent provider specific options instead. Upon upgrade we recommend running the pool save command before upgrading - at least if this has not been done since upgrading to 2.9.

pool: optimize Berkeley DB meta data updates

The part of the pool abstracting file and meta data operations - the repository - has been heavily refactored. One consequence is that Berkeley DB updates are now transactional, improving consistency and hopefully performance too as updates are batched together.

pool: allow CEPH to be used as a storage backend

One consequence of the heavily refactored repository component in pools is that it has become easy to plug in alternative backends. CEPH is a distributed block storage that has gained popularity in recent years. In dCache 3, CEPH can be used as a backend for pools, allowing the file data to be stored in CEPH. The meta data is kept in the traditional Berkeley DB format.

See the documentation of the pool.backend property for choosing an alternative backend. Several CEPH related properties starting with pool.backend.ceph have been added. Consult the documentation in pool.properties for details.

One consequence of the change is that HSM drivers can no longer expect replicas to be stored as files on the local file system. Instead replicas in a pool are now identified by a URI. When developing HSM drivers or HSM scripts, this should be taken into consideration. Backwards compatibility is however important to the dCache team and as long as a local file system is used, the HSM driver and HSM scripts are passed a regular path rather than a URI.

It has to be stressed that CEPH support is considered an experimental feature. The more encouraging news is maybe that this shows that it is fairly easy to hook alternate storage backends into dCache.

pool: sorted listing of movers

In the nice to have category, dCache 3 supports options in the mover ls admin shell command to sort the output by mover’s last access time and transferred data size.

ftp: port door to Netty

Netty is an non-blocking I/O framework used by xrootd doors and for xrootd and http movers in pools. In dCache 3.0 ftp doors make use of Netty too. This is pretty much transparent, but it allows the door to scale to more concurrent connections with lower resource usage.

By the way, we also upgraded to Netty 4.1 with buffer pooling enabled - this should reduce Java garbage collection overhead.

ftp: abort upload if SRM invalidates TURL

Previously, when the transfer URL generated by the SRM was invalidated, either because it expired, the request was aborted, or the SURL was deleted, the underlying FTP upload would be allowed to continue. The upload would eventually fail when the last byte had been transferred, wasting bandwidth and generating a misleading error.

This has been changed such that the SRM will kill an FTP upload when it invalidates a TURL.

srm: added extra fields to the srm access log

The srm access log contains a description of every client request. In dCache 3, several extra fields have been added. In particular for single file requests, the access log now contains the path of the file to which the request applies.

webdav: use relative links in generated HTML

The default templates for the HTML view of the webdav door now use relative links. This should make it easier to use the door behind a reverse proxy or load balancer.

webdav: allow template to be reloaded at runtime

The webdav service generates the HTML view from admin configurable template files. A reload template command was added to trigger the template to be reloaded at runtime. The new webdav.enable.auto-reload.templates configuration property may be used to enable automatic reloading whenever the template changes.

xrootd: select mover queue by application name

The xrootd door can now select a pool mover queue by client application name. See the documentation of the xrootd.app-ioqueue configuration prefix in xrootd.properties for details.

It should be stressed that the application name is submitted by the client and is easy to fake.

xrootd: recognize oss.asize parameter

Xrootd clients submit the file size upon upload as an oss.asize property. dCache now recognizes this property and uses it in space reservations, pool selection, and file verification. If the uploaded file does not match the specified size, the file is marked as broken.

nfs: autodiscovery of nfs4domain for idmapping

For correct user id mapping nfs4 requires that server and client use the same naming scope, called nfs4domain. This implies a consistent configuration on both sides. To lower deployment overhead a special auto-discovery mechanism was introduced by SUN Microsystems - a DNS TXT record. Staring from version 3.0 dcache supports this discovery mechanism. When nfs.domain property is set, it gets used. If it’s left unset, then DNS TXT record for _nfsv4idmapdomain is taken or the default localdomain is used when DNS record is absent.

See nfsmapid and DNS TXT Records for more information.

nfs: improved protocol compatibility and stability

Nfs door now can share a single mover between multiple transfers to a file by the same user coming from the same host. This should improve compatibility with existing clients as they expect such behavior which is allowed by nfs protocol specification.

nfs: added support for flexfile file layout

Flexfile layout type aims to be an extension to pNFS to allow the use of storage devices in a fashion such that they require only a quite limited degree of interaction with the metadata server. The RHEL 7.2 and it’s derivatives are providing experimental support for flexfile layout type. With this release, dcache starts to issue flexfile layouts if requested by a client.

Add support for the HAProxy proxy protocol

The ftp, xrootd, webdav, srm, httpd, and frontend services have been extended with support for the HAProxy proxy protocol. HAProxy or similar load balancers are essential components of a high availability deployment.

These load balancers act as frontends for redundant services in dCache. One problem of using such a proxy is that the real IP address of the client is hidden from dCache, thus obscuring dCache log files and hindering network based pool selection.

The HAProxy proxy protocol is an adhoc standard supported by many load balancers and proxy servers. It adds a small header to the connection between the proxy and the backend service revealing the IP address of the client to the backend. The new configuration properties .enable.proxy-protocol enable support for this protocol. Note: Once enabled, direct access from client to the respective door must be blocked as clients would otherwise be able to conceal their real address to the service.

Check the documentation in the relevant default .properties files for information of which versions of the proxy protocol are supported.

New mechanism for determining services to monitor

The httpd service - both the old style and the newer webadmin interfaces - monitor other services in dCache. The mechanism for determining which services to monitor has changed. In the past the list of services was configured in httpd; this has been replaced by a multi-cast topic. Any service subscribing to this topic will be monitored and shown in the web interface.

The name of the topic is defined by the property dcache.topic.watched, although there shouldn’t be a need to change this (maybe if one wants to partition the services to be monitored by multiple httpd instances, one could consider introducing multiple topics, but why one may want to do this eludes me). The important part is that this is the topic a service should subscribe to if it should be monitored. By default, gplazma, pnfsmanager, poolmanager, spacemanager, and srmmanager subscribe to this topic. Doors and pools are discovered through other means and are monitored despite not subscribing to the topic.

Production ready support for high availability deployments

As may have been obvious from several of the changes above, lots of work went into enabling high availability (HA) deployments with dCache. Some benefits of such a deployment are:

  • No single point of failures
  • Rolling updates
  • Horizontal scaling
  • Symmetric deployments

For most parts, there isn’t much to configure for an HA deployment as one can simply deploy several instances of the various services. There are however several pitfalls, e.g. which databases should be shared and which should not. Available documentation should be studied carefully and any deployment should be tested before it is rolled out on the production system.

Several external components are required for a true HA deployment to load balance between doors (e.g. HAProxy) and handle HA for PostgreSQL (repmgr).

Also, there are still a few components that do not handle redundant deployments well, eg. resilience and replica manager. It may be argued that redundancy of these components is not essential.

Controlled draining of services is currently not automated and requires some knowledge of the inner workings of dCache. Although support for HA may be production ready, it should currently be considered an expert feature.

Changelog from 2.16.0 to 3.0

45fed81
srmclient: by default show only CN of owner in ‘ls -l’ output
10aa51c
resilience: python script for making database changes necessary for migration from old replica manager setup
12a45fa
pool: Reduce lock contention in migration module
564442e
srmmanager: Adjust semantics of concurrent upload detection
90e8d7b
spacemanager: Fix shutdown bug causing a file to leak from a reservation
88ec2b3
spacemanager: Fixes a message bounce bug
bd00ac7
srmclient: add SRM call statistics
4f1d904
srmclient: update command names in srmfs
71192e5
srmclient: fix srmfs ‘get permission’ command
25c313e
srmclient: fix compatibility with Bestman
951fb06
srm: Sumbit token less releaseFiles requests to all backends
0d6e02f
srmmanager: Support upload detection in clustered environments
28f79b6
srm: Redefine the semantics of active upload checks
e320cee8c
srmmanager: Refactor how various operations handle active transfers
321a380
srmclient: probe for SRM endpoint parameters that the user doesn’t specify
2844429
Resilience: separate out the data structures for handling storage units from those for pool groups
ced4286
Resilience: Support matching the universal and class default storage unit expressions
87b5e57
dcap: don’t create stack-trace if tunnel fails due to bad client
ce0b9e3
PoolManager : stage protection, fix error in stage.fragment
26fa950
webdav: fix thrown NPE when CORS is set
c445044
dcap: expose dcap client version limit
d60ec26
dcap: Register DCAP interpreter as a command listener
8335057
dcap: fix Kerberos dcap if principal contains a ‘-’
fd7b21c
dcap: fix regression in handling old version
1d14d80
cells: Fix NPE in location manager
a1d7b4d
srmmanager: Add interface to query and abort transfers
cee5a6a
cells: Remove resubmission to event thread in location manager
5a08279
cells: Add safe guards to connector create and process update events
cc9b845
srmclient: fix ‘cd’ compatibility with StoRM
c35fc89
poolmanager: Log pool name rather than SelectPool object id
b6855d4
pool: Suppress two stack traces in nearline storage handling
802f2d8
srm: Log correct request token in access log
b75e224
httpd: Enable X-Forwarded-For processing in httpd
4d5eed5
httpd: Don’t insist on using http for unauthenticated webadmin pages
a671d74
nfs: allow dCache to start up without DNS
ad802f8
srmclient: add initial support for glob expansion
f7007f1
billing: fix stacktrace and slow shutdown if in refresh
1388162
common: fix alignment of abbreviated byte column in ColumnWriter
dc23d88
cleaner: Send notifications concurrently
bb4f08b
poolmanager,spacemanager: Do not reply to pool manager query replies
cb41330
doors: Allow setting an empty list of login broker tags
5abed59
dcache: Add column to dcache services for proxy-protocol
2072783
srmclient: add additional help text to srmfs
6b9b5db
common: update ColumnWriter to remove any trailing whitespace
b661232
spacemanger: Fix interception of PoolAcceptFileMessage
4031401
xrootd: Upgrade to xrootd4j 3.2.1
f631308
admin: Send requests to a fully qualified address once connected to a cell
46e4e17
webdav: Fix initialization of html template
39a2bac
common: do not markup ellipses as a user-value
a6ad18a
frontend: reinstate support for truly anonymous access
b567bc7
spacemanager: Propagate PoolManagerHandler errors as is
b07f800
dcache: Fix detection of message errors in doors and space manager
08d69d6
dcache: Fix detection of message errors in poolmanager and admin
54ba7fe
doors: Ensure that pool selection context survives between retries
5458eff
poolmanager: Propagate subscriber failures
61f6ea2
cells: Avoid an infinite loop when executing deferred tasks
494732c
cells: Do not add source name when resending a message
4be45bd
srmclient: fix checksum mismatch when uploading small files
7856802
libs: update to nfs4j–0.13.0
1f6e347
srmclient: fix srmfs to shutdown cleanly if user issues Ctrl-C
98fb6f6
srmclient: clean shutdown srmfs after transfer
d93af74
srmclient: avoid stack-trace if lcd with incorrect path
79db347
Active Transfers: substitute ? for <unknown> on html pages
881a582
dcache: Fix output of domain ports for ‘dcache ports’ command
bffb248
dcap,ftp: Allow root path and socket address to be overridden for loginbroker info
0a40230
cells: Minor optimization to routing manager when processing GetAllDomain requests
666ddf3
cells: Kill tunnel earlier in case of IO failures
f38f906
cells: Fix a couple of locking issues in the routing manager
a274ac8
cells: Less aggressive use of stack traces on slow cell shutdown
290517d
cells: Fix race between publishing and killing a cell
603bbce
srmclient: give meaningful error message if credential is missing
07db10c
common: add support for UserNamePrincipal as user:<name>
b063e96
srmclient: provide better error message if credential has expired
a49cc50
Added ‘explain login’ examples to help text in Gplazma2LoginStrategy.java
3e868c2
info, webadmin: Check for null version
3f56a0e
httpd,srm,statistics: Parameterize billing service name
2529275
billing: Strip format string from attribute name
90bc41c
transferObserverV1: replace Args with Joiner to construct transfers.txt lines
67dea34
billing: Make billing indexer work with custom format strings
2b262f6
cells: Fix NPE in unit tests
99babcd
dcache: update dcache-view version to 1.0.2
417887f
Revert “dcache: update dcache-view version”
4d8d122
dcache: update dcache-view version
e8d6d40
dcap: add support for clients presenting more version metadata
7c2fdda
srmclient: update checkPermissions to be more robust
5fb5f46
srmclient: print friendly message on SRMException
f74b140
srmclient: update ls to be more robust
0a50d58
commons: log bugs with stack-trace and instructions
46f986c
webdav: support template reloading
b3594f0
webdav: add support for ownCloud mtime header
8ade134
gplazma2-xacml: remove erroneous creation of placeholder extensions
91d1832
restful-api: add create, move and rename resources
a63ae6a
cells: add event logging on cell lifecycle events
8aff852
srm: remove trailing dot from reverse lookup result
00d782e
ftp: add support for SRM cancelling an active upload
9969b3b
webdav: add cors support for uploading files
c5aa25e
srmclient: fix async stat command
88a9fcf
srmclient: implement useful subset of local filesystem commands
03cca24
srmclient: fix error message for ls
7ce53ee
srmclient: refrain from adding default port to SURL in srmcp
9f084d5
namespace: fix permissions of auto-generated directories
032f833
cells: New wire and payload protocols for cell message
61d5753
cells: Minor paranoia refactoring in shutdown
f6cdc1f
cells: Deliver cell events from the cell event thread
3b2ca6f
poolmanager: Stop adjusting pool cost
4a70ec1
poolmanager: Define assumptions on pool selection
354aa1a
pool: Add infrastructure for pool assumptions
095f1f5
cells: Stop ZooKeeper recipies in stopping rather than stopped
6f89203
poolmanager: Delay zookeeper publication of pool manager
0d0ea8f
cells: Minor cleaning of location manager
c0ac545
srmclient: fix -debug mode
05d3b56
srmclient: support SIGKILL
03afafa
cells: Guarantee event order for cell even listeners
d5f1fdb
cells: Remove race in routing manager
2e9e81c
poolmanager: do not use string concatenation with logging
e3503f4
replicamanager: Fix race during shutdown
54b3a9c
cells: Fix lost interrupt exception
bd87c3d
cells: Ensure that newly created threads are non-daemon normal priority threads
ff53756
webdav: fix Unauthorized vs Forbidden response
70f6602
pom: Update third party dependencies
ef0aaa1
doors: Fix a minor race in LoginBrokerPublisher
6d43a76
dcache: Fix more cells that do not shut down cleanly
5809df2
cells: Add remote domain to NDC of location manager tunnel
e96c0b2
cells: Increase timeout on tunnel shutdown
ebfb1bc
cells: Propagate startup failures to caller
1509b2b
cells: Fix race when adding routes
e0b1195
cells: Remove delayed default route if tunnel shuts down
c6358e3
cells: Fix race in cell shutdown
bf563f9
srmclient: add support in srmfs for uploading and downloading files
799698c
poolmanager: Log errors in subscriber
9f9bcfa
spacemanager: Revert decision to only import link groups in leader
d0c3536
dcache: Generate proper exit code for check-config command
a474a77
spacemanager: Fix SpaceManagerHandler#toString
0806d87
doors: Prevent PoolMgrGetUpdateHandler storm on shutdown
077d1e5
pool: Drop pretend limit on p2p client queue
3348726
poolmanager: Improve stability of cost cuts
53d185b
cells: Refactor interaction between LocationManager and LoginManager
71796a5
cells: Fix potential deadlock in login manager shutdown
3a251fd
cells: Fix regression in shutdown timeout
a064ba9
srm: support configuring job expiry checking period
cc806ed
gplazma2-ldap: add embedded jdap server for unit testing
05f327a
nfs4: autodiscover nfsv4 domain used by idmapper
2fe6e94
commons: fix Args string parsing and toString method
355fc08
pool: ReplicaStoreCache updated to describe inner ReplicaStore
c08f8fe
vehicles: use FileAttributes when creating namespace entries
366182b
Change version to 3.0.0-SNAPSHOT
4fefd4d
chimera: update postgres driver to recognize alpha/beta/rc builds
9582a71
ceph: make configurable rados pool name
459ddce
dcap: use String.getBytes(UTF_8) to generate error reply
d509d77
ceph: use object’s xattrs to store files creation and atime/mtime
d56eee2
doors: Include IO queue in pool selection request
a29ba61
ceph: cleanly close rados connection on pool shutdown
5609c21
ceph: use RadosClusterInfo to provide total and available sizes
cc48200
ceph: use PoolInfo to check repository status
59a0995
pool: added CEPH back-ended FileStore implementation
5636347
nfs: addresses returned to client must match clients address ‘type’
8a8a2ab
nfs: add support for flexfile layout type
a48496b
srm-manager: explain cancellation of upload
e5068a1
ftp: Fix unit test regression
7bd0848
cells: Drop some dead code
c9d3b2c
cells: Add context information to connector cell
bede5c8
cells: Let tunnel shut down wait for its processing thread to terminate
182a655
cells: Add default methods to CellEventListener interface
78db8cf
pool: Make inner class of PoolCostInfo static
7727b28
pools: Drop legacy field in PoolCostInfo
9854e5d
ftp: Add support for the HAProxy Proxy Protocol
4a8bc29
ftp: Port FTP door to Netty
9f59b05
cells: Decouple StreamEngine from LoginManager
c8a1a38
xrootd: Add support for the HAProxy Proxy Protocol
6c76922
xrootd,pool: Upgrade to Netty 4.1
b0e85c6
webdav: optimise directory creation
f0918e2
vehicles: add fluent and single-item construction to FileAttributes
6612716
cells: Fix more shutdown bugs and reject send with callback on shutdown
2b0cbb4
cells: Use ConcurrentHashMap for callback objects
f509447
pool: Ensure that post transfer service shuts down before repository
135522e
pool: close dcap accepter thread and server socket after client connects
5679b20
pool: allow sort output of ‘mover ls’ by access time and size
26a2bd7
poolmanager,spacemanager: Do not log delivery failure of PoolMgrGetUpdatedHandler
04c5a8c
cells: Fix regression in message ID in replies
c5770fd
cells: Fix several shutdown related problems
e7323a0
dcache-webadmin: synchronize client-side filtering with server-side selection of rows on pages using picnet table filters
71034aa
dcache-webadmin: disable saving table filter settings to browser cookies
c8bc39d
dcache-webadmin: disable AJAX autorefresh on pages using picnet table filter library
dea1764
namespace: support querying information about deleted objects
7318a65
doors: include explanation when killing mover
249cb99
namespace: add nlink as an attribute
f4e5691
pool: explain why a mover was killed
ac50167
alarms: reset count history on reopened alarm
1eec7f5
pool: Resolved circular bean dependencies leading to bugs during shutdown
eb31c0e
cells: Lowered log level of failure to deliver delivery failure notifications
bb4ddd1
cells: Bind pre-removal notification to beforeStop lifecycle callback
d16926e
doors: Announce to login broker subscribers when a door is shutting down
83c62ca
cells: Log killer threads too when cell shutdown is slow
4a2f978
cells: Log when cells have to interrupt threads to shut them down
f2b6b24
doors: Abort transfer if file is deleted during pool selection
ef5ff6a
chimera: Fix IllegalStateException in inode cache
02c5b8b
poolmanager: Fix shutdown regression
9d7493d
cells: Add pre removal callback
9f7752a
cells: Ensure sequential execution of lifecycle callbacks
42d8ccf
cells: Guard shutdown of setup manager against partial initialization
9285483
poolmanager: Fix design flaw in PoolManagerHandler implementations
98f293f
cells: Expose SendFlag to suppress addition of a source path
759a897
cells: Fix problem with duplicate entries in source path
b43a705
Enable automatic detection of Postgresql master
2c8ee66
webdav,frontend: Drop support for https-jglobus
5593c87
webdav,frontend,httpd: Add proxy protocol support
e69b0fa
dcache: Extend CanlConnectorFactoryBean to also support plain connectors
285a673
nfs: Upgrade nfs4j
bbc5ea0
srm: make out-of-date historic data deletion more robust
dffe01a
srm: Add support for the proxy protocol
740f863
cells: Do not call shutdown from message thread
65a044f
xrootd: Remove serialization compatibility with pre 2.11 domains
6e240c2
dcache-vehicles: Remove serialization compatibility with pre 2.15 domains
16907b5
dcache-vehicles: Remove serialization compatibility with pre–2.12 domains
f8b06aa
cells: Remove serialization compatibility with pre–2.16 domains
c919c99
pnfsmanager: Remove support for legacy pools
2e255bb
cells: Drop support for legacy route updates
73d67b8
billing: Make service replicable
b5974a8
webadmin: Use topic to discover services to watch
92b22f7
httpd: Introduce topic for discovering services to monitor
2a3e401
zookeeper: avoid one race in shutdown
ac036b4
zookeeper: work-around slow shutdown of SessionTrackerImpl
d3f34ae
webadmin: silence warning about future change in wicket
1ee61f9
poolmanager: Reduce risk of pool manager handler update during shutdown
ee85984
billing: Optimize regular expressions of billing parser
b56ea8c
billing: Minor optimizations to path prefix computation
4f68ccb
billing: Extend billing parser to recognize formatting headers
d9f086b
billing: Let ParallelizingLineProcessor consider comments as barriers
aa74739
pool: Adjust JavaDoc of ReplicaStore
dd1f493
billing: Make billing text files self describing with format headers
1755681
billing: Do not log record if format string is empty or undefined
8601227
billing: Harmonize representation of cell name
86dd4fe
cells: Prepare to change representation of unqualified cell addresses
92788e9
cells: Fix IllegalArgumentException when reversing a terminal cell path
618c3b6
cells: Fix RejectedExecutionException during shutdown
2954441
billing: Removing erroneous stack trace output
a2fd0be
billing: Wrap cellName attribute to allow access to the cell and domain components
a81dec3
xrootd: add support for oss.asize on upload
e74520d
xrootd: add per-application io-queue
06a8baa
pool: update FileStore interface to handle all back-end specific operations
7d558ae
ftp: improve compatibility with Apache Commons FtpClient
7951b59
srm: refactor container status update and readying turls
ed44be4
dcache-restful-api/pinmanager: modify current QoS fom disk+tape to tape
bd52f0a
pool: Adjust failure semantics for empty client uploads
410c9d7
dcache-nearline-spi: Add @since tags for newly added features
07090fc
nfs: Fix pool manager subscriber startup
798b7b6
chimera,nfs: avoid byte -> string -> byte conversions
4e58e4c
alarms: convert log entry handling to use plugin extensions
f8fcd31
pom: Update 3rd party dependencies
27c8720
srm: Include session identifier in error message when srmRm aborts an upload
e2ff948
dcache: IntelliJ code inspection refactoring
e3d552b
chimera: IntelliJ refactoring
c27dcdf
common-security,dcache-vehicles: IntelliJ refactoring
4d132b0
common: Various automated refactoring
6b66b04
cells: More semi automatic refactoring
c7c5286
cells: Various automatic refactoring to clean up the code
aa0bcb9
cells: Reduce reliance on AbstractCellComponent
76cde04
billing: Expose property to subscribe to topics
bc33cdd
dcache-nearline-spi: Move AbstractBlockingNearlineStorage to dcache-nearline-spi
c1b2f5a
cells: IOException is not a bug in create command
4e1ce03
cells: update CellMessage to work with initially empty messages
f90ced0
transfermanager: fix querying of 3rd-party transfer
8e449b6
pool: fix hsm unset command
3f074dc
doors: enforce restriction in all PnfsManager operations
1390c0c
srm: remove ‘priority’ support
818b04b
common: include filename in error message
7f280f6
srmclient: do not rely on -f option of readlink
8f472c0
spacemanager: Refine errors sent on shutdown
2de19e3
poolmanager: Make pool manager replicable
0b94ca0
doors: Update doors to use PoolManagerHandler
c68b781
Upgrade to CANL 2.1 and BC 1.50
73969aa
scripts: update canonicalising function
3b355d1
webadmin: remove error and autorefresh options from alarms page
39306c2
alarms: force alarms-only-option
0d6e9ab
dcache: Add framework for dynamic pool manager handlers
7bcf592
poolmanager: Fix incorrect correction of pool cost
a4ab7ed
dcap: Factor out configuration option parsing
1a60fc0
dcap: Port DCAP door to the LineBasedDoor
d4fb646
cells: handle empty string pool value on staging in TransferObserver
ca19923
dcache: Move components of line based doors to dcache-core
a266f9f
cells: Refactor message forwarding logic
5919c92
cells: Retry RETRY_ON_NO_ROUTE_TO_CELL messages when a route is added
731cf8d
cells: Expose SendFlag in CellStub
a90151e
cells: Add SendFlag parameter to CellEndpoint#sendMessage
b482c56
dcache: Refactor message forward logic
3f8cca5
pool: include the original IOException info DiskErrorCacheException
e9bf69e
cells: Extend CellLifeCycleAware with setup change notification
43c757c
cells: Separate setup notifications from setup providers
73846b1
poolmanager: Persist pool manager setup in ZooKeeper
16e72bc
dcache: Allow UniversalSpringCell to persist setup in zookeeper
d76e7d9
poolmanager: Persist pool read only status in setup file
2e58fb2
cells: Drop support for the setup manager
adb5c04
cells: Check setup file syntax before applying it to the running cell
0254bf4
alarms: add ndc info to alarm info
25a125d
resilience: fix alarm log level
2baf078
packages/dcache-view: depolyment of dcache-view through nexus
f356de5
dcache-restful-api: change protocol to HTTP
e388703
dcache-restful-api: RestfulAPI for QoS(CDMI) CHANGE current QoS for the specified file
46d091d
dcache-restful-api: RestfulAPI for QoS(CDMI) CHANGE current QoS for the specified file
574cf49
dcache-nearline-plugin-archetype: Include correct service loader definition
5d0825d
dcap: bump max command size sent by client to 8MB
4a57c27
frontend/dcache-restful-api: remove www-authenticate header
874661c
gplazma2-argus: Update to Argus client 2.2.0 to fix dependency on VOMS library
9b23e68
util: expose timestamp when the instance of Transfer is created
09abff1
mespace/chimera: allow optionally disable move to directory having differnt storage class (and cache class)
2cd391f
pom: use project.version instead of derived dcache.version
767cf92
cells: Introduce annotation to flag setup affecting commands
d9316fd
cells: Preserve CommandThrowableException
f113664
cells: Add default methods to CellInfoProvider and CellLifecycleAware
11720e4
pool: Cancel permanent migration jobs when reload pool setup file
d12ce81
pool: Allow runtime mover queue management
2fdbaf3
dcache-restful-api: fix data type for cdmi_geographic_placement
5a5ed28
dcache-restful-api: RestfulAPI for QoS(CDMI) get current QoS for the specified file
e8f091b
dcache-restful-api: RestfulAPI for QoS(CDMI)
f433881
srm: add hint to escape IDs
f4ad20b
archetype: Add archetype for creating nearline storage plugins
e71d9af
poolmanager: Fix compilation regression
9f991f4
nfs: Fix race condition in transfer startup
ab358ba
poolmanager: Remove dead functionality from cost module
2dcc9cb
pool: Port hsm commands to annotated command structure
c434601
common-cli: Refactor allowAnyOption flag
94af2f9
pool: Remove deprecated nearline concurrency limit commands
b9f9c8c
common-cli: Fix compatibility with Java 7
a97221d
cells: Minor refactoring of command interpreters
87a0046
SrmShell.java: changed ‘rx-’ to ‘r-x’
83f0bff
SrmShell.java: changed ‘rw–’ to ‘rw-’ in case TPermissionMode._RW
5dcfbfb
info: fix broken unit-test
b20cc35
srm: Resolve message thead blocking issues with SRM third party copy
197009c
restful-api: make exception and error handling resful
5473084
pool: Fix compilation error
12ab3b1
spacemanager: Work around for doors resubmitting PoolAcceptFileMessage
5a9db09
pool: Rename repository and store related classes
d1fab17
pool: Eliminate thread in mover scheduler
923fbe5
pool: Refactor BerkeleyDB meta data store to enable reuse
dc45cb7
pool: Expose URI rather than File to NearlineStorage implementations
8ebdb0d
pool: Refactor MetaDataStore and FileStore to use Path rather than File
f91e92a
pool: Avoid direct file access in post processing
6282e3e
pool: Extend script driver for polling scripts
b8f961d
vehicles: remove unused constructors
30f36d0
cells: Propagate CommandExceptions as is on cell startup
c246de6
pool: Fix several race conditions in migration module
ca6c531
pool: Fix regression in mover set max active command
9b7b679
pool: Fix mover leak
c03c489
pool: Fix synchronization regression in jtm
856f126
pool: Decouple checksum module from checksum scanner
4deb002
poolmanager: Move selection unit commands back into the selection unit
d20dc63
poolmanager: Drop resilience decorator of pool selection unit
a6ed2e8
pool: Let CellSetupProviders use setter injection
94c738f
cell: Only implement CellSetupProvider if needed
f2f35f1
dcache: Isolate setup file processing from primary cells command interpreter
de9fbf6
pool: Make RepositoryChannel extend SeekableByteChannel
2f24b07
pool: Make checksum calculation independent of file system backend
79138be
pool: Refactor handling of replica size in the repository
4a7a21d
pool: Reduce number of stat calls in Bekeley DB backend
40a06c1
pool: avoid NPE when querying status of a 3rd-party HTTP transfer
c8cf229
resilience: finish the renaming of ‘pnfs(id) operation’ to ‘file operation’
077e8ec
resilience: eliminate unnecessary update calls and fix synchronization of file op registration
776c0f9
resilience: repair (subtle) bug in target selection
e2790b0
resilience: fix error in file operation updating
99d74a9
resilience: fix the way removal of a pool from a resilient group is handled
63bf4f7
resilience: fix several related bugs in transition checking when scheduling scans
3f88a0a
resilience: remove restriction on storage unit linkage
7114a15
resilience: remove pool from operation table when it has been removed from a resilient group
2e75f75
resilience: allow file consumer to propagate Exception
96abb3d
httpd: change condition from numeric inequality to non-null check on TransferInfo value
8c2b57f
poolmanager.properties: mentioned creation of .bak file
239ab6c
alarms: fix NPE in type setter
871e828
src: change literal package strings to org.dcache.util
eca97e9
chimera: update unit-test to log ChimeraFsExceptions
2ef6d44
src: consolidate org.dcache.commons.util, org.dcache.utils and org.dcache.util
ca60c87
admin: Deprecate DSA keys
c43acae
pool: Transactional bulk update meta data state in Berkeley DB
17575e4
pool: Eliminate MetaDataStore#copy method
13a1c7a
pool: Fix bug that disables pools on non-critical errors
1ab5e5c
webdav: avoid NPE if client fails to send a User-Agent header
472acb6
pool: use a single ByteBuffer when calculating the checksum of a hole
7650505
chimera: avoid extra byte array creation during inode2bytes conversion
e4f1b42
cells: Allow local delivery of messages through queue routes
d3d5fc2
zookeeper: avoid race in creating log directories
08ee55b
cells: Improve indentation of help text
b1a9edd
Make FsPath Serializable
3a943ef
pool: Port remaining rep commands to annotated command framework
5fc92a6
pool: Port list commands to new command structure
af43905
build: add code-coverage reports
9830f2d
pool: Eliminate MetaDataRecord#touch
7d600c5
pool: Refactor meta data store interface to allow bulk updates
39c05d1
pool: Refactor IllegalTranstionException hierarchy
259a34a
pool: Refactor callback executor for flush queues
9e9a2ab
cells: Fix local delivery to queues
7e0baec
pool: Report file in transient state as locked when setting sticky flag
92fbe53
admin: Fix compatibility with OpenSSH 7
219b970
pool: Fix documentation error for -storage option of rep ls command
5180b99
script: Do not claim success if meta data conversion failed
942cca1
frontend: add support for user ‘anonymous’
a8d9592
httpd: check for null Subject in Transfer info
dd15a13
resilience: fix incomplete behavior for pools intentionally marked “excluded”
291dbb1
admin: refactor login strategy
1dbfbe5
webdav: fix bean creation in webdav.xml for spnego handler
b21b4b4
frontend: always fail request if wrong credentials are presented
159c422
webdav: add Spnego based Kerberos authentication mechanism for SSO capabilities
7b770bc
common: add ByteUnit enum and ByteUnits utility class
e461d40
restful: add ability to discover information about current user
070a543
srm: enrich access log
c93dc6c
resilience: simplify code for handling resilience requests on pools
d154a5f
resilience: fix classification of file incomplete and other errors
73bb943
pool: Don’t use transient error for a broken file
78ee625
admin: Drop old ssh 1 keys
85811a2
pool: Add error codes to p2p failures
61ee15f
pool: Make final state update after upload atomic
586e62f
pool: Lower log level of certain failures to create mover
56fb782
chimera-nfs: fix broken commit 4682fd3
4682fd3
chimera: move byte <=> FsInode conversion into nfs specific part
e5a0042
pool: fix staging for CopyNearlineStorage
b34c3de
REST-api: fix permission denied.
8b1d1e1
frontend: remove prompt login onloading dCacheView
e717ffe
resilience: add checked location selection exception
8ddded0
resilience: fail when source retry is requested with only one possible source
6f3dfc7
pools: restore correct command names
12eeda4
pool: Add open flush mode
b40b8bd
gplazma: Make gplazma.x509.use-policy-principals obsolete
ddfdbd0
cells: Drop legacy cells communication
459680d
Mark deprecated properties obsolete
0d2f9c0
webdav: Fix error reporting when client is unauthorized
61d13ef
srm: Fix job expiration during service startup
a7860b3
pool: Minor refactoring of flush classes
887b4d3
Remove deprecated code
abdae03
system-test: Enable automatic schema management for replicamanager
b9792c7
billing: use in-memory buffer for hourly aggregate data
13a009b
webdav: make links relative in standard template
95b114e
billing : fix issue wih HSQLDB backend
d790aa8
gplazma: don’t generate a stack-trace if htaccess is malformed
ff48c27
pool: Fix race in pool initialization
62fd200
Revert “srmmanager: Enable multiple instances to share a database”
7deb9f0
system-test: Fix Linux compatibility for populate script
43e7ad0
system-test: Fix regression in populate script
86c8c32
pool: Close stores after use in meta data utilities
2900cf9
Revert “pool: Close stores after use in meta data utilities”
504e37f
pool: Close stores after use in meta data utilities
cc2eb79
pool: Fix regressions in meta data utilities
0d10b6d
system-test: Speed up liquibase update
74f0917
Disable OCSP by default
ec6df2f
srm: Fix listing of completed list requests
8e4253d
billing: re-implement population of aggregate tables
97290f6
dcache restful-api: adjust the response header
6cf3f95
billing: Fix NoClassDefFoundError when invoking dcache billing command
5b92ef4
httpd: Fail gracefully in case of missing options
870a5b7
httpd: Update cell info view for named queues and make srmmanager monitored
2072541
cells: Suppress RejectedExecutionException on shutdown
93c223e
cells: Fix NPE during shutdown
a09a461
gplazma: Add PrefixRestriction and use it to validate root and upload paths
1b7fb03
zookeeper: Document ports used by embedded zookeeper
f748eba
ftp: Fix regression in resolving path
a31c5cd
frontend: Expose dCacheView configuration in dCache configuration
89be067
zookeeper: Add automatic purging of old snapshots
6623081
Remove example.org values from configuration defaults
e63d821
dCacheView: new end-user web interface for dCache
f47224c
api: Rename service to frontend
1aa3afd
api: Move rest API to separate service
8cd3d3e
billing: fix issue with database change rollback
370fc72
src: remove unused class AbstractNameSpaceProvider
f7511cb
srmmanager: Make scheduler ID configurable and compatible with older versions
2f6abfd
pool: Upgrade to Berkeley DB Java Edition 6.4.25
b81732d
httpd: restore original fields and format to the TransferObserverV1 ascii output
d8ab725
[maven-release-plugin] prepare for next development iteration