Highlights

  • The nfs door will return error NO-SPACE when all pools are full.

  • Added control to modify behavior of the pool space sweeper.

  • HSM flush queue can be set to behave as either LIFO (last-in-first-out) or FIFO (first-in-first-out).

Incompatibilities

The 6.0 release breaks compatibility of pools with pre–6.0 dcap and srm doors. However, the 6.0 doors will work with existing pre–6.0 pools.

Acknowledgments

We would like to thank HTW-Berlin students David Linke, Lilit Patzwaldt-Sergoyan, Marcel Waldau, Quentin Kuth, Sophie Schauer, Viktoriia Sulimenko, Vladimir Vilenchik, Johann Gillhoff and Dieter Weber from FZ-Jülich for their contributions.

Release 6.0.9

dcache-view

dcache-view version 1.6.0 is released, with fixed minor issues and errors.

The new release added support for server-side-event. Now cert user are allowed to perform rename, improved error messages and increased display time of the notification message is now available.

dcache-xrootd

Now door does not fail to start if there is no voms directory on the host.

frontend

The current release removed unnecessary login requirement on restores and transfers.

The current release fixed a bug in the frontend if the inotify events are used.

skel

The current release repaired erroneous batch directives before cell creation.

Now it is fixed and domain is not left in zombie state after a fatal error, but restarts, as it should.

srm

Now host IP is used for comparison when determining if SURL is local.

Changelog 6.0.8..6.0.9

51fa4ed
[maven-release-plugin] prepare release 6.0.9
d09fae7
skel: repair erroneous batch directives before cell creation
a52cf87
dcache-xrootd: don’t initialize delegation provider when there is no gsi module
42838db
dcache-frontend: remove unnecessary login requirement on restores and transfers
f68ad94
srm: use host IP for comparison when determining if SURL is local
2a593d9
frontend: events inotify fix deadlock
cdfd8a3
dcache, frontend: release dcache-view version 1.6.0
f46785a
[maven-release-plugin] prepare for next development iteration

Release 6.0.8

cell

Curator client was not able to restore the connection to ZK server after network partitioning. The is now fixed.

skel

The current relase fixed tape-reserved size calculation.

webdav

The current release fixed, where the WebDAV door failed to follow RFC 4918. This make some clients reject dCache WebDAV door as a valid WebDAV endpoint.

Changelog 6.0.7..6.0.8

c226083
[maven-release-plugin] prepare release 6.0.8
6bf99a6
Fix tape-reserved size calculation
cfb0b5d
webdav: include DAV header in OPTIONS requests.
df1f7eb
cells: do not re-define zookeeper watcher
eec3431
gplazma: multimap support files with multiple primary gids
eae2717
[maven-release-plugin] prepare for next development iteration

Release 6.0.7

canl

The current release updated lib version to 2.5.1.

gplazma

The current release fixed URL-prefix SciToken parsing and error handling if JWT contains malformed SciToken scopes.

webdav

When making a cross-origin call to webdav door from e.g. the frontend by using dcache-view, this call would be blocked if the client want to forward the user certificate. This is now fixed.

Changelog 6.0.6..6.0.7

47a3773
[maven-release-plugin] prepare release 6.0.7
b12a440
gplazma: scitoken add unit tests and fix SciTokenScope
a38d286
webdav : set Access-Control-Allow-Credentials to true
49c7a2e
canl: update to version 2.5.1
c76d011
[maven-release-plugin] prepare for next development iteration

Release 6.0.6

chimera

Enstore client, encp, stores information in layer 4 (level_4). It was natural to assume that files written directly by encp would have NEARLINE/CUSTODIAL AccessLatenct/RetentionPolicy. Although, setting of access latency interferes with QoS.

This is fixed no and encp does not interfer with file access latency (say if it is specified to be ONLINE).

config

A typo in the dcap config file was fixed correcting dcacp.enable.kafka to dcap.enable.kafka.

frontend

The current release fixed a NPE bug (which is then logged as a stack-trace) that is triggered when the reply does not contain a media type.

gplazma

The SciToken plugin will now reject any JWT where there is none of the expected scopes defined. This allows dCache to support both OpenID-Connect and SciTokens.

webdav

The current release fixed an issue of transfers through dCacheView when the webdav door is configured with empty webdav.allowed.client.origins value, which is the default value.

Changelog 6.0.5..6.0.6

a448915
[maven-release-plugin] prepare release 6.0.6
15add3d
frontend: avoid NPE when request has no media type
f7c9f98
dcache: add null check to pool info collector util
e18b34e
config: fix typo in property name
044fe09
chimera: do not update access latency on updat/insert in level_4
e1655c3
gplazma: scitoken fix two issues with SciToken plugin
39aae1a
webdav: fix CORS when all clients are allowed to connect
f765cb6
[maven-release-plugin] prepare for next development iteration

Release 6.0.5

nfs

The current release fixed NPE on error path.

srm

The current release fixed a problem resulting in high CPU use in SrmManager if clients are attempting to pin a file and PinManager is unavailable.

A regression fixed where SrmManager will reject all QUEUED jobs and INPROGRESS BringOnline requests on restart, if there are no SRM doors running when SrmManager starts.

Changelog 6.0.4..6.0.5

c54a7c5
[maven-release-plugin] prepare release 6.0.5
cf59f51
nfs: avoid NPE on error path
780de79
SrmManager: fix handling of saved requests on start-up
eff62ea
SrmManager: avoid spamming if PinManager is down
e6b3f13
[maven-release-plugin] prepare for next development iteration

Release 6.0.4

doors

The current release fixed a bug where running the lb set tags admin command without any arguments triggers a NullPointerException.

pool

The current release improved error messages about jobs cancellation.

scripts

The dcache-storage-descriptor command no longer requires a URL argument.

Changelog 6.0.3..6.0.4

aa0eb9d
[maven-release-plugin] prepare release 6.0.4
15155fd
doors: fix “lb set tags” command with no arguments
f8c13bd
pool: improve messages when migration job is cancelled.
8dd5e4f
scripts: fix variable ordering in dcache-storage-descriptor
222a151
docs: TheBook add chapter on SRR
7da7ec0
[maven-release-plugin] prepare for next development iteration

Release 6.0.3

dcache

The current release restored pool compatibility with doors sending Xrootd-2 protocol version.

gplazma

The SciToken gplazma plugin now supports the audience (aud) claim where the claim’s value is an array. This allows dCache to support SciTokens with multiple audience values.

pool

Pool health-check log messages now include the pool’s name.

webdav

On an unsuccessful HTTP-TPC pull request, dCache will delete the file. If this deletion did not work then an error was logged. This is fixed now and failures to delete the incomplete file from a failed HTTP-TPC pull request, where the incomplete file has been deleted by some other means are now logged at DEBUG level, rather than WARN level.

xrootd

The current release refited checksum handling after xrootd4j bug fix.

Changelog 6.0.2..6.0.3

54b4cd2
[maven-release-plugin] prepare release 6.0.3
7077373
dcache-xrootd: refit checksum handling after xrootd4j bug fix
3c80a39
webdav: avoid logging non-error as an error
6626085
pool: include pool name in health-check reports
b53e33f
gplazma: scitoken add support for multiple audience claims
764eeaa
build(deps): bump jackson-databind from 2.9.10 to 2.9.10.1
2b65eb8
dcache: restore pool compatibility with doors sending ‘Xrootd–2’ protocol version
d2ef379
[maven-release-plugin] prepare for next development iteration

Release 6.0.2

frontend

The current release fixed QoS pin semantics.

A bug is fixed in frontend that results in a NullPointerException for billing queries where no limit is specified.

srm

Investigation into tape carousel has highlighted limitations in what dCache logs. In particular, information about the pin lifetime for srmBringOnline is missing.

This is now fixed and dCache SRM access logging contains information about client’s desired pin lifetime in srmBringOnline requests.

srmclient

The commands srm-check-permissions, srm-get-permissions, srmls, srmmkdir, srmmv, srm-release-space, srm-reserve-space, srmrm, srmrmdir and srm-set-permissions now support the -extraInfo command-line option.

xrootd

The current release refined error handling by providing more meaningful (or at least consistent) error responses to the client.

Changelog 6.0.1..6.0.2

0e0b409
[maven-release-plugin] prepare release 6.0.2
a92235c
build: avoid shipping log4j with dCache
a34298e
frontend: fix NPE if limit is not specified
b30e6b9
srm: improve access logging of srmBringOnline requests.
57cef0d
srmclient: complete and tidy-up storageSystemInfo support
54c9860
dcache-frontend: fix QoS pin semantics
6ec3f30
dcache-xrootd: refine error handling
3d2d270
[maven-release-plugin] prepare for next development iteration

Release 6.0.1

Changes affecting multiple services

The Apache Commons Compress library used in dCache was updated to version 1.19.

A rare deadlock situation in the Chimera database was eliminated. In cases where, within the same directory, concurrent mkdir and rmdir events happened, transactions within the database could deadlock. This would be indicated by the message

ERROR: deadlock detected

in the logs.

pool

A regression introduced in the last release caused a ClassNotFoundException on pool startup with a message like

[ERROR] Server responded with an error: [3012] Failed to open file (Protocol Xrootd-4.0:123.32.123.32:38374 is not supported [27])

appearing in the logs. This release fixes that issue.

There were reports of extraordinarily high CPU usage on pool nodes with a large number of cached files. Through an optimization of the sweeper, CPU usage was reduced significantly.

xrootd

This release fixes a small regression in the kill mover command that would occasionally fail to kill all targeted movers.

This release fixes a vulnerability in dCache’s XRootD protocol implementation. We recommend that all sites update their XRootD doors. Details will be made available through EGI Security and, in a week’s time, through an update to these release notes.

Changelog 6.0.0..6.0.1

d2d6470bce
[maven-release-plugin] prepare release 6.0.1
adfd06a75d
dcache-xrootd: honor read paths when listing directories
9557e03701
pool: fix the xrootd version number on pool transfer service
4d63b90d01
resilience: don’t compare Integer objects by refference
1cb9f0915a
xrootd: fix kill mover command
c69949bcde
doors: initialize IdentityResolverFactory after dependency injection
de0c138a25
sweeper: use in-memory map instead of repository for histogram data
e0f39a9611
dcache-xrootd: replace constants for version number
02e2e92eaf
dcache-xrootd: update protocol version numbers
371f03a0a9
libs: update apache.commons:commons-compress to 1.19
58ca3e70cf
Update config-PoolManager.md
4a80fb0815
chimera: fix ABBA db deadlock when mkdir and rmdir run concurrently
81a5497044
[maven-release-plugin] prepare for next development iteration

Release 6.0.0

Cells

fixed core domain discovery. Now on the zookeeper path dcache/lm/cores-uri is always used and the old location dcache/lm/cores is for bakward compatibility only.

DCAP

Remove strict-size control channel option that was required for legacy ‘Pnfs’.

Frontend

The QoS management in the restful api contain metadata object which provide information about the media QoS class. And one the values this metadata might show is the possible data’s geographic placement. Previously, this value is hardcoded to be DE. Now, the admins have the possibility to set this value to a specify country (or countries). This can be done via frontend.geographic-placement property. The property accept comma seperated list of ISO–3166 alpha–2 codes.

httpd

Update old https admin pages legend to describe pool space usage. Escape status field string in /poolInfo/restoreHandler/* so that it is not confused with html statement.

NFS

The nfs door will return error NO-SPACE when all pools are full. However, when a pool is not online yet (disabled due to startup procedure), then client is instructed to try later. Updated the underlying nfs code implementation. NFS door will automatically re-try pool selection when all pool are off-line. Added ability to query supported checksum types:

    cat ".(checksums)()"

Added ability to set file checksum(s) via dot command. This is useful for some HSM clients. A regular user can only set one checksum (type, value) pair, and cannot overwrite existing values. A root user is allowed to set multiple types (successively) and also to overwrite values for any type.

    touch foo
    touch ".(fset)(foo)(checksum)(ADLER32)(ffffffff)"

Fix formatting of error message in Checksum.

Pool

Added control to modify behavior of the pool space sweeper. Normally, when a pool is 100% full, the pool sweeper would remove just enough replicas to accomodate a new file. The variable has been added to adjust the amount of space released by sweeper:

    pool.limits.sweeper-margin=0.0

meaning that if set different from 0.0 then that fraction of the total pool space will be released when the pool is 100% full and it has to accept a new file replica. This behavior is designed to eliminate continuous removal when writing (or staging) to pools that are always full to avoid disk thrashing at the expense of slighly reduced replica lifetime. 0.0 means no change to existing behavior. HSM flush queue can be set to behave as either LIFO (last-in-first-out) or FIFO (first-in-first-out). This can be done statically using the property:

    (one-of?fifo|lifo)pool.flush-controller.queue-order=fifo

in the setup file:

    flush set queue order lifo

or by using the admin command itself: \s <pool-name> flush set queue order lifo

Pool Manager

In situations when all pools with requested file are off-line “Pool unavailable” message logged as warning. Before is was treated as a fatal error.

Resilience

Added command to detect files all of whose replicas are on a given set of pools. Useful to handle pool decomissioning.

   contained in

The argument to this command is a regular expression defining a set of pools to check. Running that command over the set of pools to be drained and removed will provide a list of pnfsids for which some sort of manual action/migration will be necessary, since they will be alarmed as ‘inaccessible’ by resilience. Fixed remove count when replica is precious.

WebDAV

Option method request is now properly handled even if the site does not allow anonymous operation.

A PUT request that targets an existing collection resource (a directory) will now return the correct status code (405) instead of 500 Internal Error.

Changelog from 5.2.0 to 6.0.0

58ca3e7
Update config-PoolManager.md
4a80fb0
chimera: fix ABBA db deadlock when mkdir and rmdir run concurrently
81a5497
[maven-release-plugin] prepare for next development iteration
c91b832
[maven-release-plugin] prepare release 6.0.0
1ac7ff5
Revert “docker: Add a way to create docker image”
f204094
dcap: restart pool selection on OUT-OF-DATE error
b732136
gplazma-ldap: avoid thread leak by explicitly close NamingEnumeration
1a9b880
[maven-release-plugin] prepare branch 6.0
d293b1e
flush controller: fix wrong setter type for queue order
fd70b68
libs: update jackson-databind to 2.9.10
5b5e789
docs: UserGuide fix formatting issue with worked example
cf61aea
docs: UserGuide add initial version of third-party transfer documentation
0dd6c45
pools: make flush queue configurable as FIFO or LIFO
6a5c3df
core: fix code formatting and javadoc of StickyRecord
665a774
pool: allow margin option on sweeper when reclaiming space
4acfa2b
docs: UserGuide minor fixes to the WebDAV chapter
e7d3771
docs: UserGuide add a description for how to request macaroons
deb5cce
docs: UserGuide mark-up quoted XML.
68b13d0
docs: UserGuide first iteration on WebDAV description
ec7babb
httpd: escape status field in HttpPoolMgrEngineV3
3dca2b5
webdav: add allow header to OPTION method request
cd967a3
docs: describe pools job timeout manager
c8cf843
nfs: fix layoutreturn operation handling regression
26c20f2
chimera: FsInode_SURI, refactor permission checks into ChimeraVfs
d42cb2a
Include information on creating ban.conf
3c44c62
nfs: restart stale transfers
58d479b
cells: reduce dependency on ListenableFuture
0fb511c
nfs: support ability to set file checksum(s) via dot command
fbed8d5
chimera: add command to push tags into subdirectories
c557f28
resilience: add command to detect files all of whose replicas are on a given set of pools
72b90de
libs: use h2 version 1.4.199
a231e87
srm: Remove JVM memory limits
0e31e95
util: explicitly handle POOL_UNAVAILABLE in transfer class
a1ea2d4
util: introduce constant CacheException#POOL_UNAVAILABLE
1f7b9eb
resilience: fix remove count when replica is precious
2d9992c
chimera: update ctime on checksum set and remove
3016320
cells: make UOID object globally unique
12083be
Update intouch.md
270783f
docs: improve formatting of preface.md
e2241eb
docs: make links to corresponding chapters in introduction
fe25f50
cells: remove reference to ListenableFuture in CellGlue
9581b18
Delete next_button.css
cc88dc7
Update intouch.md
1e66851
Update intouch.md
27ffe52
Update intouch.md
883f690
Update intouch.md
6ae1787
Update intouch.md
f524ed7
util: add CompletableFutures#fromCompletableFuture method
b994018
Create next_button.css
069fc83
dcache: fix RepositorySubsystem unit test NPE
da929a0
pool-repository: Improve metadata check speed on pool re-start.
765e95f
nfs: checksum types dot command
6abfd14
frontend: make geographic placement configurable
350aee6
docs: TheBook further updates to describe updated NFS pinning interface
ea3f2fb
common: fix formatting of error message in Checksum
6a8cc96
scitoken: fix remote reading of JSON with UTF–8 CharSet
2a7f502
NFS: fix permissions on the pin dot command.
d5aa2f6
NFS: check whether pin operations were accepted
4a596cf
NFS: avoid stack-traces and provide feedback on invalid pset/fset args
2fbee94
PinManager: support replying as soon as the pin request is accepted
e89de9d
NFS: use client IP address when pinning files
f67ef81
NFS: add dot command to list a file’s pins
a6ec353
docs: UserGuide warn not to use -dcpriv
da5f1ad
docs: Mixed case for headings
a216f89
docs: Added automatic local tables of content
6925c44
docs: TheBook rewrite the install chapter
960960a
docs: TheBook add support dCache versions for SNAPSHOT.
8ea0991
docs: Removed readme.md
873e50d
Motivation: logger name unification
a8b2986
dcap: remove dcap control channel support for ’strict-size`
1e778e7
util: decorate retry executor with CDC-aware wrapper
251ff2e
cells: fix incorrect CellMessage#equals method
bcb4617
docs: Ceph performance considerations
2688761
docs: Share Zookeeper cluster between instances
ac0e2f8
vehicles: add missing FileAttributes$Builder$cacheClass method
a5a9ce3
gplazma2-jaas: Logger name was changed
c312637
dcache-webdav: changed name of Logger
37d7a43
gplazma-nis: updated name of logger to fit java convention
4de1cac
chimera: changed name of Logger
b71030d
Gplazma-multimap: LOG changed to LOGGER
4197d08
gplazma-voms:gplazma-voms: log auf LOGGER geändert
7e987f3
gplazma-htpasswd: unification of LOGGER variable
2a64a5c
gplazma2-argus: changed logger variable name
4283c55
chimera: changed readme
ac6f0ae
dcache: update readme with up-to-date description
5286a17
gplazma2-nis: update readme with up to date date
6399343
acl: update readme with up-to-date description
f27fc7a
nfs: fix NPE on “show transfers” command
10177b9
vehicles: remove PoolCheckable and related messages
477c2bb
dcap,srm: use CacheEntryInfoMessage instead of PoolCheckFileMessage
907eec6
resilience: remove reference to PoolCheckMessage in unit test
ea97bbf
libs: use nfs4j–0.19.0
f02d0ad
Update frontend.md
1892df9
Update frontend.md
d73bdda
poolmanager: do not check stage/p2p reply message type
b60d4c8
docs: Update UserGuide to use guide-specific navigation header
c295da2
docs: TheBook parameterise time; minor format and content tidy-up
59a43b5
docs: TheBook switch user/client console to use markup
c996ccb
docs: TheBook use a shorter console prompt
a438854
docs: TheBook include dCache version in documentation
385b77f
webdav: fix cross origin resources sharing issue
1bf3a98
docs: TheBook language-ini for dCache configuration; fix other minor issues
c7cf6c0
pool: don’t pass checksum module to transfer services
3959ad0
docs: TheBook use root console environment for root activity
a099cb3
docs: TheBook use monospaced font for filenames and paths
1c1de5e
Revert series of 14 commits pushed upstream by accident.
5c7433e
docs: update install chapter of TheBook to use syntactic highlighting
2e10fa8
Further refactoring
85ef0f1
Make cellNameFor return Optional
871b000
Factor our routeThreadGroup discovery method to a utility class
a9cf71c
Remove ThreadId class, use long instead
00023f3
Add diagnostic stuff
1e629e0
Add CPU information to “ps” command
40f3cb6
Remove separate CPU ls command
f50f790
Fix bug in format string
6be26c9
Fix two NPEs
0a683e7
WIP adding some debug output, some minor refactoring
5e911c9
Switch CpuUsage to be immutable
b4ffa34
Refactor code and add basic support for missing functionality in CellGlue
36f8e36
cells: add per-cell CPU usage monitoring
f30b180
Move ThreadGroup#enumerate out as a utility method
21b0efa
pool: remove artifacts from the replica manager
2aeb71b
pool: remove reference to checksum module
1b5c7ad
vehicles: remove historic PoolUpdateCacheStatisticsMessage
bfd512f
Book: avoid long lines
314b8c3
book: update style to match UserGuide
b652f53
nfs: merge AccessLogAware and ProxyIo operation factories
7ebb537
poolmanager: remove unused interface ExtendedRunnable
741e989
http: fix usage info legend color and label
f7222ae
dcache, frontend: release dcache-view version 1.5.5
112f784
nfs: introduce workaround ‘permission deny’ on layout commit
60fe748
pom: fix spotbugs maven plugin initialization ()
4f38bc9
pom: enable spotbugs plugin
5f65152
crypto: fix broken unit tests
ad30a1d
chimera: chimera shell should show output when commands come from stdin.
d27558b
webdav: return 405 status code for PUT requests targeting collections
294e290
doors: improve timeout message waiting for redirection
1258cb7
chimera: return more helpful error message
c88d102
cells: fix handling of plain and tls connections
98d3cc8
pool: update billing log message to say whether cancelled mover was queued
434eee9
crypto: add work-around for SL7 clients
09ed752
pool: remove unused checksum verification method from ChecksumModule
a1cb89e
nfs: handle read and write transfers by different transfer classes
6cf7438
dcache-xrootd: add checksum cgi handling to door query
46f0211
docs: Update of PoolManager config chapter
8974b70
nearline-storage: remove unused field
bb39761
scripts: avoid copy-n-paste error when calculating pool size
91d3cbd
nfs: fail with NO-SPACE error when all pools are full
1939fed
crypto: add support for disabling weak crypo suites
a1e0c1b
httpd: allow file-specific errors to propagate
d49ec20
pool: fix toString method for ReplicaStore decorators
21fc553
frontend: include doors resource as request-scope bean
86d3f6c
cells: add fall-back for discovering to which cell a thread belongs
6f2e53e
dcache-xrootd: check the session for credential on source open
03f18cb
dcache-xrootd: respond to tpc query correctly*
1cc35df
dcache-xrootd: compute VOMs CA refresh interval using unit
8916f6b
dcache, frontend: release dcache-view version 1.5.4
6f5242f
alarms: eliminate logback context from junit test
69e773c
libs: use curator 4.2.0
9417058
pools: make the xrootd tpc response timeout less aggressive
868e77f
system-test: avoid garbage-collecting DB connections
b7f5d88
transfermanager: include pool name in error for ‘mover ls’ failures
5e1b849
ftp: avoid NPE on HA-Proxy probes
f1a277a
libs: update aspectj plugin to up-to-date version
16085a0
multiple: add support for Hikari-specific properties
b0dd3e1
[maven-release-plugin] prepare for next development iteration