Skip to content

Commit

Permalink
Merge pull request #3295 from etschannen/release-6.3
Browse files Browse the repository at this point in the history
Merge release 6.2 into release 6.3
  • Loading branch information
etschannen authored Jun 5, 2020
2 parents ac19ba1 + 7ec1d64 commit 1671997
Show file tree
Hide file tree
Showing 7 changed files with 126 additions and 78 deletions.
68 changes: 67 additions & 1 deletion documentation/sphinx/source/administration.rst
Original file line number Diff line number Diff line change
Expand Up @@ -493,7 +493,7 @@ If a process has had more than 10 TCP segments retransmitted in the last 5 secon
10.0.4.1:4500 ( 3% cpu; 2% machine; 0.004 Gbps; 0% disk; REXMIT! 2.5 GB / 4.1 GB RAM )

Machine-readable status
--------------------------------
-----------------------

The status command can provide a complete summary of statistics about the cluster and the database with the ``json`` argument. Full documentation for ``status json`` output can be found :doc:`here <mr-status>`.
From the output of ``status json``, operators can find useful health metrics to determine whether or not their cluster is hitting performance limits.
Expand All @@ -505,6 +505,72 @@ Durable version lag ``cluster.qos.worst_durability_lag_storage_server`` cont
Transaction log queue ``cluster.qos.worst_queue_bytes_log_server`` contains the maximum size in bytes of the mutations stored on a transaction log that have not yet been popped by storage servers. A large transaction log queue size can potentially cause the ratekeeper to increase throttling.
====================== ==============================================================================================================

Server-side latency band tracking
---------------------------------

As part of the status document, ``status json`` provides some sampled latency metrics obtained by running probe transactions internally. While this can often be useful, it does not necessarily reflect the distribution of latencies for requests originated by clients.

FoundationDB additionally provides optional functionality to measure the latencies of all incoming get read version (GRV), read, and commit requests and report some basic details about those requests. The latencies are measured from the time the server receives the request to the point when it replies, and will therefore not include time spent in transit between the client and server or delays in the client process itself.

The latency band tracking works by configuring various latency thresholds and counting the number of requests that occur in each band (i.e. between two consecutive thresholds). For example, if you wanted to define a service-level objective (SLO) for your cluster where 99.9% of read requests were answered within N seconds, you could set a read latency threshold at N. You could then count the number of requests below and above the threshold and determine whether the required percentage of requests are answered sufficiently quickly.

Configuration of server-side latency bands is performed by setting the ``\xff\x02/latencyBandConfig`` key to a string encoding the following JSON document::

{
"get_read_version" : {
"bands" : [ 0.01, 0.1]
},
"read" : {
"bands" : [ 0.01, 0.1],
"max_key_selector_offset" : 1000,
"max_read_bytes" : 1000000
},
"commit" : {
"bands" : [ 0.01, 0.1],
"max_commit_bytes" : 1000000
}
}

Every field in this configuration is optional, and any missing fields will be left unset (i.e. no bands will be tracked or limits will not apply). The configuration takes the following arguments:

* ``bands`` - a list of thresholds (in seconds) to be measured for the given request type (``get_read_version``, ``read``, or ``commit``)
* ``max_key_selector_offset`` - an integer specifying the maximum key selector offset a read request can have and still be counted
* ``max_read_bytes`` - an integer specifying the maximum size in bytes of a read response that will be counted
* ``max_commit_bytes`` - an integer specifying the maximum size in bytes of a commit request that will be counted

Setting this configuration key to a value that changes the configuration will result in the cluster controller server process logging a ``LatencyBandConfigChanged`` event. This event will indicate whether a configuration is present or not using its ``Present`` field. Specifying an invalid configuration will result in the latency band feature being unconfigured, and the server process running the cluster controller will log a ``InvalidLatencyBandConfiguration`` trace event.

.. note:: GRV requests are counted only at default and immediate priority. Batch priority GRV requests are ignored for the purposes of latency band tracking.

When configured, the ``status json`` output will include additional fields to report the number of requests in each latency band located at ``cluster.processes.<ID>.roles[N].*_latency_bands``::

"grv_latency_bands" : {
0.01: 10,
0.1: 0,
inf: 1,
filtered: 0
},
"read_latency_bands" : {
0.01: 12,
0.1: 1,
inf: 0,
filtered: 0
},
"commit_latency_bands" : {
0.01: 5,
0.1: 5,
inf: 2,
filtered: 1
}

The ``grv_latency_bands`` and ``commit_latency_bands`` objects will only be logged for ``proxy`` roles, and ``read_latency_bands`` will only be logged for storage roles. Each threshold is represented as a key in the map, and its associated value will be the total number of requests in the lifetime of the process with a latency smaller than the threshold but larger than the next smaller threshold.

For example, ``0.1: 1`` in ``read_latency_bands`` indicates that there has been 1 read request with a latency in the range ``[0.01, 0.1)``. For the smallest specified threshold, the lower bound is 0 (e.g. ``[0, 0.01)`` in the example above). Requests that took longer than any defined latency band will be reported in the ``inf`` (infinity) band. Requests that were filtered by the configuration (e.g. using ``max_read_bytes``) are reported in the ``filtered`` category.

Because each threshold reports latencies strictly in the range between the next lower threshold and itself, it may be necessary to sum up the counts for multiple bands to determine the total number of requests below a certain threshold.

.. note:: No history of request counts is recorded for processes that ran in the past. This includes the history prior to restart for a process that has been restarted, for which the counts get reset to 0. For this reason, it is recommended that you collect this information periodically if you need to be able to track requests from such processes.

.. _administration_fdbmonitor:

``fdbmonitor`` and ``fdbserver``
Expand Down
24 changes: 12 additions & 12 deletions documentation/sphinx/source/downloads.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,38 +10,38 @@ macOS

The macOS installation package is supported on macOS 10.7+. It includes the client and (optionally) the server.

* `FoundationDB-6.3.0.pkg <https://www.foundationdb.org/downloads/6.3.0/macOS/installers/FoundationDB-6.3.0.pkg>`_
* `FoundationDB-6.3.1.pkg <https://www.foundationdb.org/downloads/6.3.1/macOS/installers/FoundationDB-6.3.1.pkg>`_

Ubuntu
------

The Ubuntu packages are supported on 64-bit Ubuntu 12.04+, but beware of the Linux kernel bug in Ubuntu 12.x.

* `foundationdb-clients-6.3.0-1_amd64.deb <https://www.foundationdb.org/downloads/6.3.0/ubuntu/installers/foundationdb-clients_6.3.0-1_amd64.deb>`_
* `foundationdb-server-6.3.0-1_amd64.deb <https://www.foundationdb.org/downloads/6.3.0/ubuntu/installers/foundationdb-server_6.3.0-1_amd64.deb>`_ (depends on the clients package)
* `foundationdb-clients-6.3.1-1_amd64.deb <https://www.foundationdb.org/downloads/6.3.1/ubuntu/installers/foundationdb-clients_6.3.1-1_amd64.deb>`_
* `foundationdb-server-6.3.1-1_amd64.deb <https://www.foundationdb.org/downloads/6.3.1/ubuntu/installers/foundationdb-server_6.3.1-1_amd64.deb>`_ (depends on the clients package)

RHEL/CentOS EL6
---------------

The RHEL/CentOS EL6 packages are supported on 64-bit RHEL/CentOS 6.x.

* `foundationdb-clients-6.3.0-1.el6.x86_64.rpm <https://www.foundationdb.org/downloads/6.3.0/rhel6/installers/foundationdb-clients-6.3.0-1.el6.x86_64.rpm>`_
* `foundationdb-server-6.3.0-1.el6.x86_64.rpm <https://www.foundationdb.org/downloads/6.3.0/rhel6/installers/foundationdb-server-6.3.0-1.el6.x86_64.rpm>`_ (depends on the clients package)
* `foundationdb-clients-6.3.1-1.el6.x86_64.rpm <https://www.foundationdb.org/downloads/6.3.1/rhel6/installers/foundationdb-clients-6.3.1-1.el6.x86_64.rpm>`_
* `foundationdb-server-6.3.1-1.el6.x86_64.rpm <https://www.foundationdb.org/downloads/6.3.1/rhel6/installers/foundationdb-server-6.3.1-1.el6.x86_64.rpm>`_ (depends on the clients package)

RHEL/CentOS EL7
---------------

The RHEL/CentOS EL7 packages are supported on 64-bit RHEL/CentOS 7.x.

* `foundationdb-clients-6.3.0-1.el7.x86_64.rpm <https://www.foundationdb.org/downloads/6.3.0/rhel7/installers/foundationdb-clients-6.3.0-1.el7.x86_64.rpm>`_
* `foundationdb-server-6.3.0-1.el7.x86_64.rpm <https://www.foundationdb.org/downloads/6.3.0/rhel7/installers/foundationdb-server-6.3.0-1.el7.x86_64.rpm>`_ (depends on the clients package)
* `foundationdb-clients-6.3.1-1.el7.x86_64.rpm <https://www.foundationdb.org/downloads/6.3.1/rhel7/installers/foundationdb-clients-6.3.1-1.el7.x86_64.rpm>`_
* `foundationdb-server-6.3.1-1.el7.x86_64.rpm <https://www.foundationdb.org/downloads/6.3.1/rhel7/installers/foundationdb-server-6.3.1-1.el7.x86_64.rpm>`_ (depends on the clients package)

Windows
-------

The Windows installer is supported on 64-bit Windows XP and later. It includes the client and (optionally) the server.

* `foundationdb-6.3.0-x64.msi <https://www.foundationdb.org/downloads/6.3.0/windows/installers/foundationdb-6.3.0-x64.msi>`_
* `foundationdb-6.3.1-x64.msi <https://www.foundationdb.org/downloads/6.3.1/windows/installers/foundationdb-6.3.1-x64.msi>`_

API Language Bindings
=====================
Expand All @@ -58,18 +58,18 @@ On macOS and Windows, the FoundationDB Python API bindings are installed as part

If you need to use the FoundationDB Python API from other Python installations or paths, use the Python package manager ``pip`` (``pip install foundationdb``) or download the Python package:

* `foundationdb-6.3.0.tar.gz <https://www.foundationdb.org/downloads/6.3.0/bindings/python/foundationdb-6.3.0.tar.gz>`_
* `foundationdb-6.3.1.tar.gz <https://www.foundationdb.org/downloads/6.3.1/bindings/python/foundationdb-6.3.1.tar.gz>`_

Ruby 1.9.3/2.0.0+
-----------------

* `fdb-6.3.0.gem <https://www.foundationdb.org/downloads/6.3.0/bindings/ruby/fdb-6.3.0.gem>`_
* `fdb-6.3.1.gem <https://www.foundationdb.org/downloads/6.3.1/bindings/ruby/fdb-6.3.1.gem>`_

Java 8+
-------

* `fdb-java-6.3.0.jar <https://www.foundationdb.org/downloads/6.3.0/bindings/java/fdb-java-6.3.0.jar>`_
* `fdb-java-6.3.0-javadoc.jar <https://www.foundationdb.org/downloads/6.3.0/bindings/java/fdb-java-6.3.0-javadoc.jar>`_
* `fdb-java-6.3.1.jar <https://www.foundationdb.org/downloads/6.3.1/bindings/java/fdb-java-6.3.1.jar>`_
* `fdb-java-6.3.1-javadoc.jar <https://www.foundationdb.org/downloads/6.3.1/bindings/java/fdb-java-6.3.1-javadoc.jar>`_

Go 1.11+
--------
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,15 @@
Release Notes
#############

6.2.22
======

Fixes
-----

* Coordinator class processes could be recruited as the cluster controller. `(PR #3282) <https://github.com/apple/foundationdb/pull/3282>`_
* HTTPS requests made by backup would fail (introduced in 6.2.21). `(PR #3284) <https://github.com/apple/foundationdb/pull/3284>`_

6.2.21
======

Expand Down
2 changes: 1 addition & 1 deletion documentation/sphinx/source/release-notes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
Release Notes
#############

6.3.0
6.3.1
=====

Features
Expand Down
8 changes: 3 additions & 5 deletions fdbclient/HTTP.actor.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -352,6 +352,9 @@ namespace HTTP {
send_start = timer();

loop {
wait(conn->onWritable());
wait( delay( 0, TaskPriority::WriteSocket ) );

// If we already got a response, before finishing sending the request, then close the connection,
// set the Connection header to "close" as a hint to the caller that this connection can't be used
// again, and break out of the send loop.
Expand All @@ -372,11 +375,6 @@ namespace HTTP {
pContent->sent(len);
if(pContent->empty())
break;

if(len == 0) {
wait(conn->onWritable());
wait( delay( 0, TaskPriority::WriteSocket ) );
}
}

wait(responseReading);
Expand Down
2 changes: 1 addition & 1 deletion fdbserver/worker.actor.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1726,7 +1726,7 @@ ACTOR Future<Void> fdbd(
Reference<AsyncVar<ServerDBInfo>> dbInfo( new AsyncVar<ServerDBInfo>(ServerDBInfo()) );

actors.push_back(reportErrors(monitorAndWriteCCPriorityInfo(fitnessFilePath, asyncPriorityInfo), "MonitorAndWriteCCPriorityInfo"));
if (processClass == ProcessClass::TesterClass) {
if (processClass.machineClassFitness(ProcessClass::ClusterController) == ProcessClass::NeverAssign) {
actors.push_back( reportErrors( monitorLeader( connFile, cc ), "ClusterController" ) );
} else if (processClass == ProcessClass::StorageClass && SERVER_KNOBS->MAX_DELAY_STORAGE_CANDIDACY_SECONDS > 0) {
actors.push_back( reportErrors( monitorLeaderRemotelyWithDelayedCandidacy( connFile, cc, asyncPriorityInfo, recoveredDiskFiles.getFuture(), localities, dbInfo ), "ClusterController" ) );
Expand Down
91 changes: 33 additions & 58 deletions flow/Net2.actor.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -296,6 +296,35 @@ class BindPromise {
}
};

struct SendBufferIterator {
typedef boost::asio::const_buffer value_type;
typedef std::forward_iterator_tag iterator_category;
typedef size_t difference_type;
typedef boost::asio::const_buffer* pointer;
typedef boost::asio::const_buffer& reference;

SendBuffer const* p;
int limit;

SendBufferIterator(SendBuffer const* p=0, int limit = std::numeric_limits<int>::max()) : p(p), limit(limit) {
ASSERT(limit > 0);
}

bool operator == (SendBufferIterator const& r) const { return p == r.p; }
bool operator != (SendBufferIterator const& r) const { return p != r.p; }
void operator++() {
limit -= p->bytes_written - p->bytes_sent;
if(limit > 0)
p = p->next;
else
p = NULL;
}

boost::asio::const_buffer operator*() const {
return boost::asio::const_buffer( p->data + p->bytes_sent, std::min(limit, p->bytes_written - p->bytes_sent) );
}
};

class Connection : public IConnection, ReferenceCounted<Connection> {
public:
virtual void addref() { ReferenceCounted<Connection>::addref(); }
Expand Down Expand Up @@ -420,35 +449,6 @@ class Connection : public IConnection, ReferenceCounted<Connection> {
tcp::socket socket;
NetworkAddress peer_address;

struct SendBufferIterator {
typedef boost::asio::const_buffer value_type;
typedef std::forward_iterator_tag iterator_category;
typedef size_t difference_type;
typedef boost::asio::const_buffer* pointer;
typedef boost::asio::const_buffer& reference;

SendBuffer const* p;
int limit;

SendBufferIterator(SendBuffer const* p=0, int limit = std::numeric_limits<int>::max()) : p(p), limit(limit) {
ASSERT(limit > 0);
}

bool operator == (SendBufferIterator const& r) const { return p == r.p; }
bool operator != (SendBufferIterator const& r) const { return p != r.p; }
void operator++() {
limit -= p->bytes_written - p->bytes_sent;
if(limit > 0)
p = p->next;
else
p = NULL;
}

boost::asio::const_buffer operator*() const {
return boost::asio::const_buffer( p->data + p->bytes_sent, std::min(limit, p->bytes_written - p->bytes_sent) );
}
};

void init() {
// Socket settings that have to be set after connect or accept succeeds
socket.non_blocking(true);
Expand Down Expand Up @@ -721,6 +721,10 @@ class SSLConnection : public IConnection, ReferenceCounted<SSLConnection> {

// Writes as many bytes as possible from the given SendBuffer chain into the write buffer and returns the number of bytes written (might be 0)
virtual int write( SendBuffer const* data, int limit ) {
#ifdef __APPLE__
// For some reason, writing ssl_sock with more than 2016 bytes when socket is writeable sometimes results in a broken pipe error.
limit = std::min(limit, 2016);
#endif
boost::system::error_code err;
++g_net2->countWrites;

Expand Down Expand Up @@ -763,35 +767,6 @@ class SSLConnection : public IConnection, ReferenceCounted<SSLConnection> {
NetworkAddress peer_address;
Reference<ReferencedObject<boost::asio::ssl::context>> sslContext;

struct SendBufferIterator {
typedef boost::asio::const_buffer value_type;
typedef std::forward_iterator_tag iterator_category;
typedef size_t difference_type;
typedef boost::asio::const_buffer* pointer;
typedef boost::asio::const_buffer& reference;

SendBuffer const* p;
int limit;

SendBufferIterator(SendBuffer const* p=0, int limit = std::numeric_limits<int>::max()) : p(p), limit(limit) {
ASSERT(limit > 0);
}

bool operator == (SendBufferIterator const& r) const { return p == r.p; }
bool operator != (SendBufferIterator const& r) const { return p != r.p; }
void operator++() {
limit -= p->bytes_written - p->bytes_sent;
if(limit > 0)
p = p->next;
else
p = NULL;
}

boost::asio::const_buffer operator*() const {
return boost::asio::const_buffer( p->data + p->bytes_sent, std::min(limit, p->bytes_written - p->bytes_sent) );
}
};

void init() {
// Socket settings that have to be set after connect or accept succeeds
socket.non_blocking(true);
Expand Down

0 comments on commit 1671997

Please sign in to comment.