Tor
0.4.7.0-alpha-dev
|
Maintains and analyzes statistics about circuit built times, so we can tell how long we may need to wait for a fast circuit to be constructed. More...
#include "core/or/or.h"
#include "core/or/circuitbuild.h"
#include "core/or/circuitstats.h"
#include "app/config/config.h"
#include "lib/confmgt/confmgt.h"
#include "feature/control/control_events.h"
#include "lib/crypt_ops/crypto_rand.h"
#include "core/mainloop/mainloop.h"
#include "feature/nodelist/networkstatus.h"
#include "feature/relay/router.h"
#include "app/config/statefile.h"
#include "core/or/circuitlist.h"
#include "core/or/circuituse.h"
#include "lib/math/fp.h"
#include "lib/time/tvdiff.h"
#include "lib/encoding/confline.h"
#include "feature/dirauth/authmode.h"
#include "feature/hs/hs_service.h"
#include "feature/relay/relay_periodic.h"
#include "core/or/crypt_path_st.h"
#include "core/or/origin_circuit_st.h"
#include "app/config/or_state_st.h"
#include <math.h>
Go to the source code of this file.
Macros | |
#define | CIRCUITSTATS_PRIVATE |
#define | CBT_BIN_TO_MS(bin) ((bin)*CBT_BIN_WIDTH + (CBT_BIN_WIDTH/2)) |
#define | unit_tests 0 |
#define | CIRCUIT_TIMEOUT_BEFORE_RECHECK_IP (60*3) |
#define | MAX_TIMEOUT ((int32_t) (INT32_MAX/2)) |
Variables | |
static circuit_build_times_t | circ_times |
Maintains and analyzes statistics about circuit built times, so we can tell how long we may need to wait for a fast circuit to be constructed.
By keeping these statistics, a client learns when it should time out a slow circuit for being too slow, and when it should keep a circuit open in order to wait for it to complete.
The information here is kept in a circuit_built_times_t structure, which is currently a singleton, but doesn't need to be. It's updated by calls to circuit_build_times_count_timeout() from circuituse.c, circuit_build_times_count_close() from circuituse.c, and circuit_build_times_add_time() from circuitbuild.c, and inspected by other calls into this module, mostly from circuitlist.c. Observations are persisted to disk via the or_state_t-related calls.
Definition in file circuitstats.c.
#define CIRCUIT_TIMEOUT_BEFORE_RECHECK_IP (60*3) |
How long should we be unreachable before we think we need to check if our published IP address has changed.
Definition at line 1343 of file circuitstats.c.
int circuit_build_times_add_time | ( | circuit_build_times_t * | cbt, |
build_time_t | btime | ||
) |
Add a new build time value time to the set of build times. Time units are milliseconds.
circuit_build_times cbt is a circular array, so loop around when array is full.
Definition at line 756 of file circuitstats.c.
Referenced by circuit_build_times_count_close().
STATIC double circuit_build_times_calculate_timeout | ( | circuit_build_times_t * | cbt, |
double | quantile | ||
) |
This is the Pareto Quantile Function. It calculates the point x in the distribution such that F(x) = quantile (ie quantile*100% of the mass of the density function is below x on the curve).
We use it to calculate the timeout and also to generate synthetic values of time for circuits that timeout before completion.
See https://en.wikipedia.org/wiki/Quantile_function, https://en.wikipedia.org/wiki/Inverse_transform_sampling and https://en.wikipedia.org/wiki/Pareto_distribution#Generating_a_ random_sample_from_Pareto_distribution That's right. I'll cite wikipedia all day long.
Return value is in milliseconds, clamped to INT32_MAX.
Definition at line 1217 of file circuitstats.c.
Referenced by circuit_build_times_set_timeout_worker().
|
static |
Retrieve and bounds-check the cbtclosequantile consensus parameter.
Effect: This is the position on the quantile curve to use to set the timeout value to use to actually close circuits. It is a percent (0-99).
Definition at line 292 of file circuitstats.c.
Referenced by circuit_build_times_set_timeout_worker().
double circuit_build_times_close_rate | ( | const circuit_build_times_t * | cbt | ) |
Count the number of closed circuits in a set of cbt data.
Definition at line 1631 of file circuitstats.c.
int circuit_build_times_count_close | ( | circuit_build_times_t * | cbt, |
int | did_onehop, | ||
time_t | start_time | ||
) |
Store a timeout as a synthetic value.
Returns true if the store was successful and we should possibly update our timeout estimate.
Definition at line 1653 of file circuitstats.c.
void circuit_build_times_count_timeout | ( | circuit_build_times_t * | cbt, |
int | did_onehop | ||
) |
Update timeout counts to determine if we need to expire our build time history due to excessive timeouts.
We do not record any actual time values at this stage; we are only interested in recording the fact that a timeout happened. We record the time values via circuit_build_times_count_close() and circuit_build_times_add_time().
Definition at line 1685 of file circuitstats.c.
|
static |
Calculate and return a histogram for the set of build times.
Returns an allocated array of histrogram bins representing the frequency of index*CBT_BIN_WIDTH millisecond build times. Also outputs the number of bins in nbins.
The return value must be freed by the caller.
Definition at line 826 of file circuitstats.c.
Referenced by circuit_build_times_get_xm(), and circuit_build_times_update_state().
|
static |
Retrieve and bounds-check the cbtnummodes consensus parameter.
Effect: This value governs how many modes to use in the weighted average calculation of Pareto parameter Xm. Analysis of pairs of geographically near, far, and mixed guaeds has shown that a value of 10 introduces some allows for the actual timeout rate to be within 2-7% of the cutoff quantile, for quantiles between 60-80%.
Definition at line 211 of file circuitstats.c.
Referenced by circuit_build_times_get_xm().
int circuit_build_times_disabled | ( | const or_options_t * | options | ) |
This function decides if CBT learning should be disabled. It returns true if one or more of the following conditions are met:
Definition at line 117 of file circuitstats.c.
Referenced by circuit_build_times_count_close(), circuit_build_times_count_timeout(), circuit_build_times_handle_completed_hop(), circuit_build_times_init(), circuit_build_times_new_consensus_params(), circuit_build_times_parse_state(), and circuit_build_times_set_timeout().
int circuit_build_times_disabled_ | ( | const or_options_t * | options, |
int | ignore_consensus | ||
) |
As circuit_build_times_disabled, but take options as an argument.
Definition at line 124 of file circuitstats.c.
Referenced by circuit_build_times_disabled().
int circuit_build_times_enough_to_compute | ( | const circuit_build_times_t * | cbt | ) |
Return true iff cbt has recorded enough build times that we want to start acting on the timeout it implies.
Definition at line 255 of file circuitstats.c.
Referenced by circuit_build_times_needs_circuits(), and circuit_build_times_set_timeout_worker().
void circuit_build_times_free_timeouts | ( | circuit_build_times_t * | cbt | ) |
Free the saved timeouts, if the cbtdisabled consensus parameter got turned on or something.
Definition at line 591 of file circuitstats.c.
|
static |
Return the initial default or configured timeout in milliseconds
Definition at line 514 of file circuitstats.c.
Referenced by circuit_build_times_count_close(), circuit_build_times_count_timeout(), and circuit_build_times_network_check_changed().
STATIC build_time_t circuit_build_times_get_xm | ( | circuit_build_times_t * | cbt | ) |
Return the Pareto start-of-curve parameter Xm.
Because we are not a true Pareto curve, we compute this as the weighted average of the 10 most frequent build time bins. This heuristic allowed for the actual timeout rate to be closest to the chosen quantile cutoff, for quantiles 60-80%, out of many variant approaches (see #40157 for analysis).
Definition at line 859 of file circuitstats.c.
Referenced by circuit_build_times_update_alpha().
void circuit_build_times_handle_completed_hop | ( | origin_circuit_t * | circ | ) |
Perform the build time work that needs to be done when a circuit completes a hop.
This function decides if we should record a circuit's build time in our histogram data and other statistics, and if so, records it. It also will mark circuits that have already timed out as measurement-only circuits, so they can continue to build but not get used.
For this, we want to consider circuits that will eventually make it to the third hop. For circuits longer than 3 hops, we want to record their build time when they reach the third hop, but let them continue (and not count them later). For circuits that are exactly 3 hops, this will count them when they are completed. We do this so that CBT is always gathering statistics on circuits of the same length, regardless of their type.
Definition at line 679 of file circuitstats.c.
void circuit_build_times_init | ( | circuit_build_times_t * | cbt | ) |
Initialize the buildtimes structure for first use.
Sets the initial timeout values based on either the config setting, the consensus param, or the default (CBT_DEFAULT_TIMEOUT_INITIAL_VALUE).
Definition at line 565 of file circuitstats.c.
Referenced by circuit_build_times_parse_state().
int32_t circuit_build_times_initial_timeout | ( | void | ) |
Retrieve and bounds-check the cbtinitialtimeout consensus parameter.
Effect: This is the timeout value to use before computing a timeout, in milliseconds.
Definition at line 370 of file circuitstats.c.
void circuit_build_times_mark_circ_as_measurement_only | ( | origin_circuit_t * | circ | ) |
Mark this circuit as timed out, but change its purpose so that it continues to build, allowing us to measure its full build time.
Definition at line 637 of file circuitstats.c.
|
static |
Return maximum circuit build time
Definition at line 785 of file circuitstats.c.
Referenced by circuit_build_times_create_histogram(), and circuit_build_times_set_timeout_worker().
|
static |
Retrieve and bounds-check the cbtmaxtimeouts consensus parameter.
Effect: When this many timeouts happen in the last 'cbtrecentcount' circuit attempts, the client should discard all of its history and begin learning a fresh timeout value.
Definition at line 182 of file circuitstats.c.
Referenced by circuit_build_times_network_check_changed().
|
static |
Retrieve and bounds-check the cbtmincircs consensus parameter.
Effect: This is the minimum number of circuits to build before computing a timeout.
Definition at line 235 of file circuitstats.c.
Referenced by circuit_build_times_enough_to_compute().
|
static |
Retrieve and bounds-check the cbtmintimeout consensus parameter.
Effect: This is the minimum allowed timeout value in milliseconds. The minimum is to prevent rounding to 0 (we only check once per second).
Definition at line 348 of file circuitstats.c.
Referenced by circuit_build_times_initial_timeout(), and circuit_build_times_set_timeout().
int circuit_build_times_needs_circuits | ( | const circuit_build_times_t * | cbt | ) |
Returns true if we need circuits to be built
Definition at line 1322 of file circuitstats.c.
Referenced by circuit_build_times_needs_circuits_now().
int circuit_build_times_needs_circuits_now | ( | const circuit_build_times_t * | cbt | ) |
Returns true if we should build a timeout test circuit right now.
Definition at line 1333 of file circuitstats.c.
STATIC int circuit_build_times_network_check_changed | ( | circuit_build_times_t * | cbt | ) |
Returns true if we have seen more than MAX_RECENT_TIMEOUT_COUNT of the past RECENT_CIRCUITS time out after the first hop. Used to detect if the network connection has changed significantly, and if so, resets our circuit build timeout to the default.
Also resets the entire timeout history in this case and causes us to restart the process of building test circuits and estimating a new timeout.
Definition at line 1550 of file circuitstats.c.
Referenced by circuit_build_times_count_timeout().
int circuit_build_times_network_check_live | ( | const circuit_build_times_t * | cbt | ) |
When the network is not live, we do not record circuit build times.
The network is considered not live if there has been at least one circuit build that began and ended (had its close_ms measurement period expire) since we last received a cell.
Also has the side effect of rewinding the circuit time history in the case of recent liveness changes.
Definition at line 1530 of file circuitstats.c.
Referenced by circuit_build_times_count_close().
void circuit_build_times_network_circ_success | ( | circuit_build_times_t * | cbt | ) |
Called to indicate that we "completed" a circuit. Because this circuit succeeded, it doesn't count as a timeout-after-the-first-hop.
(For the purposes of the cbt code, we consider a circuit "completed" if it has 3 hops, regardless of its final hop count. We do this because we're trying to answer the question, "how long should a circuit take to reach the 3-hop count".)
This is used by circuit_build_times_network_check_changed() to determine if we had too many recent timeouts and need to reset our learned timeout to something higher.
Definition at line 1410 of file circuitstats.c.
|
static |
A circuit was just forcibly closed. If there has been no recent network activity at all, but this circuit was launched back when we thought the network was live, increment the number of "nonlive" circuit timeouts.
This is used by circuit_build_times_network_check_live() to decide if we should record the circuit build timeout or not.
Definition at line 1471 of file circuitstats.c.
Referenced by circuit_build_times_count_close().
void circuit_build_times_network_is_live | ( | circuit_build_times_t * | cbt | ) |
Called to indicate that the network showed some signs of liveness, i.e. we received a cell.
This is used by circuit_build_times_network_check_live() to decide if we should record the circuit build timeout or not.
This function is called every time we receive a cell. Avoid syscalls, events, and other high-intensity work.
Definition at line 1356 of file circuitstats.c.
Referenced by channel_do_open_actions(), and connection_or_launch_v3_or_handshake().
|
static |
A circuit just timed out. If it failed after the first hop, record it in our history for later deciding if the network speed has changed.
This is used by circuit_build_times_network_check_changed() to determine if we had too many recent timeouts and need to reset our learned timeout to something higher.
Definition at line 1439 of file circuitstats.c.
Referenced by circuit_build_times_count_timeout().
void circuit_build_times_new_consensus_params | ( | circuit_build_times_t * | cbt, |
const networkstatus_t * | ns | ||
) |
This function is called when we get a consensus update.
It checks to see if we have changed any consensus parameters that require reallocation or discard of previous stats.
Definition at line 426 of file circuitstats.c.
int circuit_build_times_parse_state | ( | circuit_build_times_t * | cbt, |
or_state_t * | state | ||
) |
Load histogram from state, shuffling the resulting array after we do so. Use this result to estimate parameters and calculate the timeout.
Return -1 on error.
Definition at line 1006 of file circuitstats.c.
double circuit_build_times_quantile_cutoff | ( | void | ) |
Retrieve and bounds-check the cbtquantile consensus parameter.
Effect: This is the position on the quantile curve to use to set the timeout value. It is a percent (10-99).
Definition at line 267 of file circuitstats.c.
Referenced by circuit_build_times_close_quantile(), and circuit_build_times_set_timeout_worker().
|
static |
Retrieve and bounds-check the cbtrecentcount consensus parameter.
Effect: This is the number of circuit build times to keep track of for deciding if we hit cbtmaxtimeouts and need to reset our state and learn a new timeout.
Definition at line 401 of file circuitstats.c.
Referenced by circuit_build_times_init(), and circuit_build_times_new_consensus_params().
void circuit_build_times_reset | ( | circuit_build_times_t * | cbt | ) |
Reset the build time state.
Leave estimated parameters, timeout and network liveness intact for future use.
Definition at line 545 of file circuitstats.c.
Referenced by circuit_build_times_network_check_changed().
|
static |
Non-destructively scale all of our circuit success, timeout, and close counts down by a factor of two. Scaling in this way preserves the ratios between succeeded vs timed out vs closed circuits, so that our statistics don't change when we scale.
This is used only in the rare event that we build more than INT32_MAX circuits. Since the num_circ_* variables are uint32_t, we won't even be close to overflowing them.
Definition at line 1389 of file circuitstats.c.
Referenced by circuit_build_times_network_circ_success(), circuit_build_times_network_close(), and circuit_build_times_network_timeout().
void circuit_build_times_set_timeout | ( | circuit_build_times_t * | cbt | ) |
Exposed function to compute a new timeout. Dispatches events and also filters out extremely high timeout values.
Definition at line 1754 of file circuitstats.c.
|
static |
Estimate a new timeout based on history and set our timeout variable accordingly.
Definition at line 1707 of file circuitstats.c.
Referenced by circuit_build_times_set_timeout().
|
static |
Shuffle the build times array.
Adapted from https://en.wikipedia.org/wiki/Fisher-Yates_shuffle
Definition at line 963 of file circuitstats.c.
|
static |
Retrieve and bounds-check the cbttestfreq consensus parameter.
Effect: Describes how often in seconds to build a test circuit to gather timeout values. Only applies if less than 'cbtmincircs' have been recorded.
Definition at line 324 of file circuitstats.c.
Referenced by circuit_build_times_needs_circuits_now().
double circuit_build_times_timeout_rate | ( | const circuit_build_times_t * | cbt | ) |
Count the number of timeouts in a set of cbt data.
Definition at line 1612 of file circuitstats.c.
STATIC int circuit_build_times_update_alpha | ( | circuit_build_times_t * | cbt | ) |
Estimates the Xm and Alpha parameters using https://en.wikipedia.org/wiki/Pareto_distribution#Parameter_estimation
The notable difference is that we use mode instead of min to estimate Xm. This is because our distribution is frechet-like. We claim this is an acceptable approximation because we are only concerned with the accuracy of the CDF of the tail.
Definition at line 1138 of file circuitstats.c.
Referenced by circuit_build_times_set_timeout_worker().
void circuit_build_times_update_state | ( | const circuit_build_times_t * | cbt, |
or_state_t * | state | ||
) |
Output a histogram of current circuit build times to the or_state_t state structure.
Definition at line 917 of file circuitstats.c.
double get_circuit_build_close_time_ms | ( | void | ) |
Return the time to wait before actually closing an under-construction, in milliseconds.
Definition at line 93 of file circuitstats.c.
double get_circuit_build_timeout_ms | ( | void | ) |
Return the time to wait before giving up on an under-construction circuit, in milliseconds.
Definition at line 101 of file circuitstats.c.
Referenced by circuit_build_times_handle_completed_hop().
const circuit_build_times_t* get_circuit_build_times | ( | void | ) |
Return a pointer to the data structure describing our current circuit build time history and computations.
Definition at line 78 of file circuitstats.c.
circuit_build_times_t* get_circuit_build_times_mutable | ( | void | ) |
As get_circuit_build_times, but return a mutable pointer.
Definition at line 85 of file circuitstats.c.
Referenced by channel_do_open_actions(), and connection_or_launch_v3_or_handshake().
|
static |
Global list of circuit build times
Definition at line 65 of file circuitstats.c.
Referenced by get_circuit_build_close_time_ms(), get_circuit_build_timeout_ms(), get_circuit_build_times(), and get_circuit_build_times_mutable().