randomisation(1) | Generation of whole or part of the RANDOMISATION SET. Also see : RANDOMISATION(3), RE-RANDOMISATION. |

sample size | 1) The number of experimental units on which observations are considered. This may be less than the number of observations in a data-set, due to the possible multipying effects of multiple variables and/or REPEATED MEASURES within the EXPERIMENTAL DESIGN. 2) The number of elements in a sample from a population. 3) the number of units chosen from a population or an environment. 4) объем выборки, см. sampling size 5) размер выборки |

randomisation(2) | The process of arranging for data-collection, in accordance with the EXPERIMENTAL DESIGN, such that there should be no foreseeable possibilty of any systematic relationship between the data and any measureable characteristic of the procedure by which the |

re-randomisation | The process of generating alternative arrangements of given data which would be consistent with the EXPERIMENTAL DESIGN. Also see : BOOTSTRAP, EXACT TEST(2), EXHAUSTIVE RE-RANDOMISATION, MONTE-CARLO, RE-RANDOMISATION STATISTICS. |

randomisation test | The rationale of a RANDOMISATION TEST involves exploring RE-RANDOMISATIONs of the actual data to form the RANDOMISATION DISTRIBUTION of values of the TEST STATISTIC. The OUTCOME VALUE value of the TEST STATISTIC is judged in terms of its relative position |

outcome value | The value of the TEST STATISTIC for the data as initially observed, before any RE-RANDOMISATION.. |

tail definition policy | This is a defined method for dividing a DISCRETE DISTRIBUTION into a TAIL area and a body area. The scope for differing policies arises due to the non-infinitesmal amount of probability measure which may be associated with the ACTUAL OUTOME value. The con |

measurement type | This is a distinction regarding the relationship between a phenomenon being measured and the data as recorded. The main distinctions are concerned with the meaningfulness of numerical comparisons of data (NOMINAL SCALE versus ORDINAL SCALE versus INTERVAL |

stratified | This is a feature of an EXPERIMENTAL DESIGN whereby a scheme of observations is repeated entirely using further sets (strata) of experimental units, with each such further set distinguished by a level of a categorical variable which is distinct from any c |

replications | This is a feature of an EXPERIMENTAL DESIGN whereby observations on an experimental unit are repeated under the same conditions. Identification of the position of a particular observation within the sequence of replications is irrelevant. Also see : REPEA |

repeated-measures | This is a feature of an EXPERIMENTAL DESIGN whereby several observations measured on a common scale refer to the same sampling unit. Identification of the relation of the individual observations to the EXPERIMENTAL DESIGN is crucial to this definition. Ex |

gold standard(2) | The idea of a re-randomisation test as a standard of correctness by which to judge other tests which are not based upon principles of RE-RANDOMISATION. |

gold standard(1) | The GOLD STANDARD is the form of test which is most faithful to the RANDOMISATION DISTRIBUTION, for a given TEST STATISTIC and EXPERIMENTAL DESIGN. This involves EXHAUSTIVE RANDOMISATION. Other RANDOMISATION TESTs may reasonably be judged by comparison wi |

tied ranks | In a NONPARAMETRIC TEST involving RANKED DATA, if two data have TIED VALUES then they will deserve to receive the same rank value. It is generally agreed that this should be the average of the ranks which would have been assigned if the values had been di |

null hypothesis | 1) In order to test whether a supposed interesting pattern exists in a set of data, it is usual to propose a NULL HYPOTHESIS that the pattern does not exist. It is the unexpectedness of the degree of departure of the observed data, relative to the pattern ex 2) In hypothesis testing, the hypothesis we wish to falsify on the basis of the data. The null hypothesis is typically that something is not present, that there is no effect, or that there is no difference between treatment and control. 3) A hypothesis of the form: there is no difference between A and B. This form of the hypothesis is the basis for statistical tests of significance, such as the t-test and the F-test. In the t-test, A and B are mean values. In the F-test, A and B are variances (squares of standard deviations) |

pas2c | One of a number of PROGRAMs for undertaking translations between STANDARD PROGRAMMING LANGUAGES. |

randomisation(3) | One of the arrangements making up the RANDOMISATION SET. These arranegments will be encountered in the act of RANDOMISATION(1). Also see : BRANCH AND BOUND, MINIMAL-CHANGE SEQUENCE. |

p-value | 1) The ALPHA value arising from a statistical test. Also see : EXACT TEST(2) 2) Suppose we have a family of hypothesis tests of a null hypothesis that let us test the hypothesis at any significance level p between 0 and 100% we choose. The P value of the null hypothesis given the data is the smallest significance level p for which any of the tests would have rejected the null hypothesis. |

programmable | The characteristic of a COMPUTER which enables it to be used to undertake a variety of different processes on different occasions. Also see : ALGORITHM(2), PROGRAM, PROGRAMMING LANGUAGE, STANDARD PROGRAMMING LANGUAGE. |

exact test(1) | The characteristic of a RE-RANDOMISATION TEST based upon EXHAUSTIVE RE-RANDOMISATION, that the value of ALPHA will be fixed irrespective of any random sampling of RANDOMISATIONS or upon any distributional assumptions. Notable examples are the EXACT BINOMI |

randomisation set | The collection of possible RE-RANDOMISATIONs of data within the constraints of the EXPERIMENTAL DESIGN. Also see : RANDOMISATION DISTRIBUTION. |

poisson distribution | 1) The distribution of number of events in a given time, arising from a POISSON PROCESS. This differs from the BINOMIAL DISTRIBUTION in that there is no upper limit, corresponding to the parameter 'n' of a BINOMIAL PROCESS, to the number of events which may 2) The Poisson distribution is a discrete probability distribution that depends on one parameter, m. If X is a random variable with the Poisson distribution with parameter m, then the probability that X = k is |

factorial | 1) The FACTORIAL operator is applicable to a non-negative integer quantity. It is notated as the postfixed symbol '!'. The resulting value is the product of the increasing integer values from 1 up to the value of the argument quantity. For instance : 3! is 1 2) For an integer k that is greater than or equal to 1, k! (pronounced "k factorial") is k×(k−1)×(k−2)× …×1. By convention, 0! = 1. There are k! ways of ordering k distinct objects. For example, 9! is the number of batting orders of 9 baseball players, and 52! is the number of different ways a standard deck of playing cards can be ordered. The calculator above has a button to compute the factorial of a number. To compute k!, first type the value of k, then press the button labeled "!". |

binomial distribution | 1) This is a special case of the MULTINOMIAL DISTRIBUTION where the number of possible outcomes is 2. It is the distribution of outcomes expected if a certain number of independent trials are undertaken of a single BERNOUILLI PROCESS (e.g. multiple tosses of 2) A random variable has a binomial distribution (with parameters n and p) if it is the number of "successes" in a fixed number n of independent random trials, all of which have the same probability p of resulting in "success." Under these assumptions, the probability of k successes (and n−k failures) is nCk pk(1−p)n−k, where nCk is the number of combinations of n objects taken k at a time: nCk = n!/(k!(n−k)!). The expected value of a random variable with the Binomial distribution is n×p, and the standard error of a random variable with the Binomial distribution is (n×p×(1 − p))½. This page shows the probability histogram of the binomial distribution. |

binomial test | This is a statistical test referring to a repeated binary process such as would be expected to generate outcomes with a BINOMIAL DISTRIBUTION. A value for the parameter 'p' is hypothesised (null hypothesis) and the difference of the actual value from this |

2-by-2 table | This is a TWO-WAY TABLE where the numbers of levels of the row- and column-classifications are each 2. If the row- and column- classifications each divide the observational units into subsets, then it is likely that it will be useful to analyse the data u |

equivalent test statistic | Within a RANDOMISATION SET, it is possible that two different STATISTICs may be inter-related in a manner which is provably monotonic irrespective of the data. In such a situation a RANDOMISATION TEST performed on either of these TEST STATISTICs will nece |

type-1 error | |

error types | |

statistical significance | |

type-2 error | |

freeman-halton test | |

scale type | |

extended pascal | |

permutation test | |

2-way table | |

chi-squared distribution | Where expected frequencies are sufficiently high, hypothesised distributions of counts may be approximated by a NORMAL DISTRIBUTION rather than an exact BINOMIAL DISTRIBUTION. The corresponding distribution of the CHI-SQUARED STATISTIC can be derived alge |

tied values | Where data are represented by ranks, TIED VALUES lead to TIED RANKS. Whether or not data are rep[resnted by ranks, for any TEST STATISTIC the occurrence of TIED VALUES will increase the extent to which a RANDOMISATION DISTRIBUTION will be a DISCRETE DISTR |

ratio scale | This is a type of MEASUREMENT SCALE for which it is meaningful to reason in terms of differences in scores (see INTERVAL SCALE) and also in terms of ratios of scores. Such a scale will have a zero point which is meaningful in the sense that it indicates c |

nominal scale | This is a type of MEASUREMENT SCALE with a limited number of possible outcomes which cannot be placed in any order representing the intrinsic properties of the measurements. Examples : Female versus Male; the collection of languages in which an internatio |

object code | This is the code which a COMPUTER recognises and acts upon as a direct consequence of its electromechanical construction. Typically such code is highly abstract and unsuitable for use in general use by human programmers. The OBJECT CODE to specify a certa |

multinomial distribution | 1) This is the distribution of outcomes expected if a certain number of independent trials are undertaken of a several separate BERNOUILLI PROCESSes, to determine a number of alternative outcomes. A special case, where the number of outcomes is 2, is the BIN 2) Consider a sequence of n independent trials, each of which can result in an outcome in any of k categories. Let pj be the probability that each trial results in an outcome in category j, j = 1, 2, … , k, so |

resampling stats | This is the name of an educational initiative involving the use of a PROGRAMMING LANGUAGE, in the form of an INTERPRETER, allowing the user to specify MONTE-CARLO RESAMPLING of a set of data and accumulation of the RANDOMISATION DISTRIBUTION of a defined |

exact-stats | This is the name of the academic initiative which produced this present glossary. EXACT-STATS is a closed e-mail based discussion group for the development and promulgation of the ideas of re-randomisation statistics. The contact address is : exact-stats@ |

ranked data | This refers to the practice of taking a set of N data, to be regarded as ORDINAL-SCALE, amd replacing each datum by its rank (1 .. N) within the set. Also see : WILCOXON RANK-SUM TEST. |

logistic regression | This relates to an EXPERIMENTAL DESIGN for predicting a binary categorical (yes/no) outome on the basis of predictor variables measured on INTERVAL SCALEs. For each of a set of values of the predictor variables, the outcomes are regarded as representing a |

permutation | 1) This term has a distinct mathematical definition, but is also commonly used as a synonym for RE-RANDOMISATION. 2) A permutation of a set is an arrangement of the elements of the set in some order. If the set has n things in it, there are n! different orderings of its elements. For the first element in an ordering, there are n possible choices, for the second, there remain n−1 possible choices, for the third, there are n−2, etc., and for the nth element of the ordering, there is a single choice remaining. By the fundamental rule of counting, the total number of sequences is thus n×(n−1)×(n−2)×…×1. Similarly, the number of orderings of length k one can form from n≥k things is n×(n−1)×(n−2)×…×(n−k+1) = n!/(n−k)!. This is denoted nPk, the number of permutations of n things taken k at a time. C.f. combinations. |

experimental design | This term overtly refers to the planning of a process of data collection. The term is also used to refer to the information necessary to describe the interrelationships within a set of data. Such a description involves considerations such as number of cas |

wilcoxon rank-sum test | |

bootstrap | [()] This is a form of RANDOMISATION TEST which is one of the alternatives to EXHAUSTIVE RE-RANDOMISATION. The BOOTSTRAP scheme involves generating subsets of the data on the basis of random sampling with replacements as the data are sampled. Such resampl |

fisher test(1) | [Named after the statistician RA Fisher()]. This is an EXACT TEST(1) to examine whether the pattern of counts in a 2x2 cross classification departs from expectations based upon the marginal totals for the rows and columns. Such a test is useful to examine |

chi-squared statistic | [Named by E.S. Pearson ()?]. This is a long-established TEST STATISTIC for measuring the extent to which a set of categorical outcomes depart from a hypothesised set of probabilities. It is calculated as a sum of terms over the available categories, where |

mid-p | [Proposed by H.O Lancaster(), and further promoted by G.A. Barnard] This is a TAIL DEFINITION POLICY that the ALPHA value should be calculated as the sum of the proportion of the TAIL for data strictly more extreme than the OUTCOME, plus one half of the p |

interval scale | A characteristic of data such that the difference between two values measured on the scale has the same substantive meaning/significance irrespective of the common level of the two values being compared. This implies that scores may meaningfully be added |

randomisation distribution | A collection of values of the TEST STATISTIC obtained by undertaking a number of RE-RANDOMISATIONS of the actual data within the RANDOMISATION SET. ALso see : CONFIDENCE INTERVAL, RANDOMISATION TEST. |

relative power | A comparison of two or more statistical tests, for the same EXPERIMENTAL DESIGN, SAMPLE SIZE, and NOMINAL ALPHA CRITERION VALUE, in terms of the respective values of POWER. Also see : BETA. |

algorithm(1) | A formal statement, clear complete and unambiguous, of how a certain process needs to be undertaken. Also see : ALGORITHM(2). |

ordinal scale | A MEASUREMENT TYPE for which the relative values of data are defined solely in terms of being lesser, equa-to or greater as compared with other data on the ORDINAL SCALE. These characteristics may arise from categorical rating scales, or from converting I |

non-parametric test | A number of statistical tests were devised, mostly over the period 1930-1960, with the specific objective of by-passing assumptions about sampling from populations with data supposedly conforming to theoretically modelled statistical distributions wuch as |

statistic | 1) A number or code derived by a prior-defined consistent process of calculation, from a set of data. Also see : ALGORITHM(1), TEST STATISTIC. 2) A number that can be computed from data, involving no unknown parameters. As a function of a random sample, a statistic is a random variable. Statistics are used to estimate parameters, and to test hypotheses. |

wilcoxon test(1) | [Named after the statistician F, Wilcoxon ()] This test applies to an EXPERIMENTAL DESIGN involving two REPEATED MEASURE observations on a common set of experimental units, which need be only ORDINAL-SCALE. The purpose is to measure shift in scale locatio |

wilcoxon test(2) | [Named after the statistician F, Wilcoxon ()] This is a test for an EXPERIMENTAL DESIGN involving two INDEPENDENT GROUPS of experimental units, where data need be only ORDINAL-SCALE. The purpose is to measure shift in scale location between the two groups |

fisher test(2) | [()] This is also known as the FREEMAN-HALTON TEST. It is an extension of the logic of the FISHER TEST(1), for a 2-way classification of counts where the extent of the cross-classification may be greater than 2x2. The RANDOMISATION SET for an EXHAUSTIVE R |

bernouilli process | [()] This is the simplest probability model - a single trial between two possible outcomes such as a coin toss. The distribution depends upon a single parameter,'p', representing the probability attributed to one defined outcome out of the two possible ou |

stevens' typology | [()] This is widely-observed scheme of distinctions between types of MEASUREMENT SCALEs according to the meaningfulness of arithmetic which may be performed upon data values. The types are : NOMINAL SCALE versus ORDINAL SCALE versus INTERVAL SCALE versus |

shift algorithm | [()]. ALGORITHMs employing BRANCH-AND-BOUND methods for the PTIMAN PERMUTAION TEST(1) and the PITMAN PERMUTATION TEST(2). |

normal distribution | 1) [] The NORMAL DISTRIBUTION is a theoretical distribution applicable for continuous INTERVAL-SCALE data. It is related mathematically to the BINOMIAL and CHI-SQUARE(2) distributions and to several named sampling distributions (including Student's t, Fisher 2) A random variable X has a normal distribution with mean m and standard error s if for every pair of numbers a ≤ b, the chance that a < (X−m)/s < b is |

mann-whitney test | [Devised by ()] This is a test of difference in location for an EXPERIMENTAL DESIGN involving two samples with data measured on an ORDINAL SCALE or better. The TEST STATISTIC is a measure of ordinal precedence. For each possible pairing of an observation |

fortran | 1) [Name is an acronym : FORmula TRANslator]. A very long established and widely implemented PROGRAMMING LANGUAGE, specialised substantially for numerical applications. A number of STANDARD PROGRAMMING LANGUAGE versions of FORTRAN have established at various 2) formula translation (computer language) |

monte-carlo test | [Named after the famous site of gambling casinos] A MONTE-CARLO TEST involves generating a random subset of the RANDOMISATION SET, sampled without replacement, and using the values of the TEST STATISTIC to generate an estimate of the form of the full RAND |

pascal | 1) [Named after the mathematician Blaise Pascal ( - )]. A PROGRAMMING LANGUAGE designed for clarity of expression when published in human-legible form, and for the teaching of programming. PASCAL is to some extent specialised for numerical work. A developmen 2) Unit of pressure in the metric (SI) system. |

pitman permutation test(1) | [Named after the statistician E.J. Pitman who described this test, and the PITMAN PERMUTATION TEST(2), in 1937; this is one of the earliest instances of an EXACT TEST(1)] An EXACT RE-RANDOMISATION TEST in which the TEST STATISTIC is the DIFFERENCE OF MEAN |

continuous distribution | A probability distribution of a continuous STATISTIC, based upon an algebraic formula, such that for any possible value of the cumulative probability there is an exact corresponding value of the STATISTIC in question. Also see : DISCRETE DISTRIBUTION. |

discrete distribution | A probability distribution of some STATISTIC, based upon an algebraic formula or upon re-randomisation or upon actual data, in which the cumulative probability increases in non-infinitesmal steps corresponding to non-infinitesmal weight associated with po |

poisson process | A process whereby events occur independently in some continuum (in many applications, time), such that the overall density (rate) is statistically constant but that it is impossible to improve any prediction of the position (time) of the next event by ref |

difference of means | A TEST STATISTIC of intuitive appeal for measuring difference in location between two samples with INTERVAL-SCALE data. Employing this TEST STATISTIC in an EXACT TEST defines the PITMAN PERMUTATION TESTs(1 or 2). |

exact test(2) | A test which yields an ALPHA value which does not depend upon the NOMINAL ALPHA CRITERION VALUE which may have been set for ALPHA. This is in contrast to the possible practice of producing only a yes/no decision with regard to a NOMINAL ALPHA CRITERION VA |

rng | 1) Acronym for Random Number Generator. This is a process which uses a arithmetic algorithm to generate sequences of PSEUDO-RANDOM numbers. Also see : SEED. 2) range |

re-randomisation statistics | Also known as PERMUTATION or RANDOMISATION(1) statistics. These are the specific area of concern of this present glossary. |

algorithm(2) | An ALGORITHM(1) expressed in a PROGRAMMING LANGUAGE for a COMPUTER . |

odds ratio | An alternative characterisation of the parameter 'p' for a BINOMIAL PROCESS is the ratio of the incidences of the two alternatives : p/(1-p) ; this quantity is termed the ODDS RATIO; the value may range from zero to infinity. This relates to a possible vi |

pitman permutation test(2) | An EXACT RE-RANDOMISATION TEST in which the TEST STATISTIC is the MEAN DIFFERENCE of a single sample of univariate data measured under two circumstances as REPEATED MEASURES. Also see : PITMAN PERMUTATION TEST(1) |

degrees of freedom | An integer value measuring the extent to which an EXPERIMENTAL DESIGN imposes constraints upon the pattern of the mean values of data from various meaningful subsets of data. This value is frequently referred to in the organisation of tables of statistica |

branch-and-bound | Exploration of a RANDOMISATION DISTRIBUTION in such a way as to anticipate the effect of the next RANDOMISATION(3) relative to the present RANDOMISATION(3). This allows selective search of particular zones of a RANDOMISATION DISTRIBUTION; in the context o |

minimal-change sequence | Exploration of a RANDOMISATION DISTRIBUTION is such a sequence that the successive RANDOMISATION(3)s differ is a simple way. In the context of a RANODMISATION TEST this can mean that the value of the TEST STATISTIC for a particular RANDOMISATION(3) may be |

exact binomial test | A STATISTICAL TEST referring to the BINOMIAL DISTRIBUTION in its exact algebraic form, rather than through continuous approximations which are used especially where sample sizes are substantial. Also see EXACT TEST(1). |

test statistic | 1) A STATISTIC measuring the strength of the pattern which a statistical test undertakes to detect. In the context of RE-RANDOMISATION TESTS one is concerned with the distribution of the values of the TEST STATISTIC over the RANDOMISATION SET. An example of 2) A statistic used to test hypotheses. An hypothesis test can be constructed by deciding to reject the null hypothesis when the value of the test statistic is in some range or collection of ranges. To get a test with a specified significance level, the chance when the null hypothesis is true that the test statistic falls in the range where the hypothesis would be rejected must be at most the specified significance level. The Z statistic is a common test statistic. |

compiler | A PROGRAM supplied especially for a particular type of COMPUTER, to enable the translation of code expressed in some PROGRAMMING LANGUAGE into OBJECT CODE for that COMPUTER. A COMPILER undertakes translation of the whole of the user's PROGRAM to produce a |

standard programming language | A PROGRAMMING LANGUAGE which has a publicly agreed common form across several different types of COMPUTER. Such standardisation allows a PROGRAM to be transported conveniently between the different types of COMPUTER and is thus suitable for communicating |

nominal alpha criterion level | A publicly agreed value for TYPE-1 ERROR, such that the outcome of a statistical test is classified in terms of whether the obtained value of ALPHA is extreme as compared with this criterion level. The fine detail of the comparison involves the TAIL DEFIN |

two-way table | A representation of suitable data in a table organised as rows and columns, such that the rows represent one scheme of alternatives covering the whole of the the data represented, the columns represent a further scheme of alternatives covering the whole o |

decision rule | A rule for comparing the OUTCOME VALUE of ALPHA with a NOMINAL ALPHA CRITERION LEVEL (such as 0.05). An OUTCOME VALUE smaller (more extreme) than the NOMINAL ALPHA CRITERION LEVEL leads to a decision of STATISTICAL SIGNIFICANCE of the finding that the TES |

random sample | 1) A SAMPLE drawn from a POPULATION in such a way that every individual of the POPULATION has an equal chance of appearing in the SAMPLE. This ensures that the SAMPLE is REPRESENTATIVE, and provides the necessary basis for virtually all forms of inference fr 2) A random sample is a sample whose members are chosen at random from a given population in such a way that the chance of obtaining any particular sample can be computed. The number of units in the sample is called the sample size, often denoted n. The number of units in the population often is denoted N. Random samples can be drawn with or without replacing objects between draws; that is, drawing all n objects in the sample at once (a random sample without replacement), or drawing the objects one at a time, replacing them in the population between draws (a random sample with replacement). In a random sample with replacement, any given member of the population can occur in the sample more than once. In a random sample without replacement, any given member of the population can be in the sample at most once. A random sample without replacement in which every subset of n of the N units in the population is equally likely is also called a simple random sample. The term random sample with replacement denotes a random sample drawn in such a way that every n-tuple of units in the population is equally likely. See also probability sample. |

program | 1) A sequence of instructions expressed in some PROGRAMMING LANGUAGE. Also see ALGORITHM(2). 2) программа 3) программа (ряд взаимосвязанных мероприятий по осуществлению проекта) |

exhaustive re-randomisation | A series of samples from a RANDOMISATION SET which is known to generate every RANDOMISATION. In particular, sampling which generates every RANDOMISATION exactly once. |

pseudo-random | A source of data which is effectively unpredictable although generated by a determinate process. Successive PSEUDO-RANDOM data are produced by a fixed calculation process acting upon preceding data from the PSEUDO-RANDOM sequence. To start the sequence it |

computer program | 1) A specification of how to undertake a certain process, usually expressed via a PROGRAMMING LANGUAGE, for some chosen COMPUTER. Also see : PROGRAM. 2) машинная программа 3) программа вычислительной машины системы «сейдж» |

confidence interval | 1) For a given RE-RANDOMISATION distribution, a family of related distributions may be defined according to a range of hypothetical values of the pattern which the TEST STATISTIC measures. For instance, for the PITMAN PERMUTATION TEST(2) to test for a scale 2) Доверительный интервал. 3) A confidence interval for a parameter is a random interval constructed from data in such a way that the probability that the interval contains the true value of the parameter can be specified before the data are collected. Confidence intervals are demonstrated in this page. 4) доверительный интервал |