Open Access Highly Accessed Open Badges Research

Theory of sampling and its application in tissue based diagnosis

Klaus Kayser1*, Holger Schultz2, Torsten Goldmann2, Jürgen Görtler3, Gian Kayser4 and Ekkehard Vollmer2

Author Affiliations

1 UICC-TPCC, Institute of Pathology, Charite, Berlin, Germany

2 Clin. & Exp. Pathology, Research Center Borstel, Borstel, Germany

3 Deep Computing, IBM, Amsterdam, the Netherlands

4 Institute of Pathology, University of Freiburg, Freiburg, Germany

For all author emails, please log on.

Diagnostic Pathology 2009, 4:6  doi:10.1186/1746-1596-4-6

Published: 16 February 2009



A general theory of sampling and its application in tissue based diagnosis is presented. Sampling is defined as extraction of information from certain limited spaces and its transformation into a statement or measure that is valid for the entire (reference) space. The procedure should be reproducible in time and space, i.e. give the same results when applied under similar circumstances. Sampling includes two different aspects, the procedure of sample selection and the efficiency of its performance. The practical performance of sample selection focuses on search for localization of specific compartments within the basic space, and search for presence of specific compartments.


When a sampling procedure is applied in diagnostic processes two different procedures can be distinguished: I) the evaluation of a diagnostic significance of a certain object, which is the probability that the object can be grouped into a certain diagnosis, and II) the probability to detect these basic units. Sampling can be performed without or with external knowledge, such as size of searched objects, neighbourhood conditions, spatial distribution of objects, etc. If the sample size is much larger than the object size, the application of a translation invariant transformation results in Kriege's formula, which is widely used in search for ores. Usually, sampling is performed in a series of area (space) selections of identical size. The size can be defined in relation to the reference space or according to interspatial relationship. The first method is called random sampling, the second stratified sampling.


Random sampling does not require knowledge about the reference space, and is used to estimate the number and size of objects. Estimated features include area (volume) fraction, numerical, boundary and surface densities. Stratified sampling requires the knowledge of objects (and their features) and evaluates spatial features in relation to the detected objects (for example grey value distribution around an object). It serves also for the definition of parameters of the probability function in so – called active segmentation.


The method is useful in standardization of images derived from immunohistochemically stained slides, and implemented in the EAMUS™ system webcite. It can also be applied for the search of "objects possessing an amplification function", i.e. a rare event with "steering function". A formula to calculate the efficiency and potential error rate of the described sampling procedures is given.