What is the limit in coding

Information-theoretical limits of channel coding

  1. Channel coding
  2. Binary block codes for channel coding
  3. Information-theoretical limits of channel coding

Channel coding theorem and channel capacity


We also consider a binary block code with $ k $ information bits per block and code words of length $ n $, from which the code rate $ R = k / n $ with the unit “information bit / code symbol” results.

As early as 1948, the ingenious information theorist Claude E. Shannon dealt very intensively with the ability to correct such codes and specified a limit for each channel that results solely from considerations of information theory. To date, no code has been found that exceeds this limit, and it will continue to do so.

$ \ text {Shannon's channel coding theorem:} $ For every channel with the channel capacity $ C> 0 $ there is always (at least) one code whose error probability approaches zero as long as the code rate $ R $ is smaller than the channel capacity $ C $. The prerequisite for this is that the following applies to the block length of this code: $ n \ to \ infty $.

Remarks: 

  • The statement “The error probability is approaching zero” is not identical to the statement “The transmission is error-free”. Example: In an infinitely long sequence, a finite number of symbols are falsified.
  • With some channels, even with $ R = C $, the error probability still approaches zero (but not with all).


The inverse of the channel coding theorem is also true and states:

$ \ text {Conversely:} $ If the code rate $ R $ is greater than the channel capacity $ C $, an arbitrarily small error probability cannot be achieved under any circumstances.

To derive and calculate the channel capacity, we initially assume a digital channel with $ M_x $ possible input values ​​$ x $ and $ M_y $ possible output values ​​$ y $. Then applies to the mean transinformation content - briefly the transinformation (English: Mutual information) - between the random variable $ x $ at the channel input and the random variable $ y $ at its output:

\ [I (x; y) = \ sum_ {i = 1} ^ {M_X} \ hspace {0.15cm} \ sum_ {j = 1} ^ {M_Y} \ hspace {0.15cm} {\ rm Pr} (x_i , y_j) \ cdot {\ rm log_2} \ hspace {0.15cm} \ frac {{\ rm Pr} (y_j \ hspace {0.05cm} | \ hspace {0.05cm} x_i)} {{\ rm Pr} (y_j )} = \ sum_ {i = 1} ^ {M_X} \ hspace {0.15cm} \ sum_ {j = 1} ^ {M_Y} \ hspace {0.15cm} {\ rm Pr} (y_j \ hspace {0.05cm} | \ hspace {0.05cm} x_i) \ cdot {\ rm Pr} (x_i) \ cdot {\ rm log_2} \ hspace {0.15cm} \ frac {{\ rm Pr} (y_j \ hspace {0.05cm} | \ hspace {0.05cm} x_i)} {\ sum_ {k = 1} ^ {M_X} {\ rm Pr} (y_j \ hspace {0.05cm} | \ hspace {0.05cm} x_k) \ cdot {\ rm Pr} ( x_k)} \ hspace {0.05cm}. \]

In the transition from the first to the second equation, Bayes' theorem and the theorem of total probability were taken into account.

It should also be noted:

  • The Logarithm dualis is here with “log2”Is designated. We sometimes use “ld” for this in our learning tutorial.
  • In contrast to the book Information Theory, in the following we do not differentiate between the random variable $ ($ uppercase letters $ X $ or $ Y) $ and the realizations $ ($ lowercase letters $ x $ or $ y) $.

$ \ text {Definition:} $ The one introduced by Shannon Channel capacity specifies the maximum transinformation $ I (x; y) $ between the input variable $ x $ and the output variable $ y $:

\ [C = \ max _ {{{\ rm Pr} (x_i)}} \ hspace {0.1cm} I (X; Y) \ hspace {0.05cm}. \]

The pseudo-unit “bit / channel access” must be added.

Since the transinformation has to be maximized over all possible (discrete) input distributions $ {\ rm Pr} (x_i) $, the channel capacity is independent of the input and thus a pure channel parameter.

Channel capacity of the BSC model


We now apply these definitions to the BSC model (Binary Symmetric Channel) at:

\ [I (x; y) = {\ rm Pr} (y = 0 \ hspace {0.03cm} | \ hspace {0.03cm} x = 0) \ cdot {\ rm Pr} (x = 0) \ cdot { \ rm log_2} \ hspace {0.15cm} \ frac {{\ rm Pr} (y = 0 \ hspace {0.03cm} | \ hspace {0.03cm} x = 0)} {{\ rm Pr} (y = 0 )} + {\ rm Pr} (y = 1 \ hspace {0.03cm} | \ hspace {0.03cm} x = 0) \ cdot {\ rm Pr} (x = 0) \ cdot {\ rm log_2} \ hspace {0.15cm} \ frac {{\ rm Pr} (Y = 1 \ hspace {0.03cm} | \ hspace {0.03cm} x = 0)} {{\ rm Pr} (y = 1)} + \]
\ [\ hspace {1.45cm} + \ hspace {0.15cm} {\ rm Pr} (y = 0 \ hspace {0.05cm} | \ hspace {0.05cm} x = 1) \ cdot {\ rm Pr} (x = 1) \ cdot {\ rm log_2} \ hspace {0.15cm} \ frac {{\ rm Pr} (Y = 0 \ hspace {0.05cm} | \ hspace {0.05cm} x = 1)} {{\ rm Pr} (y = 0)} + {\ rm Pr} (y = 1 \ hspace {0.05cm} | \ hspace {0.05cm} x = 1) \ cdot {\ rm Pr} (x = 1) \ cdot { \ rm log_2} \ hspace {0.15cm} \ frac {{\ rm Pr} (y = 1 \ hspace {0.05cm} | \ hspace {0.05cm} x = 1)} {{\ rm Pr} (y = 1 )} \ hspace {0.05cm}. \]

The channel capacity can be obtained from the following considerations:

  • Maximizing the input distribution leads to symbols with equal probability:
\ [{\ rm Pr} (x = 0) = {\ rm Pr} (x = 1) = 1/2 \ hspace {0.05cm}. \]
  • Due to the symmetry recognizable from the model, the following applies at the same time:
\ [{\ rm Pr} (y = 0) = {\ rm Pr} (y = 1) = 1/2 \ hspace {0.05cm}. \]
  • We also take into account the BSC transition probabilities:
\ [{\ rm Pr} (y = 1 \ hspace {0.05cm} | \ hspace {0.05cm} x = 0) = {\ rm Pr} (y = 0 \ hspace {0.05cm} | \ hspace {0.05cm } x = 1) = \ varepsilon \ hspace {0.05cm}, \]
\ [{\ rm Pr} (y = 0 \ hspace {0.05cm} | \ hspace {0.05cm} x = 0) = {\ rm Pr} (y = 1 \ hspace {0.05cm} | \ hspace {0.05cm } x = 1) = 1- \ varepsilon \ hspace {0.05cm}. \]
  • After combining two terms each one obtains:
\ [C \ hspace {0.15cm} = \ hspace {0.15cm} 2 \ cdot 1/2 \ cdot \ varepsilon \ cdot {\ rm log_2} \ hspace {0.15cm} \ frac {\ varepsilon} {1/2} + 2 \ cdot 1/2 \ cdot (1- \ varepsilon) \ cdot {\ rm log_2} \ hspace {0.15cm} \ frac {1- \ varepsilon} {1/2} \ varepsilon \ cdot {\ rm ld} \ hspace {0.15cm} 2 - \ varepsilon \ cdot {\ rm log_2} \ hspace {0.15cm} \ frac {1} {\ varepsilon} + (1- \ varepsilon) \ cdot {\ rm log_2} \ hspace {0.15 cm} 2 - (1- \ varepsilon) \ cdot {\ rm log_2} \ hspace {0.15cm} \ frac {1} {1- \ varepsilon} \]
\ [\ Rightarrow \ hspace {0.3cm} C \ hspace {0.15cm} = \ hspace {0.15cm} 1 - H _ {\ rm bin} (\ varepsilon). \]
  • The one used here binary entropy function:
\ [H _ {\ rm bin} (\ varepsilon) = \ varepsilon \ cdot {\ rm log_2} \ hspace {0.15cm} \ frac {1} {\ varepsilon} + (1- \ varepsilon) \ cdot {\ rm log_2 } \ hspace {0.15cm} \ frac {1} {1- \ varepsilon} \ hspace {0.05cm}. \]

The following graphic on the right shows the BSC channel capacity depending on the corruption probability $ \ varepsilon $. On the left, the binary entropy function is shown for comparison, which has already been defined in the chapter Unmemorial news sources of the book "Information Theory".

Channel capacity of the BSC model

One recognizes from this representation:

  • The probability of corruption $ \ varepsilon $ leads to the channel capacity $ C (\ varepsilon) $. According to the channel coding theorem, error-free decoding according to the best possible coding is only possible if the code rate $ R $ is not greater than $ C (\ varepsilon) $.
  • With $ \ varepsilon = 10 \% $, because $ C (0.1) = 0.531 $, error-free decoding is not possible if the code rate $ R> 0.531 $. If the corruption is 50 percent, error-free decoding is impossible, even at any low code rate: $ C (0.5) = 0 $.
  • From the point of view of information theory, $ \ varepsilon = 1 $ (inversion of all bits) is just as good as $ \ varepsilon = 0 $ (error-free transmission). Likewise, $ \ varepsilon = 0.9 $ is equivalent to $ \ varepsilon = 0.1 $. Error-free decoding is achieved here by swapping the zeros and ones, i.e. by a so-called Mapping.

Channel capacity of the AWGN model

We now consider the AWGN channel (Additive White Gaussian Noise ). The following applies to the output signal $ y = x + n $, where $ n $ describes a Gaussian distributed random variable, and the following applies to its expected values ​​(moments):

$$ {\ rm E} [n] = 0, \ hspace {1cm} {\ rm E} [n ^ 2] = P_n. $$

This results in a continuous value output signal $ y $ regardless of the input signal $ x $ (analog or digital), and $ M_y \ to \ infty $ must be inserted in the equation for the transinformation.

The calculation of the channel capacity for the AWGN channel is only given here in key words. The exact derivation can be found in the fourth main chapter “Discrete Value Information Theory” of the book Information Theory.

  • The input variable $ x $, which is optimized with regard to maximum transinformation, will certainly be value-continuous, that is, with the AWGN channel, $ M_x \ to \ infty $ also applies in addition to $ M_y \ to \ infty $.
  • While all $ {\ rm Pr} (x_i) $ must be optimized with a discrete-value input, optimization now takes place on the basis of the WDF $ f_x (x) $ of the input signal under the secondary condition of power limitation:
\ [C = \ max_ {f_x (x)} \ hspace {0.1cm} I (x; y) \ hspace {0.05cm}, \ hspace {0.3cm} {\ rm where \ hspace {0.15cm} apply \ hspace {0.15cm} must} \ text {:} \ hspace {0.15cm} {\ rm E} \ left [x ^ 2 \ right] \ le P_x \ hspace {0.05cm}. \]
  • The optimization also provides a Gaussian distribution for the input PDF ⇒ $ x $, $ n $ and $ y $ are Gaussian distributions according to the density functions $ f_x (x) $, $ f_n (n) $ and $ f_y (y) $. We name the corresponding services $ P_x $, $ P_n $ and $ P_y $.
  • After a long calculation you get for the channel capacity using the Logarithm dualis $ \ log_2 (\ cdot) $ - again with the pseudo-unit "bit / channel access":
\ [C = {\ rm log_2} \ hspace {0.15cm} \ sqrt {\ frac {P_y} {P_n}} = {\ rm log_2} \ hspace {0.15cm} \ sqrt {\ frac {P_x + P_n} { P_n}} = {1} / {2} \ cdot {\ rm log_2} \ hspace {0.05cm} \ left (1 + \ frac {P_x} {P_n} \ right) \ hspace {0.05cm}. \]
  • If $ x $ describes a time-discrete signal with the symbol rate $ 1 / T _ {\ rm S} $, this must be band-limited to $ B = 1 / (2T _ {\ rm S}) $, and the same bandwidth $ B $ must be used also apply $ n $ for the noise signal ⇒ "noise bandwidth":
\ [P_X = \ frac {E _ {\ rm S}} {T _ {\ rm S}} \ hspace {0.05cm}, \ hspace {0.4cm} P_N = \ frac {N_0} {2T _ {\ rm S}} \ hspace {0.05cm}. \]
  • Thus, the AWGN channel capacity can also be increased by the Transmission energy per symbol $ (E _ {\ rm S}) $ and the Noise power density Express $ (N_0) $:
\ [C = {1} / {2} \ cdot {\ rm log_2} \ hspace {0.05cm} \ left (1 + {2 E _ {\ rm S}} / {N_0} \ right) \ hspace {0.05cm }, \ hspace {1.9cm} \ text {Unit:} \ hspace {0.3cm} \ frac {\ rm bit} {\ rm channel access} \ hspace {0.05cm}. \]
  • The following equation gives the channel capacity per unit of time (denoted by $ ^ {\ star}) $:
\ [C ^ {\ star} = \ frac {C} {T _ {\ rm S}} = B \ cdot {\ rm log_2} \ hspace {0.05cm} \ left (1 + {2 E _ {\ rm S} } / {N_0} \ right) \ hspace {0.05cm}, \ hspace {0.8cm} \ text {Unit:} \ hspace {0.3cm} \ frac {\ rm bit} {\ rm time unit} \ hspace {0.05cm }. \]

$ \ text {Example 1:} $

  • For $ E _ {\ rm S} / N_0 = 7.5 $ ⇒ $ 10 \ cdot \ lg \, E _ {\ rm S} / N_0 = 8.75 \, \ rm dB $ one gets the channel capacity $ C = {1} / {2 } \ cdot {\ rm log_2} \ hspace {0.05cm} (16) = 2 \, \ rm bit / channel access $.
  • For a channel with the (physical) bandwidth $ B = 4 \, \ rm kHz $, which corresponds to the sampling rate $ f _ {\ rm A} = 8 \, \ rm kHz $, $ C ^ \ star = 16 \ , \ rm kbit / s $.

A comparison of different coding methods with a constant $ E _ {\ rm S} $ (energy per transmitted symbol ) is not fair, however. Rather, one should use the energy $ E _ {\ rm B} $ for this comparison per useful bit firmly pretend. The following relationships apply:

\ [E _ {\ rm S} = R \ cdot E _ {\ rm B} \ hspace {0.3cm} \ Rightarrow \ hspace {0.3cm} E _ {\ rm B} = E _ {\ rm S} / R \ hspace { 0.05cm}. \]

$ \ text {Channel coding theorem for the AWGN channel:} $

Error-free decoding $ ($ for infinitely long blocks ⇒ $ n \ to \ infty) $ is always possible if the code rate $ R $ is less than the channel capacity $ C $:

\ [R

For each code rate & nbsp $ R $, the required $ E _ {\ rm B} / N_0 $ of the AWGN channel can be determined so that error-free decoding is just about possible. For the limiting case $ R = C $ one obtains:

\ [{E _ {\ rm B}} / {N_0}> \ frac {2 ^ {2R} -1} {2R} \ hspace {0.05cm}. \]


The graphic summarizes the result, with the ordinate $ R $ plotted on a linear scale and the abscissa $ E _ {\ rm B} / {N_0} $ plotted logarithmically.

Channel capacity of the AWGN channel
  • Error-free coding is not possible outside the blue area.
  • The blue limit curve indicates the channel capacity $ C $ of the AWGN channel.


From this graph and the above equation, the following can be derived:

  • The channel capacity $ C $ increases somewhat less than linearly with $ 10 \ cdot \ lg \, E _ {\ rm B} / N_0 $. In the graphic, some selected function values ​​are indicated as blue crosses.
  • If $ 10 \ cdot \ lg \, E _ {\ rm B} / N_0 <-1.59 \, \ rm dB $, error-free decoding is in principle impossible. If the code rate is $ R = 0.5 $, then $ 10 \ cdot \ lg \, E _ {\ rm B} / N_0> 0 \, \ rm dB $ ⇒ $ E _ {\ rm B}> N_0 $.
  • For all binary codes $ 0 non-binary Codes, code rates $ R> 1 $ are possible. For example, the maximum possible code rate of a quaternary code is $ R = \ log_2 \, M_y = \ log_2 \, 4 = 2 $.
  • All one-dimensional types of modulation - that is, processes that only use the in-phase or only the quadrature component such as 2 – ASK, BPSK and 2 – FSK - must be in the blue area of ​​the graphic.
  • As shown in the chapter Maximum Code Rate for QAM Structures in the book “Information Theory”, there is a “friendlier” limit curve for two-dimensional modulation types such as quadrature amplitude modulation.

AWGN channel capacity for binary input signals


In this book we limit ourselves mainly to binary codes, i.e. to the Galois field $ {\ rm GF} (2 ^ n) $. So is

Conditional density functions for AWGN channel and binary input
  • on the one hand, the code rate is limited to the range $ R ≤ 1 $,
  • secondly, the entire blue region is not available for $ R ≤ 1 $ (see previous page).


The now valid region results from the general equation of the transinformation through

  • the parameters $ M_x = 2 $ and $ M_y \ to \ infty $,
  • bipolar signaling ⇒ $ x = 0 $ → $ \ tilde {x} = + 1 $ and $ x = 1 $ → $ \ tilde {x} = -1 $,
  • the transition from conditional probabilities $ {\ rm Pr} (\ tilde {x} _i) $ to conditional probability density functions,
  • Replace the sum with an integration.

The optimization of the source leads to symbols with the same probability:

\ [{\ rm Pr} (\ tilde {x} = +1) = {\ rm Pr} (\ tilde {x} = -1) = 1/2 \ hspace {0.05cm}. \]

This gives for the maximum of the transinformation, i.e. for the channel capacity:

\ [C \ hspace {-0.15cm} = {1} / {2} \ cdot \ int _ {- \ infty} ^ {+ \ infty} \ left [f_ {y \ hspace {0.05cm} | \ hspace {0.05 cm} \ tilde {x} = +1} (y) \ cdot {\ rm log_2} \ hspace {0.15cm} \ frac {f_ {y \ hspace {0.05cm} | \ hspace {0.05cm} \ tilde {x } = +1} (y)} {f_y (y)} + f_ {y \ hspace {0.05cm} | \ hspace {0.05cm} \ tilde {x} = -1} (y) \ cdot {\ rm log_2 } \ hspace {0.15cm} \ frac {f_ {y \ hspace {0.05cm} | \ hspace {0.05cm} \ tilde {x} = -1} (y)} {f_y (y)} \ right] \ hspace {0.1cm} {\ rm d} y \ hspace {0.05cm}. \]

The integral cannot be solved in a mathematically closed form, but can only be evaluated numerically.

  • The green curve shows the result.
  • For comparison, the blue curve shows the channel capacitance derived on the last page for Gaussian distributed input signals.
AWGN channel capacity for binary input signals


One notices:

  • For $ 10 \ cdot \ lg \, E _ {\ rm B} / N_0 <0 \, \ rm dB $, the two capacity curves differ only slightly.
  • With a binary bipolar input you only have to increase the parameter $ 10 \ cdot \ lg \, E _ {\ rm B} / N_0 $ by about $ 0.1 \, \ rm dB $ compared to the optimum (Gaussian input) in order to also increase the code rate $ R = 0.5 $ to enable.
  • From $ 10 \ cdot \ lg \, E _ {\ rm B} / N_0 \ approx6 \, \ rm dB $, the capacity $ C = 1 \, \ rm bit / channel access $ of the AWGN channel for binary input is (almost) reached .
  • In between, the limit curve increases almost exponentially.

Common channel codes compared to channel capacity


The aim now is to show the extent to which established channel codes approximate the BPSK channel capacity (green curve). In the following graphic, the ordinate is the rate $ R = k / n $ of these codes or the capacity $ C $ (with the additional pseudo-unit “bit / channel access”). The following is also required:

  • the AWGN channel, identified by $ 10 \ cdot \ lg \, E _ {\ rm B} / N_0 $ in dB, and
  • a bit error rate (BER) of $ 10 ^ {- 5} $ for the codes marked by crosses.
Guessing and required $ E _ {\ rm B} / N_0 $ of various channel codes

$ \ text {Please note:} $

  • The channel capacity curves always apply to $ n \ to \ infty $ and $ \ rm BER \ to 0 $ apply.
  • If one were to apply this strict requirement “error-free” to the considered channel codes of finite code length $ n $, then $ 10 \ cdot \ lg \, E _ {\ rm B} / N_0 \ to \ infty $ would always be required for this.
  • But this is an academic problem that has little practical relevance. For $ \ rm BER = 10 ^ {- 10} $ the graph would be qualitatively and quantitatively similar.


Here are some explanations of the dates that were included in the lecture [Liv10][1] were taken from:

  • The points $ \ rm A $, $ \ rm B $ and $ \ rm C $ mark Hamming codes with different rates. They all need more than $ 10 \ cdot \ lg \, E _ {\ rm B} / N_0 = 8 \, \ rm dB $ for $ \ rm BER = 10 ^ {- 5} $.
  • The marking $ \ rm D $ identifies the binary Golay code with the rate $ 1/2 $ and the marking $ \ rm E $ a Reed – Muller code. This very low-rate code was used in the Mariner 9 space probe as early as 1971.
  • The Reed – Solomon codes (RS codes) are dealt with in detail in the second main chapter. A high-rate RS – Code $ (R = 223/255> 0.9) $ and a required $ 10 \ cdot \ lg \, E _ {\ rm B} / N_0 <6 \, \ rm dB $ is marked with $ \ rm F $ .
  • The markings $ \ rm G $ and $ \ rm H $ denote exemplary convolution codes (English: Convolutional Codes, CC) medium rate. The code $ \ rm G $ was already used in 1972 on the Pioneer10 mission.
  • The channel coding of the Voyager mission in the late 1970s is marked with $ \ rm I $. It is about the concatenation of a $ \ text {(2, 1, 7)} $ - convolutional code with a Reed – Solomon – Code, as described in the fourth main chapter.

annotation: With the convolutional codes, the third identifier parameter in particular has a different meaning than with the block codes. For example, $ \ text {CC (2, 1, 32)} $ indicates the memory $ m = 32 $.

Rates and required $ E _ {\ rm B} / N_0 $ for iterative coding methods


With iterative decoding, significantly better results can be achieved, as the second graphic shows.

  • That means: the individual marking points are much closer to the capacity curve.
  • The solid blue curve previously labeled "Gaussian capacitance" is here called "Gaussian (real)".


Here are a few more explanations about this graphic:

  • Red crosses mark so-called turbo codes according to CCSDS (Consultative Committee for Space Data Systems ) each with $ k = 6920 $ information bits and different code lengths $ n $. These codes, invented by Claude Berrou around 1990, can be decoded iteratively. The (red) markings are each less than $ 1 \, \ rm dB $ away from the Shannon limit.
  • The LDPC codes marked by white rectangles show similar behavior (Low density parity check codes ), which has been with DVB – S (2) (Digital video broadcast over satellite ) can be used. Due to the sparse occupancy of the check matrix $ \ boldsymbol {\ rm H} $ (with ones), these are very well suited for iterative decoding using Factor graph and Exit charts. See [Hag02][2]
  • The black dots mark the LDPC codes specified by CCSDS, which all assume a constant number of information bits $ (k = 16384) $. In contrast, the code length $ n = 64800 $ is constant for all white rectangles, while the number $ k $ of information bits changes according to the rate $ R = k / n $.
  • Around the year 2000, many researchers had the ambition to approach the Shannon limit to within a fraction of a $ \ rm dB $. The yellow cross marks such a result from [CFRU01][3]. An irregular rate 1/2 LDPC code with the code length $ n = 2 \ cdot10 ^ 6 $ was used.

$ \ text {Conclusion:} $ It should be noted that Shannon recognized and proved as early as 1948 that no one-dimensional modulation method can lie to the left of the continuously drawn AWGN limit curve “Gauss (real)”.

  • For two-dimensional methods such as QAM and multi-level PSK, on ​​the other hand, the limit curve “Gaussian (complex)” applies, which is shown here as a dashed line and is always to the left of the solid curve.
  • You can find more information on this in the section Maximum Code Rate for QAM Structures in the book “Information Theory”.
  • With the QAM process and very long channel codes, this limit curve has now almost been reached without ever being exceeded.

Exercises for the chapter


Exercise 1.17: On the channel coding theorem

Exercise 1.17Z: BPSK channel capacity

Bibliography

  1. ↑ Liva, G .: Channel Coding. Lecture manuscript, Chair of Telecommunications, TU Munich and DLR Oberpfaffenhofen, 2010.
  2. ↑ Hagenauer, J .: The Turbo Principle in Mobile Communications. In: Int. Symp. On Information Theory and Its Applications, Oct. 2010, PDF document.
  3. ↑ Chung S.Y; Forney Jr., G.D .; Richardson, T.J .; Urbanke, R .: On the Design of Low-Density Parity- Check Codes within 0.0045 dB of the Shannon Limit. - In: IEEE Communications Letters, vol. 5, no. 2 (2001), pp. 58-60.