**APPLICATIONS OF NUMERICAL METHODS
**TO AN OPTIMIZATION PROBLEM

IN DATA COMMUNICATIONS

Eric Brahinsky

MAT 3633: Numerical Analysis

Dr. Richardson

The University of Texas at San Antonio

November 1995

In the fall of 1994, this author was presented by a longtime friend and correspondent, James G. King of Dallas, with a “real-world” problem in data communications. Mr. King, a director of engineering at a Dallas-area firm, had been working with the dispatching of messages on a data-communications channel, and desired mathematical models and methods that might aid in optimizing controllable parameters and thus ultimately minimize costs. In this paper, the author will describe the problem and explore the application to its solution of several numerical-analytical methods.

The problem involves the transmission of a message (the length of which is not known* a priori*) from an originator to a receiver. The message is electronically encoded in bits and is transmitted in one direction on a full-duplex channel (the purpose of the other channel will be addressed shortly).

The crux of the mathematical analysis of the problem involves the size of the “pieces” into which the sender chooses to break up the message in order to facilitate efficient transmission. Each piece, which is called a “frame,” must consist of some number of eight-bit bytes. Once the frame size is determined, it must remain constant throughout a given transmission.

The transmission of each individual frame incurs an overhead cost. This fact in itself would then argue for a large frame size: larger frames means fewer frames, and thus less total overhead. However, there is another consideration, which arises in accordance with the relative quality of the transmission channel. No medium is perfect, and there is a probability that some proportion of the bits will be incorrectly transmitted. On a given machine, this “measure of imperfection,” the* bit error rate* (BER), can be assessed -- for example, one transmission system might have a BER of .003, which would mean there would be, on average, three bad bits out of every thousand sent. The operative fact is that if a frame contains a mistake, __the entire frame__ must be re-sent.

Clearly, __this__ situation argues __against__ large frames. The more bits a frame contains, the greater the chance that at least one bit (and it only takes one) will be in error; whatever overhead might have been saved by keeping the number of frames to a minimum will likely have to be expended in repeated attempts to transmit individual frames error-free.

It is evident, then, that the optimal frame size must be somewhere in the middle: large enough to keep down the cumulative total of per-frame expenditures, and small enough to avoid inordinate instances of frame retransmission.

Overhead, which involves frame identification and channel synchronization, is itself assessed in terms of some number of bits, and according to Mr. King, may be assumed to be “encoded in such a way as to be impervious to bit errors.” Also, the system has a means of checking for bit errors. Called a “cyclic redundancy check,” the CRC __may__ incur bit errors (*i.e.*, it may report an error in a frame that was actually good); so the CRC’s bit “cost” must be added to the effective frame size in calculations involving the BER. (The CRC process uses the second channel of the duplex system to report success or failure of each frame.)

Before stating an explicit formula for channel efficiency, it is necessary to calculate the probability that a given frame will be sent successfully. By definition, a given bit has a BER probability of being wrong, and thus a 1-BER probability of being correct. A correctly-transmitted frame of N bits requires N consecutive transmissions of correct bits; hence this event has a (1-BER)^{N} probability. However, in practice, the CRC lowers this probability to (1-BER)^{N+CRC}. Since this is the probability that the frame will be correct, then its __reciprocal__ , * i.e.* , (1-BER)^{-(N+CRC)}, would be the expected number of times the frame would need to be attempted in order to achieve success.

Now channel efficiency may be defined simply as the ratio of the number of bits in one frame as provided by the source, to the total number of bits transmitted in successfully communicating a frame. That is: the numerator is just “the message,” while the denominator takes into account the message, the overhead, the CRC, and the expected number of necessary retransmissions.

Thus the formula is:

,

where E = channel efficiency

N = frame length in bits

C = cyclic redundancy check length, in bits

X = overhead cost in bits

B = bit error rate .

The aim then is to maximize this ratio.

In a given application, parameters C, X, and B will be predetermined; thus E will effectively become a function of N.

One can examine graphically the results of substituting into the above formula particular values for C, X, and B, plotting E as it varies with N. Mr. King has supplied typical values of

C = 16 bits

X = 400 bits

B = 10

This gives a formula of:

^{
.
}

The graph does exhibit the predicted behavior: the value of E “peaks” for some positive value of N, a value that appears to be in the 1000 - 2500 range.

In order to apply numerical methods, we would prefer to find zeros, rather than maxima, of functions. Fortunately, we can transform this problem simply by applying an implication of the Mean Value Theorem: if E(N) has a maximum in an interval, then its derivative E’^{ }(N) must have a zero. We have the option of:

1) differentiating E(N), the function graphed above, in which X, C, and B have already been assigned values. The disadvantage here is that in a new application with any of X, C, or B changed (*e.g.*, a different transmitter), a new function would result and a new derivative would need to be calculated;

** or**

2) differentiating E(N,X,C,B) with respect to N. This partial derivative would be applicable as any parameter might be adjusted.

If we choose option 2), we have the partial derivative

.

Let us examine the graph of this derivative function with the same values as before for X, C, and B:

The graph does appear to cross the N-axis at a value of N for which the graph of the original function had a maximum. The task now is to determine the N-intercept with some precision, and we can use numerical methods to accomplish this.

Let us use the given typical values of X, C, and B, and use [1000, 2500] as our “candidate” interval for containing a zero of E^{'}_{N} . In all of the following calculations, let us require satisfaction of a residual-norm tolerance of 10^{-9}; for those methods providing an interval of convergence, let us additionally require that an x-norm (actually, here, an “N-norm”) tolerance of 1^{*} be satisfied.

^{*} An uncharacteristically large x-norm tolerance is chosen because 1) N will eventually be converted to bytes (*i.e.*, divided by 8) and rounded to an integer, so greater precision would be of little practical value; 2) with a BER rather imprecisely given as “10^{-4},” an x-norm tolerance even this small is rather presumptuous!

Here are the results of applying **bisection**:

*
a________________b
*1000

1750

1750

1750

1750

1796.875

1820.3125

1832.03125

1837.89063

1840.82031

1840.82031

1841.55273

1841.91895

1842.10205

1842.10205

1842.10205

1842.12494

*N* = 1842.12494

**regula falsi**:

** a________________b**1000

1000

1000

1000

1000

1000

1000

1000

1000

1000

1000

1000

1000

1000

1000

1000

1000

1000

1000

1000

1000

1000

1000

1000

1000

1000

1000

1000

1000

1000

1000

1000

1000

1000

1000

1842.13662

*N* = 1842.13662

and **modified regula falsi**:

** a________________b**1000

1000

1000

1000

1000

1866.88395

1866.88395

1841.59827

1841.59827

1841.59827

1841.59827

1842.13662

*N* = 1842.13662

Applying the **secant** method, with N_{-1} = 1000 and N_{0} = 2500; tolerances as before, gives:

*N
______________
*2500

2217.98784

1625.40856

1903.41367

1851.7483

1841.69515

1842.13977

*N* = 1842.13977

The logical next step would be the **Newton** method. Newton requires the derivative of the function whose zero is sought: in this case, the __second__ derivative of the original function E. If as before we choose to apply partial differentiation with respect to N, we obtain the following:

,

where

After substituting numerical values for X, C, and B, we attempt Newton, with N_{0} = 1000, and with residual tolerance as before:

*N
______________
*1000

1417.89141

1723.17185

1831.99025

1842.06048

1842.13662

*N* = 1842.13662

Predictably, Newton, due to its quadratic convergence, requires the fewest steps -- at the cost of some complicated algebraic manipulation in determining .

In all the above methods, the resulting N of approximately 1842 would be converted to the nearest integral number of bytes by the formula:

Number of bytes = **int** [N/8 + .5],

*i.e.*, 230 bytes (= 1840 bits). This would be the actual optimum frame size. The value of E would be 0.6774, so the transmission of a message of K bits would require an overall expenditure of K/E, or 1.476 K, bits.

One might then explore the advantage of seeking a better channel (or perhaps the savings on equipment for a __lower__-quality channel!). This would involve fixing X and C as before, and trying different values for B. How is efficiency affected by using a channel with a BER of, say, 3 x 10^{-5} (*i.e.*, about three times better than before)? Or ten times as good (BER = 10^{-5})?

Using values as before for X and C, the resulting values for N and 1/E, as these quantities vary with B, are shown:

__ BER..........Optimum # bytes.......1/E
__ 10

3 x 10

10

3 x 10

10

Here is a graph of 1/E (at optimum N) as a function of B:

These were calculated using the secant method, tolerances as before.

Actual dollar costs would have to be considered in practice, but the above table and graph could be used for a rough guideline.

Results of the sort calculated here proved useful in the design and implementation of the data communication system described; the foregoing is an illustration of the power and practicality of numerical approximation methods.

** References**:

King, James G. Personal communications with the author: Letters dated 10/18/94 and 11/30/94; telephone conversation, 11/94.

The text of this paper (including formulae and equations) was produced on a Macintosh Performa 630CD computer with the aid of ClarisWorks 2.1 word processing, and printed in Helvetica font on a Hewlett Packard DeskWriter 540 printer.

All programs were run, and the first two graphs, all convergence sequences, and program listings were printed, using a Commodore 64 computer and a Star Micronics SG-10 printer.

All programs (listings of which follow) were written in BASIC by the author. “