Automatic Differentiation
|
 |
Differentiation rules can be applied to differentiate any function expressed in terms of composition and arithmetic operations of functions with known derivatives. Suppose  . Then  , where  is generally much more complicated than  . Automatic differentiation is a method for numerically calculating  at a point just using the formula for  . Only simple operations and basic functions are applied at each step. Numeric differentiation by divided differences, on the other hand, involves evaluating the full expression for  and introduces error by dividing by small numbers. The step-by-step calculation of automatic differentiation is illustrated in this Demonstration by means of an old-style reverse polish notation (RPN) calculator. To convert an expression to RPN, write the RPN for the operands of the last operation to perform, and then the operator. E.g.,  converts to  ,  converts to  and  converts to  . To evaluate, regard the RPN notation as a list, and scan it from left to right. For a value, write it on a sheet of paper and put the sheet on a pile. For an operation, pick up sheets for as many operands as it takes, perform the operation on the values, write its value on a sheet of paper, and place on the pile. When you get to the end of the list, the result of the whole expression will be written on the top (the only) sheet on the pile. The pile is a data structure called a stack. Putting sheets on and taking them off the pile is called pushing and popping values onto and from the stack. Stacks are used in this way in evaluating expressions on a computer, and would be used at a low level in implementing automatic differentiation. Automatic differentiation can be defined using dual numbers. Dual numbers are pairs of real numbers satisfying the following algebraic rules:  ,  , which defines negation for  , and hence subtraction,  , and  , which defines the reciprocal for  , and hence division. The dual number  can be written as  , where  , and we identify a real number  with the dual  . With this notation, duals are numbers of the form  where  , much like complex numbers are numbers of the form  , where  . Automatic differentiation is the method of numerically computing  by simply evaluating  at the dual number  , where  , for any differentiable function , is the function from duals to duals defined by  . Amazingly, the result of the computation is  . Thus,  and  can be read off as the real part and the dual part of the result. The Demonstration computes  and  by evaluating an RPN expression for  using a stack. For example, to compute  where  , click "x+ϵ", "x^2", "e^x", "*" to compute the first factor, followed by "x+ϵ", "x^2", "x+ϵ", "x^(1/2)", "+", "sin" to compute the second factor, and finally "*" to compute the product. Clicking "x+ϵ" pushes  on the stack. Clicking "x^2" results in  , etc. The stack contains the resulting dual values at each stage. The values and derivatives are displayed in frames in two stacks, the first showing expressions in x, the second showing values at the slider value of  . For a second example, if  , and you want  , set slider "a" to 10, click "x+ϵ", "a","y^x" and set slider  to 7. (Traditionally, "x" names the top and "y" names the next element on the stack.) We implement the dual  in Mathematica as the expression makedual[a,b]. Operations on duals and the extension of functions to functions on duals are implemented using upvalues. The Demonstration calculates with dual numbers and displays the results as framed pairs, the real part being the value of a function and the dual part being the value of its derivative at a point. That the expression  is obtained by replacing all functions in an expression for  by their extensions and evaluating the result at  can be verified by induction. First note  is the extension of the identity function at  and  is the extension of the constant function  at  . Then check it is true for arithmetic operations on functions and composition of functions. The Demonstration can be used to make these calculations. For example, click "x+ϵ", "f", "x+ϵ", "g", "*" to verify it for  , and click "x+ϵ", "g", "f" to verify it for  Click "clear" to clear the stack and storage locations, or "pop" to remove just the top value. Click "chs" to change the sign of the top value, and "lastx" to retrieve the previous top value. The calculator also computes the partial derivatives of functions of two variables, using makedual[z,{h,k}] to generalize duals to numbers of the form  , where  and  . Automatically,  . Extend  to a function  on these duals by  . Then,  has real part  and "dual" part  . Click  ",  ", "F" to verify this for a general function  . To compute the partial derivatives of  and their values at  and  , set  ,  and click "  ", "  ", "  ", "*", "  ", "  ", "  ", "  ", "+", "/". The values can be read from  and  . To compute  and its value at  , set  ,  , and  and click "  ", "b", "c", "/", "  ", "  ", "  ", "sin", "*". The results, from which the value and derivative of  at  and at  can be read, are  and  . The calculator can be used as an ordinary calculator by entering only constants, obtained by setting and clicking  ,  , or  . Decimal values can be obtained by setting at least one involved slider value to a decimal value. Slider values are only from -10 to 10, good for simple examples. To enter a larger value, write it in scientific notation  where  . Set  and  to be the mantissa and exponent and enter "a", "b", "10^x", "*", "sto1" to store the number in the first storage location. |
 |
 |
 |
 |
|
 |
|
 |
|
 Upvalues for makedual are also used to implement a real to a dual power, and a dual to a dual power, as  ,  . Mathematica is so powerful that these and the above definitions for makedual are sufficient to implement the calculator. L. B. Rall's "The Arithmetic of Differentiation" first interested me in automatic differentiation. L. B. Rall, "The Arithmetic of Differentiation," Mathematics Magazine, 59(5), pp. 275–282.
|
|