Snapshot 1: Logistic data. The phase portrait unfolds with a time lag of 1, suggesting that the delay time could be 1. The average mutual information does not have a local minimum, but the first substantial decrease stops at time lag 1, suggesting that a suitable delay time could be 1. The fraction of false nearest neighbors drops to zero for embedding dimension 1. In summary, the delay coordinates are simply

.

Snapshot 2: Hénon data. The phase portrait unfolds with a time lag of 1, suggesting that the delay time could be 1. The average mutual information does not have a local minimum, but the first substantial decrease stops at time lag 1, suggesting that a suitable delay time could be 1. The fraction of false nearest neighbors drops to zero for embedding dimension 2 (71.2% of nearest neighbors are false for embedding dimension 1). In summary, the delay coordinates are

.

Snapshot 3: Lorenz data. The phase portrait unfolds with a time lag of approximately 16, suggesting that the delay time could be 16. The average mutual information has a local minimum at a time lag of approximately 16, suggesting that a suitable delay time could be 16. The fraction of false nearest neighbors drops to zero for embedding dimension 3 (for embedding dimension 1 it is 96.9%; for embedding dimension 2 it is 2.26%). In summary, the delay coordinates are

.

Snapshot 4: NMR laser data. The phase portrait unfolds with a time lag of 1, suggesting that the delay time could be 1. The average mutual information does not have a clear local minimum, but the first substantial decrease stops at time lag 1, suggesting that a suitable delay time could be 1. The fraction of false nearest neighbors drops to zero for embedding dimension 5 (for embedding dimensions 1, 2, 3 and 4 it is 86.2%, 13.6%, 1.18% and 0.1%, respectively). In summary, the delay coordinates are

.

These four settings are set as bookmarks in the "Bookmarks/Autorun" menu.

Each of the four datasets contains 4000 values. The logistic and Hénon datasets are generated from the corresponding difference equation models. These two calculations start with a high enough precision so that even the decimal parts of the last value are all correct (for the logistic and Hénon data, we start with a precision of 2800 and 1200 digits, respectively).

The Lorenz dataset is calculated by sampling from the numerical solution of the three Lorenz differential equations. We have tried to get a high-precision numerical solution by using a working precision of 25, and accuracy and precision goals of 18 (with these settings, we need almost

steps).

For each of these three datasets, 4500 values are actually generated, but the first 500 values are dropped as transient. The NMR laser dataset was earlier available at http://people.maths.ox.ac.uk/mcsharry/lectures/ndc/ndcworkshop.shtml, but appears now to have disappeared. This dataset originally contained

values, but we only use the first 4000 values.

**Phase Space Reconstruction**In phase space reconstruction, we have data in the form of a time series

,

, …, and we choose a delay time

and an embedding dimension

to get

-dimensional delay coordinates

. In this Demonstration, we consider the estimation of the delay time and the embedding dimension.

There is no rigorous way to determine an optimal value for the delay time, often denoted by

. In practice, the value of

should be such that the values of

and

are sufficiently independent to be useful as coordinates in a time-delay vector but not so independent as to have no connection with each other at all. If the signal has a strong (almost) periodic component, a good first guess for the delay time is one-quarter of the period. One of the methods suggested for choosing

τ is the use of mutual information. It measures the general dependence of two variables (recall that autocorrelation, well-known from time series analysis, measures the linear dependence of two variables). In this method, we calculate the average mutual information

of the variables

and

for various values of

:

.

Here,

is the relative frequency of the

bin of a histogram of the data, that is, the approximate probability that an observation is inside the

bin. Similarly,

is the approximate probability that

is in the

bin and

is in the

bin. The first minimum of average mutual information marks the delay time

where

adds maximal information to the knowledge we have from

. Accordingly, it is suggested that this value of

is used as the delay time in phase space reconstruction.

A widely used method to determine

is the method of false nearest neighbors. The idea is that when the embedding dimension is too small, some points of the data are very close to one another, not on the basis of the dynamics, but because the data is projected onto a too low-dimensional space. In the method of false nearest neighbors, we gradually increase the embedding dimension

, and for each value of

, we search for any false nearest neighbors for each point of the embedded data. This search proceeds as follows.

Assume that point

is the nearest neighbor of

in dimension

; denote the distance by

. Also consider these points in dimension

, where they have the additional components

and

, respectively. Now, if

is a true neighbor of

, their distance will also be small in dimension

, but if

is a false neighbor of

, their distance will be much larger in dimension

. This observation leads to the calculation of

. If this ratio is larger than a threshold value

like 15, point

is declared to be a false nearest neighbor of

.

When we have increased

to a value for which we, for the first time, essentially do no not have any more false nearest neighbors, we have identified the correct embedding dimension. For this dimension, the proportion of false nearest neighbors is either essentially zero or very small.

For details of analysis of chaotic data with Mathematica, see "Analysis of Chaotic Data with Mathematica" in the Related Links, in which we calculate the correlation dimension and the maximal Lyapunov exponent and also consider prediction for the four datasets.

In the other Demonstrations listed in Related Links, we estimate the correlation dimension and the maximal Lyapunov exponent.