.fp 4 M2 .fp 5 M6 .nr Cl 7 .nr Pt 1 .EQ delim $$ gsize 10 define dd % "..." % define sE % font 4 E % .EN .S 10 .am DE .ls 2 .sp .. .nr Hb 7 .nr Hs 7 .nr Hu 7 .ds HP 11 11 11 11 11 11 11 .ds HF 3 3 3 3 3 3 3 .TL The 10\u20\d-th Zero of the Riemann Zeta Function and 70 Million of its Neighbors .AU "A. M. Odlyzko" AMO MH 11218 7286 2C-355 .AS 1 .ls 2 .P This paper presents the results of a computation of almost 79 million consecutive zeros of the Riemann zeta function near zero number $10 sup 20$, as well as of several other large sets of high zeros. These zeros lie about $10 sup 8$ times higher than previously calculated large sets of zeros, and their computation was made possible by a fast new algorithm invented by A. Scho\*:nhage and the author. Although the implementation of this algorithm that was used is not entirely rigorous due to incomplete control of roundoff errors, it appears to be highly accurate as well as fast, and the results indicate that all the computed zeros satisfy the Riemann Hypothesis. Various statistical studies of these zeros are presented. Some of these studies provide numerical evidence about conjectures that go even beyond the Riemann Hypothesis, and relate the distribution of zeros of the zeta function to that of eigenvalues of random matrices studied extensively in physics. Other studies compare the actual behavior of the zeta function to known asymptotic results. The computations described in this paper were carried out on a Cray \%X-MP supercomputer. .ls 1 .AE .MT 4 .ls 2 .H 1 "Introduction" The $10 sup 20$-th zero of the Riemann zeta function equals .DS 2 .EQ 1/2 ~+~ i^1 5202440115920747268.6290299 "..." ~. .EN .DE It and a few of its nearest neighbors are shown in Table\ 1.1. All told, almost 79 million zeros near the $10 sup 20$-th zero were computed. These zeros lie almost $10 sup 8$ times higher than any other large sets of zeros that had been computed before. This paper reports the statistics of these and some other high zeros and describes the algorithms that made these calculations possible .P The Riemann Hypothesis (RH) has been subjected to a series of numerical investigations, starting with unpublished ones by Riemann. (See [Ed,\|Od3] for a history of these computations.) The latest result is that the RH is true for the first $1.5 times 10 sup 9$ zeros (i.e., all zeros up to height $^inf .EN .DE (the Lindelo\*:f conjecture). It would be nice to produce convincing numerical evidence about the actual size of $ zeta ( 1/2 + it ) $. However, this is hard to do. One difficulty is that near the $10 sup 20$-th zero, one has $t^approx^1.5 times 10 sup 19$, so that $t sup 1/6 ^approx^1570$, while $( log ^t) sup 2 ^approx^1950$, so it is even hard to distinguish between these two functions that have entirely different rates of growth. (Throughout the paper, $log^x$ denotes the natural logarithm of $x$.) .P Some of the data from the present computations might also be useful in other number theoretic investigations. For example, the Stark method [St] for obtaining lower bounds for imaginary quadratic number fields with small class numbers depends on knowledge of pairs of zeros of the zeta function that are very close together. (Another method for bounding class numbers, that of Montgomery and Weinberger [MW], depends on zeros of $L$-functions.) .P The final reason for the computations of this paper was to demonstrate that the new algorithm of [OS] is of practical use, and not just a theoretical curiosity. Since this algorithm is fairly complicated, this was not obvious to start with, and a large section of this paper is devoted to a description of the implementation, including various modifications that were made to the basic algorithm described in [OS]. As it turns out, the algorithm is very fast, over $ 10 sup 5 $ times faster than the older algorithms would have been near the $ 10 sup 20 $-th zero. Moreover, work on this implementation has suggested many additional modifications, described in Section\ 4.6, which can probably speed up the algorithm by another order of magnitude. .P The main sets of zeros that were computed are listed in Table\ 1.2. The entry for $N=10 sup 20$, for example, means that 78,\|893,\|234 zeros were computed, starting with zero number $10 sup 20^-^30,^769,^710$, and ending with zero number $10 sup 20^+^ 48,^123,^523$, and that all these zeros are of the form $1/2^+^i gamma$ with $gamma ^approx^1.5 times 10 sup 19$. Throughout the paper, references to the $N= 10 sup 20$ data set will denote these 78,\|893,\|234 zeros or some subset of them and similarly for the $N=10 sup 19 ,^dd$, data sets. .P The starting points for the large data sets listed in Table\ 1.2 were chosen to be near zeros of round order (such as $10 sup 20$), so as to be easy to refer to. It was thought that as far as the distribution of zeros is concerned, these intervals would behave like random ones. Another approach is to concentrate on investigating the behavior of $zeta (1/2 + i t)$ near those $t$ where the zeta function might be expected to behave in an unusual fashion (e.g., where it is very large). Some such special values of $t$ were found, and the computations that were carried out there are listed in tables\ 3.1.1 and 3.1.2. (A full explanation of the entries in these tables is given in Section\ 3.) These computations produced many values of the zeta function and of gaps between zeros that are current records. .P While the computations that are described in this paper did obtain values of zeta function zeros at much greater heights than would be feasible with older methods, they do have one fairly serious defect, namely that they are not rigorous. The validity of the values for the zeros that have been computed (and also of the assertion that all these zeros satisfy the RH) depends on the assumption that substantial cancellation among the errors due to roundoff takes place. This is due largely to the extremely large sizes of the numbers being handled, and not so much to the new algorithm, and is explained in detail in Section\ 4. At this point we only mention that the values of zeros that have been obtained are believed to be very accurate (to within $+-^10 sup -6$ or even better for $N= 10 sup 20$). This belief is based partially on the expected cancellation of errors in the computation. The strongest argument for the validity of the computations, however, comes from the fact that several large sets of zeros were computed twice, in entirely different ways. The fact that the numbers being computed were the same follows only from deep mathematical analysis, and is not obvious from the numbers being processed. That the resulting sets of values for the zeros agreed to the expected degree is a very strong argument in favor of validity of the results. These issues are discussed in greater detail in Section\ 4.5. .P The remainder of this paper is organized into three sections. Section\ 2 recalls the basic definitions and conjectures, and then presents the statistics of the large sets of zeros given in Table\ 1.2. Section\ 2 is organized into subsections on a variety of topics, such as large values of the zeta function, large and small gaps between consecutive zeros, and many others. .P Section\ 3 is devoted to the zeros listed in Table\ 3.1.1. First the statistics of these zeros and of various properties of the zeta function in those ranges are presented. Then some simultaneous Diophantine approximation algorithms (based on the Lova\*'sz lattice basis reduction algorithm [LLL]) are described, as well as the ways in which they have been used to produce the points of Table\ 3.1.1 at which the zeta function was expected to behave pathologically, and where it does exhibit unusual behavior. .P Section\ 4 describes the algorithms and computations on which this paper is based. First the basic algorithm of [OS] is briefly surveyed, and then the various modifications that have been made to it are described. (Some are very minor, while others, such as the use of band-limited function interpolation, are much more substantial.) A discussion of various possible modifications that might be utilized in the future is included (such as the replacement of the crucial rational function evaluation algorithm of [OS] by somewhat similar algorithms that have been proposed in the context of astrophysical and fluid dynamics simulations [GR1], or ways to obtain more rigorous results). There is also a large subsection on the accuracy and validity of the computations of this paper. .H 1 "Large sets of zeros: conjectures and statistics" .HU "2.0\0 Notation and definitions" .P The trivial zeros of the zeta function are $-2,^-4,^-6,^dd$. We will consider only the \f2nontrivial zeros\f1, which lie in the critical strip $0^<^roman Re (s) ^<^1$, and are customarily denoted by $rho$. Since for every nontrivial zero $rho$, $rho bar$ is also a zero, we will consider only zeros $rho$ with $roman Im ( rho ) ^>^0$. (There are no nontrivial zeros $rho$ with $roman Im ( rho ) ^=^0$.) We number these zeros $rho sub 1 ,^rho sub 2 ,^dd$ (counting each according to its multiplicity) so that $0^<^roman Im ( rho sub 1 ) ^<=^roman Im ( rho sub 2 ) ^<=^dd$. All the zeros that have been computed so far are simple and lie on the critical line, and so can be written as $rho sub n ^=^1 over 2 ^+^i gamma sub n$, $gamma sub n ^member^R sup +$, with $gamma sub 1 ^=^14.134725 dd$, $gamma sub 2 ^=^21.022039 dd$, $gamma sub 3 ^=^25.010857 dd$, etc. In many definitions throughout the paper we will be tacitly assuming that the RH holds, as otherwise those definitions might not make sense. .P Let $N(t)$ denote the number of zeros $rho$ with $0^<^roman Im ( rho ) ^<=^t$ (counted according to their multiplicity). Then it is known unconditionally [Tit2; Chapter\ 9.3] that .DS 2 .EQ (2.0.1) N(t) ~=~ t over {2 pi} ~ log ~ t over {2 pi e} ~+~ O ( log ^t) ~~~roman as ~~~t^->^inf ^. .EN .DE Since the zeros become denser as the height increases, and the average vertical spacing between zeros at height $t$ is asymptotic to $2 pi / ( log ( t/(2 pi )))$, we define the normalized spacing between consecutive zeros $1/2 + i gamma sub n$ and $1/2 + i gamma sub n+1$ to be .DS 2 .EQ (2.0.2) delta sub n ~=~ ( gamma sub n+1 ^-^ gamma sub n ) ~ {log ( gamma sub n / ( 2 pi ))} over {2 pi} ^. .EN .DE (Here we are assuming that both zeros satisfy the RH.) It then follows from (2.0.1) that the $delta sub n$ have mean value 1 in the sense that for positive integers $N$ and $M$, .DS 2 .EQ (2.0.3) sum from n=N+1 to N+M ~ delta sub n ~=~ M ~+~ O ( log (NM)) ^. .EN .DE .P For $t$ real and positive (as will be the case throughout the paper) we define .DS 2 .EQ (2.0.4) theta (t) ~=~ roman arg [ pi sup -it/2 ^GAMMA ( 1/4 ^+^ it/2 ) ] ^, .EN .DE where the argument is defined by continuous variation of $s$ in $pi sup -s/2 ^GAMMA ( s/2 )$, starting at $s=1/2$ and going up vertically. We also let .DS 2 .EQ (2.0.5) Z(t) ~=~ exp (i theta (t)) ^zeta ( 1/2 + it ) ^. .EN .DE Then it follows from the functional equation of the zeta function that $Z(t)$ is real, and sign changes of $Z(t)$ correspond to zeros of $zeta (s)$ on the critical line. Almost all computations of the zeta function on the critical line actually calculate $Z(t)$ (cf.\ Section\ 4). .P The function $theta (t)$ is monotonic increasing for $t^>=^7$. For $n^>=^-1$, we define the \%$n$-th \f2Gram point\f1 $g sub n$ to be the unique solution $>^7$ to .DS 2 .EQ (2.0.6) theta ( g sub n ) ~=~ n pi ^. .EN .DE We have $g sub -1 ^=^ 9.666 dd$, $g sub 0 ^=^ 17.845 dd$, etc. Gram points are about as dense as the zeros of $zeta (s)$ (see Section\ 2.12 for a detailed discussion), but are much more regularly distributed. In graphs, by a \f2Gram point scale\f1 we will refer to labeling Gram point $g sub n$ by $n$ (or $n-M$ for some fixed $M$ as $n$ varies). For example, Fig.\ 2.0.1 shows $Z(t)$ near zero number $10 sup 20$. Figure\ 2.0.3 shows $Z(t)$ over a somewhat wider range. .P We let .DS 2 .EQ (2.0.7) S(t)~=~ pi sup -1 ^roman arg ^zeta (1/2 + it ) ^, .EN .DE where the argument is defined by continuous variation of $s$ in $zeta (s)$, starting at $s=2$, going up vertically to $s=2+it$, and then horizontally to $s=1/2 + it$. (This definition assumes that there are no zeros $rho$ with $roman Im ( rho ) ^=^t$.) The function $S(t)$ has jump discontinuities at heights equal to zeros. We have .DS 2 .EQ (2.0.8) N(t) ~=~ 1 ~+~ pi sup -1 theta (t) ~+~ S(t) ^, .EN .DE so that (2.0.1) is actually a consequence of the asymptotic expansion of $theta (t)$ (which follows from Stirling's formula [HMF]) .DS 2 .EQ (2.0.9) theta (t) ~=~ 1 over 2 ^t ^log (t/( 2 pi e )) ~-~ pi /8 ~+~ O ( t sup -1 ) ~~~roman as ~~~ t^->^ inf .EN .DE and the bound [Tit2; Theorem\ 9.4] .DS 2 .EQ (2.0.10) | S(t) | ~=~ O ( log ^t) ~~~roman as ~~~ t^->^ inf ^. .EN .DE Since $N(t)$ is an integer, and $theta (t)$ is very smooth, (2.0.8) shows that $S(t)$ jumps at zeros and decreases at a very steady rate between zeros. Figure\ 2.0.2 shows $S(t)$ over the same range of values of $t$ as in Fig.\ 2.0.1, in the vicinity of zero number $10 sup 20$. This range represents fairly typical behavior of $S(t)$ at that height. (For very unusual behavior of $S(t)$, see Fig.\ 3.1.3.) The function $S(t)$ is of crucial importance in understanding the distribution of zeros, and sections\ 2.4,\|2.12, and 2.13 are devoted largely to its properties. .P In comparing empirical distributions of various functions, such as $S(t)$ and $delta sub n$, to their conjectured distributions, we will rely fairly extensively on comparing the moments of their distributions. The method of moments has fallen into some disrepute in statistics because of its many faults, such as lack of robustness. (For example, a single outlier in the data can have a large effect, something we will see in our data.) However, there are some good reasons for using it. One is that it is easy to apply. A more substantial one is that for many of the statistics of the zeta function, such as those of $S(t)$, or of $Z(t)$, computation of moments is currently essentially the only known tool that can be used to obtain rigorous results. In such cases moments provide the most direct way of comparing empirical distributions to theoretical results. .P If a sequence of probability measures with distribution functions $F sub n (x)$ is such that for every $k^>=^0$, the \%$k$-th moment .DS 2 .EQ mu sub n (k) ~=~ int ~ x sup k ^dF sub n (x) .EN .DE converges to $mu (k)$ as $n^->^inf$, then there is a limiting measure with distribution $F(x)$ whose \%$k$-th moment is $mu (k)$. Furthermore, if the $mu (k)$ determine their measure uniquely, and this measure has distribution function $F(x)$, then the $F sub n (x)$ converge to $F(x)$ (in the weak star sense) [Bil; pp.\ 342-353]. The $mu (k)$ determine $F(x)$ uniquely if they do not grow too fast [Bil],\|[Fel; pp.\ 227-228], so that the normal distribution, for example, is characterized by its moments. On the other hand, the log-normal distribution (distribution of $exp ( eta )$, where $eta$ is normal) is not determined uniquely by its moments [Bil],\|[Fel]. .P The standard normal distribution has the density function .DS 2 .EQ (2.0.11) f(x) ~=~ ( 2 pi ) sup {- 1/2} ~e sup {-x sup 2 /2} ^, .EN .DE and so has mean 0 and variance 1. In many cases we will be dealing with quantities (such as $S(t)$) whose known asymptotic distributions are normal, but which have variances on the order of $log^log^N$ (for zeros near zero number $N$). Since $log^log^N$ grows very slowly, it is to be expected that the observed data will have somewhat different variances, as second order terms are likely to be substantial. (For $N=10 sup 20$, $log^log^N^=^3.82976 dd$, so even an additive constant of 1 in the estimate of the variance makes a huge difference.) On the other hand, it is not too unreasonable to hope that the shape of the distribution should be close to the expected one. To carry out such a comparison, we will often use \f2scaled and translated empirical distribution\f1. If $x sub 1 ,^dd ,^x sub n$ are samples (of $delta sub m$, say, or other quantities) with mean $a$ and variance $v= sigma sup 2$ (so that $sigma$ is the \f2standard deviation\f1, or \f2rms value\f1), .DS 3 .EQ (2.0.12) a mark ~=~ 1 over n ~ sum from j=1 to n ~ x sub j ^, .EN .sp .EQ (2.0.13) v lineup ~=~ 1 over n ~ sum from j=1 to n ~ (x sub j -a ) sup 2 ^, .EN .DE then the scaled and translated values will be .DS 2 .EQ (2.0.14) x sub j sup star ~=~ (x sub j -a ) / sigma ^. .EN .DE The $x sub j sup star$ have mean 0 and variance 1. The tables will usually list the \%$k$-th moment of $x sub j sup star$ in the \%$k$-th entry, but there will be entries giving the ordinary mean $a$ and ordinary variance $v$ that will be marked $k=1 sup star$ and $k=2 sup star$, respectively. In a few cases where the mean $a$ is extremely small, we will use $x sub j sup star ^=^x sub j / sigma$. (These cases will be easy to distinguish because the scaled \%1-st moment will not be 0.) .P Throughout this paper, numbers which have ``$dd$'' at the end are truncated to the form that is shown, while those without ``$dd$'' are rounded, but the rounding is sometimes up and sometimes down. Thus, for example, $pi$ could be represented as 3.14159..., as 3.14159, or as 3.14160. The log function will always refer to the natural logarithm. References to maximal values of function $f(x)$ will usually mean the values of $f(x)$ for which $|f(x)|$ is maximal. .P Constants such as $n sub 0 ,^n sub 1 ,^dd$, will generally be different in different sections, but will be the same within a section. .H 2 "Validity of the RH and correctness of the computational results" The main question about the validity of the computations described in this paper has to do with size and cancellation of roundoff errors. This issue is discussed in detail in Section\ 4. Even if we assume that roundoff errors are small (as they seem to be), there remains some further lack of rigor. The set of zeros corresponding to $N=10 sup 12$, for example, is claimed to consist of exactly the zeros numbered $10 sup 12 - 6,^032$ to $10 sup 12 + 1,^586,^163$. Those 1,\|592,\|196 values are indeed zeros of the zeta function between Gram points of orders $10 sup 12 - 6,^034$ and $10 sup 12 + 1,^586,^162$, provided all the computational steps were correct. Given the degree of regularity in the locations of those zeros, a theorem of Turing (see [Br5; Theorem\ 3.2] for a modified and corrected version) allows us to conclude, for example, that the \%21-st through the 1,\|592,\|176-th zeros in our set are indeed zeros numbered $10 sup 12 - 6,^012$ through $10 sup 12 + 1,^586,^143$. However, this theorem does not exclude the possibility that, for example, the interval between Gram points $10 sup 12 - 6,^034$ and $10 sup 12 - 6,^014$ might contain some additional zeros. Since such additional zeros would violate either Rosser's rule (see Section\ 2.13) or even the RH, they seem unlikely to exist, and in any event would not affect most of the statistics very much, and so were assumed not to exist. .P There are some further cases of nonrigorous computations in this paper. For example, the conjectured distribution of the $delta sub n$ (see Section\ 2.2) is complicated, and (as was done in [Od2]) was computed using Van\ Buren's program [VB], with some modifications by S.\ P. Lloyd and this author. This program uses an involved combination of variational procedures and special function expansions, and no rigorous error analysis for it is known, although it appears to be very accurate (cf.\ [Od2]). .P Other examples of nonrigorous computation are presented by various piecewise linear approximations and other interpolation schemes used in the following sections. They are all thought to produce accurate results, but no proofs are available. .H 2 "Eigenvalues of random matrices and zeros" Over the last few decades, an extensive collection of results about eigenvalues of certain types of random matrices have been obtained by mathematical physicists. The aim of these investigations was to obtain insight into the distribution of energy levels in heavy nuclei, and recently their results have been applied to studies of energy levels in other kinds of many-particle systems. Some of the references for this field are [Be1, Be2, Be3, BG, BGS1, BGS2, BFFMPW, Meh, Por]. Not only are there many beautiful and mathematically rigorous results in this area, but there is also experimental evidence that these results do describe behavior of physical systems [HPB]. (It should be mentioned that due to the difficulty of the experiments, the physical data, which was obtained with a lot of effort over the span of several decades, is very sparse and of poor quality compared to the data that can be obtained for the zeta function.) .P The eigenvalue results that will be of greatest interest to us are those of the Gaussian unitary ensemble (GUE), which together with the Gaussian orthogonal ensemble (GOE) and the Gaussian symplectic ensemble (GSE) has been studied very extensively. The GUE consists of $n times n$ complex Hermitian matrices of the form $A^=^ (a sub jk )$, where $a sub jj ^=^2 sup 1/2 ^sigma sub jj$, $a sub jk = sigma sub jk + i eta sub jk$ for $j^<^k$, and $a sub jk = a bar sub kj = sigma sub kj - i eta sub kj$ for $j^>^k$, where the $sigma sub jk$ and $eta sub jk$ are independent standard normal variables. (The GOE consists of real symmetric matrices defined similarly.) The eigenvalues of these matrices are real, and it is their asymptotic distribution, as $n^->^inf$, that is of interest. If we denote the eigenvalues by $lambda sub 1 ^<=^lambda sub 2 ^<=^...^<=^lambda sub n$, then one has the \f2Wigner semi-circle law\f1: if $M(x)$ denotes the expected number of eigenvalues $<=^x$, then for all fixed real $x$, .DS 2 .EQ (2.2.1) lim from {n^->^inf} ~ n sup -1 ^M( x sqrt n ) ~=~ left { matrix { ccol { 1 over {2 pi} ^int from -2 to x ^(4-u sup 2 ) sup 1/2 du ^,~~~ above 0 from "" to "" ^, ~~~ above 1 from "" to "" ^,~~~} lcol {|x|^<^ 2 ^, above x^<=^-2 ^, above x^>=^2 ^.} } .EN .DE This distribution law applies to much more general classes of matrices than those of the GUE and related ensembles. In the case of the GUE (and also of the GOE and GSE) a further step is possible in that one can obtain very precise information about the distribution of spacings between consecutive eigenvalues. The complete distribution of eigenvalues is known, and one can derive many limit laws. To do that one normalizes the eigenvalues (basically by stretching the distance between consecutive eigenvalues $lambda ^<^ lambda sup prime$ by a factor of $(4n - lambda sup 2 ) sup 1/2 / ( 2 pi ))$ so as to make the average nearest neighbor spacing equal to 1. With this normalization, the distribution of eigenvalues looks the same everywhere (in the limit as $n^->^inf$) and one can in principle determine any desired statistic of the zeros. (Doing so in practice means evaluating a definite multidimensional integral, which is often hard, and gives rise to interesting problems.) For example, if we use $w$ to denote a normalized eigenvalue in the GUE, then one finds that for any fixed $0^<=^alpha ^<^beta ^<^ inf$, .DS 2 .EQ (2.2.2) sE ( | "{" ( w, w sup prime ) ^:~ w^<^w sup prime ^, ~~ w sup prime ^-^ w^member^[ alpha ,^beta ] "}" | ) ~wig~ int from alpha to beta ~ left ( 1^-^ left ( {sin ^pi u} over {pi u} right ) sup 2 right ) ^du .EN .DE as $n^->^inf$, where $sE (z)$ is the expectation of $z$. We say that $1^-^ ( ( sin ^pi u ) / ( pi u ) ) sup 2$ is the .I pair correlation function .R of the GUE. (The pair correlation functions of the GOE and the GSE are different.) Equation\ (2.2.2) shows, for example, that it is rare for GUE eigenvalues to be close together. If the $w$'s were obtained by choosing $n$ points independently and uniformly from the interval $[0,^n]$ and letting $n^->^inf$, the pair correlation function would be identically 1. The GUE pair correlation function in the range $0^<=^u^<=^3$ is drawn as the solid curve in Fig.\ 2.3.1, and is far from a constant. .P If $w$ is a normalized eigenvalue of the GUE, we let $w sup (k)$ denote the \%$k$-th smallest normalized eigenvalue of those that are $>^w$. Then it is known that the \%$k$-th nearest spacings $w sup (k) - w$ satisfy a distribution law; for all $0^<=^alpha ^<^beta ^<^ inf$, .DS 2 .EQ (2.2.3) roman Prob ( w sup (k) ^-^w ^member^ [ alpha ,^beta ] ) ~ wig~ int from alpha to beta ~ p( k-1,^u) du .EN .DE as $n^->^inf$. The probability densities $p(k,^u)$ (referred to as $p sub 2 (k;^u)$ in many publications, such as [CM2; Mch, MdC], where the subscript 2 denotes the GUE) are complicated functions defined in terms of linear prolate spheroidal functions. For methods of computing them, see [MdC,\|Od2]. Graphs of $p(0,^u)$ and $p(1,^u)$ are given by the solid lines in figures\ 2.3.4 and 2.3.6, respectively. Those graphs show the ``rigidity'' of the GUE; the eigenvalues repel each other and most of the time are close to the expected distance from their neighbors. For all $u^>=^0$, .DS 2 .EQ (2.2.4) 1 ~-~ left ( {sin ^pi u} over {pi u} right ) sup 2 ~=~ sum from k=0 to inf ~ p(k,^u ) ^. .EN .DE We note for future reference that the $p(k,^u)$ have the following Taylor series expansions around 0 [Meh,\|MdC]: .DS 3 .EQ (2.2.5) p(0,^u) mark ~=~ {pi sup 2} over 3 ^u sup 2 ~-~ {2 pi sup 4} over 45 ^u sup 4 ~+~ {pi sup 6} over 315 ^u sup 6 ^+^ ... ^, .EN .sp .EQ (2.2.6) p(1,^u) lineup ~=~ {pi sup 6} over 4050 ^u sup 7 ~+~ ... ^, .EN .sp .EQ (2.2.7) p(2,^u) lineup ~=~ {pi sup 12} over 5358150000 ^u sup 14 ~+~... ^. .EN .DE .P The normalized eigenvalues in the GUE have (in the limit as $n^->^inf$) a stationary distribution. This means that clusters of eigenvalues have the same distribution no matter where in the spectrum they are located. However, this distribution is not Markovian, so that the distribution of an eigenvalue depends not just on the preceding eigenvalue, but on all previous ones as well. .P The basic results about distribution of GUE eigenvalues are completely rigorous. However, they do have many gaps. One of them is that the results are obtained by averaging over the full ensemble of GUE matrices. It is conjectured that if one considers a large random GUE matrix, the distribution of its eigenvalues will be close to that of the entire ensemble with high probability. Although numerical calculations confirm this conjecture, there is no proof of it. Also, it is thought that entries of the matrix do not have to be of exactly the form specified above for the GUE result to hold. .P The main goal of this paper (and of the preceding paper [Od2]) is to test the conjecture, which will be referred to as the \f2GUE hypothesis\f1, the \f2GUE theory\f1, or simply the \f2GUE\f1, that the zeros of the zeta function behave like eigenvalues of the GUE. More precisely, it is conjectured that the $delta sub n$ behave asymptotically like $w sup (1) ^-^w$ in the GUE, so that for any $0^<=^alpha ^<^ beta ^<^ inf$, .DS 2 .EQ (2.2.8) M sup -1 | "{" n^:~ N+1^<=^n^<=^N+M ,~~delta sub n ^member^ [ alpha ,^beta ] "}" | ~wig~ int from alpha to beta ^p(0,^u) du .EN .DE as $M,^N^->^inf$ with $M$ not too small compared with $N$. Similarly, it is conjectured that .DS 2 .EQ (2.2.9) M sup -1 | "{"n^:~ N+1^<=^n^<=^N+M ,~~delta sub n + delta sub n+1 ^member^ [ alpha ,^beta ] "}" | ~wig~ int from alpha to beta ~ p(1,^u) du ^. .EN .DE More generally, the same reasoning leads one to expect that for any $k$, the empirical distribution function of $delta sub n ,^delta sub n+1 ,^dd ,^delta sub n+k$ for $N+1^<=^n^<=^N+M$ approaches the stationary process that holds for the GUE. .P The GUE hypothesis is of interest because if it is true, it might be interpreted as providing some support for the Hilbert and Po\*'lya conjectures [Be2,\|Be3,\|Mon1,\|Od3] which predict that the RH is true because the zeros of the zeta function correspond to eigenvalues of a positive linear operator. The argument is that if such an operator exists, its eigenvalues might be similar to those of a random operator (especially if, as is conjectured for the GUE, most random operators have very similar eigenvalue distributions), and a random linear operator ought to be the limit of a sequence of random matrices. .P If the GUE hypothesis were true, that would also be of interest in physics, as the zeta function could than be used as a model of quantum chaos [Be2,\|Be3]. .P The main theoretical support and inspiration for the GUE hypothesis comes from H. Montgomery's work on the pair correlation function of the zeros of the zeta function. Under the assumption of the RH, Montgomery showed [Mon1,\|Mon2] that if we define .DS 2 .EQ (2.2.10) F( alpha ,^T) ~=~ 2 pi (T^log^T) sup -1 ~sum from {pile {0^<^gamma^<=^T above 0^<^gamma sup prime ^<=^T}} ~ T sup {i alpha ( gamma - gamma sup prime )} ~ 4 over {4+( gamma - gamma sup prime ) sup 2} .EN .DE for $alpha$ and $T$ real, $T^>=^2$, then .DS 2 .EQ (2.2.11) F( alpha ,^T) ~~=~ (1+ o(1)) T sup {- 2 alpha} ^log ^T ^+^ alpha ^+^o(1) ~~~roman as~~~ T^->^inf ^, .EN .DE uniformly for $0^<=^alpha ^<=^1$. Montgomery also observed that if the primes are distributed sufficiently uniformly in arithmetic progressions, then .DS 2 .EQ (2.2.12) F( alpha ,^T ) ~=~1^+^ o(1) ~~~~roman as ~~~~T^->^inf .EN .DE uniformly for $alpha ^member^[a,^b]$, where $1^<=^a^<^b^<^inf$ are any constants. If the conjecture (2.2.12) were true, then one would find that for any $0^<^alpha ^<^beta ^<^inf$, .DS 3 .EQ N sup -1 ^| "{" (n,^k) ^:~ mark 1^<=^n^<=^N ,~ k^>=^ 0 ,~~ delta sub n ^+^ delta sub n+1 ~+~...~+~ delta sub n+k ^member^[ alpha ,^beta ] "}" | .EN .sp .5 .EQ (2.2.13) ~~ .EN .sp .5 .EQ lineup ~wig~ int from alpha to beta ~ left ( 1 ^-^ left ( {sin ^pi u} over {pi u} right ) sup 2 right ) ^du .EN .DE as $N^->^inf$. The relation (2.2.13) is known as the Montgomery pair correlation conjecture. It says that the pair correlation of the zeros of the zeta function is the same as that of the GUE. Since the pair correlations of the GOE and GSE are different, (indeed, they are even inconsistent with (2.2.11)), this leads one to expect that the zeros might behave like eigenvalues of the GUE rather than GOE or GSE, and this is the reason that only the GUE distributions were presented above. (One possible implication of this observation is that the hypothetical Hilbert-Po\*'lya operator is likely to be complex.) .P Montgomery's hypothetical result (2.2.11) and the conjectures (2.2.12) and (2.2.13) are the main theoretical evidence we have in favor of the GUE hypothesis, and the two conjectures depend on far-reaching assumptions about pseudorandom behavior of primes. Some further evidence in favor of the GUE hypothesis was provided by Ozluk [Oz1], who showed that if one considers a function similar to $F( alpha ,^T)$, but where one sums over zeros of many Dirichlet $L$-functions, then under the assumption of the Generalized Riemann Hypothesis for these $L$-functions, the analog of Montgomery's conjecture (2.2.12) is true for $1^<=^alpha ^<=^2$. Some further slight support for the GUE hypothesis is provided by new results of Ozluk [Oz2] on zeros of Dirichlet $L$-functions close to the real axis. .P Extensive numerical evidence in favor of the GUE hypothesis was presented in [Od2]. It was based largely on computed values of $gamma sub n$, with $1^<=^n^<=^10 sup 5$ and $10 sup 12 +1 ^<=^n^<=^10 sup 12 + 10 sup 5$. With some slight exceptions (such as the slight excess of very small $delta sub n$ that was mentioned in the Introduction) this evidence was in excellent agreement with the GUE hypothesis, and the degree of agreement improved dramatically as one went from the first $10 sup 5$ zeros to those near zero number $10 sup 12$. Some numerical evidence for the pair correlation conjecture for Dirichlet $L$-functions has been obtained since then by Hejhal [Hej5]. .P Various theoretical results and conjectures related to the GUE theories and the pair correlation conjecture have been obtained in recent years. Some of the references are [Be2, Be3, Be4, Fu8, Gal2, Gal3, Gal4, GM, Gol, Go2, Go3, GG, GHB, GM, HB1, Mue2]. .H 2 "General distribution of gaps between zeros" Figure\ 2.3.1 shows how well the pair correlation conjecture is satisfied. The solid line is the GUE prediction $y^=^1- (( sin ^pi x ) / ( pi x )) sup 2$. The scatterplot is based on approximately $8 times 10 sup 6$ zeros near zero number $10 sup 20$. Let .DS 3 .EQ n sub 1 mark ~=~ 10 sup 20 ~-~ 15,^409,^240, .EN .EQ n sub 2 lineup ~=~ 10 sup 20 ~-~ 13,^366,^460, .EN .EQ n sub 3 lineup ~=~ 10 sup 20 ~-~ 10,^302,^282, .EN .EQ n sub 4 lineup ~=~ 10 sup 20 ~-~ 6,^216,^711, .EN .EQ n sub 5 lineup ~=~ 10 sup 20 ~-~ 42,^778, .EN .EQ n sub 6 lineup ~=~ 10 sup 20 ~+~ 15,^316,^087, .EN .EQ n sub 7 lineup ~=~ 10 sup 20 ~+~ 46,^073,^204, .EN .EQ n sub 8 lineup ~=~ 10 sup 20 ~+~ 47,^098,^588, .EN .DE and .DS 2 .EQ V ~=~ "{" n^:~ n sub i ^<=^n^<^n sub i ^+^10 sup 6 ~~~roman {"for some"} ~~i, ~~~ 1^<=^i^<=^8 "}" ^. .EN .DE Then for each interval $I^=^[ alpha ,^beta )$ with $alpha = k/20$, $beta = alpha + 1/20$, $0^<=^k^<^60$, a star is placed at the point $x = ( alpha + beta ) /2$, $y = a sub {alpha ,^beta}$, where .DS 2 .EQ (2.3.1) a sub {alpha ,^beta} ~=~ 20 over {8 times 10 sup 6} ^left | ^"{"(n,^k) ^:~ n^member^V ,~~k^>=^0 ,^~~ delta sub n ^+^...^+^delta sub n+k ^member^ [ alpha ,^beta ) "}" right | ^. .EN .DE The solid line is the GUE prediction $y=1-(( sin ^pi x ) / ( pi x )) sup 2$. As can be seen, the agreement between the conjectured and observed values is excellent. .P Figure\ 2.3.2 presents similar data, but this time based on just $10 sup 6$ values of $n$; $n sub 9 ^<=^n^<^n sub 9 ^+^10 sup 6$, $n sub 9 = 10 sup 12 - 6,^032$. A comparison of these two graphs with figures\ 1 and 2 of [Od2] is instructive. Those figures show similar graphs, but based in each case on $10 sup 5$ zeros starting with zeros number 1 and $10 sup 12 +1$. The scatterplot of Fig.\ 2.3.2 is much smoother than that of Fig.\ 2 of [Od2], because the former is based on $10 sup 6$ instead of $10 sup 5$ samples, and so the sampling error is smaller. That same reason explains why the scatterplot of Fig.\ 2.3.1 looks smoother than that of Fig.\ 2.3.2. Even if we make allowances for the different sample sizes, though, it is clear that the agreement between empirical and predicted values improves dramatically from $N=1$ to $N= 10 sup 12$, and improves some more between $N=10 sup 12$ and $N=10 sup 20$. In all cases, the empirical data has more pronounced peaks and troughs than expected, but this effect decreases as the height increases. .P Some of the pair correlation function oscillations can be seen even for normalized spacings that exceed 3. Figure\ 2.3.3 shows a graph based, just like Fig.\ 2.3.1, on $8 times 10 sup 6$ zeros near zero number $10 sup 20$. In this case, though, the scatterplot was smoothed slightly by applying the lowess function of [BC] (an implementation of Cleveland's robust locally weighted regression [Cle]). The reason for this smoothing is that even with $8 times 10 sup 6$ zeros, each of the $a sub {alpha ,^beta}$ defined in (2.3.1) corresponds to about $4 times 10 sup 5$ counts $(n,^k)$. Therefore we can expect random sampling errors on the order of $(4 times 10 sup 5 ) sup 1/2$, which gives a variation of about $1.6 times 10 sup -3$ in the value of $a sub {alpha ,^beta}$. Given the small variation in the GUE prediction $y=1- (( sin ^pi x ) /( pi x )) sup 2$ over the range $3^<=^x^<=^5$, this random sampling error produces a rather confusing picture if the data is not smoothed. (Another, but slightly less effective way to produce a better picture is to use sampling intervals larger than 1/20. The resulting picture is very similar to that of Fig.\ 2.3.3.) .P Figure\ 2.3.3 shows that the empirical pair correlation function, even for $N=10 sup 20$, has peaks and triangles that are more pronounced than those of the conjectured distribution, at least in the range $3^<^x^<^5$. This is also true in the range $5^<^x^<^10$. .P Figures\ 2.3.4 and 2.3.5 show the distribution of the normalized spacings $delta sub n$ for $N=10 sup 12$ and $N=10 sup 20$, based on the 1,\|592,\|196 and 78,\|893,\|234 zeros, respectively, that have been computed. Thus, for example, in Fig.\ 2.3.4 a star is plotted at $x= ( alpha + beta ) /2$, $y= b sub {alpha ,^beta }$ for $alpha = k/20$, $beta = alpha + 1/20$, $0^<=^k^<=^59$, where .DS 2 .EQ (2.3.2) b sub {alpha ,^beta} ~=~ 20 over {1592195} ^left | ^ "{"n^:~ 10 sup 12 ^-^ 6032 ^<=^n^<=^10 sup 12 ^+^1586162,~~ delta sub n ^member^[ alpha ,^beta ) "}" right | ^. .EN .DE The solid lines are the GUE predictions, $y= p(0,^x)$. Similarly, figures\ 2.3.6 and 2.3.7 show the distribution of $delta sub n + delta sub n+1$. (Similar graphs based on the first $10 sup 5$ zeros are contained in [Od2].) .P The graphs show very good agreement between conjecture and numerical data, and, as was to be expected, the degree of agreement increases dramatically as one goes from $N=1$ to $N= 10 sup 12$, and then improves a bit more as one goes to $N=10 sup 20$. Moreover, the fact that the disagreement is greater for $delta sub n + delta sub n+1$ than for $delta sub n$ is to be expected, given that $S(t)$ is very small. (See Section\ 2.4 for a discussion of this.) .P A quantitative measure of the agreement between observed and conjectured distributions is shown in tables\ 2.3.1 through 2.3.3 which display moments of distributions. For each set of $M$ zeros, $K^<^n^<=^K+M$ $(M=1,^592,^196$ for $N=10 sup 12$, 78,\|893,\|234 for $N=10 sup 20$, etc.) Table\ 2.3.1 displays .DS 2 .EQ (2.3.3) (M-1) sup -1 ~ sum from {n=K+1} to {K+M-1} ~( delta sub n -1 ) sup k ^, .EN .DE while Table\ 2.3.2 shows .DS 2 .EQ (2.3.4) (M-2) sup -1 ~ sum from n=K+1 to K+M-2 ~ ( delta sub n + delta sub n+1 -2) sup k ^, .EN .DE in each case for $2^<=^k^<=^10$. (The values for $N=1$ are taken from [Od2].) Table\ 2.3.3 shows moments of $log ^delta sub n$, $delta sub n sup -1$, and $delta sub n sup -2$. In all cases the values predicted by the GUE are also shown. .P Tables\ 2.3.1 to 2.3.3 do show quite satisfactory agreement between observed values and conjectured ones, with the degree of agreement increasing as the height of the zeros increases. (The slightly anomalous value for the moment of $delta sub n sup -2$ for $N=10 sup 18$ is due to one very small $delta sub n$ that is very unusual and will be discussed in sections\ 2.5 and 2.7.) .P The Kolmogorov test [KS; Section\ 30.49] yields a method for measuring the agreement between the observed distribution of the $delta sub n$ and the GUE predictions. If samples $x sub 1 ,^dd ,^x sub n$ are drawn from a distribution with a continuous cumulative distribution function $F(z)$, let $F sub e (z)$ denote the sample distribution function: .DS 2 .EQ F sub e (z) ~=~ n sup -1 | "{"k^:~ 1^<=^k^<=^n ,~~ x sub k ^<=^z "}" | ^. .EN .DE The Kolmogorov statistic is then .DS 2 .EQ (2.3.5) D ~=~ roman {"sup"} from z ~ |F sub e (z) ^-^F(z) | ^. .EN .DE If the $x sub i$ are drawn from the distribution corresponding to $F(z)$, then [KS; Eq.\ 30.132] .DS 2 .EQ (2.3.6) lim from {n^->^inf} ~ roman Prob (D^>^ un sup {- 1/2} ) ~=~ g(u) ^, .EN .DE where .DS 2 .EQ (2.3.7) g(u) ~=~ 2 ~ sum from r=1 to inf ~ (-1) sup r-1 ~exp (-2r sup 2 u sup 2 ) ^. .EN .DE Table\ 2.3.4 gives the Kolmogorov statistic $D$ for $delta sub n$ and $delta sub n + delta sub n+1$ for several blocks of $10 sup 6$ consecutive values of $n$. The set denoted by $N=10 sup 12$ corresponds to $n sub 9^<=^n^<^n sub 9 ^+^10 sup 6$; the ones denoted by $N=10 sup 20 (a)$, $N=10 sup 20 (b)$, and $N= 10 sup 20 (c)$ start at $n=n sub 6$, $n=n sub 8$ and $n=n sub 5$, respectively, where the $n sub i$ were defined at the beginning of this section. The ``$N=10 sup 12$ vs. GUE'' entry, for example, gives the Kolmogorov statistic of the $N=10 sup 12$ set when it is compared to the GUE distribution. For each value of $D$, the ``prob.'' column gives an estimate that this statistic would arise if the $delta sub n$ ($delta sub n + delta sub n+1$, respectively) were drawn independently for each $n$ from the GUE distribution. This estimate is obtained by evaluating $g( D^times ^1000)$. The ``$N=10 sup 20 (a)$ vs. $N=10 sup 20 (b)$'' row of the table was obtained by constructing a continuous distribution from the $N=10 sup 20 (b)$ data and computing the Kolmogorov statistic for the discrete $N=10 sup 20 (a)$ data against this continuous distribution. .P What is apparent from Table\ 2.3.4 is that as the height increases, the empirical distributions of $delta sub n$ and $delta sub n + delta sub n+1$ do approach that of the GUE. In fact, when one computes the $D$ statistic for the $delta sub n$ in the 10 blocks of $10 sup 5$ consecutive zeros that are contained in the $N=10 sup 20 (b)$ set, one obtains values ranging between 0.002 and 0.0031, which correspond to probabilities of between 0.83 to 0.3 of occurring if the $delta sub n$ were drawn from the GUE distribution. Thus for sets of $10 sup 5$ zeros around zero number $10 sup 20$, it is essentially impossible to distinguish the empirical distribution of the $delta sub n$ from the expected one. (For $delta sub n + delta sub n+1$, the corresponding $D$ values are 0.0035 and 0.00555, which gives probabilities of 0.17 and 0.004, so the fit here is slightly worse.) .P The comparison of the three different sets of $10 sup 6$ zeros near zero number $10 sup 20$ to each other is quite revealing. The Kolmogorov statistics $D$ are quite small (especially for $delta sub n$), and indicate that all three sets come from essentially the same distribution. Thus what seems to be happening is that at each height, when we examine large sets of zeros, the $delta sub n$ and $delta sub n + delta sub n+1$ behave as if they were drawn independently from some distributions that depend on $t$, change relatively slowly as $t$ changes, and tend to the GUE distributions as $t^->^inf$. .H 2 "Values of $bold S(t)$" The upper bound (2.0.10) for $S(t)$ is the best that is known unconditionally. The Lindelo\*:f Hypothesis (see Section\ 2.8) implies that $|S(t)| ^=^o( log ^t)$ as $t^->^inf$, while the RH implies [Tit2] that .DS 2 .EQ (2.4.1) |S(t)| ~=~ O left ( {log^t} over {log^log^t} right ) ~~~roman as ~~~ t^->^inf ^. .EN .DE The true rate of growth is thought to be much smaller. The best lower bound that has been proved under the RH is due to Montgomery [Mon3], and gives .DS 2 .EQ (2.4.2) S(t) ~=~ OMEGA sub +- left ( left ( {log^t} over {log^log^t} right ) sup 1/2 right ) ~~~roman as ~~~ t^->^inf ^. .EN .DE (The best unconditional bound, due to Tsang [Ts1,\|Ts2], replaces the square root in (2.4.2) by a cube root.) Montgomery [Mon3] has conjectured that the quantity on the right side of (2.4.2) represents the correct rate of growth of $S(t)$, and Joyner [Joy2] has presented a heuristic argument supporting this conjecture. As we will see in Section\ 2.5, the GUE suggests that $|S(t)|$ might occasionally get as large as $( log ^t) sup 1/2$, which would contradict the Montgomery conjecture. In any case, it is thought likely that .DS 2 .EQ (2.4.3) |S(t)| ~<=~ ( log ^t) sup {1/2 ^+^o(1)} ~~~roman as ~~~ t^->^inf ^. .EN .DE Some lower bounds for $S(t+h) ^-^S(t)$ are also known, see [Ts1,\|Ts2], for example. .P Not only is $S(t)$ small, but its oscillations tend to cancel out. If we define .DS 2 .EQ (2.4.4) S sub 1 (t) ~=~ int from {t sub 0} to t ~ S(u) du ^, .EN .DE then $|S sub 1 (t) | ^=^ O( log ^t)$ unconditionally, and $|S sub 1 (t) |^=^ O( ( log ^t) ( log^log^t) sup -2 )$ on the RH [Tit2]. The true maximal order of magnitude of $|S sub 1 (t) |$ is probably again around $( log^t) sup 1/2$. (See [Ts1,\|Ts2] for lower bounds. The estimate $|S sub 1 (t) |^=^ o( log ^t)$ is equivalent to the Lindelo\*:f Hypothesis, see Notes to Chapter\ 13 of [Tit2].) Furthermore, if one chooses $t sub 0$ appropriately, then one obtains .DS 2 .EQ (2.4.5) int from 0 to t ~ S sub 1 (u) du ~=~ O( log ^t) ~~~roman as ~~~ t^->^inf ^. .EN .DE (The same property applies to further iterations of this process.) In addition, .DS 2 .EQ lim from {T^->^inf} ~ T sup -1 ~ int from 0 to T ~ S sub 1 (t) sup 2 dt ~=~ c .EN .DE exists for a constant $c^>^0$ (Theorem\ 14.19 of [Tit2]). .P Selberg [Sel2] proved, under the assumption of the RH, that for every fixed positive integer $k$, .DS 2 .EQ (2.4.6) int from 0 to T ~ S(t) sup 2k dt ~=~ {(2k)!} over {k! (2 pi ) sup 2k} ~T ( log^log^T) sup k ( 1^+^O( ( log ^log^T) sup -1 )) .EN .DE as $T^->^inf$. Later [Sel3] he proved similar estimates unconditionally, with $( log ^log ^T) sup -1$ in the remainder term replaced by $( log ^log ^T) sup -1/2$. Although it was apparently not noticed right away, these results imply (unconditionally) that $S(t)$ is asymptotically normally distributed with mean 0 and variance $2 pi sup 2 ^log ^log ^t$, so that for $alpha ^<^ beta$, .DS 2 .EQ (2.4.7) lim from {T^->^inf} ~ T sup -1 ^left | ^ left { t^:~ 0^<=^t^<=^T ,~ {S(t)} over {( 2 pi sup 2 ^log ^log ^T) sup 1/2} ^member^( alpha ,^beta ) right } right | ~=~ (2 pi ) sup -1/2 ^int from alpha to beta ^e sup {-x sup 2 /2} dx ^.~~~"\0\0\0" .EN .DE For further results on moments and distributions of $S sub 1 (t)$, $S(t+h)^-^S(t)$, and related functions, see [Fu1, Fu2, Fu3, Fu4, Fu8, GM, Gh1, Gh2, Go2, Joy1, Ts1, Ts2]. Goldston [Go2] has improved the estimate (2.4.6) for $k=1$ by showing, under the assumption of the RH, that .DS 2 .EQ (2.4.8) int from 0 to T ~S(t) sup 2 dt ~=~ T over {2 pi sup 2} ~ log ^log ^T ~+~ T over {2 pi sup 2} left ( c sub 1 ^+^ int from 1 to inf ^F( alpha ,T) alpha sup -2 d alpha right ) ~+~ o(T) .EN .DE as $T^->^inf$, where $F( alpha ,^T)$ is defined by (2.2.10), and $c sub 1$ is a constant, .DS 2 .EQ (2.4.9) c sub 1 ~=~ c sub 0 ~+~ sum from m=2 to inf ~ sum from p ~ left ( -^ 1 over m ~+~ 1 over {m sup 2} right ) ^1 over {p sup m} ^, .EN .DE where $c sub 0 ^=^ 0.577^"..."$ is Euler's constant. (The sign of the $m sup -1$ term is wrong in [Go2].) If Montgomery's pair correlation conjecture (2.2.12) holds, then $int from 1 to inf ^F( alpha ,^T) alpha sup -2 ^d alpha$ is asymptotic to the constant 1, but if this conjecture were to fail, it is conceivable that the second order term in the asymptotic expansion of $int from 0 to T ^S(t) sup 2 dt$ might oscillate. .P Table\ 2.4.1 presents data on the moments of $S(t)$. Statistics were collected on two intervals of the form $( gamma sub n ,^gamma sub {n+10 sup 6} )$, where $n=n sub 1 = 10 sup 12 - 6,^032$ for the $N=10 sup 12$ data, and $n=n sub 2 = 10 sup 20 - 48,^778$ for the $N=10 sup 20$ data. The average values of $S(t)$ and $S(t) sup 2$ for these sets are given in the $k=1 sup star$ and $k=2 sup star$ zeros. To obtain a good comparison with the asymptotic normal distribution, the other moments were scaled, so that if we let $sigma sup 2$ be the mean value of $S(t) sup 2$, then the $k=1,^2,^dd ,^8$ entries denote the average values of $( sigma sup -1 S(t)) sup k$, and the $k=|1| ,^|3|$, and $|5|$ entries the average values of $| sigma sup -1 S(t) | sup k$. Finally, the last column gives the corresponding values for the standard normal distribution. As we can see, the agreement between empirical values and asymptotic ones is reasonably good, and is somewhat better for $N=10 sup 20$ than for $N=10 sup 12$. .P Since $S(t)$ has jump discontinuities by 1 at zeros and decreases monotonically between zeros with derivative essentially $-1$ (on Gram point scale), and there is asymptotically one zero per Gram point, the smallest mean values of $S(t) sup 2k$ for any $k^member^Z sup +$ that is at all conceivable would be obtained by having a zero exactly halfway between every two neighboring Gram points. This would yield a mean value of $S(t) sup 2$ of 1/6. The values that are observed, 0.23 for $N=10 sup 12$ and 0.26 for $N=10 sup 20$, are not very much larger than that. .P That the distribution of $S(t)$ is close to the normal one can be seen visually in Fig.\ 2.4.1. This figure is based on determining for what fraction of values of $t^member^( gamma sub n sub 2 ,^gamma sub {n sub 2 + 10 sup 8} )$ we have $S(t) ^member^ [k/100 ,^(k+1)/100)$, and then scaling the resulting histogram by $sigma$ to produce a graph that can be compared to that of the standard normal distribution. It is curious that the observed distribution of $S(t)$ is less peaked than the normal one, whereas in most of the other comparisons the empirical distributions have sharper peaks than expected. It is especially interesting to compare Fig.\ 2.4.1 to Fig.\ 2.10.1, which compares the distribution of $log ^| Z(t) |$ (essentially the harmonic conjugate of $S(t)$) to the normal distribution. In both cases the limiting distributions are known to be normal (even without assuming the RH), but the observed deviations from normal behavior are different for $S(t)$ and $log ^|Z(t)|$, and are much more pronounced in the latter case. .P The area between the two curves in Fig.\ 2.4.1 is 0.023. For the corresponding figure using the $N=10 sup 12$ data, the area is 0.029. .P Since both $S(t)$ and its integral $S sub 1 (t)$ are very small, we can expect that $S(t)$ will have many sign changes, and several results in this direction have been proved, with the strongest ones being due to Ghosh [GL1] and Mueller [Mue1], but they are all quite weak. For example, Mueller proves that gaps between consecutive zeros of $S(t)$ are $O( log^log^log^t)$. Given that $S(t)$ has a limiting normal distribution with variance on the order of $( log^log^t) sup 1/2$ and mean close to 0, and that it cannot vary too widely (in particular, essentially all of the time it is monotone decreasing with derivative of the order of $-^log^t$), we might expect that the ratio of the number of zero crossings of $S(t)$ for $t^member^( gamma sub N ,^gamma sub N+M )$ to $M$ might be roughly the fraction of $t$ in $ ( gamma sub N,^gamma sub N+M ) $ for which $|S(t) |^<=^1$. This therefore suggests that there ought to be on the order of $( log ^log ^t) sup -1/2$ zeros of $S(t)$ per Gram interval. .P The number of sign changes of $S(t)$ in the intervals that have been investigated can be determined quite easily from the statistics of Gram blocks and exceptions to Rosser's rule that have been collected. When $g sub n$ is a good Gram point that is not close to an exception to Rosser's rule, and is not a zero of the zeta function, then $S(g sub n )^=^0$, and $S(t)$ changes sign at $g sub n$. We will count this sign change as occurring in the Gram interval $[g sub n ,^g sub n+1 )$. If $B(n,^k)$ is a Gram block that has exactly $k$ zeros, then an easy accounting shows that $S(t)$ has exactly 2 sign changes in $B(n,^k)$. On the other hand, when $B(n,^k)$ is an exception to Rosser's rule, and $[g sub m, ^g sub m+r )$ is the smallest union of Gram blocks that contains both the exception and the excess zeros, then a similar accounting shows that $[g sub m ,^g sub m+r )$ contains exactly 2 sign changes. Thus if Gram's law (see Section\ 2.12) held universally, we would have an average of 2 sign changes of $S(t)$ for every zero of $zeta (s)$. Departures from Gram's law lower this average. Table\ 2.4.2 shows the actual averages for the different data sets. There is a steady decrease in the average, but it is very slow. Since the argument in the preceding paragraph suggests a rate of decrease of $( log ^log ^t) sup -1/2$, this is not surprising. .P For every exception $B(n,^k)$ to Rosser's rule (see Section 2.13 for definitions) there is a $t$ nearby with $|S(t)|^>=^2$ (and even $|S(t)|^>^2$, if zeros do not coincide with Gram points, as seems likely). Statistics about these large values of $S(t)$ were collected during investigation of exceptions to Rosser's rule. Large values of $S(t)$ are of special interest because it is only when $S(t)$ is large that unusual behavior of the zeta function can take place. Locally extreme values of $S(t)$ occur at zeros. (Each zero has associated to it two values of $S(t)$, the limits of $S(t)$ as $t$ approaches the zero from the right or the left.) Table\ 2.4.3 shows the values of $S(t)$ for which $|S(t)|$ was largest in absolute value, as well as the number of zeros at which $|S(t)| ^>^2.3$ divided by the number of exceptions to Rosser's rule. The largest value of $|S(t)|$ that was found here is 2.7379, while among the first $1.5 times 10 sup 9$ zeros the largest such value is 2.3137 [LRW2]. (A point $t$ at which $|S(t)|^=^2.8747$ was found later in the computations described in Section\ 3.) Earlier computations established that $|S(t)|^<^1$ for $7^<^t^<=^280$, and $|S(t)|^<^2$ for $7^<^t^<=^6.8 times 10 sup 6$. .P The values of $S sub 1 (t)$ were investigated in the two intervals $( gamma sub n ,^gamma sub {n+10 sup 6} )$, where $n^=^10 sup 12 - 6,^032$ (for the $N=10 sup 12$ set) and $n^=^10 sup 20 - 48,^778$ (for $N=10 sup 20$). The values of $S sub 1 ( gamma sub n )$ were assigned so as to make .DS 2 .EQ (2.4.10) int from {gamma sub n} to {gamma sub m} ~ S sub 1 (t) dt ~=~ 0 ^ .EN .DE for $m=n+ 10 sup 6$. The data that was obtained is summarized in Table\ 2.4.4; the mean of $S sub 1 (t) sup 4$, for example, refers to .DS 2 .EQ 1 over {gamma sub m - gamma sub n} ~ int from {gamma sub n} to {gamma sub m} ~ S sub 1 (t) sup 4 dt ^. .EN .DE In addition to the uncertain choice of $S sub 1 ( gamma sub n )$, there were additional problems in these computations due to the accumulating errors due to uncertainties in the values of zeros and $S(t)$. Values computed over shorter intervals suggest that the mean values in Table\ 2.4.4 are accurate. The entry for sign changes of $S sub 1 (t)$ refers to the number of sign changes per Gram interval. This figure also appears to be quite accurate. Changing the initial value of $S sub 1 ( gamma sub n )$ by $+- ^10 sup -4$ varied the number of computed sign changes of $S sub 1 (t)$ for the $N=10 sup 20$ interval only between 73799 and 74089. .H 2 "Extreme values of gaps between zeros" In its weakest form, the GUE hypothesis predicts only that (2.2.8) holds for all $0^<=^alpha ^<^beta^<^inf$, and so it says essentially nothing about the existence of a small number $(o(M))$ of very large or very small $delta sub n$. A double zero of the zeta function, giving $delta sub n =0$, would not by itself contradict this weak hypothesis. On the other hand, it is known (cf.\ (2.2.3) and (2.2.5)) that in the GUE, .DS 2 .EQ (2.5.1) roman Prob ^( delta sub n ^<=^ x) ~=~ {pi sup 2} over 9 ~x sup 3 ~-~ {2 pi sup 4} over 225 ~x sup 5 ~+~ {pi sup 6} over 2205 ~x sup 7 ~+~...^, .EN .DE so very small $delta sub n$ (roughly $o(M sup -1/3 )$ among $M$ samples) are very unlikely in the GUE, and a similar result holds for large $delta sub n$. A strong form of the GUE hypothesis would predict that even extreme values of $delta sub n$ (and $delta sub n + delta sub n+1$) for the zeta function would behave roughly as in the GUE model. .P Given the constraints on $S(t)$ described in Section\ 2.4, one can expect that even if the strong form of the GUE hypothesis holds, it would only apply to the zeta function at large heights, and that the lower the region under investigation, the fewer extreme values of $delta sub n$ or $delta sub n + delta sub n+1$ there would be. This is clear for large values of $delta sub n$ and $delta sub n + delta sub n+1$, as these clearly correspond to large values of $| S(t) |$. It is also true for small values of $delta sub n$ and $delta sub n + delta sub n+1$, though, since several zeros clustered close together again force $|S(t)|$ to be relatively large. .P What was observed in [Od2] in a comparison of the first $10 sup 5$ zeros to $10 sup 5$ zeros starting with zero number $10 sup 12$ is that the above predictions were largely satisfied by the data. In general, there was a deficiency of extreme values of $delta sub n$ and $delta sub n + delta sub n+1$ (compared to the GUE prediction), but this deficiency declined as one considered the higher zeros. There was, however, one observation that went counter to expectations. The number of small $delta sub n$ that were observed at large heights was larger than predicted by the GUE theory. This excess was not large, but it was also observed in the data for $10 sup 5$ zeros starting with zero number $2 times 10 sup 11$, as well as by some data based on the first $1.5 times 10 sup 9$ zeros. This excess of small spacings was very counterintuitive, and so gave rise to some suspicions about the validity of the GUE hypothesis. .P Table\ 2.5.1 shows the extremal values of $delta sub n$ and $delta sub n + delta sub n+1$ that were found in each data set. (The number of zeros in each data set is given in Table\ 1.2.) The last column in Table\ 2.5.1 gives the probability that the minimal $delta sub n$ would not exceed the values in the second column if all the $delta sub n$ in the data set were drawn independently from the GUE distribution. From (2.5.1), we see that the probability that the smallest $delta sub n$ out of $M$ that are drawn from the GUE satisfies $delta sub n ^<=^x$ is approximately .DS 2 .EQ (2.5.2) 1 ~-~ left ( 1 ^-^ {pi sup 2} over 9 ~x sup 3 right ) sup M ~wig~ 1 ~-~ exp ( - pi sup 2 ^x sup 3 M/9 ) ^. .EN .DE This approximation was used to compute the last column of Table\ 2.5.1. We can see that most of the entries in that column are fairly high (although not too high, which would indicate a severe deficiency of small spacings), while those for $N=10 sup 18$ (where $delta sub n ^=^0.001124$ for $n^=^10 sup 18 ^+^12,^376,^780$, a case that will be discussed in sections\ 2.7 and 4.5) and for $N^=^10 sup 19$ (where $delta sub n ^=^0.000897$ for $n^=^10 sup 19 ^+^ 15,^987,^196$ is the smallest $delta sub n$ that was found) are extremely low. Furthermore, the smallest value of $delta sub n$ that is known is $delta sub n ^=^0.000310$ for $n^=^1,^048,^449,^114$ (found by van\ de\ Lune et\ al. [LRW2]), and the probability of such a small spacing occurring among $1.5 times 10 sup 9$ samples drawn from the GUE is only 0.048. Thus the extremely small values of the $delta sub n$ do appear to be somewhat too frequent. (Some more evidence pointing to this conclusion is presented in Section\ 3.) .P When we consider still very small, but slightly larger spacings, we find essentially no evidence of an excess of small spacings. Table\ 2.5.2 shows the number of $delta sub n ^<=^1/20$ and $<=^1/10$ observed in each set (given in number of cases per million zeros to make comparisons easier). If we consider the $N=10 sup 19$ entry for $delta sub n ^<=^1/20$, for example, we see that we are dealing with 2353 cases altogether, so a normal sampling error might be around 50, which is about 2%. Thus the 140.5 figure in the table is quite consistent with the 136.8 expected for the GUE. .P Still another way to judge whether there is any anomaly in the distribution of the $delta sub n$ or the $delta sub n + delta sub n+1$ is through the use of the quantile-quantile $(q-q)$ plots to compare the observed distributions to those of the GUE. Given a sample $x sub 1 ,^dd ,^x sub n$, and a continuous cumulative distribution function $F(z)$ for some distribution, the $q-q$ plot is obtained by plotting $x sub (j)$ against $q sub j$, where $x sub (1) ^<=^x sub (2) ^<=^...^<=^x sub (n)$ are the $x sub i$ sorted in increasing order, and the $q sub j$ are the theoretical quantiles defined by $F(q sub j ) ^=^ (j-1/2) /n$ [CCKT]. The $q-q$ plot is a sensitive method of detecting differences among distributions. In particular, while it does show the outliers that are far away from the expected position, it makes it possible to disregard them and concentrate on the main part of the distribution curve. If the $x sub j$ are drawn from the distribution corresponding to $F(z)$, and the sample size $n$ is large, the $q-q$ plot will be close to the straight line $y=x$. In all of our $q-q$ plots, straight lines $y=x$ are drawn to facilitate comparisons. (By the standards of typical statistical investigations, the sample sizes we deal with are very large, and the degree of agreement between conjecture and numerical evidence is very good, so one has to look at minute deviations.) .P The $q-q$ plots of [Od2] that showed the distribution of small $delta sub n$ indicated a deficiency of small $delta sub n$ for $N=1$, and a slight excess for $N=2 times 10 sup 11$ and $N= 10 sup 12$. These plots were each based on $10 sup 5$ values of $delta sub n$. When the new, more extensive data for $N=10 sup 12$ was obtained, the resulting $q-q$ plot was very similar to that of Fig.\ 2.5.1, and did not behave like the plot in Fig.\ 8 of [Od2] (which was based on only $10 sup 5$ zeros). Figures\ 2.5.1 and 2.5.2 show $q-q$ plots of $delta sub n$ drawn from two disjoint sets of $10 sup 6$ zeros near zero number $10 sup 20$. While the plot of Fig.\ 2.5.1 might indicate a very slight excess of small spacings (those in (0.02,\|0.04), roughly), and a slight deficiency of slightly larger spacings (where the scatterplot lies above the straight line), Fig.\ 2.5.2 indicates almost perfect agreement between theory and experiment. Figure\ 2.5.2 is not completely representative of zeros in the $N=10 sup 20$ sets, since it was the one of several $q-q$ plots based on disjoint sets of $10 sup 6$ zeros that gave the best agreement. Figure\ 2.5.1 is more typical in this respect. .P Figures\ 2.5.1 and 2.5.2 provide only a little, if any, support to the theory that there is an excess of small spacings among the zeros. Some further support can be found, however, if we aggregate all the data from the $N=10 sup 8$, $10 sup 19$, and $10 sup 20$ data sets, which contain $112,^314,^006$ zeros, and yield 112,\|314,\|003 values of $delta sub n$. The resulting $q-q$ plot, shown in Fig.\ 2.5.3, does indicate a slight excess of small $delta sub n$ (the two outliers close to the bottom of the graph are the unusually small $delta sub n$ that are minimal in the $N=10 sup 18$ and $10 sup 19$ data sets), but the evidence is not very conclusive. .P When we consider the other extremal values of $delta sub n$ and $delta sub n + delta sub n+1$, the evidence is in much better agreement with expectation. The counts in Table\ 2.5.2 show that the numbers of small $delta sub n + delta sub n+1$, large $delta sub n$, and large $delta sub n + delta sub n+1$ are all smaller than predicted by the GUE theory, but increasing towards that prediction. The $q-q$ plots of figures\ 2.5.4 through 2.5.7 also support this impression; there are too few extreme values in general, but the deficiency is smaller for $N=10 sup 20$ than for $N=10 sup 12$. .P In view of (2.2.6), one can expect that among the values of $delta sub n + delta sub n+1$ drawn from the GUE, the probability of the minimal value being $<=^x$ is about .DS 2 .EQ 1~-~ exp ( -^pi sup 6 ^x sup 8 M/32400 ) ^. .EN .DE The minimal value of $delta sub n + delta sub n+1$ of 0.1124 in the $N=10 sup 20$ data set would then occur with probability of 0.06 in the GUE, while the corresponding probabilities for the $N=10 sup 12$, $10 sup 14$, $10 sup 16$, $10 sup 18$, and $10 sup 19$ data sets are 0.93, 0.78, 0.25, 0.27, and 0.60, respectively. Thus the only one of these figures that might seem unusually small is that for the minimal $ { delta sub n } + delta sub n+1$ for $N=10 sup 20$. .P The maximal values of $delta sub n$ and $delta sub n + delta sub n+1$ recorded in Table\ 25.1 are all somewhat smaller than what the GUE predicts, which is not too surprising given the bounds known to hold for $S(t)$ and $S sub 1 (t)$. For very large spacings in the GUE, des\ Cloizeaux and Mehta [CM2] have proved that .DS 2 .EQ (2.5.3) log ~ p(0,^t) ~wig~ -~ pi sup 2 t sup 2 /8 ~~~roman as ~~~ t^->^ inf ^, .EN .DE which suggests that .DS 2 .EQ (2.5.4) max from {N+1^<=^n^<=^N+M} ~ delta sub n ~wig~ pi sup -1 ( 8 ^log ^M) sup 1/2 .EN .DE as $N,^M^->^inf$ with $M$ reasonably large compared to $N$. This is larger by about a $( log ^log^M) sup 1/2$ factor than the conjecture (2.4.2) of Montgomery allows. Our data is too limited to shed any light on the question of whether that conjecture is right. .P Values of $delta sub n$ and $delta sub n + delta sub n+1$ larger than those of Table\ 2.5.1 have been found in other computations, and are described in Section\ 3.1. In particular, the largest known values of $delta sub n$ and of $delta sub n + delta sub n+1$ are 5.1454 and 6.0165, respectively. .P Even on the assumption of the RH, it is only known that $delta sub n ^<=^0.5172$ and $delta sub n ^>=^2.337$ each occurs infinitely often [CGG1], and $delta sub n^>=^2.68$ occurs infinitely often on the assumption of the Generalized Riemann Hypothesis for Dirichlet $L$-functions (or at least of a Generalized Lindelo\*:f Hypothesis) [CGG2]. On the assumption of the RH, it is also known that $delta sub n ^<^ 0.77$ and $delta sub n ^>^ 1.33$ each holds for a positive proportion of $n$ [CGGGGH]. The GUE predicts that $delta sub n ^<^ epsilon$ and $delta sub n ^>^ epsilon sup -1$ should each hold for a positive proportion of $n$ for every fixed $epsilon ^>^0$. If one could prove that $delta sub n ^<^ 1/4$ holds for infinitely many $n$, we could obtain effective bounds for class numbers of imaginary quadratic number fields [MW]. The GUE hypothesis predicts that $delta sub n ^<^ 1/4$ for 1.6% of $n$'s, and this is very close to what is observed in numerical data. (For $delta sub n ^<^ 1/2$ the corresponding figure is 11.3%.) .H 2 "Long and short range correlations between zeros and Berry's formula" The distribution of the eigenvalues of the GUE is stationary (but not Markovian). In the limit, that also should be true for the zeros of the zeta function. However, given the slow growth rate of $S(t)$, one cannot expect GUE behavior from joint distributions of $delta sub n ,^delta sub n+1 ,^dd ,^delta sub n+k$ if $k$ is large. Already the data of sections\ 2.3 and 2.5 show that the behavior of $delta sub n$ is much closer to the GUE prediction than that of $delta sub n + delta sub n+1$. That was the main reason for not investigating $delta sub n + delta sub n+1 + delta sub n+2$ and even higher order spacings intensively. .P When we investigate long range correlations among the zeros of the zeta function, we find phenomena connected not to the GUE, but rather to the distribution of primes. For example, if we let the autocovariances of a set of $delta sub n$ be defined by .DS 2 .EQ (2.6.1) c sub k ~=~ c sub k (H,^M) ~=~ 1 over M ~ sum from {m=H+1} to H+M~ ( delta sub n -1) ( delta sub n+k -1 ) ^, .EN .DE then it has been conjectured by F.\ J. Dyson (unpublished) that in the GUE, .DS 2 .EQ (2.6.2) c sub k ~ approx ~ -1 over {2 pi sup 2 k sup 2} .EN .DE for $k^>^0$, with the $approx$ indicating some degree of approximation, not asymptotic equality as $N,^M^->^inf$. This result has not been proved for the GUE, but it is intuitively appealing for both the GUE and the zeros of the zeta function, since it says in effect that a large spacing would lead to smaller spacings nearby (and vice versa), and that this effect would diminish as one considered spacings further and further away. .P What was observed in [Od2] for the $delta sub n$ was quite different from the conjecture (2.6.2). Additional data based on the new computations is presented in Table\ 2.6.1. The $N=1$ entries come from the [Od2] computations, and have $H=0$, $M=10 sup 5$. The $N=10 sup 12$ and $10 sup 20$ entries come from the new computations, and both have $M=10 sup 6$, with $H^=^10 sup 12 - 6,^032$ for the $N=10 sup 12$ column and $H=10 sup 20 ^-^ 48,^776$ for the $N=10 sup 20$ column. (A comparison of the $N=10 sup 12$ entries here with those in Table\ 6 of [Od2], which are based on 1/10 as many zeros indicates the size of the sampling errors.) For small $k$, the data in this table supports Dyson's conjecture (2.6.2). What we observe is that for higher sets of zeros, the agreement with (2.6.2) extends to slightly higher values of $k$. However, for very high $k$, we see totally different behavior. If $delta sub n$ and $delta sub n+k$ were independent, then, since their mean value is 1 and variance is about 1/6, we would expect a sum of $10 sup 6$ terms of the form $( delta sub n -1) ( delta sub n+k -1 )$ (for $k^>^0$) to be about $10 sup 6/2 / 6 ^approx^170$, and this would correspond to a value of $c sub k$ of $1.7 times 10 sup -4$. The values in Table\ 2.6.1 for $9,980 ^<=^k^<=^10,000$ are usually much larger than that, which indicates that there are fairly strong long range correlation between the $delta sub n$. The pattern of signs of the $c sub k$ also indicates the nonrandom characters of the values of the $c sub k$. The $c sub k$ are occasionally positive, and occasionally negative, indicating that for some $k$, a large $delta sub n$ tends to be associated with large $delta sub n+k$, while for other $k$ it tends to be associated with small $delta sub n+k$. .P An explanation for the long range dependencies among the $delta sub n$ was proposed in [Od2]. It implies that the observed correlations come from primes through formulas such as that of Landau [Lan1], which says that for any fixed $y^>^0$, as $N^->^inf$ we have .DS 2 .EQ (2.6.3) sum from n=1 to H ~ e sup {i gamma sub n y} ~=~ left { matrix { lcol {- ^{gamma sub H} over {2 pi} ^e sup {-y/2}^log^p ^+^O(e sup -y/2^log^N) ~~above O(e sup -y/2^log^N)~~} lcol {roman if ~~ y^=^log^p sup m ^, above roman if ~~ y^!=^log^p sup m ^,} } .EN .DE where $p$ denotes a prime and $m^member^Z sup +$. The above statement assumes the RH, but Landau actually proved a similar unconditional result. Improvements on Landau's result (in terms of better error terms and more explicit dependencies of the error terms on $y$) have been obtained by Fujii [Fu5,\|Fu7] and Gonek [Gon2]. (There are many formulas relating primes and zeros, and the ``explicit formulas'' of Guinand [Gu1,\|Gu2] and Weil [We1] are among the most general.) .P The paper [Od2] presents the detailed explanation of how Landau's formula (2.6.3) forces the spectrum of the $delta sub n$ to consist largely of point masses at frequencies corresponding to prime powers, which then forces the initially unexpected behavior of the $c sub k$ that is seen the tables. This explanation will not be repeated here. We will mention only that while it is not rigorous, it is supported by heuristics and numerical evidence. What we will do now is to check how well Landau's formula (2.6.3) fits with the numerical data. The main interest here is to see just how many zeros $gamma sub n$ are needed at various heights to observe the phenomenon of large values occurring at logarithms of prime powers. Some proposals have even been made to use sums like that in (2.6.3) for primality testing and factoring integers. While it seems unlikely that efficient methods could be developed by this approach, it is of some interest to see what happens when one considers a relatively short sum over high zeros. .P Let .DS 2 .EQ (2.6.4) h(y) ~=~ sum from {n=10 sup 20 +1} to {10 sup 20 +4 times 10 sup 4} ~ e sup {i gamma sub n y} ^. .EN .DE Figure\ 2.6.1 shows a graph of $2^log ^| h(y) |$ for $0^<=^y^<=^3$. It is instructive to compare this graph with that of Fig.\ 15 of [Od2], which is drawn on the same scale, but is based on an exponential sum of $4 times 10 sup 4$ zeros starting at zero number $10 sup 12 +1$. Both graphs show sharp peaks precisely at logarithms of prime powers, and the peaks are visibly higher at primes than at proper prime powers, as predicted by Landau's formula. (The heights of the peaks are not represented too accurately on the graph due to limited sampling.) All the prime powers $<~e sup 3 ^=^20.09$ are visible. The main difference between the two graphs is that in Fig.\ 2.6.1 the peaks are slightly lower, and the ``noise'' region between the peaks has somewhat higher values. Furthermore, the nice regular patterns seen in the ``main'' regions of Fig.\ 15 of [Od2] (which come from sampling at regular intervals a very rapidly oscillating function whose frequency and amplitude are changing slowly) is not visible in Fig.\ 2.6.1. These differences are probably due partly to the errors in the computed values of the $gamma sub n$ near the $10 sup 20$-th zero and partly to the fact that we are taking a very short sum. $4 times 10 sup 4$ zeros out of the first $10 sup 20$ is a very small proportion, so it is remarkable that the pattern of Fig.\ 2.6.1 is as clear as it is, since this is much better than the proved results of [Fu5,\|Fu7,\|Gon2,\|Lan1] might lead one to expect. .P Figure\ 2.6.2 shows a graph of $2^log^| h(y) |$, where $h(y)$ is again defined by (2.6.4), but this time over the region $8^<=^y^<=^8.05$. (This graph is based on $10 sup 4$ equally spaced values of $y$.) The interval from $e sup 8 = 2980.96$ to $e sup 8.05 ^=^3133.79$ contains the primes 2999, 3001, 3011, 3019, 3023, 3037, 3041, 3049, 3061, 3067, 3079, 3083, 3089, 3109, 3119, and 3121, and the prime power $5 sup 5 =3125$. Figure\ 2.6.2 fails to distinguish between several close pairs of primes. When one graphs a similar sum, but with 10 times as many zeros, as is done in Fig.\ 2.6.3, all the primes can be distinguished, and even 3125 can be easily discerned. .P One of the most elegant long range correlations between zeros was found by Berry [Be4]. If we consider an interval of length $2 pi L ( log ^(T/(2 pi )) sup -1$ at height $T$, the expected number of zeros in it equals $L$. We define the number variance of the zeros by .DS 2 .EQ (2.6.5) V sub T (L) ~=~ V sub T,H (L) ~=~ H sup -1~int from T to T+H~ left { N left ( t^+^ {2 pi L} over {log^(t/(2 pi ))} right ) ^-^ N(t) ^-^L right } sup 2 dt ^. .EN .DE In the GUE, one has $V sub T (L)^=^G(L)$, with .DS 3 .EQ G(L) ~=~ pi sup -2 "{" mark log (2 pi L) ^-^ Ci( 2 pi L ) ^-^ 2 pi L ^Si (2 pi L ) .EN .sp .5 .EQ (2.6.6) ~~ .EN .sp .5 .EQ lineup ~~~~+~ pi sup 2 L ~-~ cos ( 2 pi L ) ^+^ 1^+^c sub 0 "}" ^, .EN .DE where $Ci$ and $Si$ are the cosine and sine integrals [HMF] and $c sub 0 ^=^ 0.577^dd$ is Euler's constant. Asymptotically, .DS 2 .EQ (2.6.7) G(L) ~wig~ pi sup -2 ^log (2 pi L ) ~~~~roman as ~~~~ L^->^inf ^, .EN .DE while .DS 2 .EQ (2.6.8) G(L) ~wig~ L ~~~~roman as ~~~~ L^->^0 ^. .EN .DE Gallagher and Mueller [GM] showed that Montgomery's pair correlation conjecture implies $V sub T (L) ^=^ L-L sup 2^+^ o( L sup 2 )$ as $L^->^ 0$, which is consistent with (2.6.8). (See also [Fu8].) On the other hand, the numerical evidence of [Od2] showed that $V sub T (L)$ was small even for moderately large $L$, and so a relation like (2.6.7) appeared impossible. Motivated by this discovery, by the relations between primes and long range correlation between zeros discussed above, and by his earlier work on eigenvalues of Hamiltonians of chaotic dynamical systems [Be1,\|Be2,\|Be3], Berry [Be4] found heuristic arguments which suggested that for any $tau^member^(0,^1)$, and any $L^>^0$, .DS 2 .EQ (2.6.9) V sub T (L) ~approx~ G(L) ~+~ B sub T (L) ^, .EN .DE where for $U^=^T (2 pi ) sup -1$, .DS 3 .EQ B sub T (L) ~=~ pi sup -2 left { 2~{sum from p ~ sum from r=1 to inf} from {p sup r ^<^ U sup tau}~mark {sin sup 2 ( pi L r ( log ^p) /( log ^U))} over {r sup 2 p sup r} .EN .sp .5 .EQ (2.6.10) ~~ .EN .sp .5 .EQ lineup ~+~ Ci( 2 pi L tau ) ~-~ log ( 2 pi L tau ) ~-~ c sub 0 "\b'\(rt\(bv\(rk\(bv\(rb'"^, .EN .DE and $p$ denotes primes. Computations using $10 sup 5$ zeros near zero number $10 sup 12$, using values of $L$ up to 1000, showed excellent agreement between Berry's conjecture (2.6.9) and empirical data, and those results are shown in the graphs in [Be4]. Note that the $log (L)$ terms in $G(L)$ and $B sub T (L)$ cancel out, and so for every fixed $L$, one can show that there is a positive function $g(L)$ such that .DS 2 .EQ (2.6.11) G(L) ~+~ B sub T (L) ~=~ g(L) ~+~ o(1) ~~~~roman as ~~~~ T ^->^ inf ^. .EN .DE Moreover, if $tau$ is held fixed, then it is easy to see that .DS 2 .EQ (2.6.12) G(L) ~+~ B sub T (L) ~wig~ pi sup -2 ^log^log^T ~~~~roman as ~~~~ T,^L~->~ inf ^, .EN .DE (with $L$ growing much more slowly than $T$), since the arguments of the sine in the definition (2.6.10) of $B sub T (L)$ will be asymptotically equidistributed modulo $2 pi$. .P The new values of zeros were used to obtain further data. For $N=10 sup 12$, the number variance $V sub T (L) ^=^ V sub T,H (L)$ defined by (2.6.5) was computed with .DS .TS center; r1 l1 l5 r1 l. $T$ $=$ $gamma sub n sub 0$, $n sub 0$ $=~ 10 sup 12 ^-^ 6,^032$, .sp $T+H$ $=$ $gamma sub m sub 0$, $m sub 0$ $=~ n sub 0 ^+^5 times 10 sup 5$. .TE .DE For $N=10 sup 20$, the values that were chosen were .DS .TS center; r1 l1 l5 r1 l. $T$ $=$ $gamma sub n sub 1$, $n sub 1$ $=~ 10 sup 20 ~-~ 48,^778$, .sp $T+H$ $=$ $gamma sub m sub 1$, $m sub 1$ $=~ n sub 1 ^+^ 5 times 10 sup 5$. .TE .DE Berry's function (2.6.10) was computed in each case with $tau ^=^ 1/4$. (Varying $tau$ between 0.2 and 0.3 did not appreciably change the results, as was to be expected.) The results of some of these computations for $N=10 sup 20$ are presented in figures\ 2.6.4 through 2.6.6. In Fig.\ 2.6.4, the dashed line is the graph of the GUE prediction $G(L)$, the solid line is the graph of Berry's prediction $G(L) ^+^ B sub T (L)$, and the scatterplot is that of computed values of $V sub T (L)$. In figures\ 2.6.5 and 2.6.6 the graphs of the computed values of $V sub T (L)$ and of Berry's prediction $G(L) ^+^ B sub T (L)$ were both drawn as solid lines, one superposed on the other. The very slight differences between the two curves show up as slight blotches on the graph. (The empirical data is slightly more wiggly than $G(L) ^+^ B sub T (L)$.) We see that even for $L^=^ 5 times 10 sup 5$, the agreement between computed and predicted values is almost perfect. .P A comparison of the graphs of [Be4] (and of similar graphs drawn with the more extensive data that has been obtained for $N=10 sup 12$ in the present computation) with figures\ 2.6.4 through 2.6.6 shows that for $N=10 sup 20$, the number variance oscillates less than for $N=10 sup 12$. The agreement of data with Berry's prediction is better for $N=10 sup 20$. .P While Berry's prediction (2.6.9) for $ V sub T (L) $was based on heuristic arguments, one can prove that a version of the conjecture follows from the RH and the pair correlation conjecture (2.2.12). This will be shown in a separate manuscript [Od4]. .H 2 "Lehmer phenomenon" For the RH to be true, $|Z(t)|$ cannot have any relative minima between two distinct consecutive zeros. Cases where .DS 2 .EQ (2.7.1) v sub n ~=~ max from {gamma sub n ^<^t^<^gamma sub n+1} ~ |Z(t)| .EN .DE is very small (so that in a sense the RH is ``almost violated'') are referred to as Lehmer's phenomenon [Lr2], and provide some of the more interesting heuristics both for and against the RH (cf.\ [Od3]). In this section we present statistics on the frequency of this phenomenon (which does not have a precise definition). .P The zero-locating program printed the largest value of $|Z(t)|$ that had been computed in each stretch of $10 sup 4$ zeros. To provide further information, the program was modified for the $N=10 sup 19$ and $N=10 sup 20$ data sets so as to obtain statistical information about the behavior of $v sub n$. Since getting a very good approximation to $v sub n$ would have required substantial computing time, what the program computed was the midpoint value .DS 2 .EQ (2.7.2) w sub n ~=~ |Z(( gamma sub n + gamma sub n+1 ) / 2) | ^. .EN .DE When a value of $w sub n ^>^ 250$ or $w sub n ^<^ 5 times 10 sup -4$ was encountered, it was printed together with $n$, $gamma sub n$, and $delta sub n$. (However, $w sub n$ was not computed for a total of roughly 100 zeros at the ends of data sets.) .P To see how good an approximation $w sub n$ was to $v sub n$, the values .DS 2 .EQ (2.7.3) v sub n sup star ~=~ max from {1^<=^k^<=^39} ~ left |^ Z left ( gamma sub n ^+^ k over 40 ^( gamma sub n+1 - gamma sub n ) right ) ^right | .EN .DE as well as of $w sub n$ were computed for $n sub 0 ^<=^n^<=^n sub 0 ^+^8 times 10 sup -5 -1$, where $n sub 0 ^=^10 sup 20 ^+^15,^316,^087$. Let .DS 2 .EQ (2.7.4) r sub n ~=~ v sub n sup star / w sub n ^. .EN .DE Then the maximal value of $r sub n$ that was found was 1.43. Only 755 out of the $8 times 10 sup 5$ values of $nu sub n$ were $>^1.2$, while the rms value of $r sub n -1$ was 0.029. Among the 873 values of $n$ for which $delta sub n ^<^0.1$, the maximal value of $r sub n$ was 1.008, and the rms value of $r sub n -1$ was $5.1 times 10 sup -4$. For the 898 values of $n$ for which $delta sub n ^>^2.5$, the corresponding numbers were 1.29 and 0.036. For the 244 values of $n$ for which $w sub n ^>^100$, these numbers were 1.137 and 0.032, while for the 1426 values of $n$ for which $w sub n ^<^ 0.01$, they were 1.072 and 0.0066. Thus in general the values of $w sub n$ do provide good approximations to $v sub n sup star$, and therefore surely also to $v sub n$. This was to be expected on the basis of the GUE predictions (in particular that the approximation would be exceptionally good when $delta sub n$ is small). In fact, the size of $v sub n$ is determined largely by the few zeros nearest to $delta sub n$ (cf.\ [Hej5, Hej6]), and so under the assumption of the GUE one can make quantitative predictions about the behavior of $r sub n$. .P Table\ 2.7.1 shows frequency of occurrence of values of $w sub n ^<^ 5 times 10 sup -4$ among the approximately $9.5 times 10 sup 7$ values of $n$ that were checked in the $N=10 sup 19$ and $N=10 sup 20$ data sets. The smallest value of $w sub n$ that was found there was $4.02 times 10 sup -6$, for $n=10 sup 19 ^+^15,^987,^197$ with ($delta sub n ^=^ 0.000897$), while the second smallest was $5.03 times 10 sup -6$. .P One might expect, and one does observe empirically, that the Lehmer phenomenon is associated to small values of $delta sub n$. If $delta sub n$ is small, then one might expect that $w sub n$ is almost proportional to $delta sub n sup 2$, since zeros other than $gamma sub n$ and $gamma sub n+1$ ought to contribute multiplicative factors that behave like a power of $log^gamma sub n$ on the average, and are at most $gamma sub n sup o(1)$ as $n^->^inf$ (assuming the Lindelo\*:f conjecture). Since the probability that $delta sub n ^<=^x$ is about $pi sup 2 x sup 3 /9$ for $x$ small (see sections\ 2.2 and 2.5), one might conjecture that the probability of $w sub n ^<^y$ might be proportional to $y sup 3/2$. This would suggest that among the first $n$ zeros, the smallest $w sub n$ might be on the order of $n sup -2/3+o(1)$ as $n^->^inf$. If true, this relation would settle an old question [Ed] about the number of terms in the asymptotic part of the Riemann-Siegel formula that have to be used to separate the zeros; even the old estimate of Titchmarsh [Tit1] with an error term of $O(t sup -3/4 )$ would suffice at large heights. .P The above heuristic about the behavior of small $w sub n$ is supported very well by empirical data. Among the 1976 values of $n$ in the $N=10 sup 19$ and $N=10 sup 20$ data sets that had $w sub n ^<^ 5 times 10 sup -4$, the ratio $w sub n / delta sub n sup 2$ varies between 0.0136 and 8.56, with a mean of 0.608 and a variance of 0.427. Thus the correlation between $delta sub n sup 2$ and $w sub n$ is only fairly good. On the other hand, these $w sub n$ follow almost perfectly the rule conjectured above that the fraction of them that are $<^y$ ought to be proportional to $y sup 3/2$. This can be seen from the counts in Table\ 2.7.1, as well as by looking at the ratio of the \%$k$-th smallest $w sub n$ to $5 times 10 sup -4 times (k/1978) sup 2/3$, which varies between 0.715 and 1.267, with a mean of 1.01 and a variance of $9 times 10 sup -4$, and from looking at a $q-q$ plot of the sorted $w sub n$ against $5 times 10 sup -4 times (k/1976) sup 2/3$. Thus on average the influence of the neighboring $gamma sub k$ cancels out. .P The most extreme example of the Lehmer phenomenon that was found during the computations described in this paper occurs for $n= 10 sup 18 ^+^12,^376,^780$, where $v sub n$ and $w sub n ^=^5.28 times 10 sup -7$ and $delta sub n ^=^ 0.001124$. A graph of $Z(t)$ in the vicinity of this point is given in Fig.\ 2.7.1. (Figure\ 2.7.1 shows also what looks like another case of the Lehmer phenomenon near Gram point $n-5$, but in that case the minimum of $Z(t)$ reaches $-0.0094$, and so it does not qualify under our definition.) A much more detailed view of $Z(t)$ in a very small neighborhood of this Lehmer phenomenon is given in Fig.\ 4.5.1. (That picture plays an important role in the discussion of the validity of the present computations that is presented in Section\ 4.5.) .P The most extreme example of the Lehmer phenomenon that is known was found by van\ de\ Lune et\ al. [LRW2]. For $n=1,^048,449,^114$, they discovered that $delta sub n ^=^ 0.000310$, while $v sub n ^=^ 2.2 times 10 sup -7$ $(>=^w sub n )$. Since the height of this example is only about the square root of that for $n=10 sup 18 ^+^12,^376,^780$, it could be argued that the higher example of this paper is even more extreme. However, the $delta sub n$ found by van\ de\ Lune et\ al. is by far the smallest of any that are known. .H 2 "Large values of $fat zeta bold {(1/2 +it)}$" The largest value of $|Z(t) |^=^ | zeta (1/2 + it )|$ that was encountered by van\ de\ Lune et\ al. [LRW2] in their investigation of the first $1.5 times 10 sup 9$ zeros was 117. Table\ 2.8.1 lists the largest values of $|Z(t)|$ that were encountered in each of the data sets computed in this paper. The main zero locating program kept track of the largest value of $|Z(t)|$ that had been computed, but did not attempt to do a careful search for large values. However, since large values are usually associated with large $delta sub n$, the standard zero locating procedure seemed to be quite good at finding the high peaks in $|Z(t)|$. For $N=10 sup 19$ and $N=10 sup 20$ data sets, the more careful procedure described in Section\ 2.7 was employed, which provided even more reliable statistics. The number of values of $n$ in those two data sets for which $w sub n$ (defined in 2.7.2) exceeded various thresholds is given in Table\ 2.8.2. (Section\ 3 lists some values $t$ for which $|Z(t)|$ is much larger and which were found by a different procedure.) .P The rate of growth of $|Z(t)|$ is one of the most intensively studied problems in the theory of the zeta function, since bounds on it provide estimates on the distribution of zeros away from the critical line. It is very easy to show that .DS 2 .EQ (2.8.1) |Z(t)| ~<=~ t sup {alpha + o(1)} ~~~roman as ~~~ t^->^inf .EN .DE with $alpha ^=^ 1/4$. Exponential sum methods were used in the first few decades of this century to show that (2.8.1) holds with $alpha ^=^1/6$, and then to successively lower this value of $alpha$. (See the Notes to Chapter\ 5 of [Tit2] for a list of the improvements.) Until recently, the smallest value of $alpha$ for which (2.8.1) was known to hold was $alpha ^=^ 139/858^=^0.162004 dd$, due to Kolesnik [Ko], and there were indications that this result was close to the limit of what the ``exponent pair'' method that was being used could yield [GK]. However, Bombieri and Iwaniec [BI] have obtained a new method that gave $alpha ^=^ 9/56 ^=^ 0.16071 dd$. This method was then developed by Huxley and Watt [HW] and was used very recently by Watt [Wa] to show that (2.8.1) holds with $alpha ^=^89/560 ^=^0.15892 dd$. .P The Lindelo\*:f hypothesis is the statement that (2.8.1) holds with $alpha =0$. The RH yields a slightly stronger bound [Tit2] .DS 2 .EQ (2.8.2) |Z(t)| ~<=~ exp (c( log ^t) ( log ^log ^t) sup -1 ) .EN .DE for some $c>0$. On the other side, Balasubramanian and Ramachandra [Bala,\|BR] have shown that .DS 2 .EQ (2.8.3) max from {0^<=^t^<=^T} ~ |Z(t) |~>=~ exp left ( {3( log ^T) sup 1/2} over {4( log ^log ^T) sup 1/2} right ) .EN .DE if $T$ is large enough and more generally, that if $eta ^>^0$, then for $T^>=^T( eta )$ and $( log ^T) sup eta ^<=^H^<=^T$, we have .DS 2 .EQ (2.8.4) max from {T^<=^t^<=^T+H} ~|Z(t)| ~>=~ exp left ( {3( log ^H) sup 1/2} over {4( log^ log^H) sup 1/2} right ) ^. .EN .DE Montgomery [Mon3] has conjectured that (2.8.3) is close to the real rate of growth of $|Z(t)|$. .P While the data that was collected about large values of $|Z(t)|$ probably does reflect accurately the behavior of the zeta function in these ranges, it does not help in assessing what the true rate of growth of $|Z(t)|$ is. There are two main problems. One is the relatively small number of zeros that were investigated. Since large values of $|Z(t)|$ are very rare, we probably do not even have a good representation of the large values of $|Z(t)|$ for $t^<^ gamma sub n$, $n=10 sup 20$. (This is supported by the results of Section\ 3, where much higher values were found by special methods.) Another problem in using our data to assess the true growth rate of $|Z(t)|$ arises from the slow approach to true asymptotic behavior. As is noted in Section\ 2.10 (see especially Fig.\ 2.10.1), even $log^|Z(t)|$ in the ranges that have been investigated can be rather far from its eventual distribution. Furthermore, as was noted in the Introduction, even when one investigates at heights $t^approx^1.5 times 10 sup 19$, it is hard to tell the differences in growth rates between various functions. (The situation is not quite as bleak as might seem from the argument used in the Introduction, since one can use relatively sensitive tools such as ratios of values of a function at different points to estimate its growth rate, but that only helps to a very limited extent.) Note that for $t= gamma sub n$, $n=10 sup 20$, the bound (2.8.3) is only 12.9. .P Before concluding this section, we present some more statistics on the large values of $|Z(t)|$ that were found in the $N=10 sup 19$ and $10 sup 20$ data sets. Altogether 565 values of $n$ were recorded for which $w sub n ^>^250$. The largest is $w sub n ^=^631.7$ for $n^=^10 sup 20 + 13,^704,^916$, for which $delta sub n = 3.1428$. (The maximum of $|Z(t)|$ between $gamma sub n$ and $gamma sub n+1$ is at least 641, and there is no violation of Rosser's rule in the neighborhood of $gamma sub n$.) Of the 565 values, 94 are associated with violations of Rosser's rule. (Of the 28 values of $n$ for which $w sub n ^>^400$, 7 are associated to violations of Rosser's rule.) The smallest value of $delta sub n$ that was found for these 565 values of $n$ was 2.07, and the largest was 4.03. .P There was a fairly substantial correlation between $delta sub n sup 2$ and $w sub n$ among these 565 samples. The ratio $w sub n / delta sub n sup 2$ was in the range (19.47,\|64.74), with a mean of 35.47 and variance 61.32. However, at very large heights one would expect this correlation to diminish, in contrast to the situation for Lehmer's phenomenon (Section\ 2.7). In the latter case the GUE theories predict that $delta sub n sup 2$ will occasionally get as small as $n sup -2/3$, so that the influence of the other zeros (likely to be $n sup o(1)$ because of the Lindelo\*:f hypothesis and the separation of the other zeros that is predicted by the GUE) will not affect the size of $Z(t)$ very much. On the other hand, the GUE theories predict that $delta sub n sup 2 ^=^O ( log ^n )$, and since $Z(t)$ is known to get much larger (cf.\ (2.8.3)), this must be due to some longer range imbalances in the locations of the zeros. One model for distribution of $Z(t)$ (first proposed informally by Montgomery, and worked out in detail by Bombieri and Hejhal [BH, Hej5, Hej6]) predicts that at large heights, the size of $Z(t)$ is determined primarily by long ``amplitude'' waves, which are then slightly modulated by local distributions of zeros. This model predicts that there should be clusters of large values of $|Z(t)|$, and that over very wide ranges, $w sub n$ ought to depend mostly on the ``amplitude'' waves, and not on $ delta sub n$. The fact that there is a very strong correlation between the large $w sub n$ and $delta sub n sup 2$ in our data might therefore indicate that we are not seeing the true asymptotic behavior. .H 2 "Moments of $fat zeta bold {(1/2 + it)}$ It is conjectured that for every $lambda^>=^0$, .DS 2 .EQ (2.9.1) lim from {T^->^inf} ~ T sup -1 ( log^T) sup {- lambda sup 2} ~int from 0 to T ~ | Z(t) | sup {2 lambda} ^dt ~=~ c( lambda ) .EN .DE exists, with $c( lambda ) ^>^0$ for all $lambda$. A proof of this conjecture, or even of some much weaker bound, would be very important, since it would prove the Lindelo\*:f conjecture. However, this conjecture is only known to be true for $lambda =0$ with $c(0)=1$ (trivial), $lambda =1$ with $c(1)=1$, and $lambda =2$ with $c(2) = (2 pi sup 2 ) sup -1$ (see the Notes for Chapter\ 7 in [Tit2] for detailed information and references). No specific values have been conjectured for $c( lambda )$ in general, but under the assumption of the RH, Conrey and Ghosh [CG1] have shown that $c( lambda ) ^>=^c sub 1 ( lambda )$, where .DS 2 .EQ (2.9.2) c sub 1 ( lambda ) ~=~ GAMMA ( 1+ lambda sup 2 ) sup -1 ~ prod from p ^left { left ( 1 - 1 over p right ) sup {lambda sup 2} ~sum from m=0 to inf ~ left ( {GAMMA (m+ lambda )} over {m!^GAMMA ( lambda )} right ) sup 2 ^p sup -m right } ^, .EN .DE and since $c( lambda ) = c sub 1 ( lambda )$ for $lambda =0$ and 1, they suggested that perhaps $c( lambda ) = c sub 1 ( lambda )$ for all $lambda^member^[0,^1]$. Since $c sub 1 (2) = ( 4 pi sup 2 ) sup -1 = c(2)/2$, equality of $c( lambda )$ and $c sub 1 ( lambda )$ is unlikely outside the range $0^<=^lambda ^<=^1$. (There is a mistake on this point in the Notes to Chapter\ 7 of [Tit2].) Conrey and Ghosh [CG3] have shown that the derivatives of $c sub 1 ( lambda )$ and $c( lambda )$ with respect to $lambda$ agree at $lambda =0$ and 1. Also, for $0^<=^lambda ^<^1$, Heath-Brown [HB2] has shown under the assumption of the RH that if $c ( lambda )$ exists, it is not much larger than predicted by the Conrey-Ghosh conjectures. .P One of the purposes of this section is to provide some numerical evidence about possible values of $c( lambda )$. One might expect that if .DS 2 .EQ (2.9.3) r( lambda ,^T,^H) ~=~ H sup -1 ( log ^T) sup {- lambda sup 2}~ int from T to T+H ~ |Z(t)| sup {2 lambda} ^dt ^, .EN .DE then $r( lambda ,^T,^H)^wig^c( lambda )$ as $T^->^inf$, if $H$ grows sufficiently fast with $T$ while $lambda$ is held fixed. Table\ 2.9.1 presents some values of $r( lambda ,^T,^H)$ computed for $T= gamma sub n sub 0$ with $n sub 0 = 10 sup 20 ^+^47,^098,^588$ and $T+H^=^ gamma sub n sub 1 ,^n sub 1 = n sub 0 + 10 sup 6$. Each of the $10 sup 6$ gaps between consecutive zeros was divided into 40 intervals, $Z(t)$ was evaluated at the endpoints of these subintervals, and Simpson's rule was applied to estimate the integral. Variations on this procedure showed that it produced estimates that were accurate to at least three decimal places (and more for high moments, as Simpson's rule is least accurate for small $lambda$). However, the values in the tables, especially for large $lambda$, have to be used with caution because even an interval of $10 sup 6$ zeros around zero number $10 sup 20$ is too small to be truly representative. For example, similar data was obtained for $T= gamma sub n sub 2$ with $n sub 2 = 10 sup 20 + 15,^316,^087$ and $T+H^=^gamma sub n sub 3$, $n sub 3 = n sub 2 + 8 times 10 sup 5$, and also for $T= gamma sub n sub 4$, $n sub 4 = 10 sup 20^-^ 15,^409,^244$, $T+H = gamma sub n sub 5$, $n sub 5 = n sub 4 + 10 sup 6$. For $lambda =1$, the values found there differed by less than 0.5% from those in Table\ 2.9.1, but for $lambda = 2.5$ these values were 1.20 and 0.752 times those in Table\ 2.9.1, respectively. The problem is that high moments are determined largely by the few exceptionally large values of $Z(t)$, and those are very rare. (See the next section for some further evidence of this.) To get a good sample, for large $lambda$, one would need to integrate $|Z(t)| sup {2 lambda}$ over much longer intervals. .P The data in Table\ 2.9.1 is reasonably consistent with the Conrey-Ghosh conjectures that $c( lambda ) = c sub 1 ( lambda )$ for $0^<=^lambda ^<=^1$ and that $c( lambda ) = c sub 2 ( lambda )$ for $1^<=^lambda ^<=^2$. Even for $lambda ^=^5/2$, where $c sub 3 ( lambda ) ^=^ 11.802 ^dd ^times^c sub 1 ( lambda )$, the agreement of data with conjecture is very good, as $r( lambda , H) / c sub 1 ( lambda ) ^=^11.38 ^dd$. However, given the differences between the empirical data for $lambda =1$ and 2 and the known asymptotic values, it is hard to draw any definitive conclusions. For $lambda =1$, estimates of the second moment of $Z(t)$ are known that are better than (2.9.1). They are of the form .DS 2 .EQ (2.9.4) int from 0 to T ~Z(t) sup 2 dt ~=~ T( log ^T -1- log (2 pi ) + 2 c sub 0 ) ~+~ E(T) ^, .EN .DE where $c sub 0$ denotes Euler's constant $(=^0.577215^"...")$, and $|E(T)|^=^O(T sup alpha )$ for various $alpha ^<^1/3$. (The best current value of $alpha$ is $139/429 + o(1)$ as $T^->^inf$, due to Kolesnik [Ko] and in a slightly sharper form to Hafner and Ivi\o'c\(hc' [HI]. Note that 139/429$^=^$0.3240....) If we let $r sup star ( lambda ,^T,^H)$ be defined similarly to $r( lambda ,^T,^H)$, but with $log^T$ in (2.9.3) replaced by $log^T - log ( 2 pi ) + 2 c sub 0$, we find that for the values of $T$ and $H$ that were used to compute Table\ 2.9.1, $r sup star (1,^T,^H)^=^1.004$, which is closer to the asymptotic value $c(1) =1$ than the value of $r(1,^T,^H)^=^0.989$. (The other two sets of values that were considered give $r sup star (1,^T,^H)^=^1.0003$ and 0.9995, respectively.) Thus one of the main problems in using the empirical data is that we do not have good conjectures about asymptotics of moments of $Z(t)$, and that second order terms in those asymptotics are likely to be only slightly smaller than the main terms. (See also Section\ 2.10 on deviations between observed and expected behavior of $Z(t)$.) .P Some data were obtained also about the negative moments of $|Z(t)|$. Table\ 2.9.2 shows some values of .DS 2 .EQ 1 over H ~ int from T to T+H ~|Z(t)| sup {- 2 lambda} ^dt .EN .DE for $T$ and $H$ as in Table\ 2.9.1. (The values for $T= gamma sub n sub 2$, $T+H ^=^gamma sub n sub 3$, were essentially identical.) They were obtained by applying Simpson's rule to the inner 38 subintervals in every gap between consecutive zeros, and approximating $|Z(t)|$ by a linear function on the two outer subintervals. .P Conrey and Ghosh [CG2] have shown (assuming the RH) that .DS 2 .EQ (2.9.5) 1 over M ~ sum from m=1 to M ~ max from {gamma sub m ^<^t^<^gamma sub m+1} ~Z(t) sup 2 ~wig~ 1 over 2 (e sup 2 - 5) ^log ^( gamma sub M /( 2 pi )) .EN .DE as $M^->^inf$. Since $c(1) =1$, this means that on average $Z(t) sup 2$ at its maxima is $1+ 1 over 2 (e sup 2 -7) = 1.1945^"..."$ times the average of $Z(t) sup 2$ over the entire range $0^<^t^<=^gamma sub M$. (This surprisingly small factor of 1.1945... is due to the fact that the values of $Z(t) sup 2$ at the critical points where they achieve their maxima are not weighted by the lengths of the intervals on which the maxima are computed. Large values of $Z(t)$ are usually associated to large gaps between consecutive zeros.) Actual computation over the range from $T= gamma sub n sub 2$ to $T+H = gamma sub n sub 3$ yielded a value of 1.224... instead of the asymptotic value of $1.1945 dd ^$. (The value 1.224... is probably a slight underestimate of the actual ratio, since the actual maxima were not determined, but the largest of the values at the 40 evenly spaced points was used.) .P Gonek [Gon1] has shown, again assuming the RH, that .DS 2 .EQ (2.9.6) 1 over M ~ sum from m=1 to M ~ Z( gamma sub m + i alpha DELTA ) sup 2 ~ wig~ left ( 1 ^-^ left ( {sin ^pi alpha} over {pi alpha} right ) sup 2 right ) ^log^( gamma sub M / ( 2 pi )) .EN .DE as $M^->^inf$, when $DELTA = 2 pi ( log ( gamma sub M /( 2 pi ))) sup -1$. Computations for $alpha =0.1,^0.2,^"...",^0.9$ and over the zeros numbered $n sub 4$, $n sub 4 +1 ,^"...",^n sub 5 -1$ showed reasonably good argument, but with the ratio of empirical data to Gonek's asymptotic estimate declining by 4% as $alpha$ goes from 0.1 to 0.9. .H 2 "Distribution of values of $fat zeta bold {( 1/2 + it)}$" .P Since .DS 2 .EQ log ^zeta (1/2 + it) ~=~ log ^ | Z (t) | ~+~pi i S(t) ^, .EN .DE it is not surprising that methods that yield the distribution of $S(t)$ should give corresponding results for $log ^| Z(t) |$. In fact, Selberg in unpublished manuscripts studied mean values of $( log ^zeta ( 1/2 + it)) sup h ( log ^zeta (1/2 - it )) sup k$ for nonnegative integers $h$ and $k$, and his results imply, for example, that for rectangles $E$ in $R sup 2$ .DS 2 .EQ (2.10.1) lim from {T^->^inf} ~ 1 over T ^left | ^ left { t^:~ T^<=^t^<=^2T ,~ {log ^zeta (1/2 + it )} over {( 2 sup -1 ^log^log^T) sup 1/2} ^member^E right } ^right | ^=^ ( 2 pi ) sup -1~ {int int} from E ~ e sup {- ( x sup 2 + y sup 2 )/2} dx dy ^,~~~"\0\0\0" .EN .DE so that in particular, for any $alpha ^<^ beta$, .DS 2 .EQ (2.10.2) lim from {T^->^inf} ~ 1 over T ^left | ^ left { t^:~ T^<=^t^<=^2T, ~alpha ^<^ {log^|Z(t)|} over {( 2 sup -1 ^log^log^T ) sup 1/2} ^<^ beta right }^ right | ~=~ (2 pi ) sup -1/2~int from alpha to beta ^e sup {- x sup 2 /2} dx ^.~~~"\0\0\0" .EN .DE Thus the real and imaginary parts of $log ^zeta (1/2 + it )$ behave like independent normal variables with means 0 and variances $( log ^log ^t) /2$. While Selberg's results have not been published, they were known to some mathematicians (see [Hej6,\|Joy1,\|Jut,\|Mon6\), and some extensions of Selberg's results have been obtained by Joyner [Joy1] and Tsang [Ts2]. The weaker result (2.10.2) has been reproved by Laurinchikas [Lau1,\|Lau2,\|Lau3,\|Lau4,\|Lau5]. .P The critical issue is whether the approximation (2.10.2) is accurate even for $T$ fixed and $alpha$ and $beta$ varying over fairly wide ranges. If that is the case, then we are led to expect that something like (2.9.1) holds. Furthermore, if the approximation is very good even for $alpha$ and $beta$ relatively large (compared to $T$), one would expect that the maximal size of $|Z(t)|$, for $0^<=^t^<=^T$, would be of the order of $exp (( log ^T) sup {1/2 + o(1)} )$, which is conjectured by some to be the true rate of growth of $Z(t)$ (cf. Section\ 2.8). Thus it is of substantial interest to find out more about the tails of the distribution of $log ^|Z(t)|$. .P For $n sub 0 ^<=^n^<=^n sub 1 -1$, $n sub 0 ^=^ 10 sup 12 - 6032$, $n sub 1^=^n sub 0 + 10 sup 6$, each interval $( gamma sub n ,^gamma sub n+1 )$ was partitioned into 40 equal subintervals, $Z(t)$ was evaluated at the endpoints of these subintervals, and a linear approximation to $Z(t)$ between consecutive evaluation points was used to estimate .DS 2 .EQ (2.10.3) b sub {alpha ,^beta} ~=~ 1 over {gamma sub n sub 1 - gamma sub n sub 0} ~ left | ^"{"t^:~ gamma sub n sub 0 ^<=^t^<=^gamma sub n sub 1 , ~~ alpha ^<=^log ^| Z(t) | ^<=^beta "}" ^right | .EN .DE for $beta ^=^ alpha + 1/100$, $alpha^=^ k/100$, $- 1000^<=^k^<^1000$. The mean of this distribution, referred to as $N=10 sup 12$, (as derived from the $b sub {alpha ,^beta}$ data) was $5.29 times 10 sup -4$ and the variance was 2.2930. Similar data was obtained for $n sub 2 ^<=^n^<=^n sub 3 -1$, $n sub 2 ^=^ 10 sup 20 + 15,^316,^087$, $n sub 3 ^=^ n sub 2 + 10 sup 6$, and there the mean was $5.20 times 10 sup -4$ and the variance was 2.5657. (This is the $N=10 sup 20$ distribution.) Based on (2.10.2), one would expect mean values of 0, which is quite close to the calculated values, given the errors in the computation and sampling errors. The values for the variances would be expected to be $( log ^log ^T) /2$, where $T$ is the height of the data set, which equals 1.635 and 1.894 for the two data sets, respectively. Since $( log ^log ^T) /2$ is only the asymptotic value and increases very slowly, lower order terms can be expected to be significant, and so the agreement between observed data and theory is reasonably good on this point as well. However, the shapes of the observed distributions of $log ^|Z(t)|$ appear to be quite different from the asymptotic normal distribution. To obtain a good comparison, the two distributions for $N=10 sup 12$ and $10 sup 20$ were each scaled so as to have variance\ 1, and were plotted in Fig.\ 2.10.1 together with the standard normal distribution. We see that while the fit of the $N=10 sup 20$ data is slightly better than that for $N=10 sup 12$, it is not much better. This is in great contrast to the fit of the data for $S (t)$ (which, apart from a factor of $1/ pi$, is the imaginary part of $log ^zeta (1/2 +it)$, while $log ^| Z(t)|$ is the real part of it) which, as we see in Section\ 2.4 and Fig.\ 2.4.1, is much better. It might be of some interest to compute second order terms in the expansion of moments of $log ^| Z(t) |$ to see what is responsible for the deviations from the asymptotic behavior that are visible in the data. In view of Goldston's results [Go2] (mentioned in sections\ 2.4 and 2.6), it seems likely that such higher order terms depend on the pair correlation of zeros, and even on higher order correlations. .P The area between the empirical distribution curve for $N=10 sup 12$ in Fig.\ 2.10.1 and the normal curve is 0.132, while for $N=10 sup 20$ the corresponding area is 0.114. In both cases these areas are much larger than those for the distribution curves for $S(t)$ discussed in Section\ 2.4, which confirms the impression one obtains by comparing Fig.\ 2.4.1 to Fig.\ 2.10.1. .P Table\ 2.10.1 presents fairly extensive data on the moments of $log^|Z(t)|$. The six sets of data summarized in this table were all obtained by choosing $10 sup 6$ random points in an interval of length $1.5 times 10 sup 5$. For $N=10 sup 12$, this interval started near zero number $10 sup 12^-^6,^032$. For $N=10 sup 18 (a)$ and $N=10 sup 18 (b)$, the intervals were the same, starting near zero number $10 sup 18 ^-^ 8,^839$ but the random sequences were different, since different seeds were chosen for the pseudorandom number generator. This was done to show the size of the sampling error. For $N=10 sup 20 (c)$, the starting point was near zero number $10 sup 20 ^-^ 48,^776$, while for $N=10 sup 20 (d)$, it was near zero number $10 sup 20 ^+^15,^316,^087$. The mean and second moment for each data set are shown in the $k=1 sup star$ and $k=2 sup star$ entries, respectively. These were then applied to translate and scale the data sets so as to obtain mean equal to 0 and variance equal to 1, for ease of comparison with the standard normal distribution. The $k$-th entry in the table, $1^<=^k^<=^10$, given the \%$k$-th moment of each scaled data set, and the last column gives the corresponding value for the normal distribution (0 for $k$ odd, $(k-1) ^cdot^ (k-3)^cdot^"..."^cdot^3^cdot^1$ for $k$ even). .P Given that the distribution of $log^|Z(t)|$ differs so much from the expected normal one, one has to treat the data about moments of $|Z(t)|$, for example, with extreme caution, as they may not be very representative of true asymptotic behavior. Furthermore, the general distribution of $Z(t)$ may be even less representative of what happens higher up. .P Figure\ 2.10.2 presents some empirical data on values of $Z(t)$. This figure is based on the values of $Z(t)$ in the three intervals covering $2.8 times 10 sup 6$ zeros that were described in the preceding section. For each interval between consecutive zeros, the function $|Z(t)|$ was approximated on 40 equal-sized subintervals by a linear function, and the length of the interval on which this linear approximation was in each range $[k-1,^k)$ for $k^>=^0$ was computed. If $A sub k$ denotes the length of all the intervals on which the linear approximations were in $[k-1,^k)$, and .DS 2 .EQ q sub k ~=~ {A sub k} over {sum from k=1 to inf ~A sub k} .EN .DE the fraction of time spent there, then the plot in Fig.\ 2.10.2 shows $log^q sub k$. From this graph and other graphs based on the data from each of the three main intervals separately, it appears that for $k^wig^250$, the behavior of $log^q sub k$ is dominated by a few large peaks of $|Z(t)|$ (which also account for a large part of the values of high moments of $Z(t)$ dealt with in the previous section). In particular, the segments of the graph in Fig.\ 2.10.2 that shoot up are due to high peaks, with the final region $(k^>=^353)$ due to two peaks where $|Z(t)|$ reaches the neighborhood of 460, and the preceding region of increase in $log^q sub k$ being due to a point where $|Z(t)|$ is around 351. .H 2 "Values of $fat {zeta sup prime} bold {( 1/2 + i fat gamma )}$" Under the assumption of the RH and of a weak consequence of the pair correlation conjecture, namely that for some $tau ^>^0$, there is a constant $B$ such that .DS 2 .EQ (2.11.1) roman {"lim sup"} from {N^->^inf} ~ 1 over N ^left |^"{" n^:~ N^<=^n^<=^2N ,~~delta sub n ^<^c "}" right | ~<=~ B c sup tau .EN .DE holds uniformly for all $c^member^(0,^1)$, Hejhal [Hej6] has shown that for all $alpha ^<^ beta$, .DS 3 .EQ (2.11.2) lim from {N^->^inf} ~1 over N ^left | ^left { n^:~ N^<=^n^<=^2N ,~~ {log left | ^{2 pi Z sup prime ( gamma sub n )} over {log ( gamma sub n ( 2 pi ) sup -1 )} right |} over {( 2 sup -1 ^log^log^N) sup 1/2}~member~( alpha ,^beta ) right }^ right | ~=~ ( 2 pi ) sup -1/2 int from alpha to beta ^e sup {- x sup 2 / 2} dx ^. .EN .DE (Note that under the RH, which we assume throughout this section, $| zeta sup prime ( rho ) |^=^ |Z sup prime ( gamma )|$ for $rho = 1/2 + i gamma$.) .P As is the case with the values of $Z(t)$, we would like to obtain more information about the tails of the distribution of $Z sup prime ( gamma sub n )$, and in particular about the moments. Let us define .DS 2 .EQ (2.11.3) J sub lambda (T) ~=~ sum from {pile {n above gamma sub n ^<=^T}} ~| Z sup prime ( gamma sub n ) | sup {2 lambda} ^. .EN .DE Then $J sub lambda (T)$ exists for all $lambda ^>=^0$, and if the zeros of the zeta function are all simple (as they are conjectured to be, and as is the case with all of the zeros that have been computed) then $J sub lambda (T)$ also exists for $lambda ^<^0$. The only nontrivial proved asymptotic result is due to Gonek [Gon1] under the assumption of the RH; .DS 2 .EQ J sub 1 (T) ~wig~ T over {24 pi} ~ ( log ^T) sup 4 ~~~roman as ~~~ T^->^inf ^. .EN .DE It is trivial that $J sub 0 (T) ^wig^T over {2 pi}^log^T$ as $T^->^inf$, and it is known (cf. [Tit2; Section\ 14.27]) that $J sub {- 1/2} (T) /T^->^inf$ as $T^->^inf$. Gonek [Gon3] has also shown that (under the RH) $J sub -1 (T)^>=^cT$ for some $c^>^0$. If the limit law (2.11.2) holds fairly well even for small $N$, and the tails of the distribution of $Z sup prime ( gamma sub n )$ are not too large, then we might expect (as was suggested by Hejhal [Hej6] and stated explicitly by Gonek [Gon3]) that $J sub lambda (T)$ is on the order of .DS 2 .EQ (2.11.4) T( log^T ) sup {( lambda +1) sup 2} ~~~roman as ~~~ T^->^inf ^. .EN .DE Furthermore, Gonek [Gon3] has conjectured that .DS 2 .EQ (2.11.5) J sub -1 (T) ~wig~ 3 pi sup -3 T ~~~roman as ~~~T^->^inf ^. .EN .DE .P Approximate values were obtained for $|Z sup prime ( gamma sub n )|$, $n sub 0 ^<=^n^<=^n sub 0 ^+^10 sup 6 -1$, where $n sub 0 ^=^ 10 sup 20 ^+^15,^316,^107$. Since the behavior of $Z(t)$ is determined primarily by zeros close to $t$ (cf.\ [BH, Hej6]), it was assumed that for $t$ near $gamma sub n$, $Z(t)$ is approximated well by .DS 2 .EQ (2.11.6) a ~ prod from j=-20 to 20 ~ (t- gamma sub n+j ) ^, .EN .DE where $a$, representing the influence of zeros far away from $gamma sub n$, is almost a constant, and this led to approximating $Z sup prime ( gamma sub n )$ by .DS 2 .EQ (2.11.7) epsilon sup -1 ^Z( gamma sub n + epsilon ) ~ prod from {pile {j=-20 above j^!=^0}} to 20 ~ {gamma sub n+j} over {gamma sub n + epsilon - gamma sub n+j} ^, .EN .DE where $epsilon ^=^ ( gamma sub n+1 - gamma sub n ) /40$. Varying the number of terms in the heuristic approximation (2.11.6) as well as varying $epsilon$ suggested that (2.11.7) does produce good approximation to $Z sup prime ( gamma sub n )$. .P The smallest value of $| Z sup prime ( gamma sub n )|$ that was found was 0.13, while the largest was $2.47 times 10 sup 3$. The values of $log ^|Z sup prime ( gamma sub n )|$ had a mean of 3.35 and a variance of 1.14, in contrast to 1.91 and 1.9, respectively, which are predicted by Hejhal's result (2.11.2). Given the slow rate of growth of these quantities, second order terms in the asymptotic results are likely to be comparatively large, so this difference between expected and observed values is probably not significant. If we let .DS 2 .EQ (2.11.8) v sub n ~=~ ( log ^| Z sup prime ( gamma sub n )| ^-^ m ) / sigma ^, .EN .DE where $m$ is the mean and $sigma$ the standard deviation of our set of $log ^|Z sup prime ( gamma sub n )|$, then Fig.\ 2.11.1 shows a comparison of the distribution of $v sub n$ with the standard normal distribution. The line is the standard normal density, while the scatterplot represents a histogram of $v sub n$; for each interval $[ alpha ,^beta )$, $alpha ^=^ k/50$, $beta ^=^ alpha + 1/50$, a star is placed at $(x,^y)$, $x^=^(( alpha + beta ) /2-m)/ sigma$, $y ^=^sigma ^ b sub {alpha ,^beta}$, where .DS 2 .EQ (2.11.9) b sub {alpha ,^beta} ~=~ 50 over {10 sup 6} ^left | ^"{" n^:~ v sub n ^member^[ alpha ,^beta ) "}" right | ^. .EN .DE It is worth noting that the distributions of $log ^| Z (t)|$ and $log ^|Z sup prime ( gamma ) |$ are both supposed to be asymptotically normal, but the convergence appears much faster for $log ^| Z sup prime ( gamma )|$, as is revealed by a comparison of Fig.\ 2.11.1 to Fig.\ 2.10.1. This is true even though the asymptotic normality of $log ^|Z(t)|$ is an unconditional theorem, while that of $log^|Z sup prime ( gamma ) |$ depends on unproved assumptions. .P Table\ 2.11.1 gives the moments of the $v sub n$ and of the asymptotic normal distribution. The entry for $k=1 ,^dd ,^10$ denotes the \%$k$-th moment of $v sub n$ and the normal distribution, while the $k=1 sup star$ and $k=2 sup star$ entries give the first two moments of $log ^|Z sup prime ( gamma sub n ) |$, respectively. Comparison with Table\ 2.10.1 again shows much better agreement between empirical and expected values for $log^|Z sup prime ( gamma sub n ) |$ than for $log^|Z(t)|$. .P Moments of $Z sup prime ( gamma sub n )$ for the $10 sup 6$ values that were computed are shown in Table\ 2.11.2. Since (2.11.4) suggests that for $M$ relatively large, .DS 2 .EQ (2.11.10) J sub lambda sup star (M,^N) ~=~ 1 over M ~ sum from n=N+1 to N+M ~ | Z sup prime ( gamma sub n ) | sup {2 lambda} .EN .DE ought to be of the order of magnitude of .DS 2 .EQ ( log ^gamma sub N ) sup {( lambda +1) sup 2 -1} ^, .EN .DE while for $lambda =1$ and $-1$ we ought to have the more precise relations .DS 3 .EQ (2.11.11) J sub 1 sup star (M,^N) mark ~wig~ 1 over 12 ~ ( log ^gamma sub N ) sup 3 ^, .EN .sp .EQ (2.11.12) J sub -1 sup star (M,^N) lineup ~wig~ 6 pi sup -2 ~ ( log ^gamma sub N ) sup -1 .EN .DE (the asymptotic relations holding as $M,^N^->^inf$ with $M$ relatively large). Table\ 2.11.2 shows the ratio of empirical to expected values, namely .DS 2 .EQ (2.11.13) r sub lambda ~=~ J sub lambda sup star ( 10 sup 6 ,^n sub 0 -1 ) ( log ^gamma sub n sub 0 ) sup {1- ( lambda +1) sup 2} ^. .EN .DE .P The value for $lambda =1$ is in excellent agreement with (2.11.11) (which is a theorem under the assumption of the RH), while the value for $lambda =-1$ is reasonably consistent with that of (2.11.12). Since (2.11.12) is derived from Gonek's conjecture (2.11.5), this supports the conjecture. .P A theorem announced by Fujii [Fu4] (which assumes the RH) states that .DS 3 .EQ sum from {0^<^gamma ^<=^T} ~ zeta sup prime ( 1/2 + i gamma ) mark ~=~ T over {4 pi} ~ log sup 2 ~ T over {2 pi} ~+~ c sub 0 ~ T over {2 pi} ~ log~ T over {2 pi} .EN .sp .5 .EQ (2.11.14) ~~ .EN .sp .5 .EQ lineup ~+~ c sub 1 ~ T over {2 pi} ~+~ O(T sup {9/10^+^o(1)} ) .EN .DE as $T^->^inf$, where $c sub 0$ and $c sub 1$ are explicit constants. This turns out to be in excellent agreement with the empirical result .DS 2 .EQ (2.11.15) sum from {n=n sub 0} to {n sub 0 +10 sup 6 -1} ~ zeta sup prime (1/2 + i gamma sub n ) ~=~ 2.181 times 10 sup 6 ~+~ i ^8.7 times 10 sup 3 ^. .EN .DE .P The approximate procedure that was used to evaluate $Z sup prime ( gamma sub n )$ can be replaced by a much more rigorous and accurate method. The algorithm of [OS] that was used to compute $Z(t)$ precomputes a set of values from which $Z(t)$ is obtained by interpolation. However, the main interpolation formula (4.3.15) can be differentiated with respect to $t$, which enables one to compute $Z sup prime (t)$ (and therefore also $zeta sup prime (1/2 + it )$) from the basic data. If such a program were written, it could be used to check the speed of convergence of the distribution of $log ^| zeta sup prime (1/2+ it)|$ to the gaussian limit that has been shown to hold under the assumption of the RH by Hejhal [Hej6]. .H 2 "Gram points and blocks" \f2Gram's law\f1 is the empirical observation that $Z(t)$ usually changes sign in each \f2Gram interval\f1 $G sub n ^=^[g sub n ,^g sub n+1 )$, $n^>=^-1$. (The Gram points $g sub n$ are defined in Section\ 2.0.) Gram [Gram] observed that it held in the range of values he investigated, but he conjectured that it would fail eventually. The first counterexample occurs for $G sub 125$, and was discovered by Hutchison [Hu]. If Gram's law held universally, the RH would be true. However, it is known that this ``law'' fails infinitely often. On the other hand, it does hold for a large fraction of cases. For $n^<=^1.5 times 10 sup 9$, Gram's law holds 72.79% of the time [LRW2], among $10 sup 6$ Gram intervals near zero number $10 sup 12$, it holds 70.82% of the time, and among $10 sup 6$ Gram intervals near zero number $10 sup 20$, it holds 68.9% of the time. (Under the GUE and some further assumptions to be discussed later, one might expect that asymptotically, Gram's law would hold 66.3% of the time.) .P One barely plausible reason why Gram's law might hold (and why the RH might hold) is that in the Riemann Siegel formula for $Z(t)$ (see Eq.\ (4.1.2)) the leading term equals $2(-1) sup n$ at $t=g sub n$. If this term, which is the largest, were truly dominant, then Gram's law and the RH would follow. We now know this to be false, but there is still a lot of interest in the behavior of $Z(t)$ at Gram points, since sign changes of $Z(t)$ correspond to zeros of the zeta function on the critical line. .P A Gram point $g sub n$ is called \f2good\f1 if $(-1) sup n ^Z(g sub n )^>^0$, and \f2bad\f1 otherwise. A \f2Gram block\f1 is an interval $B sub n ^=^[g sub n ,^g sub n+k )$ such that $g sub n$ and $g sub n+k$ are good Gram points, while $g sub n+1 ,^dd ,^g sub n+k-1$ are bad Gram points. The \f2length\f1 of a Gram block $B sub n ^=^ [g sub n ,^g sub n+k )$ is $k$. The \f2pattern of zeros\f1 in a Gram block $B sub n ^=^[g sub n ,^g sub n+k )$ is the string $a sub 1 ^...^a sub k$, where $a sub i$ denotes the number of zeros of $Z(t)$ in $[g sub n+i-1 ,^g sub n+i )$. Since no Gram interval with more than 4 zeros has yet been found, writing $a sub 1 ^...^a sub k$ without comma separators is unambiguous. (Gram intervals with arbitrarily many zeros almost surely exist, but given the GUE predictions about zeros repelling each other, they are likely to be very rare.) .P The statistics that have been collected on Gram intervals and blocks (as well as on exceptions to Rosser's rule, which are discussed in Section\ 2.13) are subject to errors, not only due to the roundoff problems that have been mentioned before and are discussed extensively in Section\ 4, but also to the fact that even if the computations of $Z(t)$ were exact, Gram points were determined only approximately, so that the determinations of the signs of $Z(g sub n )$ were not certain. No special precautions were taken to deal with this problem (such as checking on the size of the computed value of $Z(g sub n )$) as it was felt that this was unlikely to affect general statistics. .P The computations of [LRW2] of the first $1.5 times 10 sup 9$ zeros found only 6 Gram blocks of length 9, and none of lengths $>=^10$. In contrast, the maximal lengths of Gram blocks found during the present computations were 9 for $N=10 sup 12$, 9 for $N=10 sup 14$, 11 for $N=10 sup 16$, 13 for $N=10 sup 18$ (1 case), 12 for $N=10 sup 19$, and 14 for $N=10 sup 20$ (1 case, with zero pattern $0111...13110$) .P Table\ 2.12.1 gives the fraction of Gram blocks in given data sets that had given lengths. The $N=1$ and $N=1.4 times 10 sup 9$ data is derived from Table\ 1 of [LRW2], and comes from two sets of $10 sup 8$ Gram intervals each, the first one starting at $g sub 0$, the second at $g sub n$ for $n=1.4 times 10 sup 8$. The $N=10 sup 12$ data is based on only 1,\|590,\|000 Gram interval. .P The main program did not keep track of Gram blocks according to their pattern of zeros. However, a special study was made of two blocks of $10 sup 6$ Gram intervals each, one starting at $g sub n sub 1$, $n sub 1 ^=^ 10 sup 12 - 6,^034$, the other at $g sub n sub 2$, $n sub 2 ^=^10 sup 20 - 42,^780$, which for the remainder of this section will be referred to as the $N=10 sup 12$ and $N=10 sup 20$ data sets, respectively. .P If a Gram block $B(n,^k)$ contains exactly $k$ zeros (so is not associated with a violation of Rosser's rule, see Section\ 2.13) then its zero pattern must be either $211...110$, or $011...112$, or $011...131...110$ (where any number of 1's in the indicated pattern might be missing). Van\ de Lune et\ al. [LRW2] noted in their computations that for a fixed $k$, the first two zero patterns seemed to be much more frequent than the third one, and that the frequencies seemed stable as the height of zeros increased. The new computations, however, show a steady decrease in the frequency with which the third pattern appears. Table\ 2.12.2 shows the actual numbers. The $N=1$ entry is drawn from Table\ 2 of [LRW2], which is based on statistics of 3 sets of $10 sup 8$ Gram intervals each, starting at $g sub 0$, $g sub {7 times 10 sup 8}$, and $g sub {1.4 times 10 sup 9}$. In all cases only Gram blocks of length $k$ with exactly $k$ zeros are considered, and the entry in the table gives the fraction of all such Gram blocks that have a zero pattern with a 3 in it. The decrease in the frequency of the third zero pattern is rather puzzling. The GUE theories suggest that this pattern ought to occur a positive proportion of the time. .P Table\ 2.12.3 presents data on the fraction of Gram intervals that contain a given number of zeros. The $N=1$ and $N= 1.4 times 10 sup 9$ data sets are the same as in Table\ 2.12.1, and these entries come from Table\ 5 of [LRW2]. Note that there were no Gram intervals with $>=^4$ zeros in the $N=10 sup 12$ and $N=10 sup 20$ sets (although such intervals did turn up in other data sets around the $10 sup 20$-th zero, for example). .P The GUE entry in Table\ 2.12.3 comes from assuming that a Gram interval does not differ from any other interval of that length, and so the entry in the table for a given $m$ in the GUE row is the probability that an interval of length 1 contains exactly $m$ zeros. Since the averages of $S(t)$ do increase as $t$ increases, it seems reasonable to expect that at large heights the local distribution of the zeros will be independent of Gram points, which leads to the above assumption (cf.\ [Fu4]). In other words, the expectation is that at large heights, any grid of points spaced like the Gram points would exhibit similar behavior with respect to location of zeros. .P If the zeros at large heights are distributed independently of the Gram points, in the sense above, namely that shifting all the Gram points in a large interval by the same amount would not affect the statistics of Gram intervals and blocks, then we can expect that if we define .DS 2 .EQ z sub n ~=~ {gamma sub n - g sub m} over {g sub m+1 - g sub m} ~~~~roman if ~~~~ gamma sub n ^member ^[g sub m ,^g sub m+1 ) ^, .EN .DE then the $z sub n$ will be distributed uniformly in the unit interval [Fu4]. (So far, the equidistribution of the $gamma sub n$ has been shown only modulo much coarser grids, see [Hl,\|Fu3].) Figure\ 2.12.1 shows the distribution of $z sub n$ for the two data sets $N=10 sup 12$ and $N=10 sup 20$. In each case a histogram was prepared giving the number of $z sub n ^member^[j/1000$, $(j+1)/1000)$, $0^<=^j^<^10 sup 3$, and this data was used to derive the smooth curve in the picture using the lowess function of [BC]. A perfectly uniform distribution would correspond to a straight horizontal segment at height\ 1, while the most nonuniform distribution (which also would minimize the moments of $|S(t)|$), corresponds to a point mass at 1/2 and 0 elsewhere. The $N=10 sup 20$ curve is much closer to this conjectured uniform behavior than the $N=10 sup 12$ curve, and neither is far away from it. The area between the curve in Fig.\ 2.12.1 and the straight horizontal segment at height 1 is 0.051 for $N=10 sup 12$ and 0.028 for $N=10 sup 20$. .P A quantitative study of the extent to which the sign of $Z(g sub n )$ might coincide with $(-1) sup n$ was initiated by Titchmarsh [Tit0], who showed (as might be expected from the Riemann-Siegel formula (4.1.2)) that as $M^->^inf$, .DS 3 .EQ (2.12.1) M sup -1 ~ sum from n=1 to M ~Z(g sub n ) mark ~=~ o(1) ^, .EN .sp .EQ (2.12.2) M sup -1 ~ sum from n=1 to M ~ (-1) sup n ^Z(g sub n ) lineup ~=~ 2 ~+~ o(1) ^, .EN .sp .5 .nr Eq 1 .EQ "as well as" ~~ .EN .sp .5 .nr Eq 0 .EQ (2.12.3) M sup -1 ~ sum from n=1 to M ~ Z(g sub n ) ^Z(g sub n+1 ) lineup ~=~ -^2(1+c sub 0 ) ~+~ o(1) .EN .DE where $c sub 0 = 0.577 dd$ is Euler's constant. These results have been strengthened and extended considerably by Moser [Mos1-Mos9, Mos11, Mos12, Mos14]. Table\ 2.12.4 presents some averages involving the $Z(g sub n )$ that were computed, using the 2 sets of $10 sup 6$ values each that were specified above. For example, the $|Z sup 3 (g sub n ) |$ entry gives the value of .DS 2 .EQ 10 sup -6 ~ sum from n=R to {R+ 10 sup 6 -1} ~ |Z sup 3 (g sub n ) | .EN .DE for the appropriate $R$. We see that the computational results are in excellent agreement with Titchmarsh's results (2.12.1) to (2.12.3). .H 2 "Rosser's rule violations" Rosser's rule, formulated on the basis of empirical evidence, states that a Gram block $B(n,^k)$ contains at least $k$ zeros. It thus requires less regularity than Gram's law, yet if Rosser's rule holds universally, it would imply the RH (just as the validity of Gram's law would), and would also imply that every Gram block $B(n,^k)$ contains exactly $k$ zeros. However, it is easy to see that Rosser's rule holding up to height $T$ is equivalent to the bound $|S(t)|^<^2$ holding for $t^<^T$, which contradicts the unboundedness of $S(t)$. Thus Rosser's rule has to fail infinitely often. .P $S(t)$ grows very slowly, and Rosser's rule holds for most Gram blocks that have been checked. The first exception to Rosser's rule (which is defined as a Gram block $B(n,^k)$ which has fewer than $k$ zeros) is $B(n,^2)$ with $n^=^13,^999,^525$ [Br5]. There are only 15 exceptions to Rosser's rule for $n^<=^7.5 times 10 sup 7$ [Br5], and 3055 exceptions for $n^<=^1.5 times 10 sup 9$ [LRW2]. Among the values of $n$ with $1.4 times 10 sup 9 ^<=^n^<=^1.5 times 10 sup 9$, there were 0.287 exceptions per $10 sup 6$ zeros. .P The new computations found 22521 exceptions to Rosser's rule. Table\ 2.13.1 shows how many occurred in each data set and their density. Not only are the exceptions in the new data sets more frequent, but they also are much more varied than those found among the first $1.5 times 10 sup 9$ zeros. If $B(n,^k)$ is an exception to Rosser's rule, then $k$ will be referred to as the \f2length\f1 of the exception. The pattern of zeros inside this block has to be $011^...^110$. (For notation, see Section\ 2.12.) To describe the exception, we have to specify where the two ``missing zeros'' are located. We will use the notation .DS 2 .EQ (2.13.1) kX a sub 1 a sub 2 ^...^a sub m ^, ~~~~~~~X^=^ L ~~roman or ~~ R ^, .EN .DE to denote an exception $B(n,^k)$ where the missing zeros are to the left of $B(n,^k)$ (if $X=L$) or to the right of it (if $X=R$), and where $a sub 1 a sub 2 ^...^a sub m$ denotes the pattern of zeros in the smallest union of Gram blocks that is adjacent to $B(n,^k)$ and contains the missing zeros. Thus, for example, $3L0312$ denotes an exception of length 3, where the pattern of zeros in $[g sub n-4 ,^g sub n+3 )$ is 0312010. This is not a completely unambiguous description, but it suffices for all the cases that have been encountered, as no case of 3 exceptions to Rosser's rule that are close together has been found. We will refer to (2.13.1) as the \f2type\f1 of the exception, and $m$ will be called the \f2length of the excess block\f1. With this notation the 3055 exceptions around the first $1.5 times 10 sup 9$ zeros fall into just 13 types: .DS 3 .EQ mark 2R3 ,~~2L3 ,~~2R40, ~~2L04, ~~2R22 ,~~ 2L22 ,~~2R230, .EN .sp .EQ lineup 2L032, ~~2R410 ,~~3R3, ~~3L3, ~~3R40 ,~~3L04 , .EN .DE with 2715 of them being of types $2R3$ and $2L3$, and only 82 being of length 3. In particular, all lengths of exceptions and lengths of excess blocks are $<=^3$. .P The new exceptions fall into 146 distinct types. The relative frequencies of the most popular types in the new data sets and also in the first $1.5 times 10 sup 9$ zero computations are shown in Table\ 2.13.2. .P The maximal length of an exception that was found is 8, and it occurs in two exceptions, both of type $8R3$. There are 25 exceptions of length 7, and 123 of length 6. The maximal length of an excess block is 8, and occurs in 3 exceptions of types $2R21111130$, $2R21113110$, and $2L01311112$. There are 14 cases of excess blocks of length 7, and 84 of length 6. .P Some of the 22521 exceptions to Rosser's rule that were found in the main computations occur very close to each other. There are several cases where 2 exceptions are separated by a single Gram interval. The smallest such case that was found is that of $B(n,^3)$ and $B(n+4,^5)$, where $n=10 sup 16 + 3,^916,^331$, and the pattern of zeros in $[g sub n ,^g sub n+10 )$ is 0103011103. No case was found where two exceptions are adjacent. (However, Section\ 3 presents results of other computations that found several examples of this phenomenon.) Finally, no example of 3 exceptions close to each other has been found. .H 1 "Special points for the zeta function" .HU "3.0\0 Introduction" The main computations described in Section\ 2 were carried out at heights that were thought likely to be fairly random with regard to the zeta function behavior. Thus the data that was collected was likely to be representative of long-run statistics of the zeta function at these heights. However, what would be most interesting is not to study the typical behavior but rather to look at extreme values. It would be desirable, for example, to determine where the smallest spacing of consecutive zeros up to some height is without finding all the zeros up to that height. No way to do this is known. It is not even known how to find places where consecutive zeros are very close to each other. The problem there is that one would need a way to determine places where both the zeta function and its derivative are small, and this is not feasible currently. On the other hand, there are ways to determine values of $t$ where $zeta (1/2 + it )$ is likely to be very large. Such methods have been used before [KW,\|vdL,\|Od2], and the method described later in this section is basically a development of the method that was mentioned briefly on [Od2]. These methods determine values of $t$ for which the large initial terms in formulas for $zeta (1/2 + it )$ have the same argument, and therefore add up to a large quantity that will hopefully not be cancelled by the remaining terms. .P One reason for the interest in large values of $zeta (1/2 + it )$ is that one could think of a large peak as ``pushing aside'' the zeros that would normally lie in that area, and if these zeros were pushed off of the critical line, one would find a counterexample to the RH. No such counterexamples were found in these computations, but many interesting phenomena were observed. .P Section\ 3.1 presents the results of the computations near the special points. Section\ 3.2 describes the diophantine approximation algorithms that were used to construct these special points. Finally, Section\ 3.3 discusses how these algorithms could be improved, and what other computations could be attempted in the future. .H 2 "Computational results" The computations of this section, which are summarized in tables\ 3.1.1 and 3.1.2, found 5,\|168,\|540 zeros. As is the case with the main sets of zeros described in Section\ 2, even if the programs are correct and roundoff errors do not matter, it is not absolutely certain that the few dozen zeros at the ends of the data sets are indeed all of the zeros in those ranges (cf.\ Section\ 2.1). However, for the purpose of exposition, it will be assumed that they are. .P There were 22 separate computations, and the sets of zeros and special points associated to them will be denoted with the letters A through V. Table\ 3.1.1 shows the first zero of each data set, the number of zeros in that set, and the value of $t sub 0^=^( gamma sub n + gamma sub n+1 ) /2$ for that $n$ for which $|Z(( gamma sub n + gamma sub n+1 )/2)|$ is largest among all $n$ in the data set. Table\ 3.1.2 then shows the value of $Z(t)$ at $t=t sub 0$, the largest value of $S(t)$ in a neighborhood of $t sub 0$ (which is in all cases the largest $S(t)$ in a given data set, but which does not always occur at $gamma sub n$ or $gamma sub n+1$), the value of $delta sub n$ (which in all cases is the largest $delta sub m$ in a given data set), and the pattern of zeros in a union of Gram blocks that include $t sub 0$ (see Section\ 2.12 for notation). .P The entries in Table\ 3.1.2 show that the attempt to produce unusual behavior of the zeta function was very successful. The value of $|Z(t)|^approx^1580$ found in set U is far higher than 641, the largest value that was found in the main computations. Similarly, the value of $delta sub n ^=^ 5.1454$ from set C is the largest $delta sub n$ that has been found so far, and the value of $S(t)^=^2.8747$ from set T is a record for this function. Figures\ 3.1.1 and 3.1.2 show graphs of $Z(t)$ near the special value of $t=t sub 0$ from set T, and Fig.\ 3.1.3 shows a graph of $S(t)$ in that same range. .P Figures\ 3.1.1 and 3.1.2 are typical of those for the other sets in that they display a single very high peak of $|Z(t)|$, with other nearby values of $Z(t)$ much smaller. For example, in looking at stretches of about 30 Gram intervals centered at the special points, one finds only 3 peaks among all 22 data sets where the sign of $Z(t)$ was opposite to that at the main peak, and $|Z(t)| ^>^30$ was satisfied. The largest value of $|Z(t)|$ in such secondary peaks was 36. Thus we are probably still not seeing the expected behavior of large values of $Z(t)$ that is discussed at the end of Section\ 2.8. .P The general distribution of zeros as well as other properties of the zeta function in the ranges covered here were not too remarkable, aside from the behavior near the peaks of $|Z(t)|$. Exactly 3 midpoint values $w sub n$ (see (2.7.2) for a definition) that were $>^250$ were found away from the special values of $t$, but they were all $<^304$. Exactly 100 values of $w sub n ^<^ 5 times 10 sup -4$ were found, the smallest of them $2.47 times 10 sup -5$. The smallest value of $delta sub n$ that was found was $3.29 times 10 sup -3$, with the second smallest $7.62 times 10 sup -3$. (Since the probability of the minimal $delta sub n$ of 5,\|168,\|540 being drawn from the GUE ensemble turning out to be $<=^3.29 times 10 sup -3$ is about 0.18, this is consistent with the tendency that was observed before of having the minimal $delta sub n$ somewhat smaller than expected.) There were 5459 values of $n$ with $delta sub n ^<^0.1$, and 844 values of $n$ with $delta sub n ^>^2.8$. The largest 22 $delta sub n$ that were found are the ones given in Table\ 3.1.2. The \%23-rd largest $delta sub n$ was 3.50. There were 1861 values of $delta sub n + delta sub n ^<^0.6$, the smallest of them 0.2512, and 525 values of $delta sub n + delta sub n+1 ^>^4$, the largest of these 6.0165. (If $n=35,^200,^636,^070,^992,^305,^894$, so that $delta sub n = 4.3214$ is the largest $delta sub m$ in set V, then for this $n$ we have $delta sub n-1 + delta sub n = 6.0165$.) .P An initial concern about these computations was that they might give a distorted view of various properties of the zeta function, such as the distribution of $delta sub n$, for example, at the heights being investigated. This was due to the fact that the special points $t sub 0$ were chosen so that the initial terms in the Riemann-Siegel formula for $Z(t sub 0 )$ behave as if $t sub 0$ were close to 0. Thus it seemed possible that aside from the vicinity of the special point $t sub 0$, where $Z(t)$ is large, $Z(t)$ might behave as if $t$ were small, and so would be very constrained. However, that appears not to be the case. The agreement between the distributions of the $delta sub n$ in our small sets and the GUE prediction is quite good when one compares graphs prepared like those of figures\ 2.3.4 and 2.3.6, and also when one prepares $q-q$ plots. In those comparisons, the presence of one relatively huge outlier does not make much of a difference. On the other hand, when comparing moments of $delta sub n -1$, one sees very substantial differences, especially for high moments. These are easy to explain. When computing the mean value of $( delta sub n -1 ) sup 10$ over $2.5 times 10 sup 5$ zeros, for example, a single value of $delta sub n = 5$ will contribute $4 sup 10 ^cdot ^4^cdot^10 sup -6 ^=^ 4.1943$ to the mean, whereas the GUE prediction for that mean is only 0.488. .P It is not always the case that the maximal value of $|S(t)|$ occurs at one of the zeros adjacent to the highest peak of $|Z(t)|$. For example, in set V, if we let $n= 35,^200,^636,^070,^992,^171,^653$, then $w sub n =1329.5$, $delta sub n = 4.3214$, but $delta sub n-1 = 1.6951$, and $S( gamma sub n-1 + ) ^=^ 2.8314$, $S( gamma sub n -) ^=^ 1.1363$, $S( gamma sub n + )^=^2.1363$, $S( gamma sub n+1 - ) ^=^ 2.1851$, $S( gamma sub n+1 + )^=^ -1.1851$. .P Exactly 614 exceptions to Rosser's rule were found. They fall into 45 types, each of which had occurred in the main computations. The longest exception had length 6, and the longest excess block also had length 6. On the other hand, a new phenomenon was observed in 21 of the 22 data sets, namely that of 2 exceptions to Rosser's rule being adjacent to each other. Thus for example, the zero pattern 22000022 near the special point $t sub 0$ for set A corresponds to an exception of type $2L22$ followed immediately by an exception of type $2R22$. This phenomenon has been observed only in the 21 cases exhibited in Table\ 3.1.2. .P The basic conclusion to be drawn from the computations of this section is that the idea of looking for special points when the zeta function behaves in unusual ways is sound, and does produce interesting results. It also shows that investigating only a random selection of about $10 sup 8$ out of the first $10 sup 20$ zeros misses some of the most intriguing places. .H 2 "Diophantine approximation algorithms and special points" The Riemann-Siegel formula (Eq.\ (4.1.2)), as well as other ``approximate functional equations'' show that the size of $zeta (1/2 +it)$ is determined by the size of the sum of an initial segment of the divergent Dirichlet series .DS 2 .EQ (3.2.1) sum from n=1 to inf ~ n sup {-1/2 -it} ^. .EN .DE One can also hope that the size of this sum is determined largely by the size of a partial Euler product, .DS 2 .EQ (3.2.2) P sub X (t) ~=~ prod from {p^<=^X} ~ (1-p sup {-1/2 -it} ) sup -1 ^. .EN .DE The basic strategy for finding large values of $zeta (1/2 +it )$ is to find $t$ such that $|P sub X (t)|$ is large, and if it is, compute $zeta (1/2 +it )$ as a check. (In practice, it has turned out to be helpful to first check that $| P sub Y (t)|$ is large for some $Y^>^X$. This eliminated many candidate values of $t$.) There is no guarantee that this approach will succeed, but it appears to work very well. .P To find values of $t$ that make $|P sub X (t)|$ large, we search for values of $t$ such that each of the $p sup it$ is close to 1, as that makes each term in the product maximal. Thus we need to find a $t$ for which there exist integers $m sub 1 ,^dd ,^m sub n$ such that each of $t^log^p sub k ^-^2 pi^m sub k$ is small, $1^<=^k^<=^n$, where $n= pi (X)$ and $p sub 1 ,^p sub 2 ,^dd ,^p sub n$ are the primes $<=^X$. This is an instance of a homogeneous simultaneous diophantine approximation problem. We solve it using the Lova\*'sz lattice basis reduction algorithm [LLL], which has now become the basic tool in solving a variety of diophantine approximation problems in high dimensions. Given a basis for a lattice in which the vectors have integer coordinates, this algorithm produces another basis of fairly short vectors. While the new \f2reduced\f1 basis is not guaranteed to contain the shortest vector in the lattice, the algorithm has polynomial running time and variants of it are quite efficient in practice. The papers [LO2,\|OtR] contain some examples of the applications of this algorithm. .P The lattices to which the Lova\*'sz algorithm was applied have as their basis the rows of the following $(n+1) times (n+1)$ matrix: .DS 2 .EQ (3.2.3) left ( matrix { ccol {[ alpha sub 1 ^2 sup m-r ^log^p sub 1 ]~~~ above [2 pi ^alpha sub 1 ^2 sup m ]~~~ above 0 sub ""~~~ above 3dot ~~~ above 0 sub ""~~~} ccol {[ alpha sub 2 ^2 sup m-r ^log^p sub 2 ]~~~ above 0 sub "" ~~~ above [2 pi ^alpha sub 2 ^2 sup m ] ~~~ above "" ~~~ above 0 sub "" ~~~} ccol {^...~~~ above ^...~~~ above ^...~~~ above "" above ^...~~~} ccol {[ alpha sub n^2 sup m-r ^log^p sub n ]~~~ above 0 sub ""~~~ above 0 sub ""~~~ above "" above [2 pi ^alpha sub n ^2 sup m ]~~~} ccol {1 sub "" above 0 sub "" above 0 sub "" above "" above 0 sub "" } } right ) ^, .EN .DE where $alpha sub k ^=^ p sub k sup -1/4$, and $m^>^r^>^0$ are integers. A typical vector in the reduced basis is then of the form .DS 2 .EQ (3.2.4) (M [ alpha sub 1 ^2 sup m-r ^log ^p sub 1 ] ^-^ m sub 1 [ 2 pi ^alpha sub 1 ^2 sup m ] ,^dd ,^M [ alpha sub n ^2 sup m-r ^log ^p sub n ] ^-^ m sub n [ 2 pi ^alpha sub n^2 sup m ] ,~M) ^,~~~"\0\0\0" .EN .DE where $M , ^m sub 1 ,^dd ,^m sub n$ are integers. For this vector to be relatively short, $M$ and each of .DS 2 .EQ (3.2.5) M [ alpha sub k ^2 sup m-r ^log^p sub k ] ~-~ m sub k [ 2 pi ^alpha sub k ^2 sup m ] .EN .DE have to be relatively small. For the difference in (3.2.5) not to be too large, .DS 2 .EQ (3.2.6) M2 sup -r ~log^p sub k ~-~ 2 pi ^m sub k .EN .DE must be small, so that $t^=^M 2 sup -r ,^m sub 1 ,^dd ,^m sub n$ gives a solution to our basic problem. .P The function of the $alpha sub k$ in the definition (3.2.3) of the lattice basis is to take advantage of the fact that in trying to make $P sub X (t)$ large, it is more important that the $p=2$ term be large than that the $p=79$ term be large, say. If .DS 2 .EQ (3.2.7) t^log^p sub k ~-~ 2 pi ^m sub k ~=~ epsilon sub k ^, .EN .DE and the $epsilon sub k$ are small, then .DS 2 .EQ (3.2.8) log ^|P sub X (t) | ~-~ log ^P sub X (0) ~approx~ -^1 over 2 ~sum from {p^<=^X} ~ epsilon sub k sup 2 ^p sub k sup -1/2 ^, .EN .DE and so we really wish to minimize $sum^epsilon sub k sup 2 ^p sub k sup -1/2$. Since the Lova\*'sz algorithm attempts to minimize the Euclidean norm of vectors, the definition of the $alpha sub k$ induces it to produce the desired result. .P The implementation of the Lova\*'sz algorithm that was used in the computations of this section was essentially the same as that of [LO2,\|OtR], and will not be described here. Just as those implementations, it computed the Gram-Schmidt factors in floating point approximations, and not in exact rational arithmetic, in order to make the computations practical. For each initial basis, several iterations were performed; after reducing a given basis, the rows of the reduced basis were permuted, and the Lova\*'sz algorithm was applied to that basis. This had roughly the same effect as the procedure followed in [LO2], in which several permutations of the initial basis were reduced separately, in that additional reductions gave sometimes better and sometimes worse results. .P As in [LO2,\|OtR], the lattice basis reduction algorithm was implemented using Brent's MP multiple precision package [Br4]. The lattice basis of the form (3.2.3) to which it was applied usually had $40^<=^n^<=^85$, $70^<=^m^<=^75$, and $11^<=^ r^<=^16$, and usually about 6 successive reductions were performed. All the values of $t$ from all the reductions (several thousand values in total) were collected and used to compute $|P sub Y (t)|$ with $Y$ on the order of $p sub 95 ^=^ 499$. Those $t$ for which $|P sub Y (t)|$ was largest (in a given range of values of $t$) were then used for the computations described in Section\ 3.1. .H 2 "Possible extensions" One possible way to obtain even better values of $t$ is to speed up the implementation of the Lova\*'sz algorithm. The Brent MP package [Br4] was written to be portable and is not very efficient, and on a machine like the Cray\ \%X-MP is about 10 times slower than a program customized for this machine could be. Also, there are some very nice methods for speeding up the Lova\*'sz algorithm itself that have been developed by Radziszowski and Kreher [RK]. All these improvements could be used to reduce lattices of larger dimensions or reduce more permutations of a given basis. Another approach might be to develop better lattice basis reduction algorithms. Several approaches are available, such as those of Schnorr [Sch1,\|Sch2], but apparently none of them have been implemented yet. Any one of those approaches could also be combined with simpler tricks, such as that of trying to maximize a product like that of (3.2.2), but where some of the large primes are replaced by slightly larger primes. .P All of the above approaches have major limitations. Logarithms of primes are rationally independent, and ought to behave with respect to multidimensional diophantine approximation properties like independent random variables. This means that given any fixed subset $S$ of them, values of $t$ for which all the $t^log^p$ for $p^member^S$ are small $roman modulo^2 pi$ are likely to be far apart, and if $S$ is large, the smallest value of $t$ of this kind is likely to be large. Therefore to find values of $t$ for which $zeta (1/2 +it)$ is very large, we would probably need algorithms that could find vectors in extremely high dimensional lattices that are only slightly shorter than usual, as opposed to the method that has been used, which finds very short vectors in relatively low dimensions. It is very doubtful that any of the approaches suggested above could yield such algorithms. .P Computations with some of the values of $t$ that were found during the main computations of Section\ 2 and for which $zeta (1/2 +it )$ is large confirm the suggestion above that such large values arise typically from unpredictable interactions of many large primes and not from an almost perfect lining up of a small set of initial primes. Therefore further searches for values of $t$ with $zeta (1/2 + it )$ large by means of algorithms known or foreseeable today might produce additional interesting phenomena, but is not likely to find all the large values. .P Simultaneous diophantine approximation algorithms could also be applied to find other values of $t$ for which $zeta (1/2+ it )$ is unusual. For example, the values of $t$ in Table\ 3.1.1 all lie close to values of $u$ for which $S(u)$ is large, but that is a by-product of having a large gap between zeros in that region. One could also try to search directly for values of $t$ for which $S(t)$ is large. There are various formulas for $S(t)$, such as those of Selberg (Theorem\ 14.21 of [Tit2]) or Goldston [Go2] (see Section\ 2.6). The main term in Selberg's formula suggests that to make $S(t)$ large, one ought to find $t$ such that .DS 2 .EQ (3.3.1) sum from {p^<=^X} ~ roman Arg ( 1^-^ p sup {- 1/2 -it} ) .EN .DE is large in absolute value. This task can be formulated easily as a diophantine approximation problem, but in order to obtain large values, it appears that one would have to deal with very large $X$, which would tend to produce impracticably large values of $t$. There are two culprits here. One is that the contribution of the sum in (3.3.1) to $S(t)$ is divided by $pi$. The other one is that the error term in Selberg's formula is fairly large compared to the main term in ranges of $t$ that are of interest. (This is to be expected, since $|S(t)| ^<^2.9$ for all values that have been computed, while the remainder terms in Selberg's formula have to produce the jumps by 1 of $S(t)$ at zeros, since the main term is continuous.) .P The final conclusion to be drawn from the above discussion is that searches for special values of $zeta (1/2 +it)$ do produce interesting results and can be improved somewhat, but there is no method in sight that would produce all the points of interest. .H 1 "Algorithms and their implementation" .HU "4.0\0 Introduction" The main result of [OS], namely Theorem\ 5.1, can be reformulated for the case of computations of $zeta (1/2 +it)$ as follows: .in +3n .P 0 .I For any $a^member^[0,^1/2]$ and any positive constants $delta$ and $c sub 1$, there is an effectively computable constant $c sub 2 ^=^ c sub 2 ( delta ,^c sub 1 ,^a)$ and an algorithm that for every $T^>^0$ will perform $^<=^c sub 2 ^T sup {1/2 + delta}$ operations on numbers of $<=^c sub 2 ^log^T$ bits using $^<=^c sub 2 ^T sup {a+ delta}$ bits of storage and will then be capable of computing any value $Z(t)$ for $T^<=^t^<=^T^+^T sup a$ to within $+- ^T sup {-c sub 1}$ in $<=^c sub 2 ^T sup delta$ operations using the precomputed values. .R .in 0 .P 0 This result is completely rigorous, but implementing it as it is described in [OS] presents difficulties because of the need for very high precision and large storage. This section shows a modified version of the algorithm which is practical, but which does sacrifice some of the rigor of the basic result to achieve speed. Many of the choices that were made in the implementation were forced or at least suggested by the hardware and software that was used, and would have been made differently on another machine. .P All the main computations were carried out on a Cray \%X-MP supercomputer with 2 processors and 4 million words of main memory. Although occasionally both processors were used, there was no true parallel processing involved, as the programs did not interact with each other. The Cray computers have \%64-bit words, with \%48-bit mantissas (including the sign bit), which give slightly over 14 decimal digits of precision in the standard single precision $(sp)$ floating point numbers, and slightly over 28 decimal digits in double precision $(dp)$. (See [Od2] for a more extended discussion of this issue.) One of the crucial parts of the algorithm, as will be shown later, involves computing $exp (it^log^n)$ for $n$ ranging up to about $t sup 1/2$. Since $t^approx^1.5 times 10 sup 19$ near the $10 sup 20$-th zero, $t^log^n$ is on the order of $10 sup 20$, and so if we do the computations in $dp$, then after reducing modulo $2 pi$ we are left with only about 8 decimal digits of accuracy, and this is also true for values of $exp (it^log^n)$ that we obtain after exponentiating. This is only barely acceptable, and accounts for most of the lack of rigor in the computations. Attempting this computation in $sp$ would produce a totally meaningless answer. On the other hand, the Cray is designed for $sp$ computations that vectorize. All $dp$ computations are done in software, and although many of them are vectorized by the latest Cray compilers, it is still the case that $dp$ arithmetic operations are often on the order of 100 times slower than $sp$ ones. Therefore even though $dp$ computations were by themselves only barely accurate enough, it was necessary to do as much computing as possible in $sp$ to obtain high speed. To achieve this, some hybrid methods described in sections\ 4.1 and 4.2 were used. .P The problems outlined above of getting sufficient accuracy were due not to the nature of the new algorithm but to the large height at which the computations were undertaken. Implementation of any of the older algorithms (such as that of the Riemann-Siegel formula discussed in Section\ 4.1) would have had to cope with the same difficulties. (No matter which algorithm was used, supercomputers like the Cray would be essential in practice, since less powerful machines typically have only 32-bit words, which would necessitate using multiple precision packages, which are prohibitively slow.) The new algorithm does introduce some additional sources of errors, however, which would make rigorous error analysis harder than it would be for the older methods even if higher precision computations were employed. .P The present implementation applies only to computation of the zeta function on the critical line. The algorithm of [OS] can also be used to compute the zeta function on other lines, and this has applications to problems such as that of computing $pi (x)$ [LO4], but no attempt was made to write programs to implement such applications. The method of [OS] also applies to the computation of Dirichlet $L$-function and related functions. Only very minor modifications to the present implementation would be needed to compute Dirichlet $L$-functions, and this may be done in the future. .P The main computations were all carried out on a Cray \%X-MP supercomputer running the Unicos\ 2.0 operating system, with some of the final statistical computations done under Unicos\ 3.0. The language of the main computations was Fortran, with a number of different compilers being used. Various UNIX\*(Tm tools, such as the Awk programming language [AKW], were utilized. Many of the statistical studies of zeros were carried out on a DEC\ VAX 8550 computer using Fortran, Awk, or (especially) the S statistical programming language [BC]. S was also used to produce all of the graphs in this paper. .H 2 "Zero-locating program" The program for locating zeros is based on the Riemann-Siegel formula [Ed, Gab, Iv, Sie1, Tit2], which has been the basic tool for all zeta function computations at large height during the last 60 years. This formula says that if .DS 2 .EQ (4.1.1) tau ~=~ t/(2 pi ) ,~~ k sub 1 ~=~ left floor tau sup 1/2 right floor , ~~ z ^=^ 2 ( tau sup 1/2 ^-^ k sub 1 ) ^-^ 1 ^, .EN .DE then for any $m^>=^0$, .DS 3 .EQ Z(t) ~=~ mark 2 ~ sum from k=1 to {k sub 1} ~ k sup -1/2 ~cos ( t^log^k ^-^ theta (t)) .EN .sp .5 .EQ (4.1.2) ~~ .EN .sp .5 .EQ lineup +~ (-1) sup {k sub 1 +1} ^tau sup -1/4 ~ sum from j=0 to m ~ PHI sub j (z) (-1) sup j ^tau sup -j/2 ~+~ R sub m ( tau ) ^, .EN .DE where the $PHI sub j (z)$ are certain entire functions that can be expressed in terms of derivatives of .DS 2 .EQ PHI sub 0 (z) ~=~ {cos "{" pi ( 4z sup 2 ^+^3)/8 "}"} over {cos ( pi z )} ^, .EN .DE and .DS 2 .EQ (4.1.3) R sub m ( tau ) ~=~ O( tau sup {-(2m+3)/4} ) ~~~roman as ~~~ tau ^->^inf ^. .EN .DE .P Gabcke [Gab] has obtained essentially optimal bounds for the remainder terms $R sub m ( tau )$, and the one used in the new computations was .DS 2 .EQ (4.1.4) | R sub 1 ( tau ) | ~<=~ 0.053 ^t sup -5/4 ~~~roman for ~~~ t^>=^ 200 ^. .EN .DE The asymptotic expansion terms $PHI sub 0 (z)$ and $PHI sub 1 (z)$ were computed using their Taylor series expansions [CR,\|Gab]. .P The main difficulty in computing $Z(t)$ by means of the Riemann-Siegel formula is in the evaluation of the cosine sum in (4.1.2). (For $t$ near the $10 sup 20$-th zero, $k sub 1 ^approx^1.5 times 10 sup 9$.) In the new implementation it was computed as the sum of two terms, .DS 2 .EQ (4.1.5) Z sub 1 (t) ~=~ 2 ~ sum from k=1 to {k sub 0 -1} ~ k sup -1/2 ^cos ( t^log^k^-^ theta (t)) ^, .EN .DE and .DS 2 .EQ (4.1.6) roman Re ~e sup {-i theta (t)} ~ F(t) ^, .EN .DE where .DS 2 .EQ (4.1.7) F(t) ~=~ F(k sub 0 ,^k sub 1 ;^t) ~=~ sum from {k=k sub 0} to {k sub 1} ~2k sup -1/2 ^exp ( it ^log^k ) ^. .EN .DE The advantage of the new algorithm over the straightforward term-by-term evaluation of the Riemann-Siegel formula is in the method of evaluating $F(t)$, which is an adaptation of the method presented in [OS], and is described in detail in sections\ 4.2 and 4.3. We will now describe the computations of $theta (t)$, of $Z sub 1 (t)$, and of the zero-locating procedure. .P One could take $k sub 0 =1$, in which case $Z sub 1 (t) =0$ identically, but for technical reasons having to do with the speed of the algorithm for computing $F(t)$ it was advantageous not to do this, and in practice one had $100^<=^k sub 0 ^<=^500$. (See Table\ 4.4.1 for some values.) The method used to compute $Z sub 1 (t)$ was essentially the same as that used in [Od2] for computing the entire cosine sum in the Riemann-Siegel formula. The argument $t$ was always maintained as a $dp$ variables. Another $dp$ variable, $t sub 0$, was also maintained, which normally had the property that $| t- t sub 0 | ^<=^ 10$. Three arrays, $d sub n ,^q sub n ,^u sub n$, $1^<=^n^<=^k sub 0 -1$, were also used; $d sub n$ was the $dp$ value of $log^n$, $q sub n$ was the value of $2n sup -1/2$, computed in $dp$ but stored in $sp$, and $u sub n$ was the value of $t sub 0 ^log^n$ reduced modulo $2 pi$, where the computation was again done in $dp$ but the stored value was in $sp$. To compute $Z sub 1 (t)$ for a new value of $t$, $t$ was compared to $t sub 0$. If $|t - t sub 0 | ^>^10$, $t sub 0$ was set to $t$, and the $u sub n$ were recomputed. At that point (and also if $|t- t sub 0 |^<=^ 10$ was satisfied initially) $delta$ was defined as the $sp$ value of $t-t sub 0$, $t sub 1$ as the $dp$ value of $t sub 0 + delta$, $theta (t sub 1 )$ was computed in $dp$, reduced $roman mod~2 pi$, and converted to an $sp$ variable $v$. Finally, $Z sub 1 (t)$ was computed as the sum (in $sp$) of .DS 2 .EQ w sub n ^=^ q sub n ^cos ( delta e sub n + u sub n -v ) ^, ~~~ 1^<=^n^<=^k sub 0 -1^, .EN .DE where $e sub n$ is the $sp$ value of $d sub n$ (obtained by truncation). .P For the computation of $theta (t)$, another $dp$ variable $t tilde sub 0$ was maintained together with the $dp$ value of $theta ( t tilde sub 0 )$ and with $dp$ or $sp$ (depending on order) values of derivatives of $theta ( t)$ at $t tilde sub 0$. When $|t- t tilde sub 0 |^<=^50$ was satisfied, $theta (t)$ was computed from the stored values using its Taylor series expansion around $t tilde sub 0$, using partially $dp$ and partially $sp$ arithmetic. When $| t- t tilde sub 0 | ^>^ 50$, $t tilde sub 0$ was set to $t$ and $theta (t)$ and its derivatives were computed in $dp$ (or $sp$ for higher derivatives) using Stirling's formula. The reason for this involved procedure was to avoid using the Cray $dp$ logarithm function, which was extremely slow when the program was being written. Later, a new version of the $dp$ logarithm routine was installed in the system libraries which is about $4$ times faster than the old one, so that this procedure does not gain very much. However, this procedure was retained, both because it was still faster, and because of the considerations of accuracy and reliability of the computational results that are described in Section 4.5. .P The procedure for locating zeros was the standard one of finding Gram blocks and searching for the expected number of changes of $Z(t)$ in them. When a violation of Rosser's rule was encountered, the program searched neighboring Gram blocks. Once all the zeros were separated, they were located to a nominal accuracy (i.e, disregarding any inaccuracy in the computation) of $+- ^2 times 10 sup -8$ by the Brent combination [Br1] of linear and quadratic interpolation. The sophisticated zero-locating strategies of [LRW1,\|LRW2] were not employed, and about 8.5 evaluations of $Z(t)$ were used on average to compute each zero. (An additional 1 evaluation of $Z(t)$ per zero was performed to determine the value of $Z(t)$ halfway between zeros.) .H 2 "Odlyzko-Sch$bold {o dotdot} bold {nhage~ algorithm}$" The function $F(t) ^=^ F(k sub 0 ,^k sub 1 ;^t)$ is computed in two stages. In the first, precomputation stage, which accounts for most of the computing time, $F(t)$ is computed at a uniform grid of points. .DS 2 .EQ (4.2.1) t~=~ T, ~~T^+^delta ,^dd ,~T~+~ (R-1) delta ^. .EN .DE The second stage, described in Section\\ 4.3, is very fast, and computes the values of $f(t)$ for $T+A^<=^t^<=^T+(R-1) delta -A$ for a certain constant $A$ from the stored values of $f(T)$, $f(T+ delta ) ,^dd ,^f(T+(R-1) delta )$. This section describes the precomputation phase. It is based on [OS] with only minor modifications, and although it is essentially complete, it is very technical. The description in [OS] does not cover the details of the implementation, but is more conceptual and easier to read, and is therefore likely to be preferable for those interested only in the basic ideas of the algorithm and not in the details. .P Let $r^member^Z sup +$, and define .DS 2 .EQ (4.2.2) R~=~ 2 sup r ^, ~~~~omega ^=^ exp ( 2 pi i / R ) ^. .EN .DE In principle any $R$ for which the Fast Fourier Transform (\f2FFT\f1) can be applied efficiently could be used, but it was convenient to work with powers of 2. The values of $r$ that were used in the main computations were $r^=^17,^19,^23$, and 24. .P For $-R/2 ^<=^h^<^ R/2$, define .DS 2 .EQ (4.2.3) u sub h ~=~ sum from j=0 to R-1 ~ F(T+ j delta ) omega sup -hj ^. .EN .DE Once the $u sub h$ are computed, the $F(T+ j delta )$ can be obtained from them very fast through the FFT: .DS 2 .EQ (4.2.4) F(T+ j delta ) ~=~ R sup -1 ~ sum from h=-R/2 to R/2-1 ~ u sub h ^omega sup jh ^. .EN .DE This computation takes a negligible amount of time. .P Using the definition (4.1.7) of $F(t)$ in (4.2.3), exchanging the orders of summation, and summing the geometric series that arises, one obtains .DS 2 .EQ (4.2.5) u sub h ~=~ omega sup h ~ sum from {k=k sub 0} to {k sub 1} ~ {a sub k} over {omega sup h ^-^b sub k} ^, .EN .DE where the $beta sub k$ are defined so that $-R/2 ^<=^beta sub k ^<^ R/2$ and .DS 3 .EQ (4.2.6) b sub k mark ~=~ exp ( 2 pi i ^beta sub k /R ) ~=~ exp ( i delta ^log ^k ) ^, .EN .sp .EQ (4.2.7) a sub k lineup ~=~ 2 ^k sup -1/2 ^e sup {iT^log^k} ^(1-e sup {iR delta ^log ^k} ) ^. .EN .DE Write .DS 2 .EQ (4.2.8) f(z) ~=~ sum from {k=k sub 0} to {k sub 1} ~ {a sub k} over {z- b sub k} ^, .EN .DE Then we need to evaluate $f( omega sup h )$ for $-R/2 ^<=^h^<^R/2$. Term-by-term evaluation of the sum in (4.2.8) would require on the order of $k sub 1 ^R$ operations, which would be of the same complexity as evaluating the Riemann-Siegel formula in the standard way at each point $T+j delta$. However, the new algorithm of [OS] leads to much faster evaluation of the $f( omega sup h )$ by means of Taylor series expansions. Let $langle x rangle$ denote the nearest integer to $x$, let $"||" ^x ^"||" sub {^R}$ denote the ``cyclic distance'' modulo $R$: .DS 2 .EQ "||" ^x ^"||" sub {^R} ~=~ min from m ~ | x^-^mR | ^, .EN .DE and for integers $p,^q$ with $q^>=^0$, $3 sup q ^<=^R/2 +1$, $-R/2^<=^p^<^ R/2$, $| p 3 sup q |^<=^ R/2 -1 ^+^(3 sup q -1 ) /2$, define .DS 2 .EQ (4.2.9) I sub p,q ~=~ left { k^:~ k sub 0 ^<=^k^<=^k sub 1 , ~~ "||"^beta sub k ^-^ p 3 sup q ^"||" sub {^R} ~>=~3 sup q -1 ,~~ "||"^beta sub k ^-^ langle p/3 rangle 3 sup q+1 ^"||" sub {^R} ^<^ 3 sup q+1 -1 right } ^.~~~~"\0\0\0" .EN .DE Then it can be shown easily (\f2cf.\f1 [OS]) that each $k$ belongs to at most 6 different $I sub p,q$ for a fixed $q$. .P Let $Q^=^ left floor log sub 3 ( R/2 +1) right floor$. Then for any $h$, $-R/2^<=^h^<^R /2$, it is easy to see (\f2cf\f1. [OS]) that $"{" k sub 0 ,^k sub 0 +1 ,^dd ,^k sub 1 "}"$ is the disjoint union of the sets $I sub {langle h3 sup -q rangle ,^q}$ for $0^<=^q^<=^Q$. Hence if .DS 2 .EQ (4.2.10) f sub p,q (z) ~=~ sum from {k^member^I sub p,q} ~ {a sub k} over {z^-^b sub k} ^, .EN .DE then for $-R/2^<=^h^<^R/2$, we have .DS 2 .EQ (4.2.11) f ( omega sup h ) ~=~ sum from q=0 to Q ~ f sub {langle h3 sup -q rangle ,^q} ^( omega sup h ) ^. .EN .DE The new algorithm evaluates the functions $f sub p,q (z)$ at points $z= omega sup h$ with $langle h3 sup -q rangle ^=^p$. For $q^<=^Q sub 1$, ordinary evaluation of the sum in (4.2.10) is used. For $Q sub 1 ^<^ q^<=^Q$, the function $f sub p,q (z)$ is expanded in its Taylor series around the point .DS 2 .EQ (4.2.12) z sub p,q ~=~ exp ( 2 pi i p 3 sup q /R ) ^. .EN .DE It is easy to show (\f2cf\f1. [OS]) that these Taylor series converge fast, so not too many terms in them have to be kept. Finally, these Taylor series are used to evaluate the $f sub p,q ( omega sup h )$. .P The threshold $Q sub 1$ was taken to be 3 in all the computations after some experiments showed that it was reasonably close to the optimal choice. The Taylor series method is inefficient when $|I sub p,q |$ is small, since its overhead is large. A slight improvement in the program could be obtained by selecting which method to use based on $| I sub p,q |$ and not on $q$ alone. .P The main computation proceeds in stages indexed by integers $m$, $-R/2 ^<=^m^<=^R/2-1$. In stage $m$, only $k^member^S sub m$ are considered, where .DS 2 .EQ (4.2.13) S sub m ~=~ "{" k^:~ k sub 0 ^<=^k^<=^k sub 1 ,~~ beta sub k^member^[m,^m+1 )"}" ^. .EN .DE For any $p$ and $q$, if $k^member^I sub p,q$ for some $k^member^S sub m$, then $S sub m ^!subset^I sub p,q$, which makes bookkeeping for the various computations easy. The distribution of the $beta sub k$ is very nonuniform, with almost all the time being spend in the relatively small fraction of stages $m$ for which $|S sub m |$ is large. .P Each stage $m$ is further subdivided into substages corresponding to a partition of $S sub m$ into blocks $S sub m,j$, $1^<=^j^<=^s$, of consecutive $k$'s with $|S sub m,j |^<=^2560$ for all $j$, and $|S sub m,j |^<^ 2560$ being possible only for $j=s$. (For almost all stages $|S sub m | ^<=^2560$, and so $s=1$.) This was done to keep the sizes of the auxiliary arrays small, and also to have their lengths be multiples of 64, the length of Cray vector registers. .P Suppose that .DS 2 .EQ S sub m,j ~=~ "{"k^:~ k sub 2 ^<=^ k^<=^k sub 3 "}" ^. .EN .DE Several auxiliary arrays are defined. The most important and most time consuming to compute is the $d sub k$ array, $k sub 2 ^<=^ k^<=^k sub 3$, with $d sub k$ being an approximation to the $dp$ value of $log^k$. The set $S sub m,i$ is partitioned into blocks of 64 consecutive values of $k$ (with the last block possibly being smaller), and if a block consists of $k$'s with $k sub 4 ^<=^k^<=^k sub 5 ^<=^k sub 4 + 63$, then $d sub k sub 5$ is computed using the Cray $dp$ logarithm routine, and the $d sub k$, $k sub 4^<=^k^<^k sub 5$ are then computed from $d sub k sub 5$ using Taylor series expansions. When the program was first written, this involved procedure was about 6 times faster (for computations near zero number $10 sup 20$) than using the Cray $dp$ logarithm routine, which served to cut the running time of the entire rational evaluation program by over 30%. (As a result, the computation of the $d sub k$ now takes on the order of 10% of the total running time instead of the roughly half that was required by the earliest version of the program which involved the Cray $dp$ logarithm function.) The latest Cray mathematical subroutine libraries have a $dp$ logarithm routine that is about 4 times faster than the old one, and so the procedure described above is only about 1.5 times as fast as using the standard Cray $dp$ logarithm all the time would be. (Much faster variants of this method are possible, as is shown in Section\ 4.6.1.) .P Once the $d sub k$ are computed, they are used to calculate $T^log^k$ $ roman modulo~2 pi $ in $dp$, which is then converted to $sp$ and used to compute $exp (iT^log^k)$ utilizing the Cray cosine and sine routines. The $2k sup -1/2$ factor is also computed in $sp$ arithmetic. Finally the difference $1^-^exp (iR delta ^log ^k )$ is computed in the form .DS 2 .EQ (4.2.14) -2i ~exp left ( i ~1 over 2 ~ R delta ^log ^k right ) ~~ roman sin left ( 1 over 2 ~ R delta ^log ^k right ) ^, .EN .DE where $sp$ arithmetic is used for the trigonometric functions, but $2 sup -1 ^R delta ^log^k$ is computed in $dp$ and reduced modulo $2 pi$ in $dp$, for reasons that will be explained later. All these factors are then combined using $sp$ arithmetic to obtain $a sub k$. The $b sub k$ are also computed in $sp$. .P For $q=2$ and 3, ordinary complex $sp$ arithmetic is used to evaluate the $a sub k / ( omega sup h ^-^b sub k )$ for $k^member^S sub m,j$ and these are added to stored variables corresponding to $f( omega sup h )$. For $q^>=^4$, complex $sp$ arithmetic is used to compute the coefficients $a sub k ( z sub p,q ^-^ b sub k ) sup -n-1$ for $0^<=^n^<=^V$ in the Taylor series expansion .DS 2 .EQ (4.2.15) {a sub k} over {z^-^b sub k} ~=~ sum from n=0 to inf ~ a sub k ( z sub p,q ^-^ b sub k ) sup -n-1 ( z sub p,q ^-^ z ) sup n .EN .DE around $z sub p,q$, and these are added to the arrays holding the Taylor series coefficients of $f sub p,q (z)$. The number of terms $V$ depends on $m,^p,^q$, and is chosen so as to make the \%$V$-th computed coefficient about $10 sup -15$ times the size of the \%$0$-th one. Except for $q$ very close to $Q$, $V$ is almost always $<^50$. After all the $S sub m$ have been processed, the Taylor series of the $f sub p,q (z)$ are used to compute the $f sub {langle h3 sup -q ,^q rangle} ( omega sup h )$ in $sp$ arithmetic for $q^>=^4$, and these numbers are then added to the variables corresponding to $f( omega sup h )$. Since the $a sub k$ are only accurate to 9 to 10 decimal digits in the computations near zero number $10 sup 20$, one could take $V$ much smaller for computations at such larger heights, say about 2/3 of the present value, without affecting the accuracy of the final results very much. This would speed up the main program by about 15%. This modification was not made in the programs to keep them the same for all heights. .P For $q=0$ and 1, a special procedure is used, since here $b sub k$ and $z sub p,q^=^ omega sup h$ (for $h=p3 sup q$) are very close to each other, and so computing $z sub p,q ^-^b sub k$ in $sp$ would lead to large errors. Instead, we use the expansion .DS 2 .EQ (4.2.16) omega sup h ^-^ b sub k ~=~ -2i ^exp ( pi i (h+ beta sub k )/R ) ~roman sin ( pi (h- beta sub k ) /R ) ^. .EN .DE The $pi ( h- beta sub k )/R$ factor is evaluated in $dp$, reduced modulo $2 pi$, and converted to $sp$ before being used to evaluate the sine. If $(h- beta sub k ) /R$ is very small, the definition (4.2.6) of $beta sub k$ shows that $2 sup -1 ^R delta ^log^k$ reduced modulo $2 pi$ cannot be too large, and the ratio of the two sines in (4.2.14) and (4.2.16) is bounded by $R$ in absolute values. The Cray $sin (x)$ routine is very accurate for small $x$, since it actually computes $x( sin (x) /x )$, and so the quotient of the quantities in (4.2.14) and (4.2.16) is evaluated quite accurately. (The computation of $2 sup -1 ^R delta ^log^k$ modulo $2 pi$, which was mentioned above, is done in $dp$ to make sure that the arguments of sine in these computations are accurate.) .P Aside from the $dp$ operations which are not vectorized by the Cray compilers, most of the computations were written so they would be vectorized automatically by the compiler. (No assembly language routines were used.) This is even true of the Taylor series expansions, since those are almost always performed on large sets of $k$'s simultaneously, so the inner loops are written to run on $k$, and not on the index of the Taylor series term being evaluated. (This does require the use of some auxiliary arrays, but since at most 2560 $k$'s are considered in each stage, storage in not a problem.) As will be described in Section 4.6.1, some of the crucial loops in the program are executed at the rate of over 100 million floating point operations per second, which is very fast for Fortran programs, since the cycle time on the Cray\ \%X-MP is 9.5 nanoseconds. .P The above sketch of the implementation of the rational function evaluation algorithm applies directly only for the runs with $r=17$ and 19. For $r=23$ and 24, a modified version of the algorithm had to be used because of space restrictions that are discussed at greater length in Section\ 4.4. In the implementation discussed above, a complex array of length $R$ is kept for the values $f( omega sup h )$, $-R/2^<=^h^<^R/2$, as well as arrays for the Taylor series coefficients of the $f sub p,q (z)$ with $q^>^Q sub 1 ^=^3$. In the versions used for $r^>=^23$, the program works on $2 sup r-17$ segments of values of $h$, each of length $2 sup 17 ^=^131072$. If we denote one such segment by .DS 2 .EQ H~=~ "{"h^:~ h sub 0 ^<=^h^<^h sub 0 ^+^ 2 sup 17 "}" .EN .DE $(h sub 0 ^=^-R/2$, $-R/2 ^+^2 sup 17$, etc.), then the main program computes the contribution to $f( omega sup h )$ for $h^member^H$ of all $k$ such that $k^member^I sub p,q$ with some $0^<=^q^<=^8$, $p^=^langle h sup prime 3 sup -q rangle$ for some $h sup prime ^member ^H$, where these contributions are computed as before, namely directly for $q^<=^Q sub 1$, and through Taylor series expansions for $q^>^Q sub 1$. These values are stored in a file. Another file is also created, which contains the contributions to the Taylor series coefficients for $q^>=^9$ of all the .DS 2 .EQ k^member~ union from {m^member^H} ~ S sub m ^. .EN .DE As different $H$'s are processed, the Taylor series contributions for $q^>=^9$ are added, and at the end they are combined with the previously computed contributions of $q^<=^8$ to obtain the values of $f( omega sup h )$. .P The algorithm is quite involved, and its running time depends on a complex combination of various factors. A rough indication of where most of the time is spent is provided by Table\ 4.2.1. It is based on experiments with the algorithm for $r=17$, when it is applied to evaluate the $f( omega sup h )$ for $k sub 0 ^approx^1.5 times 10 sup 9$, $k sub 1 ^=^k sub 0 ^+^10 sup 6$, $T^approx^1.5 times 10 sup 19$, $delta ^=^0.15$. The total running time in this case was 132 seconds. The figures in Table\ 4.2.1 should be treated with caution as only a rough indication of where most of the computational effort was spent. .P The basic FFT routines that were used were those of Bailey [Bail]. They were written especially for the Cray-2, where they are both faster and more accurate than the standard Cray routines. On the Cray\ \%X-MP, Bailey's routines are slightly slower than the standard Cray ones. They were selected because of their greater accuracy, although in view of the errors in the rational function evaluation algorithm, the additional errors introduced by the FFT program are negligible. The time needed for the FFT itself was completely negligible, with complex transform on $2 sup 19$ points taking under 1 second, much less time than it took to read in the data. .P Due to space limitations on the Cray\ \%X-MP, Bailey's routines could be used directly only for $r=17$ and 19. For $r=23$ and 24 it was necessary to perform extensive reformatting operations. Suppose that we wish to take the FFT of $v sub 0 ,^dd ,^v sub M-1$, where $M^=^2 sup g K$, say, and we can only perform FFT of length $K$ in core. If $w sub 0 ,^dd ,^w sub M-1$ is the Fourier transform of the $v sub j$, then .DS 3 .EQ w sub h mark ~=~ sum from j=0 to M-1 ~ v sub j ~exp ( 2 pi i hj /M ) .EN .sp .EQ (4.2.17) lineup ~=~ sum from s=0 to {2 sup g -1}~exp ( 2 pi i h s/M) ~ sum from m=0 to K-1 ~v sub {2 sup g m+s} ~exp ( 2 pi i mh/K )^. .EN .DE The inner sum above is just the Fourier transform of a sequence of length $K$, and can be handled by the FFT directly. To implement this, one needs to create new data sets consisting of the subsequences $v sub {2 sup g m+s}$, $0^<=^m^<=^K-1$, perform the FFT on them, and then combine them to obtain the $w sub h$. In the case of the computations with $r=23$, for example, Bailey's algorithm is used with $K= 2 sup 19$, so that for each of the $16=2 sup 4$ FFT's, all $2 sup 23$ values $v sub j$ have to be read, and after all the FFT's are done, 16 passes through the data are performed to compute and store the decimal linear combination given by (4.2.17). This takes on the order of an hour of elapsed time (the exact length depending on the load on the system), although very little computing time. For $r=24$, the total time is about 4 times longer. For large computations, it would be worthwhile to use more efficient procedures, some of which are discussed in Section\ 4.6. Such procedures would have been advantageous even for computations on the scale described here, and the only reason they were not carried out was the additional programming effort that would have been required, and the limited facilities for data storage that were available. .H 2 "Band-limited function interpolation" Section\ 4.2 shows how $F(t)$, $F(T+ delta ) ,^dd$, $F(T+(R-1) delta )$ are computed. In general, though, we need to compute $F(t)$ for various $t^member^(T,^T^+^(R-1) delta )$ that are not predictable a priori. The approach that was presented in [OS] was to compute several of the derivatives $F sup (h) (t)$ at the grid points $t=T$, $T+ delta ,^dd$, $T^+^(R-1) delta$, and then to compute desired values of $F(t)$ by expanding in a Taylor series around the nearest grid point. Since the derivatives $F sup (h) (t)$ are representable as sums similar to that for $F(t)$, they can be computed by a variant of the algorithm described in Section\ 4.2. However, the need to use a fairly dense grid and to store the derivatives $F sup (h) (t)$ at the grid points make this approach inefficient. Another possible approach is that of interpolating values of $Z(t)$ from the values computed on the grid $T,^T+ delta ,^dd ,^T^+^(R-1) delta$, as is done in [Hej5], for example, where $Z(t)$ is approximated as if it were a polynomial through the Lagrange interpolation formula. This method also appears relatively inefficient, and furthermore it is not rigorous. .P The method that is actually used to compute $F(t)$ for $t$ not a grid point is to use band-limited function interpolation techniques. If .DS 2 .EQ (4.3.1) G(t) ~=~ int from {- tau} to tau ~ g(x) e sup ixt ^dx ^, .EN .DE then it's been known for a long time that $G(t)$ is determined by its samples at the points $n pi / tau$, $n^member^Z$, provided only that $G(t)$ satisfies some fairly mild conditions, and that in fact $G(t)$ is then representable by the ``cardinal series'' .DS 2 .EQ (4.3.2) G(t) ~=~ sum from {n=- inf} to inf ~ G left ( {n pi} over tau right ) ~ {roman sin ( tau t - n pi )} over {tau t - n pi} ^. .EN .DE Results of this type have a long history, going back to E. Borel, Hadamard, de\ la\ Valle\*'e Poussin, E.\ T. and J.\ M. Whittaker, and Ferrar in the mathematical literature, and to Nyquist, Kotelnikov, Shannon, and Someya in engineering (see [Hig] for a history), and are the basis for digital sound transmission and storage, for example. Two comprehensive surveys of the literature in this area are those of Butzer et\ al. [BSS] and Jerri [Jer]. .P The cardinal series in (4.3.2) is not suitable for the interpolation of $F(t)$ because, aside from the question of whether the expansion (4.3.2) is valid for $F(t)$, the sum in (4.3.2) converges very slowly. We use instead a formula for $G(t)$ that involves a sum of $G(n pi / beta )$ for some $beta ^>^ tau$, which thus involves more frequent (and thus less efficient) sampling of $G(t)$, but in which the coefficients of $G( n pi / beta )$ decrease rapidly. The basic approach appears to be well-known to many analysts and communications engineers, but no published reference for the result we use was found, so a proof is sketched below. (See [BSS,\|Jer] for other possible approaches.) .P Suppose that $G(t)$ satisfies (4.3.1), where $g(x)$ will be assumed for the moment to be in $L sup 2 (- tau ,^tau )$. Take $beta ^>^tau$ and define $g(x) =0$ for $tau ^<^ | x |^<^ beta$, and then extend $g(x)$ to the entire real line by making it periodic with period $2 beta$. Then we have .DS 2 .EQ (4.3.3) g(x) ~=~ sum from n ~ a sub n ~ exp ( 2 pi in x / ( 2 beta )) ^, .EN .DE where .DS 2 .EQ (4.3.4) a sub n ~=~ ( 2 beta ) sup -1 ~ int from {- beta} to beta ~ g(x) ~exp (-2 pi in x / ( 2 beta )) dx ^. .EN .DE Eq.\ (4.3.1) then shows that .DS 2 .EQ (4.3.5) a sub n ~=~ ( 2 beta ) sup -1 ~ G( - n pi / beta ) ^. .EN .DE Next, choose $lambda ,^epsilon ^>^0$ so that .DS 2 .EQ (4.3.6) tau ~<=~ lambda ^-^epsilon ~<~ lambda ^+^epsilon ~<=~ beta ^, .EN .DE and let $H(x)$ be some continuous function with $H(x) =0$ for $|x|^>^epsilon$, and .DS 2 .EQ (4.3.7) int from {- inf} to inf ~ H(x) dx ~=~ 1 ^. .EN .DE Further, let $chi (x)$ be the characteristic function of the interval $[- lambda ,^lambda ]$, and let $u^star^v$ denote the convolution of the functions $u$ and $v$; .DS 2 .EQ (u^star^v) (x) ~=~ int from {- inf} to inf ~ u(y) v(x-y) dy ^. .EN .DE Then .DS 2 .EQ (4.3.8) ( chi ^star^H) (x) ~=~ left { matrix { lcol {1 ^,~~~ above 0^,~~~} lcol { |x| ^<=^lambda - epsilon ^, above |x| ^>=^ lambda + epsilon ^.} } .EN .DE Therefore .DS 2 .EQ (4.3.9) G(t) ~=~ int from {- tau} to tau ~ g(x) e sup ixt dx ~=~ int from {- inf} to inf ~ g(x) e sup ixt ( chi ^star^ H) (x) dx ^. .EN .DE Substituting the Fourier series (4.3.3) into the last expression above and using (4.3.5) yields .DS 2 .EQ (4.3.10) G(t) ~=~ (2 beta ) sup -1 ~sum from n ~ G( -n pi / beta )~int from {- inf} to inf ~ e sup {ixn pi / beta + ixt} ( chi ^star^H) (x) dx ^. .EN .DE The integral above is just the Fourier transform of $chi ^star ^H$ evaluated at $n pi / beta - t$, which is the product of the Fourier transforms of $chi$ and $H$. If $h(t)$ is the Fourier transform of $H(x)$, .DS 2 .EQ (4.3.11) h(t) ~=~ int from {- inf} to inf ~ H(x) e sup ixt ^dx ^, .EN .DE then we obtain .DS 2 .EQ (4.3.12) G(t) ~=~ lambda over beta ~ sum from n ~ G( n pi / beta ) ~ {roman sin ^lambda ( n pi / beta -t)} over {lambda (n pi / beta -t )} ~ h( n pi / beta - t ) ^. .EN .DE The interpolation formula (4.3.123) was derived under the assumption that $g(x)^member^L sup 2 (- tau ,^tau )$, but by taking limits, it is easy to see that this formula holds when $g(x)$ is a finite linear combination of delta functions, as well as in more general settings. .P The formula (4.3.12) can be applied directly with $G(t) ^=^F(t)$ for $tau^=^log^k sub 1$, but since the spectrum of $F(t)$ is actually limited to $[ log ^k sub 0 ,~ log ^k sub 1 ]$, it is more efficient to apply it with .DS 2 .EQ (4.3.13) G(t) ~=~ F(t) e sup {-i alpha t} ^, .EN .DE where .DS 2 .EQ (4.3.14) alpha ~=~ 1 over 2 ^( log ^k sub 1 ~+~ log ^k sub 0 ) ^. .EN .DE Then Eq.\ (4.3.12) yields .DS 2 .EQ (4.3.15) F(t)^=^ lambda over beta ~ sum from n ~F left ( {n pi} over beta right ) e sup {- i alpha ( n pi / beta -t)} ~{sin^lambda (n pi / beta -t)} over { lambda ( n pi / beta -t)}~ h ( n pi / beta -t ) ^,~~~"\\0\\0" .EN .DE valid for any $beta$ and $lambda$ that satisfy (4.3.6), where we now take .DS 2 .EQ (4.3.16) tau ~=~ 1 over 2 ^( log ^k sub 1 ~-~ log^k sub 2 ) ^. .EN .DE We choose .DS 3 .EQ (4.3.17) beta mark ~=~ pi / delta ^, .EN .sp .EQ (4.3.18) lambda lineup ~=~ ( beta + tau ) /2 ^, .EN .sp .EQ (4.3.19) epsilon lineup ~=~ ( beta - tau ) /2 ^, .EN .DE and take .DS 2 .EQ (4.3.20) h(u) ~=~ c over {roman sinh (c)} ~ {roman sinh ( c sup 2 - epsilon sup 2 u sup 2 ) sup 1/2} over {( c sup 2 - epsilon sup 2 u sup 2 ) sup 1/2} ^, .EN .DE where $c$ is a constant that was equal to 30 in most of the computations. A typical set of values is that used for one of the computations of large sets of zeros near zero number $10 sup 20$: .DS 3 .EQ (4.3.21) matrix { rcol { k sub 0 ~=~ above k sub 1 ~=~ above alpha ~=~ above tau ~=~ above delta ~=~ above beta ~=~ above lambda ~=~ above epsilon ~=~ above c ~=~} lcol {450^, above 1,^555,^488,^184^, above 13.637^dd ^, above 7.5279^dd ^, above 0.29^, above 10.833^dd ^, above 9.1804^dd ^, above 1.65258^dd^, above 30 ^.} } .EN .DE Note that the distances between consecutive Gram points are 0.148433..., so there is only about one grid point at which $F(t)$ is evaluated for every two Gram intervals. .P Many different kernels $h(u)$ could have been used for the interpolation. The specific function $h(u)$ of (4.3.20) was suggested by B.\ F. Logan. Logan had discovered a long time ago [Kai] that $h(u)$ is a remarkably good approximation to the principal eigenfunction of the finite Fourier transform, which led to its widespread use in some signal processing applications, as well as in some problems in number theory [MO]. More important for our application are some further optimality properties of $h(u)$ that have been proved by Logan [Log1,\|Log2]. The formula (4.3.15) is evaluated by summing the terms in the series corresponding to $n$ with $n pi / beta$ fairly close to $t$, and neglecting the remainder of the sum. If we do not use any special knowledge of the behavior of $F( n pi / beta )$ or of $sin ( lambda ( n pi / beta - t ))$, and we sum the series in (4.3.15) over $n$ with $| n pi / beta - t |^<^ c/ epsilon$, then we basically need to minimize, .DS 2 .EQ int from {|u| ^>^c epsilon sup -1} ~ |h(u) u sup -1 | du ^, .EN .DE and Logan's results show that this minimum is achieved by the function defined in (4.3.20), and equals .DS 2 .EQ 2 ~ log ~ {1+ e sup -c} over {1-e sup -c} ^. .EN .DE (For $c=30$, this quantity is $approx^2e sup -30 ^approx^1.9 times 10 sup -13$.) Interpolation using the formula (4.3.15) is performed over approximately the interval $ [ T + c / epsilon , T + (R-1) delta - c / epsilon ] $. .P For the set of parameters listed in (4.3.21), the interpolating sum in (4.3.15) was estimated by explicitly evaluating and adding up approximately 120 terms of the sum. Increasing $delta$ (without changing $k sub 0$) increases the length of the interval over which $F(t)$ can be computed, and therefore increases the number of zeros that can be calculated. This has practically no effect on the running time of the rational function evaluation program (assuming the number of grid points stays the same), but increases the time needed by the zero-locating program, both because of the greater number of zeros to be processed, and because more terms in the interpolation formula (4.3.15) have to be evaluated. Increasing $k sub 0$ allows one to increase $delta$ (and so the number of zeros that can be computed) without changing $epsilon$ (and thus the number of terms that have to be computed in (4.3.15)). Such a change, however, increases the running time of the zero-locating program by increasing the number of terms in the sum $Z sub 1 (t)$. The choice of parameters listed in (4.3.21) was not optimized carefully, and could undoubtedly be modified to obtain a more efficient algorithm. .H 2 "Space and time requirements" Table\ 4.4.1 shows the running times of the rational function evaluation program in some of the computations that were carried out. The first column denotes the zero set. Upper case letters refer to the computations near the special points described in Section\ 3 and listed in tables\ 3.1.1 and 3.1.2. Lowercase letters refer to computations listed in Table\ 4.5.1. These were primarily the large sums that are described in Section\ 2, together with some smaller computations designed to check on the accuracy of the larger ones. (See Section\ 4.5 for a discussion of the reasons for such computations.) The FFT computations were very fast, by comparisons, especially for $R^<=^2 sup 19$. (For $R=2 sup 23$ and $2 sup 24$, they took several hours of elapsed time, most of it spent reading and writing disk files to rearrange the data, but only seconds of computing times.) The zero-locating program took slightly under 90 minutes per million zeros when $delta ^approx^0.3$ (and less for smaller $delta$), so that the computation of the roughly $3.3 times 10 sup 7$ zeros in set $n$ took approximately 46 hours (2800 minutes) in addition to the 102 hours for the rational function evaluation program. .P Comparison of entries $g$ and $i$, and also of $k$ and $n$, shows that increasing the number of grid points (and therefore the number of zeros that can be computed) has relatively little effect on the running time of the rational function evaluation program; around the $10 sup 20$-th zero, going from $1.6 times 10 sup 7$ zeros to $3.2 times 10 sup 7$ zeros increases the running time by less than 17%. The reason for not using even larger grids was lack of memory. .P Lack of memory, both core and disk, was the main constraint in planning the program from the very beginning. Computing time was not a major limitation. Around 1000 hours were used for all the computations reported here, which is very substantial. At the time these computations were carried out, however, the Cray was relatively lightly utilized, and so although essentially only time that would have been idle otherwise was used, a lot of it was available. As a result, minimizing the running time of the program was not of very high priority. (Various possible improvements are discussed in Section\ 4.6.) .P The Cray X-MP that was available has 2 processors and 4 million words of memory (32Mb, or megabytes). In practice a maximum of 25Mb is available for a single process, and when such a process runs, one of the processors stands idle. The first version of the rational function evaluation program to be implemented had $R^=^2 sup 19$ and maintained all the auxiliary arrays in memory all the time, and as a result required over 15Mb. This program was used to compute the zeros in sets $b,^c$ and $e$ of Table\ 4.5.1, as well as the $N=10 sup 14$ set of Table\ 1.2, but it would not run if there was any other process of over 10Mb that was running. For $R=2 sup 17$, the corresponding program (which was used to compute all the small sets of zeros of Section\ 3) requires only about 5Mb, and so was able to utilize much more of the spare time that was available, since sometimes it would even run when there was one process of $<=^20$Mb running, and all other waiting processes were too large to fit into the remaining memory. (This did not happen in all such cases due to the way the scheduler was working.) Most of the large computations were carried out with the segmented version of the program that is described at the end of Section\ 4.2. For $R=2 sup 23$, it requires 8Mb. (This can be lowered to below 5Mb with some simple rewriting of the program.) The main zero locating program also uses about 8Mb of space. In this case the space requirement can be lowered to below 1Mb very easily, since only small segments of the values of $F(t)$ at grid points are needed at any time. The reduction of process size to 8Mb seemed sufficient, however, to take advantage of available time. .P The limitation on core memory was overcome by segmenting the rational function evaluation program. A limitation harder to overcome was the lack of disk storage space. Most of the large computations had $R=2 sup 23$, which meant that $2 sup 23$ complex values of $F(t)$ were being computed and stored, which comes to 128Mb. Moreover, this data had to be reformatted for the FFT application, so that at least for short periods, 256Mb had to be stored. (An in-place FFT program would have eliminated the need for the extra storage, and thus would have led to the computation of twice as many values, but this option was not used since it would have required larger storage of final files. This is discussed further in Section\ 4.6.) Disk space, even for temporary storage, was extremely scarce during these computations, and so this seemed to be close to the limit of what could be easily computed at that time. For $R=2 sup 24$ (set $n$ in tables\ 4.4.1 and 4.5.1), the peak storage requirement is 512Mb, and was satisfied only because W.\ M. Coughran kindly made available some of his dedicated disk space. .P Some of the ways of overcoming the memory limitations are discussed in Section\ 4.6.1. .P Over 2000Mb of data from these computations (mostly values of $F(t)$ at grid points, but also some listings of zeros, as well as various other data) have been stored on an optical disk, and are available for further studies. While the optical disk storage technology appears to be very reliable, some of the data may have been corrupted in moving it over a local area network, and so may not be usable. (When the possibility of such errors was realized, a system of parity checks was instituted for later data sets, to prevent such problems from arising.) .H 2 "Correctness of computational results" .P The main defect in the computations reported here is that they lack rigorous error bounds. This is due to the combination of the height at which the zeta function was computed and the computer hardware that was available. Even if one were to use the standard term-by-term evaluation of the Riemann-Siegel formula, this problem would be severe. The main difficulty there would be in evaluating the first sum in (4.1.2), which is (neglecting the $theta (t)$ term) of the form .DS 2 .EQ (4.5.1) 2 ~ sum from k=1 to {k sub 1} ~ k sup -1/2 ~ roman cos (t^log^k ) ^. .EN .DE Near the $10 sup 20$-th zero, $k sub 1 ^approx^1.5 times 10 sup 9$, $t^approx^1.5 times 10 sup 19$, and so for almost all values of $k$ in the sum, $t^log^k^approx^3 times 10 sup 20$. Therefore, since $dp$ arithmetic on the Cray is performed with about 28 decimal digits of precision, the values of $t^log^k$ that are computed are accurate only to within an error on the order of $+- ^10 sup -8$, and hence the values of $t^log^k$ after reduction modulo $2 pi$ are only likely to be accurate to within $+-^10 sup -8$, and the sum in (4.5.1) is likely to be evaluated with an error of .DS 2 .EQ (4.5.2) E~=~ 2 ^times ^10 sup -8 ~ sum from k=1 to {1.5 times 10 sup 9} ~ epsilon sub k ^k sup -1/2 ^, ~~~~~-1^<=^epsilon sub k ^<=^1 ^, .EN .DE even if the computations of the $k sup -1/2$ and of the sum in (4.5.1) are performed in infinite precision. Given no special knowledge of the $epsilon sub k$, all one can say is that .DS 2 .EQ (4.5.3) |E| ~<=~ 2 ^times ^10 sup -8 ~ sum from k=1 to {1.5 times 10 sup 9} ~ k sup -1/2 ~ approx ~ 1.5 ^times^10 sup -3 ^. .EN .DE In view of the many examples of Lehmer's phenomenon that have been found, where the maximum of $|Z(t)|$ between zeros is substantially less than the bound (4.5.3), one could not be certain even whether some zeros were on the critical line, much less be able to locate them accurately. .P The use of multiprecision arithmetic packages would solve the above roundoff problem, but at a very high price in computing time. In the case of the straightforward evaluation of the Riemann-Siegel formula, one can gain a few extra digits of guaranteed accuracy by a method described at the end of Section\ 4.6.1. In the case of the new method that was used for this paper, there are several additional difficulties. The rational function evaluation method computes the $f( omega sup h )$ as sums of a large number of terms, and some of these terms are Taylor series expansions whose coefficients are obtained by adding up numerous other expansions. It would be a very difficult task to obtain good error estimates for all these operations. Further, even if one succeeded, it would be necessary to also bound the errors in the FFT routines and in band-limited function interpolation. The FFT, for example, is known for its good properties in terms of controlling errors, but this applies to only a limited extent when one considers worst-case behavior and has to worry even about roundoff errors in addition. .P The above roundoff problem would not arise if one used machines with larger word sizes than the Cray's 64 bit ones, but such computers are unlikely to become available in the near future. Another solution would be to restrict computations to lower heights. However, it seemed desirable to obtain information from as high up as possible, since the zeta function approaches its asymptotic behavior very slowly. Therefore it was necessary to abandon rigor in the computations. .P While no rigorous error bounds have been obtained, the computational results are thought to be accurate. One reason for thinking this is based on heuristics. The bound (4.5.3) is very conservative in that it is sharp only when essentially all the $epsilon sub k$ in (4.5.2) are close to $+1$ or almost all are close to $-1$. In practice, one expects that the $epsilon sub k$ will be quite independent of each other, and if that is so, then even under the assumption that the $epsilon sub k$ take only the extreme value $+-^1$, we find that the rms value of $E$ is only .DS 2 .EQ 2 ^times 10 sup -8 ~ left ( sum from k=1 to {1.5 times 10 sup 9} ~k sup -1 right ) sup 1/2 ~ approx ~ 9 ^times ^10 sup -8 ^. .EN .DE This is the typical error we expect due to cancellation among various roundoff errors. Similarly, one expects substantial cancellation in the rational function evaluation, in the FFT, and in band-limited function interpolation. .P The need to rely on cancellation of roundoff errors introduces another level of uncertainty to the computation. Although it is common and accepted in numerical analysis, statistics, and the physical sciences, it is seldom encountered in pure mathematics. This uncertainly is added to the usual uncertainties about reliability of hardware, the design of hardware floating point units (cf. [Br,\|Od2,\|Schr1,\|Schr2]), the correctness of manufacturers' mathematical subroutines (cf. [SF]), the reliability of compilers, and finally the correctness of the main programs. All these problems occur with reasonably high frequency. To add to the long list of problems that have been found, we mention that with some Cray Fortran compilers, the test program for the Brent MP package [Br4], which computes $pi$, $exp ( pi (163/9) sup 1/2 )$, and $exp ( pi (163) sup 1/2 )$ to about 100 decimal places, produced all the digits of $pi$ correctly, but gave it a negative sign, and produced totally unrecognizable numbers for the remaining two problems. Mathematicians usually insist on a higher standard of rigor than this. .P In some mathematical computations it is not vital to have absolute assurance of correctness, since the results are used only to obtain insight into behavior of various functions or systems, and eventually conventional proofs that make no appeal to any computations are constructed (cf.\ [Od0]). In many other cases, though, such as that of the Four Color Theorem [AH], or of some proofs in dynamical systems (cf.\ [Lan]), computational results are an integral part of the proof. There is a school of thought that questions the validity of all such proofs. .P The computations of this paper are in one sense even more questionable than those mentioned above, since they depend not only on the correctness of the hardware and software, but also on quasi-random cancellation of roundoff errors. This is to some extent worse than relying on usual probabilistic algorithms, since in these at least the coin tosses are really independent, so one can talk of rigorous probabilistic results. (This assumes, of course, that one can obtain really random bits, but that is another topic.) In our case there is no true randomness, as the roundoff process is deterministic. Moreover, the zeta function is certainly very nonrandom, and so it is certainly conceivable that the errors in the evaluations of $Z(t)$ might arrange themselves so as to conceal a violation of the RH. It is for this reason that the previous numerical verifications of the RH, such as those of Brent [Br5] and of van\ de\ Lune, te\ Riele, and Winter [LRW2], were done very carefully. For example, those investigators did not even rely on their machines' cosine routines, and were very careful in the analysis of their error terms. As a result, the validity of the verification of the RH for the first $1.5 times 10 sup 9$ zeros by van\ de\ Lune et\ al. relies only on the assumptions that the hardware and compilers were reliable, their program was correct (it is available for inspection in [LRW1] and some further modifications are in [WR]), and that their machines' $dp$ cosine routines (used to provide data to the linear interpolation routines that compute cosines) were at least moderately accurate. .P The new programs do not have the same kind of assurance of correctness that those of [LRW2] do. However, in a sense they can be argued to be even more trustworthy. The reason for this is that large parts of the computations were done twice. In general, redoing the same computation on the same machine with the same program provides a check only against certain intermittent errors. In our case, though, the computations were quite different. The grids $T$, $T+ delta$, $T+ 2 delta ,^dd$, at which $F(t)$ was being evaluated were always different. As a result, the rational functions $f(z)$ that were being evaluated at the $R$-th roots of unity (where $R$ was sometimes the same and sometimes different in different computations) were different for the two grids. Therefore the numbers that resulted from the application of the FFT, and were used for band-limited function interpolation, were different. The fact that what was being computed in the two calculations was $Z(t)$ was thus not apparent at all from the numbers being processed, and is a result of the involved analysis of sections\ 4.1 to 4.3. The fact that the two values that were obtained were the same to within the expected error serves as evidence that they are indeed values of $Z(t)$, since it would require a very unusual combination of errors for the two computations to yield the same answers otherwise. This method thus serves as a check on not just on roundoff errors, but also on the hardware, compilers, operating systems, and the program themselves. .P Care was taken to minimize the parts of the computations of $Z(t)$ that were common to the overlapping sets. The values of $delta$ were always different. The computations of $Z sub 1 (t)$ and of the asymptotic expansion in the Riemann-Siegel formula were harder to make distinct. However, the procedure for computing $theta (t)$ outlined in Section\ 4.1 served to make the values of $theta (t)$ in different computations slightly different, so that in locating zeros, different computations dealt with different values of $t$. .P The above method of computing $Z(t)$ in two different ways that ought to yield the same result only because of deep mathematical results is not novel. In early computations of $pi$, such as that of Shanks and Wrench [SW] (see [BB] for history of this subject and much more efficient modern methods), $pi$ was calculated through two different Machin-like formulas. In the case of [SW], they were .DS 3 .EQ pi mark ~=~ 24 ~roman arctan ^left ( 1 over 18 right ) ~+~ 8~ roman arctan ^left ( 1 over 57 right ) ~+~ 4 ~roman arctan ^left ( 1 over 239 right ) .EN .sp .5 .nr Eq 1 .EQ and ~~ .EN .sp .5 .nr Eq 0 .EQ pi lineup ~=~ 48~ roman arctan ^left ( 1 over 18 right ) ~+~ 32 ~roman arctan ^left ( 1 over 57 right ) ~-~ 20~roman arctan ^left ( 1 over 239 right ) ^. .EN .DE Since again only a very unusual combination of errors could give the same answer by both methods, obtaining the same result provides a very convincing (although nonrigorous) argument in favor of correctness. .P In the case of the new algorithm, the idea of comparing results of overlapping computations did lead to the uncovering of one error in the program that had escaped detection in several earlier runs. After one computation of over $1.6 times 10 sup 7$ zeros near zero number $10 sup 20$ (set $m$ of Table\ 4.5.1, described below), the set below it (set $l$ of Table\ 4.5.1) was computed. However, a computation of the zeros in the segment overlapping set $m$ showed apparent violations of the RH when the values of $F(t)$ from set $l$ were being used, although no such violations were found using set $m$. Interpolation of values of $F(t)$ from set $m$ to give the values of $F(t)$ on the grid of set $l$ revealed that the computed values of $F(t)$ at the grid points $f sub j ^=^T + j delta$ of set $l$ differed from the (presumably correct) ones derived from set $m$ by $c(-1) sup j$, where $c$ was a certain constant. This immediately suggested that in set $l$, $f(-1)$ was being evaluated incorrectly. An inspection of the code revealed a simple mistake having to do with indexing of the roots of unity $omega sup h$ in the segmented program, and it was easy to correct the data. This bug had not revealed itself before because the unusual combination of having a pole of the rational function $f(z)$ in a certain range close to $-1$, which was required for the error to be activated, had not occurred in the earlier runs. .P How close to each other are the values of $Z(t)$ computed in different runs? Let us consider the neighborhood of the extreme example of Lehmer's phenomenon near $gamma sub n$, $n=10 sup 18 ^+^12,^376,^780$, where the minimum of $Z(t)$ between $gamma sub n$ and $gamma sub n+1$ is only $-5.3 times 10 sup -7$. This is the example that comes closest to violating the RH among all those found in our computations, and the obvious question is whether one can be sure that the RH is indeed satisfied by $rho sub n$ and $rho sub n+1$. This example was found in a computation of $1.7 times 10 sup 7$ zeros, and to confirm the accuracy of the computed values of $Z(t)$, two additional computations were carried out, each of about $1.5 times 10 sup 5$ zeros, and each centered close to $gamma sub n$. The starting points of the three computations and the grid spacings $delta$ were distinct in all three computations in order to assure maximal independence in the computed values. When the results of the these runs are plotted on the scale of Fig.\ 2.6.1, they are indistinguishable. When one plots $Z(t)$ only in the immediate vicinity of $gamma sub n$ and $gamma sub n+1$, as in Fig.\ 4.5.1, the three graphs are still indistinguishable. It is only when one goes over to the scale of 4.5.2, which shows $Z(t)$ near its minimal value in $( gamma sub n ,^gamma sub n+1 )$, that differences are apparent. This graph was prepared by computing $Z(t)$ from each run at intervals of $10 sup -6$ times the length of a Gram interval (so that Fig.\ 4.5.2 corresponds to about 150 evenly spaced values of $t$) and connecting the points obtained that way by lines. (The function of the lines was to enable the reader to tell which values come from the same data set.) The jagged appearance of the lines is the result of the quantization and roundoff errors. (Note that changing a value of $t^approx^1.7 times 10 sup 17$ by $10 sup -6$ times the length of a Gram interval affects only the last 15 or so bits in the $dp$ representation of $t$.) Given the scale of Fig.\ 4.5.2, the three curves are very close together, and thus provide convincing evidence that the claimed values of $Z(t)$, $gamma sub n$, and $gamma sub n+1$ are indeed highly accurate, and that the RH is not violated in this region. .P Since all indications from preliminary runs were that the new algorithm was highly accurate, and storage of two complete data sets needed to carry out a detailed comparison would have been hard to arrange, it was decided not to recompute all the values of zeros near zero number $10 sup 20$ using different grids, but rather to have different computations cover consecutive ranges with some overlap. Table\ 4.5.1 shows all the different sets of zeros that were computed and that overlapped other ranges. The three sets of zeros that were referred to above in the discussion of Lehmer's phenomenon, for example, are listed under $f,^g,$ and $h$ in Table\ 4.5.1. (The set $a$ consists of the zeros computed in [Od2] by the standard Riemann-Siegel formula method, and so its values of zeros are very trustworthy.) The four main computations (near $N=10 sup 20$) are those in sets $k,^l,^m$, and $n$, and each one overlaps each of its neighbors in about $10 sup 6$ zeros. The small set o was computed as an additional check, since the smaller grid spacing and fewer grid points were expected to produce more accurate values, and the somewhat different program used was an extra check on programming mistakes. (This was also the motivation behind some of the other computations of small sets of zeros, such as that of $j$. The medium size sets, $b,^c$ and $e$, were computed by the earliest of all versions of the new program.) .P A few large scale statistical comparisons were made of the values of $Z(t)$ produced in different computations. For example, to compare sets $m$ and $o$ of Table\ 4.5.1, the values of $Z(t)$ were computed using data from each set at $5 times 10 sup 6$ points spaced 1/300 apart (approximately 45 per Gram interval) starting at $t^=^1.52024401159207401 times 10 sup 9$. The largest difference (in absolute value) was $1.5 times 10 sup -6$, and the rms one was $5 times 10 sup -8$. .P While the errors made in computing $Z(t)$ are of some interest, the main question is that of accuracy in computing the $gamma sub n$, which depends not only on accuracy of values of $Z(t)$, but on the size of $Z(t)$ and $Z sup prime (t)$ near zeros. Therefore very extensive and fairly careful comparisons were made of the differences in values of $gamma sub n$ computed in different sets. Table\ 4.5.2 summarizes the results of these comparisons. The ``$a$ vs. $b$'' entry, for example, indicates that the values for the 101,\|053 zeros common to sets $a$ and $b$ different by no more than $2.5 times 10 sup -9$, and the rms difference was $3.7 times 10 sup -11$. (These are differences in the values of the $gamma sub n$. Should the values of two adjacent zeros in set $l$, for example, each be off by $epsilon$, with one value too small by $epsilon$ and the other too large by $epsilon$, the resulting value of $delta sub n$ would be off by $+- 14 ^ epsilon$.) The maximal differences increase as one looks down the table, as was to be expected. They all stay very small, though, and are the main justification for the claimed validity of the data. .P The rms difference entries in Table\ 4.5.2 should be treated with great caution. One reason is that the zero-locating program was only asked to compile the zeros to a nominal accuracy of $+- 2 times 10 sup -8$ (for zeros near zero number $10 sup 20$; somewhat higher accuracy was specified for lower zeros). Because of the mixture of linear and quadratic interpolation that was used, usually the convergence of the algorithm at the end of a particular search was quadratic, and so accuracy much greater than the specified one was reached in almost all cases. Thus the fact that the rms figures in Table\ 4.5.2 are substantially below the specified accuracy of $2 times 10 sup -8$ is the result of many happy accidents, and not a matter of design. Another reason not to rely on the rms figures is that in many cases they were inflated by the programs that were used for the comparisons. Since there was no reason to expect accuracy better than $+- 10 sup -8$, $sp$ programs were used for most of the computations of Table\ 4.5.2, which led to the loss of the last few bits of precision. (The anomalously large rms value for the ``$l$ vs. $m$'' entry, as compared to the ``$k$ vs. $l$'' and ``$m$ vs. $n$'' entries, which cover roughly the same number of zeros at about the same height, is almost certainly due to the use of a $s p$ array in a data conversion routine, for example.) Thus in general the rms figures in Table\ 4.5.2 are upper bounds for the rms errors achievable with the new algorithm, but should not be regarded as very accurate estimates. .P One of the sources of errors in the computation of the $gamma sub n$ lies in the method of calculating $theta (t)$. Some of these errors come from the roundoff difficulties associated with handling large numbers within the limited precision that was available. Other errors came from the Taylor series expansion procedure, described in Section\ 4.1, that was used to compute $theta (t)$. Some indication of the errors introduced this way can be obtained from the data produced during the main runs. Because of the inefficient search procedure near exception to Rosser's rule (described in Section\ 4.1), usually about ten zeros near each exception were computed twice. The two values were hardly ever identical, since the zero locating program was usually invoked with different arguments. Differences caused by this factor were usually extremely small. Much larger were differences caused by the fact that fairly often the two computations calculated $theta (t)$ for nearby values of $t$ by expanding around different values of $t sub 0$. The largest difference in the computed values of the $gamma sub n$ that was found that is due to this phenomenon is $2.7 times 10 sup -7$ for $n~=~ 10 sup 20 ^+^ 31,^141,^844$. (The second largest was only slightly more than half as large.) This zero is located near two exceptions to Rosser's rule that are close to each other, with the peak value of $Z(t)$ in that region equal to 257.6. The pattern of zeros (starting at Gram point $g sub n-19$) is 2111110110030101311, and $Z(t)$ is very small in a relatively large neighborhood of $gamma sub n$ ($| Z(t) | ~<~ 0.015$ over approximately the whole Gram interval that contains $gamma sub n$). $Z sup prime ( gamma sub n) ~=~ 0.7$ is quite small, and so the computed location of $gamma sub n$ is quite sensitive to mistakes in the computation of $theta (t)$. .P Other tests to determine the sensitivity of the computed values of the zeros to errors in the computation of $theta (t)$ were also performed. For example, the zeros in set $o$ were computed several times, always using the same rational function evaluation output for the interpolation of band-limited functions, but modifying the strategy of evaluating $theta (t)$ by forcing more frequent recomputations of $t sub 0$, or simply the use of different sets of $t sub 0$'s. The resulting values for the zeros had differences (when compared to the basic computation of the zeros in that set) that were $<= ^10 sup -7$ in absolute value, and $<= ^2 times 10 sup -9$ in rms value. Thus the basic conclusion from these tests is that errors in the computed $theta (t)$ were not a very significant factor. .P Another reason for trusting the computational results of this paper is that the results of the most time-consuming part, the rational function evaluation, are transformed by the FFT before being used for the computation of zeros. This means that any error in this part of the computation affects the computation of all zeros, and so if it is substantial, is likely to give rise to an apparent counterexample to the RH. This is in contrast to the standard methods, such as that of [LRW2], in which a single mistake affects the computation of only one value of $Z(t)$. .P The final, and in some sense very convincing, although very unrigorous argument in favor of the correctness of the computations reported here is that they did not find any counterexamples to the RH. This might seem a strange argument. The point of it is that if the RH is true, it is only barely true, in the sense that even very tiny changes in the formulas used to compute the zeta function yield functions that no longer satisfy the RH. Many deliberate as well as accidental experiments were performed in which some of the parameters in the programs were modified very slightly, and they almost invariably ended up giving apparent counterexamples to the RH. For example, in set $V$ of Section\ 3, the minimal $w sub n$ (see Section\ 2.7 for definitions) that was found was $1.43 times 10 sup -4$, so perturbing $Z(t)$ by smaller quantities could not produce apparent counterexamples to the RH. In particular, dropping the asymptotic expansion part of the Riemann-Siegel formula does not produce visible problems in this set, although it does in other ones which have more extreme cases of Lehmer's phenomenon, and in all cases it perturbs the computed values of the zeros. On the other hand, only slightly larger perturbations do produce apparent counterexamples. One also finds counterexamples when one computes .DS 2 .EQ Z(t) ~-~ 2k sup -1/2 ~~roman cos ( t^log^k ^-^theta (t) ) .EN .DE for $k=10 sup 6$. Also, when one computes .DS 2 .EQ Z sub 1 (t) ~+~ roman Re ~ e sup {- theta (t)} ~ F(t- alpha ) .EN .DE with $alpha = 10 sup -4$ instead of $alpha =0$, apparent counterexamples to the RH appear. (In all these cases, the apparent counterexamples refer to cases where the function being computed has a positive relative minimum or a negative relative maximum.) .H 2 "Possible improvements" At large heights, the new algorithm is much faster than previous methods. The computation of $10 sup 5$ zeros near zero number $10 sup 12$ in [Od2] took approximately 15 hours on a Cray \%X-MP using direct evaluation of the Riemann-Siegel formula. Set $n$ of Table\ 4.5.1 contains almost $3.3 times 10 sup 7$ zeros near zero number $10 sup 20$, and it was computed in about 150 hours on the same machine. Since the Riemann-Siegel formula involves about $7.5 times 10 sup 3$ times more terms near zero number $10 sup 20$ than near zero number $10 sup 12$, computing all the zeros in set $n$ by the method of [Od2] would have required about .DS 2 .EQ 15~times ~ 300 ~times ~ 7500 ~approx~ 3.7 ~ times ~ 10 sup 7 .EN .DE hours, or more than $2 times 10 sup 5$ times longer than the new algorithm required. .P While the current implementation of the new algorithm is much more efficient than previous algorithms, it is far from optimal. The author's main interest was in demonstrating that the new algorithm was indeed faster than old ones, and in obtaining data about zeros of the zeta function. Since spare computer time was available, saving programming effort was often chosen over efficiency of the program. The following subsections present some of the ways in which the program could be modified so as to run faster or to produce more accurate results. They might be useful in future computations. It seems likely that the ideas in sections\ 4.6.1 and 4.6.2 could be used to increase the speed of the algorithm by another order of magnitude on the Cray \%X-MP. This might make it possible to compute large sets of zeros near zero number $10 sup 22$, for example. .P All the main programs can be parallelized, and one can achieve very high performance this way. (For the rational function evaluation program, there are some examples of similar algorithms, discussed in Section\ 4.6.2, that have been implemented very effectively on parallel computers by Greengard and Gropp [GG], Zhao [Zh], and Zhao and Johnsson [ZJ].) One of the main difficulties in using existing multiprocessors would likely be their relatively low precision. Our discussion will be oriented towards more standard vector processors, however. .H 3 "Faster and more accurate computations" .P Many parts of the rational function evaluation program can be speeded up. Table\ 4.2.1 shows that the present strategy of computing $dp$ values of $log^k$ takes about 10% of the running time. While this is 1.5 times faster than using the current standard Cray $dp$ routines would be (and 6 times faster than using the old Cray routines), it can be improved very substantially by modifying the program slightly. For example, the following method is about 3.5 faster (for $k^approx^10 sup 9$) than the one currently used. Group the $k^member^S sub m,j$ into consecutive blocks, say $k sub 4 ^<=^ k^<=^k sub 5$, with $k sub 5 ^-^k sub 4 ^=^0$. For such values of $t$, it would only be necessary to precompute the values of .DS 2 .EQ (4.6.1.1) e sup {i alpha m delta / 1000} ~ {roman sin ( lambda m delta / 1000 )} over {( lambda m delta / 10000 )} ~ h( m delta / 1000 ) .EN .DE for $|m|^=^3/2$. The coefficients $a sub k ( b sub k - z sub m ) sup n$ would again be computed for $0^<=^n^<=^V$, with $V$ of roughly the same order of magnitude as in the present algorithm. Addition of the coefficients $a sub k ( b sub k - z sub m ) sup n$ for $k^member^S sub m$ gives an expansion .DS 2 .EQ (4.6.2.3) sum from {k^member^S sub m} ~ {a sub k} over {z- b sub k} ~=~ sum from n=0 to V ~ A sub n sup (m) ( z - z sub m ) sup -n-1 ~+~ ... .EN .DE that can be used to compute the contribution of the $k^member^S sub m$ at $z= omega sup h$ for $omega sup h$ relatively close to $z sub m$. For $omega sup h$ that are further away, one would combine the contributions of several $S sub l$'s, say $S sub m$, $S sub m+1$, $S sub m+2$, and obtain an expansion around $z sub m+1$. The crucial point about the Greengard-Rokhlin algorithm is that unlike in the algorithm of [OS] and this paper, this expansion around $z sub m+1$ would not be done by recomputing the contribution of each $k^member^S sub m ^cup^S sub m+1 ^cup^S sub m+2$, but rather by translating the previously computed expansions; e.g., .DS 2 .EQ (4.6.2.4) sum from n=0 to V ~ A sub n sup (m) (z-z sub m ) sup -n-1 ~=~ sum from n=0 to V ~ B sub n sup (m) (z-z sub m+1 ) sup -n-1 ^+^...^, .EN .DE where the $B sub n sup (m)$ are derived from the $A sub n sup (m)$ by linear transformations coming from the binomial expansion, without reference to the $a sub k$ and $b sub k$ for $k^member^S sub m$. The straightforward formulas for the $B sub n sup (m)$ take on the order of $V sup 2$ operations to compute them from the $A sub n sup (m)$. As a result, since there would again be on the order of $log^R$ levels in the hierarchy of expansions, with each level having only 1/3 or 1/2 of the number of expansions in the level below, obtaining all the expansion coefficients would take on the order of .DS 2 .EQ (4.6.2.5) k sub 1 ^V ~+~ V sup 2 R .EN .DE operations. For $k sub 1 ^approx^R$ and $V^approx^log^R$, as would be true for the zeta function algorithm in the absence of memory constraints, and as is the case for the Coulomb or gravitational potential calculations of [GR1] and related papers, this is approximately the same operation count as for the present algorithm (see (4.6.2.1)). However, for present and foreseeable computations of the zeta function at large heights, $R$ is much smaller than $k sub 1$ $(R^approx^1.6 times 10 sup 7$ as compared to $k sub 1 ^approx^1.5 times 10 sup 9$ in the largest computation of this paper), and so the Greengard-Rokhlin algorithm is likely to give much faster rational function evaluation. An order of magnitude improvement is not unlikely in the time needed to evaluate the expansion coefficients. At the present time, Taylor series expansions consume about half of the time, so even eliminating them entirely would only double the speed of the program. However, with faster coefficient expansion techniques, one could also use these methods to evaluate contributions to $f( omega sup h )$ of $b sub k$ that are closer to $omega sup h$ than is done at present, and this would give much greater gains in efficiency. Moreover, once these parts of the program were improved, the improvements to other parts (such as that of evaluating $dp$ values of $log^k$) that have been suggested would become much more significant, and all of them together could increase the speed of the entire algorithm by an order of magnitude, especially if assembly language was used as suggested in Section\ 4.6.1. .P The basic idea of translating an expansion that is at the heart of the Greengard-Rokhlin algorithm for pole expansions can also be used for Taylor expansions of the kind that are used in the present algorithm, but it is not as efficient in this setting. Also, since the $B sub n sup (m)$ are derived from the $A sub n sup (m)$ by a convolution, one can carry out this computation in fewer than the $V sup 2$ steps of the straightforward algorithm, by using FFT-based methods. Greengard and Rokhlin [GR2] report some improvements obtained this way, but they are only on the order of 2 or 3 for 2-dimensional problems, and on the order of 8 for 3-dimensional ones. Since the zeta function problem is essentially a 1-dimensional one (with all the poles and points of evaluation on the unit circle), we might expect very small improvements from this source. This might be counteracted to some extent by the fact that the order of expansion $V$ that is used with the zeta function is higher than in [GR2], so the overhead might be comparatively smaller, and noticeable savings might still be obtained. .P One aspect of this multipole expansions of Greengard and Rokhlin that would have to be investigated carefully before their algorithm could be used for the zeta function computation in its accuracy. However, based on the results reported so far [GR1,\|GR2], that is not likely to be a problem. .H 3 "Computations of low zeros" .P The new algorithm is much more efficient than the implementation of the standard Riemann-Siegel formula evaluation in [Od2] even around zero number $10 sup 12$. However, this advantage might not hold or be as noticeable around zero $1.5 times 10 sup 9$, especially if one were only interested in separating zeros, and not computing them accurately (so that only about 1.2 evaluations of $Z(t)$ would be needed per zero, instead of the 10 or so of the current implementation of the algorithm of [OS]). Thus if one were interested in extending the numerical verification of the RH beyond the $1.5 times 10 sup 9$ zeros of [LRW2], the present implementation might not help very much. This is due to a large extent to the design of the program, which was aimed at computing around the $10 sup 20$-th zero, and so various parameters were chosen with that goal in mind. It is likely that the program could be rewritten to be much faster at lower heights, and with more extensive use of $dp$ arithmetic rigorous error analysis could be performed for it, but this would represent a substantial programming effort. .P What we present here is a combination of several techniques that ought to give a relatively simple algorithm for computing $Z(t)$ that ought to be about an order of magnitude faster than the algorithm of [LRW2], and for which rigorous error analysis could be performed. The basic idea is to again compute $Z sub 1 (t)$ in the standard way, and to compute $F(t)$ on a uniformly spaced grid of points $T$, $T+ delta ,^dd$, and to use band-limited function interpolation to then obtain $F(t)$ at intermediate points, as is explained in Section\ 4.3. The band-limited function interpolation method errors can be bounded rigorously. If $k sub 0 ^approx^t sup 1/4$, it would suffice to compute $F(t)$ once every 4 Gram intervals, but in order to shorten the interpolation computations and to control the errors better (through having to sum fewer terms in the series in (4.3.15)), it could be preferable to sample somewhat more frequently. .P The evaluation of $F(t)$ in the suggested method would be performed not by the method of [OS], but by forming arrays $a sub k$ and $b sub k$, $k sub 0 ^<=^k^<=^k sub 1$, with .DS 2 .EQ (4.6.3.1) a sub k ~=~ 2k sup -1/2 ^e sup {iT^log^k} ^, ~~~~b sub k ~=~ e sup {i delta ^log^k} ^. .EN .DE $F(T)$ would then be the sum of the $a sub k$. Next, we assign to $a sub k$ the value of $a sub k ^cdot^b sub k$, and sum the new $a sub k$, to obtain $F(T+ delta )$. Repeating this operation yields all the $F(T+ j delta )$. Since complex $sp$ multiplications and additions are vectorized by the Cray Fortran compiler, this method would be very fast, as was already noted in [Od2]. (In order to avoid loss of accuracy in repeated multiplications, it would be advisable to use this method only on short stretches of the grid $T,^T+ delta ,^dd$, so that at $t^=^T+100 j delta$, for example, one would recompute $a sub k ^=^ 2k sup -1/2 ^exp ( it^log^k )$ from scratch.) .P A further improvement can be obtained by using the Euler product, as was suggested by A. Scho\*:nhage in a slightly different context. To compute $F(t)$, we do not compute all the $2k sup -1/2 ^exp ( it ^log^k)$ for $k sub 0 ^<=^k^<=^k sub 1$ explicitly. Instead, we compute them for all $k$, $2^<=^k^<=^k$, such that $(k,P)^=^1$, where $P^=^2 times 3 times 5 times ^...^ times p sub h$ is the product of the first $h$ primes (with $h$ small, say $h=4$ or 5). (This computation would be done by a modification of the method presented above.) Then, to obtain $F(t)$, we would compute, for each $k$, $1^<=^k^<=^k sub 1$, .DS 2 .EQ (4.6.3.2) 2k sup -1/2 ~ e sup {it^log^k} ~ sum from {pile {g^member^Q above k sub 0 ^<=^kg ^<=^k sub 1}} ~ g sup -1/2 ^e sup {it^log^g} ^, .EN .DE where $Q$ is the set of integers all of whose prime factors are $<=^p sub h$. Since the sum in (4.6.3.2) would be the same for many $k$, this operation would be vectorizable. .P The methods presented in this section could also be useful for very accurate computations of high zeros. If one were to find an extreme example of Lehmer's phenomenon at large heights, or even a suspected counterexample to the RH, where it would be necessary to obtain more accurate values of $Z(t)$ than are given by the present implementation of the algorithm of [OS], writing an improved version of this algorithm with a guaranteed error bound would be very laborious, and might require a prohibitive amount of time to run. On the other hand, since values of $Z(t)$ in only a relatively short interval would likely be needed, the method of this section (combined with the suggestion at the end of Section (4.6.1) about increased accuracy) might be adequate to resolve any uncertainties. .HU "Acknowledgements" Since this paper is essentially a continuation of [Od2], the author would like to thank P. Barrucand, R. A. Becker, O. Bohigas, R. P. Brent, W. S. Cleveland, F. J. Dyson, P. X. Gallagher, E. Grosswald, D. A. Hejhal, B. Kleiner, H. J. Landau, S. P. Lloyd, C. L. Mallows, H. L. Montgomery, A. Pandey, H. J. J. te Riele, L. Schoenfeld, N. L. Schryer, D. S. Slepian, and D. T. Winter, all of whom helped in the preparation of that earlier paper, and many of whom helped with this one. Additional help in the preparation of this paper was provided by D.\ H. Bailey, M.\ V. Berry, J.\ B. Conrey, L. Greengard, A. Ivi\o'c\(hc', and B.\ F. Logan. A. Scho\*:nhage is due special thanks for the joint work that resulted in the algorithm that made these computations possible. .P The author is grateful to W.\ M. Coughran, P. Glick, R.\ H. Knag, and P.\ J. Weinberger for help in obtaining time and storage space on computers and to D.\ H. Bailey for providing fast Fourier transform programs. .bp .ce .B References .R .sp .VL 11 .LI [AKW] A. V. Aho, B. W. Kernighan, and P. J. Weinberger, \f2The AWK Programmming Language,\f1 Addison-Wesley, 1988. .LI [AGR] J. Ambrosiano, L. Greengard, and V. Rokhlin, The fast multipole method for gridless particle simulations, \f2Computer Physics Comm. 48\f1 (1988), 117-125. .LI [AH] K. Appel and W. Haken, The four color proof suffices, \f2Math. Intelligencer 8\f1 (no.\ 1) (1986), \%10-20,\|58. .LI [Bail] D. H. Bailey, A high-performance FFT algorithm for vector supercomputers, \f2Intern. J. Supercomputer Appl. 2, no. 1,\f1 (1988), 82-87. .LI [Bala] R. Balasubramanian , On the frequency of Titchmarsh's phenomenon for $ zeta (s) $. IV, \f2Hardy-Ramanujan J. 9\f1 (1986), 1-10. .LI [BR] R. Balasubramanian and K. Ramachandra, On the frequency of Titchmarsh's phenomenon for $ zeta (s) $. III, \f2Proc. Indian Acad. Sci., Sect. A 86\f1 (1977), 341-351. .LI [BC] R. A. Becker and J. M. Chambers, \f2S: An Interactive Environment for Data Analysis and Graphics\f1, Wadsworth, 1984. .LI [Be1] M. V. Berry, Semiclassical theory of spectral rigidity, \f2Proc. Royal Soc. London A 400\f1 (1985), 229-251. .LI [Be2] M. V. Berry, Riemann's zeta function: A model for quantum chaos?, pp. 1-17 in \f2Quantum Chaos and Statistical Nuclear Physics\f1, T. Seligman and H. Nishioka, eds., Lecture Notes in Physics # 263, Springer, 1986. .LI [Be3] M. V. Berry, Quantum chaology, \f2Proc. Royal Soc. London A 413\f1 (1987), 183-198. .LI [Be4] M. V. Berry, Semiclassical formula for the number variance of the Riemann zeros, \f2Nonlinearity 1\f1 (1988), 399-407. .LI [Bil] P. Billingsley, \f2Probability and Measure\f1, Wiley, 1979. .LI [BG] O. Bohigas and M.-J. Giannoni, Chaotic motion and random matrix theories, pp. 1-99 in \f2Mathematical and Computational Methods in Nuclear Physics\f1, J. S. Dehesa, J.\ M.\ G. Gomez, and A. Polls, eds., Lecture Notes in Physics #209, Springer 1984. .LI [BGS1] O. Bohigas, M. J. Giannoni, and C. Schmit, Characterization of chaotic quantum spectra and universality of level fluctuation laws, \f2Physical Rev. Letters 52\f1 (1984), 1-4. .LI [BGS2] O. Bohigas, M. J. Giannoni, and C. Schmit, Spectral properties of the Laplacian and random matrix theories, \f2J. Physique-Lettres 45\f1 (1984), L1015-22. .LI [BHP] O. Bohigas, R. U. Haq, and A. Pandey, Higher-order correlations in spectra of complex systems, \f2Phys. Rev. Letters 54\f1 (1985), 1645-1648. .LI [BH] E. Bombieri and D. A. Hejhal, Sur les ze\*'ros des fonctions $roman {z e hat t a}$ d'Epstein, \f2Comp. Rend. Acad. Sci. Paris, Se\*'r. I, 304\f1 (1987), 213-217. .LI [BI] E. Bombieri and H. Iwaniec, On the order of $ zeta (1/2 ~+~ it)$, \f2Ann. Scuola Norm. Sup. Pisa, ser. IV, 13\f1 (1986), 449-472. .LI [BB] J. M. Borwein and P. B. Borwein, \f2Pi and the AGM\f1, Wiley-Interscience, 1987. .LI [Br1] R. P. Brent, \f2Algorithms for Minimization without Derivatives\f1, Prentice-Hall, 1973. .LI [Br4] R. P. Brent, A Fortran multiple precision arithmetic package, \f2ACM Trans. Math. Software 4\f1 (1978), 57-70. .LI [Br5] R. P. Brent, On the zeros of the Riemann zeta function in the critical strip, \f2Math. Comp., 33\f1 (1979), 1361-1372. .LI [BFFMPW] T. A. Brody, J. Flores, J. P. French, P. A. Mello, A. Pandey, and S. S. M. Wong, Random-matrix physics: spectrum and strength fluctuations, \f2Rev. Modern Physics 53\f1 (1981), 385-479. .LI [Br] W. S. Brown, A simple but realistic model of floating-point computations, \f2ACM Trans. Math. Software\f1 \f27\f1 (1981), 445-480. .LI [BSS] P. L. Butzer, W. Splettsto\*:sser, and R. L. Stens, The sampling theorem and linear prediction in signal analysis, \f2Jber. d. Deutschen Math.-Verein. 90\f1 (1988), 1-70. .LI [CGR] J. Carrier, L. Greengard, and V. Rokhlin, A fast adaptive algorithm for particle simulations, \f2SIAM J. Stat. Sci. Comp. 9\f1 (1988), 669-686. .LI [CCKT] J. M. Chambers, W. S. Cleveland, B. Kleiner, and P. A. Tukey, \f2Graphical Methods for Data Analysis\f1, Wadsworth, 1983. .LI [Clev] W. S. Cleveland, Robust locally weighted regression and smoothing scatterplots, \f2J. Amer. Statist. Assoc. 74\f1 (1979), 829-836. .LI [CM1] J. des Cloizeaux and M. L. Mehta, Some asymptotic expressions for prolate spheroidal functions and for the eigenvalues of differential and integral equations of which they are solutions, \f2J. Math. Phys. 13\f1 (1972), 1745-1754. .LI [CM2] J. des Cloizeaux and M. L. Mehta, Asymptotic behavior of spacing distributions for the eigenvalues of random matrices, \f2J. Math. Phys. 14\f1 (1973), 1648-1650. .LI [CG1] J. B. Conrey and A. Ghosh, On mean values of the zeta-function, \f2Mathematika 31\f1 (1984), 159-161. .LI [CG2] J. B. Conrey and A. Ghosh, A mean value theorem for the Riemann zeta-function at its relative extrema on the critical line, \f2J. London Math. Soc. (2) 32\f1 (1985), 193-202. .LI [CGGGH] J. B. Conrey, A. Ghosh, D. A. Goldston, S. M. Gonek and D. R. Heath-Brown, On the distribution of gaps between zeros of the zeta function, \f2Quarterly J. of Math. Oxford\f1 (2), \f236\f1 (1985), 43-51. .LI [CGG1] J. B. Conrey, A. Ghosh, and S. M. Gonek, A note on gaps between zeros of the zeta function, \f2Bull. London Math. Soc. 16\f1 (1984), 421-424. .LI [CGG2] J. B. Conrey, A. Ghosh, and S. M. Gonek, Large gaps between zeros of the zeta-function, \f2Mathematika 33\f1 (1986), 212-238. .LI [CR] F. D. Crary and J. B. Rosser, High precision coefficients related to the zeta function, MRC Technical Summary Report #1344, Univ. of Wisconsin, Madison, May 1975, 171 pp.; reviewed by R. P. Brent in \f2Math. Comp. 31\f1 (1977), 803-804. .LI [Dav] D.\ Davies, An approximate functional equation for Dirichlet L-functions, \f2Proc. Royal Soc. Ser. A 284\f1 (1965), 224-236. .LI [Deu] M. Deuring, Asymptotische Entwicklungen der Dirichletschen L-Reihen, \f2Math. Ann. 168\f1 (1967), 1-30. .LI [Dy1] F. J. Dyson, Statistical theory of the energy levels of complex systems. II, \f2J. Math. Phys. 3\f1 (1962), 157-165. .LI [Ed] H. M. Edwards, \f2Riemann's Zeta Function\f1, Academic Press, 1974. .LI [Fel] W. Feller, \f2An Introduction to Probability Theory and its Applications,\f1 vol. 2, 2nd ed., Wiley, 1971. .LI [Fu1] A. Fujii, On the zeros of Dirichlet $L$-functions. I. \f2Trans. Amer. Math. Soc.\f1 \f2196\f1 (1974), 225-235. .LI [Fu2] A. Fujii, On the uniformity of the distribution of zeros of the Riemann zeta function, \f2J. reine angew. Math. 302\f1 (1978), 167-205. .LI [Fu3] A. Fujii, A prime number theorem in the theory of the Riemann zeta function, \f2J. reine angew. Math. 307/308\f1 (1979), 113-129. .LI [Fu4] A. Fujii, On the zeros of Dirichlet $L$-functions. II (With corrections to ``On the zeros of Dirichlet $L$-function. I'' and the subsequent papers), \f2Trans. Amer. Math. Soc.\f1 \f2267\f1 (1981), 33-40. .LI [Fu5] A. Fujii, On the uniformity of the distribution of zeros of the Riemann zeta function (II), \f2Comm. Math. Univ. Sancti Pauli 31\f1 (1982), 99-113. .LI [Fu6] A. Fujii, Zeros, primes and rationals, \f2Proc. Japan Acad. Ser. A, 58\f1 (1982), 373-376. .LI [Fu7] A. Fujii, Uniform distribution of the zeros of the Riemann zeta function and the mean value theorems of Dirichlet L-functions, \f2Proc. Japan Acad. Ser. A, 63\f1 (1987), 370-373. .LI [Fu8] A. Fujii, Gram's law for the zeta zeros and the eigenvalues of gaussian unitary ensembles, \f2Proc. Japan Acad. Ser. A, 63\f1 (1987), 392-395. .LI [Fu9] A. Fujii, Zeta zeros and Dirichlet $ L $-functions, \f2Proc. Japan Acad. Ser. A, 64\f1 (1988), 215-218. .LI [Gab] W. Gabcke, \f2Neue Herleitung und explicite Restabscha\*:tzung der Riemann-Siegel-Formel\f1, Ph.D. Dissertation, Go\*:ttingen, 1979. .LI [Gal2] P. X. Gallagher, Pair correlation of zeros of the zeta function, \f2J. reine\f1 \f2angew. Math.\f1 \f2362\f1 (1985), 72-86. .LI [Gal3] P. X. Gallagher, Applications of Guinand's formula, pp. 135-157 in \f2Analytic Number Theory and Diophantine Problems\f1, A. C. Adolphson, J. B. Conrey, A. Ghosh, and R. I. Yager, eds., Birkha\*:user, 1987. .LI [Gal4] P. X. Gallagher, A double sum over primes and zeros of the zeta function, in \f2Number Theory, Trace Formulas, and Discrete Groups,\f1 K. E. Aubert, E. Bombieri, and D. M. Goldfeld, eds., Proc. 1987 Selberg Symposium, Academic Press, 1989, pp. 229-240. .LI [GM] P. X. Gallagher and J. H. Mueller, Primes and zeros in short intervals, \f2J. reine angew. Math.\f1 303/304 (1978), 205-220. .LI [Gen] W. M. Gentleman, Fast Fourier transforms \(en for fun and profit, \f2AFIPS Proc. 29\f1 (1966), 563-578. .LI [Gh1] A. Ghosh, On Riemann's zeta-function--sign changes of $S(T)$, pp. 25-46 in \f2Recent Progress in Analytic Number Theory\f1, vol. 1, H. Halberstam and C. Hooley, eds., Academic Press, 1981. .LI [Gh2] A. Ghosh, On the Riemann zeta-function--mean value theorems and the distribution of $| S(t) |$, \f2J. Number Theory 17\f1 (1983), 93-102. .LI [Go1] D. A. Goldston, Prime numbers and the pair correlation of zeros of the zeta function, pp. 82-91 in \f2Topics in Analytic Number Theory\f1, S.\ W. Graham and J.\ D. Vaaler, eds., Univ. Texas Press, 1985. .LI [Go2] D. A. Goldston, On the function $S(T)$ in the theory of the Riemann zeta function, \f2J. Number Theory 27\f1 (1987), 149-177. .LI [Go3] D. A. Goldston, On the pair correlation conjecture for zeros of the Riemann zeta function, \f2J. reine angew. Math. 385\f1 (1988), 24-40. .LI [GG] D. A. Goldston and S. M. Gonek, A note on the number of primes in short intervals, to be published. .LI [GHB] D. A. Goldston and D. R. Heath-Brown, A note on the differences between consecutive primes, \f2Math. Ann.\f1 \f2266\f1 (1984), 317-320. .LI [GM] D. A. Goldston and H. L. Montgomery, Pair correlation of zeros and primes in short intervals, pp. 183-203 in \f2Analytic Number Theory and Diophantine Problems\f1, A. C. Adolphson, J. B. Conrey, A. Ghosh, and R. I. Yager, eds., Birkha\*:user, 1987. .LI [Gon0] S. M. Gonek, \f2Analytic Properties of Zeta and $L$-Functions\f1, Ph.D. Dissertation, Univ. Michigan, 1979. .LI [Gon1] S. M. Gonek, Mean values of the Riemann zeta-function and its derivatives, \f2Invent. math. 75\f1 (1984), 123-141. .LI [Gon2] S. M. Gonek, A formula of Landau and mean values of $zeta (s)$, pp. 92-97 in \f2Topics in Analytic Number Theory\f1, S. W. Graham and J. D. Vaaler, eds., Univ. Texas Press, 1985. .LI [Gon3] S. M. Gonek, On negative moments of the Riemann zeta function, to be published. .LI [GK] S. W. Graham and G. Kolesnik, One and two dimensional exponential sums, pp. 205-222 in \f2Analytic Number Theory and Diophantine Problems\f1, A. C. Adolphson, J. B. Conrey, A. Ghosh, and R. I. Yager, eds., Birkha\*:user, 1987. .LI [GG] L. Greengard and W. D. Gropp, A parallel version of the fast multipole algorithm, to be published. .LI [GR1] L. Greengard and V. Rokhlin, A fast algorithm for particle simulations, \f2J. Computational Phys. 73\f1 (1987), 325-348. .LI [GR2] L. Greengard and V. Rokhlin, Rapid evaluation of potential fields in three dimensions, to be published. .LI [GR3] L. Greengard and V. Rokhlin, On the efficient implementation of the fast multipole algorithm, to be published. .LI [Gu1] A. P. Guinand, A summation formula in the theory of prime numbers, \f2Proc. London Math. Soc.\f1 (2) \f250\f1 (1948), 107-119. .LI [Gu2] A. P. Guinand, Fourier reciprocities and the Riemann zeta-function, \f2Proc. London Math. Soc. (2) 51\f1 (1949), 401-414. .LI [Gut] M. C. Gutzwiller, Stochastic behavior in quantum scattering, \f2Physica 7D\f1 (1983), 341-355. .LI [HI] J. L. Hafner and A. Ivi\o'c\(hc', On the mean-square of the Riemann zeta-function on the critical line, to be published. (IBM report RJ5729 (57705), 7/9/87.) .LI [HMF] \f2Handbook of Mathematical Functions\f1, M. Abramowitz and I. A. Stegun, eds., National Bureau of Standards, 9th printing, 1970. .LI [HPB] R. U. Haq, A. Pandey, and O. Bohigas, Fluctuation properties of nuclear energy levels: Do theory and experiment agree? \f2Physical Rev. Letters 48\f1 (1982), 1086-1089. .LI [HB1] D. R. Heath-Brown, Gaps between primes and the pair correlation of zeros of the zeta-function, \f2Acta Arith.\f1 \f241\f1 (1982), 85-99. .LI [HB2] D. R. Heath-Brown, Fractional moments of the Riemann zeta-function. II, to be published. .LI [Hej1] D. A. Hejhal, The Selberg trace formula and Riemann zeta function, \f2Duke Math. J. 43\f1 (1976), 441-482. .LI [Hej5] D. A. Hejhal, Zeros of Epstein zeta functions and supercomputers, pp. 1362-1384 in \f2Proc. Intern. Congress Math. 1986\f1, Amer. Math. Soc., 1987. .LI [Hej6] D. A. Hejhal, On the distribution of $ log | zeta sup prime (1/2 ~+~ it) | $, in \f2Number Theory, Trace Formulas, and Discrete Groups,\f1 K. E. Aubert, E. Bombieri, and D. M. Goldfeld, eds., Proc. 1987 Selberg Symposium, Academic Press, 1989, pp. 343-370. .LI [Hig] J. R. Higgins, Five short stories about the cardinal series, \f2Bull. Amer. Math. Soc. 12\f1 (1985), 45-89. .LI [Hl] E. Hlawka, U\*:ber die Gleichverteilung gewisser Folgen, welche mit den Nullstellen der Zetafunktion zusammenha\*:ngen, \f2Sitzungsber. O\*:st. Akad. Wiss. Math.-Naturw. Kl. II 184\f1 (1975), 459-471. .LI [HW] M. N. Huxley and N. Watt, Exponential sums and the Riemann zeta function, \f2Proc. London Math. Soc. (3) 57\f1 (1988), 1-24. .LI [Iv] A. Ivi\o'c\(hc', \f2The Riemann Zeta-function\f1, Wiley, 1985. .LI [Jer] A. J. Jerri, The Shannon sampling theorem - its various extensions and aplications: a tutorial review, \f2Proc. IEEE 65\f1 (1977), 1565-1596. .LI [Joy1] D. Joyner, \f2Distribution theorems for L-functions\f1, Longman, 1986. .LI [Joy2] D. Joyner, On the Dyson-Montgomery hypothesis, preprint. .LI [Jut] M. Jutila, On the value distribution of the zeta-function on the critical line, \f2Bull. London Math. Soc. 15\f1 (1983), 513-518. .LI [Kai] J. F. Kaiser, Design methods for sampled data filters, pp. 221-236 in \f2Proc. First Allerton Conf. Circuit and System Theory,\f1 Monticello, Ilinois, 1963. .LI [KW] E. Karkoschka and P. Werner, Einige Ausnahmen zur Rosserschen Regel in der Theorie der Riemannschen Zetafunktion, \f2Computing 27\f1 (1981), 57-69. .LI [Kat] J. Katzenelson, Computational structure of the N-body problem, \f2SIAM J. Stat. Sci. Comp.\f1, to appear. .LI [KS] M. G. Kendall and A. Stuart, \f2The Advanced Theory of Statistics\f1, Griffin, 1981. .LI [Ko] G. Kolesnik, On the method of exponent pairs, \f2Acta Arith. 45\f1 (1985), 115-143. .LI [LO2] J. C. Lagarias and A. M. Odlyzko, Solving low-density subset sum problems, \f2J. ACM 32\f1 (1985), 229-246. (Preliminary version in \f2Proc. 24-th IEEE Foundations Computer Science Symp.,\f1 pp. 1-10, 1983.) .LI [LO4] J. C. Lagarias and A. M. Odlyzko, Computing $ pi (x) $ : An analytic method, \f2J. Algorithms 8\f1 (1987), 173-191. .LI [Lan1] E. Landau, U\*:ber die Nullstellen der Zetafunktion, \f2Math. Ann. 71\f1 (1911), 548-564. .LI [Lan] O. E. Lanford III, Computer-assisted proofs in analysis, pp. 1385-1394 in \f2Proc. Intern. Congress Math. 1986\f1, Amer. Math. Soc. 1987. .LI [Lau1] A. Laurinchikas, Riemann zeta function on the critical line, \f2Litovsk. Mat. Sb. 25, no. 2,\f1 (1985), 114-118. (In Russian.) English translation in \f2Lithuanian Math. J. 25\f1 (1985), 145-148. .LI [Lau2] A. Laurinchikas, Moments of the Riemann zeta-function on the critical line, \f2Mat. Zametki 39\f1 (1986), 483-493. (In Russian.) English translation in \f2Math. Notes Akad. Sci. USSR 39\f1 (1986), 267-272. .LI [Lau3] A. Laurinchikas, Limit theorem for the Riemann zeta-function on the critical line. I, \f2Litovsk. Mat. Sb. 27, no. 1,\f1 (1987), 113-132. (In Russian.) English translation in \f2Lithuanian Math. J. 27\f1 (1987), 63-75. .LI [Lau4] A. Laurinchikas, Limit theorem for the Riemann zeta-function on the critical line. II, \f2Litovsk. Mat. Sb. 27, no. 3,\f1 (1987), 489-500. (In Russian.) English translation in \f2Lithuanian Math. J. 27\f1 (1987), 236-243. .LI [Lau5] A. Laurinchikas, A limit theorem for Dirichlet $L$-functions on the critical line, \f2Litovsk. Mat. Sb. 27, no. 1,\f1 (1987), 699-710. (In Russian.) .LI [LLL] A. K. Lenstra, H. W. Lenstra, Jr., and L. Lova\*'sz, Factoring polynomials with rational coefficients, \f2Math. Ann. 261\f1 (1982), 515-534. .LI [Li] J. E. Littlewood, On the zeros of the Riemann zeta-function, \f2Proc. Cambridge Philos. Soc. 22\f1 (1924), 295-318. .LI [Log1] B. F. Logan, Optimal truncation of the Hilbert transform kernel for bounded high-pass functions, pp. 10-12 in \f2Proc. 5-th Annual Princeton Conf. Information Sci. Systems\f1, Princeton Univ., 1971. .LI [Log2] B. F. Logan, Bounds for the tails of sharp-cutoff filter kernels, \f2SIAM J. Math. Anal. 19\f1 (1988), 372-376. .LI [LRW1] J. van de Lune, H. J. J. te Riele, and D. T. Winter, Rigorous high speed separation of zeros of Riemann's zeta function, Report NW 113/81, Mathematical Center, Amsterdam, 1981. .LI [LRW2] J. van de Lune, H. J. J. te Riele, and D. T. Winter, On the zeros of the Riemann zeta function in the critical strip. IV., \f2Math. Comp. 46\f1 (1986), 667-681. .LI [Meh] M. L. Mehta, \f2Random Matrices\f1, Academic Press, 1967. .LI [MdC] M. L. Mehta and J. des Cloizeaux, The probabilities for several consecutive eigenvalues of a random matrix, \f2Indian J. Pure Appl. Math. 3\f1 (1972), 329-351. .LI [Mon1] H. L. Montgomery, The pair correlation of zeros of the zeta function, pp. 181-193 in \f2Analytic Number Theory,\f1 H. G. Diamond, ed., Proc. Symp. Pure Math. \f224\f1, Amer. Math. Soc., Providence 1973. .LI [Mon2] H. L. Montgomery, Distribution of zeros of the Riemann zeta function, \f2Proc. Int. Congress Math. Vancouver\f1 (1974), 379-381. .LI [Mon3] H. L. Montgomery, Extreme values of the Riemann zeta function, \f2Comm. Math. Helv.\f1 \f252\f1 (1977), 511-518. .LI [Mon6] H. L. Montgomery, Selberg's work on the zeta function, in \f2Number Theory, Trace Formulas, and Discrete Groups,\f1 K.\ E. Aubert, E. Bombieri, and D. M. Goldfeld, eds., Proc. 1987 Selberg Symposium, Academic Press, 1989, pp. 157-168. .LI [MO] H. L. Montgomery and A. M. Odlyzko, Gaps between zeros of the zeta function, pp. 1079-1106 in \f2Topics in Classical Number Theory: Coll. Math. Soc. Janos Bolyai 34,\f1 G. Hala\*'sz, ed., North-Holland, 1984. .LI [MW] H. L. Montgomery and P. J. Weinberger, Notes on small class numbers, \f2Acta Arith. 24\f1 (1973/74), 529-542. .LI [Mos1] J. Moser, On a certain sum in the theory of the Riemann zeta function, (in Russian), \f2Acta Arith. 31\f1 (1976), 31-43. .LI [Mos2] J. Moser, On a Hardy-Littlewood theorem in the theory of the Riemann zeta function, (in Russian), \f2Acta Arith. 31\f1 (1976), 45-51. .LI [Mos3] J. Moser, On Gram's law in the theory of the Riemann zeta function, (in Russian), \f2Acta Arith. 32\f1 (1977), 107-113. .LI [Mos4] J. Moser, Proof of a hypothesis of E. C. Titchmarsh in the theory of the Riemann zeta function, (in Russian), \f2Acta Arith. 36\f1 (1980), 147-156. .LI [Mos5] J. Moser, On the roots of the equations $ Z sup prime (t) ~=~ 0 $, (in Russian), \f2Acta Arith. 40\f1 (1981), 79-89. .LI [Mos6] J. Moser, Corrections to the papers: \f2Acta Arith. 31\f1 (1976), pp. 31-43; \f231\f1 (1976), pp. 45-51; \f235\f1 (1979), pp. 403-404, (in Russian), \f2Acta Arith. 40\f1 (1981), 97-107. .LI [Mos7] J. Moser, New consequences of the Riemann-Siegel formula, (in Russian), \f2Acta Arith. 42\f1 (1982), 1-10. .LI [Mos8] J. Moser, On a certain biquadratic sum in the theory of the Riemann zeta function, (in Russian), \f2Acta Math. Univ. Comen. 42-43\f1 (1983), 35-39. .LI [Mos9] J. Moser, Properties of the sequence $ {Z[ t sub v ( tau ) ]} $ in the theory of the Riemann zeta function, (in Russian), \f2Acta Math. Univ. Comen. 42-43\f1 (1983), 55-63. .LI [Mos10] J. Moser, On some lower bounds for the distance of consecutive zeros of the function $ zeta (1/2 ~+~ it) $, (in Russian), \f2Acta Math. Univ. Comen. 44-45\f1 (1984), 75-80. .LI [Mos11] J. Moser, On a cubic formula in the theory of the Riemann zeta function, (in Russian), \f2Acta Math. Univ. Comen. 44-45\f1 (1984), 81-89. .LI [Mos12] J. Moser, New mean value theorems for the function $ | zeta (1/2 ~+~ it) | sup 2 $, (in Russian), \f2Acta Math. Univ. Comen. 46-47\f1 (1985), 21-40. .LI [Mos13] J. Moser, On the behavior of positive and negative values of the function $ Z(t) $ in the theory of the Riemann zeta function, (in Russian), \f2Acta Math. Univ. Comen. 46-47\f1 (1985), 41-48. .LI [Mos14] J. Moser, On a cubic sum in the theory of the Riemann zeta function, (in Russian), \f2Acta Math. Univ. Comen. 46-47\f1 (1985), 63-74. .LI [Mue1] J. H. Mueller, On the Riemann zeta function $ zeta (s) $ - gaps between sign changes of $ S(t) $, \f2Mathematika 29\f1 (1982), 264-269. .LI [Mue2] J. H. Mueller, Arithmetic equivalent of essential simplicity of zeta zeros, \f2Trans. Amer. Math. Soc. 275\f1 (1983), 175-183. .LI [Od0] A. M. Odlyzko, Applications of symbolic mathematics to mathematics, in \f2Applications of Computer Algebra\f1, R. Pavelle, ed., Kluwer-Nijhoff, 1985, pp.\ 95-111. .LI [Od1] A. M. Odlyzko, New analytic algorithms in number theory, pp. 466-475 in \f2Proc. Intern. Congress Math. 1986\f1, Amer. Math. Soc. 1987. .LI [Od2] A. M. Odlyzko, On the distribution of spacings between zeros of the zeta function, \f2Math. Comp. 48\f1 (1987), 273-308. .LI [Od3] A. M. Odlyzko, Zeros of the Riemann zeta function: Conjectures and computations, manuscript in preparation. .LI [Od4] A. M. Odlyzko, The number variance of zeros of the Riemann zeta function, manuscript in preparation. .LI [OtR] A. M. Odlyzko and H. J. J. te Riele, Disproof of the Mertens conjecture, \f2J. reine angew. Math. 357\f1 (1985), 138-160. .LI [OS] A. M. Odlyzko and A. Scho\*:nhage, Fast algorithms for multiple evaluations of the Riemann zeta function, \f2Trans. Amer. Math. Soc. 309\f1 (1988), 797-809. .LI [Oz1] A. E. Ozluk, \f2Pair correlation of zeros of Dirichlet L-functions\f1, Ph.D. dissertation, Univ. of Michigan, Ann Arbor, Mich., 1982. .LI [Oz2] A. E. Ozluk, On the pair correlation of zeros of Dirichlet L-functions, \f2Proc. First Conf. Canadian Number Theory Assoc. (Banff, 1988)\f1, R.\ A. Mollin, ed., W.\ de Gruyter, 1989, to appear. .LI [Por] C. E. Porter, ed., \f2Statistical Theories of Spectra: Fluctuations\f1, Academic Press, 1965. .LI [RK] S. P. Radziszowski and D. L. Kreher, Solving subset sum problems with the $L sup 3$ algorithm, \f2J. Combin. Math. Combin. Comput. 3\f1, (1988), 49-63. .LI [Sch1] C. P. Schnorr, A more efficient algorithm for a lattice basis reduction, \f2J. Algorithms 9\f1 (1988), 47-62. .LI [Sch2] C. P. Schnorr, A hierarchy of polynomial time lattice basis reduction algorithms, \f2Theoretical Computer Science 53\f1 (1987), 201-224. .LI [Schr1] N. L. Schryer, A test of a computer's floating-point arithmetic unit, AT&T Bell Laboratories Computing Science Technical Report #89, 1981. .LI [Schr2] N. L. Schryer, A case study in testing: floating-point arithmetic, to be published. .LI [Sel2] A. Selberg, On the remainder in the formula for $ N(T) $, the number of zeros of $ zeta (s) $ in the strip $ 0 ~<~ t ~<~ T $, \f2Avh. Norske Vid. Akad. Oslo I. Mat.-Naturvid. Kl., no. 1\f1 (1944), 1-17. .LI [Sel3] A. Selberg, Contributions to the theory of the Riemann zeta-function, \f2Arch. for Math. og Naturv. B, 48\f1 (1946), 89-155. .LI [SW] D. Shanks and J. W. Wrench, Jr., Calculation of $ pi $ to 100,000 decimals, \f2Math. Comp. 16\f1 (1962), 76-99. .LI [Sie1] C. L. Siegel, U\*:ber Riemanns Nachlass zur analytischen Zahlentheorie, \f2Quellen und Studien zur Geschichte der Math. Astr. Phys. 2\f1 (1932), 45-80. Reprinted in C. L. Siegel, \f2Gesammelte Abhandlungen\f1, Springer, 1966, Vol. 1, pp. 275-310. .LI [Sie2] C. L. Siegel, Contributions to the theory of the Dirichlet $L$-series and the Epstein zeta functions, \f1Ann. Math. 44\f1 (1943), 143-172. Reprinted in C. L. Siegel, \f2Gesammelte Abhandlungen\f1, Springer, 1966, Vol. 2, pp. 360-389. .LI [SF] E. H. Spafford and J. C. Flaspohler, A report on the accuracy of some floating point math functions on selected computers, Georgia Inst. Tech., School of Inform. Comp. Sci., Report GIT-ICS 85/06, revised Jan. 1986. .LI [St] H. M. Stark, On complex quadratic fields with class number two, \f2Math. Comp. 29\f1 (1975), 289-302. .LI [Tit0] E. C. Titchmarsh, On van der Corput's method and the zeta-function of Riemann. IV, \f2Quart. J. Math. 5\f1 (1934), 98-105. .LI [Tit1] E. C. Titchmarsh, The zeros of the Riemann zeta-function, \f2Proc. Royal Soc. London 151\f1 (1935), 234-255 and \f2157\f1 (1936), 261-263. .LI [Tit2] E. C. Titchmarsh, \f2The Theory of the Riemann Zeta-function\f1, 2nd ed. (revised by D. R. Heath-Brown), Oxford Univ. Press, 1986. .LI [Ts1] K.-M. Tsang, \f2The Distribution of the Values of the Riemann Zeta-function\f1, Ph.D. Dissertation, Princeton, 1984. .LI [Ts2] K.-M. Tsang, Some $ OMEGA $-theorems for the Riemann zeta-function, \f2Acta Arith. 46\f1 (1986), 369-395. .LI [VB] A. L. Van Buren, A Fortran computer program for calculating the linear prolate functions, Report 7994, Naval Research Laboratory, Washington, May 1976. .LI [Wa] N. Watt, Exponential sums and the Riemann zeta function. II, \f2J. London Math. Soc.\f1, to appear. .LI [We1] A. Weil, Sur les ``formules explicites'' de la theorie des nombres premiers, \f2Comm. Sem. Math. Univ. Lund\f1, tome supplementaire (1952), 252-265. .LI [Whit] J. M. Whittaker, \f2Interpolatory Function Theory,\f1 Cambridge Univ. Press, 1935. .LI [WR] D. Winter and H. te Riele, Optimization of a program for the verification of the Riemann hypothesis, \f2Supercomputer 5\f1 (1985), 29-32. .LI [Zh] Feng Zhao, An $O(N)$ algorithm for three-dimensional $N$-body simulations, MIT AI Lab. report # 995, October 1987. .LI [ZJ] Feng Zhao and L. Johnsson, The parallel multipole method on the Connection Machine, paper in preparation. .LE .PH "" .bp .ce .DS .VL 10 .LI "Table\ 1.1." Several zeros of the Riemann zeta function near zero number $10 sup 20$. All zeros are of the form $1/2 + i gamma sub n$. .LE .sp 2 .TS center; c | c l | n. $n$ $gamma sub n ^-^15,202,440,115,920,740,000$ .sp .3 _ .sp .4 $10 sup 20 - 6$ 7267.894628 .sp .5 $10 sup 20 - 5$ 7267.988948 .sp .5 $10 sup 20 - 4$ 7268.077538 .sp .5 $10 sup 20 - 3$ 7268.258252 .sp .5 $10 sup 20 - 2$ 7268.337163 .sp .5 $10 sup 20 - 1$ 7268.563308 .sp .5 $10 sup 20$ 7268.629029 .sp .5 $10 sup 20 + 1$ 7268.828625 .sp .5 $10 sup 20 + 2$ 7268.972156 .sp .5 $10 sup 20 + 3$ 7269.122460 .sp .5 $10 sup 20 + 4$ 7269.241484 .sp .5 $10 sup 20 + 5$ 7269.313890 .TE .DE .sp 3 .DS .TS center; c s s s c | c | c | c c | c | c | c c | n | l | l. Table\ 1.2.\0 Large computed sets of zeros of the Riemann zeta function. .sp 2 index of first approximate height $N$ number of zeros zero in set of zero no. $N$ .sp .3 _ .sp .4 $10 sup 12$ 1,592,196 $N^-^ 6,032$ \0\|$2.677 times 10 sup 11$ .sp .5 $10 sup 14$ 1,685,452 $N^-^ 736$ \0\|$2.251 times 10 sup 13$ .sp .5 $10 sup 16$ 16,480,973 $N^-^5,946$ \0\|$1.941 times 10 sup 15$ .sp .5 $10 sup 18$ 16,671,047 $N^-^8,839$ \0\|$1.706 times 10 sup 17$ .sp .5 $10 sup 19$ 16,749,725 $N^-^13,607$ \0\|$1.608 times 10 sup 18$ .sp .5 $10 sup 20$ 78,893,234 $N^-^30,769,710$ \0\|$1.520 times 10 sup 19$ .TE .DE .bp .DS .TS center; c s s s s s s c | c | c | c | c | c | c n | n | n | n | n | n | n. Table\ 2.3.1.\0 Moments of $delta sub n -1$. .sp 2 $k$ $N=1$ $N= 10 sup 12$ $N=10 sup 16$ $N=10 sup 18$ $N=10 sup 20$ GUE .sp .3 _ .sp .4 2 0.161 0.176 0.177 0.178 0.178 0.180 3 0.031 0.035 0.036 0.036 0.037 0.038 4 0.081 0.096 0.098 0.098 0.099 0.101 5 0.046 0.059 0.061 0.062 0.062 0.066 6 0.075 0.100 0.103 0.105 0.106 0.111 7 0.072 0.109 0.113 0.115 0.116 0.124 8 0.103 0.171 0.178 0.180 0.183 0.197 9 0.126 0.246 0.258 0.261 0.266 0.290 10 0.181 0.408 0.431 0.434 0.444 0.488 .TE .DE .sp 5 .DS .TS center; c s s s s s s c | c | c | c | c | c | c n | n | n | n | n | n | n. Table\ 2.3.2.\0 Moments of $delta sub n + delta sub n+1 - 2$. .sp 2 $k$ $N=1$ $N= 10 sup 12$ $N=10 sup 16$ $N=10 sup 18$ $N=10 sup 20$ GUE .sp .3 _ .sp .4 2 0.207 0.236 0.241 0.242 0.243 0.249 3 0.028 0.027 0.027 0.027 0.028 0.030 4 0.123 0.167 0.173 0.175 0.176 0.185 5 0.047 0.062 0.064 0.065 0.066 0.073 6 0.119 0.204 0.214 0.218 0.220 0.237 7 0.078 0.151 0.159 0.162 0.164 0.185 8 0.155 0.370 0.393 0.401 0.406 0.451 9 0.142 0.423 0.453 0.465 0.470 0.544 10 0.252 0.909 0.985 1.016 1.026 1.178 .TE .DE .bp .DS .TS center; c s s s s s s c | c | c | c | c | c | c c | c | c | c | c | c | c c | n | n | n | n | n | n. Table\ 2.3.3.\0 Moments of $log ^delta sub n$, $delta sub n sup -1$, and $delta sub n sup -2$. .sp 2 moments of $N=1$ $N=10 sup 12$ $N=10 sup 16$ $N=10 sup 18$ $N=10 sup 20$ GUE .sp .3 _ .sp .4 $log^delta sub n$ \-0.0912 \-0.1013 \-.1022 \-.1025 \-.1027 \-0.1035 $delta sub n sup -1$ 1.2363 1.2700 1.2725 1.2733 1.2736 1.2758 $delta sub n sup -2$ 2.2235 2.5277 2.5309 2.5855 2.5461 2.5633 .TE .DE .sp 1i .DS .VL 14 .LI "Table\ 2.3.4." Kolmogorov statistic for $delta sub n$ and $delta sub n + delta sub n+1$, for blocks of $10 sup 6$ zeros. .LE .sp 1.5 .TS center; c || c s || c s c || c s || c s c || c | c || c | c c || n | c || n | c. $delta sub n$ $delta sub n + delta sub n+1$ _ _ .sp .4 $D$ prob. $D$ prob. .sp .3 _ .sp .4 $N=10 sup 12$ vs. GUE 0.00419 $10 sup -15$ 0.00819 $10 sup -58$ .sp .5 $N=10 sup 20 (a)$ vs. GUE 0.00180 $3 times 10 sup -3$ 0.00318 $3 times 10 sup -9$ .sp .5 $N=10 sup 20 (b)$ vs. GUE 0.00152 $2 times 10 sup -2$ 0.00399 $3 times 10 sup -14$ .sp .5 $N=10 sup 20 (a)$ vs. $N=10 sup 20 (b)$ 0.00108 0.19 0.00119 0.12 .sp .5 $N=10 sup 20 (a)$ vs. $N=10 sup 20 (c)$ 0.00082 0.51 0.00123 0.10 .sp .5 $N=10 sup 20 (b)$ vs. $N-10 sup 20 (c)$ 0.00089 0.41 0.00096 0.32 .TE .DE .bp .DS .VL 14 .LI "Table\ 2.4.1." Moments of scaled values of $S(t)$ computed from two intervals of $10 sup 6$ zeros each near $N=10 sup 12$ and $10 sup 20$. .LE .sp 1.5 .TS center,delim ($$); c | c | c | c n | n | n | n. $k$ $N=10 sup 12$ $N=10 sup 20$ normal .sp .3 _ .sp .4 1 $1.2 times 10 sup -5$ $-6.3 times 10 sup -6$ 0 2 1.0 1.0 1 3 $3.9 times 10 sup -4$ $-4.7 times 10 sup -4$ 0 4 2.792 2.831 3 5 $4.8 times 10 sup -3$ $-9.1 times 10 sup -3$ 0 6 12.22 12.71 15 7 0.050 \-0.140 0 8 70.98 76.57 105 .sp $|1|$ 0.8058 0.8042 0.79788... $|3|$ 1.3130 1.3458 1.5957... $|5|$ 5.597 5.742 6.3830... .sp 1\(** $5.9 times 10 sup -6$ $-3.2 times 10 sup -6$ 2\(** 0.2330808 0.2606901 .TE .DE .sp 1i .DS .VL 14 .LI "Table\ 2.4.2." Average number of sign changes of $S(t)$ per Gram interval. .LE .sp 1.5 .TS center; c | c c | n. $N$ $S(t)$ sign changes .sp .3 _ .sp .4 $10 sup 12$ 1.600 $10 sup 14$ 1.575 $10 sup 16$ 1.556 $10 sup 18$ 1.538 $10 sup 19$ 1.531 $10 sup 20$ 1.524 .TE .DE .bp .DS .VL 14 .LI "Table\ 2.4.3." Largest values of $|S(t)|$ in various data sets and fraction of exceptions to Rosser's rule that had $|S(t)|^>^2.3$. .LE .sp 1.5 .TS center; c | c | c c | c | c c | n | n. fraction of cases $n$ largest $S(t)$ with $|S(t)|^>^2.3$ .sp .3 _ .sp .4 $10 sup 12$ 2.1918 \(en $10 sup 14$ \-2.2784 \(en $10 sup 16$ \-2.4639 0.0123 $10 sup 18$ 2.6121 0.0175 $10 sup 19$ \-2.5698 0.0162 $10 sup 20$ 2.7379 0.0236 .TE .DE .sp 1i .DS .TS center; c s s c | c | c c | n | n. Table\ 2.4.4.\0 Statistics of $S sub 1 (t)$. .sp 2 $N=10 sup 12$ $N=10 sup 20$ .sp .3 _ .sp .4 mean of $S sub 1 (t) sup 2$ 0.0793 0.0793 .sp mean of $S sub 1 (t) sup 3$ 0.0058 0.0058 .sp mean of $S sub 1 (t) sup 4$ 0.0148 0.0148 .sp max $S sub 1 (t)$ 0.966 0.996 .sp min $S sub 1 (t)$ \-0.786 \-0.768 .sp no. sign changes 0.120 0.074 .TE .DE .bp .DS .VL 14 .LI "Table\ 2.5.1." Extremal values of $delta sub n$ and $delta sub n + delta sub n$, and the probability that the minimum value of $delta sub n$ in the GUE in a sample of the same size would not exceed the minimal value that was found. .LE .sp 1.5 .TS center; c | c | c | c | c | c c | c | c | c | c | c c | n | n | n | n | n. prob. $N$ $min^delta sub n$ $max^delta sub n$ $min ^( delta sub n + delta sub n+1 )$ $max ^( delta sub n + delta sub n+1 )$ $min^delta sub n$ .sp .3 _ .sp .4 $10 sup 12$ 0.00649 3.5098 0.2952 4.5833 0.38 $10 sup 14$ 0.00935 3.4716 0.2723 4.6564 0.78 $10 sup 16$ 0.00454 4.1637 0.1664 4.9921 0.82 $10 sup 18$ 0.00112 3.9869 0.1680 5.0401 0.025 $10 sup 19$ 0.00090 3.8089 0.1918 5.0588 0.013 $10 sup 20$ 0.00243 4.0258 0.1124 5.2125 0.71 .TE .DE .sp 1i .DS .VL 14 .LI "Table\ 2.5.2." Frequencies of very large and very small $delta sub n$ and $delta sub n + delta sub n$ (number of cases per million zeros) and the GUE predictions. .LE .sp 1.5 .TS center; c | c | c | c | c | c c | n | n | n | n | n. $N$ $delta sub n ^<=^0.05$ $delta sub n ^<=^0.1$ $delta sub n^>=^2.8$ $delta sub n + delta sub n+1 ^<=^0.6$ $delta sub n + delta sub n+1 ^>=^4$ .sp .3 _ .sp .4 $10 sup 12$ 126.2 1055 157.6 331.0 94.8 $10 sup 14$ 118.7 1103 156.6 329.9 97.9 $10 sup 16$ 130.9 1070 164.4 341.1 107.7 $10 sup 18$ 135.3 1088 169.9 356.1 108.5 $10 sup 19$ 140.5 1084 170.2 362.5 114.0 $10 sup 20$ 133.9 1073 175.1 359.1 112.8 .sp GUE 136.8 1088 196.8 386.3 135.7 .TE .DE .bp .DS .TS center; c s s s c | c | c | c n | n | n | n. Table\ 2.6.1\0 Autocovariances of the $delta sub n$. .sp 2 $k$ $N=1$ $N=10 sup 12$ $N=10 sup 20$ .sp .3 _ .sp .4 0 .1607429 0.1754737 0.1781405 1 \-.0574023 \-0.0576441 \-0.0566976 2 \-.0126083 \-0.0143034 \-0.0143122 3 \-.0065874 \-0.0055030 \-0.0065465 4 \-.0045317 \-0.0026406 \-0.0028474 5 \-.0031454 \-0.0016681 \-0.0019375 6 \-.0011362 \-0.0013422 \-0.0014018 7 \-.0007084 \-0.0009186 \-0.0006824 8 \-.0013904 \-0.0010702 \-0.0006266 9 .0013483 \-0.0007598 \-0.0005397 10 .0034456 \-0.0006851 \-0.0004818 11 .0018714 \-0.0006116 \-0.0002820 12 \-.0002503 \-0.0004058 \-0.0004115 13 \-.0005412 \-0.0006459 \-0.0003212 14 .0025227 \-0.0005569 \-0.0003363 15 .0046388 \-0.0007091 \-0.0003671 16 .0025451 \-0.0001529 \-0.0001061 17 .0010829 \-0.0000236 \-0.0004597 18 \-.0001093 0.0004387 \-0.0000046 19 \-.0057139 0.0001141 \-0.0003378 20 \-.0133596 \-0.0000075 0.0000028 .sp .3 _ .sp .4 9980 0.0020484 0.0018166 9981 \-0.0037100 0.0012394 9982 \-0.0030168 0.0003898 9983 0.0029465 \-0.0015079 9984 0.0043783 \-0.0019355 9985 \-0.0010326 \-0.0012999 9986 \-0.0034815 0.0001715 9987 0.0000487 0.0014113 9988 \-0.0012679 0.0021382 9989 \-0.0037964 0.0004500 9990 0.0003175 \-0.0005050 9991 0.0048778 \-0.0014679 9992 0.0062130 \-0.0018540 9993 0.0053806 \-0.0002132 9994 0.0011459 0.0014712 9995 \-0.0048852 0.0013364 9996 \-0.0057967 0.0010678 9997 \-0.0056723 0.0001780 9998 \-0.0034737 \-0.0014741 9999 0.0031196 \-0.0020779 10000 0.0074084 \-0.0014374 .TE .DE .bp .DS .VL 14 .LI "Table\ 2.7.1." Frequency of the Lehmer phenomenon among the $N=10 sup 19$ and $N=10 sup 20$ data sets. .LE .sp 1.25 .TS center; c | c n | n. $x$ no. values $<^x$ .sp .3 _ .sp .4 0.0005 1976 0.0004 1407 0.0003 900 0.0002 472 0.0001 173 0.00005 63 0.00002 20 0.00001 8 .TE .DE .bp .DS .ce Table\ 2.8.1\0 Largest values of $| zeta (1/2 +it)|$ that were found. .sp 2 .TS center; c | c c | n. $N$ $max ~|Z(t)|$ .sp .3 _ .sp .4 $10 sup 12$ 176 $10 sup 14$ 246 $10 sup 16$ 460 $10 sup 18$ 376 $10 sup 19$ 448 $10 sup 20$ 641 .TE .DE .sp 1i .DS .ce Table\ 2.8.2.\0 Frequency of large values of $| zeta (1/2 + it)|$, $N=10 sup 19$ and $N=10 sup 20$. .sp 2 .TS center; c | c n | n. $x$ no. values $>^x$ .sp .3 _ .sp .4 250 565 300 207 350 84 400 28 450 9 500 5 .TE .DE .bp .DS .TS center; c s s s s c | c | c | c | c n | n | n | n | n. Table\ 2.9.1.\0 Mean values of $| zeta (1/2 + it)|$. .sp 2 $lambda$ $r( lambda ,^H)$ $c sub 1 ( lambda )$ $c sub 2 ( lambda )$ $c( lambda )$ .sp .3 _ .sp .4 .1 1.004 1.0042 .2 1.034 1.0172 .3 1.067 1.0381 .4 1.098 1.0640 .5 1.123 1.0904 .6 1.135 1.1113 .7 1.132 1.1195 .8 1.107 1.1076 .9 1.060 1.0690 1.0 .989 1.0 1.0 1.0 1.1 .896 .901 0.906 1.2 .787 .776 0.795 1.3 .667 .637 0.672 1.4 .544 .494 0.544 1.5 .426 .360 0.421 1.6 .319 .246 0.309 1.7 .229 .157 0.215 1.8 .156 .092 0.142 1.9 .101 .050 0.086 2.0 .0624 .025 0.051 .051 2.1 .0364 .012 2.2 .0201 .0049 2.3 .0105 .0019 2.4 .00522 .00066 2.5 .00239 .00021 .TE .DE .sp 2 .DS .TS center; c s c | c n | n. Table\ 2.9.2\0 Negative moments of $| zeta (1/2 + it)|$. .sp 2 $lambda$ mean values of $|Z(t)| sup {- 2 lambda}$ .sp .3 _ .sp .4 0.1 1.06 .sp .5 0.2 1.27 .sp .5 0.3 1.83 .sp .5 0.4 3.77 .TE .DE .bp .po -.6i .DS .ll 7i .nr W 7i .VL 14 .LI "Table\ 2.10.1." Moments of the scaled distribution of log $| zeta (1/2 + it)|$ obtained from $10 sup 6$ random samples near zero number $N$ and the moments of the normal distribution. .LE .sp 1.5 .TS c | c | c | c | c | c | c n | n | n | n | n | n | n. $k$ $N=10 sup 12$ $N=10 sup 18 (a)$ $N=10 sup 18 (b)$ $N=10 sup 20 (c)$ $N=10 sup 20 (d)$ normal .sp .3 _ .sp .4 1 0.0 0.0 0.0 0.0 0.0 0 2 1.0 1.0 1.0 1.0 1.0 1 3 \-0.61867 \-0.54505 \-0.54199 \-0.53625 \-0.55069 0 4 4.1319 3.9441 3.9491 3.9233 3.9647 3 5 \-9.0528 \-7.8024 \-7.8610 \-7.6238 \-7.8839 0 6 44.065 39.717 40.360 38.434 39.393 15 7 \-175.39 \-159.45 \-162.86 \-144.78 \-148.77 0 8 900.06 930.19 930.70 758.57 765.54 105 9 \-4700.06 \-6065.28 \-5692.4 \-4002.5 \-3934.7 0 10 27016.2 48430.0 40818.3 24060.5 22722.9 945 .sp 1\(** \-0.0003725 \-0.0009607 0.00101075 \-0.00159534 0.00054934 2\(** 2.29679 2.52283 2.51805 2.57360 2.51778 .TE .ll 6i .nr W 6i .DE .sp .po +.6i .bp .pl +5 .DS .VL 14 .LI "Table\ 2.11.1" Moments of scaled values of $log ^| zeta sup prime (1/2 + it) |$ computed from $10 sup 6$ zeros for $N=10 sup 20$. .LE .sp 1.5 .TS center; c | c | c c | c | c n | n | n. moments of $k$ scaled moments of $log^|Z sup prime ( gamma ) |$ normal distribution .sp .3 _ .sp .4 1 0.0 0 2 1.0 1 3 \-0.03377 0 4 3.0182 3 5 \-0.59687 0 6 15.4522 15 7 \-9.0568 0 8 115.378 105 9 \-144.031 0 10 1180.33 945 .sp 1\(** 3.34571 2\(** 12.3312 .TE .DE .sp .DS .VL 14 .LI "Table\ 2.11.2." Moments of $| zeta sup prime ( 1/2 +i gamma sub n )|$ divided by conjectured main term, two sets of zeros near zero number $10 sup 20$. .LE .sp 1.5 .TS center; c | c | c n | l | l. $lambda$ first set of $5 times 10 sup 5$ zeros second set of $5 times 10 sup 5$ zeros .sp .3 _ .sp .4 \-5. \0\0$6.08 times 10 sup -22$ \0\0$4.68 times 10 sup -20$ \-4.5 \0\0$1.29 times 10 sup -16$ \0\0$5.40 times 10 sup -15$ \-4. \0\0$4.21 times 10 sup -12$ \0\0$9.45 times 10 sup -11$ \-3.5 \0\0$2.15 times 10 sup -8$ \0\0$2.54 times 10 sup -7$ \-3. \0\0$1.74 times 10 sup -5$ \0\0$1.07 times 10 sup -4$ \-2.5 \0\0$2.29 times 10 sup -3$ \0\0$7.41 times 10 sup -3$ \-2. \0\0$5.29 times 10 sup -2$ \0\0$9.51 times 10 sup -2$ \-1.5 \0\00.275 \0\00.322 \-1. \0\00.640 \0\00.644 \-0.5 \0\01.079 \0\01.078 0. \0\01.0 \0\01.0 0.5 \0\00.436 \0\00.436 1. \0\0$8.17 times 10 sup -2$ \0\0$8.17 times 10 sup -2$ 1.5 \0\0$5.31 times 10 sup -3$ \0\0$5.32 times 10 sup -3$ 2. \0\0$9.26 times 10 sup -5$ \0\0$9.48 times 10 sup -5$ 2.5 \0\0$3.53 times 10 sup -7$ \0\0$3.83 times 10 sup -7$ 3. \0\0$2.59 times 10 sup -10$ \0\0$3.10 times 10 sup -10$ 3.5 \0\0$3.38 times 10 sup -14$ \0\0$4.65 times 10 sup -14$ 4. \0\0$7.50 times 10 sup -19$ \0\0$1.22 times 10 sup -18$ 4.5 \0\0$2.73 times 10 sup -24$ \0\0$5.34 times 10 sup -24$ 5. \0\0$1.59 times 10 sup -30$ \0\0$3.79 times 10 sup -30$ .TE .DE .bp .pl -5 .po -.5i .sp .1 .DS .TS center,delim ($$); c s s s s s s s s c | c | c | c | c | c | c | c | c c | n | n | n | n | n | n | n | n. Table\ 2.12.1.\0 Fractions of Gram blocks of various lengths. .sp 2 $N$ $k=1$ $k=2$ $k=3$ $k=4$ $k=5$ $k=6$ $k=7$ $>=^8$ .sp .3 _ .sp .4 1 0.8449 0.1249 0.0258 0.0041 $3.1 times 10 sup -4$ $1.7 times 10 sup -5$ $6.4 times 10 sup -7$ 0 .sp .2 $1.4 times 10 sup 8$ 0.8325 0.1289 0.0305 0.0069 $1.03 times 10 sup -3$ $8.2 times 10 sup -5$ $6.0 times 10 sup -6$ $2.8 times 10 sup -7$ .sp .2 $10 sup 12$ 0.8178 0.1326 0.0356 0.0106 $2.8 times 10 sup -3$ $5.4 times 10 sup -4$ $6.7 times 10 sup -5$ $7.4 times 10 sup -12$ .sp .2 $10 sup 14$ 0.8099 0.1347 0.0380 0.0122 $3.9 times 10 sup -3$ $1.1 times 10 sup -3$ $2.1 times 10 sup -4$ $3.0 times 10 sup -11$ .sp .2 $10 sup 16$ 0.8045 0.1357 0.0393 0.0135 $4.8 times 10 sup -3$ $1.6 times 10 sup -3$ $4.5 times 10 sup -4$ $8.9 times 10 sup -12$ .sp .2 $10 sup 18$ 0.7998 0.1364 0.0407 0.0147 $5.5 times 10 sup -3$ $2.1 times 10 sup -3$ $6.9 times 10 sup -4$ $1.9 times 10 sup -11$ .sp .2 $10 sup 19$ 0.7977 0.1368 0.0412 0.0150 $5.9 times 10 sup -3$ $2.3 times 10 sup -3$ $8.2 times 10 sup -4$ $2.6 times 10 sup -11$ .sp .2 $10 sup 20$ 0.7957 0.1371 0.0417 0.0155 $6.2 times 10 sup -3$ $2.5 times 10 sup -3$ $9.3 times 10 sup -4$ $7.2 times 10 sup -12$ .TE .DE .sp 1i .po +.5i .sp .DS .VL 14 .LI "Table\ 2.12.2." Fraction of Gram blocks of given length $k$ that have exactly $k$ zeros and contain a Gram interval with 3 zeros. .LE .sp 2 .TS center; c | c | c | c | c c | n | n | n | n. $N$ $k=3$ $k=4$ $k=5$ $k=6$ .sp .3 _ .sp .4 1 0.0511 0.0799 0.1737 0.5448 .sp .5 $10 sup 12$ 0.0449 0.0541 0.0776 0.1032 .sp .5 $10 sup 20$ 0.0356 0.0413 0.0447 0.0392 .TE .DE .bp .DS .VL 14 .LI "Table\ 2.12.3." Fractions of Gram intervals that contain $m$ zeros, and the GUE prediction. .LE .sp 2 .TS center,delim ($$); c | c | c | c | c | c c | n | n | n | n | n. $N$ $m=0$ $m=1$ $m=2$ $m=3$ $m=4$ .sp .3 _ .sp .4 1 0.13197 0.73772 0.12864 0.00167 $10 sup -8$ .sp .2 $1.4 times 10 sup 9$ 0.13965 0.72254 0.13598 0.00183 $3 times 10 sup -8$ .sp .2 $10 sup 12$ 0.14787 0.70625 0.14388 0.00200 \(en .sp .2 $10 sup 20$ 0.15748 0.68709 0.15339 0.00204 \(en .sp GUE 0.17022 0.66143 0.16649 0.00186 $4 times 10 sup -7$ .TE .DE .sp 5 .DS .TS center,delim ($$); c s s c | c | c c | n | n. Table\ 2.12.4.\0 Averages of $Z(g sub n )$ and related functions. .sp 2 average of $N=10 sup 12$ $N=10 sup 20$ .sp .3 _ .sp .4 $Z(g sub n )$ $1.12 times 10 sup -2$ $-8.801 times 10 sup -4$ .sp .3 $|Z(g sub n ) |$ 2.6213 2.952707053204 .sp .3 $(-1) sup n^Z(g sub n )$ 2.0000 2.0000 .sp .3 $Z(g sub n ) sup 2$ 27.65 45.47 .sp .3 $(-1) sup n ^Z(g sub n ) sup 2$ 0.1415 \-0.1945 .sp .3 $Z(g sub n ) sup 3$ 5.539 104.98 .sp .3 $|Z(g sub n ) sup 3 |$ 749.8 2240.4 .sp .3 $(-1) sup n ^Z(g sub n ) sup 3$ 692.7 1919.1 .sp .3 $Z(g sub n ) sup 4$ 37645 238921 .sp .3 $(-1) sup n ^Z(g sub n ) sup 4$ 110.3 31305 .sp .3 $Z(g sub n ) sup 6$ $2.821 times 10 sup 8$ $1.062 times 10 sup 10$ .sp .3 $(-1) sup 6 ^Z(g sub n ) sup 6$ $1.175 times 10 sup 6$ $5.803 times 10 sup 9$ .sp .3 $Z(g sub n ) ^Z(g sub n+1 )$ \-3.1387 \-3.1606 .sp .3 $|Z(g sub n ) ^Z(g sub n+1 ) |$ 13.028 22.122 .sp .3 $(-1) sup n ^Z(g sub n )^Z(g sub n+1 )$ $-7.73 times 10 sup -3$ 0.282 .sp .3 $Z(g sub n ) sup 2 ^Z(g sub n+1 ) sup 2$ 6068 46070 .TE .DE .bp .pl +4 .DS .TS center; c s s c | c | c c | c | c c | n | n. Table\ 2.13.1.\0 Number of exceptions to Rosser's rule. .sp 2 exceptions per $N$ no. exceptions million zeros .sp .3 _ .sp .4 $10 sup 12$ 38 23.9 .sp .5 $10 sup 14$ 87 51.6 .sp .5 $10 sup 16$ 1539 93.4 .sp .5 $10 sup 18$ 2453 147.1 .sp .5 $10 sup 19$ 2780 166.0 .sp .5 $10 sup 20$ 15624 198.0 .TE .DE .sp 3 .DS .VL 14 .LI "Table\ 2.13.2." Relative frequencies of the most frequent types of exceptions to Rosser's rule. .LE .sp 1.5 .TS center; c | c | c c | c | c l | n | n. first type $1.5 times 10 sup 9$ zeros $N=10 sup 12 ,^10 sup 14 ,^dd ,^10 sup 20$ .sp .3 _ .sp .4 $2L22$ 0.0363 0.1073 $2L3$ 0.4501 0.1043 $2R22$ 0.0373 0.1038 $2R3$ 0.4386 0.1013 $3R3$ 0.0101 0.0550 $3L3$ 0.0151 0.0539 $3R22$ \(en 0.0537 $3L22$ \(en 0.0524 $2L212$ \(en 0.0508 $2R212$ \(en 0.0502 $3L212$ \(en 0.0246 $3R212$ \(en 0.0235 $4L3$ \(en 0.0221 $4R3$ \(en 0.0216 $2R2112$ \(en 0.0197 $4L22$ \(en 0.0194 $4R22$ \(en 0.0182 $2L2112$ \(en 0.0177 $3L2112$ \(en 0.0075 $5R3$ \(en 0.0071 .sp .T& c | n | n. total 0.9892 0.9142 .TE .DE .bp .pl -4 .DS .TS center,delim ($$); c s s s s c | c | c | c | c c | c | c | c | c c | n | n | n | n. Table\ 3.1.1.\0 Special points for the zeta function. .sp 2 index of approx. index set first zero of first zero no. zeros special point .sp .3 _ .sp .4 A 1789820229889768 $1.8 times 10 sup 15$ 213298 366350755915100.830671 B 3225901860089967 $3.2 times 10 sup 15$ 202337 648244850785931.253497 C 4817290207847018 $4.8 times 10 sup 15$ 224580 956149582979864.127715 D 5097943069948350 $5.1 times 10 sup 15$ 204441 1010102804832220.857487 E 6901069159073074 $6.9 times 10 sup 15$ 206276 1354828108521396.144683 F 18950008168234690 $1.9 times 10 sup 16$ 220040 3609764047662162.288453 G 22460777057881112 $2.2 times 10 sup 16$ 221960 4257232978148261.305478 H 42024941452698132 $4.2 times 10 sup 16$ 230978 7821904288693735.919567 I 51214985107007070 $5.1 times 10 sup 16$ 238512 9478467782100661.935759 J 71764726511399980 $7.2 times 10 sup 16$ 221752 13154657441819662.863688 K 76038726777613110 $7.6 times 10 sup 16$ 242968 13915273262098117.070642 L 76935378855702384 $7.7 times 10 sup 16$ 238556 14074693071712087.957658 M 153808369585296620 $1.5 times 10 sup 17$ 228170 27596944669957270.886813 N 233803646149078564 $2.3 times 10 sup 17$ 242576 41467826318647943.357194 O 253172315703241351 $2.5 times 10 sup 17$ 234879 44805187485720884.423354 P 473670769727688896 $4.7 times 10 sup 17$ 254092 82413269794748757.568756 Q 1250710180558723404 $1.3 times 10 sup 18$ 246054 212059301707021086.999247 R 4710265558902545324 $4.7 times 10 sup 18$ 254632 771729629469964785.437895 S 4795416924536726612 $4.8 times 10 sup 18$ 250812 785323253967853754.707393 T 17623088585596705508 $1.8 times 10 sup 19$ 262932 2793650241983592679.318477 U 32220179491036385680 $3.2 times 10 sup 19$ 263299 5032868769288289111.005891 V 35200636070992171652 $3.5 times 10 sup 19$ 265396 5486648117377526447.759269 .TE .DE .bp .DS .TS center; c s s s s c | c | c | c | c n | n | n | n | n. Table\ 3.1.2.\0 Zeta function at special points. .sp 2 set $Z(t)$ large $S(t)$ $max ^delta sub n$ zero pattern .sp .3 _ .sp .4 A \-396.2 \-2.4235 4.5200 22000022 B 459.7 \-2.3202 3.7948 301000122 C \-663.5 \-2.6410 5.1454 220000212 D 598.6 2.7575 4.9347 21120000212 E 571.0 2.3145 3.8490 22010002112 F \-523.7 \-2.4394 3.9353 301000212 G 843.9 2.7022 5.0612 2210000212 H \-555.7 2.3022 4.3147 2120000212 I 581.6 \-2.1748 4.2731 2120000212 J \-720.0 2.3142 4.4956 2120000212 K \-767.4 2.6654 4.8923 2120000122 L \-780.0 2.6238 4.8025 2120000212 M \-831.3 \-2.4475 4.4574 220000122 N 724.0 \-2.7654 4.5515 2111000122 O \-874.6 \-2.6160 4.7116 2120000122 P \-918.8 2.2410 4.4529 3110000212 Q \-971.3 \-2.6178 4.6669 221000012112 R 754.7 2.1360 3.8989 22100010212 S \-1065.2 \-2.5178 4.4694 2120000122 T 1036.7 2.8747 4.8433 3110001022 U 1580.6 \-2.4862 4.5683 220200001212 V \-1329.5 2.8314 4.3214 221100002112 .TE .DE .bp .DS .VL 14 .LI "Table\ 4.2.1." Profile of rational function evaluation algorithm computations for $N=10 sup 20$, $R=2 sup 17$. .LE .sp 1.5 .TS center; c | c l | n. step time .sp .3 _ .sp .4 evaluate Taylor series coefficients 49.1% .sp .5 Taylor series expansions, $q^>=^4$ 6.5% .sp .5 $q=3$ terms 12.2% .sp .5 $q=2$ terms 4.9% .sp .5 $q=1$ terms 7.4% .sp .5 $q=0$ terms 3.8% .sp .5 evaluate $dp^log (k)$ 9.6% .sp .5 compute $a sub k ,^b sub k$, etc. 6.5% .TE .DE .bp .DS .VL 14 .LI "Table\ 4.4.1." Running times (in minutes) of the main rational function evaluation program. .LE .sp 1.5 .TS center,delim ($$); c | c | c | c | c | c | c c | c | c | c | c | c | c c | c | n | n | n | n | n. approx. set $R$ $k sub 0$ $k sub 1$ $T$ $delta$ time .sp .3 _ .sp .4 $A$ $2 sup 17$ 500 7,\|635,\|871 $3.7 times 10 sup 14$ 0.323 24 .sp .2 $d$ $2 sup 23$ 100 17,\|577,\|894 $1.9 times 10 sup 15$ 0.37 438 .sp .2 $H$ $2 sup 17$ 500 35,\|283,\|065 $7.8 times 10 sup 15$ 0.319 86 .sp .2 $P$ $2 sup 17$ 500 114,\|527,\|198 $8.2 times 10 sup 16$ 0.3287 261 .sp .2 $g$ $2 sup 17$ 100 164,\|755,\|715 $1.7 times 10 sup 17$ 0.2 380 .sp .2 $f$ $2 sup 23$ 200 164,\|755,\|715 $1.7 times 10 sup 17$ 0.33 1115 .sp .2 $i$ $2 sup 23$ 450 505,\|829,\|004 $1.6 times 10 sup 18$ 0.313 2133 .sp .2 $U$ $2 sup 17$ 500 894,\|989,\|353 $5.0 times 10 sup 18$ 0.3067 1975 .sp .2 $k$ $2 sup 23$ 450 1,\|555,\|488,\|184 $1.5 times 10 sup 19$ 0.28961 5250 .sp .2 $n$ $2 sup 24$ 450 1,\|555,\|488,\|184 $1.5 times 10 sup 19$ 0.2903 6099 .TE .DE .bp .DS .TS center,delim ($$); c s s c | c | c c | c | c c | n | n. Table\ 4.5.1.\0Large sets of zeros, showing duplication of computed values. .sp 2 index of first set zero in set number of zeros .sp .3 _ .sp .4 $a$ $10 sup 12 +1$ 101,\|053 .sp .3 $b$ $10 sup 12 - 6,^032$ 1,\|592,\|196 .sp .3 $c$ $10 sup 16 - 4,^930$ 1,\|584,\|442 .sp .3 $d$ $10 sup 16 - 5,^946$ 16,\|480,\|973 .sp .3 $e$ $10 sup 18 - 8,^394$ 1,\|419,\|501 .sp .3 $f$ $10 sup 18 - 8,^839$ 16,\|671,\|047 .sp .3 $g$ $10 sup 18 + 12,^333,^574$ 157,\|608 .sp .3 $h$ $10 sup 18 + 12,^345,^608$ 140,\|684 .sp .3 $i$ $10 sup 19 - 13,^607$ 16,\|749,\|725 .sp .3 $j$ $10 sup 19 - 45,^597$ 135,\|161 .sp .3 $k$ $10 sup 20 - 30,^769,710$ 16,\|366,\|702 .sp .3 $l$ $10 sup 20 - 15,^409,^240$ 16,\|341,\|831 .sp .3 $m$ $10 sup 20 - 48,^778$ 16,\|388,\|741 .sp .3 $n$ $10 sup 20 + 15,^311,^688$ 32,\|811,\|834 .sp .3 $o$ $10 sup 20 - 48,^867$ 132,\|188 .TE .DE .sp 4 .DS .VL 14 .LI "Table\ 4.5.2." Comparison of values for zeros obtained in different computations. .LE .sp 1.5 .TS center,delim ($$); c | c | c c | n | n. sets of zeros max. difference rms difference .sp .3 _ .sp .4 $a$ vs. $b$ $2.5 times 10 sup -9$ $3.7 times 10 sup -11$ .sp .2 $c$ vs. $d$ $5.7 times 10 sup -8$ $3.4 times 10 sup -10$ .sp .2 $e$ vs. $f$ $4.2 times 10 sup -8$ $3.3 times 10 sup -10$ .sp .2 $f$ vs. $g$ $2.2 times 10 sup -8$ $2.3 times 10 sup -10$ .sp .2 $g$ vs. $h$ $3.5 times 10 sup -8$ $1.6 times 10 sup -10$ .sp .2 $i$ vs. $j$ $2.6 times 10 sup -8$ $7.5 times 10 sup -10$ .sp .2 $k$ vs. $l$ $7.3 times 10 sup -7$ $3.8 times 10 sup -9$ .sp .2 $l$ vs. $m$ $7.6 times 10 sup -7$ $5.9 times 10 sup -9$ .sp .2 $m$ vs. $n$ $5.8 times 10 sup -7$ $3.9 times 10 sup -9$ .sp .2 $m$ vs. $o$ $2.7 times 10 sup -7$ $4.2 times 10 sup -9$ .TE .DE .bp .ce .B "Figure Captions" .sp .VL 15 .LI "Figure\ 2.0.1." $Z(t)$ near zero number $10 sup 20$. .nr P 1 .PH "''F-\\\\nP''" The horizontal axis extends from Gram point number $10 sup 20 -8$ to Gram point number $10 sup 20 +4$. .LI "Figure\ 2.0.2." $S(t)$ near zero number $10 sup 20$. The range of $t$ is the same as in Fig.\ 2.0.1, and the jumps by 1 occur at zeros of the zeta function numbered $10 sup 20 -6$ to $10 sup 20 +5$. .LI "Figure\ 2.0.3." $Z(t)$ near zero number $10 sup 20$. The horizontal axis extends from Gram point number $10 sup 20 - 50$ to Gram point number $10 sup 20 + 50$. .LI "Figure\ 2.3.1." Pair correlation of zeros of the zeta function. Solid line: GUE prediction. Scatterplot: empirical data based on $8 times 10 sup 6$ zeros near zero number $10 sup 20$. .LI "Figure\ 2.3.2." Pair correlation of zeros of the zeta function. Solid line: GUE prediction. Scatterplot: empirical data based on $10 sup 6$ zeros near zero number $10 sup 12$. .LI "Figure\ 2.3.3." Pair correlation of zeros of the zeta function. Solid line: GUE prediction. Scatterplot: empirical data based on $8 times 10 sup 6$ zeros near zero number $10 sup 20$. Scatterplot smoothed. .LI "Figure\ 2.3.4." Probability density of the normalized spacings $delta sub n$. Solid line: GUE prediction. Scatterplot: empirical data based on 1,\|592,\|196 zeros near zero number $10 sup 12$. .LI "Figure\ 2.3.5." Probability density of the normalized spacings $delta sub n$. Solid line: GUE prediction. Scatterplot: empirical data based on 78,\|893,\|234 zeros near zero number $10 sup 20$. .LI "Figure\ 2.3.6." Probability density of the normalized spacings $delta sub n + delta sub n+1$. Solid line: GUE prediction. Scatterplot: empirical data based on 1,\|592,\|196 zeros near zero number $10 sup 12$. .LI "Figure\ 2.3.7." Probability density of the normalized spacings $delta sub n + delta sub n+1$. Solid line: GUE prediction. Scatterplot: empirical data based on 78,\|893,\|234 zeros near zero number $10 sup 20$. .LI "Figure\ 2.4.1." Comparison of the scaled distribution of $S(t)$ for $N=10 sup 20$ to the asymptotic normal distribution. .LI "Figure\ 2.5.1." Initial segment of the quantile-quantile plot of the normalized spacings $delta sub n$ against the GUE prediction. Data based on $10 sup 6$ consecutive values of $n$, starting with $n=10 sup 20 - 42,^778$. Straight line $y=x$ drawn to facilitate comparisons. .LI "Figure\ 2.5.2." Initial segment of the quantile-quantile plot of the normalized spacings $delta sub n$ against the GUE prediction. Data based on $10 sup 6$ consecutive values of $n$, starting with $n= 10 sup 20 + 15,^316,^087$. .LI "Figure\ 2.5.3." Initial segment of the quantile-quantile plot of the normalized spacings $delta sub n$ against the GUE prediction. Data based on 112,\|314,\|003 values of $n$ from $N=10 sup 18$, $10 sup 19$, and $10 sup 20$ data sets. .LI "Figure\ 2.5.4." Initial segment of the quantile-quantile plot of the normalized spacings $delta sub n + delta sub n+1$ against the GUE prediction. Data based on $10 sup 6$ consecutive values of $n$, starting with $n=10 sup 12 - 6,^032$. .LI "Figure\ 2.5.5." Initial segment of the quantile-quantile plot of the normalized spacings $delta sub n + delta sub n+1$ against the GUE prediction. Data based on $10 sup 6$ consecutive values of $n$, starting with $n=10 sup 20 - 42,^778$. .LI "Figure\ 2.5.6." Find segment of the quantile-quantile plot of the normalized spacings $delta sub n$ against the GUE prediction. Data based on $10 sup 6$ consecutive values of $n$, starting with $n=10 sup 12 - 6,^032$. .LI "Figure\ 2.5.7." Final segment of the quantile-quantile plot of the normalized spacings $delta sub n$ against the GUE prediction. Data based on $10 sup 6$ consecutive values of $n$, starting with $n=10 sup 20 - 42,^778$. .LI "Figure\ 2.6.1." Graph of $2^log ^| sum ^exp (i gamma sub n y ) |$, where $n$ runs over $10 sup 20 +1^<=^n^<=^10 sup 20 + 40,000$, and values $<^0$ and $>^16$ are deleted. .LI "Figure\ 2.6.2." Graph of $2^log ^| sum ^exp (i gamma sub n y ) |$, where $n$ runs over $10 sup 20 +1^<=^n^<=^10 sup 20 + 40,000$, and values $<^0$ are deleted. .LI "Figure\ 2.6.3." Graph of $2^log ^| sum ^exp (i gamma sub n y ) |$, where $n$ now runs over $10 sup 20 + 1 ^<=^n^<=^10 sup 20 + 400,000$, and values $<^0$ are deleted. .LI "Figure\ 2.6.4." Variance of the number of zeros in an interval of length $L$ for the GUE (dashed line), for $5 times 10 sup 5$ zeros near zero number $10 sup 20$ (scatterplot), and Berry's prediction (solid line). .LI "Figure\ 2.6.5." Variance of the number of zeros in an interval of length $L$ based on $5 times 10 sup 5$ zeros near zero number $10 sup 20$. .LI "Figure\ 2.6.6." Variance of the number of zeros in an interval of length $L$ based on $5 times 10 sup 5$ zeros near zero number $10 sup 20$. .LI "Figure\ 2.7.1." Neighborhood of an example of Lehmer's phenomenon. Graph of $Z(t)$ between Gram points $n-6$ and $n+6$, where $n=10 sup 18 + 12,^376,^778$. The point between Gram points $n+1$ and $n+2$ where $Z(t)$ is seemingly tangent to the zero line represents 2 zeros with $delta sub n+2 = 0.0011$ and the minimum of $Z(t)$ between those zeros equal to $-5 times 10 sup -7$. For a smaller scale picture of this phenomenon, see Fig.\ 4.5.1. The other point of near tangency, near Gram point $n-3$, has minimum of $Z(t)$ of $-0.0094$. .LI "Figure\ 2.10.1." Comparison of the distribution of $log ^| zeta ( 1/2 + it )|$ over two ranges of $10 sup 6$ zeros each near zeros number $10 sup 12$ and $10 sup 20$ to that of the normal distribution. .LI "Figure\ 2.10.2." For each $k$, plots the logarithm of the fraction of time that $| zeta ( 1/2 + it )|^member^[k-1,^k)$. Data obtained from 3 intervals covering $2.8 times 10 sup 6$ zeros near zero number $10 sup 20$. .LI "Figure\ 2.11.1." Scaled distribution of $10 sup 6$ values of $log ^| zeta sup prime ( 1/2 + i gamma ) |$ for $N=10 sup 20$ vs. the conjectured standard normal distribution. .LI "Figure\ 2.12.1." Distribution modulo\ 1 of $gamma sub n$ on Gram point scale, for two sets of $10 sup 6$ zeros each. Curves derived by smoothing a histogram. .LI "Figure\ 3.1.1." $Z(t)$ near the point where the largest known value of $S(t)$ occurs. The horizontal axis extends from Gram point number $n^=^17,^623,^088,^585,^596,^834,^905$ to Gram point number $n+30$. .LI "Figure\ 3.1.2." $Z(t)$ near the point where the largest known value of $S(t)$ occurs. The horizontal axis extends from Gram point number $n+9$ to Gram point number $n+21$, where $n^=^17,^623,^088,^585,^596,^834,^905$. The high peak of $Z(t)$ has been cut off. This is a smaller scale view of the central part of Fig.\ 3.1.1. .LI "Figure\ 3.1.3." $S(t)$ near the point where its largest known value occurs. The range of $t$ is the same as in Fig.\ 3.1.2. .LI "Figure\ 4.5.1." Small neighborhood of an example of Lehmer's phenomenon. Graph of $Z(t)$ on a segment of the interval between Gram points $n$ and $n+1$ (corresponding to 0 and 1 on the scale of the figure), where $n=10 sup 18 + 12,^376,^799$. Enlargement of a section of Fig.\ 2.7.1. .LI "Figure\ 4.5.2." Small scale view of Lehmer's phenomenon. Enlargement of a section of Fig.\ 4.5.1. The three curves represent three different computations of $Z(t)$ on this segment. .LE .ls 1 .CS