Victor Y. Pan
Computations with Toeplitz and Toeplitz-like matrices are fundamental for many areas of algebraic and numerical computing. The list of computational problems reducible to Toeplitz and Toeplitz-like computations includes, in particular, the evaluation of the gcd, the lcm, and the resultant of two polynomials, computing Pad\'{e} approximation and the Berlekamp-Massey recurrence coefficients, as well as numerous problems reducible to these. Transition to Toeplitz and Toeplitz-like computations is currently the basis for the design of the fastest known parallel (RNC) algorithms for these computational problems.
Our main result is in contructing nearly optimal randomized parallel algorithms for Toeplitz and Toeplitz-like computations and, consequently, for numerous related computational problems (including the computational problems listed above), where all the input values are integers and all the output values are computed exactly. This includes randomized parallel algorithms for computing the rank, the determinant, and a basis for the null-space of an $n \times n$ Toeplitz or Toeplitz-like matrix $A$ filled with integers, as well as a solution ${\bf x}$ to a linear system $A{\bf x}={\bf f}$ if the system is consistent. Our algorithms use $O((\log n) \log (n \log \|A\|))$ parallel time and $O(n \log n)$ processors, each capable of performing (in unit time) an arithmetic operation, a comparision, or rounding a rational number to a closest integer. The cost bounds cover the cost of the verification of the correctness of the output. The computations by these algorithms can be performed with the precision of $O(n \log \|A\|)$ bits, which matches the precision required in order to represent the output, except for the rank computation, where the precision of the computation decreases. The algorithms involve either a single random parameter or at most $2n-1$ ones.
The cited processor bounds are less by roughly factor $n$ than ones supported by the known algorithms that run in polylogarithmic arithmetic time and do not use rounding to the closest integers.
Technically, we first devise new algorithms supporting our old nearly optimal complexity estimates for parallel computations with general matrices filled with integers. Then we decrease dramatically, by roughly factor $n^{1.376}$, the processor bounds required in these algorithms in the case where the input matrix is Toeplitz-like. Our algorithms exploit and combine some new techniques (which may be of independent interest, e.g. in the study of parallel and sequential computation of recursive factorization of integer matrices) as well as our earlier techniques of variable diagonal (relating to each other several known algebraic and numerical methods), stream contraction, and the truncation of displacement generators in Toeplitz-like computations; our development and application of these techniques may be of independent interest.