Bill Harlan
Date: December, 2016
For global consistency, we need to adjust each dataset independently by absolute corrections that best reproduce the relative corrections. Independent time shifts and amplitude scalars are easily estimated with a damped least-squares optimization, which we review.
Phase rotations cannot be optimized in the same fashion without ambiguity in cyclical phase angles. Instead, we begin with phase-shifted impulses as simplified representations of cross-correlations. We solve for simplified transfer functions, which then unambiguously determine the best phase correction for each dataset. A revised objective-function is optimized with a Gauss-Newton optimization.
This paper describes a robust generalized inversion for time, amplitude, and phase corrections of overlapping and intersecting seismic datasets.
In a well-developed area, we may have overlapping 3D seismic surveys, 2D seismic lines, and even 4D repeated surveys that attempt to illuminate the same seismic reflections. Different acquisition parameters and different vintages result in very different wavelet characteristics. Such differences complicate the correlation of seismic sections with sonic logs. Interpreters find it more difficult to track formation boundaries from dataset to dataset.
We will consider approaches that begin by cross-correlating selected traces from different surveys near intersection points. Early work by [Harper, 1991] and [Henry and Mellman, 1988] used these cross-correlations to estimate convolutional wavelets, or transfer functions, that best adjusted each dataset. Later simplifications were independently implemented by [Mackidd, 1998] at EnCana and by [Bishop and Nunns, 1994]. From each cross-correlation they extract relative time shifts, phase rotations, and amplitude scale factors. The time shift is generally located at the peak of the analytic envelope of the cross-correlation. Comparing the strength of the cross-correlation to the strength of a dataset determines a relative amplitude scaling. The phase rotation is the constant phase bias remaining in the frequency domain after removing the time shift. (Equivalently, the best time lag and phase rotation correspond to the slope and intercept of a line fitting phase in the frequency domain.)
Each of the three corrections are inverted separately to find independent adjustments for each dataset. Interpreters find such corrections easy to view for quality control, with minimal risk of damage to spectral information. If the inversion is robust, then repeating the procedure will not result in any further improvements.
[Mackidd, 1998] preferred to construct explicit systems of equations with specific constraints to remove non-uniqueness. [Bishop and Nunns, 1994] used a recursive algorithm (essentially Gauss-Seidel) that converged to the same solution for time shifts and amplitudes.
Both inverted phase rotations with modified versions of their algorithms for time shifts. Phases, however, require “unwrapping” that repeatedly adjusts by full cycles of 360 degrees to minimize errors. Such approaches cannot guarantee that they avoid local minima or that they find a globally optimum solution.
We define all solutions with objection functions that best minimize errors, with damping to handle non-uniqueness. For optimization, we choose a Gauss-Newton optimization as described in [Luenberger, 1973], and as implemented by [Harlan, 2004] for related problems in [Harlan, 2014] and [Harlan, 1989]. In the case of time shifts and amplitudes, we obtain the same least-squares solution as solved by [Bishop and Nunns, 1994].
We prefer, however, to avoid fitting spectral phases directly. Instead we return to the time domain and fit the corresponding phase-shifted impulses. Instead of adding or subtracting phases, we convolve or correlate these simplified transfer functions. In the time domain, we avoid any need to adjust phase by an ambiguous number of cycles. This optimization is a specialization of the cross-balancing algorithm in [Harlan, 2015], which is very similar to a surface-consistent deconvolution as described in [Levin, 1989].
We begin by reviewing the numerically simplest optimization of relative time shifts.
Let us index independent seismic lines or datasets by an integer index or . To align all the datasets, we assume that each line needs to be shifted by a time , in milliseconds (or meters or feet if in depth). A trace amplitude previously appearing at zero time will appear at the sample with this time, after correction. A negative time correction will reduce the times at which a particular amplitude appears.
If a pair of lines cross, then we could make a relative correction to line to make it better tie a line . If shifts are globally consistent, then relative corrections can be computed from absolute corrections:
(1) |
We can directly measure or estimate these relative corrections, but not the absolute corrections. Let us use an integer index to identify each intersection at which we measure a relative shift from the data. Each will index a triplet of indices . At each of these intersections , we will measure a relative time shift that adjusts line to match line better. Selected traces from each line are cross-correlated so that this lag can be measured from the analytic peak.
We will assume that errors in measured follow an uncorrelated Gaussian distribution with zero mean. This assumption would imply a maximum likelihood solution for that minimizes the least-squares error between the measured and computed corrections:
(2) | ||
or | (3) |
Worse, there may be some linear components that are poorly conditioned, where large changes in some combination of result in very small changes to the total error. A larger area of the objective function may have almost the same minimum value within the precision of our measurements.
To handle such poorly conditioned solutions, we add a damping term on the magnitude of absolute corrections. Damping is equivalent to assuming that inverted corrections also have a Gaussian distribution, then maximizing the a priori probability (known as a MAP estimate).
We use a damping factor , where is the number of intersections indexed by , and is the number of independent datasets indexed by and . Additionally we add a factor of , which is equivalent to assuming measurement errors have about one hundredth the magnitude of typical shifts. This factor can be reduced even more without significantly altering the result. Only a little damping is required to eliminate non-uniqueness and poorly conditioned components.This objective function (4) is a purely quadratic function of the unknown values. We use a low-memory iterative conjugate-gradient algorithm as in [Luenberger, 1973] and [Harlan, 2004].
Damping ensures that the average absolute correction will be zero. Instead, we can choose certain lines as reference lines and force their corrections to be any predetermined value, such as zero. To do so, we simply omit these corrections from the optimization. (Programatically, we can remove elements from our solution vector, or we can apply conditioning that zeros out these elements.)
With a few tweaks, we can fit amplitude changes with the same algorithm we used for time shifts.
We assume each line needs to be scaled by an absolute amplitude factor of to best match all other lines. If a pair of lines cross, then we could make a relative amplitude correction to line to make it better tie a line . If amplitude corrections are globally consistent, then relative corrections can be computed from absolute corrections:
(5) |
In most cases, lines will have already been normalized to some degree, perhaps to make their root-mean-square (RMS) amplitudes all the same. Normalization does not result in the best overall ties because one side of a dataset may be stronger in amplitude than in another. But we do expect remaining amplitude scale factors to be reasonably distributed around a value of 1, with no large outliers.
To turn this problem into the same linearized problem we just solved, we will assume that the distributions of scale factors are all log-normal. That is, the logarithm of follows a Gaussian distribution. Multiplying or dividing log-normal variables results in more log-normal variables, including our computed . The resulting MAP objective function is
This function is not quadratic in the unknown and is more difficult to solve. Instead, if we define the following,and | (7) |
Superficially, the problem of fitting phase rotations is very similar to those of previous sections. Relative phase rotations are measured at line intersections, and are modeled as differences of absolute phase rotations for each line.
Some authors ([Bishop and Nunns, 1994,Mackidd, 1998]) solve for absolute phase rotations with least-squares approaches very similar to those just described for time and amplitudes. Unfortunately phase rotations are not uniquely invertible. A phase rotation of has the same effect on a trace as . As a result, phases are usually limited to a branch such as (exclusive) to (inclusive), or equivalently -180 (exclusive) to 180 (inclusive).
When minimizing least-squares errors in phases, a small perturbation can easily cross over a branch cut. Such approaches must update both measured and estimated phases repeatedly during optimization to allow values outside the original branch range. Such “phase unwrapping” does not converge well for larger numbers of intersections and datasets; there is no guarantee of finding a global minimum.
We should first clarify the definition of a phase rotation (also known as a constant phase shift). We will represent all necessary operations in the time domain without Fourier transforms.
We will write a seismic trace as a continuous function of a time . (Discretization is equivalent to assuming a bandlimited “sinc” function as a basis for the continuous function.)
We first compute an analytic trace of complex amplitudes by convolving our real trace with an analytic function :
(9) | |
where | (10) |
and | (11) |
The real part of an analytic trace restores the original trace.
Re | (12) |
The new imaginary component is always orthogonal to the original function, mapping a cosine to a sine, and a sine to a negative cosine, like a derivative, but without affecting the amplitude spectrum.
To apply a constant phase rotation of angle to a trace , we multiply the complex analytic trace by a complex number of unit magnitude , then keep just the real part.
Re | (13) | |
Re | (14) | |
(15) | ||
where | (16) |
Successive convolutional phase rotations can be combined merely by adding their corresponding phase rotations:
(17) |
(18) |
(19) |
(20) | ||
(21) |
If we have an instance of the phase rotation function, we can reestimate its angle:
and | (22) | |
(23) | ||
and | (24) | |
atan2 Arg | (25) |
Rather than optimize phase rotations directly, we can equivalently construct and work with their corresponding phase rotation functions. Phase rotations that differ by exactly 360 have exactly the same representation as functions, so we avoid any non-unique phase unwrapping problems.
We assume each line needs to be adjusted by an absolute phase rotation of to best match all other lines. If a pair of lines cross, then we could make a relative phase correction to line to make it better tie a line . If rotations are globally consistent, then relative corrections can be computed from absolute corrections:
At each intersection , we also measure a relative rotation that makes line more consistent with line . Once again, we want to fit measured corrections by estimating the best absolute corrections .Instead of fitting these phases directly, we will substitute phase rotation functions and model them with correlations rather than differences.
For each line we want to estimate an absolute phase rotation function . We can then compute a relative phase correction function that best corrects line to line by correlating their respective absolute functions.
(27) |
This optimization is a specialization of the cross-balancing algorithm by [Harlan, 2015], which is very similar to a surface-consistent deconvolution as described by [Levin, 1989].
We can optimize for phases in two steps. First we estimate the absolute rotation functions, then we estimate phase from those functions as in equation (25).
(29) | |
where | (30) |
Notice that this objective function, unlike our previous versions, is quartic rather than quadratic in the unknown . Nevertheless, the objective function is well behaved and convex with a single global minimum.
We use a low-memory Gauss-Newton optimization as described in [Luenberger, 1973], and as implemented by [Harlan, 2004]. We iteratively approximate the objective function by linearizing the correlation in perturbations of the unknown rotation functions, which approximates the objective function as a quadratic:
We initialize all unknown with delta functions and solve for perturbations with a standard quadratic conjugate gradient algorithm. Then we use a line-search optimization to find the best scale factor for all perturbations before updating the reference functions. Then we relinearize and solve for perturbations again. Because the objective function is convex, with a single global minimum, convergence is guaranteed.From these functions, we use equation (25) to estimate the best phase rotations for each dataset. We also can use the simple differences in equation (26) to recompute the expected phase shifts at each intersection , after adjusting to the standard branch cut. Any remaining residual errrors are removed by the quadratic least-squares of previous sections.
We find 23 samples more than adequate for correlations and transfer functions, reconstructing phases with less than one degree of error. Because time lags are handled separately, the envelopes of these functions remain centered, reducing the effect of truncations.
We do find that estimated phases can drift somewhat when tens of thousands of lines are inverted without any constraints. That is, the reconstructed differences at intersections are well within measurement error, but the absolute phase shifts can vary more substantially at distant degrees of connectedness. If adjusted lines are no more than a hundred intersections from a constrained line, then inverted phases avoid such drift.
We began with the premise that time shifts, amplitude scaling, and phase rotations are an adequate correction to tie overlapping datasets. These corrections are orthogonal and can be optimized independently.
We first reviewed a least-squares approach to time and amplitude corrections, with few differences from previous work. We do, however, prefer damping to explicit rank reduction. A low-memory iterative optimization scales well and more conveniently incorporates constraints.
We believe phase corrections are best solved as phase-rotated delta functions representing simplified transfer functions. Such an approach avoids ad-hoc phase unwrapping and guarantees convergence to a global minimum.
Data may not always agree with these simplifying assumptions. Phase distortions can change more arbitrarily than just linear time shifts and constant phase rotations. Amplitudes may also scale differently with frequency. Additional preprocessing with deconvolution can help, but does not eliminate all such issues.
It is also possible to fit cross-correlations directly as in [Harlan, 2015]. The three independent corrections can be extracted from full transfer functions estimated for each dataset. Users may find it more difficult to audit and edit such corrections at intersections.
Ultimately, the popularity of the three independent corrections is the best testimonial for their effectiveness.