Lecture 3: Discrete-domain signals and systems

73 minute read

Prof. Alex Bronstein

Discrete-domain Fourier transform

When a function f is sampled on the lattice L, its Fourier transform is periodized on the reciprocal lattice L. Therefore, it is sufficient to describe the value of the transform on the structural element of the lattice, which can be parametrized as ξ=Aω with ω[12,12]. The vector ω is called the normalized or digital frequency.

Substituting the normalized frequency into the periodized and scaled Fourier transform of f yields

F[ω]=volL(IIILF)(Aω)=F(IIILf)(Aω)=pLf(p)F(τpδ)(Aω)=pLf(p)ϕp(Aω)=nZdf(An)e2πiωAAn=nZdf(An)e2πiωn=nZdf[n]e2πiωn,

where we defined f[n]=f(An). In what follows, with some abuse of notation, the discrete sequence of samples, f|L=fn=f[n]nZd, as well as its particular element will be denoted simply as fn or f[n]. Square brackets will emphasize that we are working in the digital (i.e., discetely sampled) domain. We will refer to f[n] as to a discrete signal and to F[ω] as to its (discrete-domain) Fourier transform. By convention, F[ω] is extended periodically to entire Rd.

Note that, as before, the Fourier transform is defined as the orthogonal projection,

F[n]=f,ϕω2(Zd),

only this time the inner product is on the space of sequences

f,g2(Zd)=nZdf[n]g[n].

The inverse transform is given by f[n]=[12,12]dF[ω]e2πiωndω. Since the discrete-domain Fourier transform is defined through its (periodized) continuous counterpart, all the properties of the latter are straightforwardly inherited.

Discrete systems

Let f be a continuous-domain signal, h the impulse response of an LSI system, and let us denote g=fh. Sampling g on the lattice L=AZd yields

IIIL(fh)F volLIIIL(FH)

If either f or h is L0-band limited, so is fh and we can further write IIIL(FH)=(IIILF)(IIILH), hence

IIILg=IIIL(fh)=(IIILf)(IIILh)=(pLf(p)τpδ)(pLh(p)τpδ)=p,qLf(p)h(q)τp+qδ=sL(pLf(p)h(sp))τsδ=mZd(nZdf(An)h(A(mn)))τAmδ=mZd(nZdf[n]h[mn])τAmδ.

On the other hand,

IIILg=mZdg[m]τAmδ.

Hence, we arrive at a discrete form of convolution

g[n]=mZdf[m]h[nm]=(fh)[n].

It is straightforward to show that convolution is transformed to point-wise product in the frequency domain.

Kroenecker’s delta

Recall that Dirac’s delta was a distribution that acted like identity under convolution. The discrete-domain analog is the Kroenecker delta, which is the sequence δ[n] defined as δ[0]=1 and δ[n]=0 for every n0.

Decimation

Let f be a signal sampled on the lattice L=AZd. From the definition of the Fourier transform, the following relation holds between the discrete and continuous domains:

F[n]=nZdf(An)e2πiωn=volLnZdF(A(ωn)).

Let M be an integer d×d matrix such that M=MZdZd (that is, the lattice M is has only integer points). We denote by M0 the (discrete) structural element of M, i.e., the smallest set such that

pM(p+M0)=Zd.

Let us sub-sample f[n] on M creating a new discrete signal g[n]=f[Mn]=f(AMn) (this operation is ofter referred to as compression). Essentially, g[n] can be thought of f(x) sampled on the dilated lattice L=AMZd (with the corresponding dual L=AMZd, where M=M). Using definitions and elementary algebra, we obtain the following expression for the Fourier transform of g[n]:

G[ω]=nZdf(AMn)e2πiωn=volLnZdF(AM(ωn))=volLvolMnZdmM0F(A(M(ωm)n))=volMmM0volLnZdF(A(M(ωm)n))=volMmM0F[M(ωm)].

Firstly, notice that the continous-domain Fourier transform of f does not appear in the final expression; G[ω] is entirely expressed in terms of the discrete-domain Fourier transform F[ω]. Therefore, we do not need to assume any underlying continuos-domain signal f(x) (though it can always be obtained at least in principle via band-limited interpolation of f[n]), and perform the sub-sampling of f[n] entirely in the discrete domain.

Secondly, observe the striking resemblance of the above expression to its continuous counterpart. The discrete-domain Fourier transform of the signal being sampled is periodized on the dual lattice M (recall that sub-sampling is performed on M). As in the continuous case, the support of F[ω] has to be contained within the structural element of the dual lattice M in order to avoid aliasing; in other words, f[n] has to be M-band limited. In that case, the replicas of F[ω] will not overlap, and f[n] can be recovered exactly from g[n] by keeping only the main replica of F[ω] in G[ω] and suppressing the others.

Anti-aliasing

In general, when the signal is not band-limited and an anti-aliasing filter has to be applied before sub-sampling. The frequency response of the ideal low-pass filter HM[ω] has to satisfy supp(HM)M0 (of course, this is a slightly sloppy notation: HM is periodic, hence the support is replicated on M). Such a low-pass filter is achieved by HM[ω]=box(M1ω) (again, periodized on M), whose impulse response is straightforwardly obtained by invoking the stretching property of the Fourier transform

hM[n]=volMsinc(Mn).

The combined action of digital anti-aliasing filtering followed by sub-sampling is usually referred to as decimation. On block diagrams, we will denote sub-sampling as M or M:I, followed by a decimation filter HM.

Interpolation

Let f be a signal sampled, as before, on L, and let N be a d×d matrix such that ZdN=NZd (that is, a sub-sample of the lattice N is the integer lattice). We define the expansion of f[n] to N as

g[n]={f[Nn]:NnZd0:otherwise.

Substituing p=Nn and, inversely, n=Np, we obtain

g[n]=pZdf[p]δ[nNp].

(observe the resemblance to the continuous case). In the frequency domain, this translates to

G[ω]=nZdg[n]e2πiωn=pZdf[p]e2πiωNp=F[Nω].

Since volN<1, the periodic discrete-time Fourier transform F[ω] is shrunk, and a few of its periods alias to higher frequencies of G[ω]. However, unlike the case of sub-sampling where information was lost, all the information about f[n] is still present in g[n]. In order to fully recover it, we just need to remove the superfluous replicas of F[ω] outside N1[12,12]d. The frequency response of the ideal low-pass filter HN[ω] has to satisfy supp(HN)N1[12,12]d.

Denoting L=ANZd, by definition of the discrete-domain Fourier transform,

F[Nω]HN[ω]=volLnZdF(A(Nωn))HN[ω]=volLvolNnZdF(AN(ωN1n))HN[ω]=volLpZdF(AN(ωp)),

since HN(ω) removes all replicas of F except those with integer N1n. Note that we also assumed that the filter has DC gain of volN to make the result consistent with the sampling of f(x) on L. Such a filter is achieved by HN(ω)=volNbox(Nω). The corresponding impulse response is

hN[n]=sinc(Nn).

Convolving hN[n] with g[n] yields the following identity

f[n]=(ghN)[n]=pZdf[p]hN[nNp]=mZdf[Nm]sinc(Nnm).

The combined action of expansion followed by anti-aliasing filtering is called interpolation. Note the striking resemblance to the continuous case. As in the continuous case, here as well practice mandates the use of other interpolation kernels that have finite support, which in the frequency domain is translated to non-ideal low-pass filters. On block diagrams, we will denote expansion by N or I:N, followed by an interpolation filter HN.

Resampling

Let f be a signal sampled, as before, on L, and let Q=QZd be a new lattice onto which we would like to resample f without leaving the digital domain. If Q can be factorized into Q=MN such that ZdNZd and MZdZd, we can first interpolate f[n] to NZd and then decimate it onto MZd as described above. On block diagrams we will denote resamplers as M:N. They are particularly useful when resampling images to a new grid related to the original one by a non-integer (yet, rational) factor.

A few elementary rules can be straightforwardly proved:

  1. Successive sub-sampling cascades can be combined: M1M2=M2M1

  2. Successive expansion cascades can be combined: N1N2=N2N1

  3. Expansion can be exactly inverted: NN=id

  4. Sub-sampling cannot usually be exactly inverted: MMid

Noble identities

Treating sub-sampling and expansion and discrete-domain systems, let us quickly highlight several of their properties.

Linearity

Observe that the sub-sampling operation we defined, (Mf)[n]=f[Mn] is linear, i.e., it commutes with addition and scaling:

(M(αf+βg))[n]=(αf+βg)[Mn]=αf[Mn]+βg[Mn]=α(Mf)[n]+β(Mg)[n].

The same is also true for the expansion operation

(Nf)[n]={f[Nn]:NnZd0:otherwise,

since

(N(αf+βg))[n]={(αf+βg)[Nn]:NnZd0:otherwise=α{f[Nn]:NnZd0:otherwise+β{g[Nn]:NnZd0:otherwise.=α(Nf)[n]+β(Ng)[n].

Translation

When changing the order of translation and sub-sampling, the lattice matrix multiplies the translation vector:

(τpMf)[n]=f[M(np)]=f[MnMp]=(MτMpf)[n].

Similarly, for the expansion operation,

(Nτpf)[n]={f[Nnp]:NnZd0:otherwise,={f[N(nNp)]:NnZd0:otherwise,={f[N(nNp)]:N(nNp)Zd0:otherwise,=(Nf)[nNp]=(τNpNf)[n].

Note that since the translation operator is modified when swapped with the sub-sampling/expansion, these systems are linear but not shift invariant!

Exchanging sub-sampling and filtering

Let H be an LSI system defined by

(Hf)[n]=(hf)[n]=mZdh[m]f[nm]=(mZdh[m]δ[nm])f[n]=(mZdh[m]τm)f[n]

Combining linearity and translation properties of sub-sampling, we obtain

HM=mZdh[m](τmM)=mZdh[m](MτMm)=M(mZdh[m]τMm)=MG.

In other words, applying the system H followed by sub-sampling on M is equivalent to first sub-sampling on M and then applying the system

G=nZdh[n]τMn=pMZdh[Mp]τp,

which can be described by the expanded impulse response

g[n]={h[Mn]:MnZd0:otherwise

The frequency response of this equivalent system is straightforwardly given by

G[ω]=nZdh[n]e2πiωMn=nZdh[n]e2πi(Mω)n=H[Mω]=H[M1ω].

This result is usually known as a noble identity and can be summarized as

h(Mf)=M((Mh)f)

or, alternatively, as the following informal expression

H[ω]M=MH[M1ω]

where with some abuse of notation we represent the action of an LSI system by its Fourier transform.

Exchanging expansion and filtering

Similarly to the previous case, combining linearity and translation properties of expansion, we obtain

NH=mZdh[m](Nτm)=mZdh[m](τNmN)=(mZdh[m]τNm)N=GN.

with

G=nZdh[n]τNn=pNZdh[Np]τp,

that can be described by the expanded impulse response

g[n]={h[Nn]:NnZd0:otherwise

The frequency response of this equivalent system is straightforwardly given by

G[ω]=nZdh[n]e2πiωNn=nZdh[n]e2πi(Nω)n=H[Nω].

This yields our second noble identity that can be summarized as

N(hf)=(Nh)(Nf)

or, alternatively, as the following informal expression

NH[ω]=H[Nω]N

Polyphase representation

Let us fix some lattice M=MZdZd and denote by M0 its structural element. In this notation, we can represent

nZdf[n]=mM0nZdf[Mn+m].

Let H be an LSI system with the impulse response h[n] that can be described as

H=nZdh[n]τn.

We define the polyphase components of h[n] with respect to M as

em[n]=h[Mn+m]

for every mM0. Each such component can be regarded as the impulse response of a system in its own right,

Em=nZdem[n]τn=nZdh[Mn+m]τn.

In these terms, the system H itself can be expressed as

H=mM0nZdh[Mn+m]τMn+m=mM(nZdem[n]τMn)τm.

Note that the inner sum is just an expanded impulse response of Em,

(hf)[n]=mM(Mem)f[nm]

which in the frequency domain assumes the form of

H[ω]=mM0Em[M1ω]e2πiωm.

This decomposition of an LSI system in terms of the polyphase components is known as the type I polyphase decomposition.

Alternatively, we can define the polyphase sequences rm[n]=em[n]=h[Mnm], yielding the type II polyphase decomposition:

H=mMτm(nZdrm[n]τMn).

Efficient implementation of decimators

When H is a system implementing a decimation filter, it is followed by the sub-sampling M. Substituting type I polyphase decomposition with respect to M and applying the noble identity yields

M(hf)=M(mM(Mem)(τmf))=mMM(Mem)(τmf)=mMem(Mτmf)=mMEmMτmf.

or in operator writing

MH=mMEmMτm

Both sides of the identity can be viewed as two different implementations of the decimator system.

Let us compare the computational complexity of these two implementations. For simplicity, let us assume M=MI, so that M0=0,1,,M1d, and furthermore, let us assume that h[n] is supported on the discrete square 0,,K1d, while the input signal is supported on 0,,N1d. To simplify expressions, let us further assume that both N and K are divisible by M. The first system first calculated Nd outputs of the filter H (containing Kd coefficients), each of which requiring Kd multiply-and-accumulate (MACC) operations. This totals in (KN)d MACC operations. If the input signals are images coming at a rate of one image per second, this requires a processor running at (KN)d hertz. For d=2, M=4, K=1, and N=4000 (a 16 megapixel image decimated × in each axis using a 4×4 bicubic kernel), this amounts to 256 MHz! Note that afterwards, only 1/16-th of these output samples are retained by the sub-sampler; the rest are thrown away.

The second implementation first filters Md shifted and sub-sampled versions of the input signal (containing (NM)d samples) with the filters Em (containing (KM)d coefficients). Each filter output sample requires only (KNM2)d MACC operations per filter, and Md(KNM2)d=(KNM)d MACC operations in total. Note the save up by a factor of Md. For the same settings as before, we would need a processor running only at 16 MHz. Also note that the Md filters can be computed in parallel on Md cores running at M2d the frequency of the core required to implement the first system. In our example, if the processor had 16 cores, we could run each core at 1 MHz. Since processor power consumption is roughly proportional to f3 (or worse), the second system requires roughly M5d less power. In our example, 16 1-MHz cores would consume about one million times less power. For this reason, it is not an exaggeration to say that any ASIC implementing signal decimation uses the polyphase implementation.

Efficient implementation of interpolators

We will now repeat the same analysis for the case when H is an interpolation filter preceding expansion N. Substituting the type II polyphase representation with respect to N and applying the noble identity yields

h(Nf)=mNτm(Nrm)(Nf)=mNτmN(rmf)=mNτmNRmf.

or in operator writing

HN=mNτmNRm

As before, both sides of the identity can be viewed as two different implementations of the interpolator system. The second system first filters the inputs with the smaller filters Rm, the expands the results and sums their shifted versions.