Reformulation of the Fresnel Transform
to Introduce Sampling and
Recovery Region Control and
its Acceleration
By
Modesto Guadalupe Medina-Melendrez
A Dissertation
Submitted in partial fulfillment of the requirements for the degree of
DOCTOR IN COMPUTER SCIENCE
at the
National Institute for Astrophysics, Optics and Electronics January 2010,
Tonantzintla, Puebla, México
Advisors:
Dr. Miguel Arias-Estrada
Computer Science Coordination, INAOE Dra. Albertina Castro
Optics Coordination, INAOE
© INAOE 2010 All rights reserved
The author hereby grants to INAOE permission to reproduce and to distribute copies of this thesis document in whole or in part
Summary
The Fresnel transform has been used in several applications of digital holography to recover wave fields from digital holograms. Three-dimensional (3D) reconstruction, 3D recognition and particle tracking velocimetry can be found among these applications. Depending on the application, the recovered wave fields should satisfy certain requirements. Sampling rate control (availability to choose the distance between samples in the recovered wave field) is a requirement for several digital holography applications; furthermore, recovery region control (availability to choose the size and position of the recovered wave fields) can be useful since in most applications only a small region of the wave fields is required. Nevertheless, none of the actual formulations of the Fresnel transform can be used to control the sampling rate and the recovery region in the same formulation. There are a few proposals that completely control the sampling rate of the wave fields to be recovered, but they use the computation of at least a couple of two-dimensional discrete Fourier transforms with dependency between them. This dependency restricts the minimum execution time that can be achieved.
In this research, it is proved that implementations of the Fresnel transform with a single two-dimensional discrete Fourier transform can be used to control the sampling rate and the recovery region of the wave fields and, at the same time, to reduce the required execution time. In a proposed software alternative, the use of a single two-dimensional discrete Fourier transform can achieve shorter execution times for most of the practical applications than the current alternatives if a small flexibility is permitted in the required sampling rate. Furthermore, a parallel hardware architecture, where the flexibility is not required, is proposed. The hardware architecture can achieve shorter execution times than any existing alternative to compute the Fresnel transform.
The new formulation of the Fresnel transform can require computing only a few coefficients of the two-dimensional discrete Fourier transform applied to an input array padded with zeros. In order to reduce the execution time required by the new formulation of
Fourier transforms was proposed. The pruning method avoids computing the multiplications per zero and the non required Fourier coefficients. This pruning method can be useful in several applications of digital signal processing.
The proposed hardware architecture uses a bank of fixed-point implementation of the second-order Goertzel algorithm to compute the required Fourier coefficients. The scaling factor required by the classical second-order Goertzel algorithm is in O(1/M 2), yielding a small signal-to-noise ratio. Here, a new method that use a scaling factor in O(1/M) to compute any Fourier coefficient with a fixed-point implementation of the second-order Goertzel algorithm, is also proposed. The new scaling factor increases the signal-to-noise ratio. Pipelines at different levels are proposed for the implementation of the second-order Goertzel algorithm in a kind of bank of filters. The bank of filters is used to compute the two-dimensional discrete Fourier transform required to implement the proposed formulation of the Fresnel transform.
Finally, it can be affirmed that the new formulation of the Fresnel transform can be used to effectively control the sampling rate and the recovery region of the wave fields. The proposed software alternative requires shorter execution time than current software alternatives for practical applications. And, the proposed hardware alternative can achieve similar reductions in the execution time without sacrificing functionality in comparison with current accelerators.
Acknowledgements
I owe special gratitude to my advisors Dr. Miguel Arias Estrada and Dra. Albertina Castro for their valuable guidance and constructive criticism through the development of the thesis research work.
I would like to thank the members of my defense committee for their comments and suggestions. Thanks to Dr. René Armando Cumplido Parra, Dr. Aurelio López López, Dr. Manuel Montes y Gómez, Dr. Gustavo Rodríguez Gómez and Dr. René de Jesús Romero Troncoso.
I want to express my gratitude to all those who provide me encouragement and mainly their friendship.
It is important to mention the support of my family, especially that of my wife Cristina whose patient love encouraged me to complete this work.
I also want to recognize the facilities given by the technical staff of the INAOE in particular the people of the computer science coordination.
Finally, I thank to the CONACYT for the doctoral scholarship #161826 and to the Instituto Tecnológico de Culiacán for the granted license during my doctoral studies.
Dedicatory
To my mother, brothers
Contents
Summary
Acknowledgments
iii v
Contents ix
1. Introduction 1
1.1. Literature Overview ………... ………..……….. 1
1.1.1. Applications of the FresT in digital holography………... …. 1
1.1.2. Control of the sampling rate of the wave field to be recovered……… 4
1.1.3. Hardware Accelerators for DH………... 5
1.1.4. Discussion……….. . 7
1.2. Problem Statement………. ……… 8
1.3. Motivations for the research work………. 9
1.3.1. Research Questions……….. 9
1.3.2. General Objectives……….... 10
1.3.3. Requirements……… 10
1.4. Description of the Chapters……… 10
2. Fresnel transform with sampling and recovery region control Through a single 2D discrete Fourier transform 13 2.1. Introduction……….……… 13
2.2. Fresnel transform (FresT)………...………….………... 15
2.3. Using the sampling theory of the Fourier transform to introduce sampling control in the FresT………. ………..………. 17
2.3.1. Aliasing Operator………..……… 18
2.3.2. Changing the number of samples to be computed of the Fourier transform………..……… 21
2.3.3. Changing the number of samples of the 2D Fourier transform to be computed. ……….……….. . 23 2.3.4. Fresnel transform with sampling control (FresTSC)………... 23
2.4. Introducing control of the recovery region in the Fresnel transform……... 24
2.5. Implementation of the FresTSRC………... 27
2.5.1. Elimination of the non required 1D DFTs………... 28
2.5.2. Computing the complex exponential functions…………... 29
2.5.3. Introduction of flexibility in the sampling rate to reduce the execution time of the required 2D DFT………... 30
2.6.1. Test Bench……… 32
2.6.2. Sampling and Recovery Region Control………..… 35
2.6.3. Execution time………. 41
2.7. Discussion……….. 44
3. Input and/or Output Pruning of Composite Length FFTs using a DIF-DIT Transform Decomposition 45
3.1. Introduction………. 46
3.2. Decimation-in-frequency and decimation-in-time transform decomposition. 47
3.3. Implementation of the FFTDIF-DIT-TD………... 51
3.3.1. Implementation of the Input Stage……… 51
3.3.2. Implementation of the Output stage……….. 53
3.4. Selection of the decomposition parameters……… 56
3.5. Comparison with other pruned FFTs……….. 58
3.6. The FFTDIF-DIT-TD using the FFTW………... 60
3.7. Using the FFTDIF-DIT-TD to compute the 2D DFT required by the FresTSRC. 61 3.8. Discussion………... 63
4. Fixed-point implementation of the second-order Goertzel algorithm 65 4.1. Introduction………. 65
4.2. Second-Order Goertzel algorithm………... 67
4.3. Overflow Analysis in the Fixed-Point Implementation of the second-order Goertzel algorithm………... 70
4.3.1. Analysis of the recursive stage………. 70
4.3.2. Analysis of the complete system………... 78
4.4. Using the Second-Order Goertzel Algorithm with a scaling factor in O(1/M)………... 80
4.5. Noise analysis in the proposed fixed-point implementation of the second-order Goertzel algorithm……….. 82
4.5.1. Effects of Coefficient Quantization……….. 83
4.5.2. Noise analysis……… 84
4.6. Discussion………... 88
5. Parallel Architecture based on FPGAs to Compute the DFT required by the FresTSRC 91 5.1. Introduction………. 91
5.2. Basic considerations……… 93 5.3. Pipeline to compute the imaginary and real part of
5.3.3. Final Stage……….. 100
5.4. Computation of 1D DFTs………. 102
5.4.1. Bank of Goertzel filters……….. 102
5.4.2. Pipeline with the bank of Goertzel filters to compute 1D DFTs…… 105
5.5. Using the bank of Goertzel filters to compute 2D DFTs………. 107
5.5.1. Managing the data with a single memory unit………... 107
5.5.2. Pipeline to compute the 2D DFT……… 108
5.6. Implementation of the proposed FPGA based architecture to compute the 2D DFT………... 113
5.7. Computation of the FresTSRC with the proposed architecture…………... 116
5.8. Discussion……… 121
6. General conclusions and future work 123 6.1. Work Summary……… 123
6.2. Conclusions……….. 124
6.3. Publications……….. 125
6.4. Future work……….. 126
Appendix A. Implementation of the FFTDIF-DIT-TD 127 A.1. Implementation of the input stage………... 127
A.2. Implementation of the output stage………. 128
A.3. Using the FFTW to compute the required DFTs of the DFTDIT-DIT-TD…... 130
Appendix B. Overflow analysis in the fixed-point implementation of the first-order Goertzel algorithm 135 B.1. Introduction………. 135
B.2. The first-order Goertzel algorithm……….. 136
B.3. Implementation of the first-order Goertzel algorithm for complex-valued in put sequences………... 138
B.4. Overflow analysis in the fixed-point implementation of the first-order Goertzel algorithm for complex-valued input sequences…. 140 B.4.1. Necessary and sufficient condition to avoid overflow………. 141
B.4.2. Required scaling factor………. 143
B.5. Finite-precision numerical effects………... 149
B.5.1. Noise analysis………... 151
References 155
Chapter 1. Introduction
The Fresnel transform is used in digital holography to compute wave fields propagated to a certain distance or with a certain wavelength from a digital hologram. The requirements that should be satisfied by the Fresnel transform depend on the application. For instance, in
3D reconstruction correspondences should be maintained among the recovered wave fields. In this research, a reformulation of the Fresnel transform is proposed to satisfy the main requirements in applications of digital holography. Furthermore, two alternatives to reduce the execution time of the reformulation of the Fresnel transform are proposed, one in software and another one in hardware.
In this chapter a literature overview about applications of digital holography is presented, and the research problem is stated by analyzing the main requirements of the Fresnel transform. The motivations that guide this research work are described with the objectives of the thesis. Finally, the organization of the thesis report is described.
1.1. Literature Overview
In this subsection, the existing literature about the use of the Fresnel transform (FresT) in digital holography (DH) is reviewed. First, some applications of DH and their
requirements are mentioned. One of the requirements is the introduction of control for the sampling rate in the wave fields to be recovered, thus the existing proposals about this topic are described. Afterward, the existing hardware architectures that reduce the execution time of the FresT are also described. Finally, a discussion is presented.
1.1.1. Applications of the FresT in digital holography
DH is an interferometric technique that can be used to capture and recover the three-dimensional (3D) information of the objects registered on a digital hologram. The FresT can be used to retrieve the phase and amplitude of the wave field (3D information) propagated to a certain distance and with a certain wavelength. Recently, digital holography (DH) is gaining importance in applications such as metrology, medicine, and pattern
A. Microscopy
Digital holographic microscopy (DHM) provides numerical reconstruction of the third dimension by performing a plane-by-plane recovery process that overcomes the small depth of focus limitation in optical microscopy. Microscopes based on the principle of digital holography are emerging as an interesting alternative, [1, 2, 3]. The advantage with
holographic microscopy is that both phase and amplitude of the light are retrieved, while in traditional microscopy, the intensity is only retrieved (the square of the amplitude). Phase information reveals material and object properties that cannot be seen in a traditional microscope, e.g. refractive index and the 3D structure of an object. In DHM 3D opaque and transparent objects can be measured. Additionally, DHM in its simplest setup does not present the aberrations that the use of lenses can cause.
B. 3D Reconstruction, 3D Recognition and Holographic 3D TV.
In [4], a technique for 3D shape reconstruction using a focus analysis in DH is presented. Such technique can be referred as “shape from focus in DH” (SFFDH). In SFFDH several 2D images of the amplitude of the wave fields at various distances have to be recovered from the hologram using the FresT. Then, a focus measure is applied to small regions over the images. Finding the maxima of the focus measures, over each region among all the recovered images, an estimation of the distance for each region is obtained. These estimations constitute the 3D shape reconstruction of the analyzed scene. In order to compare the regions of all the images, the sampling rate of all the images should be very close or even the same. The main problem with this technique is that a large amount (hundreds) of images has to be recovered with the FresT, resulting in prohibited execution times for practical applications.
In [5], a real-time 3D imaging and shape-based recognition of microorganisms is addressed. Images of the amplitude of the wave field coming from a hologram of
microscopic samples are recovered numerically at different distances. Images are resized and the objects of interest are segmented in a subsequently stage. Some feature vectors are
future work, the authors proposed to consider four-dimensional imaging (with consideration of time in 3D imaging) in order to study small living organisms in their spatial and temporal living field, requiring a high speed recovery of the images from the holograms.
In [6], a system for image recognition applications was proposed. In this system, a recovered image of the wave field coming from the hologram of the analyzed object is
compared with that from a similar digital hologram of a 3D reference object using correlation methods. The authors showed that their system detects the 3D position of the object in the input scene with high accuracy and tolerance to distortions. In a first proposed method, the correlation algorithms are applied directly to the digital holograms. Furthermore, in a second approach the method is improved by applying 3D correlation algorithms in the object space. The object space was recovered using the FresT in a set of different recovery distances.
In [7], the development of the holographic 3D television (holographic 3DTV) is reviewed. In that review, the requirement of reducing the execution time of the used diffraction equation is stated; and it is recognized that the FresT is one of the most used diffraction equations in holographic 3DTV.
C. Multi-wavelength Digital Holography
Important applications rely on the possibilities of using multi-wavelength digital holography (MWDH) to record and reconstruct colored objects with a simple optical setup. However, the use of MWDH for metrological applications demands some caution when reconstructions are being combined at different wavelengths, [8-12]. The direct red-green-blue combination (the RGB composition) of the amplitude images of the recovered wave fields is affected by the variation of the sampling rate at the different recovery wavelengths.
D. Digital Holographic Particle Tracking Velocimetry
Digital holographic techniques can easily capture the time evolution of particles using a high-speed digital camera. In [13], a complete digital holographic particle tracking
amplitude of the wave field by using an algorithm running in a personal computer (PC). The image recovery is done by a fast Fourier transform (FFT) algorithm that implements the FresT. In the same paper [13], it is stated that the procedures for image recovery and particle tracking with 1000 fringe images require calculations of 200x1000 times for the FFT (eight hours using 5 PCs). In [14], a special purpose computer system proposed to accelerate the execution of DHPTV was presented. The system developed in [14] is
described in section 1.2.3.
1.1.2. Control of the sampling rate of the wave field to be recovered
If the FresT is directly used to recover the wave field, the sampling rate of the recovered wave fields varies inversely proportional to the product of the recovery distance and the recovery wavelength, [15]. The variations of the sampling rate over the different recovered wave fields give rise to some problems in shape from focus in digital holography (SFFADH) and in multi-wavelength digital holography (MWDH). A few techniques have been proposed to keep constant the sampling rate among the recovered wave fields.
One method uses a convolution approach of the FresT, [15]. Using the convolution approach, the sampling rate of the recovered wave fields is equal to that of the digital hologram. Then, by using this approach, the changes of the recovery distance or the recovery wavelength do not have influence over the sampling rate. Nevertheless, two extended 2D discrete Fourier transforms (DFT) and one extended 2D inverse discrete Fourier transform are required. Furthermore, the sampling rate is maintained constant, but it cannot be changed.
A simple method to control the sampling rate of the wave field is proposed in [16-19].
The sampling rate is controlled by an increase in the number of pixels of the digital hologram. The hologram array is zero padded to reach a size of N2×N2, where N2 depends
on the recovery distance and on the recovery wavelength. Nevertheless, the sampling rate controlled by this method cannot be smaller to that obtained by the direct FresT. In the field of digital signal processing the method proposed in [16] can be interpreted as an oversampling, [30]. There is a counterpart of oversampling known as downsampling. In
sequence, [30]. Nevertheless, if the classical aliasing operator is used, the downsampling can be performed only in integer factors but not for an arbitrary value.
A double FresT method (DBFT) is proposed to adjust the sampling rate of the complex wave field by introducing a transitional plane (TP) and implementing the FresT twice, [20]. It is considered that the FresT is a linear system. As a linear system can be decomposed into a cascade of subsystems, the FresT is interpreted as a system conformed by two
subsystems. The hologram is first propagated to a TP at the distance d1 and then to the
observation plane at a distance d2. Through the adjustment of the distance parameter in the
first propagation, it is possible to control the sampling rate in the wave field to be recovered independently of the distance and the wavelength. The problem with this method is that two FresTs are required.
Another alternative to control the sampling rate in digital holography is proposed in [21]. The wave field on either a variable tilted or vertical plane can be recovered with adjustable sampling rate. The wave field on a variable tilted plane can be calculated by a modified FresT that includes geometrical information of the tilted plane. The pixel sizes of the tilted recovered wave field are the same. However, if a small distance is preferred, the modified FresT cannot be directly used. In order to solve this problem, a transitional reconstruction plane is introduced, as in [20]. First, the transitional wave field is calculated using an inverse Fourier transform. Second, by using the transitional wave field, the final tilted wave field is recovered directly from the modified FresT. The main problem is that a fast algorithm to implement the modified FresT does not exist.
1.1.3. Hardware Accelerators for DH
There have been proposed a few special digital systems to reduce the execution time of the FresT.
An implementation of a custom DSP to accelerate algorithms used in DH is described in [22]. The system supports holograms with lengths between 16 and 1024 pixels by side. A convolution approach of the FresT is used. A FFT of 2048×2048 is used to avoid the problem of overlapping when the largest hologram size supported by the system is processed. Three FFT computations are required to recover one image of the wave field,
custom hardware accelerator was developed to implement the FFTs. The FFT core contains several computational units based on the pipelined radix-22 decimation-in-frequency algorithm. One more pipeline radix-2 is required to process the largest FFT (2048×2048). With regard to a multiplexing architecture of the FFT, the number of required clock cycles
is decreased from O(Nlogr(N)) to O(N), where r denotes the radix. The system was synthesized for a Xilinx Virtex XCV1000-E device. For the largest hologram at least 2 seconds are required to recover one image of the wave field and for the smaller hologram, approximately 3 images per second can be processed using a clock frequency of 25 MHz. The system requires 73% of the LUTs (≈17500) and 84% of the block RAMs (≈330 Kbits).
About 50% of the required resources are used for the FFT processor.
A series of 5 special purpose computer systems for electroholography have been
developed by a scientific group from Japan. The system is named HORN (HOlographic ReconstructioN). Electrolography is the generation of the optical field from a computer generated hologram (CGH). There are 5 HORN systems; the evolution of HORN is described in [23]. HORN-5 is a parallelized high-performance computing PCI board for CGH and the first that uses an FPGA. The next generation has a different application that is described next. Recently, it was developed a special purpose computer system for digital holographic tracking velocimetry (DHPTV), which is presented in [14]. The system was designed using the FFT for the recovery of the wave field (only the amplitude), and it is called special purpose computer for DHPTV, or FFT-HORN. The FresT based on convolution is used as the recovery algorithm.It used a Xilinx LogiCORE as a FFT CORE. Three RAM units are required. FFT-HORN is implemented over a platform with four Xilinx XC2VP70 FPGA chips. So, there is one recovery system in each FPGA chip, an FPGA board can recover four images of the wave field at the same time. The clock frequency of the special purpose computer is 133 MHz. A comparative table between the FFT-HORN and a PC to calculate 100 recovered images of 256X256 pixels is shown in Table 1.1
In [14], the authors let know their plans to expand the FFT-HORN to recover the amplitude of the wave fields from 1024x1024 pixels digital hologram.
Table 1.1. Comparison between the execution time of the FFT-HORN and a PC.
1.1.4. Discussion
Each DH application requires certain characteristics to be present over the wave field recovered with the FresT. In DHM, both the amplitude and the phase of the wave field are required. In SFFADH, MWDH and 3D reconstruction maintaining constant the sampling rate in the recovered wave fields is also required. In 3D recognition, controlling the sampling rate can be very useful since the resizing procedure can be avoided. In the 3D reconstruction and recognition applications, selecting the region of the wave field to be recovered is desirable. Finally, it can be stated that the development of algorithmic or hardware alternatives to reduce the execution time in the computation the FresT can push the development of practical applications. These requirements are overviewed next:
1. Recovery of the phase and amplitude of the wave field.
2. Controlling the recovery region of the wave field to be recovered. 3. Controlling the sampling rate of the recoveries.
4. Fast computation of the FresT.
The complete classical FresT can be used to recover the phase and amplitude of the wave field, but most of the actual implementations eliminate a phase factor producing an incorrect estimation on the phase of the wave fields, [5, 6, 13].
Even thougth there are several applications that requires only a portion of the wave field to be recovered, [2, 6, 13], the computation of only the region of interest has not yet been explored. In general the complete wave field is computed and the non required regions are
Any of the described methods to control the sampling rate of the wave field to be recovered can be used. Nevertheless, the convolution approach has the limitation that three extended 2D DFT are required and that the sampling rate is constant and cannot be varied. The FresT with zero-padding can be used only to increase the sampling rate over that of the direct FresT. A possibility to reduce the sampling rate more than that of the direct FresT is to introduce aliasing in the hologram. Finally, the techniques described in [20, 21] require
two computations of the FresT.
A few hardware alternatives have been developed to accelerate the process of recovering wave fields from holograms. The FPGA technology has been adopted due to its feasibility to implement custom processors with parallelism. In the actual hardware accelerators, the convolution approach is used to recover the 2D images from the holograms, requiring the application of the 2D DFT at least three times on larger matrices than the size of the original hologram. Since the processed image sizes are a power of two, the 2D FFT is used to implement the 2D DFT. In the FFT- HORN pipeline system, [13], the resulting images represent only the amplitude of the computed wave field.
1.2. Problem Statement
There are applications that require recovering several images of the wave field obtained from holograms but fulfilling certain requirements. These requirements can be listed as:
• control of the sampling rate;
• control of the recovery region (its size and position); • and acceleration of the recovery process.
Since no previous developed proposal to implement the FresT has fulfilled all the mentioned requirements at the same system, the reformulation and implementation of a single FresT to cope with all the latter mentioned requirements is established as the problem to be solved.
1.3. Motivations for the research work
The hypothesis that using a zero-padding and the aliasing operator in the FresT, the sampling rate can be completely controlled by computing a single 2D DFT is stated. Furthermore, control of the recovery region of interest in the wave field to be recovered is added. The most time-consuming part in the FresT is the computation of the required 2D DFT, thus the main effort is focused in the implementation of the 2D DFT. Using the proposed formulation of the FresT, two alternatives to implement it and to reduce its execution time are developed.
In this section, the research questions, the objectives and the requirements that guide this research work are mentioned.
1.3.1. Research Questions
General research questions:
• Is there a formulation of the FresT that using a single 2D DFT can introduce sampling rate control and recovery region control in the wave field to be recovered?
• What would be the efficient implementation of the proposed FresT to achieve a lower execution time in comparison with that of the current alternatives?
Specific research questions:
• Can the FresT with the aliasing operator be used to reduce in any arbitrary value the sampling rate of the wave field to be recovered?
• Is there a software implementation alternative to reduce the execution time required by the 2D DFTs used to implement the proposed formulation of the FresT? • Is there an efficient hardware architecture alternative based on FPGAs to reduce the execution time required by the 2D DFTs used to implement the proposed formulation of the FresT?
1.3.2. General Objectives
To propose a formulation of the FresT with introduction of sampling rate control and recovery region control in the wave fields to be recovered.
To implement the proposed formulation of the FresT, achieving a reduction in the execution time in comparison with that of the current alternatives.
1.3.3. Requirements
The requirements for the implementation of the modified FresT are:
• The size of the hologram can be any for the software alternative and up to 1000×1000 pixels for the hardware alternative based on FPGAs.
• To save at least 20% of the execution time with the software alternative in comparison with that of the convolution approach.
• To get a reduction in the execution time of at least two orders of magnitude with the hardware architecture in comparison with a sequential machine and of at least two thirds of the required by previous reported architectures that compute the FresT.
• The number of gray levels in the amplitude image of the recovered wave fields should be at least of 256 (8 bits coding).
1.4. Description of the Chapters
The development of the reformulation and implementation of the FresT is constituted by various improvements and contributions in different fields of knowledge. This multidisciplinary characteristic arises the need to analyze independently each contribution; and to go in deep in the literature overview where each contribution is described.
In the second chapter, the FresT is reformulated. Additionally, the introduction of a slight flexibility in the sampling rate of the wave fields is proposed to reduce the execution time required by the reformulated FresT. In the same chapter, general considerations to efficiently implement the proposed formulation of the FresT are suggested. Finally, the
In chapter three, a pruning method for highly composite (a positive integer that is equal to the product of several positive integers) length FFTs is proposed in order to avoid the multiplications per zero and the computation of the non required outputs. The proposed FFT pruning method increases the efficiency in the implementation of the reformulation of the FresT. Furthermore, it is demonstrated that the execution time required to compute the pruned FFTs is reduced more than the achieved by the current methods in the literature.
In the fourth chapter, an overflow analysis in the fixed-point implementation of the second-order Goertzel algorithm is described; this algorithm is the most promising choice for the architectural alternative. Thus, a method is proposed to implement the second-order Goertzel algorithm with a increased scaling factor in O(1/M) from the classical in O(1/M 2), where M is the length of the input sequence. This scaling factor increases the signal-to-noise ratio performed by the implementation of the classical second-order Goertzel algorithm.
In chapter five, a parallel architecture to implement the 2D DFT required by the reformulation of the FresT is proposed. This architecture is based on the modification of the second-order Goertzel algorithm reported in chapter four.
In the last chapter, the general conclusions are discussed and the future work is presented. Additionally, two appendixes are attached. Appendix A includes the proposed code to implement the pruned FFTs. Finally, in appendix B, an overflow analysis in the fixed-point implementation of the first-order Goertzel algorithm is presented. Such analysis is used to compare our proposal in chapter four.
Chapter
2. Fresnel Transform with
Sampling and Recovery Region Control
through a Single 2D Discrete Fourier
Transform
In this chapter, a new formulation of the Fresnel transform that introduces control in the
sampling rate as well as in the recovery region (size and position) of the wave fields to be
recovered is proposed. The proposed formulation is evaluated using digital holograms. This
proposal is based on the sampling theory of the Fourier transform. By setting a fixed
sampling rate, a direct correspondence can be maintained among the several wave fields
recovered from digital holograms at different distances or with different wavelengths. The
sampling rate can be increased or decreased with regard of the sampling rate obtained by
the Fresnel transform directly applied to the digital hologram. The main advantage of the
proposed formulation is that it requires the computation of only one two-dimensional
discrete Fourier transform (2D DFT) in contrast with other proposals that requires at least
two computations. Nevertheless, if the special characteristics of the required 2D DFT are
not considered, it can be computed inefficiently, increasing thus its execution time. Here, it
is shown that several of the rows and/or columns of the 2D DFT are not needed.
Additionally, this analysis gives rise to the necessity of pruning the computation of the
required 2D DFT or of the development of special hardware alternatives to compute the
required 2D DFT.
2.1. Introduction
Sampling control in digital holography makes reference to manipulating or analyzing
the effect of sensing a digital hologram, [24-27]. There are other works that deal with
controlling the sampling rate only in the wave fields to be recovered and not in the recorded
holograms.
In several applications it is necessary to control the sampling rate of the wave fields
recovered by a numerical implementation of the Fresnel transform (FresT). In
multi-wavelength DH the same sampling rate has to be kept in the complex amplitude
distributions recovered with different wavelengths for a proper matching [8-12]. Sampling
control in the retrieved wave field would also be highly desirable for three-dimensional
(3D) imaging and object recognition [4, 6, 28-29]. For instance, in digital holography (DH)
the shape from focus technique presented in [4] would reconstruct 3D shapes with extended
depth of field if the several wave fields recovered at different propagation distances would
have the same sampling rate and therefore maintain correspondences.
The FresT can be implemented using a single two-dimensional discrete Fourier
transform (2D DFT). Nevertheless, if the FresT is directly applied to the digital hologram,
the sampling rate of the recovered wave fields decreases as the recovery distance and/or the
recovery wavelength increases [13]. The latter causes that the wave fields obtained at
different recovery planes do not have the same magnification. The FresT can also be
implemented by a convolution approach to keep the sampling rate equal to that of the
digital hologram and therefore maintain correspondences among the wave fields retrieved
at different distances along the optical axis and/or with different wavelengths; however, this
sampling rate cannot be changed and the exact FresT implemented by this method requires
the computation of three extended 2D DFTs [13].
In [20, 21], two methods based on two-stage propagations were proposed to recover
wave fields and at the same time to control their sampling rate. In the first stage, a wave
field is recovered at an intermediate distance. In the second stage, the previously recovered
wave field is propagated to the desired final distance generating the final wave field. These
methods can control the sampling rate of the wave field to be recovered, independently of
the used wavelength. The main problem in that approach is that two computations of the
FresT are required.
sampling rate of all the recovered wave fields to a reference sampling rate. The methods
proposed in [16-19] can only increase the sampling rate of the fields to be recovered with
respect to the one obtained by the FresT directly applied to the digital hologram.
Additionally, it may not be necessary to recover the complete wave field, but only a
region of interest. Hence, in many practical applications, the use of the classical methods
can be inefficient, [2, 6, 13].
In this chapter, we propose a formulation of the FresT to control the sampling rate as
well as to control the recovery region of the wave field to be recovered. First, a general
formulation of the Fresnel transform is described. Afterwards, the use of the sampling
theory and mathematical shifting to introduce sampling and recovery region control in the
FresT is proposed. The main advantage of the proposed formulation, in contrast with the
methods presented in [16-20], is that the sampling rate can be decreased or increased
through the computation of a single 2D DFT. In the last section, especial considerations are
described about the implementation of the proposed reformulation of the Fresnel transform.
2.2. Fresnel transform (FresT)
In this section, a general formulation of the Fresnel transform as the used in digital
holography (DH) is described. An object wave field can be estimated from a recorded
digital hologram and a reference wave by using the FresT. To retrieve the wave field, the
FresT is applied on the array Eh[m,n], where Eh[m,n]=h[m,n]×E[m,n]. The sign × represents the multiplication element per element. h[m,n] represents a digital hologram and
E[m,n] describes the reference wave that is characterized by its wavelength λ and inclination. The parameters of the digital hologram are given by the characteristics of the
recording medium that is the electronic sensor. The hologram has a size in number of
elements of Nx×Ny. The distance between two pixels of the hologram is ∆x in the x-axis and
∆y in the y-axis, then the physical size of the hologram is (Nx∆x)×(Ny∆y)=Lx×Ly. Given
Eh[m,n], the FresT can be used to compute the wave field propagated at a certain distance d
in the ξ-η plane from the hologram plane. The propagated wave field has a size in number of elements of Nξ×Nη. The distance between two consecutive pixels (sampling step) in the
The FresT is given by + − × − + − × ∆ − + ∆ − × × + × ∆ − + ∆ − × = Γ
∑ ∑
− = − = η ξ η ξ η η ξ ξπ
π
λ
π
π
λ
π
λ
π
λ
η ξ N ln N km j N n N m j N n N m d j n m Eh N lN N kN j N l N k d j d j d j l k y x N n N m y y x x y x 2 exp 2 2 exp 2 2 exp ] , [ exp 2 2 exp 2 exp ] , [ 1 0 1 0 2 2 2 2 2 2 (2.1)Eq. (2.1) can be implemented with the two-dimensional discrete Fourier transform (2D
DFT) as:
Γ[k,l]= p[k,l]×DFT
{
f[m,n]}
, (2.2) where − + − × ∆ − + ∆ − × = 2 2 exp 2 2 exp ] , [ ] , [ 2 2 2 2 y x y y x x N n N m j N n N m d j n m Eh n m fπ
λ
π
; + × ∆ − + ∆ − × = η ξ η η ξ ξπ
λ
π
λ
π
λ
N lN N kN j N l N k d j d j d j l k p y x exp 2 2 exp 2 exp ] , [ 2 2 .In f [m,n] the complex exponential exp[jπ(m-Nx/2+n-Ny/2)] is a shift factor that is used
to perform shifts of Nξ/2 in the k-axis and of Nη/2 in the l-axis in the wave field to be
The complex exponential exp[jπ(kNx/Nξ+lNy/Nη)] in p[k,l] is generally omitted in the
classical version of the FresT [5, 6, 13]. Even with such omission, the required wave field
can be correctly retrieved if Nx=2cNξ and Ny=2cNη, where c is a positive integer. If Nx=Nξ
and Ny=Nη, the phase of the recovered wave fields would present a phase offset of π rad.
The size of the required 2D DFT restricts the sampling step, and then the sampling rate,
of the wave field to be recovered. In the direct implementation of the FresT, the size of the
used 2D DFT is equal to that of the digital hologram, Nx=Nξ and Ny=Nη. And according to
[15], the sampling step is given by
; . y y y x x x N d N d N d N d ∆ = ∆ = ∆ ∆ = ∆ =
∆
λ
λ
λ
λ
η η ξ
ξ (2.3)
In Eq. (2.3), it can be observed that if Nx=Nξ and Ny=Nη while λd vary, the sampling step
cannot be controlled. According to Eq. (2.3), the sampling step depends on the wavelength,
the recovery distance and the physical size of the hologram. In next section, a method to
control the sampling step of the wave fields to be recovered is proposed. The proposed
method can deal with holograms of any size.
2.3. Using the sampling theory of the Fourier transform to
introduce sampling control in the FresT
From the work presented in [16], it can be stated that the sampling steps of the wave
field to be recovered can be modified by changing the size of the array to be transformed
with the 2D DFT. The number of elements in the new array is given by
; . ∆ ∆ = ∆ ∆ = y x d round N d round N η η ξ
ξ
λ
λ
(2.4)In Eq. (2.4), the sampling steps ∆ξ and ∆η are defined for the ξ-axis and η-axis,
respectively. The round operator is needed since Nξ and Nη should be integers. If Nξ>Nx
and/or Nη>Ny, the array to be transformed is a zero-padded hologram. In practice, when
[15]. This is true for rough objects since they spread out its information all over the
recording medium when the hologram is generated. But for smooth objects, the truncation
of its hologram can eliminate important information in such a way that the wave field
cannot be recovered. In such cases, another alternative should be proposed to increase the
sampling step.
In this section, the sampling theory of the Fourier transform is extended to the context
of the Fresnel transform but in the recovery domain. In this case, the sampling theory is
used to reduce the sampling rate and consequently increase the sampling step in the wave
fields to be recovered. The sampling theory points out that aliasing can be introduced in the
input sequence to down-sample the Fourier transform (FT). In this work, aliasing is
intentionally introduced on the digital hologram and then reduce the sampling rate of the
wave fields to be recovered. The classical aliasing operator is described, and its major
disadvantage of yielding changes of the sampling steps in only integer factors is pointed
out. Then, a general aliasing operator is proposed. The new aliasing operator can be used to
compute any number of samples of the FT. Furthermore, the use of the proposed aliasing
operator with the Fresnel transform is described. Finally, it is shown how to use the
zero-padding operator and the proposed aliasing operator to control the sampling rate of the
wave fields to be recovered.
2.3.1. Aliasing Operator
For the sake of brevity, the analysis in this section is carried out for the one-dimensional
case. Here Nsamp defines the number of required samples (the length) of the Fourier
transform and Ninp defines the number of samples of the input sequence. In Eq. (2.1), it is
considered that Nsamp=Ninp, and in the FresT with zero padding that Nsamp>Ninp. When
Nsamp<Ninp, the DFT can be applied over a new input vector obtained from the superposition
of shifted replicas of the original input sequence [30]. The process of adding superimposed
shifted replicas of a sequence is generally known as aliasing. According to the aliasing
theorem, an aliasing in the space domain produces a down-sampling in the Fourier domain.
The aliasing operator of Eq. (2.5) applied to a sequence of length Ninp is used to
generate a new sequence of length Nsamp, where Nsamp=Ninp/K for a positive integer value of
K. In such a case, it is only possible to decrease the number of samples in integer factors. These Nsamp samples in the Fourier domain are computed with the DFT of the new
sequence.
In order to deduce a general aliasing operator, the following relation is used, [30]:
k N u
N
m samp samp
samp m f N km j m g m g ) / 2 ( 1 0 ]) [ ( FT 2 exp ] [ ]) [ (
DFT π = π
− = = −
=
∑
, (2.6)where
∑
∞ −∞ = + = r samp rN m f mg[ ] [ ]. (2.7)
Eq. (2.6) relates the DFT with the continuous Fourier transform (FT) of an input
discrete sequence f [m] with length Ninp. Actually, in Eq. (2.6) and Eq. (2.7), f [m] is
considered as an infinite-length sequence where only the elements in the range from 0 to
Ninp can be different than zero. The relation in Eq. (2.6) points out that the DFT can be used
to compute equally-spaced samples of the FT, these samples occurs in u=2πk/Nsamp with k=0,1,2,…,Nsamp-1. Nevertheless, the DFT should be applied over a new sequence g[m] that
is generated with Eq. (2.7) from the original input sequence f [m]. Eq. (2.7) can generate an infinite periodic sequence, but according to Eq. (2.6) only one period of g[m] is required. Then, in this work g[m] is used as a vector of length Nsamp for m=0,1,2,…,Nsamp-1. Next,
Eq. (2.7) is re-expressed for the case Nsamp<Ninp.
In Eq. (2.7), the sum of an infinite number of shifted replicas of f [m] (f [m+rNsamp]) is
performed over the interval 0≤m≤Nsamp-1, the shifts are performed in multiples of Nsamp. If Nsamp<Ninp, there are some shifted replicas of f [m] that contribute to the summation of Eq.
(2.7). Only the elements of f [m] with indexes between 0 and Ninp contribute with h[m].
Then, it can be assured that f [m] contributes with g[m] only when 0 ≤ m + rNsamp ≤ Ninp - 1,
that is –m/Nsamp ≤ r ≤ (Ninp-1-m)/Nsamp. Since m ≤ Nsamp-1 and r is an integer, the bounds of
, 1 0 − − ≤ ≤ samp samp inp N m N N r (2.8)
where denotes the largest integer not greater than the argument. If the interval indicated
in Eq. (2.8) is satisfied, f [m] can be used as a finite sequence (as in practice it is) without the risk of indexing out of its size. Then, g[m] can be computed as
[ ] [ ]. ) 1 ( 0
∑
− = − + = samp N m samp N inp N r samp rN m f mg (2.9)
The upper bound of the summation in Eq. (2.9) depends on m. When m=0, the upper bound presents its maximum value; and when m=Nsamp-1, the upper bound presents its
minimum value. These two values only differ by one unit. More generally, the maximum
value is obtained when the fraction m/Nsamp does not surpass the fractional part of (Ninp
-1)/Nsamp, this is
. 1 1 samp samp inp samp inp N m N N N N ≥ − − − (2.10)
Hence, the maximum value of the upper bound of the summation occurs when
(
( 1)module)
(( 1)) . 1 ) 1 ( samp N inp samp inp samp inp samp inp N N N N N N N m − = − = − − − ≤ (2.11)Conversely, when m>((Ninp-1))Nsamp, the upper bound takes its minimum value. Thus,
the computation of g[m] can be carried out by
. 1 )) 1 (( for , ] [ )) 1 (( 0 for , ] [ ] [ 1 0 ) 1 ( − ≤ < − + − ≤ ≤ + =
∑
∑
− = − samp N inp samp N inp r samp N m N rN m f N m rN m f m g samp samp N inp N samp samp N inp N (2.12) Eq. (2.12) can be viewed as a general aliasing operator since Ninp and Nsamp can take any
integer value that fulfill Nsamp<Ninp.
2.3.2. Changing the number of samples to be computed of the Fourier
transform
According to the previous discussion, the number of samples in the Fourier transform
can be changed by using a new input sequence instead of the original one. This new
sequence to be transformed is denoted as g[m]. The computation of g[m] depends on the relation between the number of required samples Nsamp and the number of samples Ninp of
the original input sequence. The different equations to generate g[m] are: if Nsamp=Ninp,
g[m]= f[m], for 0≤m≤ Nsamp −1; (2.13) if Nsamp > Ninp,
;
1 for
, 0
1 0
for ], [ ] [
− ≤
≤
− ≤
≤ =
samp inp
inp
N m N
N m m
f m
g (2.14)
and finally, if Nsamp < Ninp, Eq. (2.12) is used.
Fig. 2.1 presents the generation of g[m] for the three different cases.
Figure 2.1. Different cases to generate g[m]. (a) if Nsamp=Ninp; (b) if Nsamp>Ninp; (c) if Nsamp<Ninp.
When Nsamp=Ninp, the DFT is applied directly to the input vector g[m], this is the most
general way to use the DFT.
When Nsamp>Ninp, the DFT is applied to a new vector that is formed with the input
described in [30], a zero padding in the space domain increases the number of samples in
the Fourier domain to be computed.
When Nsamp<Ninp, the DFT is applied over a new input sequence obtained with the
aliasing operator applied over the original input sequence. In Eq. (2.12) the values of Nsamp
and Ninp can be any positive integers. Then, Eq. (2.12) can be considered as a form of
aliasing in the space domain that produces a decrease in the sampling rate of the Fourier
domain –or equivalently and increase in the sampling rate. In fact, when Ninp/Nsamp is an
integer, the proposed Eq. (2.12) can be expressed as the well known aliasing operator in Eq.
(2.5). Then, it can be claimed that Eq. (2.12) is a general expression for the aliasing
operator.
The samples obtained for each of the cases can be related and presented in the same plot
if the indexes of each resulting vector are normalized to 2π. If a very large zero padding is
performed to the input vector, the plot of the resulting samples looks like the complete
continuous Fourier transform of the original input sequence. In Fig. 2.2, we present the
samples obtained in four examples (when Nsamp>>Ninp, Nsamp=Ninp, Nsamp>Ninp and Nsamp<Ninp) using the proposed method, as well as the values obtained when the input
sequence is truncated (Nsamp<Ninp).
(a) (b)
Figure 2.2. Samples of the continuous Fourier transform for the input sequence
f[m]=[1,2,3,4,4,4,3,2,1] when g[m] is used with Nsamp=512, Nsamp=9, Nsamp=13 and Nsamp=7; and values obtained when the input sequence is truncated to a length Nsamp=7.
Fig. 2.2 shows that the values obtained with Eq. (2.12), Eq. (2.13) and Eq. (2.14)
corresponds to samples of the continuous Fourier transform of the original input sequence.
When the DFT is computed using a truncated input sequence, the obtained values do not
exactly correspond to samples of the continuous Fourier transform of the original input
sequence. This expected, since every element different than zero in the original input
sequence contribute to every computed sample of the continuous Fourier transform.
2.3.3. Changing the number of samples of the 2D Fourier transform to be
computed
The zero-padding operator and the proposed aliasing operator described in the latter
subsections are used to sample the 2D Fourier transform by using the separability property.
First, one of Eq. (2.12), Eq. (2.13) or Eq. (2.14) is applied to each row of the input matrix,
the resulting array is called intermediate matrix. Then, one of Eq. (2.12), Eq. (2.13) or Eq.
(2.14) is applied to each column of the intermediate matrix; the resulting array from this
step is the new input matrix. The elements of the 2D DFT computed with the new input
matrix correspond to the required samples of the Fourier transform of the original input
matrix.
2.3.4. Fresnel transform with sampling control (FresTSC)
In order to control the sampling rate in the wave fields to be recovered at various
recovery distances and/or with different wavelengths, the values of ∆ξ and ∆η have to be
specified. The inverse of the sampling step ∆ξ is recognized as the sampling rate, rξ'=1/rξ',
over the ξ-axis, and the inverse of ∆η is recognized as the sampling rate, rη'=1/rη', over the η-axis. The apostrophe in the parameters rξ' and rη' means that their values can be selected.
Using the sampling rates, Eq. (2.4) results in
∆ =
x
d r round
Nξ ξ'
λ
and ' .
∆ =
y
d r round
Nη η
λ
(2.15)
d
x rξ =λ∆
∆ and .
d
y rη
λ
∆ =
∆ (2.16)
Actually, the real sampling rate that can be performed is given by rξ =s∆x/λd and rη=s∆y/λd, where s is a positive integer. Since the value of λd is much larger than the values
of ∆x and ∆y, then the values of ∆rξ and ∆rη are very small in comparison with the sampling
rate of the digital hologram (1/∆x and 1/∆y). For instance, if ∆x=∆y=6.4 µm, λ=532 nm and d=200 mm, then ∆rξ=∆rη=0.0601504 samples/mm while 1/∆x=1/∆y=156.250 samples/mm.
Since rξ is approximately equal to rξ' and rη is approximately equal to rη', the values of Nξ
and Nη obtained by Eq. (2.15) are considered as the required values to achieve the desired
sampling rates rξ' and rη'. A different choice than the round operator, used to choose the
values for Nξ and Nη, is proposed in section 2.5.3.
The values of Nξ and Nη are the number of equally-spaced samples of the FresT that
have to be computed in each axis to get the specified sampling rate. These samples are
computed with the 2D DFT of a new input matrix g[m, n] generated from f [m, n] (Eq. (2.2)) with the procedure described in section 2.3.3. To compute these equally-spaced
samples, the FresT is implemented as
Γ[k,l]= p[k,l]×DFT
{
g[m,n]}
, (2.17)We call Eq. (2.17), the FresT with sampling control (FresTSC). In the next section, we
show how f [m,n] and p[k,l] should be modified to afford the control in the recovery region of the required wave fields when the FresTSC is used.
2.4. Introducing control of the recovery region in the Fresnel
transform.
The wave field recovery from a digital hologram is carried out according to the
Figure 2.3. Schematic diagram representing the wave field retrieval
from a hologram.
The digital hologram is placed on the x-y plane centered at origin; this can be observed in Fig. 2.3. The region to be recovered at a distance d is shown in Fig. 2.3, this region is located in the ξ-η plane. The region to be recovered is a window in the propagated wave field, this region is defined by its size and position. The region to be recovered has a size in
number of elements of Nξ'×Nη'. The sampling step in the recovery region is given by ∆ξ in
the and by ∆η in the η-axis. The sampling step is controlled indirectly by defining the
required sampling rates, rξ'=1/∆ξ in the ξ-axis and rη'=1/∆η in the η–axis. Then, the
physical size of the region to be recovered is (Nξ'∆ξ)×(Nη'∆η)=(Nξ'/rξ')×(Nη'/rη')=Lξ×Lη.
Furthermore, the position of such region is defined by the distance from the center of the
region to be recovered to the origin of the ξ-η plane, this distance is defined by Sξ' in the ξ
-axis and of Sη' in the η-axis. In the proposed implementation of the FresT, the parameters of
the region to be recovered can be adjusted depending on the application. In this work, the
parameters used in the proposed FresT that can be adjusted are identified by an apostrophe.
In Eq. (2.2), a shift of Nξ/2 elements in the k-axis and of Nη/2 in the l-axis of the wave
field to be recovered was introduced. A shift of -Sξ'/∆ξ elements in the k-axis and of -Sη'/∆η
elements in the l-axis are required to place the region to be recovered at the first elements of the final transformed matrix. Then, the required shifts are given by:
d N S N S N
Shiftk x
λ
ξ ξ ξ ξ ξ
ξ = − ∆
∆ −
= '
2 ' '
2 '
d N S N S N
Shiftl y
λη
η η η η
η = − ∆
∆ − = ' 2 ' ' 2 '
for axis l. (2.18)
If a shift of Nξ/2 and Nη/2 is introduced by multiplying π per (m-Nx/2+n-Ny/2) in the
argument of the shifting complex exponential in Eq. (2.2), the shift given by Eq. (2.18) can
be introduced by multiplying
∆ − d S N N x λ π ξ ξ ξ ' 2 '
2 per (m-Nx/2), and
∆ − d S N N y λ π η η η ' 2 '
2 per (n-Ny/2). (2.19)
f [m, n] is modified when the shifts in Eq. (2.19) are introduced. Thus, instead of f [m,n], a new function fshift[m,n] is used, this is defined as
− ∆ − + − ∆ − × ∆ − + ∆ − × = 2 ' 2 ' 2 ' 2 ' 2 exp 2 2 exp ] , [ ] , [ 2 2 2 2 y x x x y y x x shift N n d S N N N m d S N N j N n N m d j n m Eh n m f
λ
λ
π
λ
π
η η η ξ ξ ξ. (2.20)
Due to the introduced shift, the symmetry of the square factor of p[k,l] should also be modified. Thus, a new function is defined to be used instead of p[k,l]. Furthermore, the relations ∆ξ=λd/Nξ∆x and ∆η=λd/Nη∆y are used in the definition of the new function. This
+ × ∆ + − ∆ + ∆ + − ∆ × = η ξ η η η η ξ ξ ξ ξ
π
λ
λ
πλ
λ
π
λ
N lN N kN j d N S N l N d N S N k N d j d j d j l k p y x y y x x shift exp ' 2 ' 1 ' 2 ' 1 exp 2 exp ] , [ 2 2 2 2 2 2. (2.21)
The required values of Nξ and Nη to achieve the desired sampling rates rξ' and rη' are
computed with Eq. (2.15). The required samples are computed with the 2D DFT of a new
input matrix gshift[m,n] generated from fshift[m,n] with the procedure described in section
2.3.3. To compute the required recovery region with sampling control, the FresT is
implemented as
Γ[k,l]= pshift[k,l]×DFT
{
gshift[m,n]}
, (2.22)We call Eq. (2.22) the FresT with sampling and recovery region control (FresTSRC). In
the next section, we show how the FresTSRC can be implemented efficiently.
2.5. Implementation of the FresTSRC.
The 2D DFT required to compute the FresTSRC has a size of Nξ×Nη, while the input
array has a size of Nx×Ny. If Nx<<Nξ and/or Ny<<Nη, the matrix to be transformed contains
a great number of zeros because of the performed zero-padding. Additionally, the size of
the region of interest in the complete wave field is Nξ'×Nη'. If Nξ'<Nξ and/or Nη'<Nη, there
are several outputs of the 2D DFT that are not required. It can be affirmed that the
computation per zeros and the computation of non required outputs reduce the efficiency of
the 2D DFT.
The direct computation of fshift[m,n] and pshift[k,l] requires the evaluation of a number of
exponential are independent among them, so they can be factorized and the number of
evaluations of the complex exponential can be reduced.
The execution time in the computation of the 2D DFT depends on the size of the matrix
to be processed. Here, the use of fast Fourier transforms is recommended to compute the
2D DFT. If the length on each axis of the matrix to be transformed is highly composite, the
computation of the 2D DFT with a FFT could require a short execution time.
In this section, it is shown that the number of computations per zero and of non required
outputs can be reduced. Moreover, reformulations of the functions fshift[m,n] and pshift[k,l]
are proposed, such that the number of evaluations of the complex exponentials is reduced in
O(Nx+Ny+Nξ'+Nη'). Finally, the introduction of flexibility in the sampling rate for certain
applications is proposed. This flexibility can yield the reduction of the execution time
required to compute the FresTSRC.
2.5.1. Elimination of the non required 1D DFTs
The 2D DFT required by the FresTSRC is computed using the separability property.
This can be computed with one-dimensional DFTs, first applied to the columns and then to
the resulting rows. The array to be transformed can be constituted of several columns with
only zeros, these transforms are discarded. Thus, if Nη≥Ny, the DFTs are carried out on the
columns 0 to Ny-1 of the fshift[m,n]. Otherwise, if Nη<Ny, the DFTs are carried out on the
columns 0 to Nη-1 of gshift[m,n]. Additionally, the output matrix can contain more rows than
the required by the recovery region, therefore, the computation of the non required rows is
avoided. The condition Nξ'≤Nξ and Nη'≤Nη need to be adopted since the maximum size of
the wave field is Nξ×Nη. Thus, since Nξ'≤Nξ, the DFTs are carried out on the rows 0 to Nξ'-1.
The region of interest is obtained in the elements 0≤k≤Nξ'-1 and 0≤l≤Nη' of the final matrix.
The “faster Fourier transform in the west” (FFTW) is used for the computation of the
required DFTs. The FFTW is described in [33, 34]. The reported complexity for the FFTW
is in O(NlogN). The FFTW computes the N possible output elements of the DFT of length
N and requires an input sequence also of length N. Nevertheless, the DFTs required to implement the FresTSRC do not have such characteristics. In chapter 3, a pruning method