Optical configuration acceleration on a new optically reconfigurable gate array very large scale integration using a negative logic implementation

Retsu Moriwaki; Minoru Watanabe

doi:10.1364/AO.52.001939

1. Introduction

Recently, uses of field programmable gate arrays (FPGAs) [1–3] are increasing drastically, with extension to mass production from smaller production scales, while the initial cost of custom very large scale integrations (VLSIs) is increasing [4–7]. Among such applications, demand for a high-speed reconfiguration has been increasing to change many functions on a gate array quickly. However, current FPGAs cannot be reconfigured quickly because of their serial configuration mechanism.

Recently, high-performance optical FPGAs extracting optical capabilities have been developed [8–10]. Although mainstream applications of such optical FPGA studies are optical communications and optical sensing, an optically reconfigurable gate array (ORGA) has an optical technique used for its programmable gate array’s configuration [11–14]. An ORGA consists of a holographic memory, a laser array, and a programmable gate array VLSI, as shown in Fig. 1. The programmable gate array is a fine-grained gate array similar to that of FPGAs. However, the gate array on an ORGA is reconfigured using information of a holographic memory. Therefore, ORGA can achieve nanosecond-order high-speed dynamic reconfiguration using a parallel configuration mechanism.

Fig. 1. Basic construction of an ORGA.

Download Full Size | PDF

The optical configuration speed can be accelerated easily by increasing its laser power. However, in this case, the optical power dissipation is also increased along with increasing reconfiguration frequency.

This paper therefore presents a proposal of an optical configuration acceleration method on ORGAs using a negative logic implementation with no increase of laser power. The method uses a holographic memory property by which the reading speed of a holographic memory is inversely proportional to the number of bright bits included in a configuration context. The proposed optical configuration acceleration method can decrease the number of bright bits so that it can increase the reconfiguration frequency. In this paper, a newly fabricated ORGA-VLSI that can support the optical configuration acceleration method is estimated. Consequently, this paper shows that the reconfiguration frequency of the proposed method is 1.97 times higher than those of conventional ORGA architectures with no increase of laser power.

2. Holographic Memory Property

Since an ORGA receives optical configuration contexts on photodiodes and because the photodiode response frequency is typically proportional to the light intensity, high-intensity light increases the optical reconfiguration frequency. Therefore, the easiest way to increase reconfiguration frequency is to use high-powered lasers. However, the use of such high-powered lasers increases the unit’s power consumption and might necessitate the use of a cooling system. Such a cooling system would greatly increase the package size. For that reason, high-powered lasers should be used only as a last resort.

In an ORGA, a holographic memory generates optical configuration contexts. A holographic memory has a property by which the light intensity of each bit diffracted from a holographic memory is inversely proportional to the number of bright bits included in a configuration context. Here, a bright bit means a binary state high. A dark area means a binary state low for a programmable gate array. Figure 2 presents an experimentally obtained result of light intensity of each bit diffracted from a holographic memory. Figures 2(a), 2(b), and 2(c), respectively, show examples of a configuration context including one bright bit, 12 bright bits, and 23 bright bits. In Figs. 2(d)–2(f), it can be confirmed from the experimentally obtained results that the light intensity of each bright bit of a context including fewer bright bits is higher than that of a context including a greater number of bright bits. Consequently, with fewer bright bits, the reconfiguration frequency can be radically increased. The proposed optical configuration acceleration method using a negative logic implementation reduces the number of such bright bits included in a configuration context.

Fig. 2. Light intensity of each bit of optical configuration context diffracted from a holographic memory. Panels (a) and (d) show the light intensity of a configuration context including a single bright bit. Panels (b) and (e) show the light intensity of a configuration context including 12 bright bits. Panels (c) and (f) show the light intensity of a configuration context including 23 bright bits.

Download Full Size | PDF

3. Negative Logic Implementation

A. Method

The programmable gate array on an ORGA is dynamically reconfigured while its gate array operation is executed. However, under such dynamic reconfiguration, look-up tables (LUTs) of the programmable gate array are frequently reconfigured, although switching matrices are rarely reconfigured. Therefore, the proposed optical configuration acceleration method focuses reconfigurations of LUTs on a programmable gate array. Here, an example is shown in Fig. 3. The example shows an implementation of a three-input OR circuit for a LUT. In this case, seven bright bits are necessary to program the three-input OR circuit onto a LUT as shown in the truth table of an OR circuit. However, if a NOR circuit and a NOT circuit can be implemented onto a programmable gate array as the same circuit, then the number of bright bits inside the LUT can be decreased from seven to one. If a selector can choose whether an inversion output or a noninversion output of a LUT behind a LUT, then such implementation of a NOR and a NOT operation becomes possible. Even if one bright bit is necessary for the selector, the total number of bright bits can be decreased. This is the proposed optical configuration acceleration method using a negative logic implementation.

Fig. 3. Negative logic implementation.

Download Full Size | PDF

B. Theoretical Analysis

This strategy treats configuration data used for a LUT. The following discussion uses bit-length $N$ , which represents the number of input signals in a LUT. Always, the number $N$ takes a value of 4 to 6. Here, when the output of a LUT is inverted, it is assumed that one additional bright bit is required by following the hardware. Assuming that configuration contexts are given continuously for an ORGA-VLSI and assuming that they include all possible patterns uniformly, the average reduction ratio of ones corresponding to laser irradiation is calculated by counting bit 1 of all possible vectors, as in the following equation:

κ_{new} = \frac{\sum_{r = 1}^{[\frac{N}{2}]} r \cdot {{}_{N}C}_{r} + \sum_{r = [\frac{N}{2} + 1]}^{N} (N - r + 1) \cdot {{}_{N}C}_{r}}{\sum_{r = 1}^{N} r \cdot {{}_{N}C}_{r}} .

In that equation, ${{}_{N}C}_{r}$ represents a combination. Using the estimation presented above, the reduction ratios in the case of four-input LUT, five-input LUT, and six-input LUT are estimated as 0.391, 0.413, and 0.401, respectively. The average number of bright bits of a four-input LUT, five-input LUT, and six-input LUT can be decreased, respectively to 7, 14, and 38 bits per LUT. A reduction effect of 19.8%–21.8% can be expected.

4. Design of a 0.18 μm Complementary Metal Oxide Semiconductor Process Optically Reconfigurable Gate Array VLSI

A. Gate Array Design

A new ORGA-VLSI that can support optical configuration acceleration with a negative logic implementation was fabricated using a 0.18 μm standard complementary metal oxide semiconductor (CMOS) process technology. A chip photograph is portrayed in Fig. 4. A transmission gate cell was designed as custom cells having the same height as a standard cell and a photodiode cell was designed as having double height. The gate array design was synthesized by combining such custom cells and standard cells. The logic synthesis tool is Design Compiler (Synopsys Inc.). A place and route for the synthesized gate array design was executed using Astro (Synopsys Inc.). Finally, the ORGA-VLSI was fabricated at Rohm’s manufacturing facility. Table 1 presents the specifications. Voltages of the core and I/O cells were designed, respectively using 1.8 and 3.3 V. Photodiodes were constructed between an $N$ -Well and a $P$ -substrate. The computer-aided design (CAD) layout of a photodiode cell is shown in Fig. 5. The junction area of a photodiode was designed as $4.40 μm \times 4.45 μm$ . The photodiode aperture size is $6.08 μm \times 6.08 μm$ . The photodiode cells are arranged at 30.08 μm horizontal intervals and at 30.24 μm vertical intervals. This design incorporates 10,322 photodiodes. The gate array of the ORGA-VLSI uses an island style. The basic functionality of the gate array is fundamentally identical to that of currently available FPGAs. In all, 80 optically reconfigurable logic blocks (ORLBs), 90 optically reconfigurable switching matrices (ORSMs), and eight optically reconfigurable I/O blocks (ORIOBs), which include four programmable I/O bits, were implemented into the gate array VLSI. The ORLBs, ORSMs, and ORIOBs are programmable, respectively through 69, 49, and 49 optical connections. The total gate count is 2720.

Fig. 4. Photograph of a 0.18 μm CMOS process ORGA-VLSI.

Download Full Size | PDF

Table 1. Specifications of the ORGA-VLSI

View Table | View all tables in this article

Fig. 5. CAD layout of a photodiode cell.

Download Full Size | PDF

B. Optically Reconfigurable Logic Block

A block diagram and CAD layout of an ORLB are presented in Fig. 6. Each ORLB consists of two four-input one-output LUTs, 12 selectors, eight tri-state buffers, and two delay-type flip-flops with a reset function. The input signals from the wiring channel, which are applied through some switching matrices and wiring channels from ORIOBs or ORLBs, are transferred to LUTs through eight selectors. The LUTs are used for implementing Boolean functions. The outputs of an LUT and of a delay-type flip-flop connected to the LUT are connected to a selector. A combinational circuit and sequential circuit can be chosen by changing the state of the selector, as in FPGAs. In addition, this VLSI can support the proposed optical configuration acceleration method using a negative logic implementation. Therefore, the output of the selector to select a sequential circuit or a combinational circuit is connected to an additional selector. The selector can choose whether the output is inverted or not. The programming is also executed optically. The additional circuit area is only $209.7 {μm}^{2}$ . Since the cell size of a logic block is $288.00 μm \times 192.48 μm$ , the additional area is less than 0.38%. The implementation area increase is slight. Finally, the output of the selector is connected to wiring channel again through eight tri-state buffers. In all, 69 photodiodes are used for programming an ORLB. The ORLB is perfectly reconfigurable in parallel. Such an ORLB design is based on a standard cell design, except for custom cells of the transmission gate cells and photodiode cells.

Fig. 6. Block diagram and CAD layout of an ORLB.

Download Full Size | PDF

C. Optically Reconfigurable Switching Matrix

A block diagram and CAD layout of the ORSM are portrayed in Fig. 7. Its basic construction is the same as that used by Xilinx Inc. Four-directional switching matrices with 48 transmission gates were implemented in the gate array. Each transmission gate can be regarded as a bidirectional switch. A photodiode is connected to each transmission gate. It controls whether the transmission gate is closed or not. The four-direction switching matrices can be programmed using 49 optical connections. The cell size is $197.76 μm \times 192.48 μm$ . Such an ORSM was designed using custom cells of photodiode cells and transmission gate cells, except for some buffers.

Fig. 7. Block diagram and CAD layout of an ORSM.

Download Full Size | PDF

5. Experimental System

A. Hologram Calculation Method

Here, a holographic memory is constructed on a liquid crystal spatial light modulator (LC-SLM) as a programmable holographic memory. The holographic memory takes gray-level modulation. An aperture plane of target lasers, a holographic plane, and an ORGA-VLSI plane are parallelized. The laser beam is assumed as a collimated beam. The reference wave propagates into the holographic plane. The holographic medium comprises rectangular pixels of $δ_{x} \times δ_{y}$ on the $x_{1} - y_{1}$ holographic plane. The pixels are assumed as analog values. The input object comprises rectangular pixels of $d_{x} \times d_{y}$ on the $x_{2} - y_{2}$ object plane. The pixels can be modulated to be either on or off. The intensity distribution of a holographic medium is calculable using the following equation:

H (x_{1}, y_{1}) \propto \int_{- \infty}^{\infty} \int_{- \infty}^{\infty} O (x_{2}, y_{2}) \cos (\frac{π}{λ Z_{L}} {{(x_{1} - x_{2})}^{2} + {(y_{1} - y_{2})}^{2}}) d x_{2} d y_{2} .

Therein, $O (x_{2}, y_{2})$ stands for a binary value of a reconfiguration context, $λ$ represents the wavelength, and $Z_{L}$ denotes the distance between the holographic plane and the object plane. The value $H (x_{1}, y_{1})$ is normalized as 0–1 for the minimum intensity $H_{\min}$ and maximum intensity $H_{\max}$ , as explained in the following:

H^{'} (x_{1}, y_{1}) = \frac{H (x_{1}, y_{1}) - H_{\min}}{H_{\max} - H_{\min}} .

Finally, the normalized image

H^{'}

is used for implementing a holographic memory. Other areas on the holographic plane are opaque to the illumination.

B. Experimental System

Figure 8 presents a block diagram of an ORGA. Figure 9 portrays a photograph of the experimental system. The ORGA was constructed using a 532 nm–300 mW—laser (torus 532; Laser Quantum), a LC-SLM as a holographic memory, and a newly fabricated 0.18 μm CMOS process ORGA-VLSI. The beam from the laser source, the diameter of which is 1.7 mm, is expanded three times to 5.1 mm using two lenses of 50 mm focal length and 150 mm focal length. The expanded beam is incident to a holographic memory on an LC-SLM. The LC-SLM is a projection TV panel (L3D07U-81G00: Seiko Epson Corp.). It is a 90° twisted nematic device with a thin film transistor. The panel consists of $1920 \times 1080$ pixels, each having size of $8.5 μm \times 8.5 μm$ . The LC-SLM is connected to an evaluation board (L3B07-E60A; Seiko Epson Corp.) with video input connected to the external display terminal of a personal computer. Programming for the LC-SLM is executed by displaying a holographic memory pattern with 256 gradation levels on the personal computer display. Each holographic memory pattern was designed as $700 \times 700$ pixels, as shown in Figs. 10–13.

Fig. 8. Block diagram of an experimental system.

Download Full Size | PDF

Fig. 9. Photograph of the experimental system.

Download Full Size | PDF

Fig. 10. Holographic memory patterns and CCD-captured configuration context patterns of OR circuits.

Download Full Size | PDF

Fig. 11. Holographic memory patterns and CCD-captured configuration context patterns of comparator circuits. In these circuits, when all values of three-bit or four-bit inputs are the same, the output becomes binary state low. Otherwise, the output becomes binary state high.

Download Full Size | PDF

Fig. 12. Holographic patterns and CCD-captured configuration context patterns of a comparator and larger than or equal operation circuits with two ports of two bits. In the comparator circuit of (a), when values of two ports are equal, the output becomes binary state high. Otherwise, the output becomes binary state low. In the larger than or equal operation, if one is larger than another or one is equal to another, the output becomes binary state high. Otherwise, the output becomes binary state low.

Download Full Size | PDF

Fig. 13. Hologram patterns and CCD-captured configuration context patterns of a three-bit down counter.

Download Full Size | PDF

6. Experimental Results

To estimate the optical configuration acceleration method using negative logic implementation, seven combinational circuits of a two-input OR circuit, a three-input OR circuit, a four-input OR circuit, a three-input comparator with three ports of a single bit, a four-input comparator with four ports of a single bit, an equal comparator with two ports of two bits, and a larger than or equal comparator with two ports of two bits, and one sequential circuit of a three-bit down counter were implemented onto the system described above. All of their holographic memory patterns are shown, respectively in Figs. 10, 11, 12, and 13. Additionally, all CCD-captured configuration context patterns are shown, respectively in Figs. 10, 11, 12, and 13. The CCD-captured configuration context patterns have been programmed onto the new ORGA-VLSI. At that time, the reconfiguration periods were measured. The numbers of bright bits on normal configuration contexts and negative logic implementation configuration contexts are shown in Table 2. As a result, if using the negative logic implementation, then the average number of bright bits can be reduced from 20.125 to 14.5 bits. The proposed method was able to remove 28% bright bits. In addition, the reconfiguration frequency improvement is shown in Table 3. Using this negative logic implementation, an average of 28% bright bits can be removed so that reconfiguration periods were improved to an average of 50% of normal configurations. Results confirmed that the average reconfiguration time of a conventional ORGA was 129.1 ns. In contrast, the average configuration time of this negative circuit implementation was improved to 65.5 ns. The reconfiguration of the new negative circuit implementation can be 1.97 times faster than that of conventional implementations. Therefore, the optical configuration acceleration method using negative logic implementation is extremely useful to accelerate the optical reconfiguration frequency of ORGAs.

Table 2. Numbers of Bright Bits on Configuration Contexts

View Table | View all tables in this article

Table 3. Configuration Times

View Table | View all tables in this article

7. Conclusion

This paper has presented a proposal of the optical configuration acceleration method using negative logic implementation. In addition, reconfiguration clock frequency improvement has been demonstrated using a newly fabricated ORGA-VLSI that can perfectly support a negative logic implementation. The reconfiguration frequency of the proposed method was confirmed experimentally as 1.97 times higher than that of a normal ORGA architecture. The result was achieved with no increase of laser power. Although a selector and a photodiode to program the selector must be added for each logic block, the additional implementation area is less than 0.38%. Consequently, the gate array’s density is nearly equal to that of conventional gate arrays. Therefore, the optical configuration acceleration method using negative logic implementation is extremely useful to accelerate the optical reconfiguration frequency of ORGAs with no power increase.

This research is supported by the Ministry of Internal Affairs and Communications of Japan under the Strategic Information and Communications R&D Promotion Programme (SCOPE). The VLSI chip in this study was fabricated in the chip fabrication program of VLSI Design and Education Center (VDEC), the University of Tokyo in collaboration with Rohm Co. Ltd. and Toppan Printing Co. Ltd.

References

1. Altera Corporation, “Altera unveils 28 nm Stratix V FPGA family,” http://www.altera.com.

2. X. Wu, P. Gopalan, and G. Lara, “Xilinx next generation 28 nm FPGA technology overview,” http://www.xilinx.com.

3. Xilinx Inc., “Xilinx product data sheets,” http://www.xilinx.com.

4. T. Ghani, M. Armstrong, C. Auth, M. Bost, P. Charvat, G. Glass, T. Hoffmann, K. Johnson, C. Kenyon, J. Klaus, B. McIntyre, K. Mistry, A. Murthy, J. Sandford, M. Silberstein, S. Sivakumar, P. Smith, K. Zawadzki, S. Thompson, and M. Bohr, “A 90 nm high volume manufacturing logic technology featuring novel 45 nm gate length strained silicon CMOS transistors,” in IEEE International Electron Devices Meeting (IEEE, 2003), pp. 11.6.1–11.6.3.

5. V. Barral, T. Poiroux, F. Andrieu, C. Buj-Dufournet, O. Faynot, T. Ernst, L. Brevard, C. Fenouillet-Beranger, D. Lafond, J. M. Hartmann, V. Vidal, F. Allain, N. Daval, I. Cayrefourcq, L. Tosti, D. Munteanu, J. L. Autran, and S. Deleonibus, “Strained FDSOI CMOS technology scalability down to 2.5 nm film thickness and 18 nm gate length with a TiN/HfO2 gate stack,” in IEEE International Electron Devices Meeting (IEEE, 2007), pp. 61–64.

6. E. Culurciello and A. G. Andreou, “Capacitive coupling of data and power for 3D silicon-on-insulator VLSI,” in IEEE International Symposium on Circuits and Systems (IEEE, 2005), pp. 4142–4145.

7. R. Hentschke, G. Flach, F. Pinto, and R. Reis, “3D-Vias aware quadratic placement for 3D VLSI circuits,” in IEEE Computer Society Annual Symposium on VLSI (IEEE, 2007), pp. 67–72.

8. P. Mal, P. D. Patel, and F. R. Beyette, “Design and demonstration of a fully integrated multi-technology FPGA: a reconfigurable architecture for photonic and other multi-technology applications,” IEEE Trans. Circuit. Sys. 56, 1182–1191 (2009). [CrossRef]

9. J. V. Campenhout, H. V. Marck, J. Depreitere, and J. Dambre, “Optoelectronic FPGA’s,” IEEE J. Sel. Top. Quantum Electron. 5, 306–315 (1999). [CrossRef]

10. F. Breyer, S. C. J. Lee, D. Cardenas, S. Randel, and N. Hanik, “Real-time gigabit ethernet transmission over up to 25 m Step-Index Polymer Optical Fibre using LEDs and FPGA based signal processing, ” presented at European Conference on Optical Communication, Vienna, Austria, 20–24 September 2009.

11. H. Morita and M. Watanabe, “Microelectromechanical configuration of an optically reconfigurable gate array,” IEEE J. Quantum Electron. 46, 1288–1294 (2010). [CrossRef]

12. M. Nakajima and M. Watanabe, “A four-context optically differential reconfigurable gate array,” IEEE/OSA J. Lightw. Technol. 27, 4460–4470 (2009). [CrossRef]

13. D. Seto and M. Watanabe, “A dynamic optically reconfigurable gate array—perfect emulation,” IEEE J. Quantum Electron. 44, 493–500 (2008). [CrossRef]

14. M. Watanabe and F. Kobayashi, “Dynamic optically reconfigurable gate array,” Jpn. J. Appl. Phys. 45, 3510–3515(2006). [CrossRef]

Technology	0.18 μm Double-Poly 5-Metal CMOS Process
Chip size	$5.0 mm \times 5.0 mm$
Gate array area	$4289 μm \times 2309 μm$
Supply voltage	Core 1.8 V, I/O 3.3 V
Photodiode size	$4.40 μm \times 4.54 μm$
Photodiode response time	$< 5 ns$
Photodiode sensitivity	$2.12 \times 10^{- 14} J$
Distance between photodiodes	$h . = 30.08$ , $v . = 30.24 μm$
Number of photodiodes	10,322
Number of logic blocks	80
Number of switching matrices	90
Number of wires in a routing channel	8
Number of I/O blocks	8 (32 bit)
Gate count	2,720

Normal Configuration		Inversion Configuration
Circuit Name	Number of Bright Bits	Circuit Name	Number of Bright Bits
2-input OR	10	2-input $NOR + INV$	9
3-input OR	16	3-input $NOR + INV$	11
4-input OR	26	4-input $NOR + INV$	13
3-input comparator	15	3-input inverted $comparator + INV$	12
4-input comparator	25	4-input inverted $comparator + INV$	14
Equal comparator	23	Inverted equal $comparator + INV$	16
Larger than or equal comparator	21	Inverted larger than or equal $comparator + INV$	18
3 bit down counter	25	3 bit inverted $down-counter + INV$	23
Average	20.125	Average	14.5

Normal Configuration		Inversion Configuration
Circuit Name	Configuration Time (ns)	Circuit Name	Configuration Time (ns)	Frequency Ratio
2-input OR	35	2-input $NOR + INV$	24	1.46
3-input OR	94	3-input $NOR + INV$	41	2.29
4-input OR	188	4-input $NOR + INV$	42	4.48
3-input comparator	78	3-input inverted $comparator + INV$	56	1.39
4-input comparator	180	4-input inverted $comparator + INV$	53	3.40
Equal comparator	147	Inverted equal $comparator + INV$	63	2.33
Larger than or equal comparator	131	Inverted larger than or equal $comparator + INV$	97	1.35
3 bit down counter	180	3 bit inverted $down-counter + INV$	148	1.22
Average	129.125	Average	65.5	1.97

Technology	0.18 μm Double-Poly 5-Metal CMOS Process
Chip size	$5.0 mm \times 5.0 mm$
Gate array area	$4289 μm \times 2309 μm$
Supply voltage	Core 1.8 V, I/O 3.3 V
Photodiode size	$4.40 μm \times 4.54 μm$
Photodiode response time	$< 5 ns$
Photodiode sensitivity	$2.12 \times 10^{- 14} J$
Distance between photodiodes	$h . = 30.08$ , $v . = 30.24 μm$
Number of photodiodes	10,322
Number of logic blocks	80
Number of switching matrices	90
Number of wires in a routing channel	8
Number of I/O blocks	8 (32 bit)
Gate count	2,720

Normal Configuration		Inversion Configuration
Circuit Name	Number of Bright Bits	Circuit Name	Number of Bright Bits
2-input OR	10	2-input $NOR + INV$	9
3-input OR	16	3-input $NOR + INV$	11
4-input OR	26	4-input $NOR + INV$	13
3-input comparator	15	3-input inverted $comparator + INV$	12
4-input comparator	25	4-input inverted $comparator + INV$	14
Equal comparator	23	Inverted equal $comparator + INV$	16
Larger than or equal comparator	21	Inverted larger than or equal $comparator + INV$	18
3 bit down counter	25	3 bit inverted $down-counter + INV$	23
Average	20.125	Average	14.5

Optical configuration acceleration on a new optically reconfigurable gate array very large scale integration using a negative logic implementation

Abstract

1. Introduction

2. Holographic Memory Property

3. Negative Logic Implementation

A. Method

B. Theoretical Analysis

4. Design of a 0.18 μm Complementary Metal Oxide Semiconductor Process Optically Reconfigurable Gate Array VLSI

A. Gate Array Design

B. Optically Reconfigurable Logic Block

C. Optically Reconfigurable Switching Matrix

5. Experimental System

A. Hologram Calculation Method

B. Experimental System

6. Experimental Results

7. Conclusion

References

Cited By

Figures (13)

Tables (3)

Equations (3)

Applied Optics