# A 0.68e-rms Random-Noise 121dB Dynamic-Range Sub-pixel architecture CMOS Image Sensor with LED Flicker Mitigation

S. Iida<sup>1</sup>, Y. Sakano<sup>1</sup>, T. Asatsuma<sup>1</sup>, M. Takami<sup>2</sup>, I. Yoshiba<sup>1</sup>, N. Ohba<sup>2</sup>, H. Mizuno<sup>1</sup>, T. Oka<sup>1</sup>,

K. Yamaguchi<sup>1</sup>, A. Suzuki<sup>1</sup>, K. Suzuki<sup>1</sup>, M. Yamada<sup>1</sup>, M. Takizawa<sup>1</sup>, Y. Tateshita<sup>1</sup>, and K. Ohno<sup>1</sup>

<sup>1</sup>Sony Semiconductor Solutions, Kanagawa, Japan. <sup>2</sup>Sony Semiconductor Manufacturing, Kumamoto, Japan.

E-mail: Satoko.Iida@sony.com

*Abstract*—This is a report of a CMOS image sensor with a sub-pixel architecture having a pixel pitch of 3 um. The aforementioned sensor achieves both ultra-low random noise of 0.68e-rms and high dynamic range of 121 dB in a single exposure, further realizing LED flicker mitigation.

## I. INTRODUCTION

Recently, real-time sensing has been creating new businesses and social changes, specifically in the internet of things (IoT) and automotive fields. Clearly, image sensing is a critical function in these fields. The accurate perception of moving objects and obstacles as well as detection with high color reproducibility for all light conditions is a necessity.

For example, people, objects, or features must be recognized through sampling of high-sensitivity and low noise images, even when moving in darkness.

Additionally, using the multiple exposure method, a conventional high dynamic range (HDR) technique [1-2] can cause motion artifacts depending on the sampling time difference of dynamic subjects. This results in misrecognition.

Moreover, the signal lights of light-emitting diode (LED) actually blink; however, they must appear as though they are always on in the images. When using a method that extends the exposure time simply in order to capture such blinking signals, the signals become saturated within a short period of time, thus losing their luminance and color information. As a method of LED flicker mitigation (LFM), sampling multiple times in the time direction during the exposure period has been proposed. This method has a non-supplementing period and cannot completely mitigate the light flicker effect. [3]

We have developed a new image sensor to address these issues. The characteristic of this sensor is that it has been designed with a sub-pixel architecture that has a single large photodiode, a single small photodiode, and an in-pixel floating capacitor.

## **II. SENSOR ARCHITECTURE**

# A. Sensor Configuration

Fig.1 shows the block diagram of the image sensor. A pixel array with 1920×1200 pixels, read-out circuits (load MOS transistors, column ADC's, DAC), driver circuits (row driver, row decoder), image signal processor, and other circuits (PLL, regulator, MIPI I/F, CPU, etc.), are all mounted using a 90-nm process.

## B. Pixel Circuit

Fig.2 shows the pixel schematic of the sub-pixel architecture. This circuit employs a single large photodiode (SP1), a single small photodiode (SP2), an in-pixel floating capacitor (FC), and seven transistors. The SP1 has a high sensitivity (Green) of 36000e-/lx · s and SP2 has 1/10 of SP1's sensitivity. SP1's linear full-well capacity (FWC) of 10000e- and SP2's linear FWC of 78500e- are attributed to the FC. The seven transistors are as follows: transfer gate of SP1 (TGL), transfer gate of SP2 (TGS), floating diffusion gate (FDG), floating capacitor gate (FCG), reset transistor (RST), select transistor (SEL), and the source follower amplifier (AMP). A floating diffusion (FD) is separated as FD1, FD2, and FD3 by FDG and FCG, which serve as a switch to connect FD1 with FD2, and FD2 with FD3, respectively. The two-electrodes of FC are connected to FD3 and the counter electrode of which supply voltage is FCVDD respectively.

As shown in the pixel top view of Fig. 3, one pixel has a large on-chip micro lens (OCL) and a small OCL. SP2's OCL is located in the gap section of SP1's OCL. This makes the sensitivity ratio of SP1 to SP2 equal to 10:1.

Fig.4 shows the pixel cross-sectional view corresponding to the dotted line in Fig.3. As shown in Fig.4, deep trench isolations are employed in the silicon substrate to prevent the leakage of electrical charges from SP1 to SP2.

# C. Pixel Read-out Method

Fig.5 shows the pixel driving sequence. The signals that come from SP1 and SP2 are output serially. Additionally, the electrical charges accumulated in SP1 is converted to signal voltage in two modes, namely high conversion gain (HCG) and low conversion gain (LCG) by switching FDG. In this manner, three types of signals are read-out in a single exposure. First, an exposure of SP1 and SP2 begins by the reset of SP1, SP2, and FC. Then, LCG reset level 2 and HCG reset level 1 is sampled. Subsequently, HCG signal level 1 is sampled after switching TGL, LCG signal level 2 is sampled after switching TGL once again. By performing co-related double sampling (CDS) for each reset and signal level, two signals are read-out: SP1H and SP1L. Subsequently, the signal that comes from SP2 is read-out by performing delta reset sampling (DRS): SP2L, in which the signal level 3 is sampled first, followed by the reset level 3. Because the signal charges are accumulated in FD3, FD3 cannot be reset prior to sampling the signal level 3. The flaw of DRS is that kTC noise cannot be removed; however, it can be suppressed

by securing the capacitance of FC sufficiently. The FCVDD in the accumulation period is lower than that in the read-out period to reduce the fixed pattern noise (FPN) of SP2L.

Fig.6 shows SP1's potential diagram that considers the driving sequence of SP1 in Fig.5. The cross section shows the path of the perforated line A-B in Fig.2. In the beginning of the exposure period, switching TGL, FDG, and RST resets the electrical charges of SP1 (Fig.6-a). During the exposure period, FD1 and FD2 are always reset (Fig.6-b), and after that, the LCG reset level 2 is sampled when RST is turned off (Fig.6-c). Subsequently, the HCG reset level 1 is sampled when FDG is turned off (Fig.6-d). Then, the HCG signal level 1 is sampled after TGL is switched and the electrical charges accumulated in SP1 are transferred to FD1 (Fig.6-e). After FD1 is connected to FD2 by turning FDG on, The LCG signal level 2 is sampled when TGL is switched once again and the remaining charges in SP1 are fully transferred to FD1 and FD2 (Fig.6-f); thus, SP1H and SP1L can be read-out from the electrical charges accumulated in SP1.

Fig.7 shows SP2's potential diagram that considers the driving sequence of SP2 in Fig.5. The cross section shows the path of the perforated line C-B in Fig.2. In the beginning of the exposure period, switching TGS, FCG, and RST resets the electrical charges of SP2 and FD3 (Fig.7-a). During the exposure period, the electrical charges that come from SP2 are accumulated in both SP2 and FD3 (Fig.7-b). After FD3 is connected to FD1 and FD2 by turning FCG on, the signal level of SP2 is sampled when TGS is switched and the electrical charges accumulated in SP2 are fully transferred to FD3 (Fig.7-c). Again, the RST is turned on and the electrical charges of FD1, FD2, and FD3 are reset and sampled as the reset level of SP2 when RST is turned off (Fig.7-d); thus, the SP2L can be read-out from the electrical charges accumulated in SP2 during the same exposure period of SP1.

#### **III. SENSOR CHARACTERISTICS**

#### A. Sensor Characteristics

Fig.8 shows the photo responses for SP1H, SP1L, and SP2L. The linear FWC of SP1 and SP2 are 10000e- and 78500e-, respectively. As mentioned earlier, the sensitivity ratio of SP1 to SP2 is 10:1. By multiplying SP2L and a gain of 10, its linear FWC becomes equivalent to 785000e-, thus achieving a dynamic range of 121 dB.

Fig.9 shows the FCVDD dependency of SP2's linear FWC and the FPN of SP2L. The linear FWC is determined by the capacitance of the FC and the difference in electric potential between the reset level of SP2 and the threshold voltage of FCG. Hence, the FPN is caused by the variation of FD3 dark current. The cause of the FD3 dark current is an electric field between FD3 and its P-well, which is generated by FCVDD. Therefore, the supply voltage of FCVDD in the accumulation period is lower than that in the read-out period to reduce the FD3 dark current. The linear FWC and dark current have a trade-off relationship; nevertheless, a linear FWC of 78500ecan be secured.

Fig.10 shows the SNR curve of the synthesized signal. It consists of three types of signals. SP1H is used in low light

scenes, SP1L is for medium light scenes, and SP2L is for high light scenes. An SNR of 20 dB can be maintained even when connecting SP1L to SP2L at 60°C.

#### B. Synthesized Image

Fig.11 shows a synthesized image of a high dynamic range scene. Fig.11 (a) shows an image of SP1L with an exposure time of only 0.3 ms. There are crushed shadows inside the tunnel. Fig.11 (b) shows an image of SP1L with an exposure time of 11 ms. The outside of the tunnel has blown-out highlights. As shown in Fig.11 (c), by synthesizing SP1H, SP1L, and SP2L, the darkness of the tunnel and the outside landscape are accurately captured. Additionally, blinking LED signal lights and car headlights are captured without being switched off.

Fig.12 shows a synthesized image captured in a very low light scene of 0.1 lx. Fig.12 (a) shows the SP1H being applied and Fig.12 (b) shows an image in the absence of SP1H. When comparing the sections of the outer wall, it can be observed that the wall is covered in noise, and it is impossible to distinguish the wall's pattern and the white lines from it without SP1H. Furthermore, when comparing the sections with people, we can clearly observe their figures and contours if we apply SP1H.

Table 1 shows the sensor performance developed in this study and Table 2 shows a comparison of the characteristics reported previously [4]–[6]. The well-balanced characteristics have been achieved using a CMOS image sensor with a sub-pixel architecture.

#### **IV. CONCLUSIONS**

We have developed a new image using a sub-pixel architecture with a pixel pitch of 3 um that achieves both ultra-low noise of 0.68e-rms and a high dynamic range of 121 dB in a single exposure, further realizing LFM.

#### ACKNOWLEDGMENT

The authors would like to thank M. Torii, T. Machida, Y. Matsumura, and T. Toyofuku, for their support and advice regarding the sensor pixel design. The authors also appreciate the support from the members of Sony Semiconductor Solutions and Sony Semiconductor Manufacturing.

#### REFERENCES

- [1] Trygve Willassen et al., "A 1280x1080 4.2µm Split-diode Pixel HDR Sensor in 110nm BSI CMOS Process" in IISW 2015.
- [2] Sergey Velichko et al., "140 dB Dynamic Range Sub-electron Noise Floor Image Sensor" in IISW 2017, pp. 294-297,
- [3] Chris Silsby et al., " A 1.2MP 1/3" CMOS Image Sensor with Light Flicker Mitigation" in IISW 2015,

[4] M. Takase et al., "An over 120 dB wide-dynamic-range 3.0 μm pixel image sensor with in-pixel capacitor of 41.7 fF/um2 and high reliability enabled by BEOL 3D capacitor process," in Symp. VLSI 2018, pp. 71-72.

- [5] K. Nishimura et al., "An Over 120dB Simultaneous-Capture Wide-Dynamic-Range 1.6e- Ultra-Low-Reset-Noise Organic-Photoconductive-Film CMOS Image Sensor," in ISSCC 2016, pp. 110-112.
- [6] Johannes Solhusvik et al., "A 1392x976 2.8µm 120dB CIS with Per-Pixel Controlled Conversion Gain," in IISW 2017, pp. 298-301







Fig. 6 SP1 potential diagram (A-B in Fig.2)

Fig. 7 SP2 potential diagram (C-B in Fig.2)



Fig. 8 Photo response of each signal



(b)SP1L 11ms



Fig. 10 SNR curve of synthesized signal



(c)SP1H+SP1L+SP2L 11ms



Fig. 11 Synthesized image (high dynamic range scene)

| Fable | 1 | Sensor | performance |
|-------|---|--------|-------------|
| auto  | 1 | School | performance |

| Parameter                                    | Unit       | Value       |  |
|----------------------------------------------|------------|-------------|--|
| Power Supply                                 | V          | 2.9/1.8/1.2 |  |
| Process Technology                           | -          | 90nm 4Cu1AL |  |
| Pixel Array                                  | pixels     | 1920x1200   |  |
| Pixel Pitch                                  | μm         | 3           |  |
| SP1L Linear Full-well Capacity               | e-         | 10000       |  |
| SP2L Linear Full-well Capacity               | e-         | 78500       |  |
| Sensitivity(Green, 3200K with IR cut filter) | e/lx • sec | 36000       |  |
| Sensitivity Ratio (SP1:SP2)                  | -          | 10:1        |  |
| Random Noise                                 | e-rms      | 0.68        |  |
| Dynamic Range                                | dB         | 121         |  |

Table 2 Comparison of sensor characteristics

|                           | Unit    | This<br>work | VLSI<br>2018 | 1SSCC<br>2016 | 11SW<br>2017 |
|---------------------------|---------|--------------|--------------|---------------|--------------|
|                           |         |              | [4]          | [5]           | [6]          |
| Pixel pitch               | um      | 3            | 3            | 6             | 2.8          |
| Random noise@RT           | e - rms | 0.68         | 6.2          | 5.4           | 1            |
| Full-well Capacity        | e -     | 78500        | 489K         | 600k          | 50000        |
| Dynamic Range(SingleExp.) | dB      | 121          | 121          | 123.8         | 94           |



Fig. 12 Synthesized image (low light scene)