# UNIVERSITÉ DE STRASBOURG

# UNIVERSITÉ DE STRASBOURG



# **ÉCOLE DOCTORALE MSII**Laboratoire ICube UMR 7375

# THÈSE présentée par : Rémi BONNARD

soutenue le : 10 Février 2015

pour obtenir le grade de : Docteur de l'université de Strasbourg

Discipline/Spécialité : Mathématiques, Sciences de l'Information et de l'Ingénieur /

Optoélectronique

# Burst CMOS Image Sensor with on-chip Analog to Digital Conversion

THÈSE dirigée par :

Prof. UHRING Wilfried Professeur, Université de Strasbourg / CNRS

**RAPPORTEURS:** 

Prof. MAGNAN Pierre Chef du Département Electronique, Optronique et Signal, ISAE

Prof. Dr.-Ing. FEY Dietmar Chef de l'Equipe Computer Science 3, Friedrich-Alexander-

Universität Erlangen-Nürnberg

#### **AUTRES MEMBRES DU JURY:**

**Dr. KOCH Andreas** Scientifique, European XFEL GmbH

Dr. DUPRET Antoine HDR, CEA/Leti

Prof. HEBRARD Luc Professeur, Université de Strasbourg / CNRS



# Rémi BONNARD Burst CMOS Image Sensor with on-chip A/D Conversion



#### Résumé

Ce travail vise à étudier l'apport des technologies d'intégration 3D à l'imagerie CMOS ultrarapide. La gamme de vitesse d'acquisition considérée ici est du million au milliard d'images par seconde. Cependant au-delà d'une dizaine de milliers d'images par seconde, les architectures classiques de capteur d'images sont limitées par la bande passante des buffers de sortie. Pour atteindre des fréquences supérieures, une architecture d'imageur burst est utilisée où une séquence d'une centaine d'images est acquise et stockée dans le capteur.

Les technologies d'intégration 3D ont connu un engouement depuis une dizaine d'années et sont considérées comme une solution complémentaire aux travaux menés sur les dispositifs (transistors, composants passifs) pour améliorer les performances des circuits intégrés. Notre choix s'est porté sur une technologie où les circuits intégrés sont directement empilés avant la mise en boitier (3D-SIC). La densité d'interconnections entre les différents circuits est suffisante pour permettre l'implémentation d'interconnections au niveau du pixel. L'intégration 3D offre d'intéressants avantages à l'imagerie intégrée car elle permet de déporter l'électronique de lecture sous le pixel. Elle permet ainsi de maximiser le facteur de remplissage du pixel tout en offrant une large place aux circuits de conditionnement du signal. Dans le cas de l'imagerie burst, cette technologie permet de consacrer une plus grande surface aux mémoires dédiées au stockage de la séquence d'image et ce au plus proche des pixels. Elle permet aussi de réaliser sur la puce la conversion analogique numérique des images acquises.

Dans un premier temps, nous avons conçu un modèle pour évaluer les performances de deux architectures d'imageurs. La première stocke les images sous forme analogique puis réalise la conversion durant la lecture de la mémoire. Cette solution offre de très hautes vitesses d'acquisition allant jusqu'au milliard d'images par seconde. Cependant, ce mode de stockage présente les inconvénients des mémoires analogiques comme le bruit d'échantillonnage et un temps de rétention limité des données. L'architecture à mémoire analogique est capable d'acquérir et de stocker des centaines d'images en mémoire selon la dynamique désirée. La seconde architecture convertit l'image en données numériques avant le stockage en mémoire. La vitesse d'acquisition est alors limitée par la fréquence des convertisseurs analogiques numériques à des dizaines de millions d'images par seconde selon la résolution de la conversion. L'utilisation de mémoires numériques permet de stocker des milliers d'images à chaque acquisition. Cette dernière architecture offre une amélioration de la profondeur mémoire d'un facteur dix par rapport à l'architecture à stockage analogique. Une telle architecture est rendue possible grâce à l'intégration 3D alors

que les imageurs à stockages analogiques sont concevables sans cette technologie. La suite de notre étude s'est donc portée sur les imageurs burst à mémoires numériques.

Lors de la modélisation des performances du l'imageur burst à stockage numérique, il est apparu que la consommation du système était très élevée. Pour évaluer les risques de surchauffe, un modèle thermique de l'imageur a été réalisé. Il confirme la faisabilité d'une telle structure pour l'acquisition d'une simple séquence d'images et définit des limites de fonctionnement pour un mode d'acquisition répété.

Une attention particulière a été portée à la réalisation d'un circuit de lecture adapté à notre architecture d'imageur. Un circuit de lecture inspiré des « active pixel sensors » permettant une acquisition « global shutter » a été réalisé. De plus différentes solutions ont été envisagées pour augmenter la sensibilité du circuit. Un pixel basé sur une structure à injection directe a été implémenté sur le test chip. Enfin un pixel sans source de courant a été étudié et conçu pour réduire la consommation du capteur.

Suite à la fabrication du test-chip, des tests électriques et optiques ont été menés. La photosensibilité de la photodiode et les caractéristiques du pixel inspiré de « l'active pixel sensor» et du pixel à injection direct ont été mesurées. La vitesse de ces deux pixels a aussi été caractérisée grâce à la mesure de leur ouverture électronique. Ces structures ont été validées respectivement pour des vitesses de 1,6 et 5 millions d'images par seconde. Enfin, la fin de ce manuscrit présente la réalisation d'un prototype d'imageur burst à stockage numérique. Cet imageur est composé d'un empilement de deux matrices de pixels. Le circuit pixel supérieur contient les photodiodes, le circuit de lecture global shutter et le comparateur d'un convertisseur simple rampe. Le circuit pixel inférieur est composé du compteur 8 bits du convertisseur et d'une mémoire numérique.

Mots-clés : Capteur d'Image, Imageur CMOS, Imageur Ultra Rapide, Circuit Intégré en Trois Dimensions, Imagerie Burst, Conversion Analogique Numérique

## Summary

This work aims to study the inflows of the 3D integration technology to ultra-high speed CMOS imaging. The acquisition speed range considered here is between one million to one billion images per second. However above ten thousand images per second, classical image sensor architectures are limited by the data bandwidth of the output buffers. To reach higher acquisition frequencies, a burst architecture is used where a set of about one hundred images are acquired and stored on-chip.

3D integration technologies become popular more than ten years ago and are considered as a complementary solution to the technological improvements of the devices. We have chosen a technology where integrated circuits are stacked on the top of each other (3D-SIC). The interconnection density between the circuits is high enough to enable interconnections at the pixel level. The 3D integration offers some significant advantages because it allows deporting the readout electronic below the pixel. It thus increases the fill factor of the pixel

while offering a wide area to the signal processing circuit. For burst imaging, this technology provides more room to the memory dedicated to the image storage while staying close to the pixel. It also allows implementing analog to digital converter on-chip.

First, we have proposed a model to assess the performances of two image sensor architectures. The first one stores the images into analog memory and then performs the conversion during the memory reading. This architecture enables very high frame rates up to billion images per second. However, the analog storage has some drawbacks as the sampling noise and the limited retention time. The analog storage architecture is able to store hundreds of images depending on the targeted dynamic range. The second architecture converts images into digital data before their storage. The frame rate is then limited by the analog to digital conversion to tens millions images per second depending on the conversion resolution. Using digital memories allows storing thousands of images at each acquisition. This architecture provides an improvement of the memory depth of a factor ten compared to the analog storage architecture. Such architecture is possible thanks to 3D integration while the analog storage architecture can be implemented without this technology. The continuation of this PhD work has then focused on the burst image sensor with digital storage.

During the performance study of the burst image sensor with digital storage, it has appeared that the power consumption of this sensor is very high. To assess the risk of overheating, a thermal model of the sensor has been made. It has confirmed the feasibility of this structure for the acquisition of a single burst of images and has defined some operating limits to the multi burst acquisition mode.

A specific attention has been paid to the design of a readout circuit suited to our architecture. A circuit which enables global shutter acquisition inspired from the "active pixel sensor" (APS) pixel has been designed. Moreover different solutions have been considered to increase the pixel sensitivity. A pixel with a direct injection circuit has been implemented on a test-chip. Finally a pixel without current source has been studied and designed to reduce the power consumption of the sensor.

After the manufacturing of the test-chip, electrical and optical tests have been carried out. The photo-responsivity of the photodiode and the characteristics of the APS based pixel and the direct injection pixel have been measured. The speed both of these pixels has been evaluated thanks to electronic aperture measurements. These structures have respectively been validated for a frame rate of 1.6 and 5 million images per second. Finally, the end of the report describes the design of a digital storage burst image sensor prototype. This image sensor is made of a stack of two pixel arrays. The top tier pixel contains the photodiode, the global shutter circuit and the comparator of a single slope ADC. On the bottom tier pixel, the 8 bit counter of the converter and the digital memory are implemented.

Keywords: Image Sensor, CMOS Imaging, Ultra High Speed Imaging, Three-Dimensional Integrated Circuit, Burst Imaging, Analog to Digital Conversion

#### <u>International Conference Communications:</u>

- R. Bonnard, F. Guellec, J. Segura, A. Dupret, W. Uhring; "New 3D-integrated burst image sensor architectures with in-situ A/D conversion", DASIP 2013, Cagliari, Italy, pages 215-222, ECSI (Eds.), IEEE, October 2013
- R. Bonnard, J. Segura Puchades, F. Guellec, W. Uhring; "Signal conditioning circuits for 3D-integrated burst image sensors with on-chip A/D conversion", Electronic Imaging 2015, San Francisco, United States of America
- R. Bonnard, M. Garci, J. Kammerer, W. Uhring; "Electrothermal Analysis of 3D Integrated Ultra-fast Image Sensor With Digital Frame Storage", Therminic 2015, Paris, France

## Résumé de la thèse : Capteur d'Image Burst CMOS avec Conversion Analogique-Numérique sur Puce

#### 1. Introduction

L'imagerie rapide est un vaste domaine qui propose des caméras enregistrant à des cadences allant du millier à des milliards d'images par seconde. Ces caméras sont destinées à des applications industrielles ou scientifiques pour enregistrer des évènements rapides. Selon la vitesse d'acquisition envisagée, l'architecture du capteur d'image diffère. Pour des vitesses d'acquisition allant jusqu'à la dizaine de milliers d'images par seconde pour une résolution d'un million de pixels, l'architecture des capteurs d'image CMOS classique est utilisée. Cependant pour éviter des distorsions spatiotemporelles de l'image, une acquisition d'obturation globale « global shutter » est implémentée. Lors de l'enregistrement de chaque image de la vidéo, les pixels de toute la matrice sont acquis au même instant contrairement à l'acquisition « rolling shutter ». Ce type de caméra est utilisé surtout pour enregistrer des ralentis sportifs ou animaliers, pour enregistrer des séquences de crash test automobile ou faire du contrôle sur les chaines de production. La vitesse d'acquisition est en pratique limitée par l'opération de lecture de la matrice et le débit de l'électronique de sortie de l'imageur. L'état de l'art actuel de la caméra rapide continue offre un débit d'environ 25 milliards de pixels par seconde en sortie de l'imageur. Pour une conversion analogiquenumérique réalisée sur 8 bits, cela correspond à un débit de 25 Go/s. Pour contourner cette limitation, une architecture d'imageur par rafale dite burst est employée. Avec cette structure, une vidéo est acquise à très haute vitesse et stockée dans l'imageur. La vidéo stockée est ensuite envoyée à vitesse conventionnelle hors de l'imageur se libérant ainsi de la contrainte de débit de l'électronique de sortie. Cependant comme la vidéo est stockée au cœur de la puce, la longueur de la vidéo est limitée et va d'une dizaine à quelques centaines d'images. Ce type d'imageur est principalement utilisé dans le domaine scientifique pour l'étude de phénomènes rapides comme les tests de ruptures mécaniques, la formation de plasma et l'étude de certaines combustions. Ce type de caméra enregistre à des cadences pouvant dépasser le million d'images par seconde. Pour atteindre des vitesses supérieures jusqu'à des milliards d'images par seconde, des architectures « streak » sont employées. Ces caméras permettent d'atteindre des résolutions temporelles de quelques centaines de picosecondes. Cependant ces caméras produisent une vidéo d'images constituées d'une seule colonne de pixels. Une image en deux dimensions peut être reconstituée selon la répétabilité de l'évènement enregistré. Ce type de caméra est utilisé pour enregistrer des phénomènes d'optiques (ablation laser), de photochimie (mesure de constantes de temps de fluorescence) ou de physique de la matière condensée.

L'objectif de ce travail est d'étudier les apports des technologies d'intégration en trois dimensions (3D) à l'imagerie rapide. Cette étude s'est portée plus spécifiquement sur les architectures d'imageur *burst* en technologie CMOS. Suite à une étude de l'état de l'art des

imageurs *burst*, deux architectures d'imageur intégré en 3 dimensions ont été évaluées et comparées. Cette étude nous a permis de sélectionner l'architecture burst à stockage numérique pour maximiser le nombre d'images par *burst*. Cette architecture a été modélisée pour évaluer les élévations de températures dans différents modes de fonctionnement. Suite à cette approche haut niveau, des pixels ont été conçus pour réduire la consommation et augmenter la sensibilité du capteur. Un microcircuit de test a été réalisé et a permis de valider les différents pixels. Enfin un prototype d'imageur *burst* à stockage numérique intégré en 3 dimensions a été fabriqué grâce à un empilement de deux circuits.

#### 2. Etat de l'art

Avant de présenter les différentes implémentations d'imageur burst, nous allons brièvement présenter les photodétecteurs classiques. Les pixels des imageurs burst sont de grandes tailles (> 30x30 µm²). Par conséquence, les différents photodétecteurs comparés ici font une taille de 40x40 µm². Le photodétecteur le plus connu est la photodiode qui est composée d'une ou plusieurs jonctions PN polarisées en inverse. Les photons reçus génèrent dans le silicium des paires électron/trou. Les électrons se déplacent vers les zones dopées N et les trous vers les zones dopées P. Les charges générées dans la zone de charge d'espace se déplacent sous l'effet du champ électrique de la jonction (haute vitesse). Les charges générées hors de la zone de charge d'espace doivent d'abord l'atteindre sous les effets de la diffusion (faible vitesse). Les performances de la photodiode dépendent ainsi fortement du type de jonctions utilisées (profondeur et dopage). Pour augmenter la zone de collection (zone de charge d'espace et zone de diffusion), des jonctions faiblement dopées doivent être préférées. La capacité équivalente de la jonction joue un rôle important dans la sensibilité de l'imageur, car elle définit le gain de conversion d'une structure de pixel classique (pixel 3T). De plus cette capacité peut limiter la bande passante de structures de pixel plus évoluées (ex : pixel à injection directe). Les caractéristiques de différentes photodiodes sont résumées dans le tableau ci-dessous :

| Structure         | Réponse Maximale | Bande Passante | Capacité de<br>Jonctions |
|-------------------|------------------|----------------|--------------------------|
| N+/PSub           | 0.1 A/W          | 70 MHz         | 1 pF                     |
| NWell/PSub        | 0.25 A/W         | 70 MHz         | 100 fF                   |
| P+/NWell/PSu<br>b | 0.45 A/W         | 100 MHz        | 1.3 pF                   |
| P+/NWell          | Mauvais (~0.1)   | 2 GHz          | 1 pF                     |

La seconde structure classique de photodétecteur utilisée en imagerie est la *pinned* photodiode. Ce photodétecteur a une structure en sandwich P+/N/Substrat P. Les électrons photogénérés sont collectés dans la zone N aussi appelée puits de stockage. A la fin de l'exposition les électrons ainsi collectés sont déplacés vers une zone N fortement dopée (diffusion flottante) grâce à une grille de transfert. Ce photodetecteur offre une collection efficace des électrons photogénérés caractérisée par une efficacité quantique pouvant atteindre 80 %. Cette valeur correspond à une réponse d'environ 0,3 A/W. De plus, la conversion charge-tension est réalisée par la zone de diffusion flottante qui possède une très faible capacité ( $^{\sim}3$  fF). Le gain de conversion est donc très élevé ( $^{\sim}70$   $\mu$ V/e-). Cette structure est optimisée pour offrir une bonne sensibilité. En terme de vitesse, cette structure est limitée par le processus de transfert des charges vers la diffusion flottante. Bien que les

charges se déplacent sous la grille de transfert sous l'effet du champ électrique, la vitesse de déplacement des charges au sein du puits de stockage est limitée par des phénomènes de diffusion. Ce phénomène empire pour les larges photodiodes et limite le *frame rate* des imageurs utilisant une *pinned* photodiode à 20 Mfps. Les photodiodes seront donc privilégiées pour les imageurs *burst* ayant un frame rate de plus de 20 Mfps alors que les *pinned* photodiodes sont plus adaptées aux imageurs *burst* nécessitant une forte sensibilité.

Les capteurs d'images *burst* peuvent être implémentés en technologies CCD et CMOS. Lorsque la technologie CCD est utilisée, l'imageur est composé d'une matrice de photodiode avec à l'aplomb de chaque photodiode un registre à décalage CCD comme illustré en Fig. 1. Ce registre stocke environ une centaine d'images par *burst*. La lecture de la mémoire s'effectue grâce à des registres CCD verticaux (VCCD) et des registres CCD horizontaux (HCCD). La vitesse d'acquisition maximale des imageurs *burst* de type CCD est de 16 Mfps et est limitée par l'opération de transfert de charge entre la photodiode et le registre de stockage CCD. Des solutions ont été proposées pour augmenter la vitesse d'acquisition jusqu'à 100 Mfps à l'aide de plusieurs grilles de collection.



Fig. 1 Architecture d'Imageur Burst en Technologie CCD

En technologie CMOS, les imageurs *burst* stockent les images dans le pixel ou hors de la matrice (Fig. 2). Le stockage est réalisé par une mémoire analogique à base de capacité. Si le stockage est réalisé dans le pixel, la profondeur mémoire est limitée à une dizaine d'images par *burst*. De plus la sensibilité est réduite à cause du faible facteur de remplissage du pixel qui contient les mémoires et la photodiode. Cependant comme la mémoire est au plus proche du pixel, l'accès à la mémoire est très rapide et la vitesse d'acquisition peut atteindre jusqu'à 1 Gfps. La seconde solution est de réaliser le stockage des images hors de la matrice dans des bancs de capacités. Cette architecture offre une profondeur mémoire de plus de 256 images. La sensibilité d'une telle architecture est élevée grâce au bon facteur de forme et à l'utilisation d'une photodiode *pinned* qui fournit un bon gain de conversion (74  $\mu$ V/e-). La vitesse d'acquisition est de 20 Mfps et est limitée par l'accès aux bancs de mémoire

analogique en périphérie de la matrice et par l'opération de transfert de charge de la photodiode *pinned*.

Fig. 2 Architectures d'Imageur Burst en Technologie CMOS





Imageur à Stockage dans le Pixel

Imageur à Stockage Hors du Pixel

En technologie CMOS, une architecture à stockage dans le pixel sera privilégiée pour offrir un frame rate élevé alors qu'une architecture à stockage hors du pixel sera choisie pour augmenter le nombre d'images stockées par *burst*. La technologie CCD, quant à elle, offre un bon compromis entre profondeur mémoire du *burst* et la vitesse d'acquisition.

Les dix dernières années ont vu l'apparition de l'intégration en trois dimensions (3D) qui propose d'augmenter la densité d'intégration par empilement de circuits. Différentes technologies existent pour réaliser l'empilement et les interconnexions verticales. Selon les technologies, la densité des interconnexions est plus ou moins élevée. Pour réaliser des imageurs intégrés en 3D ayant une interconnexion ou plus par pixel, la technologie considérée est l'empilement de circuits intégrés au sein d'un même boitier. Les circuits sont empilés comme illustré en Fig. 3. Dans cet exemple l'empilement est constitué de trois circuits. Le circuit supérieur et le circuit central sont connectés face à face (F2F) et leurs niveaux de métallisation supérieure sont connectés par des bondings. Le circuit central et le circuit inférieur sont connectés face à dos (F2B) par des bondings et le signal est routé d'une face à l'autre du circuit central à l'aide de via traversant le substrat (through silicon via appelé TSV).

Fig. 3 Circuits Intégrés en 3 Dimensions



#### 3. Proposition d'architecture

Deux architectures d'imageurs *burst* ont été proposées et analysées. Ces architectures réalisent la conversion analogique/numérique des données avant leur transfert hors de la puce. La conversion peut être effectuée avant ou après le stockage du *burst* d'images en mémoire comme illustré en Fig. 4. Dans le premier cas, la conversion est effectuée à haute vitesse durant l'acquisition du *burst* d'images qui est stocké dans des mémoires numériques. Dans le second cas, le *burst* d'images est stocké dans des mémoires analogiques et la conversion est effectuée durant la lecture de ces mémoires.



Fig. 4 Architectures à Stockage (a) Analogique et (b) Numérique

Les performances de ces différents modèles ont été évaluées en terme de vitesse d'acquisition, de dynamique, de profondeur mémoire et de consommation. La taille de la mémoire analogique est contrainte par le bruit d'échantillonnage du signal dans la capacité de stockage. La principale source de consommation de l'architecture à stockage analogique est la source de courant du buffer du pixel qui doit charger la mémoire analogique. La vitesse d'acquisition de l'architecture à stockage analogique fait l'objet d'un compromis avec la consommation et la dynamique du capteur. En terme de vitesse d'acquisition et de dynamique, l'architecture à stockage numérique est fortement contrainte par le choix du convertisseur analogique numérique qui doit avoir une grande fréquence de conversion pour une taille minimale. La principale source de consommation de l'architecture à stockage numérique est le convertisseur analogique numérique qui contribue à 50 % de la consommation totale devant la mémoire numérique (~20 %) et le multiplexeur pixels vers convertisseur (~20 %).

Les évaluations des performances de ces deux architectures montrent que pour une vitesse d'acquisition et une dynamique identiques, l'architecture à stockage analogique est plus économe en énergie comme illustré en Fig. 5. Ce graphique montre aussi que pour les deux architectures, la consommation croît avec la dynamique. Cette augmentation est respectivement due à l'augmentation de la capacité de stockage des mémoires de l'architecture à stockage analogique et de la capacité de l'étage *global shutter* de

l'architecture à stockage numérique. En terme de profondeur mémoire, l'architecture à stockage numérique offre une capacité largement supérieure à celle à stockage analogique sur toute la gamme de dynamique. De plus, l'utilisation d'une mémoire numérique dynamique permet de multiplier par quatre le nombre d'images stockées comparé à une mémoire numérique statique. Cependant les mémoires dynamiques nécessitent la mise en place d'un système de rafraichissement régulier des cellules mémoires.

Fig. 5 Consommation et Profondeur Mémoire pour Différentes Dynamiques



L'intégration 3D permet la réalisation d'architecture permettant l'augmentation de la profondeur mémoire de l'imageur *burst*. L'architecture à stockage numérique offre une profondeur 10 à 40 fois supérieure selon le type de mémoire utilisé (statique ou dynamique) comparé à l'état de l'art. Notre choix s'est donc porté vers ce type d'architecture. La vitesse d'acquisition ciblée est de 5 Mfps pour une dynamique de 54 dB soit une conversion analogique numérique de 9 bits.

#### 4. Etude thermique

L'intégration des circuits en 3 dimensions soulève souvent des questions de gestion de la température. En effet la consommation du système croît du fait de l'augmentation de la densité d'intégration, mais les surfaces d'échange avec l'extérieur restent constantes. Comme l'étude précédente l'a démontré, la consommation est très importante pour l'architecture à stockage numérique, une évaluation des températures de jonction du capteur d'image est présentée ici. La température du circuit dépend du mode de fonctionnement de l'imageur. En effet le capteur d'images burst peut acquérir un seul burst ou enchaîner l'acquisition et la lecture de plusieurs bursts comme illustré en Fig. 6. De plus il est important de synchroniser la caméra burst et l'évènement enregistré. Si l'occurrence de l'évènement est déterministe, le déclenchement de l'enregistrement peut être synchronisé avec l'évènement (déclenchement pré-évènement). Si l'occurrence de l'évènement est mal contrôlée ou aléatoire, l'enregistrement doit être démarré avant l'évènement et stoppé après l'occurrence de l'évènement (arrêt post-évènement). Ce dernier mode de déclenchement nécessite l'enregistrement des images dans une mémoire cyclique.

Fig. 6 Etats de Consommations pour une Acquisition Multi-Burst



Le modèle considère que l'imageur intégré en 3D est placé dans un boitier disposant d'un capot transparent. Le boitier est connecté à l'aide d'un BGA à un PCB comme illustré en Fig. 7. Le système est refroidi grâce aux échanges convectifs avec l'air ambiant. On considère que les échanges thermiques s'effectuent selon l'axe z.

Fig. 7 Circuit et PCB pour le Modèle Thermique



Un premier modèle statique a été réalisé pour évaluer la température en enregistrement multi-burst. La consommation moyenne de chaque couche de l'empilement 3D est évaluée grâce à la formule présentée en Fig. 6. Les simulations statiques démontrent la nécessité de couper les sources de courant des pixels et d'éteindre le convertisseur analogique numérique durant la lecture des mémoires. De plus l'échange thermique est limité par l'échange convectif entre le circuit imprimé et l'air. Pour augmenter cet échange et réduire la température de jonction, un radiateur est placé sur la surface du circuit imprimé. Ce faisant la consommation moyenne est réduite et la température de jonction est de 40 °C pour un mode de déclenchement pré-évènement.

Pour évaluer la température de jonction lors de l'enregistrement d'un unique *burst* d'images, un simulateur électrothermique à éléments finis a été utilisé. Différentes simulations ont été réalisées pour un déclenchement pré évènement et des arrêts post évènement. La température maximale atteinte par la jonction est tracée pour différentes durées d'acquisition en Fig. 8. Ces simulations démontrent que l'imageur *burst* à stockage numérique peut enregistrer un simple *burst* d'images en arrêt post évènement jusqu'à une durée de 1,7 s avant que la température de jonction dépasse les 125 °C. Des simulations pour le mode d'enregistrement multi-*burst* ont été réalisées. Elles confirment les résultats de l'étude statique et évaluent la durée limite d'acquisition en arrêt post-évènement à 3 ms.

Fig. 8 Températures de Jonction Maximale pour Différentes Durées d'Acquisition



#### 5. Circuits Pixel

Différents circuits pixels ont été réalisés pour l'imageur burst à stockage numérique. Le premier circuit est simplement constitué d'une photodiode NWell/Psub, d'un étage de global shutter et d'un multiplexer 9 pixels vers 1 ADC. L'intégration du photo-courant est réalisée sur la capacité de jonction de la photodiode. Ce circuit pixel a été simulé à 5 Mfps et offre une sensibilité de 2.1 V/lux/s ainsi qu'une dynamique de 70 dB. La consommation par pixel est de 226 µW. Deux circuits pixel ont ensuite été proposés, l'un pour augmenter la sensibilité du circuit et l'autre pour réduire la consommation. Pour augmenter la sensibilité la solution retenue a été de réaliser la conversion courant tension sur une faible capacité (20 fF). Pour copier le courant sur cette capacité un circuit d'injection direct bufférisé a été utilisé. Ce circuit est constitué d'un transistor d'injection en saturation qui connecte la photodiode à la capacité de 20 fF. Un amplificateur opérationnel est placé entre la photodiode et la grille du transistor d'injection comme illustré en Fig. 9. Sans amplificateur la bande passante de la copie de courant est définie par le ratio de la transconductance du transistor d'injection  $g_{mTinj}$  et la capacité de la photodiode  $C_{PD}$ . La transconductance dépend du photocourant qui polarise le transistor d'injection. Pour les faibles photocourants, la bande passante n'est pas suffisante pour notre vitesse d'acquisition de 5 Mfps. L'amplificateur augmente la bande passante de la copie de courant par son gain A. Pour atteindre nos spécifications, le gain de l'ampli doit être d'environ 60 dB. Des mesures ont été réalisées et ce circuit offre une sensibilité de 21 V/lux/s ce qui est une augmentation d'un facteur 10 par rapport à la solution précédente. Le bruit de lecture dépend du signal d'entrée, car le photo-courant polarise le transistor d'injection. La dynamique du circuit est de 55 dB. L'inconvénient de ce circuit est l'implémentation d'un amplificateur opérationnel par pixel ce qui augmente la consommation à 450 µW par pixel.

Fig. 9 Pixel à Injection Direct Bufférisée



Le troisième circuit réalisé est conçu pour réduire la consommation du circuit pixel. L'idée est d'enlever les sources de courants des différents buffers du pixel comme illustré en Fig. 10. Ce circuit nécessite cependant une opération de pré-charge réalisée par les interrupteurs  $SW_{buffP}$  et  $SW_{buffN}$ . Les simulations réalisées montrent que la consommation est réduite d'un facteur 20 par rapport au premier circuit.

Fig. 10 Pixel Sans Source de Courant



Cependant comme les buffers ne sont plus polarisés, l'état d'équilibre n'est pas atteint avant l'opération d'échantillonnage de l'étage de *global shutter* ou du multiplexeur. Comme le signal est échantillonné dans son régime transitoire, une variation temporelle du signal de contrôle génère une variation de tension du signal échantillonné. L'effet du *jitter* (i.e. bruit temporel) du signal de contrôle a donc été évalué. Pour un *jitter* de 50 ps rms, les dynamiques limitées par ce bruit sont résumées dans le tableau suivant pour différentes durées d'échantillonnage.

Fig. 11 Variation de Tension et Limite de Dynamique Causée par le Jitter sur le Signal d'Echantillonnage



| Sampling Time (ns) | Dynamic Range (dB) |
|--------------------|--------------------|
| 2                  | 42.6               |
| 4                  | 52.0               |
| 6                  | 58.1               |
| 8                  | 62                 |

#### 6. Prototype intégré en 3 dimensions

Un prototype d'imageur *burst* intégré en 3D et fonctionnant à 5 Mfps a été réalisé. Comme le fabricant proposait un empilement de deux circuits (Fig. 12), le convertisseur analogique numérique a été partagé entre le circuit supérieur (i.e. matrice de pixels) et le circuit inférieur (i.e. mémoire numérique). De plus la chaine d'acquisition de l'image n'est pas multiplexée, un pixel est associé à un convertisseur et une mémoire numérique.

Fig. 12 Prototype d'Imageur Burst à Stockage Numérique pour un Empilement Deux Circuits



Le pixel est formé d'une photodiode en îlots et d'un circuit à injection direct bufférisé pour réaliser la conversion courant-tension. Une cellule de *global shutter* est aussi implémentée dans le pixel. La conversion analogique numérique est réalisée à l'aide d'un convertisseur simple rampe 8 bits fournissant 5 MC/s. Le comparateur est implémenté sur le circuit supérieur et effectue une opération d'auto-zéro (Fig. 13). Le compteur est implémenté sur le circuit inférieur. La rampe analogique est générée en externe et l'horloge du compteur est fournie par une PLL. La mémoire digitale est réalisée grâce à 4 registres à décalage. La profondeur mémoire de l'imageur est de 52 images et la résolution spatiale de 20x20 pixels pour un pitch de 50 µm par 50 µm. L'utilisation de SRAM au lieu de registre pour réaliser la mémoire numérique aurait permis de stocker un millier d'images par *burst*. Le temps de développement ne nous a pas de permis de réaliser une telle mémoire. Le circuit a été validé en simulation et envoyé en fabrication. Les tests optiques seront effectués au retour du circuit.

Fig. 13 Circuit Supérieur et Inférieur pour le Prototype d'Imageur Burst Intégré en 3D





#### 7. Conclusion

Cette étude a permis d'identifier les apports des technologies d'intégration 3D à l'imagerie burst. Suite à une étude de l'état de l'art, deux architectures d'imageur burst intégrées en 3D ont été proposées. La première architecture effectue la conversion analogique numérique de la vidéo après le stockage en mémoire du burst d'images. La seconde architecture effectue la conversion analogique numérique durant l'acquisition de la vidéo et avant le stockage en mémoire. Une étude a été menée pour évaluer les performances de ces deux architectures et a démontré que l'intégration 3D permettait l'augmentation du nombre d'images stockées par burst d'un ordre de grandeur par rapport à l'état de l'art. De plus l'architecture à stockage numérique est apparue comme la plus efficace pour maximiser la profondeur mémoire qui peut alors atteindre jusqu'à 8000 images. Comme cette dernière architecture a une consommation élevée, une étude thermique a été menée pour évaluer la faisabilité. Elle révèle la nécessité de couper les sources de courant du pixel et d'éteindre les ADCs durant la phase de lecture de la mémoire ainsi que de placer un radiateur sur le circuit imprimé support. Ce faisant, les acquisitions d'un simple burst et d'une succession de burst sont possibles d'un point de vue thermique pour un déclenchement pré évènement ou un arrêt post évènement. Différents circuits pixels ont ensuite été proposés pour l'architecture à stockage numérique fonctionnant à 5 millions d'images par seconde. Une première structure de pixel à *qlobal shutter* a été proposée et testée sur silicium. Deux autres pixels ont ensuite été conçus, l'un pour augmenter la sensibilité d'un facteur 10 grâce à l'utilisation d'une structure à injection directe bufférisée et l'autre pour réduire la consommation d'un facteur 20 en supprimant les sources de courant. La structure à injection directe a été testée et validée sur silicium alors que la seconde n'a pour l'instant été validée qu'en simulation. Enfin un prototype d'imageur burst intégré en 3D a été réalisé. Ce prototype est constitué d'un empilement de deux circuits. Le pixel est implémenté avec un circuit à injection directe bufférisée. Le convertisseur est un ADC simple rampe 8 bits qui permet une vitesse d'acquisition à 5 millions d'images par seconde. La mémoire numérique est réalisée avec des bancs de registres à décalage.

Différentes compléments pourraient être apportés à cette étude des imageurs burst intégrés en technologie 3D. Tout d'abord des tests du pixel sans source de courant et des mesures de bruit (phototransfer curve) seraient à effectuer. D'autre part et d'un point de vue système, l'étude du risque d'emballement thermique pourrait être menée. En effet les courants de fuite des mémoires numériques qui augmentent avec la température génèrent une surconsommation. Cette surconsommation entraine elle-même une augmentation de la température. D'autre part l'étude de l'évolution de la température sur la qualité de l'image pourrait être faite. D'un point de vue circuit, la conception d'un pixel à base de photodiode pinned ou de phototransistor semble une piste intéressante pour augmenter la sensibilité sans ajouter de circuiterie additionnelle à la fréquence d'acquisition considérée. Enfin l'étude et la conception d'un ADC adapté à notre capteur semblent importantes, car l'ADC est l'élément déterminant du capteur en termesde fréquence d'acquisition et de dynamique.

#### **Remerciements:**

Je souhaite tout d'abord remercier les rapporteurs Prof. Pierre MAGNAN et Prof. Dr.-Ing. Dietmar FEY pour leur relecture et leurs remarques constructives sur mon manuscrit de thèse. Je remercie aussi les membres du jury Dr. Andreas KOCH, Dr. Antoine DUPRET et Prof. Luc HEBRARD ainsi que M. Fabrice GUELLEC pour leurs remarques sur mes travaux et leur disponibilité.

Je remercie grandement mon directeur de thèse Prof. Wilfried UHRING pour m'avoir fait partager sa grande expérience de l'imagerie rapide et sa passion communicative pour la recherche. Je suis très heureux d'avoir pu réaliser ma thèse sous son encadrement. Cette thèse a été réalisée en partie grâce aux moyens du laboratoire ICUbe. Je tiens en particulier à remercier Prof. Luc HEBRARD, Dr. Jean-Baptiste KAMMERER et Maroua GARCI pour leurs aides sur les simulations électro-thermiques. J'ai aussi pu tester mon circuit grâce à l'aide de Pascal LEINDECKER et Jérémy BARTRINGER. Enfin je remercie les doctorants du laboratoire Imane, Octavian, Laurent, Vincent, Fitsum, Abdelatif et Thomas pour l'accueil qu'ils ont fait à un grenoblois.

D'autre part, cette thèse a été menée à bien grâce aux moyens du L3I qui ont été mis à ma disposition par M. Michael TCHAGASPANIAN et M. Fabrice GUELLEC. Je remercie aussi M. Josep SEGURA pour avoir participé à la proposition de ce sujet de thèse et m'avoir encadré au quotidien. J'ai aussi eu la chance de pouvoir discuter avec tous les permanents du L3I qui ont partagé leurs expériences sur de nombreux sujets. Ces trois ans ont aussi été un moment agréable grâce à tous les doctorants, stagiaires et permanents que j'ai croisé au sein de l'équipe Timothé, William, Amr, Assim, Simon, Nicolas, Camille, Victor, Bertrand et Margaux. Je remercie aussi Michele pour ses feedbacks sur l'expérience de thèse qui m'ont permis de relativiser dans les moments durs. J'ai pu aussi décrocher de mon travail pour profiter de Grenoble et ses environs avec Cédric, Amélie, Sabine, Jean Michel, Raphaëlle, Davide, Cornelia, Agustin, Andres, Pierre L et Pierre L. Ces trois ans ont été l'occasion de reprendre la rivière avec les kayakistes du GACK, Thomas, Téo, Julian, GuiGui, Denis, Jérémy, Mark, CriCri et tous les autres...

Enfin je souhaite remercier toute ma famille qui m'a soutenu et aidé à prendre du recul durant ces années de thèse.

## **Table of Contents**

| Ta | able of C | ontents                                             | I  |
|----|-----------|-----------------------------------------------------|----|
| Ta | able of F | gures                                               | V  |
| Ta | able of A | bbreviations                                        | X  |
| 1. | Intro     | duction                                             | 1  |
| 2. | Back      | ground & Bibliographical Review                     | 4  |
|    | 2.1.      | Classical Image Sensor Architecture                 | 4  |
|    | 2.2.      | High Speed Image Sensor                             |    |
|    | 2.3.      | Burst Image Sensor                                  | 8  |
|    | 2.3.1.    | Architectures                                       | 9  |
|    | 2.3.1.1.  | CCD Image Sensor                                    | 9  |
|    | 2.3.1.2.  | CMOS Image Sensor                                   | 12 |
|    | 2.3.1.3   | Synthesis                                           | 15 |
|    | 2.4.      | Photodetectors                                      | 16 |
|    | 2.4.1.    | Photodiodes                                         | 16 |
|    | 2.4.2.    | Pinned Photodiode                                   | 21 |
|    | 2.4.3.    | Phototransistor                                     | 22 |
|    | 2.4.4.    | Photodetector Synthesis                             | 23 |
|    | 2.5.      | Three-Dimensional (3D) Interconnection Technologies | 24 |
|    | 2.5.1.    | Technology Offers                                   | 24 |
|    | 2.5.2.    | 3D Stacked Integrated Circuit                       | 25 |
|    | 2.5.3.    | 3D Integration and High Speed Image Sensor          | 27 |
|    | 2.6.      | Conclusion                                          | 27 |
| 3. | 3D Ir     | ntegrated Burst Image Sensor                        | 28 |
|    | 3.1.      | Motivations                                         | 28 |
|    | 3.2.      | Architectural General Considerations                | 28 |
|    | 3.2.1.    | Recording and Triggering Modes                      | 29 |
|    | 3.3.      | Burst Image Sensor with Analog Storage              | 30 |

|    | 3.3.1.   | Architecture Overview                                   | 30 |
|----|----------|---------------------------------------------------------|----|
|    | 3.3.1.1. | 3D Integration                                          | 31 |
|    | 3.3.1.2. | Circuit Implementation                                  | 31 |
|    | 3.3.2.   | Performance Evaluation                                  | 33 |
|    | 3.3.2.1. | Analog Memory                                           | 33 |
|    | 3.3.2.2. | Pixel Front-End                                         | 36 |
|    | 3.3.2.3. | Model Results                                           | 39 |
|    | 3.3.3.   | Performance Synthesis                                   | 42 |
|    | 3.4.     | Burst Image Sensor with Digital Storage                 | 43 |
|    | 3.4.1.   | Architecture Overview                                   | 43 |
|    | 3.4.2.   | Performance Evaluation                                  | 45 |
|    | 3.4.2.1. | Pixel Front End and Multiplexer                         | 45 |
|    | 3.4.2.2. | Analog to Digital Conversion                            | 48 |
|    | 3.4.2.3. | Digital Memory                                          | 50 |
|    | 3.4.3.   | Performance Synthesis                                   | 53 |
|    | 3.5.     | Conclusions and Perspectives                            | 54 |
| 4. | Ther     | mal Study of a 3D Integrated Digital Burst Image Sensor | 56 |
|    | 4.1.     | Introduction                                            | 56 |
|    | 4.2.     | Thermal System                                          | 56 |
|    | 4.2.1.   | Package Description and Model Assumptions               | 56 |
|    | 4.2.2.   | Power Consumption                                       | 59 |
|    | 4.3.     | Static Simulation                                       | 61 |
|    | 4.3.1.   | Static Model                                            | 61 |
|    | 4.3.2.   | Simulations                                             | 63 |
|    | 4.4.     | Finite Element Simulations                              | 64 |
|    | 4.4.1.   | Finite Element Model                                    | 64 |
|    | 4.4.2.   | Transient Simulations                                   | 66 |
|    | 4.4.2.1. | Single Burst Recording                                  | 66 |
|    | 4.4.2.2. | Multi-Burst Recording                                   | 68 |

|    | 4.5.     | Thermal Runaway                               | 69  |
|----|----------|-----------------------------------------------|-----|
|    | 4.6.     | Conclusion                                    | 69  |
| 5. | Anal     | og Front-End Circuits                         | 71  |
|    | 5.1.     | APS Based Pixel Front-End Circuit             | 71  |
|    | 5.1.1.   | Description                                   | 71  |
|    | 5.1.2.   | Current to Voltage Conversion                 | 72  |
|    | 5.1.3.   | Global Shutter Stage                          | 74  |
|    | 5.1.4.   | Multiplexer Stage                             | 76  |
|    | 5.1.5.   | Full Front-End Performances                   | 78  |
|    | 5.1.6.   | Conclusion                                    | 79  |
|    | 5.2.     | Design Strategies to Increase Sensitivity     | 80  |
|    | 5.2.1.   | Introduction                                  | 80  |
|    | 5.2.2.   | Resistive Trans-Impedance Amplifier Circuit   | 80  |
|    | 5.2.3.   | Capacitive Trans-Impedance Amplifier Circuit  | 83  |
|    | 5.2.4.   | Buffered Direct Injection Circuit             | 85  |
|    | 5.2.5.   | Implementation and Simulation                 | 87  |
|    | 5.2.6.   | Conclusion                                    | 92  |
|    | 5.3.     | Design Strategies to Reduce Power Consumption | 93  |
|    | 5.3.1.   | Conclusion                                    | 98  |
| 6. | Circu    | it Implementations and Tests                  | 100 |
|    | 6.1.     | Test-Chip for Pixel Front-End Evaluation      | 100 |
|    | 6.1.1.   | Test-Chip and Test Board                      | 100 |
|    | 6.1.2.   | Test Results                                  | 104 |
|    | 6.1.2.1. | Photodiode Characterization                   | 104 |
|    | 6.1.2.2. | APS Front-End Circuit                         | 107 |
|    | 6.1.2.3. | Pixel with BDI Stage                          | 110 |
|    | 6.1.3.   | Front-End Without Current Source              | 113 |
|    | 6.2.     | 3D Integrated Circuit                         | 113 |
|    | 6.2.1.   | Circuit Implementation                        | 113 |

| 6.2.2.    | Simulation Results       | 116 |
|-----------|--------------------------|-----|
| 6.3.      | Conclusion               | 119 |
| 7. Con    | clusion and Perspectives | 121 |
| Annex Se  | ction                    | 1   |
| Annex     | A                        | 1   |
| Annex     | В                        | 3   |
| Annex     | C                        | 8   |
| Annex     | D                        | 13  |
| Annex     | E                        | 16  |
| Bibliogra | ohy                      | 19  |

# **Table of Figures**

| Fig.   | 1 Digital Camera Diagram                                                                              | 4  |
|--------|-------------------------------------------------------------------------------------------------------|----|
| Fig. 2 | 2 CMOS Image Sensor Architecture                                                                      | 5  |
| Fig. 3 | 3 Rolling Shutter and Global Shutter Acquisitions                                                     | 6  |
| Fig. 4 | 4 CIS Output Bandwidth and Number of Output Channels versus the CIS Frame Rate                        | 7  |
| Fig. ! | 5 Frame Rate Requirement for High Speed Image Sensor                                                  | 9  |
| Fig. ( | 6 Timing Diagram of Burst and Continuous Recording                                                    | 9  |
| Fig.   | 7 Simplified CCD Burst Image Sensor Architecture                                                      | 10 |
| Fig. 8 | 8 Back-Side-Illuminated Burst CCD Image Sensor                                                        | 11 |
| Fig. 9 | 9 Multi-Collection Gate Pixel Architecture and its Timing Diagram                                     | 12 |
| Fig. : | 10 Burst CIS Pixel with 8 In-Pixel Memories                                                           | 13 |
| Fig. : | 11 Burst CIS with Out-Of-Pixel Storage                                                                | 14 |
| Fig. : | 12 Reverse Biased PN Junction                                                                         | 17 |
| Fig. : | 13 Cross-Sectional Views of (a) N+/PSub, (b) NWell/PSub and (c) P+/NWell/PSub Photodiodes             | 18 |
| Fig.   | 14 Photodiodes Responsivity versus the Incident Light Wavelength for 40x40 μm² Photodiodes            | 18 |
| Fig.   | 15 Pinned Photodiode Cross Section and its Potential Diagrams                                         | 22 |
| Fig.   | 16 PNP Phototransistor and Quad Small Base PNP Phototransistor Cross Sections                         | 23 |
| Fig. : | 17 3D Stacked Integrated Circuit                                                                      | 25 |
| Fig. : | 18 Through Silicon Via RLC Model                                                                      | 26 |
| Fig. : | 19 Pre-Event Triggering Mode Timing Diagram                                                           | 29 |
| Fig. 2 | 20 Post-Event Triggering Mode Timing Diagram                                                          | 30 |
| Fig. 2 | 21 Analog Storage Burst Image Sensor Architecture                                                     | 31 |
| Fig. 2 | 22 Cluster of Pixels of the Analog Storage Architecture                                               | 32 |
| Fig. 2 | 23 Timing Diagram of Analog Storage Architecture                                                      | 33 |
| Fig. 2 | 24 Analog Memory Architecture                                                                         | 34 |
| Fig. 2 | 25 Analog Front-End Circuit                                                                           | 37 |
| Fig. 2 | 26 (a) Memory Size and (b) Switch Transistor Width versus the Dynamic Range for Different Frame Rates | 39 |
| Fig 1  | 27 Voltage Drop versus the Dynamic Range for Different Frame Rates                                    | 40 |

| Fig. 28 Bias Current of the SF Buffer versus the Dynamic Range                                                                                     |
|----------------------------------------------------------------------------------------------------------------------------------------------------|
| Fig. 29 Pixel Power Consumption versus the Dynamic Range for Different Frame Rates                                                                 |
| Fig. 30 Dynamic Range Limitation due to Readout Noise                                                                                              |
| Fig. 31 Digital Storage Burst Image Sensor Architecture                                                                                            |
| Fig. 32 Timing Diagram of a Cluster of 4 Pixels for the Digital Storage Architecture                                                               |
| Fig. 33 Pixel Front-End of Digital Storage Architecture                                                                                            |
| Fig. 34 Pixel Power Consumption versus the Dynamic Range of Different Frame Rates                                                                  |
| Fig. 35 Dynamic Range Limitation due to the Readout Noise                                                                                          |
| Fig. 36 SNDR versus the Frame Rate                                                                                                                 |
| Fig. 37 Power Consumption Density versus the Frame Rate                                                                                            |
| Fig. 38 Random Access Memory Architecture                                                                                                          |
| Fig. 39 (a) 6T SRAM Bit Cell and (b) 1T1C DRAM Bit Cell                                                                                            |
| Fig. 40 Memory Size versus Dynamic Range of the Digital Storage Architecture for Static and Dynamic Bit Cell i 28 nm technology                    |
| Fig. 41 Pixel Power Consumption for Different Pairs of Dynamic Range and Frame Rate                                                                |
| Fig. 42 Image Sensor Package and Printed Circuit Board                                                                                             |
| Fig. 43 Cross Section View of Thermal Model of the 3D Burst Image Sensor                                                                           |
| Fig. 44 3D Integrated Image Sensor in a BGA Package Cooled by Free Air Convection                                                                  |
| Fig. 45 Static Model                                                                                                                               |
| Fig. 46 Power Consumption States and Weighted Average Power Consumption for Multi-burst Recording 6                                                |
| Fig. 47 Temperature at the Steady State through the z Axis of the Static Model                                                                     |
| Fig. 48 Steady State Temperature in Multi-Burst Recording for Different Acquisition Times with an Heat Sink . 6-                                   |
| Fig. 49 Electro-Thermal Simulator                                                                                                                  |
| Fig. 50 Thermal Model for Finite Element Simulation                                                                                                |
| Fig. 51 Thermal Transient Simulation for a Single Burst Acquisition with the average temperature on the first (x), second (+) and third (•) layers |
| Fig. 52 Maximum Junction Temperature versus the Acquisition for Single Burst Recording in Post Even Triggering Mode                                |
| Fig. 53 Junction Temperature of Multi-Burst Recording in Pre-Event Triggering Mode                                                                 |

| Fig. 54 Maximum Junction Temperature versus Time in Post-Event Triggering Mode for Different A                                              |    |
|---------------------------------------------------------------------------------------------------------------------------------------------|----|
| Fig. 55 Cluster of APS Based Pixel Front-End                                                                                                | 72 |
| Fig. 56 (a) Photo-current to voltage characterisctic and (b) photo-current to voltage gain of a 41: NWell/PSub photodiode                   |    |
| Fig. 57 Integrated Shot Noise and SNR of the Photocurrent to Voltage Conversion of a 41x48.8 $\mu$ m <sup>2</sup> N <sup>1</sup> Photodiode |    |
| Fig. 58 (a) Schematic and (b) Timing Diagram of the Global Shutter Stage                                                                    | 74 |
| Fig. 59 DC Characteristic and Gain of the Global Shutter Stage                                                                              | 75 |
| Fig. 60 Transfer Function and Noise Figure of the Global Shutter SF Buffer                                                                  | 75 |
| Fig. 61 Schematic of the Multiplexer Stage                                                                                                  | 77 |
| Fig. 62 Characteristic and Gain of the Multiplexer SF Buffer                                                                                | 77 |
| Fig. 63 Transfer Function and Noise Figure of the Multiplexer SF Buffer                                                                     | 78 |
| Fig. 64 Characteristic and Gain of the APS Pixel Front-End for an Integration Time of 180 ns                                                | 78 |
| Fig. 65 Signal to Noise Ratio of the APS Pixel Front-End Readout Chain                                                                      | 79 |
| Fig. 66 RTIA Stage and its Transfer Function                                                                                                | 81 |
| Fig. 67 RTIA Stage and Voltage Amplifier Solution                                                                                           | 81 |
| Fig. 68 Timing diagram of the RTIA and amplifier stage                                                                                      | 82 |
| Fig. 69 Photo-Current to Voltage Conversion performed by a Capacitive Trans-Impedance Amplifier ( its Timing Diagram                        |    |
| Fig. 70 Bias Current of the Op-Amp versus the Integration Capacitor for a 34 MHz Bandwidth                                                  | 84 |
| Fig. 71 Direct Injection Stage and its Transfer Function                                                                                    | 86 |
| Fig. 72 Buffered Direct Injection Stage and its Transfer Function                                                                           | 86 |
| Fig. 73 Folded Cascode Amplifier                                                                                                            | 88 |
| Fig. 74 Integration Voltage versus Photocurrent Characteristic and Sensitivity of the BDI stage                                             | 89 |
| Fig. 75 I <sub>int</sub> versus I <sub>pd</sub> Transfer Function of the BDI Stage                                                          | 89 |
| Fig. 76 Stability Analysis of the BDI stage                                                                                                 | 90 |
| Fig. 77 Direct and Feedback Gains of the Open Loop Circuit                                                                                  | 90 |
| Fig. 78 Direct and Feedback Gains of the Open Loop Circuit with a 25 fF Coupling Capacitor                                                  | 91 |
| Fig. 79 I <sub>int</sub> versus I <sub>pd</sub> Closed Loop Transfer Function of the BDI Stage with a 25 fF Coupling Capacitor              | 91 |

| Fig. 80 Noise on the Integration Node of the BDI Stage Without and With Coupling Capacitor                                             | 92    |
|----------------------------------------------------------------------------------------------------------------------------------------|-------|
| Fig. 81 Pixel Front-End Without Current Source and its Timing Diagram                                                                  | 93    |
| Fig. 82 Power Consumption versus the Photocurrent of the Pixel Front-End Without Current Source                                        | 94    |
| Fig. 83 Characteristic and Gain of the Front-End Circuit without Current Source                                                        | 94    |
| Fig. 84 Effect of a Variation of the Sampling Time on the Sampled Value                                                                | 95    |
| Fig. 85 Sampling Time Jitter versus the Length of the Buffer Daisy Chain for a Power Supply Noise of 10 m\                             |       |
| Fig. 86 Histogram of the Sampled Signal Values for 2 ns and 18 ns Sampling Times with a Standard Deviation 48 ps                       |       |
| Fig. 87 (a) Voltage Noise versus the Input Signal for a 48 ps Jitter Noise (b) Voltage Noise and Dynamic Rafor Different Sampling Time | _     |
| Fig. 88 Packaged Pixel Front-End Test-chip with an Optical Lid                                                                         | . 100 |
| Fig. 89 Equivalent Electrical Model of the Signal Path from the Chip to the Board                                                      | . 101 |
| Fig. 90 Chip Buffer Schematic                                                                                                          | . 102 |
| Fig. 91 Test Board                                                                                                                     | . 102 |
| Fig. 92 Control Signal Generation Circuit based on Delay Lines                                                                         | . 103 |
| Fig. 93 DC Characteristics of the Chip and Board Buffers                                                                               | . 104 |
| Fig. 94 Measured Characteristic of the APS Front-End without Photodiode                                                                | . 105 |
| Fig. 95 Measurement Circuit for Integration Capacitor Extraction                                                                       | . 106 |
| Fig. 96 Test Bench for Responsivity Measurement                                                                                        | . 106 |
| Fig. 97 Responsivity of the NWell/PSub Photodiode                                                                                      | . 107 |
| Fig. 98 Characteristic of the APS front-end circuit                                                                                    | . 108 |
| Fig. 99 Ideal and Real Response of the Integration Stage to a Light Pulse                                                              | . 109 |
| Fig. 100 Electronic Aperture of the APS Front-End for an Integration Time of 600 ns                                                    | . 110 |
| Fig. 101 Characteristic of the Pixel Front-End with BDI Stage                                                                          | . 111 |
| Fig. 102 Electronic Aperture of the BDI Front-End Circuit for an Integration Time of 600 ns                                            | . 111 |
| Fig. 103 Electronic Aperture of the BDI Front-End Circuit for an integration time of 200 ns                                            | . 112 |
| Fig. 104 3D Integrated Burst Image Sensor with Two Tiers                                                                               | . 113 |
| Fig. 105 Timing Diagram of the 3D Integrated Burst Image Sensor                                                                        | . 114 |
| Fig. 106 Pixel Front-End and ADC of the Ton Tier                                                                                       | 115   |

| Fig. 107 ADC and Memory of the Bottom Tier                                                           | 116 |
|------------------------------------------------------------------------------------------------------|-----|
| Fig. 108 Characteristics and Gain of the BDI Pixel Front-End of the 3D Integrated Burst Image Sensor | 117 |
| Fig. 109 Voltage Noise on the Pixel Font-End Output of the 3D Integrated Burst Image Sensor          | 117 |
| Fig. 110 Voltage Error on the Ramp Signal due to the Wire Bonding                                    | 118 |
| Fig. 111 Top and Bottom Tier Pixel Layout and Full Chip Bottom Layout                                | 119 |

#### **Table of Abbreviations**

3D: Three spatial Dimension 3D-IC: 3D Integrated Circuit

3D-P: 3D Packaging

3D-SIC: 3D Stacked Integrated Circuit 3D-WLP: 3D Wafer Level Packaging 3T pixel: Three Transistor pixel 5T: Five Transistor op-amp CIS: CMOS Image Sensor

ADC: Analog to Digital Converter

APS: Active Pixel Sensor
BDI: Buffered Direct Injection
BEOL: Back-End-Of-Line
BSI: Back Side Illuminated
CCD: Coupled Charge Device

CG: Collection Gate
CIS: CMOS Image Sensor

CMOS: Complementary Metal Oxide Semiconductor

CTIA: Capacitive Trans-Impedance Amplifier

DAC: Digital to Analog Converter DSP: Digital Signal Processor F2B: Face To Back stacking F2F: Face To Face stacking FD: Floating Diffusion

FIFO: First In First Out memory

FPA: Focal Plan Array
FPN: Fixed Pattern Noise

FWHM: Full Width at Half Maximum

fps: Frame Per Second

GBW: Gain BandWidth product HCCD: Horizontal CCD segment h/e pair: Hole Electron pair IC: Integrated Circuit

IO: Input Output

IP: Intellectual Property

IS: Image Sensor

ITRS: International Technology Roadmap for Semiconductor

KOZ: Keep-Out-Zone LDO: Low Dropout Voltage MCG: Multi Collection Gate

MOS: Metal Oxide Semiconductor MOSFET: MOS Field Effect Transistor N+: heavily N-type doped region

NWell: N-type doped well

P+: heavily P-type doped region PN junction: P-type N-type junction

**PSD: Power Spectral Density** 

PSub: P-type doped substrate PWell: P-type doped well rms: Root Mean Square

RTIA: Resistive Trans-Impedance Amplifier SNDR: Signal to Noise and Distortion Ratio

SNR: Signal to Noise Ratio TSV: Through Silicon Via VCCD: Vertical CCD segment

#### 1. Introduction

The high speed imaging is a wide domain that covers frame rates from thousands to billions frames per second (fps). Integrated high speed image sensors are used in industrial cameras as well as in research laboratories to observe fast phenomena. The high speed cameras are sorted in three categories depending on their architecture. The continuous video cameras record video up to 25 kfps. The recording length is not limited by the camera but by the mass storage devices (i.e. hard disks). Such cameras are used to perform industrial production line control, automotive crash test, sport and wildlife slow motion recording... [1] The second type of camera are the burst video cameras which record a video of one hundred images at a frame rate up to 100 Mfps. They are mainly used for R&D applications to observe phenomena such as crack propagation (1 Mfps), mechanical stress test (2 Mfps), inkjet droplet formation (5 Mfps), plasma formation (4 Mfps) and combustion [2]. The last type of fast camera is the streak camera which records at frame rates above 1 Gfps. These cameras shoot a video of a hundred images and produce images of one spatial dimension. They are used to record ultra-high speed phenomena in different scientific fields such as condensed matter physics, plasma physics, photochemistry, biology and optical communication [3]. Depending on the recorded phenomenon, a two spatial dimension image can be reconstructed [4].

As for other fields of the visible imaging, coupled charge device (CCD) and complementary metal oxide semiconductor (CMOS) technologies are both competing to implement the image sensors of high speed video cameras. For this PhD research work, only solutions based on CMOS technology are considered. This choice first relies on the industrial environment of my laboratory which is more focused on CMOS technology rather than CCD. Moreover, this choice has also been done for a practical reason. The company where our test-chip was made only provides CMOS technologies for its multi project wafers. However, high speed CCD image sensors will still be presented and compared with CMOS image sensor solutions.

During 40 years, the technological choices of the microelectronic industry have been driven by the transistor shrinking. However, the transistor shrinking seems to reach its limits as the transistor dimensions are now about few hundreds atoms and the devices are prone to quantic effects that degrade their performances. The last decade has seen the emergence of disruptive technologies to bypass this issue and keep improving the performances of the integrated circuits. Therefore, many alternatives have been proposed and developed. At the devices level, new structures of transistors have been presented to improve the carrier mobility and reduce the current leakages such as germanium based transistor, trigate transistor or carbon nanotube transistor [5]. The memory technologies have seen the same development of alternative solutions. These solutions are based on different physic phenomena such as magnetic effects, spin orientation, phase change or resistive materials [6]. These technologies combine fast access times and high density storages. At the

integration level, a promising technology is the three dimension (3D) integration technology which offers the possibility to implement transistors in three spatial dimensions instead of two. This technology has been developed at different levels and it is now possible to stack the packages, the integrated circuits or the transistors. This technology is especially interesting for system in package as it allows the nearby implementation of circuits made with different technologies and also enables high bandwidths between the different circuits. 3D integration technology has already proved its efficiency for digital applications with the stacking of memories on a processor [7]. In imaging, the 3D integration has been used to implement image sensor with embedded digital processing, large spatial resolution image sensors or hybrid pixel detectors for particle physics [8] [9] [10]. However, only few works have been done for high speed imaging and have been implemented with a non-standard 3D integration technology [11].

The objective of this PhD work is first to identify the inflows of the 3D integration technology to the development of high speed image sensors. Then, the aim is to propose and study some architectures of high speed burst image sensors based on the 3D integration technology. In this thesis dissertation, we first present a review of the high speed images sensors for both CCD and CMOS image sensors. The emphasis is put on the burst high speed image sensor architectures. We also describe the different characteristics of the photodetectors and the 3D integration technologies with respect to our high speed imaging concern. Based on this background description, we then propose architectures of burst CMOS image sensor using 3D integration technologies. This chapter highlights two 3D integrated burst image sensors. Both these architectures perform an on-chip analog to digital conversion of images but the first one stores the burst of images before the conversion (i.e. analog storage) while the second one memorizes the images after the conversion (i.e. digital storage). The performances of both architectures are assessed and compared in term of frame rate, memory depth and power consumption. Thanks to the information obtained from this analysis, we chose to focus our work on the burst image sensor with digital storage. The analysis also reveals a high power consumption of the burst image sensor with digital storage during the image acquisition. To estimate the risk of overheating, we then conduct a thermal study of the sensor. This thermal analysis is carried out for the different recording and triggering modes of the camera and demonstrates the feasibility of such architecture. At this point, we chose to focus our work more on the circuit than the system and we presents the design of different pixel front-end circuits for 3D integrated burst image sensor with digital storage. A basic front-end circuit inspired from active pixel sensor (APS) structure is first designed. This pixel performs the current to voltage conversion, the global shutter acquisition and a multiplexing operation. Two front-end circuits are then presented and designed. One increases the pixel sensitivity while the other reduces the power consumption. Finally, we present the measurement results of a test-chip where those pixel front-end circuits have been implemented. These measurements include the photodiode responsivity, the pixel front-end characteristic and its electric aperture. This last chapter also presents a proof of concept of a 3D integrated image sensor. This sensor is made of a stack of two circuits with on the top the pixel front-end and on the bottom the A/D converter and the burst digital memory. This image sensor has 20x20 pixels and acquires a video at 5 Mfps.

# 2. Background & Bibliographical Review

# 2.1. Classical Image Sensor Architecture

This section gives a brief description of a digital camera and presents the basic architecture of a CMOS image sensor (CIS). A digital camera is composed of an optical system, an image sensor, and some electronic systems [12] [13]. The optical system is made of a set of lens and a diaphragm. This system focuses the light emitted by the scene to provide a clear and sharp image on the focal plane and also performs optical zooms. At this point, the camera performs a spatial discretization of the image thanks to an array of pixels, commonly known as focal plan array (FPA). The FPA is composed of a micro-lens array to focalize the light on the photosensitive area of each pixel. A color filter array (red, green, blue) is used to produce a color image. The image sensor converts the optical signal into an electrical signal thanks to photodiodes. Several analog to digital converters (ADC) convert the signal into digital data that can be processed by a digital signal processor (DSP). The digital image is then stored into a mass memory. The image sensor can be implemented in two technologies. The first technology is based on coupled charge devices (CCD) [14], the second is based on the classic CMOS technology. The first CMOS image sensors (CIS) were designed in the 60's [15]. However, CIS technology was abandoned in the 70's because of the small pixel pitch offered by CCD. Since the 1990s, CIS takes over the market due to the continuous shrinking of CMOS technology that enables small pixels for the mainstream market [16]. Contrary to CCD, CIS embeds the ADCs on the image sensor chip.

Fig. 14 Digital Camera Diagram



A CIS is composed of an array of pixels, a row decoder on the side of the array and a readout circuit at the bottom of it, Fig. 15. To acquire an image, each pixel of the CIS is exposed to the light and collects an amount of photons. The pixel then performs the photoelectric conversion and produces an electrical signal proportional to the number of collected photons. The signal is conditioned inside the pixel. On Fig. 15, this operation is performed by a three transistors (3T) pixel circuit. The row decoder acts as an addressing circuit which connects through the column bus a row of the pixel array to the column amplifiers. The amplified signal is then converted into digital data thanks to a column ADC. Finally, the outputs of the column ADCs are multiplexed to be connected to an output buffer.

Fig. 15 CMOS Image Sensor Architecture



Throughout the rest of the CIS description, a pixel with charge integration is considered [13]. An image acquisition is thus composed of three steps. First each pixel is reset to a common voltage value. Then, the electrical charges received during the exposition time are integrated on a capacitor into a voltage. Finally, the pixel voltage values are transferred to the column ADCs by the read-out circuit. The method to readout the pixel array plays a significant role in the image acquisition. There are two modes of acquisition known as rolling shutter and global shutter acquisition [17]. In rolling shutter mode, an acquisition signal is propagated successively from the first row to the last row of the image sensor. This signal is then followed by a reading signal that connects the pixel row to the read-out circuit. This reading signal corresponds to the end of the integration. The timing diagram of the rolling shutter mode is illustrated in Fig. 16. For the pixels of two different rows, the charges collected during the integration process do not correspond to the same time. Some artifacts, also known as skew effect, can thus appear on the image if an object of the scene is moving. To tackle this issue, it is necessary to perform the charge integration in the same time for each pixel of the array. This acquisition method known as global shutter is the natural read-out method for CCD [18]. The image sensor control unit sends the reset and the integration stop signal at the same time to every pixel of the CIS. The signal is stored inside the pixel in an analog memory which is then read through the column bus. Depending on the pixel circuit, another image can be acquired during the reading of the current image. This feature is

implemented to remove dead time between two images of a video and is known as acquirewhile-read.

Fig. 16 Rolling Shutter and Global Shutter Acquisitions



## 2.2. High Speed Image Sensor

Nowadays video cameras are used to record many phenomena for consumer, industrial or scientific markets. Applications such as sport video, machine vision, ballistics and scientific researches require fast video camera. These video cameras make use of high speed image sensors. High speed image sensors (IS) cover a wide range of speed from 120 frame-persecond (fps) for slow motion sport video to more than one thousand fps to record phenomena such as fracture mechanics [19] [20]. High speed ISs usually perform global shutter acquisition to avoid the skew effect as presented in the previous section. The main issues, to design a high speed IS, are the bandwidth and the sensitivity [21]. These sensors require high speed circuits for the pixel, the ADCs and output buffers [22]. The bandwidth of the image sensor circuits increases from the pixel to the output buffers as the multiplexing ratio increases. The CIS architecture allows reading out and converting each column in parallel which enables high data bandwidths. Such a feature is not possible with standard CCD. Moreover, CIS performs an on-chip analog to digital conversion and works at low-voltage compare to CCD. Finally, high speed CCD image sensors are prone to artifacts such as (panel-to-panel) fixed pattern noise, blooming, and smear [23]. Therefore, CISs are usually

preferred over CCDs for high speed imaging [24]. However, the frame rate of CIS is also limited by its architecture. The bottleneck is the limited number of the output buffers and their data bandwidth DB [25]. The required data bandwidth for a given the frame rate FR depends on the CIS spatial resolution (width W and height H pixel numbers) and the resolution n of the ADCs (Eq. 1).

$$DB = W \times L \times n \times FR$$
 Eq. 1

The communication protocol commonly used for high speed CIS is the Low Voltage Differential Signaling (LVDS). In 2006, high speed CIS working at more than 1000 fps has been designed based on eight 500 Mbit/s LVDS transmitter [26]. Each LVDS driver requires 2 output channels as it is a differential protocol. The required output bandwidth and the number of output channels versus the CIS frame rate is presented in Fig. 17 for a 400x400 matrix resolution and an 8 bits ADC resolution. Implementing a high number of output drivers could be a solution to increase the frame rate. However, it becomes challenging to design such a circuit especially in term of power consumption. It is also challenging to process and store this large data flow. Therefore, a practical frame rate limitation is currently about 20 kfps for a megapixel spatial resolution. A common feature in high speed camera is thus trading matrix resolution for frame rate. Based on this trick, Vision Research camera proposes a video camera (*Phantom V2511*) recording at 25 kfps for a 1024x1024 matrix resolution [27]. By reducing the matrix resolution to 128x16, the camera reaches 1 Mfps. This CIS has an output bandwidth of 314 Gbit/s.

Fig. 17 CIS Output Bandwidth and Number of Output Channels versus the CIS Frame Rate



The second key parameter of high speed image sensor is the sensitivity. The sensitivity S corresponds to the voltage variation  $V_{int}$  created by a given photon flux  $\phi$  during one second,

Eq. 2 (a). As high speed image sensors have a short exposure time  $T_{int}$ , a good sensitivity is necessary to avoid using a too powerful light source to illuminate the scene. The sensitivity is usually given with photometric units in V/lux.s. There are different possibilities to increase the pixel sensitivity. The first solution is to increase the pixel photosensitive area A. It is also possible to optimize the photocurrent generated by a given input power (i.e. responsivity R). The last possibility is to increase the voltage variation created by a photogenerated electron. Usually, the charge to voltage conversion is performed by the integration of charges on a capacitor  $C_{int}$ . This capacitor has thus to be minimized to maximize the sensitivity. The sensitivity is then given by Eq. 2 (b).

$$\begin{cases} V_{int} = S \times T_{int} \times \phi & (a) \\ S = \frac{R}{C_{int}} \times A & (b) \end{cases}$$
 Eq. 2

For instance in high speed X-Ray CIS for medical applications, the sensitivity has to be as high as possible to reduce the X-Ray dose received by the patient [28]. This CIS working at 1500 fps has a pixel of 15x15  $\mu m^2$  with a fill factor of 37%. The pixel sensitivity is 68.5 V/lux.s thanks to a capacitive trans-impedance amplifier (CTIA) implemented with a 0.7 fF integration capacitor.

## 2.3.Burst Image Sensor

As it was stated in the previous section, the frame rate of classical high speed image sensors can reach up to 25 kfps at full resolution. This limitation is due to the bottleneck at the output buffer level. However, some applications require faster frame rate. A survey has been conducted among 1000 potential users of very high speed camera in Japan to identify the needs in term of frame rate [29] [30]. This study points out the need of Mfps image sensor (Fig. 18). To tackle the frame rate bottleneck of the CISs, different solutions can be devised. For instance, compression technics could be used to reduce on-chip the data bandwidth and thus increase the frame rate. To drastically reduce the data bandwidth, lossy compression has to be considered. However, it imposes losses in the image quality that do not always match the specifications of high speed imaging applications such as scientific imaging.

Fig. 18 Frame Rate Requirement for High Speed Image Sensor



Another solution is to perform the video storage on-chip and thus avoid the output buffer bottleneck. This solution has been developed since the 90's with CCD and CMOS technology [31]. These image sensors are known as burst image sensor and enables frame rate above 100 kfps. Burst image sensors usually acquire images in global shutter mode. The counterpart of burst image sensor is a limited number of images per recording. These cameras target applications where the recorded event is time-limited. The timing diagram of burst acquisition and classical acquisition is shown in Fig. 19.

Fig. 19 Timing Diagram of Burst and Continuous Recording



### 2.3.1. Architectures

## 2.3.1.1. CCD Image Sensor

Since the early 1990s, an interest for CCD high speed video camera has emerged. Indeed, a CCD image sensor offers a natural burst memory thanks to the CCD registers. Moreover, CCD

was commonly used for high end applications. As high speed image sensors are usually used for scientific applications, CCD seemed to be a good candidate to implement burst architecture. The first prototype of CCD burst image sensor was developed in 1993 [32]. The architecture of a CCD burst image sensor is illustrated in Fig. 20. The circuit is composed a matrix of photodiodes with a memory below each photodiode. The memory is a first-in firstout (FIFO) memory composed of CCD registers. During the burst acquisition, the photodiode performs the photo-detection and generates a number of charges depending on the amount of received light. A collection gate extracts the charges from the photodiode to the first CCD register of the FIFO memory. The charges corresponding to an image are successively moved from a CCD site of the memory to its neighbor. The last element of the CCD FIFO is a drain. If the user records more images than the memory capability, the first images are deleted in the drain and only the last images are memorized. This feature eases the synchronization of the video record with the studied phenomenon. This can be useful if the phenomenon occurrence is not well controlled. When the acquisition operation is finished, the burst reading is performed by a Vertical CCD (VCCD) segment and by a Horizontal CCD segment (HCCD). One VCCD is attached to each pixel column. The HCCD is placed at the bottom of the pixel matrix to read-out the VCCDs. The VCCD is made of segments. Each segment is composed of CCD registers and CCD switches. Firstly, the CCD switch is connected to the last element of the FIFO to load the VCCD with images of the burst. Then the CCD switch is connected to the last CCD register of the previous segment. The VCCD is then read-out by the HCCD.

Burst Acquisition – CCD Register Loading

Photodiode

CCD Register

VCCD segment

CCD switch

HCCD

Fig. 20 Simplified CCD Burst Image Sensor Architecture

As presented before, the sensitivity is an important characteristic of high speed burst CIS for visible light applications. For burst CCD image sensor, a significant area of the sensor is used by the burst memories. For instance, the fill factor is only 13 % for a memory depth of 100

images per burst [30]. To keep a good sensitivity, a solution is to use Back-Side-Illuminated (BSI) architecture [33]. This technology permits to collect almost all the electrical charges generated in the silicon (Fig. 21). The effective fill factor is thus about 100 %. To produce BSI sensor, the silicon substrate is thinned to 30  $\mu$ m. The photons reach the back side of the chip and generate electron/hole pairs in the depletion region of the PN junction. The electric potential profile  $Z_1$  of the junction drives the electrons from the back side to the collection gate on the front side. The CCD memories are placed in a PWell to protect them from the light. The PWell creates a potential barrier  $Z_2$  which prevents the electrons to move to the CCD memories. If an electron is generated below the CCD memory, the potential profile  $Z_2$  moves the electrons to the collection gate.

Fig. 21 Back-Side-Illuminated Burst CCD Image Sensor



The current trend for CCD burst image sensors is to push further the frame rate limitation. During the acquisition of one image, the electrical charges move from the backside of the sensor to the collection gate. They are then transferred from the collection gate to the CCD memory. The acquisition time of one image is limited by the sum of the collection time and the transfer time. The transfer time is about 10 times higher than the collection time [34]. The frame rate is thus limited in BSI CCD architecture by the transfer time from the collection gate to the CCD memory. For instance, the pixel of ISIS-V40 has a collection time of 62.5 ns, therefore the frame rate is limited to 16 Mfps [34]. A solution to tackle this frame rate limitation is to use a multi-collection gate (MCG) pixel [35]. The back side illuminated pixel is composed of a set of collection gates (CG) surrounding the center of the pixel (Fig. 22). Each collection gate successively collects and transfers to the memories the charges generated in the substrate. The collection gates perform the charge transfers in parallel. Therefore, the frame rate is increased by a factor which is equal to the number of collection gates per pixel.

Fig. 22 Multi-Collection Gate Pixel Architecture and its Timing Diagram



However, one limitation of CCD burst image sensor is the limited speed of the readout operation. Therefore, it is impossible to implement a high speed continuous mode in CCD burst image sensor. To enable these two acquisition modes, a solution based on hybrid CCD/CMOS image sensor has been proposed [36]. In burst mode, the acquisition and the storage are performed at 5 Mfps in a way similar to the above description [37]. In continuous mode, the image sensor works as a classical high speed image sensor and reaches 1180 fps.

## 2.3.1.2. CMOS Image Sensor

Burst image sensors have been developed in CMOS technology since the 00's [11]. CMOS burst image sensor can be sorted in two categories depending on the location of the burst memory. Indeed, the frame memory can be in or out of the pixel. For in-pixel storage, some capacitive memory points are implemented inside the pixel [38]. This architecture is very limited in term of memory depth because of the limited pixel size. Moreover, this solution requires a tradeoff between the memory depth and the pixel fill-factor. The storage capacitor must have a good density in order to acquire a high number of images per burst. To maximize the number of stored images, the capacitor value must be low. However, the minimal value depends on the targeted sampling noise and leakage that respectively define the dynamic range and the retention time of the analog data. The capacitor value is thus chosen by doing a tradeoff between these different criterions. For instance, a burst CIS with a pixel size of 37x30 μm<sup>2</sup> is able to store 8 images inside the pixel as illustrated in Fig. 23 [39]. The pixel fill factor is 9 % and the sampling cell is designed to have a dynamic range of 45 dB. In this architecture the sampling cells also perform the global shutter acquisition. The main drawback of the in-pixel storage is the poor pixel fill factor and thus a limited sensitivity of such an approach.

Fig. 23 Burst CIS Pixel with 8 In-Pixel Memories



With the out-of-pixel storage, the burst of image is stored in an analog memory array on the side of the pixel matrix [40]. The design of the sampling cell has the same constraints than the in-pixel memory cell (i.e. sampling noise and leakage) but this solution does not impact the pixel fill-factor and thus preserves pixel sensitivity. The main drawback of out-of-pixel storage is the long path between the pixel and sampling cells. This metal line must be carefully placed to prevent SNR drops due to crosstalk. Besides, this long line is capacitive. For each column of pixels, an analog buffer is implemented to drive this line. These buffers work at the frame rate bandwidth and are a major source of power consumption. A tradeoff must be chosen between the power consumption and the frame rate of the burst CIS. The other drawback is the limited resolution of the image sensor. Indeed, the memory array area depends on the required memory depth. For instance, 2/3 of the chip is dedicated to the burst memory to reach 128 images-per-burst with full custom capacitive memories as illustrated in Fig. 24 [40]. With out-of-pixel storage architecture, the frame rate reaches up to 20 Mfps but is limited by the RC delay of the line to access the memory [41].

Fig. 24 Burst CIS with Out-Of-Pixel Storage



The sensitivity of burst CISs depends on the integrating capacitor and the fill factor (i.e. photosensitive area) of the pixel. The simplest structure to perform the current to voltage conversion is the 3T pixel where the photogenerated charges are integrated on the photodiode junction capacitor. This structure is very simple and does not impact the pixel fill factor. Moreover, this structure has a large bandwidth and can reach up to 1 Gfps. However, the pixel photodiode is usually large for high speed imaging (> 30x30 µm<sup>2</sup>). The junction capacitance is thus high and the sensitivity is poor [39]. Another structure to increase the conversion gain is based on CTIA. This circuit performs the charge integration on a chosen capacitor. This capacitor is small to increase the sensitivity [38]. The drawback of this structure is the addition of one operational amplifier per pixel that consumes power and affects the fill factor. To target frame rate above 30 Mfps, it becomes interesting to perform the current to voltage conversion with a resistive transimpedance amplifier (RTIA) [42]. However, this technic requires lots of surface due to the large footprint of a resistor in integrated technology [43]. Finally, a pinned photodiode can perform the charge integration. Pinned photodiode is based on the collection gate technology which is used in CCD image sensor [40]. The charges collected on the photodiode are transferred on a floating diffusion node. The charge to voltage conversion is performed by the small capacitance of the floating diffusion node. The benefit of this solution is to combine a good sensitivity without impacting the pixel fill factor. As for CCD image sensor, the charge transfer operation limits the frame rate to about 16 kfps. Moreover, pinned photodiode requires a specific technology option to perform charge transfer that is not available in standard CMOS technology.

To reach frame rate above 1 Gfps, a specific type of burst image sensors is used. These cameras, also known as streak cameras, are composed of a single column of pixel with a memory line alongside each pixel [43]. They reach up to 100 ps of time resolution but they acquire an image in one dimension. By scanning the scene, a 2D image can be reconstructed.

### **2.3.1.3.** Synthesis

A summary of the different burst image sensors is presented in Tab. 1. This table contains the performances of the different burst image sensors presented above. Some characteristics such as conversion gain and dynamic range have been computed from the data found in the references. In term of frame rate, image sensors based on collection gate (CG) technology do not exceed 20 Mfps due to their limited transfer time. Higher frame rates can be reached by using multi collection gate (MCG) structure as in the hybrid CMOS/CCD image sensor. Last developments propose a MCG pixel able to reach 1 Gfps but no prototype has been yet reported [44]. With a basic 3T pixel, burst CIS has a frame rate of 1 Gfps and can reach 2 Gfps with RTIA stage. However, this last solution is implemented in a streak camera where the fill factor is not an issue. Otherwise RTIA circuit is usually too large to be implemented inside the pixel. In term of sensitivity, the image sensors based on CG technology show conversion gain of more than 74 µV/e- while the image sensors based on 3T pixel have a poor conversion gain of 2.2 µV/e-. The same difference appears in term of dynamic range with more than 60 dB for CG pixel and 45 dB for 3T pixel. The power consumption of burst image sensor strongly depends on the architecture. However, for a given architecture, the power consumption scales with the frame rate. As presented before, only CMOS and hybrid CMOS/CCD image sensor are able to perform high speed continuous acquisition. It is interesting to note that the memory depth hardly exceeds 200 images for both CCD and CMOS architecture. Moreover, all the burst memories of these cameras are analog storages. The analog memories are known to be sensitive to the hostile conditions such as high temperatures, strong radiations and space environment.

Tab. 1 Burst Image Sensor Summary

|                       | [33]    | [40]    | [45]    | [36] [37]   | [39]  | [43] |
|-----------------------|---------|---------|---------|-------------|-------|------|
| Technology            | CCD     | CMOS    | CCD     | CCD in CMOS | CMOS  | CMOS |
| Type of Integration   | CG      | CG      | MCG     | CG          | 3T    | RTIA |
| Frame Rate (fps)      | 16 M    | 20 M    | 100 M   | 2 M         | 1 G   | 2 G  |
| Memory Depth (#)      | 117     | 256     | 126     | 180         | 8     | 128  |
| Matrix Resolution (#) | 362x456 | 200x256 | 440x320 | 924x764     | 32x32 | 1x64 |
| Pixel Pitch (μm)      | 43      | 32      | 50      | 30          | 37    | 26   |
| Fill Factor           | ~100 %  | 37 %    | NA      | 11 %        | 9 %   | NA   |

| Conversion Gain (μV/e-) | "good"* | 74  | "good"* | 80          | 2.2 | NA    |
|-------------------------|---------|-----|---------|-------------|-----|-------|
| Dynamic Range (dB)      | ~60     | >60 | ~60     | 61          | 45  | NA    |
| Continuous Mode         | No      | Yes | No      | Yes         | No  | No    |
| Power Consumption (W)   | NA**    | 24  | NA**    | 12 @ 1 Mfps | NA  | > 0.6 |

<sup>\*</sup>not communicated but should be about 80 μV/e- due to charge transfer operation

#### 2.4.Photodetectors

The photodetector is the first element of the image sensor. It converts the incident photons into electrical charges. The photodetection can be performed with different devices. As we are considering here CMOS image sensors, we are presenting only silicon based photodetectors. High speed image sensors have specific requirements in term of photodetection. The photodetector bandwidth is an important characteristic for burst image sensor application. Indeed, the photodetector must have a sufficient bandwidth to not limit the frame rate. The second point is the sensitivity. Indeed as presented previously, the exposition time is very short (< 1  $\mu$ s). The burst image sensors are thus designed to have a good sensitivity. Therefore, their photodetectors are usually large. Moreover, photodetectors with a good responsivity are favored.

#### 2.4.1. Photodiodes

The first photodetector presented here is the photodiode. We start by a short presentation of the device and the physic mechanisms that create a photocurrent across the photodiode. A photodiode is a reverse-biased diode. The PN junction is thus depleted and an electric field is present in the depletion region as shown in Fig. 25. Under illumination, photons penetrate in the silicon and are absorbed on its surface. The photon flux exponentially decreases with the depth due to the silicon absorption. An absorbed photon generates a hole/electron (h/e) pair. If an h/e pair is created in the depletion region, the electric field drives the hole and the electron respectively in the P region and N region. This motion creates a photocurrent known as drift current. If an h/e pair is created in the N region, the hole moves to the depletion region thanks to a diffusion mechanism. Once arrived, the hole is driven by the electrical field in the P region. The same mechanism occurs for the electron of a h/e pair generated in the P region. This motion of an electron or a hole also creates a photocurrent across the PN junction known as diffusion current.

<sup>\*\*</sup> not communicated but should be high due to the high voltage required to control CG

Fig. 25 Reverse Biased PN Junction



As presented above, the responsivity characterizes the photodiode efficiency to convert the input optical power into current and is given in Ampere per Watt. The responsivity of a photodiode changes with the wavelength of the incident light. For silicon photodiode, it reaches a maximum around 600-800 nm. As presented before, the incident photons are absorbed in the silicon depending on their wavelength. Thanks to the Beer Lambert law, it is possible to express the number of the pair created by an incident photon flux  $\phi_0$  (photon/m²/s) at a given depth z (Eq. 3). This quantity is known as the generation rate G and depends on the silicon absorption coefficient  $\alpha_{Si}$  which itself depends on the wavelength  $\lambda$  and the silicon extinction coefficient  $k_{Si}$  [46].

$$\begin{cases} G(z) = \alpha_{Si}\phi_0 e^{\alpha_{Si}z} \\ \alpha_{Si} = \frac{4\pi k_{Si}}{\lambda} \end{cases}$$
 Eq. 3

However, only the h/e pair generated in the depletion region and the diffusion regions can contribute to the photocurrent. The size of these regions depends on the junction structure and the dopant concentrations which vary from a technology to another. Moreover, it would be necessary to present charge recombination mechanisms to gives an accurate expression of the photocurrent. We will not go here into more details to give a general expression of it. Therefore, the responsivity will be studied based on measurements made for different photodiodes [47]. The three analyzed photodiodes are presented in Fig. 26. The N+/PSub and the NWell/PSub structure have a single depletion region whereas the P+/NWell/PSub structure has two depletion regions.

Fig. 26 Cross-Sectional Views of (a) N+/PSub, (b) NWell/PSub and (c) P+/NWell/PSub Photodiodes



The responsivity of these photodiodes is plotted versus the incident light wavelength in Fig. 27. The shape of the responsivity can be analyzed thanks to the generation rate formula Eq. 3 which says that the photons with short wavelength are absorbed closer to the silicon surface than the photons with large wavelength. The responsivity of the N+/PSub photodiode is flat at short wavelengths and then decreases at mid-range wavelengths due to its depletion region close to the surface. The NWell/PSub and P+/NWell/PSub photodiode have a deeper depletion region and are thus more sensitive to mid-range wavelengths. It explains the triangle shape of their responsivity. For the three photodiodes, the peak of responsivity is reached at 630 nm. The P+/NWell/PSub photodiode has the highest responsivity with a maximum of 0.45 A/W due to its two depletion regions. The NWell/PSub photodiode and N+/PSub photodiode have respectively a maximum responsivity of 0.25 A/W and 0.11 A/W. This difference can be explained by the larger depletion region of the NWell/PSub junction over the N+/PSub junction.

Fig. 27 Photodiodes Responsivity versus the Incident Light Wavelength for 40x40 μm² Photodiodes



We will now present the frequency response of the photodiodes. The physic phenomena that create the photocurrent have their time constants that limit the bandwidth of the photodiode. This bandwidth is known as the intrinsic bandwidth. Moreover, the photodiode have its own electrical characteristics that affect the bandwidth the pixel circuit. This bandwidth is known as the extrinsic bandwidth. The intrinsic bandwidth corresponds to the -

3dB bandwidth of the responsivity and depends on the nature of charge motion. Under a given illuminance, the photocurrent across the photodiode is the sum of the current created by the electric field and the current due to diffusion process. Some studies have been carried out to analyze of their bandwidth. It appears that the intrinsic frequency response of a photodiode is limited by the time constant of the diffusion phenomena. However, the bandwidths also depend on the nature of the PN junction and are summed up in Tab. 2 for the photodiode structure presented in Fig. 26. They are given for finger photodiode structure at a wavelength of 650 nm [48]. The NWell/PSub photodiode and P+/NWell/PSub photodiode have respectively intrinsic bandwidths of 70 MHz and 100 MHz. For these structures, the frequency response is limited by the diffusion current in the P-Substrate. The N+/PSub structure is not analyzed in this reference. However, its photocurrent is the sum the N+ diffusion current, the P-Substrate diffusion current, and the depletion drift current. If we suppose that the N+ diffusion current and the NWell diffusion current have the same frequency response, one can assume that its -3dB bandwidth is close from the NWell/PSub photodiode and should thus be limited to 70 MHz. However, the photodiode frequency response does not correspond to the response of a first order system. Indeed, their frequency response loses less than -20 dB/dec. For instance the NWell/PSub photodiode can be used at 10 times its cutoff frequency with an attenuation of only 55 %. A solution to increase the bandwidth is to use a P+/NWell structure. There is thus no substrate diffusion current and the -3dB bandwidth reaches 2 GHz. At the first order, the intrinsic bandwidth is independent of the photodiode surface.

**Tab. 2 Intrinsic Frequency Response of Different Photodiode Structures** 

|               | -3 dB bandwidth (Hz) | gain loss (dB/dec) |
|---------------|----------------------|--------------------|
| NWell/PSub    | 70 M                 | -5                 |
| P+/NWell/PSub | 100 M                | -5                 |
| N+/PSub       | ~70 M                | ~-5                |
| P+/NWell      | 2 G                  | -10                |

The extrinsic bandwidth corresponds to the bandwidth of the system formed by the photodiode and the pixel circuit. It is thus not possible to express this bandwidth without considering a given current to voltage conversion stage. However, the bandwidth of this stage usually depends on the equivalent capacitance of the photodiode and decreases as this capacitance increases. Moreover if the current to voltage conversion is performed with a 3T stage, the sensitivity decreases with high values of photodiode capacitances. Based on those facts, photodiode structures with small capacitance are preferred for high speed imaging. The photodiode capacitance is due to the depletion region of the reverse biased PN junction. Its value depends on the silicon dielectric constant  $\varepsilon_{Si}$ , the junction width W and the diode area A and is given by Eq. 3. The width of the junction depends of the electron

elementary charge q, the dopant concentration in the N region  $N_n$  and in the P region  $N_p$ , the build-in potential  $V_{Bl}$ , and the biased voltage  $V_A$ . The junction capacitance strongly depends on the dopant concentrations and thus on the considered CMOS technology.

$$C_{Junction} = rac{A \ arepsilon_{Si}}{W} = A \sqrt{rac{q arepsilon_{Si}}{2(V_{BI} - V_A)} rac{N_p N_n}{N_p + N_n}}$$
 Eq. 4

A study that takes into account the bottom and the side wall capacitances of the depletion region has been carried out for large photodiodes [47]. The junction capacitance of each photodiode structures is computed for an UMC 180 nm technology is given in Tab. 3. As expected, the junction capacitance scales with the photodiode surface. It is interesting to note that the junction capacitance of the NWell/PSub photodiode structure is about 10 times smaller than the capacitances of the other structures. This is due to the low dopant concentration of the N-Well that creates a wider depletion region compared with the other junctions which are heavily doped. This phenomenon is confirmed by the measurement with a Tower 180 nm technology [49].

Tab. 3 Junction Capacitance (fF) for Different Photodiodes Structures and Different Areas

|               | 10x10 μm² | 20x20 μm² | 40x40 μm² |
|---------------|-----------|-----------|-----------|
| NWell/PSub    | 7.84      | 28.85     | 110.5     |
| P+/NWell/PSub | 89.64     | 367.15    | 1341.73   |
| N+/PSub       | 74.6      | 289.99    | 1140      |

Finally, it is important to consider some non-idealities of the photodiode to design high speed image sensors. A photodiode generates a current even without incident light. This current is known as dark current. The dark current is due to the thermal generation of h/e pairs. There are different sources of dark current but the main one occurs at the Si/SiO<sub>2</sub> interface at the surface of the photodiode. It value strongly increase with the temperature and is multiplied by 2 every 8 °K rising. Tab. 4 gives the dark current density for each photodiode structures [50] for a 0.5  $\mu$ m CMOS process. The N+/PSub and P+/NWell/PSub photodiodes show the lower dark current density.

Tab. 4 Dark Current Density for Different Photodiodes Structures at 300 K

|               | Dark Current Density (nA/cm²) |
|---------------|-------------------------------|
| N+/PSub       | 96.2                          |
| NWell/PSub    | 363.4                         |
| P+/NWell/PSub | 90.3                          |

The dark current acts as an offset added to the signal. For a photocurrent to voltage conversion performed with an integrator, the dark current reduces the full well capacity of the pixel and thus reduce the dynamic range of the sensor [51]. However, high speed image sensor requires large photocurrents and the effect of the dark current on the dynamic range can be neglected.

#### 2.4.2. Pinned Photodiode

The pinned photodiode has been developed for the CIS inspired by the CCD charge transfer operation. The pinned photodiode is composed of a P+/N/PSub structure, a collection gate CG, and a floating diffusion FD region [52]. At the beginning of an exposition, the N region is depleted and is at a high potential known as pinning voltage  $V_p$ . Thanks to its potential profile across the z axis, the P+/N/PSub structure collects the photogenerated charges in the storage well SW as illustrated in Fig. 28. The collected charges are maintained in the SW due to the potential barrier  $V_B$  created by the CG low voltage. At the end of the exposition time, the potential barrier is released by applying a high voltage on the CG. A monotonic potential slope appears between the SW and the FD which drives the collected charges to the FD region under the effect of the electric field. The charges  $Q_{ph}$  transferred on the FD junction capacitance  $C_{FD}$  create a voltage shift  $\Delta V_{ph}$  that is read by the pixel. The conversion gain for a classical pinned photodiode is about 50  $\mu$ V/e-.

Fig. 28 Pinned Photodiode Cross Section and its Potential Diagrams



We prefer express the photoelectrical conversion efficiency of the pinned photodiode with the quantum efficiency QE rather than the responsivity. The quantum efficiency corresponds to the ratio between the photogenerated charges and the received photons. The QE can be improved by placing the pinned photodiode in a weakly doped epitaxial grown P region (PEpi). In doing so the depletion region size is increased and more photogenerated charges are collected. The QE is then increases to 0.4-0.8 depending on the PEpi thickness [53]. The bandwidth of the pinned photodiode is limited by the charge transfer operation from the SW to the FD. For small pixel, the charge motion during the charge transfer is caused by the electric field. However, in large pixel the potential in the SW region is flat and the charge transfer is limited by the diffusion motion across the pinned photodiode. In such a case, the transfer time is proportional to the ratio between the square of the length  $l^2$  and the electron diffusion coefficient  $D_n$ . For a pinned photodiode of 40  $\mu$ m, the transfer time is about 600 ns. Some works have been done to improve the transfer time for high speed applications. It consists in the creation of a graduated doping of SW across the x axis. The graduated doping creates an electric field that drives the charges close to the collection gate region [54]. In doing so, the photogenerated charges are fully transferred in less than 100 ns **[55]**.

### 2.4.3. Phototransistor

The last photodetector presented here is the phototransistor. This device can be implemented by different structures in CMOS technology such as vertical or horizontal bipolar phototransistors and photo-MOSFET [56] [57]. We are presenting here the mode of operation of a vertical bipolar PNP phototransistor [58]. This phototransistor is made of a heavily doped P region in an NWell as illustrated in Fig. 29. The N+ and P+ contacts are

respectively placed to the base and the collector of the bipolar transistor and the emitter is formed by the P+ region. The phototransistor is biased by a  $V_{CE}$  voltage applied across the collector and the emitter. The base is floating. Under illumination, the NWell/PSub junction is reverse biased and collects the photogenerated electrons in the NWell. The electrons accumulated in the base lower its potential which increases the base-emitter voltage  $V_{BE}$ . Therefore, the collector current  $I_C$  is proportional to the photogenerated charge collected in the base i.e. the photocurrent  $I_{ph}$ . The proportional coefficient is the well-known common emitter current gain  $\beta$ . As the substrate is the collector, the PNP phototransistor can only be used in common collector mode.

Fig. 29 PNP Phototransistor and Quad Small Base PNP Phototransistor Cross Sections



**PNP Phototransistor** 

**Quad Small Base PNP Phototransistor** 

The vertical bipolar phototransistor has a frequency response which is limited by the base/emitter capacitance. As other silicon based photodetector, the maximum sensitivity is reached in the red light. The responsivity of this phototransistor also depends on the collector/emitter biasing voltage  $V_{CE}$ . The responsivity of such structure is about 0.34 A/W with a cutoff frequency of 150 kHz at 850 nm under a  $V_{CE}$  of 2V [59]. A solution to increase the bandwidth is to reduce the base/emitter capacitance by down-sizing the emitter. Moreover, the responsivity can be improved by placing the phototransistor in an epitaxial grown P region. As the PEpi is weakly doped, the depletion region of the NWell/PEpi junction is large and collects more photogenerated charges. Based on those facts, a quad small base pnp phototransistor made of four small P+/NWell junctions in an PEpi collector is proposed in [58]. For a 40x40  $\mu$ m² phototransistor, its sensitivity is about 2.7 A/W and the -3dB bandwidth is 13.8 MHz at a wavelength of 675 nm for a  $V_{BC}$  of -2V. However by increasing the  $V_{BC}$  voltage to -10V, the bandwidth reaches 72 MHz. In term of dimensions, the cutoff frequency slightly decreases with the size of the phototransistor.

### 2.4.4. Photodetector Synthesis

Three different photodetectors have been discussed in this section. First, the classic photodiodes present very high bandwidth from 70 MHz to 2 GHz depending on their

structures. Due to the -5dB/dec loss of the responsivity frequency response, the classical photodiodes offer a range of tradeoffs between responsivity and bandwidth. The responsivity of the photodiode depends on the considered structure but can reach up to 0.45 A/W for P+/NWell/PSub photodiode. The classic photodiodes are well suited for very high speed image sensors of 100 Mfps and beyond. The pinned photodiodes seem interesting for mid-range frame rate application (~10 Mfps). Their frequency response is limited by the time constant of charge transfer operation. However, lots of work has been done in the past few years to improve the speed of this operation. The whole photogenerated charges can be transferred in less than 100 ns thanks to a graded doping of the storage well. By using a P epitaxial layer, the pinned photodiode can reach quantum efficiency of 80 %. The floating diffusion node performes the charge to voltage conversion and offers a high conversion gain ( $\sim$ 50  $\mu$ V/e-). However, the design of pinned photodiodes requires a specific technology option. Finally, the phototransistor offers a very good responsivity up to 2.7 A/W due to the current gain provided by the PNP structure. The frequency response of the phototransistor is about 14 MHz which enables low-range frame rate (~5Mfps). The frequency response can be enhanced up to 72 MHz by using high bias voltage.

# 2.5. Three-Dimensional (3D) Interconnection Technologies

# 2.5.1. Technology Offers

For years now, the technologists face difficulties to shrink the transistors dimensions as predicted by the Moore law. Different technologies have been developed to bypass the limit of the transistor shrinking and keep increasing the integrated circuit density. 3D integration technology seems a good candidate to increase the integration density by implementing transistors in three spatial dimensions instead of two. The International Technology Roadmap for Semiconductor (ITRS) gives a general presentation of 3D interconnection technology. ITRS distinguishes different technological approaches depending of the interconnection hierarchy [60]. This hierarchy is based on the interconnection level between the different chips of the stack (Tab. 5). 3D integration requires the development of different interconnections. 3D bonding is used to join two chip surfaces together while Through Silicon Vias (TSV) are used to route the signal from one side of a chip to the other. 3D packaging (3D-P) technology uses the "classical" packaging technology (wire bonding and package-on-package) to create a stack of chips. This technology requires less technology developments compared to other 3D interconnection technologies. However, the vertical interconnection density is poor. In 3D wafer level packaging (3D-WLP), the 3D interconnections (TSVs and 3D bonding) are manufactured at the bond-pad level. The requirements on the pitch of the inter-chip bonding and the TSVs are weak. The TSVs are usually placed at the chip periphery. 3D stacked integrated circuit (3D-SIC) uses direct interconnections between each chip of the stack. Depending on the pitch of the interconnections, the 3D integration can be performed at the IP-block level or the gate level. 3D integrated circuits (3D-IC) are composed of front-end devices (e.g. transistors) that are stacked on the top of each other. These devices share common back-end interconnect layers and the 3D interconnection is performed at the transistor level. This technology offers the highest integration level but requires complex technology developments.

Tab. 5 3D Interconnection Technology depending on the Hierarchy Level

|                           | 3D-P      | 3D-WLP   | Global 3D-SIC  | Intermediate 3D-SIC | Local 3D-IC      |
|---------------------------|-----------|----------|----------------|---------------------|------------------|
| Interconnect<br>Hierarchy | Package   | Bond-pad | IP-block level | Gate level          | Transistor level |
| Interconnection<br>Pitch  | Pin pitch | ~100 μm  | <20 μm         | <4 μm               | <1 μm            |

In imaging, the pixel array is set on the top tier of the 3D stacking. An interconnection hierarchy at the pixel level requires an interconnection pitch smaller than the pixel pitch. For burst imaging, the pixel pitch varies between 30  $\mu$ m and 50  $\mu$ m. Therefore, 3D-P and 3D-WLP do not provide a sufficient interconnection density. Moreover, 3D-IC technology is not mature enough to propose 3D integrated image sensor in the coming years. On the contrary some CIS prototypes have been designed in the past few years using 3D stacked integrated circuit [9] [61]. 3D-SIC is thus the selected technology to design advanced burst CISs.

# 2.5.2. 3D Stacked Integrated Circuit

There are two topologies of stacking as illustrated on Fig. 30. The chips are stacked Face-to-Face (F2F) if the back-end-of-line (BEOL) of each chip are facing each other. In a F2F stacking, the BEOLs of each chip are bonded together with solder micro-bumps or thermo-compression bonding. For Face-to-Back (F2B) stacking, the front side of one chip faces the back side of the other. In F2B stacking, the signal is routed from the front side to the back side of the chip thanks to TSVs. The chip is usually thinned for technical considerations. Then a ReDistribution Layer (RDL) routes the signal on the back side of the first chip to match the interconnection positions on the other chip. Finally both chips are bonded together.

Fig. 30 3D Stacked Integrated Circuit



3D stacked integrated circuit has an impact on the circuit design. The TSV fabrication process produces some mechanical stress on the silicon surface. This stress causes characteristic variation of the transistors close from TSVs [62]. The voltage threshold and charge mobility are affected due to piezo-resistive effect. The characteristic variations depend on distance between the transistor and the TSV. A transistor Keep-Out-Zone (KOZ) can be defined in the TSV neighborhood to limit the large variations on the transistor characteristics due to mechanical stress. Moreover, the TSVs used to route transient signals create some transient substrate coupling. To reduce substrate coupling, some works have been done on TSV oxide isolation [63]. It also demonstrates that KOZ is not effective to avoid substrate coupling as its dimensions would be too large.

From an electrical point of view, some models have been proposed for TSV [64]. They gives an evaluation of the equivalent resistor  $R_{TSV}$ , capacitor  $C_{TSV}$  and self-inductance  $L_{TSV}$  of a TSV as illustrated on Fig. 31. The RLC model depends on the TSV dimensions (length I and radius I), the oxide thickness I and the material properties (resistivity, permittivity, and doping concentration). For the designer, the main electrical characteristic of a TSV is the capacitor I0 between the TSV and the substrate. Its value depends on the oxide capacitor and the depletion capacitor due to the Metal/Oxide/Semiconductor (MOS) structure. The worst case occurs when the TSV voltage is below the threshold voltage of the MOS structure. The electrical characteristics of a TSV for two couples of length and diameter are summed for different diameters in Fig. 31 for an oxide thickness of 100 nm.

Fig. 31 Through Silicon Via RLC Model



| 2r / I    | R <sub>TSV</sub> | $L_TSV$ | $C_{TSV}$ |
|-----------|------------------|---------|-----------|
| 5 / 50 μm | 46 mΩ            | 10 pH   | 69 fF     |
| 2 / 20 μm | 132 mΩ           | 14 pH   | 52 fF     |

Contrary to the TSV, the chip-to-chip bonding is placed at the top of the BEOL and does not prohibit the placing of active devices below it. From an electrical point of view, a bonding is mainly a resistive element. The RDL is implemented on the back side of the chip. From an electrical point of view, the RDL is characterized as a metal layer of BEOL with its own linear resistivity. Such as a TSV, a RDL forms a MOS capacitor with the substrate. High frequency signals have thus to be carefully routed on the RDL taking into account the electrical coupling with the substrate.

### 2.5.3. 3D Integration and High Speed Image Sensor

A study has been carried out on the benefits provided by 3D integration technology to the burst CISs [41]. It demonstrates that 3D stacking enhances the frame rate by a factor 400 % for the burst CIS presented in [40]. It also enables an acquisition rate independent of the memory size for memory depth below 1024 images. Moreover, it offers a significant power consumption reduction.

### 2.6.Conclusion

We have presented here a review of the burst images sensor implemented in CCD and CMOS technology. It appears that burst CIS provides the highest frame rate (>100 Mfps) but for a limited memory depth or matrix resolution and a limited sensitivity. Burst CCD image sensor and burst CIS based on pinned photodiode have a good responsivity and dynamic range while reaching frame rate up to 100 Mfps. Then we have proposed a description of the different silicon based photodetectors. We have given some insights about their responsivity and their frequency response. It appears that the classical photodiode is well suited for very high speed applications up to 2 GHz. The pinned photodiode offers a charge to voltage conversion of about 50 µV/e- thanks to its floating diffusion node but its bandwidth is limited to about 16 MHz due to the charge transfer operation. The PNP phototransistor provides a higher responsivity of 2.5 A/W due to the current gain of the PNP structure. However, phototransistor bandwidth is limited to 14 MHz due to its junction capacitors. Finally, the 3D integration technologies have been exposed. It appears that 3D-SIC is the best candidate to implement 3D integrated burst image sensor due to its interconnection pitch and its technological maturity. We have thus presented with more details this technology and described the different interconnections and their electrical model. We are known going to present different architectures of 3D integrated burst CIS and assess their performances.

# 3. 3D Integrated Burst Image Sensor

### 3.1.Motivations

In 1965, Gordon Moore proposes a conjecture: "the circuit complexity increases at a rate of roughly a factor of two per year". This assumption has been verified until 2000's thanks to the shrinking of transistor dimensions. However, these dimensions are now so small that physicists are facing some quantum effects which limit the size shrinking. Researchers proposed a paradigm shift to overcome this issue, the 3D integration technology. With this technology, dies are stacked and interconnected on top of each other's to make a single chip. It improves the integration density of the chip while keeping the same footprint. The 3D interconnections were initially developed to increase the data bandwidth between digital circuits, reduce the latency between processor and memory and reduce power consumption [65]. For CIS, this technology enables the implementation of transistors below the pixel without impacting the fill-factor. Pixel level ADCs, image processing or compression units are thus implemented without affecting the sensor sensitivity. In 2005, a vertically integrated sensor array program is presented [66]. This work relies on the technological developments made for military applications. Only one CIS has been yet reported for burst imaging in 2004 [11]. However, its implementation uses a nonstandard 3D integrated technology and requires the stacking of a number of circuits equal to the number of column in the matrix. 3D integration appears as a good solution to increase the performances of burst CIS. Indeed, it permits to embed more memory in the image sensor without impacting the fill factor. Moreover, the CIS can have some ADCs that perform on-chip A/D conversions. After some general considerations about the technology and the CIS triggering modes, we present here two architectures of 3D integrated burst image sensors and we assess their performances.

#### 3.2. Architectural General Considerations

In this chapter, we present 3D integrated burst image sensors that perform on-chip the A/D conversion of the images. There are two potential implementations. In the first one, the A/D conversion is performed after the image storage during the memory reading. The burst of images is thus stored into an analog memory. The second solution is to perform the A/D conversion before the image storage during the burst acquisition. The images are then stored into digital memory. In this study, one considers the same kind of pixel front-end electronics (i.e. the photodiode and the signal conditioning) for both solutions. The pixel pitch is set to 50  $\mu$ m and an 80 % fill factor is targeted. At the other end, an issue in high speed imaging is the limited data bandwidth of the output which constrains the system. For this study the output buffers are implemented with a Low Voltage Differential Signaling (LVDS) standard to minimize the power consumption. One assumes that the pixel front-end is designed using a 130 nm technology. For the digital circuit, one considers a 28 nm technology. The performance evaluation of these both architectures is based on these assumptions. The 3D stack is made of three chips as illustrated in Fig. 30. The one on the top

is back side illuminated (BSI) and is face-to-face connected with the middle chip. The interconnections between those chips are made of bonding. The middle chip and the bottom chip are face-to-back connected and the interconnections require bonding and TSVs.

### 3.2.1. Recording and Triggering Modes

High speed burst image sensors are designed to record very high speed phenomena. These phenomena are commonly of short duration and their occurrence can be deterministic or random. Therefore, the camera triggering has to be carefully designed to fit the recording window with the phenomenon occurrence. Two modes of triggering are presented here. The pre-event triggering mode is suited for a phenomenon that occurs at a deterministic time after the event triggering time  $t_{TSevent}$  (Fig. 32). The recording window is synchronized with the event occurrence thanks to the recording triggering time  $t_{TSrecording}$ .

Fig. 32 Pre-Event Triggering Mode Timing Diagram



However for some applications, the phenomenon occurrence is not well controlled. If the occurrence time is not deterministic, the CIS cannot be triggered as previously because the recording window could be ended before the phenomenon occurrence. Post-event triggering mode is then implemented to bypass this problem [67] (Fig. 33). In this mode, the CIS acquires continuously the images that are stored in a cyclic memory. When full, the cyclic memory stores the new images on the first stored images. The burst CIS is already recording when the phenomenon occurs. A triggering signal  $t_{TSstop}$  is generated from this occurrence and stops the acquisition. The memory is frozen and contains only the last recorded burst of images.

Fig. 33 Post-Event Triggering Mode Timing Diagram



Moreover, one can distinguish two recording modes. In single burst recording mode, a burst of images is acquired and read out of the burst CIS. In multi-burst recording mode, the burst acquisition and reading are repeated over time. These different triggering and recording modes have an effect on the power consumption and the thermal management of the CIS.

### 3.3.Burst Image Sensor with Analog Storage

#### 3.3.1. Architecture Overview

3D integrated burst CIS with analog storage solution is the adaptation of the classical burst CIS in 3D. However, this architecture offers an extra feature: an on-chip A/D conversion as shown in Fig. 34. The sensor works in two phases. First, the pixel front end and the analog memory capture and store the burst of images. Then, the ADC converts the frames into digital data which are serialized and read out of the chip by the output buffer. The analog storage architecture is composed of three tiers. The first one includes the photodiodes and the pixel front-ends. The front-end circuit performs the photo-current to voltage conversion. The analog memories are set on the second tier. Each pixel front-end is connected to its own memory in order to maximize the bandwidth and enables a global shutter acquisition. A set of memories (i.e. cluster) are multiplexed to one ADC. The ADCs are on the third tier with the output buffers. For this study, one considers output buffers based on Low Voltage Differential Signaling (LVDS). This standard is used for high speed data transmissions (655 Mbits/s) and is implemented in high speed continuous camera [68]. There is many ways to implement the pixel front-end, the analog memory and the ADC. For a preliminary study, simple structures are chosen to analyze this architecture. Then, the analysis can be carried out with more evolved structures.

Pixel Front-End Pixel Front-End Fron

LVDS Driver

Fig. 34 Analog Storage Burst Image Sensor Architecture

# **3.3.1.1. 3D Integration**

The first step is to validate the signal path through the 3D stack. Between the pixel front-end and memory, only one connection is required. The pitch of the bonding for a face to face stacking is below ten micrometers. For a pixel pitch of 50 µm, each pixel can thus have its own connection to the memory. The outputs of a cluster of analog memories must be multiplexed to one ADC due to surface considerations. Indeed, an ADC does not fit below a single memory of 50x50 μm<sup>2</sup>. The multiplexer is placed in the second tier to minimize the number of interconnections between the second and the third tier. Therefore, the signal path between a cluster of pixels and the ADC requires one interconnection made of a TSV and a bonding. Then the second step is to check the distribution of the control signals through 3D stack. There is too much control signals to route each pixel independently. However, all the pixel front-ends, memories, and ADCs work in parallel and are driven by the same control signals. The control signals have to be mutualized between groups of pixels for the routing. Different routing strategies can be considered. A group of pixels can incorporate some additional interconnections for the control signals. If this is not feasible, some pixels in the array can be sacrificed to route signal from one chip to the other. This solution causes the presence of dead pixels across the matrix that have to be reconstructed by a post processing. Another solution is to route the signals on the sides of the pixel array from the bottom to the top. This solution does not allow extending the spatial resolution with a lattice of sensor.

# 3.3.1.2. Circuit Implementation

As said previously, basic structures of circuits have been chosen to simplify the analysis. Moreover, as the frame rate is the key performance, it seems a good starting point to use simple circuits as they are generally faster. A diagram of a cluster of pixels of the analog

storage architecture is presented in Fig. 35. The pixel front-end is implemented with a basic 3T pixel inspired from active pixel sensor (APS). The photo-current integration is performed on the photodiode junction capacitor. In order to maximize the sensitivity of the CIS, a photodiode made of a N-well in a P-substrate is chosen. This structure shows very good ratio between the responsivity and the junction capacitor that will maximize the conversion gain and the sensitivity. The photodiode area is  $50x40~\mu\text{m}^2$  to reach a fill-factor of 80 %. The impedance of the interconnection between the first and the second tiers (F2F bonding) is mainly resistive and will be neglected here. The source follower (SF) buffer of the 3T pixel loads the input of the analog memory. The analog memory is made of an array of analog memory cells (MC). We prefer to use an arrays structure rather than a vector structure because it reduces the number of control signals (*CS* and *RS*) and the input capacitor of the memory. Each MC has one access transistor for both writing and reading operation as in [40]. An analog multiplexer connects a cluster of analog memories to an ADC which performs the conversion during the memory reading.

Fig. 35 Cluster of Pixels of the Analog Storage Architecture



A timing diagram of the analog storage architecture is given in Fig. 36. The frame rate is equal to the inverse of the time  $T_{FR}$  between two resets. The reset time  $T_{rst}$  defines the settling time for the reset operation. The integration time  $T_{int}$  defines the settling time of the signal at the analog memory input. We consider here that 90 % of  $T_{FR}$  is dedicated to the photocurrent integration and 10 % of  $T_{FR}$  is dedicated to the reset operation.



Fig. 36 Timing Diagram of Analog Storage Architecture

#### 3.3.2. Performance Evaluation

In this section, a coarse analysis of the analog storage burst CIS architecture is presented. The aim is to assess the performances of this architecture. The sensor characteristics studied here are the frame rate, the number of images per burst, the power consumption and the dynamic range. A MATLAB model has been designed to evaluate those performances. For a given frame rate and dynamic range, the model computes the memory size, the power consumption and the readout noise.

## 3.3.2.1. Analog Memory

The analog memory is a critical part of the image sensor. It defines the number of images per burst but can also have an impact on the frame rate. As said before, the analog memory is an array of memory cell. Each memory cell is made of a writing/reading access switch SW, a storage capacitor  $C_S$ . The writing operation is done by selecting a row thanks to a RS control signal and a column thanks to CS control signal. Once the analog MC is written, the next column is selected. When a full row is written, the next row is selected by incrementing RS. To read an image, the MC is connected to the memory bus and its value is read by the column buffer  $T_{Col}$ . The reading of the analog memory is done in the same order than the writing (i.e. row by row). It is important to note that the signal charges are shared between the column equivalent capacitor  $C_C$  and the storage capacitor  $C_S$  with this reading method.

The voltage at the input of the column buffer is then given by Eq. 5. To not affect the signal reading, the column bus is preload to the ground voltage (i.e.  $V_c=0$ ) thanks to the switch  $SW_{PL}$ . Moreover to not significantly reduce the voltage range during the reading operation, the storage capacitor must be larger than the column capacitor.

$$V_{reading} = \frac{C_S}{C_S + C_C} V_{Stored} + \frac{C_C}{C_S + C_C} V_C$$
 Eq. 5

One drawback of the analog memory is the limited retention time of the data due to current leakage of the access switch *SW*. The required retention time is defined by the reading speed which depends on the ADCs and the global data bandwidth of the output drivers. The number of output driver is limited by architectural considerations such as the number of Input Output (IO) pad. Moreover, the user of the burst CIS has to be able to handle the data flow at the output of the sensor. For this study, we consider ten output drivers that provide a data flow of 6.55 Gbit/s for a spatial resolution of 400x400 pixels. The ADCs have thus to provide a bit flow density of 164 Mbit/s/mm<sup>2</sup>. For instance this bit flow requires implementing one ADC of 16 MC/s in 1 mm<sup>2</sup> if we consider a 10 bit conversion. Such a bit flow density is quite manageable by the matrix of ADCs. Therefore, the memory reading operation is limited by the number of output buffers and not by the ADC conversion frequency. The voltage range  $V_{RAM}$  of the analog memory is 0 V - 1.5 V.

Fig. 37 Analog Memory Architecture



The MATLAB model evaluates the memory size for a given frame rate FR and dynamic range  $DR_{AM}$ . The dynamic range of the analog memory is defined by the ratio of the voltage range  $V_{RAM}$  of the memory (1.5 V) and the dynamic range noise floor ( $v_{DRN}$ ) as presented in Eq. 6.

$$DR_{AM} = 20 \times log\left(\frac{V_{RAM}}{v_{DRN}}\right)$$
 Eq. 6

The storage capacitor gives a value of sampling noise which can limit the dynamic range of the analog memory. The model then choses a storage capacitor in order to ensure a standard deviation of the sampling noise equal to half the dynamic range noise floor. Moreover, the analog memory has a limited bandwidth due to the low pass filter formed by the access switch and the storage capacitor. The MC must be loaded to more than 95 % of the final value in a time equal to  $T_{int}$ . The time constant of the low pass filter must thus be three times lower than  $T_{int}$ . The value of the MC access switch resistor  $R_{SW}$  and the column access switch resistor  $R_{SWC}$  is then given by Eq. 7.

$$R_{SW} + R_{SW_C} = \frac{T_{int}}{3 \times C_S}$$
 Eq. 7

Knowing the value of the storage capacitor and the on resistor of the switch, the model can compute the surface of a memory cell and then the number of images stored in one analog memory. To compute the surface of the storage capacitor  $S_C$ , a capacitance density  $d_C$  of 3 fF/ $\mu$ m² is used as in [40]. The switch is implemented with an N-type MOSFET. The channel length L of the transistor is 0.5  $\mu$ m and the width W is computed using the quadratic equation of the MOSFET. The surface  $S_{SW}$  of the switch is the product of the transistor length by its width. The memory size MS is computed by dividing the pixel surface  $S_{pix}$  by the MC surface  $S_{MC}$ . A part of the pixel surface is dedicated to the column buffers and the control circuits. Therefore, a fill factor FF of 0.8 is applied to the pixel surface to compute the memory size as presented in Eq. 8.

$$MS = \frac{S_{Pix} \times FF}{S_{MC}}$$
 with  $S_{MC} = WL + \frac{C_S}{d_C}$  Eq. 8

For small values of dynamic range, the storage capacitor value is below 1 fF. However, it seems difficult design an analog memory with such values. Indeed, these storage capacitors would be very sensitive to mismatch as there dimensions would be sub-micrometer. Moreover, the memory reading operation would be difficult to perform due to the charge sharing between the storage capacitor and the column capacitor  $C_c$ . Therefore, a minimum value of 3 fF is implemented in the MATLAB model.

The model also computes the voltage drop due to the current leakage  $I_{leak}$  of the switch SW. knowing the memory size MS, the dynamic range DR and the data flow DF of the LVDS output drivers (6.55 Gbit/s), the retention time  $T_{ret}$  of the signal in the memory cell is computed thanks to Eq. 9. The retention time is given for a spatial resolution of 400x400 pixels.

$$T_{ret} = \frac{MS \times 400^2 \times R_{bit}}{DF}$$
 with  $R_{bit} = \left[\frac{DR - 1.76}{6.02}\right]$  Eq. 9

We consider that the switch is implemented with a low leakage transistor structure as presented in [40]. The current leakage  $I_{leakRef}$  is then equal to 0.015 fA for a switch transistor width of 0.3 µm. We assume that the leakage  $I_{leak}$  is linearly proportional to the transistor width W. The maximum voltage drop due to the leakage is then given by Eq. 10.

$$V_{drop} = I_{leak} \frac{T_{ret}}{C_{\rm S}}$$
 with  $I_{leak} = I_{leakRef} imes \frac{W}{0.3}$  Eq. 10

#### 3.3.2.2. Pixel Front-End

The pixel front-end is composed of a photocurrent to voltage conversion stage (i.e. integration stage) and a buffer stage that copies the signal in the analog memory as presented in Fig. 38. The integration of the photocurrent  $I_{pd}$  is performed on the photodiode junction capacitor  $C_{pd}$ . The reset operation is performed by the transistor  $M_{rst}$ . A high reset voltage is preferred to reduce the value of the junction capacitor and increase the conversion gain. The junction capacitor of a NWell/PSub photodiode is evaluated to 200 fF thanks to simulations. The conversion gain is then equal to 800 nV/e-. The buffer between the integration stage and the memory is implemented with a source follower (SF) circuit. We chose a SF buffer with a negative offset based on N-type MOSFET M1 to match the voltage range of the analog memory. The electrical characteristics of this buffer are given in Tab. 6 with the threshold voltage  $V_{th1}$ , the product  $K_1$  of the charge carrier mobility by the gate oxide capacitor density, the MOSFET gate and bulk trans-conductance respectively  $g_{m1}$  and  $g_{mb1}$ , and the on resistor  $r_{onSW}$  of the access path to the memory cell. In the model, the DC gain is set to 0.8. The bandwidth of the SF buffer given in this table takes into account only the gate trans-conductance. Its bandwidth depends on the SF buffer ( $C_{load}/g_{m1}$ ) and the analog memory ( $r_{onSW}C_{load}$ ) time constant. For the analog front-end, the model only considers the bandwidth limited by the trans-conductance as the inner bandwidth of the analog memory is already discussed in Eq. 7.

Fig. 38 Analog Front-End Circuit



Tab. 6 Electrical Characteristics of the SF buffer

| Offset (V)                                     | DC gain                                        | Bandwidth -3dB (Hz)                                                          |
|------------------------------------------------|------------------------------------------------|------------------------------------------------------------------------------|
| $V_{th1} + \sqrt{\frac{2I_{Bias}L_1}{K_1W_1}}$ | $\boxed{\frac{1}{1+\frac{g_{mb_1}}{g_{m_1}}}}$ | $\frac{1}{2\pi \left(\frac{C_{load}}{g_{m_1}} + r_{on_{SW}}C_{load}\right)}$ |

The MATLAB model first computes the load capacitor  $C_{load}$  which is a key parameter to design the analog front-end. As the model has already computes the analog memory, the storage capacitor  $C_S$  and the memory size MS value are known. We consider a square fill factor for the analog memory. The number of lines and columns is given by the square roots N of the memory size. The buffer has to load the line capacitor which is the sum of the wire capacitor  $C_{wire}$  and the source/bulk capacitor  $C_{SB}$  of the switches connected to the line. The total load capacitor is given by Eq. 11. The factor 2xN is the number of switches on a column and a row of the analog memory. The value of the source/bulk capacitor is computed based on the model presented in [69] which depends on the surface and perimeter of the N+ implant of the transistor source.

$$C_{load} = C_S + C_{wire} + 2 \times N \times C_{SB}$$
 Eq. 11

Then the trans-conductance of the SF buffer is computed. It depends on the targeted frame rate FR and the input capacitance of the analog memory  $C_{load}$ . The source follower transfer function is a first order low pass filter. We chose to load the analog memory to 95 % of the

final value. The time constant of the first order system must thus be three times lower than  $T_{int}$ . The trans-conductance  $g_m$  of the transistor M1 is then given by Eq. 12.

$$g_m = rac{3 imes C_{load}}{T_{int}}$$
 Eq. 12

The required bias current of the SF buffer depends on its trans-conductance and the slew rate constraint as presented in Eq. 13. The trans-conductance defines a current  $I_{gm}$  that depends of the overdrive voltage  $V_{ov}$  which is set to 0.2 V in our model. Moreover, the slew rate defines a minimum current to load the output which depends on the load capacitor, the settling time  $T_{int}$ , and the voltage range of the analog memory  $V_{RAM}$ . This current is multiplied by a factor 2 to be sure that no slew rate occurs and gives the current  $I_{SR}$ . Knowing both currents, the model choses the maximum of them as bias current  $I_{bias}$ .

$$I_{bias} = maxig(I_{gm},I_{SR}ig)$$
 with 
$$\begin{cases} I_{gm} = rac{g_m imes V_{ov}}{2} \\ I_{SR} = 2 imes C_{load} rac{V_{RAM}}{T_{int}} \end{cases}$$
 Eq. 13

The model also computes the on resistor  $R_{rst}$  value of the reset transistor  $M_{rst}$ . This transistor forms a low pass filter with the junction capacitor of the photodiode  $C_{PD}$ . As before, the reset operation is performed at 95 % of the final value and the on resistor of the transistor is given by Eq. 14.

$$R_{Rst} = \frac{T_{rst}}{3 \times C_{PD}}$$
 Eq. 14

All the elements of the analog front-end has been sized and the power consumption of the can be computed. The power consumption is the sum of the static power due to the biasing of the SF buffer and the dynamic power due to the reset operation and the load of the analog memory. The total power consumption for a pixel is then given by Eq. 15 with a power supply  $V_{supply}$  of 2.5 V. The evaluation is a worst case where the saturation is reached.

$$R_{Rst} = I_{bias}V_{supply} + \left(\frac{1}{2}C_{PD}\left(\frac{V_{RAM}}{G_{SF}}\right)^2 + \frac{1}{2}C_{load}V_{RAM}^2\right) \times FR$$
 Eq. 15

The noise of the pixel front-end is also computed to check if it does not limit the targeted dynamic range. A simple noise analysis is carried out to assess the signal to noise ratio (SNR) of the pixel front-end. The model considers two different sources of noise: the photon shot

noise which is signal dependent and the readout noise due to the reset operation and the source follower buffer. The details of the computation are presented in Annex A.

#### 3.3.2.3. Model Results

The MATLAB model has been run for different dynamic ranges and frame rates. We present here the evaluations of the memory size, the voltage drop, the bias current and the power consumption. At the end of this analysis, we verify that the readout noise does not limit the targeted dynamic range. First, the model computes the performances of the analog memory. As expected, the memory size strongly depends on the dynamic range due to the value of the sampling capacitor as illustrated in Fig. 39 (a). The memory depth increases as the dynamic range decrease until 56 dB. Below this value the sampling capacitor reaches its limit of 3 fF and the memory depth reaches its maximum of 1740 images. Moreover, the figure shows that the memory depth does not depend on the frame rate. Indeed, Fig. 39(b) demonstrates that the width of the switch transistor SW is not a limitation for frame rate below 500 Mfps and for dynamic range below 60 dB. Above these values, the on resistor of the switch requires to increase the transistor width. However at such dynamic ranges, the storage capacitor is large and is the major source of surface consumption in the memory cell.

Fig. 39 (a) Memory Size and (b) Switch Transistor Width versus the Dynamic Range for Different Frame Rates



The model of the analog memory gives some insights on the voltage drops due to current leakages as illustrated in Fig. 40. As expected at high dynamic ranges, the voltage drops decreases as the dynamic range and the storage capacitor increase. When the dynamic range is below 56 dB, the storage capacitor stays to 3 fF but voltage starts decreasing. This inversion of the slope is due to the digital reading. As the dynamic range keep reducing, the bit resolution of the A/D conversion decreases as presented in Eq. 9. Therefore, each MC is read faster and the retention time decreases as the memory size stays constant. The values of the drop do not exceed 2.3 mV and are not an issue here. The effect of the frame rate on

the voltage drop only occurs at high frame rate and dynamic range. Indeed, the current leakages increase with the width of the switch transistor. However, the effect on the voltage drop is not significant here.

Fig. 40 Voltage Drop versus the Dynamic Range for Different Frame Rates



As expected the bias current strongly depends on the frame rate and varies from sub microampere to hundreds of micro-ampere respectively for frame rates of 1 Mfps and 1 Gfps (Fig. 41). The bias current of the SF buffer is limited by the slew rate constraint over all the frame rate range. However, it is interesting to take look to the bias current versus the dynamic range as illustrated in Eq. 12. For low dynamic ranges, the memory size is constant. The load capacitor is then constant and the bias current only depends on the frame rate. However when the dynamic range is higher than 56 dB, the load capacitor  $C_{load}$  starts decreasing. Indeed, the reduction of column bus capacitor  $C_C$  due to the reduction of the memory size is stronger than the increasing of the storage capacitor  $C_S$ . This trend is reversed around 62 dB and the bias current increases.

Fig. 41 Bias Current of the SF Buffer versus the Dynamic Range



The total pixel power consumption follows the same trend as illustrated in Fig. 42. The SF buffer is the main source of power consumption as the dynamic power contributes to 30 to 45% of the total power depending on the dynamic range.

Fig. 42 Pixel Power Consumption versus the Dynamic Range for Different Frame Rates



Finally the MATLAB model computes the dynamic range limited by the readout noise. It first appears that the dynamic range limitation slightly depends on the frame rate. The noise limited dynamic range is thus plotted versus the targeted dynamic range in Fig. 43. At low dynamic range, the dynamic range limitation is constant due to the memory size limitation. Above 52 dB, the dynamic range limitation increases but stays compliant with the targeted

dynamic range up to 70-80 dB. The main source of noise is then the sampling noise of the analog memory. For higher dynamic range values, the sampling noise of the reset operation becomes predominant and limits the dynamic range to 82 dB. The signal to noise ratio of the analog storage architecture is limited by the photon shot noise and reaches 64 dB at the pixel saturation.

Dynamic Range Limitation due to Readout Noise FR=1Mfps 82 FR=5Mfps 80 FR=10Mfps **Oynamic Range Limitation (dB)** FR=50Mfps 78 FR=100Mfps FR=500Mfps 76 FR=1Gfps 74 72 68 66 64 62 80 50 55 70 85 Targeted Dynamic Range (dB)

Fig. 43 Dynamic Range Limitation due to Readout Noise

## 3.3.3. Performance Synthesis

The integration stage of the pixel front-end is implemented with a 3T pixel. The integration capacitor is evaluated to 200 fF due to the junction capacitor of a NWell/PSub photodiode. The conversion gain is then 800 nV/e-. A model of this architecture has been implemented with MATLAB and gives an estimation of the memory size, the voltage drop, the power consumption, and the readout noise for a given dynamic range and frame rate. In term of memory depth of the analog storage architecture strongly depends on the targeted dynamic range. The value of the storage capacitor is defines by its sampling noise with respect to the dynamic range. As the analog memory bandwidth does not significantly limit the frame rate, the memory size does not depends on the image acquisition speed. Using a capacitor density of 3 fF/µm<sup>2</sup>, the memory depth is equal 82 and 800 images respectively for dynamic range of 70 dB and 60 dB. For dynamic range below 56 dB, the memory size is limited to 1740 images due to technological consideration such as mismatch. Moreover at such memory depth, the storage capacitor is of the same order of magnitude than the column bus  $C_C$  and the voltage range of the reading operation is significantly reduced due to the charge sharing. As presented in this section, the A/D conversion operation is not a critical element of the analog storage architecture as it requires a conversion density of 164 Mbit/s/mm<sup>2</sup>. However, the number of LVDS buffers set the retention time and thus the value of the voltage drop due to current leakages. For ten output drivers and the switch technology presented in [40], the current leakage is 0.015 fA and the voltage drops are not significant. However, this model can be useful to evaluate the effect of the current leakages if another switch technology is used. The pixel power consumption strongly depends on the frame rates. It respectively varies between 1  $\mu$ W to 1 mW per pixel for a frame rate of 1 Mfps to 1 Gfps at low dynamic range. Moreover as the dynamic range increase above 62 dB, the total power consumption rises due to larger storage capacitors. The static power of the SF buffer contributes to 55 % to 70 % of the total power consumption depending on the targeted dynamic range. The readout noise is mainly due to the sampling noise of the analog memory for low dynamic range. However above 60-70 dB, the reset noise prevails and limits the dynamic range of the front-end circuit to 82 dB. The dynamic range is independent of the frame rate for a first order model. For high illumination, the photocurrent shot noise limits the signal to noise ratio of the front-end to 64 dB.

## 3.4.Burst Image Sensor with Digital Storage

3D integration offers another possibility of architecture. Indeed, it is possible to use high speed A/D converters to perform the conversion during the image acquisition. With this architecture, the images are stored in digital memories. In this section, one studies the 3D integrated burst image sensor with digital storage. Firstly, the architecture implemented using 3D integration technology is described. Then, one assesses the performances of the image sensor in term of frame rate, dynamic range, memory depth, and power consumption.

#### 3.4.1. Architecture Overview

The digital storage architecture differs from classical burst CIS because an A/D conversion is performed before the frame storage as shown in Fig. 44. The sensor works in two phases. The photodiode signal is integrated into a voltage that is converted into a digital signal. The digital signal is stored in a digital memory. Then, the memory is readout from the image sensor through the LVDS driver. The photodiodes and the analog pixel front-ends are implemented on the top tiers of the stack. The pixel front-end performs the current to voltage conversion. It also performs a global shutter acquisition to avoid the aberrations due to skew. A correlated double sampling (CDS) can be implemented to remove the fixed pattern noise of the front end and reduce the temporal noise cause by the reset operation. A matrix of high speed ADCs is set on the middle tier. As we will see later, a cluster of pixels is connected to one ADC because the surface of one A/D converter is larger than the pixel footprint. The digital memory is implemented on the bottom tier. For floorplan reason, several ADCs can be connected to one digital memory which enables parallel writing accesses. The digital memory is readout through an output buffer. The output buffers are

also implemented on the bottom tier. The buffers are implemented with a LVDS standard as in the analog storage architecture.

Pixel Pixel Pixel Front-End Tier 1

Multiplexer Multiplexer Multiplexer Tier 2

Digital Memory Tier 3

Fig. 44 Digital Storage Burst Image Sensor Architecture

For a better understanding of this architecture, a description of the image acquisition process is described with a timing diagram (Fig. 45). Each acquisition starts with a reset of the integration stage of each pixel during a time  $T_{rst}$ . Then the photocurrent is integrated on a capacitor during a time  $T_{int}$ . At the end of the integration, the voltage value is stored in a sampling cell in each pixel. This sampling cell performs the global shutter acquisition. At this point, the integration stages are reset and the integration of the next image starts. During this second integration, the ADC successively converts the signal of the previous integration stored in the sampling stage of each pixel of the cluster. The digital values are written into the digital memory.



Fig. 45 Timing Diagram of a Cluster of 4 Pixels for the Digital Storage Architecture

Finally, one checks the signal path through the 3D stack. If the multiplexer that connects the pixel front-end to the ADC is set on top tier, only one bonding per cluster is required between the top and the middle tier. Between the middle and the bottom tier, the bonding is face to back and the interconnection required a TSV. Depending on the technology, the TSV diameter can be large. For instance, the TSV diameter is 40  $\mu$ m with a Leti Open 3D technology [70]. As TSV uses some active area, it is better to limit the number of TSV to give more room for the transistors on the middle tier. As for the analog storage architecture, the control signals have to be mutualized between groups of pixels for the routing. To compare the digital storage architecture to the analog one, it is necessary to assess the frame rate, dynamic range, memory depth and power consumption of this architecture. The next section presents the elements than can limit the performances of the image acquisition. These three elements are the pixel front-end, the ADC and the digital memory.

### 3.4.2. Performance Evaluation

### 3.4.2.1. Pixel Front End and Multiplexer

The pixel front-end has to perform three tasks, the current to voltage conversion, the global shutter acquisition and the multiplexing. As in the analog storage architecture, the

photocurrent integration is performed on the photodiode junction capacitor (200 fF) and the conversion gain is 800 nV/e-. The global shutter operation is performed by a basic sampling cell. A source follower buffer implemented with P-MOSFET copies the voltage at the integration node into a sampling capacitor. The gain of this stage is 0.8 and the offset is 0.8 V. The multiplexer stage that connects the pixels of the cluster to the ADC is implemented with SF buffers and switches. In order to stay below the power supply headroom, these SF buffers are implemented with N-type MOSFET (i.e. with a negative offset). One current source is shared by the SF buffers to reduce the power consumption of the cluster of pixels. The pixel front-end characteristic is set to match the input voltage range of the ADC. The characteristic is tuned with the reset voltage of the integration stage and with the offsets of the SF buffers (i.e. with the dimensions of the transistors).

Fig. 46 Pixel Front-End of Digital Storage Architecture



As for the analog storage architecture, a MATLAB model has been designed to evaluate the performances of the pixel front-end of the digital storage architecture. The model has the frame rate and the dynamic range as inputs. The user also gives information about the ADC as it input capacitor  $C_{ADC}$  and its dimensions that set the multiplexing ratio. The model starts computing the value of the global shutter capacitor  $C_{GS}$  using the same method that the one presented for the analog memory of the previous architecture. The user gives some information about the percentage of the A/D conversion time dedicated to the signal sampling and the percentage of the integration time dedicated to global shutter operation. The model then computes the bias currents of the SF buffers with the bandwidth and slew rate constraints. In term of speed, the critical operation is the loading of the ADC input.

Finally the model computes the readout noise and checks that its value does not limit the dynamic range. For this study, one considers that the reset and the global shutter are performed in 10% of the frame rate time. For the multiplexing operation, one considers a multiplexing ratio of 9:1 and a load capacitor  $C_{ADC}$  of 1 pF. The sampling at the ADC input is performed in one-fifth of the conversion time.

It is first interesting when we run the model to take a look to the on-resistor value of the switches of the multiplexer, global shutter, and reset stages. Indeed as the timing constraints are strong on these stages, their on-resistor values can be small (i.e. large footprint). It is particularly true for the multiplexer that has only a fraction of the conversion time to load the ADC. For frame rate above 50 Mfps, the multiplexer must have switches with an on-resistor below 500  $\Omega$ . Such switches are very large and difficult to implement in the pixel without drastically reduce the fill factor. This is a first limitation of the digital storage architecture in term of frame rate. The model then computes the pixel power consumption. The power consumption depends on the frame rate and the dynamic range as illustrated in Fig. 47. Below 55 dB, the power is independent of the dynamic range as the capacitor of the global shutter stays constant. For higher value, the global shutter capacitor increases with the dynamic range. In term of power consumption, it seems difficult to exceed frame rates of 100 Mfps as the pixel power consumption will reach more than 1 mW for high dynamic ranges.



Fig. 47 Pixel Power Consumption versus the Dynamic Range of Different Frame Rates

As for the analog storage architecture, the computation of the readout noise is independent of the frame rate. The readout noise gives a limitation to the dynamic range as illustrated in Fig. 48. For dynamic range above 77 dB, the sampling noise of the reset operation limits the dynamic range to 79 dB. Above this dynamic range, the main source of noise is the sampling

noise of the global shutter operation. The photocurrent shot noise limits the signal to noise ratio of the front-end for dynamic range above 55 dB. The SNR is then limited to 62 dB under high illumination.



Fig. 48 Dynamic Range Limitation due to the Readout Noise

# 3.4.2.2. Analog to Digital Conversion

The A/D conversion seems to be a challenging element of the digital storage architecture. Indeed, the ADCs have to convert the analog signal coming from the pixel at the acquisition speed. Thanks to a review of the ADC state of the art, it seems difficult to implement such an ADC below each pixel of 50  $\mu$ m. Therefore, the pixels from a cluster are multiplexed and connected to a single ADC, as presented in Fig. 44. From a layout point of view and due to the 3D stack architecture, the footprint of the ADC must be equal to the pixel cluster footprint. The multiplexing ratio is thus equal to the ADC area  $A_{ADC}$  to pixel area  $A_{Pixel}$  ratio. Therefore, the ADC must work at the frame rate frequency FR times the multiplexing ratio (Eq. 16). To maximize the frame rate, the ADC must have a high conversion frequency  $F_{ADC}$  and a small footprint.

$$F_{ADC} = FR \times \frac{A_{ADC}}{A_{pixel}}$$
 Eq. 16

The analog to digital conversion is a broad topic. It is difficult to take relevant decision in term of converter architectures without a long-term experience in this field of study. To overcome this problem and chose architecture for a given set of performances, a possible solution is to make a choice based on the analysis of a data base of ADCs. For instance, a statistical analysis of a data base has been present in [71] to analyze power consumption and area with respect to the technology node. One proposes here an analysis to find the well

suited architecture for our application. This study is carried out on the Murmann data base [72]. This data base gathers the performances of the ADCs presented in ISSCC and VLSI conferences since 1997 to the present date. First, one evaluates the Signal to Noise and Distortion Ratio (SNDR) for a given frame rate for different ADC architectures in Fig. 49. This graph is plot for a pixel pitch of 50  $\mu$ m. The Effective Number Of Bit (ENOB) is computed from the SNDR by the Eq. 17.

$$ENOB = \frac{SNDR_{dB} - 1.76}{6.02}$$
 Eq. 17

As expected, flash architecture reaches high frame rate but only for low SNDR. It hardly exceeds SNDR of 40 dB or ENOB of 6 bits. It is not a good candidate for digital storage architecture as other architectures reach better SNDR for the same frame rate. Sigma-delta architecture provides the highest SNDR but limits the frame rate to about 5 Mfps. This architecture is more suited to reach high SNDR than high frame rate. The pipeline architecture reaches frame rate up to 40 Mfps. The SAR architecture offers a wide range of frame rate up to 1 Gfps. SAR and pipeline architectures seem two good candidates to target frame rates between 10 and 100 Mfps. However, pipeline architecture is more subject to missing code error than SAR architecture. Therefore, this architecture is preferred for this range of frame rate.

Fig. 49 SNDR versus the Frame Rate



Another important aspect of the A/D conversion is the power consumption. Indeed, all the ADCs of the chip are working simultaneously. Therefore, it is a major cause of power consumption in the digital storage architecture. The same work has thus been done on the

Murmann data base for the power consumption density versus the conversion frequency density in Fig. 50. Based on this plot, SAR architecture appears as a well suited solution to lower the power consumption for a conversion frequency up to 1 Mfps. For highest conversion density, the flash architecture becomes power efficient. However, it is important to notice that the power consumption of the buffer that loads the ADC input capacitor is not considered in the data base. This buffer power consumption strongly depends of the ADC architecture as it depends on its input impedance. To make a more accurate analysis of the power consumption, it would be interesting to take into account the power consumption of the input buffer.

10<sup>1</sup> 10<sup>0</sup> Power Density (W/m㎡) SAR 10 SigmaDelta Pipeline 10 Flash 10 10<sup>3</sup> 10 10<sup>8</sup> 10<sup>9</sup> 10 Frame Rate (Hz)

Fig. 50 Power Consumption Density versus the Frame Rate

# 3.4.2.3. Digital Memory

The digital storage burst image sensor requires a high density memory to maximize the number of images per burst. Moreover, the sensor requires a memory with a short access time in writing mode to store the bit stream generated by the ADCs. The memory reading operation is performed at low speed. The third tier of the stack contains also the output drivers and some control logic. Therefore, the memory technology must be compatible with a standard CMOS technology. A digital memory is implemented to store the burst of images as shown on Fig. 51. Different memory technologies can be used to implement the burst memory. The digital memory can be implemented with registers. A memory based on registers has a very short access time. However, registers are not an effective way to store a large amount of data due to the poor bit density (bit/µm²). In the integrated circuits, mass storages are generally performed with random access memory (RAM) technologies. The RAM architecture is made of an array of bit cells, also known as memory cell (MC), composed of 2<sup>n</sup> rows and 2<sup>m</sup> columns [73]. A row contains a word of 2<sup>m</sup> bits that is usually

made of 8, 16, 32 or 64 bits. RAM architecture is design to write or read a full word of bits at each memory access. To write into the RAM, a word line is selected by the row decoder. The bit line conditioning circuit drives the memory cells of the selected word line. The same operation is applied for the reading but it is the memory cells that drive the bit lines. The bit lines are read on the bottom of the matrix thanks to a sense amplifier *SA*. Different RAM architectures have been defined in order to optimize different parameters such as the power consumption or the access times. The RAMs are generally implemented by software (memory generators) that will change the shape of the memory or will divide it into several sub-arrays. However for our application, the memory is writing and reading sequentially. The addressing circuitry can thus be simplified into a simple shift register that propagates a token from one word line to another.

Fig. 51 Random Access Memory Architecture



The memory cell can be implemented with static and dynamic memory cell. A basic Static-RAM (SRAM) bit cell is made of 6 transistors (Fig. 52). Two cross coupled CMOS inverters store the bit value. Two switch transistors give access to the inverters to read the bit value or write a new one. The SRAM technology offers a good memory density that scales with the CMOS technology shrinking. For instance, the Intel SRAM density is about 1 bit/ $\mu$ m² with a 90 nm technology and about 6.25 bit/ $\mu$ m² with a 32 nm technology [73]. The access time to write and read SRAM depends on the size of the memory array and its order of magnitude is about few nano-seconds. The SRAM stores the data as long as the circuit is powered and does not require any refreshment operation. Moreover, the SRAM is compliant with the standard CMOS technology. Another benefit of SRAM is its simple implementation

procedure thanks to IPs and SRAM generators provided by microelectronics companies. A Dynamic-RAM (DRAM) bit cell is made of a capacitor and one transistor (Fig. 52). The transistor is a switch that gives access the capacitor. The bit value is stored in the capacitor. DRAM technology has a very good density above 100 bit/µm² [74]. For instance, the bit cell capacitor can be a compact trench capacitor to reach high density. The capacitor is then made in the substrate with a vertical finger of poly-silicon surrounded by a dielectric layer. Other capacitor structures have been design for DRAM applications. However, it requires specific processes that are not available in standard CMOS technology. Companies offer DRAMs that can be embedded on a standard CMOS ASIC. These memories, known as embedded DRAM (eDRAM), have good density compared to SRAM. For instance, the eDRAM density reaches up to 28 bit/µm² with a 28 nm CMOS technology [75]. There are different access times to characterize the memory reading or writing but their orders of magnitude are about tens nano-seconds. A drawback of DRAM/eDRAM is the data retention time which depends on the current leakage of the switch. Thus DRAM must be periodically refreshed. The refresh period varies from 1 to 10 ms.

Fig. 52 (a) 6T SRAM Bit Cell and (b) 1T1C DRAM Bit Cell



Both SRAM and eDRAM are suitable for our application. SRAM is used for an easy and reliable implementation of the sensor memory. eDRAM is used to push even further the memory depth of the sensor. A simple evaluation of the image burst size has been done for an SRAM and eDRAM memory. The memory size N of the RAM can be predicted by Eq. 18 (a) with the memory surface  $A_{RAM}$ , the bit cell surface  $A_{bit}$ , and the array efficiency E [73]. The array efficiency represents the percentage of the memory surface devoted to the bit cells. The rest of the memory area is devoted to the row and column circuits. The array efficiency is assessed to 0.7. The bit cell area for an SRAM is given by Eq. 18 (b) with  $\lambda$  the half of the technology node and 600 a factor that depends on the layout strategy. Here one considers a RAM designed with a 28 nm CMOS technology. The SRAM bit cell surface is 0.118  $\mu$ m<sup>2</sup> and the eDRAM bit cell surface is 0.035  $\mu$ m<sup>2</sup> [75].

(a) 
$$N = E \frac{A_{RAM}}{A_{bit}}$$
 (b)  $A_{bit SRAM} = 600 \lambda^2$  Eq. 18

For the targeted pixel pitch of 50  $\mu$ m, the memory size is plotted versus the dynamic range of the image sensor in Fig. 53. The memory depth varies over the dynamic range from 8333 to 3846 images and 2480 to 1145 images respectively for the dynamic and static bit cell. Based on manufacturer data, the power consumption an SRAM implemented with a 28 nm technology can be evaluated to 8  $\mu$ W/Byte/Mhz for a typical process. This expression does not take into account the transistor leakages that strongly increase the power consumption of the memory especially for worst case conditions (high temperature and corner parameters).

Fig. 53 Memory Size versus Dynamic Range of the Digital Storage Architecture for Static and Dynamic Bit Cell in 28 nm technology



### 3.4.3. Performance Synthesis

This study confirms that the tradeoff between the dynamic range and the frame rate due to the A/D conversion limits the performances of the digital storage architecture. Indeed for the same frame rate, the pixel front-end can have a better dynamic range than the ADCs of the state of the art. The A/D conversion appears as the bottleneck of this architecture. Therefore, the design of the front-end will be constrained by the ADC specifications and performances. Moreover, 100 Mfps seems to be a limit to the frame rate of this architecture. Indeed at such a frame rate, the ADC conversion frequency is about 1 GHz. At such a frame rate, the pixel front-end has also power consumption of hundreds micro-Watt and requires larges switches for the different sampling operations. As for analog storage architecture, the conversion gain is about 800 nV/e- and depends on the junction capacitor of the photodiode. There is a tradeoff between the memory depth and the dynamic range. The memory depth of the digital storage architecture depends on the choice of a static or dynamic bit cell. Static bit cell provides a lower memory depth but are easier to implement and more reliable than the dynamic bit cell. On the contrary, dynamic bit cell provides a very

high memory depth but requires refresh operations and is available only with some specific technological options. The power consumption of the digital storage architecture depends on the targeted frame rate. Performing the A/D conversion during the burst acquisition is a major source of power consumption. The pixel power consumption of the digital storage architecture during the burst recording is plotted in Fig. 54. The power consumption is given for three different choices of ADC that defines a pair of dynamic range and frame rate. These figures confirm that the main source of power consumption in the digital storage architecture is the A/D conversion.



Fig. 54 Pixel Power Consumption for Different Pairs of Dynamic Range and Frame Rate

# 3.5.Conclusions and Perspectives

In this chapter, the analog and digital storage architectures of burst image sensor are compared. Both architectures are limited by a tradeoff between the dynamic range and the frame rate which depends on the power consumption for the first one and on the A/D conversion for the second. However for a given dynamic range and power consumption, the analog storage architecture has a frame rate about fifty times higher than the digital storage architecture. For instance at 60 dB, the maximum frame rate of the digital storage architecture is limited by the ADCs to 10 Mfps and the analog storage architecture is limited by the analog memory to 500 Mfps for the same power consumption. In term of number of images per burst, the digital storage architecture has to be preferred. Indeed, using static digital memory, the digital storage architecture has a memory depth higher than the analog memory especially for high dynamic range (>60 dB). Besides, if we consider a dynamic digital memory, the digital storage architecture has a memory capacity more than 10 times higher than the analog storage architecture. Moreover, we have to keep in mind that the design of analog memory using small values of storage capacitors is difficult and less reliable. Moreover, the majority of R&D companies focus their works on digital memory nowadays. Therefore, one can expect very dense digital memories in the next years thanks to emerging memory technologies. In term of power consumption the analog storage architecture clearly shows better figures over the whole range of frame rate as it does not perform the A/D conversion during the image acquisition.

Before going any further, it is necessary to refocus our work on a given architecture. The main advantage of the analog storage architecture is a high frame rate. However, high frame rates can be reached with or without using 3D integration technology and lots of works have been presented to improve this parameter [76] [77] [78]. The digital storage architecture is chosen to take fully advantage of the 3D integration technology. Indeed, this architecture permits to design burst IS with a memory depth of thousands of images that is not reachable without 3D integration. Therefore, throughout the rest of this work the digital storage architecture will be studied in more details. A study of the different papers ADCs show that the converters that permit such frame rate are generally SAR ADC. Here, one considers a SAR ADC based on a capacitive digital-to-analog converter (DAC). Such ADC can have conversion frequency of 100 MHz with a resolution of 10 bits for a surface of 155x165 μm<sup>2</sup> [79]. However, this reference is a state of the art ADC, for this work, we consider that the ADC is implemented with the same architecture but reaches a conversion frequency of 45 MHz with a resolution of 8 bits for a surface of 150x150 μm<sup>2</sup>. We also consider that the value of its input capacitor is 1 pF. Therefore, the multiplexer ratio is 9:1 and the frame rate is 5 Mfps. Before going deeper in the study of this burst image sensor, one takes a look to the effect of the power consumption of the digital storage architecture. Indeed, this chapter has identified a state of high power consumption during the burst acquisition which can cause an overheating of the 3D stack. Therefore, one presents in the next chapter a thermal analysis of the digital storage architecture for different recording and triggering modes.

# 4. Thermal Study of a 3D Integrated Digital Burst Image Sensor

### 4.1.Introduction

The previous chapter presents highly integrated and highly parallel image sensor architecture thanks to 3D integration technology. In this chapter, we focus on the 3D integrated burst image sensor with digital storage architecture. In addition, the considered burst image sensor works at high speed to records a 5 Mfps video. Such architecture consumes lot of electrical power. Therefore, the system will be the place of significant joule effect that will create rises of the junction temperatures in the chips. The temperature can have a significant effect on the image quality for digital camera as signal to noise ratio is degraded with the increase of temperature. Moreover, high temperatures degrade the circuit lifetime and can destroy the circuit for extreme values. In this chapter, we propose a thermal analysis of the chip in order to prevent the overheating and define some limits of operation of the image sensor. First, a thermal description of the 3D integrated image sensor and its package is given. A presentation of the electrical power consumption in the chips is also done for the acquisition and the reading operations. Then, the static and transient simulation results are presented and discussed for the image sensor working in different triggering and recording modes. Finally some perspectives are given to evaluate the risk of thermal runaway.

# 4.2. Thermal System

# 4.2.1. Package Description and Model Assumptions

This section describes the structure of the image sensor that will be simulated. To perform a relevant thermal study, it is important to define the physical structure of the image sensor and the thermal properties of the different materials. First of all, the stacking of the 3D integrated image sensor is briefly reminded. The top chip —pixel front-end circuits— is back side illuminated with a thinned substrate. This chip is connected to the middle chip —ADCs—with a face-to-face micro-bonding. The middle chip is connected to the bottom one — memories and LVDS drivers— with a face-to-back micro-bonding. Thus, on the middle and the bottom chips, the signals are routed from one side to the other using through silicon vias (TSV). The interconnections (micro-bonding and TSV) are made with the Leti Open-3D technology [70]. Then, the 3D integrated image sensor is placed in a BGA package for flip-chip. The package has an optical lid. The sensor is connected to the package still using bonding of the Open 3D technology. The thermal characteristics and dimensions of the package are given by the manufacturers [80] [81]. The sides and the top of the sensor are not in contact with the package. Finally, the package is mounted on a printed circuit board (PCB) made of FR-4 material as shown in Fig. 55.

Fig. 55 Image Sensor Package and Printed Circuit Board



Each chip is modeled by a stack of three slices as shown in Fig. 56. The chip substrate is modeled by a slice of silicon (blue). The TSVs are not taken into account to compute the thermal properties of the substrate. In fact due to the high conductivity of copper, TSVs will only increase the conductivity of the substrate which is already the layer with the highest conductivity. The active area where the transistors dissipate the power is modeled by a thin slice of silicon (red). The back-end of line (BEOL) is a mixed slice of metal and silicon dioxide (yellow). Its thermal conductivity and its volumetric heat capacity are evaluated thanks to the weighted average method (25 % copper and 75 % SiO<sub>2</sub>). The interconnection layer between two chips is also modeled by a slice of mixed material (green). As two consecutive chips are connected with the Leti Open-3D technology [70], the micro-bonding is made of a copper micro-pillar soldered with an SnAg alloy to a copper micro-bump. The diameter of the bonding is 25 μm and the pitch is 50 μm. For this study, we consider that the chip-to-chip bonding is used at its maximum density (i.e. one bonding element every 50 µm). The heat transfer mainly takes place through the micro-bonding. The thermal conductivity and the volumetric heat capacity of the interconnection slice are computed with the weighted average method (20 % micro-bonding and 80 % air). Pillars are used, to connect the sensor the flip-chip package. The diameter of these pillars is 70 μm for a minimum pitch of 140 μm. They are connected to the package with a solder alloy. The number of interconnections is 784 to fit to the package specification. As sub-LVDS drivers are differential, 10 LVDS drivers require 20 interconnections. All the other interconnections are used to route the control signals, the voltage references and the power supplies. Using the previously mentioned method, the thermal conductivity and the volumetric heat capacity of the chip-to-package bonding are respectively 15.6 W/K/m and 691 kJ/K/m<sup>3</sup>. The flip-chip package is modeled by a slice of Al<sub>2</sub>O<sub>3</sub> ceramic. The package is connected to the PCB with a BGA made of a tin-lead alloy. For a package of 30x30 mm<sup>2</sup> and a ball pitch of 1 mm, the number of balls is 784 [82]. The thermal conductivity and the volumetric heat capacity of the BGA are respectively 10 W/K/m and 294 kJ/K/m<sup>3</sup>. The values of thermal conductivity  $\lambda$ , volumetric heat capacity  $C_{PV}$ and height h of each slice are summarized in Annex B.

Fig. 56 Cross Section View of Thermal Model of the 3D Burst Image Sensor



| Slice                         | λ<br>(W.K <sup>-1</sup> .m <sup>-1</sup> ) | C <sub>P,V</sub><br>(J.K <sup>-1</sup> .m <sup>-3</sup> ) | h<br>(μm) |
|-------------------------------|--------------------------------------------|-----------------------------------------------------------|-----------|
| BEOL                          | 97.25                                      | 2.5 M                                                     | 15        |
| Device                        | 163                                        | 1.64 M                                                    | 5         |
| Substrate<br>(top)            | 163                                        | 1.64 M                                                    | 15        |
| Substrate<br>(middle, bottom) | 163                                        | 1.64 M                                                    | 40        |
| Interco bonds                 | 19.5                                       | 864 k                                                     | 20        |
| Interco balls                 | 19.5                                       | 864 k 40                                                  |           |

For many applications, the system will be place vertically to record an experiment as illustrated in Fig. 57. The system is in contact with the air and is cooled by the natural convection. This free convection of the air mainly occurs on surface of the PCB and the lid. This heat transfer can be modeled by a thermal resistor. The equivalent resistor is given by Eq. 19 where *h* is the convective heat transfer coefficient and *A* the exchange surface.

$$R = \frac{1}{h A}$$
 Eq. 19

For a plate in contact with the air, the value of h depends of the plate orientation, vertical or horizontal. The convective heat transfer coefficient also depends on the temperature difference between the plate  $T_p$  and the air  $T_a$  and the air speed [83]. It can vary from several orders of magnitude depending on the air speed. The convective heat transfer coefficient

can be approximated by equation Eq. 20 for a natural convection on a vertical plate of a height *H* lower than 1 m in contact with the air.

$$h \cong 5.6 \left(\frac{T_p - T_a}{T_a \times H}\right)^{0.25}$$
 Eq. 20

For an air temperature of 300 K, a PCB plate temperature of 320 K and a height of 2 cm, the heat transfer coefficient is 7.5 W/K/m². The equivalent thermal resistor is 330.4 K/W. For a lid temperature of 310 K, the heat transfer coefficient is 7.36 W/K/m². The equivalent thermal resistor is 392.9 K/W.

Fig. 57 3D Integrated Image Sensor in a BGA Package Cooled by Free Air Convection



## 4.2.2. Power Consumption

This section describes the power dissipated in each layer. This information is mandatory to perform the thermal simulation of the image sensor. The power density is evaluated during the burst acquisition and the burst reading operation. This section is going to evaluate the power density of the system from the top chip to the bottom chip. It is reasonable to consider that each chip dissipates the power in a homogeneous way as this digital burst IS has a matrix structure. In the top chip, each pixel is composed of a 3T circuit and a sample and hold cell as described in the previous chapter. Moreover, the multiplexer stage that connects 9 pixels to one A/D converter is also implemented on this layer. During the burst acquisition, the sources of power consumption in the 3T circuit are the photodiode reset operation and the static power consumption of source follower (SF) buffer. The photodiode reset operation consumes a power  $P_{PD\ Reset}$  that depends on the photodiode junction capacitance, the reset voltage and the reset frequency. The SF buffer of the 3T circuit and the multiplexer consumes respectively a static power  $P_{SF\ 3T}$  and  $P_{SF\ Multiplexer}$  that depend on the required speed (bandwidth and slew rate) and dynamic range. For the targeted speed

and dynamic range, the static power due to the current bias dominates the dynamic power. During the burst reading operation, the photodiodes are not reset and the power consumption depends only on the SF buffers. Thus the power density on the top chip  $P_{TopChip}$  is given by Eq. 21 (a) and Eq. 21 (b) respectively during the burst acquisition and burst reading. The power density also depends on the cluster surface  $A_{Cluster}$  i.e. the surface of 9 pixels. The SF buffers are the main sources of power consumption. For this architecture, the power density dissipated on the top chip is about of 88.5 mW/mm² during the acquisition and reading of the burst. It is also possible to turn off the current sources of the pixels during the reading operation with some extra control logics. This case will be considered later in this study.

$$\begin{cases} P_{TopChipAcquisition} = \frac{9 \times (P_{PD\ Reset} + P_{SF\ 3T}) + P_{SF\ Multiplexer}}{A_{cluster}} & (a) \\ P_{TopChipReading} = \frac{9 \times P_{SF\ 3T} + P_{SF\ Multiplexer}}{A_{cluster}} & (b) \end{cases}$$
 Eq. 21

The middle chip of the stack is composed of ADCs that convert the signals coming from the clusters of pixels. As presented in the previous chapter, the A/D conversion is performed with a successive approximation register (SAR) ADC with a capacitive DAC. The ADC works at 45 MC/s with an 8 bit resolution. Based on the ADC power and its surface, the power density dissipated on the second layer is 88 mW/mm<sup>2</sup> during the burst acquisition. For the burst reading, the ADC is idle. The power consumption is then 0 W/mm<sup>2</sup> during the reading operation. The bottom chip is composed of memories based 28 nm SRAM bit cell and ten LVDS drivers. The size of one cluster memory has been evaluated to 12 kilo-bytes in the previous chapter. Such a memory consumes 8 µW/MHz/byte for writing and reading operations based on manufacturer's data. During the acquisition, each SRAM stores the data coming from one ADC. Based on the conversion rate and the resolution of the ADC, the power consumption of one SRAM is 360 µW during the acquisition. Moreover, the LVDS driver has a static current of 3.5 mA. Under a 1.2V power supply, each LVDS driver consumes 4.2 mW. This figure is supported by a 28 nm LVDS IP datasheet. Therefore, the bottom chip dissipates 17 mW/mm<sup>2</sup> during the acquisition. During the burst reading operation, the readout speed is limited by the number of LVDS drivers. Therefore, the reading of one memory consumes 2.3 µW. The total power density dissipated in the third layer during the reading operation is 0.207 mW/mm<sup>2</sup>. The power densities are summed up in Tab. 7.

Tab. 7 Power Densities on Each Chip for both Burst Acquisition and Reading

| Chip                               | Тор  | Middle | Bottom |  |
|------------------------------------|------|--------|--------|--|
| Acquisition power density (mW/mm²) | 88.5 | 80     | 16     |  |
| Reading power density (mW/mm²)     | 88.5 | 0      | 0.207  |  |

#### 4.3. Static Simulation

#### 4.3.1. Static Model

First, a study is carried out on the system to evaluate the temperatures at the thermal equilibrium. Analyzing the steady state temperatures is only relevant for multi-burst recording. Indeed for the single burst recording, the temperatures always return to the room temperature at the steady state as the average power consumption tends towards zero. The static model is based on the system presented in Fig. 58 where the sensor and the PCB are placed vertically. The ambient air is a thermal reservoir which is in contact with the top of the glass lid and the PCB. The ambient air temperature  $T_A$  is 300 K. It is assumed that the heat flows –red arrows on Fig. 57– propagate from the sensor to the air through a constant section (z axis). For this model, the section area is equal to the sensor surface i.e. 20x20 mm<sup>2</sup>. This assumption is a worst case as the PCB surface in contact with the air is wider than the sensor surface. It is also assumed that the lateral heat flows (x, y axis) are negligible compared with the horizontal flows [84]. The equivalent static model of the system is presented in Fig. 58. The heat sources  $P_{top}$ ,  $P_{mid}$ , and  $P_{bot}$  respectively model the power dissipated in the top, middle, and bottom device layers.  $R_{top-mid}$  and  $R_{mid-bot}$  are respectively the thermal resistors corresponding to the materials between the top and the middle device layers and the middle and the bottom device layers. The materials between the top device layer and the inner air of the package are modeled by  $R_{air-top}$ . The inner air and the glass lid are modeled by  $R_{air}$  and  $R_{alass}$ . The materials between the bottom device layer and the package are modeled by  $R_{bot\text{-}package}$ . The package and the PCB are modeled by  $R_{package}$  and R<sub>PCB</sub>. The values of thermal resistances are summarized in the Annex B. The glass lid and the PCB are cooled thanks to the natural convection of the air as presented above. These heat transfers are modeled by the equivalent thermal resistors  $R_{CAglass}$  and  $R_{CApcb}$ .

Fig. 58 Static Model



In steady state simulation, the model needs average power consumptions for each chip of the sensor. The average power consumptions are computed from the acquisition  $P_{acq}$  and reading  $P_{read}$  power consumptions weighted by their duration respectively  $T_{acq}$  and  $T_{read}$  as illustrated in Fig. 59. The reading duration is 85 ms.

Fig. 59 Power Consumption States and Weighted Average Power Consumption for Multi-burst Recording



Simulations quickly demonstrate that it is mandatory to reduce the overall power consumption to avoid extremes temperatures above one thousand Celsius degree at the steady state. As said before, the power consumption can be reduced by turning off the current biases of the pixel front-ends and the ADCs during the burst reading operation. In doing so, the average power density of the top chip and the middle chip are reduced to reasonable values. While taking into account this point, the average power densities are summarized in Tab. 8 for different acquisition times.

Tab. 8 Multi-burst Recording Average Power Densities for Different Acquisition Times

| Acquisition time (ms)              | 0.270 | 0.540 | 1.35  | 2.7   | 5.4  | 13.5 |
|------------------------------------|-------|-------|-------|-------|------|------|
| Power density Top Chip (mW/mm²)    | 0.28  | 0.559 | 1.38  | 2.72  | 5.29 | 12.1 |
| Power density Middle Chip (mW/mm²) | 0.253 | 0.505 | 1.25  | 2.46  | 4.78 | 10   |
| Power density Bottom Chip (mW/mm²) | 0.25  | 0.3   | 0.447 | 0.686 | 1.14 | 2.37 |

### 4.3.2. Simulations

The steady state simulations are first done for a pre-event triggering mode. The inner temperature of the sensor is 352 K which is excessive. In examining the different temperatures of the model, it appears that the convective exchanges on the PCB surface and the air layer between the top chip and the glass lid limit the sensor cooling. Therefore, a heat sink is placed on the PCB surface to increase the heat transfer with the room air. Based on a data sheet, a heat sink of  $20x20 \text{ mm}^2$  is equivalent to a thermal resistor of 15 K/W. The heat transfer is now limited by the PCB and the air conductivity as shown in Fig. 60. The junction temperatures of the chips are closed and reach a maximum of 309 K at the top chip level. It is also interesting to note that 90 % of the heat (i.e. 228.8 mW) flows from the sensor to the PCB and 10% (i.e. 26 mW) from the sensor to the glass lid. For the rest of this study, we consider that the sensor is cooled on its bottom side by the heat sink presented above.

Fig. 60 Temperature at the Steady State through the z Axis of the Static Model



Then steady state simulations have been carried out for post event triggering mode. In this mode the acquisition time is random and can vary from 270  $\mu$ s to few seconds (Fig. 61). The

simulations are done for different durations and the steady state temperatures are plotted versus the acquisition time. The acquisition time can reach up to 4 ms while keeping a junction temperature below 125°C.

Fig. 61 Steady State Temperature in Multi-Burst Recording for Different Acquisition Times with an Heat Sink



These steady state simulations demonstrate the need of switching off the top and middle chip during the reading operation and using a heat sink to increase the convective heat transfer. It also demonstrates the feasibility of multi-burst recording for an acquisition time up to 4 ms. Finally, this static analysis shows that the junction temperatures of the top, middle and bottom chip are closed. Therefore, moving a source of power consumption from one chip to another will not strongly impact the temperature. Consequently, it relaxes the achievement of the floorplanning which can be constrained by other considerations such as power management. For instance, it can be interesting to move the multiplexer current source from the top chip to the middle chip to ease the power distribution strategy. However, for a sharper analysis of the temperature, a finite elements model (fem) has been developed. Fem takes into account the heat capacity of the materials to perform transient simulation. Transient simulations permit to analyze single burst recording and to assess the time constants of multi-burst recording. This fem also allows us to construct a model with different sections for the sensor, the package, and the PCB. It thus provides a more realistic model.

## 4.4. Finite Element Simulations

#### 4.4.1. Finite Element Model

In order to achieve our finite elements model (fem) simulations, we use the electro-thermal simulation tool presented and validated in [85] [86]. This simulator has been developed by a team of the ICube laboratory and is the result of a previous PhD work. Researchers and PhD

student of this team helped me to use the simulator. This tool provides means to perform direct electro-thermal simulations in a standard CAD environment (e.g. Cadence®). Thanks to scripts written in SKILL®, a thermal network which represents the circuit dies is automatically generated from the analysis of the circuit layout as described in Fig. 62. This thermal network is made of interconnected Verilog-A entities and is linked to the circuit schematic. In addition, the conventional devices used in the original schematic are replaced by their electro-thermal counterparts that can inject the heat they generate and sense the local temperature using an additional thermal terminal. Using this tool, we generated the thermal network of our image sensor. Then, the network is connected at the top and the bottom of the stack to the room temperature (300K) through the convective thermal resistor  $R_{CAglass}$  and heat sink thermal resistor  $R_{HeatSink}$ . Like the static model, the lateral interfaces of the stack with the air are considered as adiabatic i.e. without heat transfers.

Fig. 62 Electro-Thermal Simulator



The system is modeled by a stack of layers as the static model. However, contrary to the previous model, the dimensions of the cross-sections are not the same for all layers as shown in Fig. 63. The cross-section of the sensor is 20x20 mm² and the package and PCB cross-section are 30x30 mm². As this interface is the thermal bottleneck of the system, the fem model provides a more accurate evaluation of the temperature. We could have modeled the PCB with wider dimensions. However, it would have slowed the transient simulation. By underestimating the PCB dimensions, we give a worst case to the temperature evaluation as the thermal resistor to the air is increased.

Fig. 63 Thermal Model for Finite Element Simulation



#### 4.4.2. Transient Simulations

Only static analyses have been presented so far. However, the image sensor power consumption is time dependent as presented in the previous chapter. This section first presents transient analyses of sensor temperatures for single burst recording depending on the triggering mode. Then the effect of multi-burst recording on the temperatures is carried out. These transient analyses are going to define some operating limits for both recording modes.

### 4.4.2.1. Single Burst Recording

Simulations have been carried out to evaluate the temperature rises during the acquisition and the reading of a single burst recording. Firstly the simulation is run for the burst CIS working in pre-triggering mode Fig. 64. After a sharp rise in temperature due to the acquisition, the chips cool down until a return to the room temperature. The inner temperatures draw close to each other during the burst reading. As expected, it appears that the temperatures of each chip are close. Indeed, the chips are mainly made of silicon which is a very good thermal conductor and the time constant of the BEOL and the interconnections are small compared to the reading time.

Fig. 64 Thermal Transient Simulation for a Single Burst Acquisition with the average temperature on the first (x), second (+) and third (•) layers



Then the fem model is simulated for post-event triggering mode. In this mode the acquisition duration is random. Therefore, the simulation is done for different acquisition durations from 1.335 ms to 1.5 s. These durations correspond to the recording of 5 and 5555 bursts of images until the activation of the stop triggering signal. The maximum temperatures reach in the system during the acquisition and reading of one burst in post-event triggering is presented in Fig. 65. It validates the operation of single burst recording in post-event triggering mode for acquisition time up to 1.5 s. For longer acquisition time, the junction temperatures will reach more than 125 °C which is above the worst case temperature defined by the manufacturer.

Fig. 65 Maximum Junction Temperature versus the Acquisition for Single Burst Recording in Post Event Triggering Mode



If the application requires a longer acquisition time, a potential solution is to use thermal buffers. Thermal buffers have been firstly design to respond to an issue of embedded chips (e.g. mobile application) [87]. High transient power consumption peaks occur in those chips

but it is not possible to cool the system with heat sinks or fans. Therefore, these buffers are set inside the package in contact with the chip. They are made of a phase changing material such as paraffin wax or metallic alloys. When a power peak occurs these PCM will store the heat power by changing the phase of the material. These heat storages avoid a strong rise of the junction temperatures inside the chip during the consumption peak. The PCM then returns slowly to its previous phase while dissipating the stored heat in the package. In our case, the thermal buffers would be placed between the image sensor and the package at the chip-to-package bonding level.

# 4.4.2.2. Multi-Burst Recording

However, the application can sometime require the successive acquisitions and readings of bursts of images. Therefore, the evolution of the temperatures in multi-burst recording is studied here. A set of simulations has been carried out for multi-burst recording. First the simulation has been done with an acquisition time of 270  $\mu$ s for pre-event triggering. Then the simulations have been done with different acquisition times for post-event triggering mode. As expected it appears that the inner temperatures of the IC are a sequence of rises and decreases (Fig. 66).

Fig. 66 Junction Temperature of Multi-Burst Recording in Pre-Event Triggering Mode



However for a large time scale, the system temperatures reach a steady state. The junction temperature of the steady state depends of the acquisition duration. For pre-event triggering mode, the steady state temperature is 40 °C. This confirms the results of the static simulation and limit the acquisition time in multi-burst recording mode to 2.7 ms to prevent overheating. These transient simulations give more information than the static model. Indeed, the time constant of the system can be extracted from graph. The response of the system seems close to a first order system. Sixty three percent of the final is reach after 45

second. This value of the time constant matches the hand calculation of the PCB layer time constant which limits the heat transfer. If only several repetitions are needed, the acquisition time can be higher than 2.7 ms. However, the multi-burst recording has to be stopped before the steady state to avoid the destruction of the image sensor. This time is assessing thanks to the transient simulations. In some applications, it is also possible to increase the acquisition time by setting an idle time between two consecutive burst acquisitions.

Fig. 67 Maximum Junction Temperature versus Time in Post-Event Triggering Mode for Different Acquisition Time



### 4.5. Thermal Runaway

The digital burst image sensor architecture embeds a large number of digital memories. The aim of digital burst IS architecture is to increase the number of images per burst. Therefore, the memories are designed with advanced technologies such as 28 nm. However, these technologies are prone to current leakages. These leakages increase the static power consumption and also the junction temperatures. A rise of the temperature increases the current leakages. It clearly appears that a feedback loop exist between the current leakages and the temperature. If the loop gain is greater than one, the temperature will not converge towards a finite value and will destroy the system. One may examine in details this point to prevent a potential destruction of the IS working in multi-burst recording. Some models have been provided to study the risk of thermal runaway of integrated circuit in [88].

#### 4.6.Conclusion

This thermal study brought some insights for the design of digital burst image sensors. From a thermal management point of view, it demonstrates the necessity to turn off pixel frontend circuits and ADCs during the reading operation to prevent overheating. Moreover, it demonstrates the need to set up a heat sink on the PCB surface to increase the convective

heat exchange. Based on these facts, this study defines the operating limitations of the image sensor for different recording and triggering modes. The simulation validates the single burst recording for pre-event triggering mode and post-event triggering mode as long as the acquisition time stays below 1.5 second. It also validates the multi-burst recording in pre-event triggering mode and assesses the temperature rise to 10 °C. For post-event triggering mode, the multi-burst recording can have an acquisition time up to 4 ms in order to keep the junction temperatures below 125°C. This chapter gives also some clues to increase further the acquisition time such as the implementation of thermal buffers for single burst recording. For multi-burst recording, a transient management of the junction temperatures is possible thanks to the transient simulations. At the end of this study, we reveal a potential risk of thermal runaway in the chip due to the memory current leakages. This risk should be evaluated with the method presented in [88] before designing a full burst CIS with digital storage. Finally, ICube laboratory is currently working on an add-on to the simulator to provide an electro-mechanical-thermal simulator [89]. As mechanical constraints are present in a 3D integrated circuit at the TSV and bonding levels, it could be interesting to complete this study with the impact of the power consumption on the TSV and bonding element dimensions due to the thermal expansion.

# **5. Analog Front-End Circuits**

In this chapter, we chose to study deeper the analog front-end circuit of the 3D integrated burst image sensor with digital storage. The pixel front-end performs the photocurrent to voltage conversion, the global shutter acquisition and the multiplexing operation between the pixels and the ADC. To present the specifications of the pixel front-end circuit, it is necessary to define the performances of the ADC. Based on the conclusion of chapter 3, the ADC would be a SAR ADC with a capacitive DAC. It has a conversion frequency of 45 MHz and a foot print of 150x150 μm<sup>2</sup>. The frame rate is set to 5 Mfps with an integration time of 180 ns and a reset time of 20 ns. The multiplexer ratio is thus 9 pixels for one ADC. The pixel front-end has 4 ns to load the input of the ADC which is capacitive and evaluated to 1 pF. The ADC has a resolution of 8 bits and a signal to noise distortion ratio (SNDR) of 50 dB. We consider that the input range of the ADC is 0.2V to 1V. Our pixel front-end must then have a dynamic range of at least 50 dB. Even if we target a back side illuminated structure, the fill factor requirement is 80 % to keep a large collection area. Firstly, the pixel front-end is implemented with a circuit inspired from the active pixel sensor (APS) circuit. The aim of this implementation is to benchmark this basic pixel structure and to identify its limitations for our specifications. Based on it, our work then focuses on the improvement of the sensitivity and the reduction of the power consumption. We propose solutions to increase the conversion gain without using a pinned photodiode and the transfer gate technology. Finally we present a method to reduce the power consumption based on circuit considerations. The different simulation results that are presented in this chapter have been done with a ST HCMOS9 Technology (130 nm) using eldoD simulator.

#### 5.1.APS Based Pixel Front-End Circuit

### 5.1.1. Description

Firstly a basic pixel circuit inspired from APS has been designed to evaluate its performances under our specifications. This structure will also serve as a reference to evaluate other pixel front-end circuits. The architecture of the APS based pixel front-end is presented in Fig. 68. It is the same architecture as the one presented in chapter 3. Some extra switches ( $SW_{RStGS1}$  and  $SW_{RStMux1}$ ) are added to prevent image lag. Two SF buffers are implemented in this analog front-end circuit. It is mandatory to use one SF buffer based on N-type MOSFET and one based on P-type MOSFET to compensate their offsets. The SF buffer  $T_{BuffMux1}$  of the multiplexer has a higher speed constraint than the SF buffer  $T_{BuffGS1}$  of the global shutter stage. This is due to the fact that the SF Buffer  $T_{BuffMux1}$  has to function at the frame rate frequency times the multiplexing ratio while the SF Buffer  $T_{BuffGS1}$  functions at the frame rate frequency. As N-type MOSFETs have a better charge carrier mobility than P-type MOSFET, the SF buffer of the multiplexer is then implemented with a N-type MOSFET that is more gmefficient for a given bias current. The SF buffer of the global shutter stage is then

implemented with P-type MOSFETs. The output load is modeled by a capacitor  $C_{ADC}$  of 1 pF. This capacitor is reset by  $SW_{RStADC1}$  before each sampling.

Fig. 68 Cluster of APS Based Pixel Front-End



## 5.1.2. Current to Voltage Conversion

For this APS based pixel front-end, the integration of the photocurrent is performed on the capacitor of the photodiode junction. A NWell/PSub structure is chosen for the photodiode to maximize the front-end sensitivity as explained in the chapter 3. The dimensions of the photodiodes are 41x48.8  $\mu m^2$ . An area of 9x50  $\mu m^2$  is left free to implement the pixel circuit. The fill factor is thus 80 %. A photodiode reversely biased is prone to dark current. For our 130 nm technology, we evaluate the dark current density to 363 nA/cm² as presented in 2.4.1.. The dark current of our photodiode is then about 10 pA at 300 K. This value is far below our minimum observable photocurrent (~nA). Therefore, the dark current will not be considered throughout the rest of the analysis. Simulations have been carried to evaluate the performances of our photodiode. However, we did not have access to the photodiode of the targeted technology. Therefore, the simulations have been done using a nonlinear model of the parasitic NWell/PSub diode. Based on the model, the junction capacitor value is about 200 fF and depends on its bias voltage. The sensitivity of the current to voltage

conversion is thus reduced and its linearity is degraded. The integration node  $V_{PD}$  is reset to 0.9 V by the reset transistor  $SW_{rst1}$ . This transistor is dimensioned to have a time constant with the photodiode far below the reset duration of 20 ns.

Firstly, the photocurrent to voltage characteristic has been extracted and is plotted in Fig. 69. As expected, the current to voltage gain is not constant over the dynamic. It decreases when the voltages across the photodiode drops (i.e. when the integration capacitance increases). The current to voltage gain is about 115 dB as it varies from 0.75 V/ $\mu$ A to 0.48 V/ $\mu$ A.

Fig. 69 (a) Photo-current to voltage characterisctic and (b) photo-current to voltage gain of a  $41x48.8 \mu m^2$  NWell/PSub photodiode



In term of noise, the total noise  $\sigma_{ItoV}$  of this stage is the root mean square (rms) of the sampling noise of the reset operation  $\sigma_{Rst}$  and the shot noise  $\sigma_{ShotNoise}$  [90]. These noises are given by Eq. 22 with the photocurrent  $i_{ph}$ , the integration time  $t_{int}$  and the photodiode junction capacitor  $C_{PD}$ .

$$\sigma_{ItoV} = \sqrt{\sigma_{Rst} + \sigma_{ShotNoise}} \ [Vrms] \ with \begin{cases} \sigma_{Rst} = \sqrt{\frac{KT}{C_{PD}}} \\ \sigma_{ShotNoise} = \sqrt{\frac{qi_{ph}}{C_{PD}^2}}t_{int} \end{cases}$$
 Eq. 22

A simulation has been carried out to extract the voltage noise and the signal to noise ratio due to shot noise for different photocurrents, Fig. 70. This simulation result confirms that the shot noise depends on the photocurrent and matches the value given by Eq. 22. Regarding the values of the SNR, the shot noise can limit the noise performance of the frontend for high photocurrents which confirms the discussion of 3.3.2.1. The dynamic range is defined by the ratio of the saturation voltage and the noise level under no illumination.

Here, the reset noise is 143  $\mu$ Vrms at 300 K and sets the dynamic range of this stage to 78 dB. This value is above the requirement of 50 dB.

Fig. 70 Integrated Shot Noise and SNR of the Photocurrent to Voltage Conversion of a 41x48.8  $\mu m^2$  NWell/PSub Photodiode



# 5.1.3. Global Shutter Stage

The global shutter stage is composed of a voltage buffer  $T_{BuffGS1}$ , a switch transistor  $SW_{GS1}$  and a sampling capacitor  $C_{GS1}$ . This stage performs the global shutter acquisition of the image. The voltage buffer is P-type MOSFET used in common drain configuration as illustrated in Fig. 71. The sampling operation performed by  $SW_{GS1}$  as shown on the timing diagram of Fig. 71.

Fig. 71 (a) Schematic and (b) Timing Diagram of the Global Shutter Stage



The DC characteristic of the global shutter stage is presented in Fig. 72. The DC gain varies from 0.82 to 0.3 for an input range of 0 V to 2 V. The input voltage is limited to 1.5 V to keep

a good linearity. The offset of the SF buffer varies between 1 V and 0.5 V. Regarding the output voltage between 1 V and 2.5 V, the switch  $SW_{GS1}$  is implemented with a transmission gate to keep a small on-resistor ( $r_{ONsw}$ ) over the whole voltage range.

Fig. 72 DC Characteristic and Gain of the Global Shutter Stage





In order to match the precision constraint and regarding the sampling time of 20 ns, the global shutter -3dB bandwidth must be above 48 MHz. A sampling capacitor  $C_{GS}$  of 200 fF is implemented to keep a low sampling noise. The analysis of this stage, made in chapter 3, shows that the bandwidth depends of the sum of the time constant of the sampling cell  $(r_{ONsw}C_{GS})$  and the common drain transistor  $(C_{GS}/g_m)$ . Therefore, the on-resistor of the switch is designed to not limit the bandwidth of the stage. The bandwidth constraint imposes a transistor transconductance of 60  $\mu$ A/V. The slew rate of 50 V/ $\mu$ s imposes a minimum bias current of 10  $\mu$ A. This stage is limited by the slew rate. The bias current  $I_{BiasGS1}$  is set to 34  $\mu$ A to not observe its effect during transient simulation. Based on AC simulation, the transfer function of the SF buffer is plotted in Fig. 73 and shows a -3dB bandwidth of 158 MHz.

Fig. 73 Transfer Function and Noise Figure of the Global Shutter SF Buffer



The simulated input referred noise power spectral density (PSD) and the input referred cumulative noise of the global shutter stage are plotted versus the frequency in Fig. 73. The output noise  $\sigma_{SFBuffer}$  integrated through the SF buffer which cut off at 158 MHz is 114  $\mu$ Vrms.

The sampling noise  $\sigma_{sampling}$  is 144  $\mu$ Vrms and the total output noise  $\sigma_{GS\ Stage}$  of the global shutter stage given by Eq. 23 is 184  $\mu$ Vrms.

$$\sigma_{GS\ Stage} = \sqrt{\sigma_{sampling}^2 + \sigma_{SF\ Buffer}^2}$$
 Eq. 23

Transient simulation has been done to evaluate the characteristic of this stage. The input of the stage is driven by a voltage ramp of 180 ns between the reset voltage and the integration voltage. To avoid lag from an image to another, the global shutter memory is preload to  $V_{rstGS}$  during 10 ns before the sampling as illustrated in Fig. 71. The signal is then sampled from 160 ns to 180 ns. The simulation is repeated to cover the whole dynamic of the stage. The transient characteristic is extracted from this set of simulations. The transient gain confirms the results of the DC analysis. The preload operation of the sampling capacitor to its higher potential boosts the transient response of the system. Indeed, during the sampling operation, the "common drain" transistor of the buffer always starts with a high conductance value. Moreover, retention time of the global shutter memory has been checked and is compliant with the specification. Finally, transient noise analysis has been done on 500 runs over a bandwidth of 10 GHz and confirms the AC noise simulation results over input dynamic. Based on these simulations, the dynamic range of the stage is 76 dB.

# 5.1.4. Multiplexer Stage

The multiplexer stage connects nine pixels to one ADC. The multiplexer is composed of one transistor  $T_{BuffMux1}$  in "common drain" mode connected through two switches to the ADC input capacitor  $C_{ADC}$  as illustrated in Fig. 74. The switch  $SW_{Mux1}$  selects the pixel which is read and the second switch  $SW_{ADC}$  performs the sampling operation. The current source  $I_{BiasMux}$  is connected between  $SW_{Mux1}$  and  $SW_{ADC}$  to be shared between the pixels of the cluster. The switches are composed of an N-type MOSFET. A complementary switch is not necessary because the output voltage range varies between 0.2V and 1V. However, dummy transistors (drain-source connected) are used to reduce the effect of the charge injection. The switch  $SW_{rstMux1}$  is connected during the global shutter operation to limit the effect of the gate/source coupling capacitor of  $T_{BuffMux1}$  on the sampling operation.

Fig. 74 Schematic of the Multiplexer Stage



Based on DC simulation, the characteristic of the SF buffer is plotted in Fig. 75. Over the targeted input range (1V to 2V) the gain of the buffer is about 0.85. The negative offset compensates the positive offset of the global shutter stage.

Fig. 75 Characteristic and Gain of the Multiplexer SF Buffer



As presented at the beginning of this chapter, the sampling time is 4 ns. To match the precision constraint, the -3dB bandwidth of the stage must be 238 MHz. As well as the previous stage, the bandwidth depends on the time constants of the sampling cell and the "common drain" transistor. Therefore, the on-resistors of the switches  $SW_{Mux1}$  and  $SW_{ADC}$  are designed to be compliant with the bandwidth of the stage. The bandwidth constraint imposes a transistor trans-conductance of 1.5 mA/V. However, the bias current is imposed by the slew rate constraint (200 V/ $\mu$ s). In simulation, the bias current  $I_{BiasMux}$  must be 510  $\mu$ A to be unaffected by the slew rate. Based on AC simulation, the transfer function of the SF buffer is plotted in Fig. 76 and shows a -3dB bandwidth of 398 MHz.

Fig. 76 Transfer Function and Noise Figure of the Multiplexer SF Buffer



The noise PSD and the cumulative noise at the buffer input are plotted on Fig. 76. The integration of the noise PSD through the transfer function of the multiplexer gives an output noise of 128  $\mu$ Vrms. By adding the sampling noise of 64  $\mu$ Vrms, the total noise at the output of the stage is 143  $\mu$ Vrms. Transient simulations have been carried out which confirms the DC characteristic. The dynamic range of the multiplexer is 75 dB.

### 5.1.5. Full Front-End Performances

The full APS based pixel front-end has been simulated in transient mode. The DC characteristics and the gain of the front-end have been extracted from these simulations as shown in Fig. 77. The average current to voltage gain is  $0.42~V/\mu A$ . It reaches a maximum of  $-~0.525~V/\mu A$  for a photocurrent of  $0.2~\mu A$  and decreases to  $-~0.15~V/\mu A$  for high photocurrent. This is due to the increase of the junction capacitor for low voltage biasing.

Fig. 77 Characteristic and Gain of the APS Pixel Front-End for an Integration Time of 180 ns



In doing the rms sum of the output noises weighted by the gain of the global shutter and multiplexer stage, the readout noise is 212  $\mu$ Vrms. By taking into account the reset noise, the readout noise is 233  $\mu$ Vrms. This value is confirmed by a transient noise analysis. Moreover, a set of transient noise simulations has been done for different photocurrents to evaluate the signal to noise ratio (SNR) of the APS based front-end readout chain as shown in

Fig. 78. These simulations confirm that the readout noise is independent of the input photocurrent. By comparing the SNR due to the readout chain and the SNR due to the photon shot noise (Fig. 70), the SNR of the APS pixel front-end is limited by the photon shot noise.

Fig. 78 Signal to Noise Ratio of the APS Pixel Front-End Readout Chain



To check that there is no lag due to the pixel front-end, a simulation is done where the pixels are saturated during the first integration and, in the dark, during the second one. The output value of the second integration is compared with the case where the first integration is done under dark condition. This simulation has been done on our pixel front-end and no lag has been detected. Moreover, a post layout transient simulation has been done to verify that there is no crosstalk from one pixel to another due to the readout chain. To do so, the first pixel is saturated and the other pixels are in the dark. The value of the other pixels are measured and compared to the case where every pixel is in the dark. The simulation shows that there is no crosstalk due to the pixel front-end. Monte-Carlo simulation has been done to check the effect of the mismatch on the front-end circuit. The performances of the circuit are confirmed for a standard deviation of three sigma.

#### 5.1.6. Conclusion

This basic pixel front-end inspired from APS structure is compliant with our requirements. Simulations confirm that the targeted speed of 5 Mfps is achieved with this front-end without any lag. The gain of this circuit is  $0.42~V/\mu A$  or 112~dB due to the photocurrent to voltage conversion stage. As the current bias of the multiplexer is shared between nine pixels, the power consumption of one pixel is  $226~\mu W$ . Finally, the dynamic range is 70~dB based on noise simulations. These results show that the sensitivity is low. Based on a photodiode responsivity of 0.25~A/W at 630~nm, the responsivity of the sensor is  $0.105~V/\mu W$  and the sensitivity is equal to 2.1~V/lux/s. To reach the saturation, the incident light flux must be  $3.04~kW/m^2~or~2~Mlux$  on the pixel surface. Such fluxes are high and require strong halogen light source or laser. Therefore, we are going to propose some solutions to

increase the sensitivity in the next section of this chapter. The second limitation of this frontend is high power consumption. Indeed for a matrix resolution of 400x400 pixels, the power consumption due to the front-end would be 36 W. This issue will be discussed at the end of this chapter.

# 5.2. Design Strategies to Increase Sensitivity

#### 5.2.1. Introduction

One major limitation of the previous pixel front-end is the limited sensitivity of 2.1 V/lux/s. There are different solutions to increase sensitivity, we can modify the responsivity of the photodetector but it requires working on the physics of the device by changing its structure or its material. Such solutions are not considered here as a standard CMOS Process is targeted for the realization of the sensor. The other way to increase the sensitivity of the image sensor is by increasing the gain of the readout chain. This gain can be dissociated into a current to voltage gain (i.e. conversion gain) and a voltage to voltage gain. It is commonly more convenient to increase the gain at the beginning of a readout chain for noise considerations. Therefore, we focus our attention on the conversion gain and not on the voltage to voltage gain of the pixel front-end. A pixel based on transfer gate technology seems a very interesting solution to increase the sensitivity while keeping a good fill factor [91]. A research team has done some work to design pinned photodiode and transfer gate suited for high speed image sensors [92]. As this solution has already been studied, we propose here to increase sensitivity with solution that does not require specific technology such as transfer gate. Different circuits enable to perform the photocurrent integration on a chosen capacitor [93]. In this section, we present different circuits to implement the current to voltage stage.

# 5.2.2. Resistive Trans-Impedance Amplifier Circuit

The resistive trans-impedance amplifier (RTIA) seems an interesting circuit for high speed photocurrent to voltage conversion as the conversion does not require any integration or reset operation. This solution has already proved its efficiency for high speed imaging using BiCMOS technology [94] [95]. A RTIA is made of an operational amplifier connected on the negative input to the photodiode with a resistor in the feedback loop as shown in Fig. 79. The current to voltage conversion gain is defined by the value of the conversion resistor  $R_{conv}$  as presented in Eq. 24. We target a conversion resistor of 10 M $\Omega$  to provide a current to voltage gain of 140 dB and increase the sensitivity by a factor 10 compared to the APS integration stage. We first consider that the photodiode is implemented with a NWell/PSub structure (i.e.  $C_{PD}$ =200 fF). For an amplifier gain A of 200, the RTIA bandwidth is 16 MHz and is compliant with the targeted frame rate of 5 Mfps.

Fig. 79 RTIA Stage and its Transfer Function



However, after a quick look to the design rules manuals of the considered technology, it appears that such a resistor has a very large footprint. By using a coil pattern to layout the resistor, the density is about  $400~\Omega/\mu\text{m}^2$ . It is obviously not possible to implement a  $10~\text{M}\Omega$  resistor within the pixel. An alternative is to use two stages to reach the gain of 140~dB. The first stage is an RTIA circuit implemented with a conversion resistor of  $100~\text{k}\Omega$  and provides a gain of 100~dB. The second stage is a basic voltage amplifier with a gain of 40~dB. This stage can replace the second stage of the front-end circuit and performs the global shutter acquisition as presented in Fig. 80. The global shutter acquisition is performed through the switch  $SW_{GS}$  on the capacitor  $C_{GS}$ .

Fig. 80 RTIA Stage and Voltage Amplifier Solution



The sampling time  $T_{GS}$  of the voltage amplifier stage is 20 ns. The RTIA stage cannot correctly record light pulse shorter than  $T_{Frame}$  if it does not occur during the sampling time as illustrated in Fig. 81. The bandwidth of the RTIA stage has then to be in the same order of magnitude as the frame rate. In doing so, the RTIA stage is able to copy the variation of a continuous light that occurs at the frame rate speed. Beside it filters light pulse. Therefore, the RTIA stage bandwidth is set to 5 MHz. To match this bandwidth, the gain of the op-amp is set to 3 and we change the photodiode for a P+/NWell/PSub structure with a junction capacitor  $C_{PD}$  of 1.34 pF. The gain of the op-amp is 3 to match the 5MHz bandwidth. The op-

amp trans-conductance is 100 mA/V to ensure the RTIA stability as presented in the Annex C. The power consumption of this stage is then 25  $\mu$ W for a power supply of 2.5V.



Fig. 81 Timing diagram of the RTIA and amplifier stage

The global shutter operation is performed in 20 ns. The bandwidth of the voltage amplifier is set to 24 MHz to load the sampling capacitor  $C_{GS}$  at more than 95 %. For a voltage amplifier the gain bandwidth (GBW) product of the amplifier is equal to the GBW product of the opamp [96]. The op-amp must then exhibit a GBW product of 2.4 GHz. The op-amp of the voltage amplifier is implemented with a 5T structure as presented in Annex C. The bandwidth and the gain requirement impose a minimal bias current of 600  $\mu$ A for the opamp. With such value, the slew rate constraint is not a limitation. It is interesting to note that using a P+/NWell/PSub photodiode instead of an NWell/PSub structure changes the photodetector responsivity (0.45 A/W at 630 nm) as presented in 2.4.1. The RTIA stage then reaches a peak responsivity of 4.5 V/ $\mu$ W at 630 nm. In term of power consumption with a 2.5 V power supply, this stage consume 1.53 mW.

An analysis of the noise has been carried out on the RTIA stage in Annex C. Considering a noiseless op-amp, the output referred thermal noise of the feedback resistor is given by Eq. 25. The transfer function to refer the resistor noise at the output of the RTIA is similar to Eq. 24. The  $BW_{-3dB}$  is then the bandwidth of the RTIA transfer function. For the targeted bandwidth of 5 Mhz and a gain of 100 dB, the value of the output referred noise due to the conversion resistor is 166  $\mu$ Vrms.

$$\sigma_{Ith\ Rf} = \frac{A}{1+A} \sqrt{R_{conv} 4KT \times \frac{\pi}{2} BW_{-3dB}} \quad with \quad BW_{-3dB} = \frac{1+A}{2\pi R_{conv} C_{PD}}$$
 Eq. 25

By rearranging Eq. 25 with the expression of the bandwidth, the dynamic range is then given by Eq. 26. As we target a voltage range  $V_{range}$  of 10 mV for this stage, the dynamic range is limited to 35.6 dB by the RTIA stage. This value is not enough for our application. Increasing the dynamic range requires to increase the voltage range (i.e. the conversion resistor) or to

reduce the RTIA bandwidth below 5 MHz. Both these solutions are not possible four our application as explained above. Regarding its power consumption and its limited dynamic range, the RTIA circuit will not be retained to increase the sensitivity of the pixel front-end.

$$DR_{dB} = 20 log \left( \frac{V_{range}}{A \sqrt{\frac{KT}{C_{PD}} \times \frac{1}{(A+1)}}} \right)$$
 Eq. 26

# 5.2.3. Capacitive Trans-Impedance Amplifier Circuit

The current to voltage conversion can also be performed by a capacitive trans-impedance amplifier (CTIA). This circuit is composed of an operational amplifier and a capacitor that provides a negative feedback as shown in Fig. 82. This circuit performs the conversion thanks to an integration operation. The CTIA is reset by switching on the reset switch  $SW_{rst}$ . The circuit is then a voltage follower and the output voltage and the inverting input are set to the reset voltage  $V_{rst}$ . The integration starts by switching off the reset switch. When the light reaches the photodiode, the photocurrent discharges the inverter input of the op-amp. In reaction, the op-amp injects some current at its output which increases the output voltage. In doing so, the feedback capacitor which forms a capacitive divider with the photodiode junction capacitor increases the inverter input of the op-amp to stabilize the circuit. A timing diagram is presented in Fig. 82. The conversion gain depends of the integration capacitor C<sub>int</sub>. To reach a current to voltage gain of 140 dB for a frame rate of 5 Mfps, the integration capacitor value is 20 fF. The photodiode is implemented with a NWell/PSub structure, the junction capacitor  $C_{PD}$  is then 200 fF. We choose here a photodiode with a minimum junction capacitor because its value affects the stage bandwidth. The load capacitor  $C_{L}$  is set to 20 fF and corresponds to the input impedance of the next stage.

Fig. 82 Photo-Current to Voltage Conversion performed by a Capacitive Trans-Impedance Amplifier (CTIA) with its Timing Diagram



In term of speed the CTIA stage is constrained by the integration and the reset operation. The transfer function during the integration operation and its bandwidth is given by Eq. 27

assuming that the output resistor of the op-amp can be neglected compared to the load impedance. This assumption is reasonable at our working frequency.

$$\frac{V_{int}}{I_{pd}} = \frac{-1}{sC_{Int}} \times \frac{1}{1 + s \frac{(C_L C_{PD} + C_L C_{Int} + C_{Int} C_{PD})}{g_m C_{Int}}}$$
 Eq. 27

The integration time is 180 ns, the targeted bandwidth is 34 MHz. To evaluate the power consumption of the CTIA, we plot in Fig. 83 the bias current of the op-amp to reach the 34 MHz bandwidth for different integration capacitors. The details of the computation are presented in Annex D. It is interesting to note that when the integration capacitor  $C_{int}$  decreases (i.e. increasing of the conversion gain), the op-amp load increases and thus the power consumption of the op-amp increases. Moreover, the reset operation imposes a slew rate constraint on the op-amp. Indeed, the op-amp has to reset the photodiode and the load capacitor in 20 ns. In the worst case (i.e. saturation of the previous image), the op-amp has to provide 14  $\mu$ A to reset the integration capacitor. It is then the slew rate constraint that sets the current bias of the op-amp to 28  $\mu$ A.

Fig. 83 Bias Current of the Op-Amp versus the Integration Capacitor for a 34 MHz Bandwidth



To evaluate the dynamic range of the CTIA stage, a noise analysis has been carried out in Annex D. The total noise and the dynamic range are given by Eq. 28 with  $V_{range}$  the voltage range of the CTIA stage (i.e. 1.2V). For an integrator, the total noise is the root mean square (rms) sum of the reset noise  $\sigma_{rst}$  and the readout noise  $\sigma_{readout}$ . The expression of the reset noise is based on the analysis presented in [97]. The total noise of the CTIA is 1.8 mVrms and the dynamic range is then 56 dB.



Unlike the RTIA circuit, the CTIA is implementable in the pixel. Indeed, the circuit does not require large component such mega-ohm resistors as the integration is performed on 20 fF capacitor. With such value, the current to voltage gain is 9 V/ $\mu$ A which corresponds to a responsivity of 2.25 V/ $\mu$ W and a sensitivity of 45.7 V/lux/s for the considered photodiode at 630 nm. The bandwidth constrain due to the integration operation imposes a bandwidth of 34 Mhz to the op-amp and the slew rate constraint imposes a biasing current of 28  $\mu$ A. The noise analysis demonstrates that the dynamic range of the CTIA stage is 56 dB which is compliant our specifications.

### 5.2.4. Buffered Direct Injection Circuit

Another solution to perform the integration on a chosen capacitor is the direct injection circuit. The circuit is composed of a single MOSFET as shown in Fig. 84. The idea is to use the transistor in "common gate" mode. In this mode, the transistor performs a current copy. The transistor is drain/source connected between the photodiode and an integration capacitor  $C_{int}$ . The transistor gate is set to a voltage that insures the saturation. The image acquisition starts by the reset of the integration capacitor to the reset voltage  $V_{rst}$  thanks to the reset switch  $SW_{rst}$ . The integration starts when  $SW_{rst}$  is switched off. The saturation level depends on the injection transistor. When it leaves the saturation region for the ohmic region (i.e.  $V_{ds}$ < $V_{gs}$ - $V_{th}$ ), the photocurrent is not integrated on  $C_{int}$  anymore but on the sum of  $C_{int}$  and  $C_{pd}$ . Moreover, the injection transistor is in weak inversion because it is biased by the photocurrent which is low. The expression of the trans-conductance  $g_{mTinj}$  for a MOSFET in weak inversion is given in Annex E. One benefit of direct injection circuit is a current to voltage conversion performed without any op-amp. One drawback is an uneven bias of the injection transistor. Indeed, the operating point of the injection transistor varies with the photocurrent. Therefore, the transistor parameters such as its trans-conductance  $g_{mTinj}$ depend on the incoming light. As the trans-conductance depends on the photocurrent, the system bandwidth will vary with it, Eq. 29. It results that the bandwidth varies from 21 kHz to 2.1 MHz for photocurrents between 1 nA to 100 nA. For our application, one targets a frame rate of 5 Mfps. Therefore, this stage must be able to copy variation of photocurrent at a speed higher than the frame rate. It clearly appears that the bandwidth values are not sufficient for our application even with a high photocurrent. The bandwidth issue is a major limitation of the direct injection circuit.

Fig. 84 Direct Injection Stage and its Transfer Function



To tackle the bandwidth limitation, a solution is to use an op-amp. This idea was first proposed for CCD in [98] and then for CIS in [99]. The injection transistor is set in the feedback of an op-amp as illustrated in Fig. 85. This circuit is named buffered direct injection (BDI). At the steady state, the inverter input of the op-amp is equal to the reference voltage  $V_{ref}$  at the positive input. When the photodiode is illuminated, the photocurrent starts to reduce the voltage at the inverter input of the op-amp. In reaction, the op-amp increases its output voltage which controls the gate of the injection transistor. Therefore, the current that flows through the transistor increases to compensate the voltage drop at the inverter input. The feedback loop stabilizes the circuit. The gain A of the op-amp increases the bandwidth of the BDI circuit (Eq. 30). It is interesting to note that the bandwidth is independent of the integration capacitor contrary to the CTIA stage. For our application, the gain of the op-amp must be higher than 48 dB to reach the targeted bandwidth with a photocurrent of 1 nA. Therefore, the op-amp must have a bandwidth higher than 5 MHz to ensure a gain of 48 dB over the working frequency.

Fig. 85 Buffered Direct Injection Stage and its Transfer Function



A simple analysis of the circuit noise is carried out to assess the dynamic range of this integration stage (c.f. Annex E). The total noise is the rms sum of the standard deviation of the readout noise  $\sigma_{BDI}$  and the reset noise  $\sigma_{rst}$ , Eq. 31. The readout noise is caused by the noise of the injection transistor  $PSD_{inj-tran}$  and the op-amp noise  $PSD_{op-amp}$ . These PSDs are

integrated over the frequency through the BDI stage of bandwidth BW. The expression of  $H_{op\text{-}amp0}$  and  $H_{inj0}$  are presented in Annex E. For our application, one considers an op-amp implemented with a folded cascode structure [100] and an integration capacitor of 20 fF. It appears that the readout noise is not constant over the dynamic as the bandwidth depends of the photocurrent. The main source of noise is the op-amp which contributes to more than 90 % to the readout noise. Based on the model presented in annex, the readout noise varies between 450  $\mu$ Vrms and 4.4 mVrms respectively for photocurrent of 1 nA and 100 nA. The reset noise corresponds to the sampling of the output noise of the BDI stage  $\sigma_{BDI}$  and the thermal noise of the switch  $SW_{rst}$  on the integration capacitor  $C_{int}$ . The reset noise varies between 640  $\mu$ Vrms and 4.4 mVrms respectively for photocurrent of 1 nA and 100 nA. One can expect a dynamic range of 64 dB for a voltage range  $V_{range}$  of 1.2V. However, the SNR of the BDI stage worsens under high illuminances.

$$DR_{DB} = 20 \log \left( \frac{V_{range}}{\sqrt{\sigma_{rst}^2 + \sigma_{readout}^2}} \right)$$
 with 
$$\begin{cases} \sigma_{readout}^2 = \sigma_{BDI}^2 = \left( PSD_{op-amp}H_{op-amp0}^2 + PSD_{inj-tran}H_{inj0}^2 \right) \times \frac{\pi}{2}BW \\ \sigma_{rst}^2 = \sigma_{BDI}^2 + \frac{KT}{C_{int}} \end{cases}$$

This solution offers an increase of the sensitivity by a reduction of the integration capacitor. For a frame rate of 5 Mfps and an integration capacitor of 20 fF, the conversion gain is 8  $\mu$ V/e-. It corresponds to a responsivity of 6 V/ $\mu$ A or 1.5 V/ $\mu$ W at a wavelength of 630 nm. The expected dynamic range is sufficient for our application that targets 50 dB. Even if the BDI circuit has an op-amp as the CTIA circuit, the constraints on it seems lower. Indeed, in BDI stage the reset operation is performed by a reset transistor and not by the op-amp. The load of the op-amp is the gate of the injection transistor which is small compared to the load of the op-amp for the CTIA stage. Moreover, the output of the BDI op-amp does not have to cover a large voltage range contrary to the CTIA. Therefore, the op-amp bandwidth and slew rate constraints are lower for the BDI than for the CTIA. Based on these points, we chose a BDI circuit to increase the sensitivity of the current to voltage stage.

## 5.2.5. Implementation and Simulation

The first choice to design a BDI circuit is to choose the op-amp architecture. Here, one first targets an op-amp gain of 48 dB with a bandwidth higher than 5 MHz as explained above. The folded cascode architecture is a good candidate as it has a high gain and only one stage as shown in Fig. 86. We used a cascode stage to increase the gain of the long tailed pair and a cascode mirror to provide a sufficient load. Both stages are respectively biased by the voltage  $V_C$  and  $V_{CM}$ . The folded cascode op-amp is biased by two current sources of 38  $\mu$ A each. As the power supply is 2.5 V, the op-amp consumes 190  $\mu$ W. The DC gain is 54.7 dB for a -3 dB bandwidth of 4 MHz. The op-amp bandwidth is just below the specification of 5 MHz.

However, the gain is higher than the 48 dB targeted and stays above this value up to 5 MHz. Therefore, the op-amp bandwidth should not limit the bandwidth of the BDI.

Fig. 86 Folded Cascode Amplifier



The BDI architecture has been designed and simulated using this op-amp. The integration is performed on the junction capacitor of the drain/substrate diode of the reset and injection transistor. The input capacitor of the next stage (SF Buffer) also contributes to the total integration capacitor. However, junction capacitors are voltage dependent and create some non-linearity in the DC characteristic. The total integration capacitor is evaluated to about 20 fF. The characteristic and the sensitivity of the BDI stage are plotted in Fig. 87. The characteristic is extracted from a set of transient simulations. For a photocurrent from 0 nA to 90 nA, the gain increases with the photocurrent from 10 V/ $\mu$ A to 16 V/ $\mu$ A and has an average value of 13 V/ $\mu$ A. These values of sensitivity respectively correspond to an integration capacitor of 11 fF and 18 fF. As expected, the integration capacitor changes with the voltage. For a higher photocurrent (>90 nA), the injection transistor leaves the saturation region. The photocurrent is then integrated on the photodiode junction capacitor  $C_{PD}$  and the integration capacitor  $C_{PD}$  and the integration capacitor  $C_{Int}$ . As the value of  $C_{PD}$  is about 200 fF, the gain drops by a factor 10.

Fig. 87 Integration Voltage versus Photocurrent Characteristic and Sensitivity of the BDI stage



The bandwidth of the BDI has been simulated thank to AC analysis. As expected the AC response of the BDI integration stage depends on the photocurrent value. Extracted from Fig. 88, the bandwidth varies from 10 MHz to 550 MHz for photocurrents from 1 to 100 nA.

Fig. 88 I<sub>int</sub> versus I<sub>pd</sub> Transfer Function of the BDI Stage



The stability of the system is studied thanks to an open loop analysis. The feedback loop is opened at the node between the photodiode and the injection transistor as illustrated in Fig. 89. In this figure the transfer function  $V_{PD}$  versus  $V_G$  is given without taking into account the coupling capacitor  $C_C$ . The source of the injection transistor is loaded by a capacitor  $C_L$  equals to the sum of the photodiode capacitor and the op-amp input capacitor. The positive input of the op-amp is grounded.

Fig. 89 Stability Analysis of the BDI stage



The AC simulation results are plotted in Fig. 90. The op-amp transfer function (direct gain) and the inverse of the injection transistor transfer function (feedback gain) are plotted. The difference between these two plots corresponds to the open loop gain in decibel. The point where the two curves intersect corresponds to an open loop gain is equal to one (i.e. 0 dB). The corresponding frequency defines the gain-bandwidth product of the open loop transfer function. At this frequency, the phase of the open loop transfer function must be above 45 ° to ensure the stability. The system seems to be stable as the phase margin varies from 74 ° to 55 ° depending on the photocurrent. However, if we take a closer look to the figure with a photocurrent of 100 nA, it appears that the phase reaches a minimum of 33 ° at 16 MHz while the open loop gain is still higher than one. This low margin may jeopardize the stability of the system.

Fig. 90 Direct and Feedback Gains of the Open Loop Circuit



By looking at the transfer function of the injection transistor, we can see that it is composed of a "low frequency" pole and a "high frequency" zero. The pole is due to the trans-

conductance of the transistor and the load capacitor  $C_L$ . The zero is due to the coupling capacitor  $C_C$  between the gate and the source of the injection transistor which creates a capacitive divider with  $C_L$ . To improve the stability of the loop, a solution is to increase the coupling capacitor  $C_C$ . In doing so, the gain of the capacitor divider ( $C_C$  and  $C_L$ ) becomes higher and the frequency of the zero is reduced, as plotted in Fig. 91. Therefore the crossing of the transfer functions occurs at 130 MHz. The phase margin is then increased and is never below 59 ° before the crossing. The drawback of this solution is a reduction of the closed loop bandwidth as presented in Fig. 91. Using a coupling capacitor requires to do a tradeoff between the stability and the bandwidth. Therefore, for our application we do not use coupling capacitor to improve the stability.

Fig. 91 Direct and Feedback Gains of the Open Loop Circuit with a 25 fF Coupling Capacitor

Fig. 92  $\rm I_{int}$  versus  $\rm I_{pd}$  Closed Loop Transfer Function of the BDI Stage with a 25 fF Coupling Capacitor



Transient noise simulation has been done to evaluate the SNR and dynamic range of the BDI stage. This simulation takes into account the reset noise. The simulations have been done for six photocurrents that cover the input range of the circuit. The standard deviation of the voltage noise on the integration capacitor C<sub>int</sub> has been plotted in Fig. 93. As it is predicted by the model, the output noise increases with the photocurrent and the main source of noise is the op-amp. The noise values are of the same order of magnitude than the values of the model for photocurrent above 60 nA. It is interesting to note that noise drops at the end of integration for a photocurrent of 80 nA and 100 nA. One can explain this phenomenon by the fact that for high photocurrents the saturation is reached. As presented before, the injection transistor then reaches the ohmic region. In this condition, its trans-conductance is small and consequently the system bandwidth decreases. Therefore, the op-amp noise is filtered by a lower bandwidth and the output noise is reduced. The same simulations have been done with the 25 fF coupling capacitor. The output noise is reduced compared to the version without coupling capacitor. One can explain this by its reduced closed loop bandwidth which cuts off earlier the noise PSD. Based on these simulations, the dynamic range of the BDI stage is 58 dB for the solution without coupling capacitor.

x 10<sup>-3</sup> Without Coupling Capacitor With Coupling Capacitor 2.5 Voltage Noise (Vrms) 2 1.5 0.5 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 8.0 0.9 Photocurrent (A) x 10<sup>-7</sup>

Fig. 93 Noise on the Integration Node of the BDI Stage Without and With Coupling Capacitor

The saturation of the integration capacitor is an issue as the photocurrent is then integrated on the junction capacitor of the photodiode. In doing so, this changes the biasing of the photodiode. In our design, the reset time is not long enough to let the op-amp reload its minus input to the reference voltage  $V_{ref}$ . Therefore, we have to keep the photocurrent low enough to prevent the saturation. Otherwise, it will create then some lag in the next image.

#### 5.2.6. Conclusion

Different solutions have been studied to increase the sensitivity of the current to voltage conversion without using transfer gate technology. Using a RTIA stage is not possible for our application as it does not offer a sufficient dynamic range due to the targeted frame rate. A CTIA stage working at 5 Mfps with a current to voltage gain of 140 dB can be implemented in the pixel. It has a dynamic range of 55 dB which is enough for our specifications. However, a burst CIS based on a CTIA stage has already been designed in [38]. We then chose to study the direct injection circuit which increases the sensitivity without using any op-amp. However, this solution was not suited for our application as its bandwidth cannot reach 5 MHz for our span of photocurrent. Inspired from this circuit, we found that a buffered direct injection circuit can increase the sensitivity of our pixel front-end. The BDI stage offers a gain of 13 V/ $\mu$ A. At 630nm, the sensitivity is then 66 V/lux/s which is a significant improvement compared to the APS integration stage sensitivity. The bandwidth of the BDI stage depends on the photocurrent and we show that this structure is compliant with the 5 Mfps specification. The dynamic range of the stage without coupling capacitor is 58 dB which is compliant with the specification. The BDI stage should not limit the dynamic range of the

sensor. However, it is important to keep in mind that SNR is not constant over the dynamic as it depends of the photocurrent.

## **5.3.Design Strategies to Reduce Power Consumption**

The second limitation of the pixel front-end inspired from APS is the high power consumption which is mainly due to the SF buffers of the global shutter and the multiplexer. There is different ways to reduce the power consumption based on architecture or circuit solutions. Firstly, the power consumption of the SF buffers can be reduced by using two memories for the global shutter stage. In doing so, the SF buffer can write the current pixel value in the sampling cell while the previous value stored in the other memory is converted by the ADC. The speed constraint of the SF buffer and thus the power consumption are divided by almost a factor ten. However, this solution reduces the fill-factor as it requires the embedding of two sampling cells in the pixel. We are going to propose here circuit modifications that reduce the power consumption without impacting the fill factor.

One solution to reduce the power consumption is to remove the current bias of the SF buffers in the front-end circuit as illustrated in Fig. 94. By removing the bias, the buffer can only load (resp. unload) its output if the MOSFET  $T_{buff}$  is an N type (resp. P type). A switch transistor  $SW_{buff}$  is connected to preset the output voltage to the minimum (resp. maximum) value of the output voltage range. The transistor  $T_{buff}$  works in saturation ( $V_{DS}>V_{GS}-V_{th}$ ). At the beginning, the transistor is in the strong inversion region but progressively reaches the weak inversion region. Therefore, the trans-conductance of  $T_{buff}$  decreases more and more. The time constant of the buffer is then very large and the buffer output is sampled before the steady state. The buffer gain is about 0.8 and slightly depends on the sampling time. The characteristic of pixel front end without current source is plotted in Fig. 96.

Fig. 94 Pixel Front-End Without Current Source and its Timing Diagram



The power consumption  $P_{buff}$  of this buffer is just due to the load of the output capacitor  $C_{load}$  at a given working frequency F and is thus given by Eq. 32. The power consumption of

the pixel and multiplexer buffer are presented in Fig. 95. The power of the pixel buffer increases with the photocurrent because this buffer unloads its output. The power of the multiplexer buffer decreases with the photocurrent because this buffer loads its output. The total pixel power consumption varies from 4  $\mu$ W to 10  $\mu$ W depending on the input photocurrent. It is an improvement of more than a factor 20 compared to the front-end inspired from APS. The multiplexer power is divided by 9 for the computation of the total power as this buffer is shared between 9 pixels.

Fig. 95 Power Consumption versus the Photocurrent of the Pixel Front-End Without Current Source



The characteristic of the front-end circuit without current source has been plotted thanks to a set of transient simulations in Fig. 96. The reset voltage of the photodiode is 1.23 V. The gain of the circuit varies between -1.2 V/ $\mu$ A and -0.6 V/ $\mu$ A which is extracted from the characteristic. The gain is slightly better than the front-end circuit with the SF buffers.

Fig. 96 Characteristic and Gain of the Front-End Circuit without Current Source



A set of transient noise simulations has demonstrated that the output voltage noise of the pixel front-end without current source is about 158  $\mu$ Vrms. The buffers are not very noisy

due to their poor trans-conductance and bandwidth. The main sources of the noise are the samplings in  $C_{GS}$  and  $C_{ADC}$ . This result is confirmed by hand calculations. However, it is necessary to take into account another source of noise. As said before, the output signal of the buffer is sampled before the steady state. A variation of the sampling time will then produce a variation of the sampled value as illustrated in Fig. 97. The temporal derivative of the signal acts as a conversion gain between a temporal variation and a voltage variation. As the sampling time variations are random, it can be considered as noise.

Fig. 97 Effect of a Variation of the Sampling Time on the Sampled Value



The sampling signals are propagated across the matrix with a chain of buffers like a clock signal. Therefore, one considers that the sources of temporal noise are the same than the ones of a clock signal [101]. The temporal variation of a signal is named skew. The sources of the skew can be systematic, random, drift and jitter. The systematic skew is due to the propagation delays through the lines and the signal buffers across the matrix. The random skew is due to the process variations of the wires and the buffers that create a delay variation. The systematic and random skews produce a random delay on the sampling signal that varies from one pixel to another but is constant over the time. The voltage noise produced by such skews can be seen as a fixed pattern noise (FPN) and can be removed by a calibration operation. The drift and jitter skews produce time-dependent variations of the delay of the sampling signal. The variations of the delay caused by the drift occur slowly. The main source of drift is the temperature variation that affects the signal buffer performances. For this study, one considers that the temperature variation during a single shot acquisition in pre-event triggering mode is too small to have an effect on the sampling signal as demonstrated in 4.4.2.1. For multi-burst acquisition or post-event triggering mode, the drift should be considered. The jitter is a high frequency variation of the sampling signal due to environmental variations. The main source of jitter is the power supply noise that creates some delay variations in the buffers. Apart from reducing the power supply noise effect, the jitter is too high frequency to be reduced by compensation circuits. The voltage noise produced by the jitter is a temporal noise. The jitter of an inverter can be calculated for a given power supply noise, load, rising time and technology [102]. In practice the sampling signal goes through a daisy chain of inverters before reaching the pixel. Therefore, the jitter on the sampling signal is the root mean square of the jitter of each inverter. The total jitter versus the daisy chain length is plotted in Fig. 98 for different loads. The power supply noise is assessed to 10 mVrms. We consider that the length of the daisy chain buffers is 50 and the load capacitor 200 fF. For the rest of this study, the standard deviation of the sampling time is set to 48 ps.

Fig. 98 Sampling Time Jitter versus the Length of the Buffer Daisy Chain for a Power Supply Noise of 10 mVrms



The jitter produces more voltage noise if the temporal derivative of the sampled signal is high. For instance, the noise on the sampled signal are respectively 5.9 and 0.1 mVrms for a sampling time of 2 and 18 ns and a jitter of 48 ps rms, Fig. 99. Therefore, the main source of noise is the multiplexer buffer ( $T_{buff\ N}$  and  $SW_{buff\ N}$ ) as its sampling time is 2 ns. One studies here the effect of the jitter on the multiplexer buffer.

Fig. 99 Histogram of the Sampled Signal Values for 2 ns and 18 ns Sampling Times with a Standard Deviation of 48 ps





The voltage noise produced by the sampling jitter depends on the sampled values. Indeed, the SNR caused by the jitter is plotted for different inputs on Fig. 100 (a). The SNR is worst at low photocurrents as the temporal derivative is higher at the sampling time than for high photocurrents. It is possible to increase the sampling time at the input of the ADC to release the constraint on the multiplexer buffer. Indeed, increasing the sampling time reduces the temporal derivative of the signal when the sampling occurs. The dynamic ranges of the pixel front-end are summarized in Fig. 100 (b) for different standard deviation on the sampling signal. Another solution is to reduce the power supply noise which is the source of the jitter. The power supply noise can be filtered by implementing decoupling capacitor. This solution consumes lots of chip area.

Fig. 100 (a) Voltage Noise versus the Input Signal for a 48 ps Jitter Noise (b) Voltage Noise and Dynamic Range for Different Sampling Time



| Sampling Time (ns) | Dynamic Range<br>(dB) |
|--------------------|-----------------------|
| 2                  | 42.6                  |
| 4                  | 52.0                  |
| 6                  | 58.1                  |
| 8                  | 62                    |

This work demonstrates that the pixel front-end without current sources reduces drastically the power consumption by a factor 20 and offers a slightly better sensitivity than the pixel front-end inspired from APS. The drawback of this solution is a limited dynamic range of 42 dB for standard deviation of 48 ps on the sampling signal of 2 ns.

#### 5.3.1. Conclusion

In this chapter, we have presented a pixel front-end for burst image sensor with digital storage inspired from APS structure. Its performances are evaluated thanks to simulation and demonstrate that this structure is compliant with our specifications in term of speed (5 Mfps) and in term of dynamic range (50 dB). However, it shows that the sensitivity of this front-end is poor (2.1 V/lux/s) due to the high integration capacitor of the photodiode and that the power consumption at such frame rate is high (226 µW/pixel). Then, we have studied different solutions to increase the sensitivity by reducing the integration capacitor. We have chosen to design an integration stage based on a buffered direct injection circuit. This circuit performs the photocurrent integration on a capacitor value of 20 fF which increase the sensitivity to about 66 V/lux/s. This circuit has been validated for a frame rate of 5 Mfps with a dynamic range of 58 dB. One drawback of this stage is a poor signal to noise ratio which depends on the signal and reaches 47 dB at the saturation. Finally we have proposed a front-end that reduces the pixel power consumption. We have analyzed a pixel front-end without current source. This structure drastically reduces the pixel power to 10 μW which increases the power efficiency by a factor 20 compared to the APS front-end. One drawback of this solution is unbiased buffers that are very sensitive to the jitter on the sampling signals. This has the effect of lowering the SNR and reducing the dynamic range to 42 dB. In the next chapter, we are going to present some experimental measurements carried out on a test-chip. This test-chip includes the three pixel front-end circuits presented

| previously. We are also going to present a 3D integrated test-chip with an implementation of |
|----------------------------------------------------------------------------------------------|
| the burst image sensor with digital storage.                                                 |

# 6. Circuit Implementations and Tests

In this chapter, we present the test-chips that have been manufactured and tested during this doctoral work. The first one which has been designed in ST 130 nm technology is pictured in Fig. 101. This test-chip aims to evaluate the three different pixels front-ends presented above. Two clusters of 3x3 pixels of the front-end inspired from APS have been implemented. One has electrical inputs and the other one has photodiodes. Two other clusters with photodiodes have been implemented, the front-end with BDI stage and the front-end without current source. A test board and a static light test bench have been designed to characterize the pixel front-ends. This bench allows us to measure the responsivity of the photodiode and plot the characteristic of the front-end. However, the bench uses halogen light sources that are not enough powerful to characterize the sensor at full speed. In order to complete these tests and validate the front-ends at 5 Mfps, another

test bench with dynamic light has been designed. This bench uses a pulsed laser to measure the electrical opening window of the front-end circuits. These tests have been performed with the help of the Heterogeneous System and Microsystems for the ICube laboratory. Afterward, we present in this chapter a second circuit that has been designed using 3D integration technology. A full burst image sensor with digital storage has been implemented on this chip. This image sensor resolution is 20x20 pixels and is composed of two tiers. On the top one, the pixel front-end and the analog part of the ADC are implemented. The second tier contains the digital part of the ADC and the burst memory. This circuit has been developed with a team of one researcher and two engineers of the L3I laboratory. Unfortunately, this image sensor has been designed and manufactured at the end of the third years and has not been

Fig. 101 Packaged Pixel Front-End Test-chip with an Optical Lid



yet tested. We will therefore only present some simulation results for this test-chip.

### **6.1.Test-Chip for Pixel Front-End Evaluation**

## 6.1.1. Test-Chip and Test Board

As presented in the introduction, the first test-chip has been designed to evaluate the different pixel front-ends. This chip is composed of four clusters of pixels. We did not have access to the imaging technology. We have thus designed our own N-Well/P-Sub photodiode. In order to let the light reach the photodiode, the metal and the salicide layers have been excluded above the photosensitive region. The front-end inspired from APS has been implemented in two versions. The first one does not have photodiode. It gives us the possibility to validate this front-end and measure its voltage gain. This is also a warranty to

identify a problem of blind photodiodes. The version with photodiodes is used to characterize the full front-end. Moreover, knowing the gain of the front-end, we will be able to extract from this measure the photodiode responsivity. The front-end with BDI and without current source have been implemented with photodiodes. During the design of the test-chip, we face a problem to drive the analog signal out of the chip. Indeed, the path to reach the test board is composed of an input/output (IO) pad, a wire bonding to connect the chip to the package and a metallic trace from the package to the board buffer. The IO pad is mainly capacitive and is about 1 pF in our case. The 3 mm wire bonding is resistive and inductive. The package is a quad flat non-leaded model with 64 pins (QFN64). The board load is the sum of the trace capacitor and the input capacitor of the board buffer. The equivalent model of the load is presented in Fig. 102.

Fig. 102 Equivalent Electrical Model of the Signal Path from the Chip to the Board



The front-end circuits are obviously not able to drive this load. Therefore, a voltage buffer has been implemented at the end of the front-end to drive the signal from the chip to the board as shown in Fig. 103. This circuit has been provided by an engineer of the L3I laboratory. The buffer is made of a P-type MOSFET  $T_{buff}$  that performs the voltage copy on its source. Moreover, this buffer is made of a loop. If the input voltage increases, the drain voltage of  $T_{buff}$  decreases as well as the gate voltage of the MOSFET  $M_5$ . As this transistor is in common drain mode, its source voltage decreases which increases the  $V_{GS}$  of the MOSFET  $M_3$ . The current increases in this branch and is copied by the current mirror  $M_1/M_2$ . This current rise boosts the load of the output and acts as a positive feedback. The input impedance of this buffer is 1 pF. Fortunately, this value corresponds to the load for which the pixel front-end has been designed. The buffer has almost a unity gain due to bulk-source connection of the MOSFET  $T_{buff}$ . The -3dB bandwidth of this buffer is 300 MHz which is enough to drive the load in less than 16  $\mu$ s.

Fig. 103 Chip Buffer Schematic



The test board is composed of three PCBs as shown in Fig. 104. The FPGA evaluation board and the test-chip board are connected to a mother board. The test-chip board is a socket for the test-chip with decoupling capacitors. The mother board contains low dropout voltage (LDO) regulators which perform the power management of the test board. The mother board contains the delay generator circuits that drive the test-chip as presented above. It also contains the voltages followers that copy the test-chip outputs to the oscilloscope inputs.

Fig. 104 Test Board



We implemented a digital circuit on the FPGA that triggers and generates the control signals to acquires a video with the test-chip. This circuit is composed of a counter which is reset at

the end of each image acquisition. For each control signal, the user provides to the FPGA a sequence of numbers. Whenever the counter reaches one of these numbers, the digital circuit changes the state (low/high) of the considered signal. As the counter is clocked by the FPGA clock, the maximum signal frequency is equal to one-half the frequency of the FPGA clock. Therefore, we face a problem to generate the some control signals with short high states. The test board provides a 25 MHz clock to the FPGA. Using the embedded PLL of the FPGA, the inner clock can reach 200 MHz. The FPGA can then generate control signals with a rising/falling edge every 5 ns but the ADC sampling signal has a pulse width of 4 ns and the pixel front-end without current source requires an ADC preload signal with a pulse width of 2 ns. These signals cannot be generated with the FPGA. To bypass this issue, these critical signals are generated with the circuit presented in Fig. 105. The FPGA sends the control signal to two delay lines. Each line adds a delay  $T_{D1}$  and  $T_{D2}$  to the signals. The delayed signals arrive in a XOR gate which generates a signal with a high state duration equals to the difference between  $T_{D1}$  and  $T_{D2}$ . The delay lines are implemented on the test board and the XOR gate on the test-chip. In doing so, the XOR out signal propagates only on the chip and is not dampen by the signal path presented in Fig. 102. The delay lines are configured by the FPGA. The proper operation of the signal control generation has been verified by measurements.

Fig. 105 Control Signal Generation Circuit based on Delay Lines



The sampled signal is readout by a chip buffer and a board buffer. The board buffers are *AD8074* video buffers. These circuits have a unity gain (0.995) and a low offset (3.5 mV). The test-chip enables to characterize on-chip buffer. Driving its input with a voltage generator, the DC characteristic has been measured. The chip buffer has a gain of 0.95 and an offset of 737 mV on its linear region (0.2 V to 1.5 V). The chip and the board buffers have been chosen for their -3dB bandwidth of respectively 300 MHz and 500 MHz. The DC characteristics are plotted in Fig. 106.

Fig. 106 DC Characteristics of the Chip and Board Buffers



### 6.1.2. Test Results

# 6.1.2.1. Photodiode Characterization

We have first tested and characterized the photodiode by measuring its responsivity. The responsivity is measured using the following method and based on the APS front-end circuit. The circuit is illuminated with a calibrated light source at a given wavelength and its voltage response is measured. Knowing the current to voltage gain of the front-end circuit and knowing the input light power at the circuit surface, the photodiode responsivity can be deducted. However, this method requests to know the current to voltage gain of the circuit. We will thus start by measuring it. The current to voltage gain is the combination of the conversion gain which depends of the photodiode junction capacitor and the voltage gain of the circuit. The voltage gain is measured with the APS front-end without photodiode. The input of the circuit is driven by a voltage reference and the output voltage is measured after the board buffer. The characteristic of the front-end is extracted from these measures and are plotted in Fig. 107. The voltage gain of the circuit is 0.44 on the 0 V to 1.5 V voltage range. For higher inputs, the voltage gain starts to decrease.

Fig. 107 Measured Characteristic of the APS Front-End without Photodiode



Now that we have the voltage gain, it is necessary to measure the integration capacitor of the photodiode. Different methods exist to measure this capacitor such as "noise squared versus signal" which rely on the dependency of the shot noise with the photocurrent. Here, we present a method described in [103]. The measure is performed using the circuit presented in Fig. 108. It starts by connecting the measurement capacitor  $C_M$  and the photodiode to the reset voltage  $V_{rst}$ . Then the external switch  $SW_{ext}$  and the reset switch  $SW_{rst}$  are disconnected and the photocurrent starts to be integrated on the photodiode junction capacitor. At the end of the integration, the output voltage  $V_{int}$  is measured through the front-end circuit and the photodiode is reset using only  $SW_{rst}$ . A little quantity of charges is withdrawn from  $C_M$  to perform the reset. It creates a little voltage drop across the capacitor  $C_M$ , Eq. 33 (a). The drop has a very small value. However, by repeating this operation many times  $N_{images}$  and taking into account all the pixel of the cluster (i by j), a significant voltage drop  $\Delta V_{ext tot}$  appears, Eq. 33 (b). The value of the integration capacitor  $C_{int}$ can be then extracted. The parasitic and the trace capacitors can be neglected by using a large enough measurement capacitor (~1 μF). Unfortunately, we did not have time to settle this experiment and we use the integration capacitor value extracted from a simulation (i.e. 200 fF). We then know that the charge to voltage gain of the front-end is 0.32  $\mu$ V/e-.

$$\begin{cases} \Delta V_{ext} = V_{int} \frac{C_{int}}{C_{ext} + C_{int}} \approx V_{int} \frac{C_{int}}{C_{ext}} & (a) \\ \Delta V_{ext}_{tot} = \sum_{n=1}^{N_{images}} \sum_{i=1}^{3} \sum_{j=1}^{3} V_{int}_{n,i,j} \frac{C_{int}}{C_{ext}} & (b) \end{cases}$$

Fig. 108 Measurement Circuit for Integration Capacitor Extraction



Finally, we perform measures with the APS front-end with photodiodes under a constant illuminance at different wavelengths. To do so, we use the test bench presented in Fig. 109 with an halogen white light source. A given wavelength is selected by a mono-chromator. The mono-chromator provides a light beam centered on the selected wavelength with a spectral bandwidth of 20 nm. The spectral bandwidth corresponds to the full width at half maximum of the output light spectrum. Then the light goes through an integrating sphere that generates an isotropic light. The test-chip is placed at the output of the integrating sphere with a calibrated photodiode. This photodiode is used to measure the received illuminance on the surface of the test-chip. This test bench is controlled by a LabVIEW Software that controls and monitors the halogen source, the mono-chromator and the calibrated photodiode.

Fig. 109 Test Bench for Responsivity Measurement



The response of the APS front-end is measured at different wavelengths. Knowing the illuminance at the surface of the photodiode and the current to voltage gain (as the charge to voltage gain is known), the responsivity is extracted and plotted versus the wavelength in Fig. 110. The responsivity increases from 450 nm to 800 nm to reach a maximum of 0.22 A/W and then decreases for higher wavelengths. The shape and the maximum value of the responsivity is the one expected for a NWell/PSub photodiode based on [47]. The same

experiment has been done with the BDI front-end and the same shape and maximum value have been measured. One hypothesis to explain the small drops of the responsivity at 450 nm and 750 nm is some destructive interference due to the interfaces of the back-end of line (BEOL).

Fig. 110 Responsivity of the NWell/PSub Photodiode



### 6.1.2.2. APS Front-End Circuit

The characteristic of the APS front-end circuit has been measured using the same test bench. The illuminance is adjusted thanks to the LabVIEW software. The illuminance is not strong enough to permit to work at full speed. Therefore, the characteristic is measured with an integration time of 10 ms for a wavelength of 650 nm. The reset voltage is 2V. The responsivity of the APS front-end at such a frame rate is  $-12 \text{ V/W.m}^2 \text{ or } -47 \text{ kV/}\mu\text{A}$ . If we convert this value for a frame rate of 5 Mfps, it corresponds to a responsivity of  $-0.24 \text{ mV/W.m}^2 \text{ or } -0.94 \text{ V/}\mu\text{A}$  which is compliant with the simulations.

Fig. 111 Characteristic of the APS front-end circuit



The speed of the APS front-end has been evaluated by measuring the electronic aperture of the front-end. The aperture corresponds to the temporal window during which the photocurrent is integrated (i.e. the integration window). Ideally, this window is a rectangular function which starts instantly at the end of the reset operation and stops instantly at the end of the global shutter acquisition. This corresponds to a steep rising edge (resp. falling edge) of the aperture characteristic. For an ideal integration, the contribution of the incoming light to the integrated signal is independent of the arrival time while it stays in the integration window. This corresponds to a flat top state of the aperture characteristic. The aperture characteristic is extracted from the impulse response of the system. The input pulse is shifted from the beginning of the integration to the end and the responses are measured. The aperture graph is given by plotting the response versus the occurrence time of the pulse. The timing diagram is presented in Fig. 112 for an ideal system and for a system with a limited time response. In case of a system with a limited time response, if the light pulse arrives at the end of the integration window a part of the photocurrent is not integrated. The falling edge of the aperture characteristics is then not steep. This method corresponds to a reconstruction of the aperture characteristic by a convolution of the system transfer function with a dirac input.

Fig. 112 Ideal and Real Response of the Integration Stage to a Light Pulse



The following test bench has been used to carry out the measure of the aperture window. The test-chip is illuminated by a picosecond laser diode source driven by the generator described in [104]. The laser is pulsed and generates lights during 100 ps for a full width at half maximum (FWHM). By comparing this duration with the integration time, the laser light pulse can be seen as a dirac impulsion. The laser trigger is generated and synchronized with the APS front-end thanks to the FPGA. The laser trigger is shifted thanks to a delay generator. The light pulse is thus moved over the integration windows and the output signal is measured. In doing so, the electronic aperture is measured point by point. The delay generator can generate a shift of less than 5 ps. This precision is enough to measure our aperture window even during the edges.

The electronic aperture of the APS pixel front-end is plotted in Fig. 113 for an integration time of 600 ns. As expected, the rising edge of the electronic aperture is steeper than the falling edge. Indeed, if the light pulse arrives at the beginning of the integration window, the photodiode and the APS integration stage have the whole integration time to respectively perform charge collection and the voltage copy. However, at the end of the aperture, the photodiode has not enough time to collect all the photogenerated charges. This is the case of the charges generated in the substrate that are collected by a diffusion phenomenon. This diffusion mechanism has a time constant of few hundred nanoseconds [48]. The decrease at the end of the aperture window which loses 10% in about 100 ns could be cause by this phenomenon.



Fig. 113 Electronic Aperture of the APS Front-End for an Integration Time of 600 ns

We did not have time to perform the noise measurements. We shortly present the experimentation protocol that we wanted to perform to measure the readout noise. The test-chip is placed in the dark and several images are acquired. In dark condition the shot noise can be neglected with respect to the readout noise. The readout noise is measured by computing the standard deviation over the different images for a given pixel. It is important to do this operation for different image acquisition to take into account the sampling noises and to do the measurement for the same pixel to not take into account the fixed pattern noise.

### 6.1.2.3. Pixel with BDI Stage

The cluster of pixels with a BDI integration stage has been tested with the previous test benches. The characteristic of the BDI stage is plotted in Fig. 114. The responsivity of the APS front-end at such a frame rate is - 117 V/W.m² or - 470 kV/ $\mu$ A at 650 nm. If we convert this value for a frame rate of 5 Mfps, it corresponds to 2.34 mV/W.m² or 9.4 V/ $\mu$ A. This value is ten times higher than the measured responsivity of the APS front-end circuit. As expected, the responsivity increases with the illuminance. It is due to the voltage dependence of the integration capacitor. The saturation is reached for an illuminance of 6 mW/m². For higher values, the injection transistor acts like a switch and the photocurrent is integrated on the integration capacitor and photodiode junction capacitor.

Fig. 114 Characteristic of the Pixel Front-End with BDI Stage



The electronic aperture of the BDI front-end is measured for the same integration time as the APS pixel front-end Fig. 115. The electric aperture shows a sharp rising edge and a flat top state. However, the BDI front-end has difficulties to integrate the photogenerated charges at the end of the aperture. Indeed, the characteristic takes about 100 ns to turn off. As for the APS pixel front-end, the source of this slow closing can be the limited bandwidth of the photodiode.

Fig. 115 Electronic Aperture of the BDI Front-End Circuit for an Integration Time of 600 ns



The electronic aperture of the BDI front-end has also been measured for an integration time of 200 ns and plotted in Fig. 116. The rising edge of the aperture is steep and occurs in less

than 10 ns. The reset transistor is able to maintain the reset voltage on the integration node until the beginning of the integration. The top state is not flat and decreases from 90 % to 78% during the integration window. It can be explained by the limited bandwidth of the BDI stage that has difficulties to perform the current copy at high speed. Therefore, a part of the photo-generated charges are not integrated. This quantity of non-integrated charges becomes larger as the light pulse is approaching the end of the integration window. Moreover, the top state of the aperture characteristic presents some oscillations with amplitude of 20 % and a frequency of 50 MHz. To identify the cause of these oscillations, we have extracted the aperture characteristic from a set of transient simulations. This simulated aperture does not contain such oscillations. Therefore, these oscillations are not due to instability of the BDI stage. The cause of them might be a crosstalk between integration capacitor and the ADC sampling signal. To confirm this hypothesis, it would be necessary to extract the aperture from post-layout simulations that take into account parasitic capacitors. These oscillations are not observed on the electronic aperture with an integration time of 600 ns. It can be due to the fact that the sampling step during the top state is 50 ns. This value is almost equal to the period of the expected oscillations (~60 ns for this integration time).

Fig. 116 Electronic Aperture of the BDI Front-End Circuit for an integration time of 200 ns



As for the APS pixel front-end, no noise measurement has been carried out. It could be interesting to measure the readout noise for different illuminances. It would permit to check if the readout noise evolves with the photocurrents as seen in simulation.

### 6.1.3. Front-End Without Current Source

The pixel front-end without current sources has not yet been tested due to a lack of time. However, we want to carry out the following experiments on this structure. Firstly the characteristic and the aperture of this front-end circuit can be measured using the same test bench than for the other front-ends. The advantage of this front-end is a low power consumption which depends on the input signal as presented in the previous chapter. It could be interesting to measure the power consumption for different illuminances and verify if the measures match the simulations. The power consumption measurement can be done by measuring the average current of the front-end without current source during the quiescent state and during an acquisition. The difference gives the power consumption of the front-end without current source.

# **6.2.3D Integrated Circuit**

# 6.2.1. Circuit Implementation

A company has given us the opportunity to design a 3D integrated burst image sensor. The manufacturer proposes a 3D integrated circuit made of two tiers. The top IC is manufactured with an imaging technology and the bottom tier with a 40 nm digital technology as shown in Fig. 117. The top tier is BSI and the IOs are on the periphery of the circuit. The implementation of the burst IS with digital storage differs from the one presented in chapter 3 as there is two tiers instead of three. The pixel front-ends are still implemented on the top tier but the ADCs are shared between the top and the bottom tiers. The burst memories are implemented on the bottom tier. This tier also contains the IS control logic and the IO circuits.

Fig. 117 3D Integrated Burst Image Sensor with Two Tiers



The photodiode is implemented with NWell/PSub junctions. The photodiode is made on 3 NWell/PSub junctions connected in parallel. In doing so, there are more lateral depletion

regions in the pixel. Therefore, more photogenerated charges participate to the drift current and thus increase the bandwidth of the photodetector. The drawback is a larger equivalent capacitance of the photodetector. The front-end performs the current to voltage conversion with a BDI circuit. The integration capacitor is 50 fF and thus the conversion gain is 3.2  $\mu$ V/e-. The operational amplifier of the BDI is a 5T structure. The front-end also performs a global shutter acquisition thanks to a sample and hold stage. This stage also enables the acquire-while-read mode.



Fig. 118 Timing Diagram of the 3D Integrated Burst Image Sensor

The ADC has to be compact as the 3D stack is composed of only two tiers. The ADC is based on a single slope architecture. It is a very small structure which allows implementing one ADC per pixel front-end and leaves enough room on the top and bottom tiers to keep a good fill factor and implement a memory. The single slope architecture is composed of a comparator stage and a counter. The comparator is placed on the top tier (i.e. implemented with 130 nm transistors) and the counter on the bottom tier (i.e. implemented with 40 nm transistors). The comparator stage is composed of a capacitive divider and a comparator. In order to have a high gain, the comparator is made of two inverter comparators with autozero circuits. The comparison reference corresponds to the threshold voltage of the inverters. The comparison starts by an auto-zero operation which removes the offset of the comparator. The capacitive divider  $(C_1, C_2)$  generates the difference  $V_{diff}$  between the pixel signal and the voltage ramp  $V_{ramp}$ . In doing so, the divider also performs a double sampling of the pixel signal that removes the fixed pattern noise (FPN) of the pixel front-end. As soon as

the counter starts, the voltage ramp and thus the  $V_{diff}$  signal start increasing. When the  $V_{diff}$  signal cross the comparison reference, the comparator toggles and sends a stop signal to the counter. The counter requires a working frequency equal to the inverse of the comparison times  $2^n$  where n is the ADC resolution in bit. Here the ADC targets a frame rate of 5 Mfps and a resolution of 8 bits. Half of the conversion time (200 ns) is used to perform the autozero operation. The other half is dedicated to the comparison. The working frequency of the counter is thus 2.4 GHz. This frequency is too high to be generated outside and sends to the counter through an IO pad. Therefore, the counter clock is generated by a PLL circuit implemented on-chip. The PLL is an IP block provided by the manufacturer. This clock signal is distributed to each ADC of the matrix thanks to a clock tree. This architecture is not well suited to reach high frame rate. Indeed, it seems difficult to reach higher frame rate than 5 Mfps with the same resolution as 2.4 GHz seems to be a limit in term of clock frequency. The reset is maintained during the auto-zero operation. The integration time is thus 100 ns.

Fig. 119 Pixel Front-End and ADC of the Top Tier



The burst memory stores the output value of the ADC at each conversion as illustrated in Fig. 120. The memory is not implemented with SRAM bit cell due to a limited development time. The burst memory is thus a FIFO memory (i.e. shift register) made of D flip-flops. This implementation is not good for the power consumption as every register of the FIFO memory is written at each image storage. In order to reduce this power consumption, the memory is divided into four FIFOs. Each FIFO is written one after each other thanks to the pixel control logic. The memories are connected to the column bus thanks to a 4:1 tristate multiplexer. This solution offers a memory depth of 52 images. It is largely below the memory depth that we can expect with static memory bit cells.

Fig. 120 ADC and Memory of the Bottom Tier



# 6.2.2. Simulation Results

Unfortunately, the 3D integrated burst image sensor has not been yet tested and only simulation results are presented here. The characteristic pixel front-end has been simulated and is plotted in Fig. 121. The responsivity at the BDI stage output is - 3 V/ $\mu$ A which is about four times lower than the one of the BDI stage implemented in the test-chip. It is due to the integration capacitor of 50 fF and the integration time of 100 ns which are respectively two times higher and lower than their value in the test-chip BDI front-end. This front-end requires then four times more illuminance to reach the saturation. Indeed, the plots show that the saturation is reached for 370 nA i.e. when injection transistor starts acting like a switch. This BDI stage has a good linearity due to the integration capacitor which is implemented with a metal/oxide/metal structure.

Fig. 121 Characteristics and Gain of the BDI Pixel Front-End of the 3D Integrated Burst Image Sensor



The transient noise simulations over a 10 GHz bandwidth show that readout noise at the  $V_{AZin}$  node depends on the input signal Fig. 122. This is due to the BDI stage which is biased by the photocurrent as presented previously. The readout noise reaches a maximum of 835  $\mu$ Vrms for a photocurrent of 350 nA. The dynamic range of the front-end is 64 dB. This circuit confirms that the dynamic range of the sensor is limited by the dynamic range of the ADC and not by the pixel front-end as demonstrated in 5.2.6.

Fig. 122 Voltage Noise on the Pixel Font-End Output of the 3D Integrated Burst Image Sensor



Different simulations, not be presented here, have been carried out on the single slope ADC to validate its design. The ramp generation is an important element that affects the performances of the single slope ADC. Different architectures exist based on resistive or capacitive DAC, switched capacitor circuit or CTIA [105]. However, due to the limited

development time for this image sensor, the voltage ramp is generated outside the chip and is propagated from the left to the right through the pixel array. Therefore, the ramp signal arrives with a different delay depending on the pixel. As the counters of the pixels are synchronized thanks to the clock tree, this delay creates an offset in the A/D conversion characteristic. This offset can be removed by a calibration and some post processing. In layout, the line that propagates the ramp has to be protected from potential aggressors (clock, control signals) to prevent noise on the ramp signal. Moreover, as the ramp is generated off chip, the signal goes through the wire bonding as shown on Fig. 102. The inductance of the wire bonding creates oscillations due to corner point at the beginning of the voltage ramp. The error between the ideal ramp and the ramp is plotted in Fig. 123. These oscillations should not be a limitation for our application as their maximum amplitudes (1.5 mV) are below the voltage LSB for an 8 bit resolution.

8 x 10<sup>-4</sup>
6 4
2 -2 -4 -6 3 3.05 3.1 3.15 3.2
Time (s) x 10<sup>-7</sup>

Fig. 123 Voltage Error on the Ramp Signal due to the Wire Bonding

The layout of the two tiers of the pixel is presented in Fig. 124. The fill factor of the pixel is 80 % and on the digital tier, the major part of the pixel is dedicated to the register memory.

Fig. 124 Top and Bottom Tier Pixel Layout and Full Chip Bottom Layout



#### 6.3.Conclusion

This section has presented the measures performed on a test-chip to evaluate different pixel front-end circuits. First, the photodiode responsivity has been measured over the visible spectrum. The measurements correspond to the expected responsivity of a NWell/PSub photodiode and reach a maximum of 0.22 A/W at 800 nm. Then, the tests conducted on the APS pixel front-end show that this structure is compliant with our requirement. It offers a responsivity of – 0.24 mV/W.m<sup>2</sup> at 5 Mfps. The electronic aperture has been measured and demonstrates that this front-end is able to work at 1.6 Mfps. This measurement has not been done at 5 Mfps but APS based the front-end should work at such a frequency. The same experiments have been carried out on the BDI pixel front-end. As expected, this structure provides an improvement of the sensitivity of - 2.34 mV/W.m<sup>2</sup> at 5 Mfps. However, this front-end has a poor linearity because of the integration capacitor value which varies with input signal. The measurement of the electronic aperture demonstrates that this front-end can work at 5 Mfps but the aperture window is not flat. Such aperture causes a time dependent response of the sensor for a light pulse of duration below the integration time. Finally, an implementation of a 3D integrated burst image is presented in this chapter. This proof of concept demonstrates the possibility of acquiring a burst of images at 5 Mfps while converting the signal into digital data with only two tiers. This image sensor has a memory depth of 52 images and a resolution of 8 bits.

To complete these tests, it would be interesting to measure the readout noise of the different pixel front-end circuits of the test-chip. It would be also necessary to test the pixel front-end without current source and measure its power consumption. For the 3D integrated image sensor, different improvements could be done. A first thing would be to generate the ramp signal on-chip to suppress the oscillations created by the wire bonding inductances. It would also be interesting to design a burst memory based on RAM bit cell to drastically increase the memory depth. Moreover, if the technology is available, it would be interesting to study integration stage based on transfer gate. Indeed, this technology is suited for this range of frame rate and can drastically increase the conversion gain and the sensitivity.

# 7. Conclusion and Perspectives

The aim of this PhD work was to identify and study burst image sensor architecture based on 3D integration technologies. After a review of the burst image sensor state of the art and of the 3D integration technology, it has appeared that this technology enables on-chip A/D conversion of the burst of images. We have then proposed two 3D integrated burst image sensor architectures, one with an analog storage of the image burst and another with a digital storage. After an assessment of the performances of those architectures, it has appeared that, for a given dynamic range, the burst CIS with analog storage reaches a frame rate twenty times higher than burst CIS with digital storage. This speed limitation of the digital storage architecture is due to the limited conversion density of current ADCs. However, for a dynamic range of 50 dB, burst CIS with digital storage offers a larger memory depth which almost reaches two thousands images with static bit cell and can exceed more than six thousands images with dynamic bit cell. We then chose to study the digital storage architecture as it takes fully advantage of the 3D integration technology and would be impossible without it. One main drawback of this architecture is the high power consumption which can create some overheating. To prevent this risk, we carried out a thermal study which demonstrates the need to use a heat sink and turn off the ADCs and the pixel current sources during the burst reading operation. This work confirms the feasibility of such architecture in single burst recording as long as the acquisition time is below 1.5 s. For multi-burst recording, it ensures a junction temperature below 125 °C for the pre-event and post-event triggering mode while the acquisition time does not exceed 4 ms. We then took a closer look to the pixel front-end circuit. A pixel inspired from APS structure has been designed. This front-end is compliant with the specifications of 5 Mfps for a dynamic range of 50 dB. However, its sensitivity is 2.1 V/lux/s and its power consumption is 226 μW/pixel. Therefore, we have proposed a solution based on a buffered direct injection circuit to increase sensitivity to 45.7 V/lux/s. This circuit acquires images at 5 Mfps but for a limited dynamic range of 58 dB and has a signal dependent SNR. To reduce power consumption, we have proposed a pixel without current sources. This structure has a consumption of only 10 μW/pixel but has a poor SNR due to the jitter effect on the sampling signals. The dynamic range of this front-end circuit is limited to 42 dB. Finally, we have carried out some tests on a chip where are implemented the designed pixel front-end circuit. The responsivity of the NWell/PSub photodiode is measured and reaches a maximum of 0.22 A/W at 800 nm. The APS pixel front-end characteristic has been measured at low frame rate and shows a responsivity of 0.2 mV/W.m<sup>2</sup> at 5 Mfps. The sensitivity is 1.6 V/lux/s. The electronic aperture has been measured and demonstrates the proper functioning of the circuit at 1.6 Mfps. The APS pixel front-end should function well at 5 Mfps. The same measurements have been carried out on the BDI pixel front-end. This front-end circuit provides an improvement of the responsivity of 2.34 mV/W.m<sup>2</sup> at 5 Mfps but has a poor linearity. The sensitivity is 19 V/lux/s. The measured electronic aperture demonstrates that this front-end can record at 5 Mfps. However, the aperture window is not flat and has some oscillations likely due to a capacitive coupling of a storage node with a control signal. At the end of this work, we have presented a proof of concept of a 3D integrated burst image with digital storage. This circuit demonstrates the possibility of acquiring a burst of images at 5 Mfps while converting the signal on 8 bits into digital data with only two tiers.

There are different ways to continue and improve this work on 3D integrated burst CIS with digital storage. The first thing should be to complete the thermal study with the evaluation of the thermal runaway risk. This work could be done by applying the method presented in [88] with some inputs from the memory manufacturer about the current leakages. Moreover, the sensor could be modeled by considering the mechanical stress due to the thermal expression thanks to mechanical add-on to the electro-thermal simulator [89].

A part of this PhD work has been to improve the sensitivity of the burst image sensor by increasing the conversion gain with a buffered direct injection pixel. However, an alternative is to work on the photodetector rather than the pixel circuit. Indeed, at the targeted frame rate of 5 Mfps, the pinned photodiode and the PNP phototransistor have a sufficient bandwidth and respectively provide an improved conversion gain and responsivity. Even if these solutions require specific technology options, the benefit in term of sensitivity is very attractive as it does not require extra power consumption contrary to circuit based solution. We also proposed a pixel without current source to reduce the power consumption. However, with this solution the pixel and the multiplexer stage are not biased anymore and the signal to noise ratio of the front-end circuit is deteriorated. An alternative could be to implement two global shutter storages in the pixel. The first memory is written with the current image while the other memory that contains the previous images is converted by the ADC. In doing so, the SF buffer that performs global shutter has the whole integration time to settle its output on the pixel memory and can have a smaller current bias. The same idea could be used for the load of the ADC input to reduce the power consumption of the multiplexer stage. The advantage of this solution is a reduction of the power consumption without SNR loss.

The ADC is the key element of the 3D integrated burst CIS with digital storage as it sets the frame rate and the resolution of the sensor. The SAR architecture is an interesting solution to implement the ADC. This architecture has been selected for a project of burst image sensor carried by the laboratory. This burst CIS acquires a video at 10 Mfps with an 8 bits resolution. In this project, the ADC is shared between ten pixels. A binding feature has been implemented between the pixels. The pixel photodiode can be connected in parallel to a single current to voltage conversion stage. A large pixel is then created which is sampled at the ADC conversion frequency (i.e. 100 MHz). This feature allows a tradeoff between the frame rate and the spatial resolution. Another idea for the A/D conversion is to design a tunable Sigma Delta (SD) ADC. A short study of a Sigma Delta ADC has shown that it is possible to implement on SD ADC per pixel (50x50  $\mu$ m²). One feature of Sigma Delta ADC is a possible tradeoff between conversion frequency and resolution. Indeed, for a given

oversampling rate, the ADC resolution and the conversion frequency is set by the digital decimator. By tuning the decimator, it would be possible to increase the frame rate for a reduced pixel resolution.

Finally, some work on the digital burst memory could be done. Some discussions with industrials working on the design of SRAM have shown that there are more interests to work on the memory architecture than on the bitcell. Indeed, standard SRAM or DRAM blocks are usually not suited for our application that requires small blocks of memory which sequentially access the same bitcell address. Some work can be done to design such a memory architecture tailored for our 3D integrated burst CIS.

# **Annex Section**

#### Annex A

As every photodetector, the photodiode is prone to shot noise. Moreover, the pixel frontend of the analog storage architecture is affected by the shot noise and the readout noise.



# **Shot Noise:**

The shot noise is due to the discrete nature of the electrons that create the photocurrent. Its variance is equal to the average number *N* of electrons that generates the photocurrent and is given by. Its standard deviation in charge number is thus given by:

$$\sigma_{ShotNoise\ IR} = \sqrt{N} = \sqrt{\frac{I_{ph}}{q}\ t_{int}}\ \ [\#rms]$$

With  $I_{ph}$  the photocurrents value (the dark current shot noise is neglected), q the charge of the electron and  $t_{int}$  the integration time. As the photocurrent is integrated on a capacitor  $C_{pd}$ , the shot noise power spectral density can be express in voltage thanks to the conversion gain with q the charge of the electron:

$$\sigma_{VShotNoise\ IR} = \sqrt{\frac{I_{ph}}{q} t_{int}} \times \frac{q}{C_{pd}} [Vrms]$$

# **Reset Noise Contribution:**

The reset noise is due to the sampling process of the reset voltage on the junction capacitor  $C_{pd}$  of the photodiode. During this operation, the thermal noise of the switch is also sampled through the first order system made by the on resistor of the switch and the junction capacitor. The noise value is independent of the switch resistor and is given by:

$$\sigma_{Rst} = \sqrt{\frac{KT}{C_{pd}}} \quad [Vrms]$$

### SF Buffer Noise Contribution:

The common drain MOSFET of the source follower buffer has two noise contributions, the thermal noise and the flicker noise. The input referred power spectral density is given by:

$$PSD_{SF\ IR} = \frac{4KT\gamma}{g_{m1}} + \frac{1}{f} \frac{K_f}{C_{OY}WL} \left[ V^2 / Hz \right]$$

The first term corresponds to the thermal noise with K the Boltzmann constant, T the temperature,  $\gamma$  a constant which depends on the process and the transistor mode (2/3 in our case),  $g_{m1}$  the transistor trans-conductance. The second term corresponds to the flicker noise with  $K_f$  a process-dependent constant,  $C_{ox}$  the gate capacitor per unit of surface and WL the gate surface of the transistor. Only the contribution of the thermal noise will be take into account to compute the output referred readout noise.

### **Output Referred Readout Noise:**

To compute the readout noise, we consider here that the SF buffer transfer function  $H_{SF}$  has a DC gain  $H_0$  of 0.8 and a bandwidth  $BW_{SF}$  limited by its trans-conductance. The output referred noise is thus given by:

$$\sigma_{Vout} = \sqrt{\sigma_{VShotNoise\,IR}^2 \times H_0^2 + \sigma_{Rst}^2 \times H_0^2 + \int PSD_{SF\,IR} \times H_{SF}(f)^2 df}$$

The upper limit of the integration is set by the SF bandwidth:

$$BW_{SF} = \frac{g_{m1}}{2\pi C_{load}}$$

The PSDs are integrated over the full bandwidth which is a worst case as the integration time should give a lower frequency to the integration.

$$\sigma_{Vout} \cong \sqrt{\left(\frac{I_{ph}qt_{int}}{C_{pd}^{2}} + \frac{KT}{C_{pd}}\right) \times H_{0}^{2} + \left(\frac{4KT\gamma}{g_{m1}}\right) \times H_{0}^{2} \times \frac{\pi}{2}BW_{SF}} [Vrms]$$

By injecting the bandwidth expression in the previous formula, it appears that the readout noise does not depend on the transistor trans-conductance.

#### Annex B

Each chip of the stacked integrated circuit is composed of a silicon substrate, a device layer that corresponds to the place where the power is dissipated inside the substrate and a backend-of-line layer. Due to the good conductivity on the silicon, we do not take into account the TSVs for the conductivity and heat capacity of the substrate layer. The chips are connected together thanks to chip-to-chip bonding. The chip is placed in a flip chip ball grid array (FCBGA) package from Amkor manufacturer. The stacked integrated circuit is connected to the package thanks to a chip to package bonding. The package body size is 29x29 mm² and the BGA is composed of 784 balls. We consider that the package is made of Al<sub>2</sub>O<sub>3</sub> ceramic. An optical lid made of borosilicate glass is placed above the package cavity. This package is then mounted on printed circuit board made of FR4 material. An illustration of the thermal model used for finite element simulation is presented below. Please note that the illustration is note to scale.



# **Back End Of Line:**

The BEOL can be seen as a successive stack of copper layers and silicon dioxide (dielectric) layers. The height of the BEOL is 15  $\mu$ m based on design rule manual of a 130 nm technology. We considered that the BEOL is a mix of copper (25%) and silicon dioxide (75%) materials.

There thermal conductivities are respectively 385 and 1.4 W/k/m. The equivalent thermal conductivity is computed with a weighted average:

$$\lambda_{BEOL} = 0.25 \times \lambda_{Cu} + 0.75 \times \lambda_{SiO2} = 97.25 \left[ \frac{W}{K m} \right]$$

The equivalent volumetric heat capacitance is also computed with a weighted average method based on the density and the heat capacity of the two materials.

$$C_{th\,V\,BEOL} = 0.25 \times C_{th\,Cu} \times d_{Cu} + 0.75 \times C_{th\,SiO2} \times d_{SiO2} = 2.5 \left[ \frac{MJ}{K\,m^3} \right]$$
 with  $C_{th\,Cu} = 386 \frac{J}{kg\,K}$ ,  $d_{Cu} = 8960 \frac{kg}{m^3}$ ,  $C_{th\,SiO2} = 1000 \frac{J}{kgK}$ ,  $d_{SiO2} = 2200 \frac{kg}{m^3}$ 

#### Chip to chip bonding:



Based on the Open 3D offer of the CEA Leti, the chip to chip bonding made of a copper micro bumps (top die) is soldered with a tin-silver alloy to copper micro pillars (bottom die). The dimensions of the micro bumps and micro pillars are given below. The bonding pitch is 50  $\mu$ m and its diameter 25  $\mu$ m. The bonding is used at its maximal density (i.e. one bonding per pixel). Therefore, 20 % of this layer is made of bonding elements. We consider that there is no thermal conduction through the air between the bonding. The material that limits the thermal conductivity of the bonding is the SnAg material. Therefore, only SnAg is considered for the computation of the equivalent conductivity of the bonding. The copper has the higher heat capacity than the SnAg. Therefore, only the copper is considered for the computation of the equivalent heat capacity of the bonding. We consider that the bonding layer has an equivalent height of 20  $\mu$ m and has a cylinder shape.

$$\lambda_{C2Cbonding} = 0.2 \times \lambda_{SnAg} = 15.6 \left[ \frac{W}{K m} \right]$$

$$with \ \lambda_{SnAg} = 78 \frac{W}{K m}$$

$$C_{th \ V \ C2Cbonding} = 0.2 \times C_{th \ Cu} \times d_{Cu} = 691 \left[ \frac{kJ}{K \ m^3} \right]$$

#### Chip to package bonding:





The chip bonding pitch is set to 140  $\mu m$  which is above the minimum pitch specification of Open 3D technology. This choice is done to match the specifications of the Amkor FCBGA package. The chip bonding diameter is set to 70  $\mu m$  which is above the minimum diameter specification. Therefore, 20 % of this layer is made of bonding elements. The bonding bump is made of copper and SnAg alloy. The thermal conductivity and the heat capacity of this layer are computed using the same approximations than for chip to chip bonding. The thermal conductivity and the volumetric heat capacity are equal to the ones of the chip to chip bonding layer. The equivalent height of this layer is 40  $\mu m$ .

# **Ball Grid Array:**



The Amkor FCBGA package specifications set the dimensions and the material of the BGA. We consider here a BGA made of tin-lead alloy. The pitch of the BGA is 1 mm and the diameter of a ball is 500  $\mu$ m. Therefore, 20 % of this layer is made of balls. As for the previous layers, we consider that there is no thermal conduction through the air. The equivalent height of this layer is 300  $\mu$ m.

$$\lambda_{BGA} = 0.2 \times \lambda_{SnPb} = 10 \left[ \frac{W}{K m} \right]$$

$$with \ \lambda_{SnPb} = 50 \frac{W}{K m}$$

$$C_{th \ V \ BGA} = 0.2 \times C_{th \ SnPb} \times d_{SnPb} = 294 \left[ \frac{kJ}{K \ m^3} \right]$$

$$with \ C_{th \ SnPb} = 167 \frac{J}{kg \ K}, d_{SnPb} = 8800 \frac{kg}{m^3}$$

The thermal conductivity and the volumetric heat capacity of each layer are summed up in the next table.

| Slice                              | λ<br>(W.K <sup>-1</sup> .m <sup>-1</sup> ) | C <sub>P,V</sub> (J.K <sup>-1</sup> .m <sup>-3</sup> ) | h<br>(μm) |
|------------------------------------|--------------------------------------------|--------------------------------------------------------|-----------|
| BEOL                               | 97.25                                      | 2.5 M                                                  | 15        |
| Device                             | 163                                        | 1.64 M                                                 | 5         |
| Substrate<br>(top chip)            | 163                                        | 1.64 M                                                 | 15        |
| Substrate<br>(middle, bottom chip) | 163                                        | 1.64 M                                                 | 40        |
| Interco bonds                      | 19.5                                       | 864 k                                                  | 20        |
| Interco balls                      | 19.5                                       | 864 k                                                  | 40        |
| Package<br>(Al <sub>2</sub> O₃)    | 32                                         | 3.1 M                                                  | 500       |
| BGA                                | 12.5                                       | 367 k                                                  | 300       |
| PCB FR-4                           | 0.3                                        | 2.6 M                                                  | 2000      |
| Borosilicate Glass                 | 1.14                                       | 1.78 M                                                 | 300       |
| Air                                | 0.0271                                     | 1.132 k                                                | 200       |

Based on the fem data values, the thermal resistors of the static model are summed up here.

|                          | Thermal Resistor<br>(K/W) |
|--------------------------|---------------------------|
| R <sub>CApcb</sub>       | 330                       |
| R <sub>pcb</sub>         | 16.5                      |
| R <sub>package</sub>     | 0.1                       |
| R <sub>bot-package</sub> | 5.6 m                     |
| R <sub>mid-bot</sub>     | 3.5 m                     |

| R <sub>top-mid</sub>  | 2.9 m  |  |
|-----------------------|--------|--|
| R <sub>air-top</sub>  | 0.3 m  |  |
| R <sub>air</sub>      | 17.5   |  |
| R <sub>glass</sub>    | 0.6575 |  |
| R <sub>CAairtop</sub> | 392    |  |

# **Annex C**

### • Circuit and Stability Analysis:

The pixel front-end based on a resistive current to voltage conversion is composed of two stages. The first stage is a RTIA circuit with a current to voltage gain of 100 dB and the second stage is a voltage amplifier with a gain of 40 dB.



We first study the RTIA stage. The conversion resistor  $R_{conv}$  is equal to 100 k $\Omega$  to reache the gain of 100 dB and the junction capacitor  $C_{PD}$  of the NWell/PSub photodiode is 200 fF. For reason presented in the section 5.2.2, we target a bandwidth of 5 MHz for the RTIA stage. The transfer function of the RTIA stage is given by the following equation with the gain of the amplifier A.

$$\frac{V_{conv}}{I_{pd}} = R_{conv} \frac{A}{(1+A)} \times \frac{1}{1 + s \frac{R_{conv} C_{PD}}{(1+A)}}$$

The computation of the bandwidth with an op-amp gain of three gives a value of 32 MHz which is above our aim. A simple solution to reduce the bandwidth is to increase the photodiode junction capacitor  $C_{PD}$ . By using a P+/NWell/PSub photodiode which has a junction capacitor of 1.34 pF (c.f. 2.4.1), the bandwidth of the RTIA is then 4.7 MHz. Another advantage of using P+/NWell/PSub photodiode is a higher responsivity of the photodetector compared to the NWell/PSub photodiode. An op-amp with a gain of three can be implemented with the following structure.



The stability of the RTIA stage depends of the op-amp bandwidth and the feedback loop. To study the stability, the feedback loop is opened at the negative input of the op-amp. The direct gain A is given by the transfer function of the op-amp on its load. The feedback gain B is given by the low pass filter made by the conversion resistor and the photodiode junction capacitor. The load capacitor  $C_L$  depends on the input impedance of the voltage amplifier stage. We consider here a  $C_L$  of 10 fF but for such a value the transfer function of the op-amp has a bandwidth limited by the photodiode junction capacitor  $C_{PD}$ . For an op-amp transconductance  $g_m$  of 100  $\mu$ A/V, the cutoff frequency  $f_A$  of the op-amp is 12 MHz. As the cutoff frequency  $f_B$  of the feedback loop is 1.1 MHz, the cutoff frequency  $f_{OL}$  of open-loop transfer function is about 5 MHz. The system is thus stable as the phase has only lost 90 ° at the 5 MHz. In term of power consumption, to provide a trans-conductance of 100  $\mu$ A/V, the op-amp requires a current of 10  $\mu$ A.



The second stage is made of a voltage amplifier with a gain of 40 dB. The sampling operation is performed in 20 ns and defines the -3dB bandwidth of this stage. We chose to load the output at 95 % i.e. three times the time constant. The required bandwidth is then 24 MHz. For a voltage amplifier, the gain bandwidth (GBW) product of the amplifier is equal to the GBW product of the op-amp [96].



For an op-amp implemented with a 5 transistor structure, the gain bandwidth *GBW* product of this voltage amplifier is given by the following equation with the trans-conductance of the long tail pair  $g_m$  and the load  $C_{gs}$ . The trans-conductance depends of the drain current  $I_d$  (half of the bias current) and the difference between the gate-source voltage  $V_{GS}$  and the threshold voltage  $V_{th}$ .



For a sampling capacitor  $C_{GS}$  of 200 fF and a gain of 40 dB, the 24 MHz bandwidth and the 40 dB gain impose a trans-conductance of 3 mA/V. The bias current  $I_{Bias}$  depends on the trans-

conductance and the overdrive voltage ( $V_{GS}$ - $V_{th}$ ). For this study, we chose an overdrive voltage of 0.2 V. The bias current is then 600  $\mu$ A. With this bias current value, the slew rate constraint is verified.

### Noise Analysis:

For this noise analysis, we first evaluate the effect of the thermal noise  $I_{NThR}$  of the conversion resistor at the output of the RTIA stage. We then compute the RTIA dynamic range due to this noise. We consider that the op-amp is noiseless. The PSD of the thermal noise in the resistor is given by the following equation and depends on the Boltzmann constant K and the temperature T:

$$PSD_{I_{NThR}} = \frac{4KT}{R_{conv}} \quad \left[ \frac{A^2}{Hz} \right]$$

The power spectral density  $PSD_{Out}$  at the output is given by the following equation. It is interesting to note that the transfer function to express the resistor thermal noise at the output is the RTIA transfer function. This transfer function depends on the conversion resistor  $R_{conv}$ , the junction capacitor of the photodiode  $C_{PD}$  and the opamp gain A.

$$PSD_{Out} = \frac{4KT}{R_{conv}} \times |H_{RTIA}|^2 \qquad \left[\frac{V^2}{Hz}\right] \quad with \ |H_{RTIA}|^2 = \left(R_{conv} \frac{A}{1+A} \frac{1}{1+s\frac{R_{conv}C_{PD}}{1+A}}\right)^2$$

This PSD is integrated over the frequency through a first order low pass filter (i.e. RTIA transfer function). The output voltage standard deviation is given by:

$$\sigma_{out} = \frac{A}{1+A} \sqrt{R_{conv} 4KT \times \frac{\pi}{2} BW_{-3dB}} \quad with \quad BW_{-3dB} = \frac{1+A}{2\pi R_{conv} C_{RD}}$$

The standard deviation of the thermal noise at the output is then 166  $\mu$ Vrms. By rearranging noise expression of the previous equation, the dynamic range is then given by the following equation with the output voltage range  $V_{range}$ . The saturation is reached on the global shutter stage for a photocurrent of 100 nA. For such value, the voltage range at the RTIA output is 10 mV. The dynamic range of the RTIA stage considering only the thermal noise of the conversion resistor is then 35.6 dB.

$$DR_{dB} = 20 \log \left( \frac{V_{range}}{A \sqrt{\frac{KT}{C_{PD}} \times \frac{1}{(A+1)}}} \right)$$

#### Annex D

# Power Consumption Analysis:

The power consumption of the CTIA stage mainly depends on the targeted bandwidth and gain (i.e.  $C_{int}$  value). We are going to compute here the value of the bias current of the opamp for a given bandwidth. The transfer function of the CTIA stage is given by the following equation.



For a bandwidth BW<sub>-3dB</sub>, the required trans-conductance is given by:

$$g_m = BW_{-3dB} \times 2\pi \frac{(C_L C_{PD} + C_L C_{Int} + C_{Int} C_{PD})}{C_{Int}}$$

We consider here an op-amp implemented with a 5 transistors structure (c.f. Annex C). The trans-conductance of this op-amp is:

$$g_m = \frac{2I_D}{V_{GS} - V_{th}}$$

For an overdrive voltage ( $V_{GS}$ - $V_{th}$ ) of 0.2V, the bias current  $I_{Bias}$  of the op-amp is then given by:

$$I_{Bias} = 0.2 \times BW_{-3dB} \times 2\pi \frac{(C_L C_{PD} + C_L C_{Int} + C_{Int} C_{PD})}{C_{Int}}$$

#### Noise Analysis:

For an acquisition performed without correlated double sampling operation, the total noise at the end of the acquisition is the rms sum of the reset noise  $\sigma_{rst}$  and the readout noise  $\sigma_{readout}$ .



The reset noise is due to the sampling of the reset transistor thermal noise and to the opamp noise. In [97], the reset noise of the CTIA stage is computed and expressed in charge at the input of the stage. This reset noise can be expressed at the output of the stage using the conversion gain of the CTIA and is then given by the following equation:

$$\sigma_{rst}^2 = \frac{KT}{C_{int}^2} \left( C_{int} + \frac{C_{PD}C_L}{C_L + C_{PD} + C_{int}} + \frac{\gamma C_{PD}^2}{C_L + C_{PD} + C_{int}} \right)$$

 $\gamma$  is a parameter which is technological dependent and is here assumed to 2/3. The first term and the second term of the reset noise correspond to the thermal noise in the reset transistor while the third term corresponds to the op-amp noise. Here, only the noise contribution of the long tail pair of the op-amp is considered. In our case, the reset noise value is 1.2 mVrms.

We are now going to compute the readout noise of the CTIA stage. The source of this noise is the op-amp. We consider here an op-amp implemented with a 5 transistors structure. The power spectral density  $PSD_{opamp}$  of the op-amp at its input is then given the following equation:

$$PSD_{opamp} = 2 \times \frac{4KT\gamma}{g_m} \quad \left[ \frac{V^2}{Hz} \right]$$

We consider that the main source of noise in the op-amp is the long tail pair. This power spectral density corresponds to a white noise (independent of the frequency) and is equal in our case to 0.24 fV $^2$ /Hz. This noise source can be expressed at the CTIA output thanks to the  $V_{int}$  versus  $V_{\perp}$  transfer function. It is interesting to note that this transfer function has the same -3dB bandwidth as the CTIA transfer function.



Integrating the power spectral density over the frequency, the standard deviation of the output noise is then given by:

$$\sigma_{readout} = \sqrt{PSD_{opamp} \times \left(\frac{C_{PD} + C_{int}}{C_{int}}\right)^2 \times \frac{\pi}{2}BW_{-3dB}} \quad with \quad BW_{-3dB} = \frac{g_mC_{int}}{2\pi(C_LC_{int} + C_{PD}C_{int} + C_{PD}C_L)}$$

In our case the readout noise value is 1.3 mVrms. The dynamic range of the CTIA stage due to the reset and readout noise is given by the following equation:

$$DR_{DB} = 20 \log \left( \frac{V_{range}}{\sqrt{\sigma_{rst}^2 + \sigma_{readout}^2}} \right)$$

The output voltage range  $V_{range}$  of the CTIA stage is here equal to 1.2V. The dynamic range of the stage is then 56 dB.

#### Annex E

### • Weak Inversion Trans-conductance :

For a MOSFET in weak inversion region, the drain current depends exponentially with the gate source voltage [106]. The trans-conductance depends of the drain current  $I_d$ , the thermal voltage  $V_T$  and a technological parameter  $\kappa$  which corresponds to the capacitive divider formed by the gate to channel capacitor  $C_{ox}$  and the depletion capacitor  $C_{dep}$  of the channel.

$$g_m = \frac{\kappa}{V_T} \times I_d$$
 with  $V_T = \frac{KT}{q}$  and  $\kappa = \frac{C_{ox}}{C_{ox} + C_{dep}}$ 

#### Noise Analysis:



The image acquisition is performed without correlated double sampling. The total noise at the end of the acquisition is then the rms sum of the reset noise  $\sigma_{rst}$  and the readout noise  $\sigma_{readout}$ . We are first going to analyze the reset noise. This noise has two contributions which are the thermal noise of the reset transistor  $\sigma_{RstTrans}$  and the noise of the BDI stage  $\sigma_{BDI}$ . The contribution of the reset transistor thermal noise on the integration capacitor is well known and is given by:

$$\sigma_{RstTrans} = \sqrt{\frac{KT}{C_{int}}} [Vrms]$$

The BDI noise is due to the thermal noise of the injection transistor  $\sigma_{injTrans}$  and the noise of the op-amp  $\sigma_{opamp}$ . As both noise sources are uncorrelated, we are going to compute each noise contribution separately. The power spectral density  $PSD_{inj}$  of the injection transistor is given by:

$$PSD_{inj} = 4KT\gamma g_m \quad \left[\frac{A^2}{Hz}\right]$$

To express the noise of the injection transistor at the output of the BDI stage, this PSD is integrated over the frequency through the transfer function  $H_{inj}$ .

$$H_{inj}(s) = \frac{-C_{PD}}{g_{mTinj}C_{int}(1+A)} \times \frac{1}{1+s\frac{C_{PD}}{g_{mTinj}(1+A)}}$$

The standard deviation of the noise  $\sigma_{lnj}$  due to the injection transistor is then given by:

$$\sigma_{InjTrans} = \sqrt{4KT\gamma g_{mTinj} \times H_{inj0}^{2} \times \frac{\pi}{2}BW}$$

$$with \begin{cases} BW = \frac{g_{mTinj}(1+A)}{2\pi C_{PD}} \\ H_{inj0} = \frac{C_{PD}}{g_{mTinj}C_{int}(1+A)} \end{cases}$$

The op-amp is implemented with a folded cascade structure. For this study, we only consider the thermal noise of the long tail pair with a trans-conductance  $g_{mP}$  and the cascade stage with a trans-conductance  $g_{mN}$ . The power spectral density of the op-amp  $PSD_{opamp}$  referred at its input is given by:

$$PSD_{opamp} = 2 \times \frac{4KT\gamma}{g_{mP}} + 2 \times \frac{4KT\gamma g_{mN}}{g_{mP}^2} \quad \left[\frac{V^2}{Hz}\right]$$

The transfer function to express this PSD at the output of the BDI stage is the following:

$$H_{op-amp} = \frac{AC_{PD}}{C_{int}(1+A)} \times \frac{1}{1 + s \frac{C_{PD}}{g_m(1+A)}}$$

The op-amp PSD integrated over the frequency gives the output referred noise due to the op-amp:

$$\sigma_{opamp} = \sqrt{PSD_{opamp} \times H_{op-amp0}^2 \times \frac{\pi}{2}BW}$$

$$with \begin{cases} BW = \frac{g_{mTinj}(1+A)}{2\pi C_{PD}} \\ H_{op-amp0} = \frac{AC_{PD}}{C_{int}(1+A)} \end{cases}$$

The output voltage standard deviation due to the BDI stage is then given by:

$$\sigma_{BDI} = \sqrt{\sigma_{InjTrans}^2 + \sigma_{opamp}^2}$$
 [Vrms]

The total reset noise is the rms sum of the standard deviation of the reset transistor and BDI noise:

$$\sigma_{Rst} = \sqrt{\sigma_{RstTrans}^2 + \sigma_{BDI}^2}$$
 [Vrms]

During the integration, the injection transistor and the op-amp contribute to the readout noise. The computation of the readout standard deviation  $\sigma_{readout}$  gives the same results than

the computation of the BDI noise  $\sigma_{BDI}$  contribution during the reset operation. The total output noise at the end of the acquisition is the rms sum of the reset and the readout noise and is given by:

$$\sigma_{total}^2 = \sqrt{\sigma_{Rst}^2 + \sigma_{readout}^2}$$
 with  $\sigma_{readout} = \sigma_{BDI}$ 

# **Bibliography**

- [1] (2015, December ) Photron Web Site Gallery. [Online]. http://www.photron.com/?cmd=gallery
- [2] Specialised Imaging. (2015, December) specialised-imaging.com. [Online]. <a href="http://specialised-imaging.com/applications-and-results">http://specialised-imaging.com/applications-and-results</a>
- [3] Hamamatsu, "Guide to Streak Cameras," White paper 2008.
- [4] A Velten, E Lawson, A Bardagjy, M Bawendi, and R Raskar, "Slow art with a trillion frames per second camera," in *SIGGRAPH*, 2011, p. Talks.
- [5] C. Claeys, "Trends and Challenges in Micro- and Nanoelectronics for the Next Decades," in *International Conference for Mixed Design of Integrated Circuits and Systems*, 2012, pp. 37-42.
- [6] J-S Meena, S Min Sze, U Chand, and T-Y Tseng, "Overview of emerging nonvolatile memory technologies," *Nanoscale Research Letters*, vol. 9, no. 1, pp. 1-33, 2014.
- [7] J Roullard et al., "Evaluation of 3D Interconnect Routing and Stacking Strategy to Optimize High Speed Signal Transmission for Memory on Logic," in *Electronic Components and Technology Conference*, 2012, pp. 8-13.
- [8] P Coudrain et al., "3D Integration of CMOS image sensor with coprocessor using TSV last and micro-bumps technologies," in *Electronic Components and Technology Conference*, 2013, pp. 674-682.
- [9] V. Suntharalingam and et al., "A 4-Side Tileable Back Illuminated 3D-Integrated," in *International Solid State Circuit Conference*, 2009.
- [10] D. Henry et al., "3D integration technology for hybrid pixel detectors designed for particle physics and imaging experiments," in *Electronic System-Integration Technology Conference*, 2012, pp. 1-5.
- [11] Stuart Kleinfelder, Yandong Chen, Kris Kwiatkowski, and Ashish Shah, "High-Speed CMOS Image Sensor Circuits With In Situ Frame Storage," IEEE TRANSACTIONS ON NUCLEAR SCIENCE, pp. 1648-1656, 2004.
- [12] A. El Gamal and H. Eltoukhy, "CMOS Image Sensors," *Circuits and Devices Magazine*, pp. 6-20, May-June 2005.

- [13] Nakamura J.,.: Taylor & Francis, 2006, ch. Chapter 1, pp. 16-17.
- [14] G. E. Smith W. S. Boyle, "Charges Coupled Semiconductor Devices," *Bell System Technical Journal*, vol. 49, pp. 587-593, 1970.
- [15] S. R. Morrison, "A new type of photosensitive junction device," *Solid State Electronics*, vol. 6, no. 5, pp. 485-494, 1963.
- [16] E.R. Fossum, "CMOS image sensors: electronic camera on a chip," in *International Electron Devices Meeting*, 1995, pp. 17-25.
- [17] QImaging, "Rolling Shutter vs. Global Shutter," 2014.
- [18] Aptina, Global Shutter Pixel Technologies and CMOS Image Sensors A Powerfull Combination, 2012.
- [19] (2015, December) GoPro. [Online]. <a href="http://gopro.com/">http://gopro.com/</a>
- [20] (2015, December) Photron Corporation. [Online].
  <a href="http://www.photron.com/index.php?cmd=product\_general&product\_id=32&product\_t\_name=FASTCAM+SA6">http://www.photron.com/index.php?cmd=product\_general&product\_id=32&product\_t\_name=FASTCAM+SA6</a>
- [21] Cypress, High speed Cmos image sensors, 2006.
- [22] G. Meynants, G. Lepage, J. Bogaerts, G. Vanhorebeek, and X. Wang, "Limitations to the frame rate of high speed image sensors," Antwerp, 2008.
- [23] M. Jung, Y. Reibel, B. Cunin, and C. Draman, "RDS and IRDS filters for high-speed CCD video sensors," *IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing*, vol. 47, no. 9, pp. 958-965, 2000.
- [24] A.I. Krymski and Nianrong Tu, "A 9-V/Lux-s 5000-Frames/s 512×512 CMOS Sensor," *IEEE Transactions on Electron Devices*, vol. 50, no. 1, pp. 136-143, 2003.
- [25] G. Meynants, G. Lepage, J. Bogaerts, G. Vanhorebeek, X. Wang, "Limitations to the frame rate of high speed image sensors," in *International Image Sensor Workshop*, 2009.
- [26] A. Krymski and K. Tajima, "CMOS Image Sensor with integrated 4Gb/s Camera Link Transmitter," in *IEEE International Solid-State Circuits Conference*, San Francisco, 2006.
- [27] (2015, December) Visionresearch. [Online].

# http://www.visionresearch.com/Products/-Phantom-Camera-Products/v2511

- [28] Ruoyu Xu, Bing Liu, and Jie Yuan, "A 1500 fps Highly Sensitive 256 x 256 CMOS Imaging Sensor With In-Pixel Calibration," *IEEE Journal of Solid-State Circuits*, vol. 47, no. 6, pp. 1408-1418, 2012.
- [29] Etoh T. and Takehara K., "Needs, requirements and new proposals for ultra-high-speed videocameras in japan," in *International Congress High Speed Photography and Photonics*, 1994.
- [30] Goji Etoh T. et al., "An Image Sensor Which Captures 100 Consecutive Frames at 1 000 000 Frames/s," *IEEE Transaction on Electron Devices*, vol. 50, no. 1, pp. 144-151, 2003.
- [31] M. Elloumi et al., "Study of a photosite for snapshot video," in *International Congress on High-Speed Photography and Photonics*, 1995.
- [32] T G Etoh and K Takehara, "Ultrahigh-speed multiframing camera with an automatic trigger," in *Ultrahigh- and High-Speed Photography, Videography and Photonics*, 1993.
- [33] T.G. Etoh et al., "A 16 Mfps 165kpixel backside-illuminated CCD," in *IEEE International Solid-State Circuits Conference Digest of Technical Papers*, 2011.
- [34] H.D. Nguyen, V.T.S. Dao, T. Yamada, and T.G. Etoh, "Toward 1 Gfps: A multi-collection-gate BSI imager," in *International Conference on Communications and Electronics (ICCE)*, 2012.
- [35] T.G. Etoh et al., "Toward 1Gfps: Evolution of ultra-high-speed image sensors -ISIS, BSI, multi-collection gates, and 3D-stacking," in *IEEE International Electron Devices Meeting (IEDM)*, 2014.
- [36] J. Crooksa et al., "Ultra- high speed imaging at megaframes per second with a megapixel CMOS image sensor," in *International Image Sensor Workshop*, 2013.
- [37] Lahav A. et al., "Cmos Image Sensor pixel with 2D CCD memory bank for ultra high speed imaging," in *International Image Sensor Workshop*, 2013.
- [38] S. Kleinfelder, Y. Chen, K. Kwiatkowski, and A. Shah, "High-Speed CMOS Image Sensor Circuits With In Situ Frame Storage," *Nuclear Science, IEEE Transactions on*, vol. 51, no. 4, pp. 1648 1656, 2004.
- [39] M.M. El-Desouki, O. Marinov, M.J. Deen, and Qiyin Fang, "CMOS Active-Pixel Sensor

- With In-Situ Memory for Ultrahigh-Speed Imaging ," *IEEE Sensors Journal* , vol. 11, no. 6, pp. 1375-1379, 2011.
- [40] Y. Tochigi et al., "A global-shutter CMOS image sensor with readout speed of 1Tpixel/s burst and 780Mpixel/s continuous," *IEEE Journal of Solid-State Circuits*, vol. 48, no. 1, pp. 329-338, 2013.
- [41] P. Martin-Gonthier, F. Raymundo, and P. Magnan, "High-density 3D interconnects Technology: The key for burst-mode very high speed imaging?," in *International Image Sensor Workshop*, 2015.
- [42] Uhring W and Zlatanski M., "Ultrafast Imaging in Standard (Bi)CMOS Technology," in "Photodetectors".: InTech, 2012, ch. 13.
- [43] M. Zlatanski and W. Uhring, "Streak-mode optical sensor in standard BiCMOS technology," in *IEEE Sensor*, 2011, pp. 1604-1607.
- [44] T.G. Etoh, D.V.T. Son, T. Yamada, and E. Charbon, "Toward One Giga Frames per Second Evolution of in Situ Storage Image Sensors," *Sensors*, vol. 4, no. 13, pp. 4640-4658, 2013.
- [45] Dao Vu Truong Son et al., "Toward 100 Mega-Frames per Second: Design of an Ultimate," *Sensor*, no. 10, pp. 16-35, December 2009.
- [46] C. Honsberg and S. Bowden. (2015, December) pveducation. [Online]. <a href="http://www.pveducation.org/pvcdrom/pn-junction/generation-rate">http://www.pveducation.org/pvcdrom/pn-junction/generation-rate</a>
- [47] G. Koklu, R. Etienne-Cummings, Y. Leblebici, G. De Micheli, and S. Carrara, "Characterization of standard CMOS compatible photodiodes and pixels for Lab-on-Chip devices," in *IEEE International Symposium on Circuits and Systems*, 2013.
- [48] S. Radovanovic, A.J. Annema, and B. Nauta, "Physical and electrical bandwidths of integrated photodiodes in standard CMOS technology," in *IEEE Electron Devices and Solid-State Circuits*, 2003, pp. 95-98.
- [49] T. Reiner et al., "CMOS Image Sensor 3T Nwell Photodiode Pixel Spice Model," in *IEEE Electrical and Electronics Engineers in Israel*, 2004, pp. 161-164.
- [50] K. Murari, R. Etienne-Cummings, Nitish Thakor, and G. Cauwenberghs, "Which Photodiode to Use: A Comparison of CMOS-Compatible Structures," *IEEE Sensor*, vol. 9, no. 7, pp. 752-760, 2009.
- [51] Nakamura J., "Basics of Image Sensors," in Image Sensors and Signal Processing for

- Digital Still Cameras.: Taylor & Francis Group, 2006, ch. 3, p. 68.
- [52] E. Fossum and Hondongwa D., "A review of pinned photodiode for CCD and CMOS Image Sensors," *IEEE Journal of the Electron Devices Society*, vol. 2, no. 3, pp. 33-43, May 2014.
- [53] C. Tubert et al., "High Speed Dual Port Pinned-photodiode for Time-Of-Flight Imaging," in *International Image Sensor Workshop*, 2009.
- [54] Fink J., Hosticka J.B. Mahdi R., "Lateral Drift-Field Photodetector for High Speed 0.35μm CMOS Imaging Sensors Based on Non-Uniform Lateral Doping Profile," in 1-4, 2010, p. Conference on Ph.D. Research in Microelectronics and Electronics.
- [55] F.W. et al. Kosonocky, "360x360 Element Three-Phase Very High Frame Rate Burst Image Sensor: Design, Operation, and Performance," *IEEE Transactions on Electron Devices*, vol. 44, no. 10, pp. 1617-1624, October 1997.
- [56] R.W. Sandage and J.A. Connelly, "A fingerprint opto-detector using lateral bipolar phototransistors in a standard CMOS process," in *Electron Devices Meeting*, 1995, pp. 171-174.
- [57] Z. Weiquan, C. Mansun, and P.K. Ko, "A novel high-gain CMOS image sensor using floating N-well/gate tied PMOSFET," in *Electron Devices Meeting*, 1998, pp. 1023-1025.
- [58] P. Kostov, W. Gaberl, and H. Zimmermann, "High-speed bipolar phot otransistors in a 180 nm CMOS process," *Journal of Optics & Laser Technology*, no. 46, pp. 171-174, April 2012.
- [59] A.C. Carusone, H. Yasotharan, and T. Kao, "CMOS Technology Scaling Considerations for Multi-Gbps Optical Receivers With Integrated Photodetectors," *IEEE Journal of Solid-State Circuits*, vol. 48, no. 8, pp. 1832-1842, 2011.
- [60] "International Technology Roadmap for Semiconductors Interconnect," ITRS, White Paper 2013.
- [61] Z. Xiangyu, S. Chen, and E. Culurciello, "A second generation 3D integrated feature-extracting image sensor," in *IEEE Sensor*, 2011, pp. 1933-1936.
- [62] L Yu et al., "Methodology for Analysis of TSV Stress Induced Transistor Variation and Circuit Performance," in *IEEE 13th International Symposium on Quality Electronic Design*, 2012.

- [63] M. Rousseau, M. Jaud, P. Leduc, A. Farcy, and A. Marty, "Impact of substrate coupling induced by 3D-IC architecture on advanced CMOS technology," in *European Microelectronics and Packaging Conference*, Rimini, 2009.
- [64] G Katti, M Stucchi, K De Meyer, and W Dehaene, "Electrical Modeling and Characterization of Through Silicon Via for Three Dimensional ICs," *IEEE TRANSACTIONS ON ELECTRON DEVICES*, vol. 57, no. 1, pp. 256-262, 2010.
- [65] M. SADAKA, I. RADU, and DI CIOCCIO L., "3D Integration: Advantages, Enabling Technologies & Applications," in *IEEE International Conference on IC Design and Technology*, Grenoble, 2010, pp. 106-109.
- [66] R. Balcerak and S. Horn, "Progress in the development of vertically integrated sensor arrays," in SPIE Infrared Technology and Applications XXXI, Orlando, 2005.
- [67] P. Karimov et al., "Phototriggering system for an ultrahigh-speed video microscopy," *Review of Scientific Instruments*, vol. 11, no. 78, October 2007.
- [68] T Cools et al., "An SXGA CMOS image sensor with 8 Gbps LVDS serial link," in *International Image Sensor Workshop*, Ogunquit, 2007.
- [69] P.E. Allen, "Chapter 3 Models for CMOS Components," Georgia Tech Institute of Technology, CMOS Analog Circuit Design Lecture 2006.
- [70] Leti. (2015, August) Open 3D. [Online]. <a href="http://www-leti.cea.fr/fr/Travaillons-ensemble/Offres-Specifiques/Open-3D">http://www-leti.cea.fr/fr/Travaillons-ensemble/Offres-Specifiques/Open-3D</a>
- [71] M Verhelst and B Murmann, "Area scaling analysis of CMOS ADCs," *Electronics Letters*, vol. 48, no. 6, pp. 314-315, 2012.
- [72] B Murmann. ADC Performance Survey 1997-2015. [Online]. http://web.stanford.edu/~murmann/adcsurvey.html
- [73] N Weste and D Harris, "Chapter 12 Array Subsystems," in *CMOS VLSI Design: A Circuits and Systems Perspective 4th Ed.*: Addison-Wesley, 2011, p. 497.
- [74] TechInshights, "Memory (DRAM) Technology & Roadmap," 2015.
- [75] K C Huang, Y W Ting, and et al., "A High-Performance, High-Density 28nm eDRAM Technology with High-K/Metal-Gate," in *International Electron Device Meeting* (*IEDM*), Washington, 2011.

- [76] Shigetoshi Sugawa et al., "A 20Mfps Global Shutter CMOS Image Sensor with Improved Sensitivity and Power Consumption," in *International Image Sensor Workshop*, 2015.
- [77] V. T. S. Dao, T. G. Etoh, K. Shimonomura, E. Charbon C. Zhang, "Designing pixel parallel, localized drivers of a 3D 1Gfps image sensor family," in *International Image* Sensor Workshop, 2015.
- [78] et al. Dao V. T. S., "Toward 10 Gfps: Factors Limiting the Frame Rate of the BSI MCG Image Sensor," in *International Image Sensor Workshop*, 2015.
- [79] Chun-Cheng Liu et al., "A 10b 100MS/s 1.13mW SAR ADC with Binary-Scaled Error Compensation," in *International Solid-State Circuit Conference*, San Francisco, 2010.
- [80] Amkor Website. [Online]. <a href="http://www.amkor.com/">http://www.amkor.com/</a>
- [81] Kyocera Website. [Online]. <a href="http://global.kyocera.com/prdct/semicon/index.html">http://global.kyocera.com/prdct/semicon/index.html</a>
- [82] Amkor Technology, "Flip-chip BGA package data sheet," october 2013.
- [83] Daniel Roux, *Echange Thermique Appliqués à l'Electronique*.: Ecole des Mines de Saint-Etienne, 2009.
- [84] C. Kwanyeob, S. Mukhopadhyay D. Lie, "Analysis of the Performance, Power, and Noise Characteristics of a CMOS Image Sensor With 3-D Integrated Image Compression Unit," *IEEE Transactions on Components, Packaging and Manufacturing Technology*, vol. 4, 2014.
- [85] J.B. Kammerer, Y.Hervé, L. Hebrard J.C. Krencker, "Electro-thermal high-level modeling of integrated circuits," in 18th Int. Workshop on Thermal investigations of ICs and Systems, 2012.
- [86] J.B. Kammerer, Y.Hervé and L. Hebrard J.C. Krencker, "Direct Electro-thermal Simulation of Integrated Circuits using standard CAD tools," in *16th Int. Workshop on thermal investigations of ICs and system*, 2010.
- [87] et al. Lei Shao, "On-chip phase change heat sinks designed for computational sprinting," in *Semiconductor Thermal Measurement and Management Symposium* (SEMI-THERM), San Jose, 2014.
- [88] A. Vassighi and M. Sachdev, "Thermal Runaway in Integrated Circuits," *IEEE Transactions On Device And Materials Reliability*, vol. 6, no. 2, pp. 300-305, June

2006.

- [89] M. Garci, J-B Kammerer, and Hebrard L., "Towards electro-thermo-mechanical simulation of integrated circuits in standard CAD environment," *Microelectronics Journal*, vol. 46, no. 12, pp. 1121-1128, December 2015.
- [90] Tian H., Fowler B., and El Gamal A., "Analysis of Temporal Noise in CMOS Photodiode Active Pixel Sensor," *IEEE Journal of Solid State Circuits*, vol. 36, no. 1, pp. 92-101, 2001.
- [91] Vincent et al. Goiffon, "Pixel Level Characterization of Pinned Photodiode and Transfer Gate Physical Parameters in CMOS Image Sensors," *Journal of Electron Devices Society*, vol. 2, no. 4, p. 65, 2014.
- [92] Dao V. T. S. et al., "Toward 10 Gfps: Factors Limiting the Frame Rate of the BSI MCG Image Sensor," in *International Image Sensor Workshop*, Vaals, 2015.
- [93] Kulah H. and Akin T., "A Current Mirroring Integration Based Readout Circuit for High Performance Infrared FPA Applications," *IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS*, vol. 50, no. 4, pp. 181-186, 2003.
- [94] M. Zlatanski and W. Uhring, "Streak-mode optical sensor in standard BiCMOS technology," in *IEEE Sensors*, Limerick, 2011.
- [95] M. Zlatanski, W Uhring, and J Le Normand, "Sub-500 ps Temporal Resolution Streak-mode Optical Sensor," *IEEE Sensors Journal*, vol. 15, no. 11, pp. 6570-6583, July 2015.
- [96] Jim Stiles, "Lecture: Closed-Loop Bandwidth," University of Kansas Departement of EECS, 2011.
- [97] Van Blerkom D A, "Analysis and Simulation of CTIA-based Pixel Reset Noise," in *SPIE Infrared Technology and Applications XXXVII*, 2011.
- [98] N. Bluzer and R. Stehlik, "Buffered Direct Injection of Photocurrents into Charge-Coupled Devices," *IEEE Transactions on Electron Devices*, vol. 25, no. 2, pp. 160-166, 1978.
- [99] Shih U. and Wu C., "The design of high performance 128x128 CMOS Image sensors using new current-readout techniques," in *IEEE International Symposium on Circuits and Systems*, Orlando, 1999, pp. 168-171.
- [100] Sam Palermo, "Lecture 14: Folded Cascode OTA," 2012.

- [101] Weste N. H. E. and Harris D. M., "Special-Purpose Subsystems," in *CMOS VLSI DESIGN A Circuits and Systems Persepective*.: Addison-Wesley, ch. Chapitre 13, p. 566.
- [102] L.H. Chen, M. Marek-Sadowska, and F. Brewer, "Buffer Delay Change in the Presence of Power and Ground Noise," *IEEE Transactions on Very Large Scale Integration Systems*, vol. 11, no. 3, pp. 461-473, 2003.
- [103] G. et al. Finger, "Interpixel capacitance in large format CMOS hybrid arrays," in SPIE High Energy, Optical, and Infrared Detectors for Astronomy II, Orlando, 2006.
- [104] W Uhring, V Zint, and J Bartringer, "A Low-Cost High-Repetition-Rate Picosecond Laser Diode Pulse Generator," in *Photonics Europe*, 2004, pp. 583-590.
- [105] Sang-Soo Lee, Kwang Oh Kim Yibing (Michelle) Wang, "Comparison of Several Ramp Generator Designs For Column-Parallel Single Slope ADCs," in *International Image Sensor Workshop (IISW)*, 2009.
- [106] R.R. Harison, "MOSFET Operation in Weak and Moderate Inversion," University of Utah, Electrical Engineering Lecture 2010.
- [107] T.G. Etoh et al., "An image sensor which captures 100 consecutive frames at 100000 frames/s," *IEEE Transactions on Electron Devices*, vol. 50, no. 1, pp. 144-151, 2003.
- [108] Abbas El Gamal, "High Dynamic Range Image Sensors," 2002.
- [109] fdghdfgh., "Cmos Image Sensor Pixel with 2D CCD Memory Bank for Ultra high Speed Imaging with Large Pixel Count," in *International Image Sensor Workshop*, 2013.
- [110] Nakamura, *Image Sensors and Signal Processing for Digital Still Cameras*.: Taylor & Francis, 2006.
- [111] J.M. Pimbley and G.J. Michon, "The output power spectrum produced by correlated double sampling," *IEEE Transactions on Circuits and Systems*, vol. 38, no. 9, pp. 1086-1090, 1991.
- [112] G. Koklu, R. Etienne-Cummings, Y. Leblebici, G. De Micheli, and S. Carrara,
  "Characterization of Standard CMOS Compatible Photodiodes and Pixels for Lab-onChip Devices," in *IEEE International Symposium on Circuits and Systems*, 2013.
- [113] Murari K., Etienne-Cummings R., Thakor N., and Gert Cauwenberghs G., "Which Photodiode to Use: A Comparison of CMOS-Compatible Structures," *IEEE Sensor*, vol. 9, no. 7, p. 752, 2009.

[114] Tian H., Liu X., Lim S., Stuart Kleinfelder S., and El Gamal A., "Active Pixel Sensors Fabricated in a Standard 0.18  $\mu$ m," in *Sensors and Camera Systems for Scientific, Industrial, and Digital Photography Applications*, San Jose, 2001.