# An 8-to-1 bit 1-MS/s SAR ADC With VGA and Integrated Data Compression for Neural Recording

Vikram Chaturvedi, Student Member, IEEE, Tejasvi Anand, Student Member, IEEE, and Bharadwaj Amrutur, Member, IEEE

Abstract—Low power consumption per channel and data rate minimization are two key challenges which need to be addressed in future generations of neural recording systems (NRS). Power consumption can be reduced by avoiding unnecessary processing whereas data rate is greatly decreased by sending spike timestamps along with spike features as opposed to raw digitized data. Dynamic range in NRS can vary with time due to change in electrode-neuron distance or background noise, which demands adaptability. An analog-to-digital converter (ADC) is one of the most important blocks in a NRS. This paper presents an 8-bit SAR ADC in 0.13-µm CMOS technology along with input and reference buffer. A novel energy efficient digital-to-analog converter switching scheme is proposed, which consumes 37% less energy than the present state-of-the-art. The use of a pingpong input sampling scheme is emphasized for multichannel input to alleviate the bandwidth requirement of the input buffer. To reduce the data rate, the A/D process is only enabled through the in-built background noise rejection logic to ensure that the noise is not processed. The ADC resolution can be adjusted from 8 to 1 bit in 1-bit step based on the input dynamic range. The ADC consumes 8.8  $\mu$ W from 1 V supply at 1 MS/s speed. It achieves effective number of bits of 7.7 bits and FoM of 42.3 fJ/conversion-step.

*Index Terms*—Asynchronous, biomedical, digital-to-analog converter (DAC) switching, multichannel, neural, preamplifier, ping-pong, successive approximation register (SAR), threshold, variable gain amplifier, variable resolution.

### I. INTRODUCTION

**N** EURAL recording and stimulation are two indispensable entities in designing efficient brain machine interfaces (BMI). Many previous works have shown progress toward designing a low power NRS [1]–[5]. However, these works were largely focused on integrating multiple functionality, such as low noise amplifiers (LNA), analog-to-digital converters (ADC), and telemetry, on a single chip. Although researchers have worked extensively toward designing an energy efficient neural LNA [6] and [7], other blocks have followed conventional design techniques. Power consumption per

Manuscript received June 14, 2012; revised September 14, 2012; accepted December 27, 2012. Date of publication February 1, 2013; date of current version September 23, 2013. This work was supported by the Department of Information Technology, Ministry of Communication and Information Technology, Government of India.

V. Chaturvedi and B. Amrutur are with the Microelectronics Lab, Department of Electrical and Computer Engineering, Indian Institute of Science, Bangalore 560012, India (e-mail: vikram@ece.iisc.ernet.in; amrutur@ece.iisc.ernet.in).

T. Anand is with Oregon State University, Corvallis, OR 97330 USA (e-mail: tejasvianand@gmail.com).

Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TVLSI.2013.2238957

NEURONS

Fig. 1. Typical N-channel NRS employing a K-bit ADC.

channel [8] and output data rate (ODR) are two key challenges currently faced by the second generation NRS. The continuous increase in demand for data from more number of neurons is further exacerbating the problem. Dynamic range in NRSs can vary with time due to change in electrode-neuron distance or background noise [4]. This requires intelligent blocks to mitigate unnecessary signal processing, thus reducing power consumption. The constraint on the implant size necessitates low area solution for each block. Motivated by the need for energy efficient solutions for each block in NRS, we present a solution for ADC and input buffer (VGA) in this paper. However, many of the presented design techniques can be equally useful in analog front-ends for other applications as well.

Fig. 1 represents an N-channel NRS employing a single k-bit ADC is to digitize N-channels in a time-divisionmultiplexed fashion. The VGA and ADC need to support larger bandwidth than other blocks. NRS typically utilizes a SAR ADC [9], [10] due to its high energy efficiency and moderate resolution requirement (8–10 bits) in the application. One of the reason for the popularity of SAR ADC is its highly digital nature, which is benefited from CMOS scaling. However, the digital-to-analog converter (DAC) power consumption does not scale as well as the logic. In this paper, we propose a novel energy-efficient DAC switching technique [11]–[13], FlipDAC switching, which makes energy consumption in DAC small compared to digital switching energy. The resolution of the ADC can be configured as per the dynamic range to prevent wastage of energy.

Typically, the VGA (Fig. 1) needs to track the input on the capacitive DAC (CDAC) in a small time (2–3 bit cycles), which demands large bandwidth in the VGA. In this paper, we emphasize on the use of dedicated sampling capacitors in analog front ends to reduce the power consumption of the system, rather than using CDAC for the input sampling.



Fig. 2. Block diagram of the proposed SAR ADC with an on-chip voltage reference buffer (REFBFR).

We also recommend employing a ping-pong input sampling architecture [14] to alleviate the bandwidth requirement of the VGA. It is explained in more detail in Section II-B. The amplitude of the maximum detectable signal varies with neuronelectrode distance and probe impedance, which is frequency dependent. A fixed gain in NRS will either under-utilize the ADC dynamic range or causes clipping of the peaks. Hence, we emphasize a large programmable range in voltage gain of the VGA. The designed VGA has voltage gain programmable from 8 to 35 dB in eight steps.

NRS generate tremendous amount of data due to chronic recording from a large number of neurons [3]. The transmission of this large amount of data through a wireless link poses serious threat to the scalability and power efficiency of BMI. Spike feature extraction [1] and simple thresholding [2] are two popular ways through which researchers have tried to reduce output data rate. However, the former needs extra hardware and power whereas the latter can cause loss of information for spike sorting. The present low ODR NRSs typically need an extra spike detection block, e.g., one DAC per channel [2], which consumes large area. In this paper, we merge spike detection logic into the ADC, reusing the CDAC in it. This scheme not only mitigates the processing of the background noise to reduce the ODR but also preserves important spike features, which are required for the spike sorting [15]. It also helps in reducing power consumption and area of NRS.

This paper is organized as follows. Section II presents the architecture of the ADC, ping-pong input sampling scheme to relax VGA specification and activity-dependent A/D scheme to reduce ODR. Section II also explains the energy efficient DAC switching technique, FlipDAC. Section III discusses the circuit implementation of various blocks. Section IV presents the measured results from a chip fabricated in UMC 0.13- $\mu$ m CMOS technology. Section V concludes this paper.

# II. ADC ARCHITECTURE

The block diagram of the proposed SAR ADC is shown in Fig. 2. Fully differential input and reference voltages are used. The reference DAC and sampling capacitors are segregated to gain advantages explained in Section II-B. A four-input preamplifier is used in the comparator to mitigate the effect of kickback noise on sampling capacitors ( $C_{in0,1}$ ). A master clock



Fig. 3. (a) DAC switching in a 3-bit SAR ADC [11]. (b) FlipDAC switching. In the figure,  $V = 2 \cdot V_{cm} = V_{ref}$  and the unit of energy consumption is  $C.V_{ref}^2$ .



Fig. 4. Input digitization through the proposed FlipDAC switching technique. In the second case, output 1001 is resolved indirectly by tracking 1110 by  $V_{dacp} - V_{dacm}$ .

of only half the sampling speed is used due to the ping-pong input sampling architecture. It reduces the power consumption in clock buffers by  $2\times$  and decreases the total system power consumption. To reduce output data rate, the detection of the neural spikes is achieved using spike threshold  $S_{\text{TH}}$ , which is calculated based on the background noise and is stored in registers. This is explained more in Section II-C. The dynamic range (DR) decides the resolution setting (N) of the ADC. An on-chip voltage reference buffer is implemented to provide clean and stable voltage reference to the ADC.

## A. FlipDAC Switching

Of late, there has been a lot of interest in energy efficient DAC switching techniques for CDAC [11]–[13]. It has been shown that the DOWN transitions take more energy than



Fig. 5. Proposed FlipDAC switching scheme for a 4-bit SAR ADC and energy cost comparison with [11] for each step in the binary search tree.

UP transitions in the digitization process [16]. This is the reason that the energy consumption for a code near the center of the ADC dynamic range is greater than that of a code toward extremes. For a 10-bit ADC, code 511 and code 512 require maximum energy, whereas code 0 and code 1023 take minimum energy [11]. The energy drawn from the reference can be reduced if it is possible to resolve an input through fewer DOWN transitions. Even if the number of DOWN transitions is not smaller than that of UP transitions, the energy consumption can be lowered by pushing DOWN transitions toward LSBs.

Fig. 3(a) shows the switching scheme proposed in [11] for a 3-bit SAR ADC. The DOWN transition draws 5 times more energy from the reference compared to the UP transition. Fig. 3(b) presents the proposed switching technique. For the DOWN transition step, the energy drawn from the reference is 5 times smaller than that in Fig. 3(a). This step is performed by switching C and re-arranging DAC reference rails so as not to degrade digital switching energy and is explained in detail in Section III-B. The proposed scheme is overall 33% more energy efficient than Fig. 3(a). Note that the DAC voltage achieved in the DOWN transition is negative of the desired value ( $V_{dac} = -V/4$ ). The sign of the DAC voltage is corrected by either interchanging the two DAC inputs to

the comparator or by comparing this negative DAC voltage with the negative of the sampled input voltage. The former approach is preferred as the latter will flip the offset of the comparator, which can affect the linearity of the ADC.

This reduction in the energy consumption is achieved by mapping the input voltage to a digital code, which is more energy efficient than the actual code for the input. Fig. 4 explains this for a 4-bit SAR ADC. If  $b_2$  is detected as logic HIGH,  $V_{dacp}$ - $V_{dacm}$  tracks  $V_{ip} - V_{in}$  in the conventional way. However, if  $b_2$  is detected as logic LOW, Flip goes high and,  $V_{dacp}$  and  $V_{dacm}$  interchange their roles. The remaining tracking of the input  $V_{ip} - V_{in}$  is then carried out by  $V_{dacm} - V_{dacp}$ . It is equivalent to resolving  $V_{ref} - (V_{ip} - V_{in})$  by  $V_{dacp} - V_{dacm}$  in remaining bit cycles. This maps the input to a higher digital code and helps in reducing the number of times CDAC is discharged, especially during MSBs. As shown in Fig. 4, output code 1001 is resolved indirectly by tracking code 1110 by the CDAC. This results in fewer discharging steps than the case when CDAC resolves 1001 directly.

Fig. 5 represents the switching scheme for a 4-bit ADC for  $V_{ip} > V_{in}$ . Note that 2<sup>nd</sup> MSB capacitor (2C) is the replica of remaining two LSB capacitors. First, DOWN transition in Fig. 5 again illustrates the concept of FlipDAC switching technique. The FlipDAC step does not take any extra clock



Fig. 6. Comparison of CDAC switching energy with [11] and [17] for a 10-bit SAR ADC.

TABLE I DAC Switching Energy Comparison for a 10-b SAR ADC

| Spec.                       | [17]   | [18]   | [11]   | This Scheme |
|-----------------------------|--------|--------|--------|-------------|
| Avg. Energy $(C.V_{ref}^2)$ | 142    | 255.5  | 170    | 106         |
| Energy Saving               | 25.3 % | 58.6 % | 37.5 % | -           |

cycle and hence speed is not compromised. The flipping of the CDAC is done only for the first DOWN transition by making use of the symmetric structure of the CDAC, for both UP and DOWN transitions, from this node. The splitting of (MSB – 1)th capacitor helps in implementing binary search algorithm, after flipping, without incurring extra time and switching. The energy consumption during various steps for a 4-bit ADC is also compared with [11, Fig. 5]. The number in the circle represents the total number of unit capacitors connected to  $V_{ref}$ . The relative energy costs are shown on the arrows.

Fig. 6 depicts the comparison of the energy drawn from the reference for each code in a 10-bit SAR ADC in [11] and [17], and FlipDAC switching scheme. Note that the proposed switching technique achieves minima at code 511 and 512 compared to maxima in [11]. This happens because code 511 and code 512 are resolved by tracking code 0 and code 1023, respectively, which have no DOWN transitions. Table I compares the proposed scheme with the present stateof-the-art CDAC switching schemes for a 10-bit ADC, and shows energy savings achieved by the proposed scheme over them. This scheme necessitates the use of separate sampling capacitors which upon investigation is found favorable in reducing the power consumption in the VGA and is explained in Section II-B.

### B. Ping-Pong Input Sampling

Fig. 7(a) depicts the timing diagram in a conventional 8-bit SAR ADC. Typically, 2-3 bit cycles or equivalent delay  $(T_{vga})$  is dedicated to the tracking of the input on the CDAC  $(C_{DAC})$ . It is then followed by 8-bit cycles for the digitization of the sample. It demands large bandwidth (current) in the VGA. The power consumption in the VGA can be reduced by giving more time for the input tracking but it contradicts



Fig. 7. Input sampling in a SAR ADC. (a) Conventional sampling. (b) Ping-pong input sampling.



Fig. 8. Architecture of the ping-pong input sampling scheme to relax the bandwidth requirement of VGA and reference buffer.  $V_{in,cm}$  is the output common mode voltage of the VGA.

with the design of the reference buffer. In this ADC, separate sampling and DAC capacitance is used to decouple the design of two blocks and employ ping-pong input sampling [14].

Figs. 7(b) and 8 illustrate the sampling scheme employed in the ADC. In this scheme, inputs are sampled on capacitors  $(C_{in0} \text{ and } C_{in1})$  rather than  $C_{DAC}$ . There are two sets of sampling capacitors of which when one tracks the input, the other is used to digitize the previous sample. This enables the use of the complete sample period  $[T_S \text{ in Fig. 7(b)}]$  for the input tracking, which relaxes the bandwidth requirement of the VGA. It also reduces the power consumption in the reference buffer as comparatively more time is available for bit cycling. Ping-pong sampling in the ADC enables the use of two half rate clocks for even- and odd-numbered channels. It alleviates the clock requirement by 2X, over asynchronous schemes employing clocks equal to that of the sampling rate, which reduces power consumption in clock buffers. The following section quantitatively shows the advantage due to the sampling scheme. The settling mechanism is assumed to be of first order and the analysis is done for settling error of 1 LSB =  $V_{\rm ref}/2^N$ .

1) Power Saving in VGA: If  $T_{vga} = \alpha_{vga} \cdot T_S$ , N-bit settling error is given by

$$V_{\text{err1}} = \frac{V_{\text{ref}}}{2^N} = V_{\text{ref}} \cdot \exp\left(\frac{-T_{\text{vga}}}{R_{\text{eq}} \cdot C_{\text{DAC}}}\right). \tag{1}$$

Assuming output resistance  $R_{eq} = \beta/I_D$  where  $\beta$  is a constant dependent on the architecture of the driver and  $I_D$  is the current consumed in the driver

$$I_{D,\text{vga},1} = \frac{N \cdot \ln(2) \cdot \beta_{\text{vga}} \cdot C_{\text{DAC}}}{\alpha_{\text{vga}} \cdot T_S}.$$
 (2)

Now for ping-pong input sampling  $\alpha_{vga} = 1$ 

$$I_{D,\text{vga},2} = \frac{N \cdot \ln(2) \cdot \beta_{\text{vga}} \cdot C_{\text{DAC}}}{T_S}.$$
(3)

The percentage power saving can be calculated as

$$\frac{I_{D,\text{vga},1} - I_{D,\text{vga},2}}{I_{D,\text{vga},1}} = (1 - \alpha_{\text{vga}}) \cdot 100\%.$$
(4)

2) Power Saving in Reference Buffer: If  $T_{ref} = \alpha_{ref} \cdot T_S$ , N-bit settling error due to the reference buffer is given by

$$V_{\rm err2} = \frac{V_{\rm ref}}{2^N} = V_{\rm ref} \cdot \exp\left(\frac{-(T_{\rm ref})/2N}{R_{\rm eq} \cdot C_{\rm eq}}\right)$$
(5)

where  $C_{eq}$  is the equivalent capacitance seen by the reference buffer and 50 % of each bit cycle ( $T_{ref}/N$ ) is given for CDAC settling

$$I_{D,\text{ref},1} = \frac{N \cdot \ln(2) \cdot \beta_{\text{ref}} \cdot C_{\text{eq}}}{\alpha_{\text{ref}} \cdot T_S} \cdot 2N.$$
(6)

For ping-pong input sampling  $\alpha_{ref} = 1$ 

$$I_{D,\text{ref},2} = \frac{N \cdot \ln(2) \cdot \beta_{\text{ref}} \cdot C_{\text{eq}}}{T_S} \cdot 2N.$$
(7)

The percentage power saving can be calculated as

$$\frac{I_{D,\text{ref},1} - I_{D,\text{ref},2}}{I_{D,\text{ref},1}} = (1 - \alpha_{\text{ref}}) \cdot 100\%.$$
 (8)

Hence, for an 8-bit ADC, with 2 cycles given for sampling in the conventional approach,  $\alpha_{vga} = 0.2$  and  $\alpha_{ref} = 0.8$ . Hence, 80% power can be saved in the VGA and 20% power can be saved in the reference buffer by using ping-pong sampling scheme. Actually the power saving in the VGA is larger as the sampling capacitors ( $C_{in0}$  and  $C_{in1}$  in Fig. 8) are much smaller than  $C_{DAC}$  (which is used as the sampling capacitor in the conventional sampling approach).

The drawback of this architecture is the need for extra sampling capacitors. But as their sizes are determined by thermal noise and leakage at the top plate switch but not matching, the area penalty is not significant for moderate resolution (8–10 b) ADCs. The leakage from sampling capacitors becomes more important consideration for higher resolution ADCs in technologies with larger leakage. This architecture also requires good matching between two sampling paths for a single channel application and may need calibration [19]. But no such requirement is imposed for multichannel input, as in NRS, where each channel (even or odd) traverses a fixed path every time. It also alleviates the concern of duty cycle distortion due to the half-rate clocking as even- and odd-numbered channels are sampled by two different nonoverlapping clocks.



Fig. 9. Data-rate reduction in NRS through the proposed activity-dependent A/D scheme.



Fig. 10. Effective activity factor. (a) Spike approximated as a triangular waveform. (b) Important spike features for spike-sorting [15], [21].

# C. Activity-Dependent A/D

There is a great need to reduce the amount of data to be transmitted to enable chronic recording from more number of channels. The information in extracellular action potential (EAP) is essentially encoded in spike time-stamps but the amplitude information in EAPs is also important for spike sorting [3]. Simple thresholding is found to be equally effective as other complex spike detection algorithms [20]. However, representing a spike as a point event causes loss of information required for spike sorting. Another way to reduce the ODR is the transmission of important spike features [1], [3]. Fig. 10(b) shows important spike features,  $A_{max}$  (the maximum positive spike amplitude),  $A_{min}$  (the minimum negative spike amplitude) and  $T_{pp}$  (the time between  $A_{max}$  and  $A_{min}$ ), that should be kept intact in the output of the ADC in a low ODR NRS [15], [21].

In this paper, we propose an activity-dependent A/D conversion scheme to obviate the processing of background noise but preserve important spike features. Figs. 9 and 10(a) illustrate the proposed scheme. In this, the digitization process is enabled only when the input is larger than the spike detection threshold  $S_{\text{TH}}$ . The spike detection threshold  $S_{\text{TH}}$  is decided based on the magnitude of the background noise ( $\sigma_n$ ),  $S_{\text{TH}} = \mu_{\text{sig}} + k.\sigma_n$  where  $\mu_{\text{sig}}$  is the mean of the input and k = 3 - 4 so that the probability of the false detection of the noise as a spike is small [22]. Note that this scheme is exclusively for the neural recording application where the information is in spike time-stamps.

Fig. 10(a) depicts the concept of the activity-dependent A/D. The spike is approximated as a triangular waveform with maximum amplitude  $A_{\text{max}}$  and spike duration  $T_{\text{spike}}$ . Referring

to Fig. 10(a), the slope the spike can be calculated as

$$m = \frac{dV}{dt} = \frac{A_{\text{max}}}{T_{\text{spike}}/2} = \frac{S_{\text{TH}}}{x}.$$
 (9)

Hence

$$x = \frac{S_{\text{TH}}.T_{\text{spike}}}{2.A_{\text{max}}} \tag{10}$$

which is the duration of a spike for which the ADC does not digitize the input [Fig. 10(a)].

If the total number of spikes in time  $T_{exp}$  is  $\alpha = S_R \cdot T_{exp}$ where  $S_R$  is the spike rate in spikes/s, the effective time ( $T_{eff}$ ) for which the N bit ADC operates is given by

$$T_{\rm eff} = \alpha (T_{\rm spike} - 2.x) + \frac{[T_{\rm exp} - \alpha (T_{\rm spike} - 2.x)]}{N+1}.$$
 (11)

The first term indicates the time for which the ADC behaves as a free running ADC. The second term in the above equation indicates that the ADC operates only for one cycle for spike detection and is idle for remaining N bit cycles if the spike is not detected. The effective activity factor (EAF) of an N-bit ADC working on activity-based A/D scheme is given by

$$\text{EAF} = \frac{T_{\text{eff}}}{T_{\text{exp}}} = \frac{N}{N+1} \left( \frac{1}{N} + S_R \cdot T_{\text{spike}} \cdot \left( 1 - \frac{S_{\text{TH}}}{A_{\text{max}}} \right) \right).$$
(12)

Equation (12) represents the effective time for which an activity-dependent ADC will be working when compared to a free running ADC. It represents the reduction in both power consumption in the ADC and ODR of the system. The typical values of  $T_{\rm spike}$  and SR are 1 ms and 100 spikes/s, respectively. Assuming  $S_{\rm TH}/A_{\rm max} = 0.2$ , EAF for an 8-bit ADC can be calculated from 12 as ~0.18, which saves 82% energy and ODR over a free running ADC.

As shown in Fig. 9, the logic is in-built in the ADC. CDAC is reused for spike thresholding purpose, which obviates the use of separate DAC for each channel [2]. Equation (12) tells that  $S_{\text{TH}}$  can be increased to reduce EAF and to provide more immunity against the background noise but may cause loss of information by missing a spike. Hence, the value of  $S_{\text{TH}}$  should be decided based on spike sorter's requirement [23] in addition to the background noise. The proposed SAR ADC is designed to be programmable to operate either in this mode or free running mode to transmit raw data.

#### **III. CIRCUIT BLOCKS**

#### A. Comparator

Unlike many of the previous publications on SAR ADC where preamplifier is not used before the clocked latch, the use of a preamplifier is preferred by us for offset and kickback noise mitigation. Kickback noise is an important concern in this architecture due to the use of small sampling capacitors for reducing power consumption in the VGA. The preamplifier in SAR ADC is subjected to step input only and needs to amplify the error just enough for the detection of the sign by the clocked latch. It relaxes the settling requirement in the preamplifier.

The schematic diagram of the preamplifier is shown in Fig. 11. A partial positive feedback is employed to reduce the effective output conductance. The load transistors are



Fig. 11. Schematic diagram of the four-input preamplifier. The preamplifier is employed to mitigate the effect of kickback noise on small sampling capacitors.

sized  $(W_2 < W_1)$  to prevent the effective output conductance from becoming negative, even in presence of mismatch. If  $W_2 = \eta \cdot W_1$ , the dc voltage gain  $A_{v0}$  and the bandwidth  $\omega_p$ , for a load capacitance  $C_L$ , can be calculated as

$$A_{v0} = \frac{g_{m,in}}{g_{m,w1}} \frac{1}{1-\eta} \qquad \omega_p = \frac{g_{m,w1} \cdot (1-\eta)}{C_L}$$

By increasing  $\eta$  toward unity, more voltage gain can be achieved but this makes the preamplifier more sluggish. We have chosen  $\eta = 0.8$  as a tradeoff between voltage gain, speed, and stability. The simulated values are  $A_{v0} = 14$  V/V and  $\omega_P = 8$  MHz. The inputs  $A_{dacp}$  and  $A_{dacm}$  are interchangeable to correct the error in the sign of DAC voltage caused during the flip step and is explained in Section III-B. Individual inputreferred offsets of two differential pairs in the preamplifier vary with the dynamic operating point of each input pair [24], [25]. This variation is desensitized by the current source  $I_0$  [18], especially when effective input to the preamplifier is close to 1 LSB. The latch employed in the comparator is a conventional sense amplifier-based latch as in [16].

## B. CDAC Manipulation and Sign Correction

The FlipDAC switching scheme is explained in Section II-A. During the first DOWN transition, flipping of the CDAC is found to reduce energy consumption in it. The flip step comprises of two parts: switching of (MSB - 1)th capacitor and manipulating DAC reference rails. This keeps the DAC capacitance switched in each bit cycle same as in [11]. During the flip step, voltage reference rails in CDAC are manipulated to achieve the desired magnitude of the DAC voltage and the logic for this is shown in Fig. 12. When Flip signal (Fig. 4) goes HIGH, rails V and G are shorted together to V<sub>cm</sub> input. Based on the sign of the input, V<sub>cm1</sub> and  $V_{cm2}$  rail is shorted to either REFP or REFM, which are outputs of the reference buffer. The proposed ADC only has an overhead in digital switching energy over [11] due to the manipulation of DAC reference rails otherwise equal capacitances are switched in every bit-cycle. However, as this manipulation requires a single driver (FLIP), which switches only once in a sample period when 1<sup>st</sup> MSB is logic LOW, the overhead is small.



Fig. 12. Energy-efficient implementation of the FlipDac step by manipulating DAC reference rails. The sign of the DAC voltage is corrected by interchanging DAC inputs to the preamplifier.

As discussed in Section II-A, the flip step causes the effective DAC voltage ( $V_{dacp} - V_{dacm}$ ) to become negative of the desired value. This is compensated by interchanging  $V_{dacp}$  and  $V_{dacm}$  inputs to the preamplifier when Flip signal goes HIGH. Charge sharing between the input parasitic capacitance of the comparator and CDAC occurs during the flip step and introduces an error. The error caused by this charge sharing is  $\sim 2 \cdot (C_{p,cmp})/(C_{dac}) \cdot (\text{REFP} - \text{REFM})/4$  differential. With  $C_{ox} \sim 10 \text{ fF}/\mu\text{m}^2$  and W/L of input transistor as 1  $\mu\text{m}/$  0.2  $\mu\text{m}$ ,  $C_{p,cmp}$  is approximately 2 fF. The error comes out to be  $\sim 0.26 \text{ mV}$ , which is quite smaller than the LSB of the ADC. During layout, care was taken to keep the input parasitic capacitance of the comparator as small as possible. The matching of parasitic capacitance of two paths to the comparator is also important to keep the gain error constant.

# C. Asynchronous Logic and CDAC

The asynchronous scheme employed is shown in Fig. 13. The individual bit cycling phases are generated once a decision is made by the comparator after STB and is detected by a NAND gate, which generates NEXT signal. A shift register of depth = 11 is used to progress a pulse, after each decision, to enable the extraction of the next bit. The first flip-flop in the shift register is preset by start-of-conversion (SOC) to generate the pulse, which is propagated to next flip-flop on each rising edge of NEXT signal. Programmable delay line (PDL) is used to generate RST and PST signals to, respectively, reset and preset STB by introducing delays  $t_{RST}$  and  $t_{PST}$  ( $t_{PST} > t_{RST}$ ). The input HS controls the PDL to modulate delays for the operation at higher speeds.

Once the pulse reaches the final flip-flop, OVER signal halts the conversion until the next SOC. The number of bit cycles



Fig. 13. Block diagram of the employed asynchronous scheme.



Fig. 14. Architecture of the 8-bit capacitive DAC.

is decided based on the resolution requirement (*N*) through a digital MUX. It implements variable resolution without complicating the layout and logic. As the resolution is reduced by simply halting the binary search algorithm in-between, it enables the resolution reconfiguration from 8 to 1 bit in 1-bit steps. This scheme of variable resolution saves power linearly with the resolution, and is finally limited by the static power consumption in the ADC. Two extra bit cycles are used to include the logic for  $S_{\text{TH}}$ , and are controlled by the signal sthen (Fig. 13). This state can be bypassed for the free running mode of the ADC.

Fig. 14 depicts the architecture of the 8-bit CDAC used in the ADC. Separate sub-DACs for DOWN and UP transitions are used. As the linearity of a binary weighted CDAC is determined by the total capacitance switched to the reference, indifferent of its position [16], this structure does not compromise the linearity of the CDAC.

# D. VGA and Reference Buffer

The architecture of the VGA block is shown in Fig. 15. The iOTA is shared between two sets of capacitors for employing ping-pong input sampling (Section II-B).  $G_m$  block in Fig. 15 is a two-stage Miller-compensated transconductance amplifier, which consumes only 3  $\mu$ A when driving a 1 MS/s 8-bit SAR ADC. The load to the VGA is 300 fF, which was chosen as the size of sampling capacitors. Simulation of the VGA achieved a THD of 0.035% for 1  $V_{p-p}$  output swing.

For large programmable range in voltage gain, two stages of the VGA block are used. One of these stages can be put into the sleep mode if the amplitude of the detectable signal is large. Each VGA uses  $C_S = 300$  fF and  $C_f = 40$  fF



Fig. 15. Architecture of each of the VGA stage. For better matching, OTA is shared between even- and odd-numbered channel.



Fig. 16. Die photograph of the chip designed in UMC 0.13- $\mu m$  CMOS technology.

(fixed) + 80 fF (programmable in four steps). Voltage gains that can be achieved with one VGA stage are 2.5, 3.75, 5, and 7.5 V/V. The programmable voltage gain range is 2.5–56.25 V/V in eight steps. We have also designed an on-chip voltage reference buffer for the ADC. The OTA designed for the reference buffer is a two-stage trans-conductance amplifier consuming 4  $\mu$ A and is load compensated.



Fig. 17. Measured DNL and INL plots. *Y*-axis is in LSB and *X*-axis is the output digital code.



Fig. 18. Measured SNDR versus  $F_{in}$  for -1 dBFS input at 1 MS/s speed.

# **IV. EXPERIMENTAL RESULTS**

# A. ADC Characterization

Fig. 16 shows a die photo of the chip fabricated in UMC 0.13- $\mu$ m CMOS technology. The ADC occupies an area of 390  $\mu$ m × 420  $\mu$ m. The charge redistribution DAC employs custom made unit capacitor ~15 fF using metal-oxide-metal (MOM) technology with six metal layers. This value of the unit capacitance is larger than the required unit capacitance size for an 8-bit ADC with  $\sigma_C/C_0 = 0.5\%$ . The total DAC capacitance is 3.84 pF. The full scale range (FSR) of the ADC is 1  $V_{pp}$  differential.

The maximum measured INL and DNL (Fig. 17) are found to be 0.6 LSB/-0.7 LSB and 0.26 LSB/-0.67 LSB. These are at 0.25 FSR and 0.75 FSR due to a mismatch between the two paths (Fig. 12) meant for interchanging two DAC inputs to the preamplifier once they are flipped. Fig. 19 shows 16384 point FFT of the ADC output for -1 dBFS input at  $F_{in} =$ 62.439 kHz. Fig. 18 presents the measured signal-to-noiseand-distortion ratio (SNDR) for different input frequencies at -1 dBFS input. The ADC achieves a SNDR of 48.1 dB for a near Nyquist input (499.939 kHz), which translates to effective number of bits (ENOB) of 7.7 bits.

The ADC consumes total power of 8.8  $\mu$ W with  $V_{DD} = 1$  V at 8-bit setting. Based on parasitic extracted simulations,



Fig. 19. 16384 point FFT of the ADC output for -1 dBFS input at  $F_{\rm in} = 62.439$  kHz.



Fig. 20. Relative power dissipation versus resolution ( $P_{\text{max}} = 8.8 \ \mu\text{W}$ ).

CDAC consumes only 0.4  $\mu$ W and preamp consumes 1.5  $\mu$ W, which is ~5% and 17% of the total power consumption. The power consumption is dominated by the digital switching (~78%). The power consumption in the digital logic is found to be larger than for a normal SAR ADC logic primarily due to the in-built logic for spike thresholding in the ADC.

As the power consumption is dominated by digital switching, it reduces linearly with decrease in the resolution and is shown in Fig. 20. The FoM of the ADC is found to be 42.3 fJ/conversion-step for the free running mode at 8-bit setting. The FoM would be even smaller ( $\sim 35.1$  fJ/conversionstep) if the ADC were not to employ the preamplifier to mitigate the effect of kickback noise on small sampling capacitors. Bigger sampling capacitors could have been used to obviate the preamplifier but it defeats the whole purpose of ping-pong sampling to reduce power consumption in the VGA. Table III compares this work with the state-of-the-art SAR ADCs with similar speed of operation.

#### B. Activity-Dependent A/D

Fig. 21 shows the measured output of the ADC working under activity-dependent A/D scheme at 1 MS/s speed. For this experiment, neural data recorded *in vitro* from the Hippocampal culture of a Wistar rat is fed to the ADC using Agilent 81150A pulse function generator. The values of  $S_{\text{TH}}$ and EAF are shown in Fig. 21 where the first bit S is the sign bit. Based on the value of input noise  $\sigma_n$ , a proper value

TABLE II VGA Comparison

| Spec.                | [1]           | [8]           | [26]            | This Paper  |  |
|----------------------|---------------|---------------|-----------------|-------------|--|
| ADC                  | 9 b, 640 KS/s | 10 b, 16 KS/s | 8 b, 31.25 KS/s | 8 b, 1 MS/s |  |
| IB                   | 40.6 µA       | 0.55 μΑ       | 1.45 µA         | 3 µA        |  |
| K <sub>eff,vga</sub> | 49.5 m        | 91.4 m        | 67.7 m          | 1.05        |  |

TABLE III SAR ADC Comparison

|                           | [27]        | [28]        | [13]        | [23]        | [29]            | This<br>Paper |
|---------------------------|-------------|-------------|-------------|-------------|-----------------|---------------|
| Technology                | 0.25<br>μm  | 0.18<br>μm  | 0.18<br>μm  | 0.13<br>μm  | 65 nm           | 0.13<br>μm    |
| Supply                    | 1 V         | 1 V         | 1 V         | 1 V         | 0.4–1 V         | 1 V           |
| Power<br>Consumption (µW) | 3.1         | 25          | 7.75        | 0.9         | 0.2 @<br>0.55 V | 8.8           |
| Speed                     | 100<br>KS/s | 100<br>KS/s | 500<br>KS/s | 100<br>KS/s | 20 KS/s         | 1<br>MS/s     |
| ENOB                      | 7.0         | 10.55       | 7.5         | 7.55        | 8.84            | 7.7           |
| FoM (fJ/conv)             | 310         | 165         | 86          | 48          | 22.4            | 42.3          |

of  $S_{\text{TH}}$  can be found, which reduces power consumption and ODR but preserves the three important features of a spike,  $A_{\text{max}}$ ,  $A_{\text{min}}$ , and  $T_{pp}$  [Fig. 10(b)].

Another experiment was performed to find the reduction in power consumption and ODR as a function of  $S_{\text{TH}}$ . For this experiment, the spike input to the ADC is approximated by a triangular waveform with noise and is generated using Agilent 81150A Pulse Function Generator. Fig. 22(a) and (b) shows the relative reduction in power consumption and ODR as a function of  $S_{\text{TH}}$  under different noise ( $\sigma_n$ ) conditions. When configured in this mode, the ADC takes two extra cycles for the spike detection and purging of the CDAC. If  $S_{\text{TH}}$  is set as 0 in this mode, the ADC runs freely with no thresholding and two extra cycles, and hence consumes larger power (=10.7  $\mu$ W). The power savings become evident once  $S_{\text{TH}}$  is increased to the desired value.

# C. VGA

VGA can only be characterized at low frequencies ( $\sim 1$  kHz) as it is not designed to drive large capacitance ( $\sim 5$  pF for I/O pads) at higher frequencies. The low frequency voltage gain matches the expected 8–35 dB. Also for testing the ADC, inputs at different frequencies (till Nyquist frequency) are given to the ADC, both directly and through the VGA. Similar performance is achieved for both these cases, which indirectly indicates that the proper functioning of the VGA till the Nyquist frequency.

To evaluate the power efficiency of the VGA, we have used the metric  $K_{\text{eff},\text{vga}}$  defined by

$$K_{\rm eff, vga} = \frac{2 \cdot \pi \cdot F_N \cdot V_{\rm sw}}{I_B} \cdot 10^{-12} \text{ pF}^{-1}$$
(13)

where  $V_{sw}$  is the output swing,  $F_N$  is the Nyquist frequency of the ADC, and  $I_B$  is the current consumption in the VGA.  $K_{eff,vga}$  is a measure of how efficiently the current is utilized in the VGA for a slew rate requirement, and is the inverse of the effective load capacitance that has to be charged by the VGA.



Fig. 21. Measured digital output of the activity-dependent ADC. The x-axis and y-axis represent the time in  $\mu$ s and the output code, respectively. The asymmetric rejection of the background noise is due to the fact that only five bits were used to encode  $S_{\text{TH}}$ .



Fig. 22. Relative reduction in power consumption ( $P_{\text{max}} = 10.7 \ \mu W$ ) and ODR with different  $\sigma_n$  and  $S_{\text{TH}}$ .

Higher is the  $K_{\rm eff, vga}$ , more power efficient is the VGA. Table II presents the comparison of the VGA with that in three state-of-the-art NRSs.

### V. CONCLUSION

We presented an 8-to-1 bit, 1 MS/s SAR ADC in UMC 0.13- $\mu$ m CMOS technology. An energy efficient DAC switching scheme was proposed, which consumes the lowest power to date without using extra capacitors or clock cycles. The DAC switching scheme consumes 37% less energy than the present state-of-the-art. For multichannel input, use of pingpong input sampling was emphasized to save power in VGA and reference buffer. The proposed ADC also consumes lower power in clock buffers as we employ clock of half of the sampling speed. Also the resolution of the ADC can be varied based on the dynamic range to avoid unnecessary processing and save power.

For neural recording application, the ODR was reduced using the proposed activity-dependent A/D scheme, which keeps important spike features preserved in the ADC output. This scheme saves both power and area when compared to spike feature extraction schemes, which employ complicated on-chip DSP. The savings in power consumption of the ADC and ODR due to the scheme are calculated and expressed in term of EAF. An experiment with the real neural data was carried out to show the usefulness of the scheme for the application. An emphasis was put on the requirement of large programmable voltage gain in the VGA for NRS. The presented VGA provides a voltage gain of 8 dB-35 dB in eight steps.

The presented SAR ADC consumes 8.8  $\mu$ W from 1 V supply and achieves ENOB of 7.7 bit for a near Nyquist input at 1 MS/s speed. FlipDAC switching technique makes energy consumption in the DAC small compared to digital switching energy. The DAC switching scheme will prove to be more beneficial in higher resolution and higher speed SAR ADCs where the DAC switching energy is more comparable to the digital switching energy. The ADC achieves a FoM of 42.3 fJ/conversion. The power consumption in the ADC is

dominated by digital switching, which will only improve with voltage and technology scaling.

#### REFERENCES

- M. S. Chae, Z. Yang, M. Yuce, L. Hoang, and W. Liu, "A 128-channel 6 mW wireless neural recording IC with spike feature extraction and UWB transmitter," *IEEE Trans. Neural Syst. Rehabil. Eng.*, vol. 17, no. 4, pp. 312–321, Aug. 2009.
- [2] R. Harrison, R. J. Kier, C. A. Chestek, V. Gilja, P. Nuyujukian, S. Ryu, B. Greger, F. Solzbacher, and K. V. Shenoy, "Wireless neural recording with single low-power integrated circuit," *IEEE Trans. Neural Syst. Rehabil. Eng.*, vol. 17, no. 4, pp. 322–329, Aug. 2009.
- [3] A. Sodagar, G. Perlin, Y. Yao, K. Najafi, and K. Wise, "An implantable 64-channel wireless microsystem for single-unit neural recording," *IEEE J. Solid-State Circuits*, vol. 44, no. 9, pp. 2591–2604, Sep. 2009.
- [4] M. Yin and M. Ghovanloo, "A flexible clockless 32-ch simultaneous wireless neural recording system with adjustable resolution," in *Proc. IEEE Int. Solid-State Circuits Conf., Dig. Tech. Papers*, Feb. 2009, pp. 432–433.
- [5] M. Azin, D. Guggenmos, S. Barbay, R. Nudo, and P. Mohseni, "A battery-powered activity-dependent intracortical microstimulation IC for brain-machine-brain interface," *IEEE J. Solid-State Circuits*, vol. 46, no. 4, pp. 731–745, Apr. 2011.
- [6] R. Harrison and C. Charles, "A low-power low-noise CMOS amplifier for neural recording applications," *IEEE J. Solid-State Circuits*, vol. 38, no. 6, pp. 958–965, Jun. 2003.
- [7] V. Chaturvedi and B. Amrutur, "An area-efficient noise-adaptive neural amplifier in 130 nm CMOS technology," *IEEE J. Emerg. Sel. Topics Circuits Syst.*, vol. 1, no. 4, pp. 536–545, Dec. 2011.
- [8] W.-S. Liew, X. Zou, L. Yao, and Y. Lian, "A 1-V 60-μW 16-channel interface chip for implantable neural recording," in *Proc. IEEE Custom Integr. Circuits Conf.*, Sep. 2009, pp. 507–510.
- [9] J. McCreary and P. Gray, "All-MOS charge redistribution analog-todigital conversion techniques. I," *IEEE J. Solid-State Circuits*, vol. 10, no. 6, pp. 371–379, Dec. 1975.
- [10] G. Promitzer, "12-bit low-power fully differential switched capacitor noncalibrating successive approximation ADC with 1MS/s," *IEEE J. Solid-State Circuits*, vol. 36, no. 7, pp. 1138–1143, Jul. 2001.
- [11] Y. Zhu. C.-H. Chan, U.-F. Chio, S.-W. Sin, U. Seng-Pan, R. P. Martins, and F. Maloberti, "A 10-bit 100-MS/s reference-free SAR ADC in 90 nm CMOS," *IEEE J. Solid-State Circuits*, vol. 45, no. 6, pp. 1111–1121, Jun. 2010.
- [12] M. van Elzakker, E. van Tuijl, P. Geraedts, D. Schinkel, E. Klumperink, and B. Nauta, "A 1.9 μW 4.4 fJ/conversion-step 10 b 1 MS/s chargeredistribution ADC," in *Proc. IEEE Int. Solid-State Circuits Conf., Dig. Tec. Papers*, Feb. 2008, pp. 244–610.
- [13] Y.-K. Chang, C.-S. Wang, and C.-K. Wang, "A 8-bit 500-KS/s low power SAR ADC for bio-medical applications," in *Proc. IEEE Asian Solid-State Circuits Conf.*, Nov. 2007, pp. 228–231.
- [14] B. Malki, T. Yamamoto, B. Verbruggen, P. Wambacq, and J. Craninckx, "A 70 dB DR 10b 0-to-80MS/s current-integrating SAR ADC with adaptive dynamic range," in *Proc. IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers*, Feb. 2012, pp. 470–472.
- [15] R. H. Olsson and K. Wise, "A three-dimensional neural recording microsystem with implantable data compression circuitry," *IEEE J. Solid-State Circuits*, vol. 40, no. 12, pp. 2796–2804, Dec. 2005.
- [16] B. Ginsburg and A. Chandrakasan, "500-MS/s 5-bit ADC in 65-nm CMOS with split capacitor array DAC," *IEEE J. Solid-State Circuits*, vol. 42, no. 4, pp. 739–747, Apr. 2007.
- [17] T. Anand, V. Chaturvedi, and B. Amrutur, "Energy efficient asymmetric binary search switching technique for SAR ADC," *Electron. Lett.*, vol. 46, no. 22, pp. 1487–1488, Oct. 2010.
- [18] C.-C. Liu, S.-J. Chang, G.-Y. Huang, and Y.-Z. Lin, "A 10-bit 50-MS/s SAR ADC with a monotonic capacitor switching procedure," *IEEE J. Solid-State Circuits*, vol. 45, no. 4, pp. 731–740, Apr. 2010.
- [19] A. Petraglia and S. Mitra, "Analysis of mismatch effects among A/D converters in a time-interleaved waveform digitizer," *IEEE Trans. Instrum. Meas.*, vol. 40, no. 5, pp. 831–835, Oct. 1991.
- [20] I. Obeid and P. D. Wolf, "Evaluation of spike-detection algorithms for a brain-machine interface application," *IEEE Trans. Biomed. Eng.*, vol. 51, no. 6, pp. 905–911, Jun. 2004.
- [21] J. Vibert and J. Costa, "Spike separation in multiunit records: A multivariate analysis of spike descriptive parameters," *Electroencephalograph. Clinical Neurophysiol.*, vol. 47, no. 2, pp. 172–182, 1979.

- [22] H. Semmaoui, J. Drolet, A. Lakhssassi, and M. Sawan, "Setting adaptive spike detection threshold for smoothed-teo based on robust statistics theory," *IEEE Trans. Biomed. Eng.*, vol. 59, no. 2, pp. 474–482, Feb. 2012.
- [23] S. O'Driscoll, K. Shenoy, and T. Meng, "Adaptive resolution ADC array for an implantable neural sensor," *IEEE Trans. Biomed. Circuits Syst.*, vol. 5, no. 2, pp. 120–130, Apr. 2011.
- [24] L. Sumanen, M. Waltari, and K. Halonen, "A mismatch insensitive CMOS dynamic comparator for pipeline A/D converters," in *Proc. 7th IEEE Int. Conf. Electron., Circuits Syst.*, vol. 1. Dec. 2000, pp. 32–35.
- [25] V. Katyal, R. L. Geiger, and D. J. Chen, "A new high precision low offset dynamic comparator for high resolution high speed ADCs," in *Proc. IEEE Asia Pacific Conf. Circuits Syst.*, Dec. 2006, pp. 5–8.
- [26] W. Wattanapanitch and R. Sarpeshkar, "A low-power 32-channel digitally programmable neural recording integrated circuit," *IEEE Trans. Biomed. Circuits Syst.*, vol. 5, no. 6, pp. 592–602, Dec. 2011.
- [27] M. Scott, B. Boser, and K. Pister, "An ultralow-energy adc for smart dust," *IEEE J. Solid-State Circuits*, vol. 38, no. 7, pp. 1123–1129, Jul. 2003.
- [28] N. Verma and A. Chandrakasan, "An ultra low energy 12-bit rateresolution scalable SAR ADC for wireless sensor nodes," *IEEE J. Solid-State Circuits*, vol. 42, no. 6, pp. 1196–1205, Jun. 2007.
- [29] M. Yip and A. Chandrakasan, "A resolution-reconfigurable 5-to-10b 0.4to-V power scalable SAR ADC," in *Proc. IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers*, Feb. 2011, pp. 190–192.



Vikram Chaturvedi (S'10) received the B.E. degree in electronics and telecommunication from the Shri Govindram Seksaria Institute of Technology and Science, Indore, India, in 2006. He is currently pursuing the Ph.D. degree with the Electrical Communication Engineering Department, Indian Institute of Science, Bangalore, India.

He was with ERC, Tata Motors Ltd., where he was involved in research on remote keyless entry system from 2006 to 2007. He presented a paper at the SRP session of ISSCC 2010. His current research

interests include adaptive biomedical systems, energy-efficient data converters, and high-speed serial interfaces.

Mr. Chaturvedi was a recipient of the first Student Travel Grants by IEEE Solid-State Circuit Society (SSCS) as a recognition of early career accomplishments on solid-state circuits. He is a member of the SSCS and the IEEE Circuits and Systems Society.



Tejasvi Anand (S'12) received the B.Tech. degree in electronics and communication from Guru Gobind Singh Indraprastha University, New Delhi, India, and the M.Tech. degree in electronics design and technology from the Indian Institute of Science, Bangalore, India, in 2006 and 2008, respectively. He is currently pursuing the Ph.D. degree with Oregon State University, Corvallis.

He was with the Cosmic Circuits and Microelectronics Laboratory, Indian Institute of Science. His current research interests include high-speed serial thesizers and data converters.

interfaces, frequency synthesizers, and data converters.



**Bharadwaj Amrutur** (M'08) received the B.Tech. degree in computer science and engineering from the Indian Institute of Technology Bombay, Mumbai, India, in 1990, and the M.S. and Ph.D. degrees in electrical engineering from Stanford University, Palo Alto, CA, in 1994 and 1999, respectively.

He was with Bell Labs, Agilent Labs, and Greenfield Networks. He is currently an Associate Professor with the Department of Electrical Communication Engineering, Indian Institute of Science, Bangalore, India, where he is involved in research

on VLSI circuits and systems.