Development of an FPGA-based high resolution TDC using Xilinx Kintex-7

T.N. Takahashi

1Research Center for Nuclear Physics (RCNP), Osaka University, Ibaraki, Osaka 567-0047, Japan

Time-of-Flight (TOF) measurement is one of the basic tools to distinguish particles in nuclear physics experiments. In a future experiment J-PARC E50 [1], a low cost and short dead time time-to-digital converter (TDC) whose time resolution of better than 30 ps is needed. A promising candidate for this purpose is multi-channel high-resolution TDC based on a field programmable gate array (FPGA). In the previous work [2], 16+1 TDC channels were implemented by using Xilinx Spartan-6 FPGA and its time resolution of 20–30 ps (σ) was achieved. One of the advantages of the FPGA-based high-resolution TDC is that once how to implement it and the operation principle is understood, then, the TDC performance and channel density can be easily extended by upgrading the FPGA to a newer one. In this article, a study of TDC implementation in Xilinx Kintex-7 FPGA (XC7K160T-1FFG676C) is described.

The TDC consists of 2 types of counters to measure the signal arrival time. One is a coarse time counter which is based on the clock counting. The other is a fine time counter which interpolates the clock interval by using a tapped delay line. To maximize the number of channels in the FPGA, delay line length has to be short. Therefore, the sampling clock (i.e. coarse counter clock) is chosen to be the highest frequency driving the global clock buffer of the target device, which is 625 MHz. The tapped delay line utilizes a carry chain in the FPGA. The delay of each tap, so-called bin width, becomes small if the manufacturing process of FPGA becomes narrow, which leads to the better time resolution in general. In [3], however, it is reported that there are many zero-width bins when using the Xilinx 7 series FPGA for the tapped delay line TDC. They found that about a half of the delay line is occupied by the zero-width bins. To obtain finer bin width, they re-order outputs of the delay line based on the place and route report. Another approach to decrease the number of zero-width bins is tuned delay line described in [4]. Xilinx CARRY4 element has 2 types of output port, COi and Oi (i=0,1,2,3), the former is the simple pass-through signal while the latter is summed output via XOR element. The difference whether signal passes through the XOR or not is a source of subtle propagation delay and helps to recover the number of effective bins. They studied the optimal combination of the carry logic output and found the heterogeneous sampling of COi and Oi resulted in the finer bin width than that of homogeneous sampling which utilizes only COi outputs or only Oi outputs. Therefore, the present work adopts heterogeneous sampling as shown in Fig. 1, where outputs from O0, CO1, O2, and CO3 are captured by the following D-flipflops.

![Figure 1: Tuned delay line. CARRY4 element is surrounded by a white line.](image)

![Figure 2: Typical look-up table for bin ID to fine time conversion (a) and each bin width (b).](image)

The output of tuned delay line is followed by a transition edge detector. Both leading edge and trailing edge of the input signal is detected by using a 6-input-2-output look-up table element (LUT6_2) which is equivalent to two 5-input-1-output look-up table elements with the same inputs but different truth table for the two outputs corresponding to the leading edge and trailing edge, respectively. Then, the leading edge and trailing edge are processed in parallel. They are encoded in binary code, converted to the fine time counter, buffered in FIFO and merged to a level-1 buffer to wait for a trigger signal. The delay line calibration logic is implemented with a block RAM and a dedicated clock signal generated by cascading the clock synthesizers in the FPGA. The calibration sequence is integrated and automatically performed in the FPGA.
The performance was checked by using RPV-260 [5] (Fig. 3), a universal logic board in a form factor of KEK-VME 6U. Figure 2(a) shows a typical look-up table content for the conversion from bin ID to fine time count. Each bin width is plotted in Fig. 2(b). In the middle of the delay line (i.e. bin ID ∼80), a bump structure is observed in the figure. This bump can be attributed to the FPGA internal structure. The bump position corresponds to the horizontal clock row which located in the middle of each clock region. The number of effective bins to interpolate the clock interval of 1.6 ns is about 145, thus, the bin width is ∼11 ps in average. The preliminary time resolution was measured by implementing an FPGA internal pulse generator. The generated pulse was split into 2 channels and its arrival time was measured by each channel. The time difference is shown in Fig. 4. The red curve is a fit result of the distribution with a Gaussian and σ ∼14 ps is obtained. Dividing it by a factor of √2, the single channel resolution is ∼10 ps.

A further study such as a performance test with an external signal and how to increase the number of channels (64 + 1 channels, for example) is ongoing.

References

https://www.repic.co.jp/product/module/vme/rpv-260.html