PCIe Prototype firmware
This is a proposal for a quick firmware for the prototype using Xillybus as PCIe endpoint.
Basic structure
The firmware will be based on the USB3 branch of the Rhythm firmware, as it supports up to 512channels, which will be useful for testing. Main changes include:
- DAC, ADC, LED and TTL controllers will be stripped out. (Control signals may stay)
- all WireIns will be replaced by 16bit registers, which will be written by a memory mapped Xillybus file
- Although most of the wireins only use the lower 16 bits, some of them use the higher. Those are mainly DAC-related, and thus non existant. The few that aren't are being remapped
- Instead of being written by registers and triggers, al three auxcmd memory banks will be written by memory mapped Xillybus files of size 16x1024 16-bit words
- File with 14bit address
- Address bits [13:10] will select the bank and bits [9:0] the address within the bank
- The main state machine will still write into a 16-to-32 bit FIFO, but the output of that FIFO will be fed into a Xillybus FIFO output instead of the DDR controller. The existing w16_4096-r32_2048 one is a good choice for size.
- DDR and, of course, USB pipe out will be stripped out
Xillybus files required
1x32bit FIFO (data acquisition file), output direction. Bandwidth of 36MB/s (512channels@30KHz, with some leeway), will use DMA. Name: xillybus_neual_data_32.
3x16bit write-only memory mapped files with 14bit addresses, for the auxcmd memory banks. Name: xillybus_auxcmdN_membank_16 (with N being 1-3, following Rhythm name convention on the auxcmd banks)
1x16bit read-write memory mapped file with 6bit addresses, for control registers. Name: xillybus_control_regs_16
1x16bit read-only memory-mapped file with 4bit addresses, for status registers. Name: xillybus_status_regs_16
Register addresses
To ease the transition, most register addresses are the same as their WireIn counterparts, just doubled (since they're 16bit registers, but unlike the WireIns, the memory mapped files use normal byte addressing).
The write-only control register map ends as follows:
- 0x00: ResetRun (DAC and TTL settings can be ignored)
- 0x02: MaxTimeStepLSb
- 0x04: MaxTimeStepMsb
- 0x06: DataFreqPll (Instead of using the trigger, we can use the write logic of this address to trigger the PLL update. See clock note)
- 0x08: MisoDelay
- 0x0A-0x0E: Unused
- 0x10-0x14: AuxCmdBank1-3
- 0x16-0x1A: AuxCmdLength1-3
- 0x1C-0x20: AuxCmdLoop1-3
- 0x22: Unused
- 0x24: DataStreamSel1234
- 0x26: DataStreamSel5678
(The three following registers have been moved, using what were addresses for dac registers) - 0x28: DataStreamSel9ABC
- 0x2A: DataStreamSelDEF10
- 0x2C: DataStreamEn
- 0x2E: AuxOutput: A new register. Bits 0-3 control the binary output of the three generic SMAs
- 0x30-0x3C: Unused
- 0x3E: Start Trigger (writing in the bit 0 on this address will not write to any register but trigger acquisition start)
And the read-only registers:
- 0x00: NumWordsLsb
- 0x02: NumWordsMsb
- 0x04: SpiRunning
- 0x06: Unused
- 0x08: DataClkLocked
- 0x0A: BoardId
- 0x0C: BoardVersion
- 0x0E: Unused
Of these, NumWords would, in any case, refer to the middle FIFO, but their contents are actually largely irrelevant. Might be used for debug purposes. BoardId and BoardVersion are largely irrelevant too, since they are originally used to check if the correct firmware is loaded, but in this case having the correct Xillybus interfaces should be enough indicator. Since we need at least the SpiRunning register output, it doesn't hurt to have all the others, though.
Notes on endianess and word sizes > 8bit
Xillybus does endianess translation when the data size is greater than 8 bits. For example, for a 16bit word with the content 0x0100 only the 16th bit will be set to '1'. (instead of the 0th as it would be stored inside the computer memory). Note that full words must been written at once, so for a 16bit words, the write operations must be multiples of two bytes.
Since in a host computer file operations are done with a minimum size of one byte when using a memory addressed file with word size greater than one bytes the memory addresses in the host system are the number of real addresses inside the FPGA multiplied by the number of bytes each word has, with the FPGA discarding the lower bits of the address. For example, for a Xillybus memory file created with 16bit data size and 5bit address size in the IP configuration, the file will behave as following:
- Inside the FPGA, the IP will report 5 bits for the addresses of that file, that is, 32 bytes
- The data bus inside the FPGA will be of 16 bit, so all operations are done with that data size
- The host will be able to address 64bytes (6 bit) of data: 32 bytes x 2bytes per word
- Every Write or Read operation shall be done in multiples of 2 bytes (16bit operations)
- The Xillybus driver will discard the last address bit when sending the address to the FPGA. That way, accessing addresses 0 and 1 will yield the same result, same with 2 & 3, 4 & 5... up until 62 & 63. Reflecting the actual space of 32 16bit words.
Clock configuration
Since the Kintex-7 uses different clock generation primitives, a new table for dynamic generation of the data clock is needed. The Kintex-7 PLL features a double divider, with a output of Fout = Fin x M / (O x D). Since the KC705 board features a fixed 200MHz clock, the conversion table is as below.
The DataFreqPll signal must be modified. It ends as follows:
- DataFreqPll[7:0] Value of the D parameter
- DataFreqPll[14:8] Value of the M parameter
- DataFreqPll[15]: '1' means the O parameter is 8, '0' means it's 4
M | O | D | dataclk | Sampling rate | Sampling period |
7 | 125 | 4 | 2,80 MHz | 1,00 kS/s | 1000,0 usec |
7 | 100 | 4 | 3,50 MHz | 1,25 kS/s | 800,0 usec |
21 | 125 | 8 | 4,20 MHz | 1,50 kS/s | 666,7 usec |
14 | 125 | 4 | 5,60 MHz | 2,00 kS/s | 500,0 usec |
35 | 125 | 8 | 7,00 MHz | 2,50 kS/s | 400,0 usec |
21 | 125 | 4 | 8,40 MHz | 3,00 kS/s | 333,3 usec |
14 | 75 | 4 | 9,33 MHz | 3,33 kS/s | 300,0 usec |
28 | 125 | 4 | 11,20 MHz | 4,00 kS/s | 250,0 usec |
7 | 25 | 4 | 14,00 MHz | 5,00 kS/s | 200,0 usec |
7 | 20 | 4 | 17,50 MHz | 6,25 kS/s | 160,0 usec |
56 | 125 | 4 | 22,40 MHz | 8,00 kS/s | 125,0 usec |
14 | 25 | 4 | 28,00 MHz | 10,00 kS/s | 100,0 usec |
7 | 10 | 4 | 35,00 MHz | 12,50 kS/s | 80,0 usec |
21 | 25 | 4 | 42,00 MHz | 15,00 kS/s | 66,7 usec |
28 | 25 | 4 | 56,00 MHz | 20,00 kS/s | 50,0 usec |
35 | 25 | 4 | 70,00 MHz | 25,00 kS/s | 40,0 usec |
42 | 25 | 4 | 84,00 MHz | 30,00 kS/s | 33,3 usec |
28 | 15 | 4 | 93,33 MHz | 33,33 kS/s | 30,0 usec |
56 | 25 | 4 | 112,00 MHz | 40,00 kS/s | 25,0 usec |
14 | 5 | 4 | 140,00 MHz | 50,00 kS/s | 20,0 usec |
Status LEDs
The firmware uses some of the KC705 Evaluation board leds as status indicators:
- LEDs 0-3 are Xillybus indicators. LED 0 should blink to indicate the system's correct operation while LEDs 1-3 are indicators of PCIe transfer
- LED4 is currently unused
- LED5 indicates an overflow of the transmission FIFO. This happens if the system generates data faster than the software can consume. The FIFO will not allow more data until acquisition has reset to avoid block corruption.
- LED6 indicates that the acquisition state machine is working and acquiring data from the headstages.
- LED7 indicates the reset status of the acquisition machine. Should be ON unless the device is open by the GUI.