PCI BARs and other means of accessing the GPU¶
Nvidia GPU BARs, IO ports, and memory areas¶
The nvidia GPUs expose the following areas to the outside world through PCI:
- PCI configuration space / PCIE extended configuration space
- MMIO registers: BAR0 - memory, 0x1000000 bytes or more depending on card type
- VRAM aperture: BAR1 - memory, 0x1000000 bytes or more depending on card type [NV3+ only]
- indirect memory access IO ports: BAR2 - 0x100 bytes of IO port space [NV3 only]
- ???: BAR2 [only NV1x IGPs?]
- ???: BAR2 [only NV20?]
- RAMIN aperture: BAR2 or BAR3 - memory, 0x1000000 bytes or more depending on card type [NV40+]
- indirect memory access IO ports: BAR5 - 0x80 bytes of IO port space [G80+]
- PCI ROM aperture
- PCI INTA interrupt line
- legacy VGA IO ports: 0x3b0-0x3bb and 0x3c0-0x3df [can be disabled in PCI config]
- legacy VGA memory: 0xa0000-0xbffff [can be disabled in PCI config]
PCI/PCIE configuration space¶
Nvidia GPUs, like all PCI devices, have PCI configuration space. Its contents aredescribed inPCI configuration space.
BAR0: MMIO registers¶
This is the main control space of the card - all engines are controlledthrough it, and it contains alternate means to access most of the otherspaces. This, along with the VRAM / RAMIN apertures, is everything that’sneeded to fully control the cards.
This space is a 16MB area of memory sparsely populated with areas representingindividual engines, which in turn are sparsely populated with registers. Thelist of engines depends on card type. While there are no known registersoutside 16MB range, the BAR itself can have a larger size on NV40+ cards ifconfigured so bystraps.
Its address is set up through PCI BAR 0. The BAR uses 32-bit addressing andis non-prefetchable memory.
The registers inside this BAR are 32-bit, with the exception of areas that arealiases of the byte-oriented VGA legacy IO ports. They should be accessedthrough aligned 32-bit memory reads/writes. On pre-NV1A cards, the registersare always little endian, on NV1A+ cards endianness of the whole area can beselected by a switch in PMC. The endianness switch, however, only affectsBAR0 accesses to the MMIO space - accesses from inside the card are alwayslittle-endian.
A particularly important subarea of MMIO space is PMC, the card’s mastercontrol. This subarea is present on all nvidia GPUs at addresses 0x000000through 0x000fff. It contains GPU id information, Big Red Switchesfor engines that can be turned off, and master interrupt control. It’sdescribed in more detail inPMC: Master control unit.
For full list of MMIO areas, seeMMIO register ranges.
BAR1: VRAM aperture¶
This is an area of prefetchable memory that maps to the card’s VRAM. On nativePCIE cards, it uses 64-bit addressing, on native PCI/AGP ones it uses 32-bitaddressing.
On non-TURBOCACHE pre-G80 cards and on G80+ cards with BAR1 VM disabled, BARaddresses map directly to VRAM addresses. On TURBOCACHE cards, BAR1 is made ofcontrollable VRAM and GART windows [seeNV44 host memory interface].G80+ cards have a mode where all BAR references go through the card’s VMsubsystem, seeG80:GF100 host memory interface andGF100- host memory interface.
On NV3 cards, this BAR also contains RAMIN access aperture at address0xc00000 [seeNV3 VRAM structure and usage]
Todo
map out the BAR fully
the BAR size depends on card type:
- NV3: 16MB [with RAMIN]
- NV4: 16MB
- NV5: 32MB
- NV10:NV17: 128MB
- NV17:G80: 64MB-512MB, set viastraps
- G80-: 64MB-64GB, set via straps
Note that BAR size is independent from actual VRAM size, although on pre-NV30cards the BAR is guaranteed not to be smaller than VRAM. This means it maybe impossible to map all of the card’s memory through the BAR on NV30+ cards.
BAR2/BAR3: RAMIN aperture¶
RAMIN is, on pre-G80 cards, a special area at the end of VRAM that containsvarious control structures. RAMIN starts from end of VRAM and the addressesgo in reverse direction, thus it needs a special mapping to access it the wayit’ll be used. While pre-NV40 cards limitted its size to 1MB and could fit themapping in BAR0, or BAR1 for NV3, NV40+ allow much bigger RAMIN addresses.RAMIN BAR provides such RAMIN mapping on NV40 family cards.
G80 did away with a special RAMIN area, but it kept the BAR around. It workslike BAR1, but is independent on it and can use a distinct VM DMA object. Asopposed to BAR1, all accesses done to BAR3 will be automatically byte-swappedin 32-bit chunks like BAR0 if the big-endian switch is on. It’s commonlyused to map control structures for kernel use, while BAR1 is used to mapuser-accessible memory.
The BAR uses 64-bit addressing on native PCIE cards, 32-bit addressing onnative PCI/AGP. It uses BAR2 slot on native PCIE, BAR3 on native PCI/AGP.It is non-prefetchable memory on cards up to and including G200, prefetchablememory on MCP77+. The size is at least 16MB and is set viastraps.
BAR2: NV3 indirect memory access¶
An area of IO ports used to access BAR0 or BAR1 indirectly by real mode codethat cannot map high memory addresses. Present only on NV3.
Todo
RE it. or not.
BAR5: G80 indirect memory access¶
An area of IO ports used to access BAR0, BAR1, and BAR3 indirectly by realmode code that cannot map high memory addresses. Present on G80+ cards.On earlier cards, the indirect access feature of VGA IO ports can be usedinstead. This BAR can also be disabled viastraps.
Todo
It’s present on some NV4x
This area is 0x80 bytes of IO ports, but only first 0x20 bytes are actuallyused; the rest are empty. The ports are all treated as 32-bit ports. Theyare:
- BAR5+0x00:
- when read, signature: 0x2469fdb9. When written, master enable:write 1 to enable remaining ports, 0 to disable. Only bit 0 ofthe written value is taken into account. When remaining portsare disabled, they read as 0xffffffff.
- BAR5+0x04:
- enable. if bit 0 is 1, the “data” ports are active, otherwisethey’re inactive and merely store the last written value.
- BAR5+0x08:
- BAR0 address port. bits 0-1 and 24-31 are ignored.
- BAR5+0x0c:
- BAR0 data port. Reads and writes are translated to BAR0 readsand writes at address specified by BAR0 address port.
- BAR5+0x10:
- BAR1 address port. bits 0-1 are ignored.
- BAR5+0x14:
- BAR1 data port. Reads and writes are translated to BAR1 readsand writes at address specified by BAR1 address port.
- BAR5+0x18:
- BAR3 address port. bits 0-1 and 24-31 are ignored.
- BAR5+0x1c:
- BAR3 data port. Reads and writes are translated to BAR3 readsand writes at address specified by BAR3 address port.
BAR0 addresses are masked to low 24 bits, allowing access to exactly 16MBof MMIO space. The BAR1 addresses aren’t masked, and the window actuallyallows access to more BAR space than the BAR1 itself - up to 4GB of VRAMor VM space can be accessed this way. BAR3 addresses, on the other hand,are masked to low 24 bits even though the real BAR3 is larger.
BAR6: PCI ROM aperture¶
Todo
figure out size
Todo
figure out NV3
Todo
verify G80
The nvidia GPUs expose their BIOS as standard PCI ROM. The exposed ROM aliaseseither the actual BIOS EEPROM, or the shadow BIOS in VRAM. This setting isexposed in PCI config space. If the “shadow enabled” PCI config register is0, the PROM MMIO area is enabled, and both PROM and the PCI ROM aperture willaccess the EEPROM. Disabling the shadowing has a side effect of disablingvideo output on pre-G80 cards. If shadow is enabled, EEPROM is disabled,PROM reads will return garbage, and PCI ROM aperture will access the VRAMshadow copy of BIOS. On pre-G80 cards, the shadow BIOS is located at address0 of RAMIN, on G80+ cards the shadow bios is pointed to byPDISPLAY.VGA.ROM_WINDOW register - seeG80 VGA emulation for details.
INTA: the card interrupt¶
Todo
MSI
The GPU reports all interrupts through the PCI INTA line. The interrupt enableand status registers are located in PMC area - seeInterrupts.
Legacy VGA IO ports and memory¶
The nvidia GPU cards are backwards compatible with VGA and expose the usualVGA ranges: IO ports 0x3b0-0x3bb and 0x3c0-0x3df, memory at 0xa0000-0xbffff.The VGA ranges can however be disabled in PCI config space. The VGA registersand memory are still accessible through their aliases in BAR0, and disablingthe legacy ranges has no effect on the operation of the card. The IO rangecontains an extra top-level register that allows indirect access to the MMIOarea for use by real mode code, as well as many nvidia-specific extraregisters in the VGA subunits. For details, seeVGA registers and memory.