卡片召唤师
精华
|
战斗力 鹅
|
回帖 0
注册时间 2008-6-22
|
从下周开始,AMD将在全球多个地方举办技术日会议,包括英国伦敦、法国巴黎、德国慕尼黑等等,首次官方讲解28nm Radeon HD 7000系列显卡的技术秘密。
Radeon HD 6000系列混合了VLIW5、VLIW4两种不同架构,Radeon HD 7000系列则会延续它们的同时加入真正全新的GCN,使用可处理器MIMD的计算单元(CU)取代传统的SIMD阵列,大大增强并行加速计算能力,从而形成“三足鼎立”的局面。
Radeon HD 7900系列开发代号Tahiti,将会包括Tahiti XT Radeon HD 7970、Tahiti Pro Radeon HD 7950两款型号,均基于GCN新架构。
Radeon HD 7970拥有完整的32组CU,总计2048个处理核心、128个纹理单元、64个ROP单元,同时还有接近5MB缓存:512KB一级数据缓存、384KB共享一级缓存、2MB本地数据共享(LDS)、2MB二级缓存。虽然目前还不清楚Tahiti核心的具体规模,但即便是28nm工艺,相信核心面积也不会太小。
不止规模庞大,Radeon HD 7970的默认核心频率将会高达1GHz,成为继Radeon HD 4890之后史上第二款破“G点”的GPU,而且后者只是限量升级版,这次是全面铺开了。
显存方面也蔚为壮观:位宽拓展至384-bit,足以媲美GeForce GTX 580,同时搭配3GB GDDR5,等效频率5500MHz,带宽将达创纪录的264GB/s。
Radeon HD 7950将会减少到30组CU、1920个核心、120个纹理单元、60个ROP单元,核心频率降至900MHz,显存频率也降至5000MHz,不过目前还不确认其位宽,如果还是384-bit的话带宽将是240GB/s。
Radeon HD 7970/7950将于拉斯维加斯CES 2012消费电子大展期间发布,确切时间是2012年1月6日,价格区间为349-449美元。
向上是双芯设计的New Zealand Radeon HD 7990,将配备4096个核心、6GB GDDR5显存,而且频率也不会低,不过这款新旗舰要到2012年3月份才会推出,预计价格699美元——和现在的Radeon HD 6990相同。
向下的Pitcairn Radeon HD 7800系列不会使用新架构,而是继续VLIW4,说白了就是28nm工艺版本的Radeon HD 6900系列(流处理器数量或许不变),同时在频率上有所微调,功耗也会大幅下降。预计它将占据199-249美元价位市场。
继续往下,Cape Verde Radeon HD 7700/7600/7500还是VLIW4架构,很大程度上也是现有中低档核心的新工艺版本,但具体规格还不得而知。
Trinity APU处理器内的整合图形核心Devastator也是基于VLIW4架构,预计将命名为Radeon HD 7450D/7550D之类的,可以和Radeon HD 7600/7500系列进行双显交火。
由于28nm Krishna/Wichita APU已经被取消,低功耗平台上AMD暂时拿不出新品,只会将Brazos 2.0平台进行重命名,图形核心叫做Radeon HD 7300/7200系列。当然了,它们将是VLIW5架构的最后残余。
Starting next week, AMD is going to organize Tech Days in several destinations around the globe, such as London or Paris - during which the company is going to present 28nm Radeon HD 7000 series.
There is a lot of rumors flying around the web, some of which are spun by AMD themselves to raise confusion, as the Radeon HD 7000 series is going to mix the existing VLIW4 and VLIW5 architectures with the "Graphics Core Next" (GCN), introduced during June's Fusion Development Summit held in Bellevue, WA.
Radeon HD 7000 Series with the old VLIW4 and VLIW5 Architectures
A couple of years ago, AMD and favorable media were all over NVIDIA for mixing different GPU architectures within the same product line. Then with the Radeon HD 6000 series, all of a sudden nobody questioned why AMD mixed two distinctive GPU architectures within a single series (new VLIW4 architecture only powered three high end parts). With Radeon HD 7000 Series, the situation is set to become even more complicated, with AMD mixing no less than three distinctive GPU architectures within the single generation of products.
Given the recent cancellation of 28nm Krishna and Wichita APUs, AMD will rebrand the Brazos 2.0 APU platform as Radeon HD 7200 and 7300 series, and for instance rebranded AMD E-Series APU will be powered by Radeon HD 7200 or 7300 series (all based on Evergreen GPU - VLIW5).
The higher end Trinity APU, the heir to the successful Llano A-Series APU will be powered by a Devastator GPU core, based on contemporary "Northern Islands" VLIW4 architecture, featuring product names such as Radeon HD 7450(D), 7550(D) and so on and so forth.
When it comes to discrete parts, parts with the codename Cape Verde (HD 7500, 7600, and 7700) and Pitcairn (HD 7800), they are all based on the VLIW4 architecture. The "Graphics Core Next" architecture is reserved just for the 7900 Series. Desktop parts are codenamed on Southern Islands, while mobile parts are codenamed after parts of London (read: Cape Verde becomes Lombok, Pitcairn becomes Thames etc.).
If you compare the VLIW4-based HD 6900 and the upcoming HD 7800 series, there isn't much difference between the two. According to our sources, HD 7800 "Pitcairn" is a 28nm die shrink of the popular HD 6900 "Cayman" GPU with minor performance adjustments. This will bring quite a compute power into the price sensitive $199-$249 bracket and we expect a lot of headaches for NVIDIA in that respect.
Welcome Graphics Core Next: Powering the Tahiti and New Zealand (HD 7900)
AMD spoke about Graphics Core Next (GCN) quite openly, a move we can only commend them for. During his keynote session in June, Eric Demers of AMD explained the reasoning behind the move to GCN: compute is graphics, graphics is compute. There is no doubt that the future of GPUs are enhanced compute capabilities and we already hear from game developers who are using computational power of the GPU to create details inside the games instead of gigabytes and gigabytes of textures.
Truly functional Virtual Memory coming to GPU with AMD Radeon HD 7900 Series
The new GCN architecture brings numerous innovations to GPU architecture, out of which we see x86 virtual memory as perhaps one of the most important ones. While the GPU manufacturers have promised functional virtual memory for ages, this is the first time we're seeing a working implementation. This is not a marketing gimmick, IOMMU is a fully functional GPU feature, supporting page faults, over allocating and even accepting 64-bit x86 memory pointers for 100% compatibility with 64-bit CPUs. Virtual memory is going to be the large part of next-gen Fusion APUs (2013) and FireStream GPGPU cards (2012), and we can only commend the effort made in making this possible.
All of this required to expand the GPU controller by two additional lines for a grand total of 384-bits, identical to GeForce GTX 580, for example. However, AMD timings are much more aggressive than the conservative NVIDIA, so expect the memory clock to remain higher with AMD GPUs.
A rumor recently exploded that HD 7900 Series will come with Rambus XDR2 memory. Given the fact that AMD has a memory development team and the company being the driving force behind creation of GDDR3, GDDR4 and GDDR5 memory standards - we were unsure of the rumors.
Bear in mind that going Rambus is not an easy decision, as a lot of engineers inside AMD flat out refuse to even consider the idea of using Rambus products due to company's litigious behavior. However, our sources are telling us that AMD is frustrated that the DRAM industry didn't made good on the very large investment on AMD's part, creating two GDDR5 memory standards: Single Ended (S.E. GDDR5) and Differential GDDR5. Thus, the company applied pressure to the memory industry in bridging GDDR5 and the future memory standard with XDR2 memory. The production Tahiti part will utilize GDDR5 memory, though.
Is AMD going to continue investing in future memory standards? We would say yes, but with all the changes that have happened, it just might take the executive route to utilize available market technologies rather than spending time and money on future iterations of GDDR memory. After all, AMD recently reshuffled their memory design task force. In any case, Differential GDDR5 comes at very interesting bandwidth figures and those figures are something AMD wants to utilize "as soon as possible".
Fusion System Architecture or How Southern Islands Pave the Road for AMD
Ticking boxes off as GPUs become more Compute-like: AMD wants to make this vision complete by 2014
AMD is pushing forward with their Fusion System Architecture (FSA) and the goals of that architecture will take some time to implement - we won't see a full implementation before 2014. However, Southern Islands brings several key features which AMD lacked when compared to NVIDIA Fermi and the upcoming Kepler architectures.
The GPU itself replaced SIMD array with MIMD-capable Compute Units (CU), which bring support for C++ in the same way NVIDIA did with Fermi, but AMD went beyond Fermi's capabilities with aforementioned IOMMU. There is also a link between power management for the CPU and GPU, which should reduce power consumption (currently, single action that GPU makes will wake up the CPU, even if it's something as simple as screen refresh).
AMD Compute Unit: Completely redesigned compute core features dedicated L1 and L2 memory, as well as shared L1 for MIMD functionality
As you can see on the image above, a single CU block is consisted out of a single Scalar and 64 Vector units which are fed through multiple layers of cache. Overall, the Compute Unit comes with 16KB of L1 Data cache and 64KB LDS memory (i.e. scratch memory), with an additional 48KB shared between four CU blocks. Each CU connects to 64KB of dedicated L2 cache.
With Tahiti packing 32 Compute Units in a maximum configuration, a 32 CU GPU with 2048 processing cores features almost 5MB of on-die memory: 512KB L1 Data cache, 384KB Shared L1 cache and 2MB of LDS and 2MB of L2 Cache. This is a record amount of cache for the GPUs so far, and you can expect this trend to continue.
Graphics Core Next: A True MIMD
AMD adopted a smart compute approach. Graphic Core Next is a true MIMD (Multiple-Instruction, Multiple Data) architecture. With the new design, the company opted for "fat and rich" processing cores that occupy more die space, but can handle more data. AMD is citing loading the CU with multiple command streams, instead of conventional GPU load: "fire a billion instructions off, wait until they all complete". Single Compute Unit can handle 64 FMAD (Fused Multiply Add) or 40 SMT (Simultaneous Multi-Thread) waves. Wonder how much MIMD instructions can GCN take? Four threads. Four thread MIMD or 64 SIMD instructions, your call. As Eric explained, Southern Islands is a "MIMD architecture with a SIMD array".
These compute units are paired with conventional fixed function hardware. AMD tried the non-fixed function hardware route with the R600 in 2007 (Radeon HD 2000 series) and after that experiment, the company saw no value in avoiding fixed function hardware. Thus, Southern Islands will continue to have up to 64 fixed Raster Ops (ROP), Z units, up to 128 Texture Memory Units, FSAA logic etc.
Tahiti becomes HD 7950 and 7970, New Zealand becomes HD 7990
Now that we're properly introduced with the GPU core, the time has come to pay more attention to the lineup itself. Given that the memory bus was extended to 384-bits, i.e. the same as GeForce GTX 580, 3GB GDDR5 are being used across the board, and we would not exclude a 1.5GB or even 896MB "7930" part coming as the number of partially functional GPUs increases.
AMD kept the unified clock concept and given that Radeon HD 7970 is based on fully configured "Tahiti XT" GPU, 2048 cores (32 Compute Units) operate at 1GHz clock. 3GB of GDDR5 memory operates in Quad Data Rate mode i.e. 1.37GHz ODR ("effective 5.5GHz"). This results with record video memory bandwidth for a single GPU - 264GB/s.
The HD7950 is based on "Tahiti Pro" and packs 30 Compute Units for 1920 cores operating at 900MHz. The number of ROPs decreased to 60, while Texture units naturally reduced to 120 (as every CU connects to 2 ROPs and 4 TMUs). Our sources did not disclose if the memory controller is still 384-bit or a 256-bit one, but the memory clock was decreased to 1.25GHz, i.e. the same clock as previous gen models. Should 384-bit controller stay, the clock should be good for 240GB/s of bandwidth.
Both products are expected to be released on CES 2012 in Las Vegas, NV, occupying the $349-449 price bracket. Those additional gigabytes of memory (and processing cores) will certainly cost a lot of $$$.
As far as the dual-GPU "New Zealand", 6GB GDDR5 is expected to be clocked on the same level as the HD6990/7970, meaning you will be getting full performance out of the dual-GPU part.
Unlike HD7950 and HD7970, Radeon HD 7990 will debut in March 2012 and the target price is the same as the original price of its predecessor - $699. |
|