In the run-up to its Developer Forum next month, which is way the heck over in Shanghai, Intel dropped a few factoids about Nehalem, Dunnington and Larrabee on the press this week.
This being an even number year, Intel is going to move to a new microarchitecture represented by the Nehalem, which uses a HyperTransport-like QuickPath point-to-point interconnect that gets Intel out of the front-side bus business and an integrated memory controller, another AMD borrowing, and Nehalem’s single biggest enhancement, by Intel’s lights, that will fetch data from the cores faster.
The controller has three channels per processor and what Intel describes as massive amounts of bandwidth. The interconnect is supposed to be good for up to 25.6 Gb/sec total bandwidth per link and there are two links per CPU socket.
Memory bandwidth has been an AMD advantage for the last four years. The jump between Intel’s current Harpertown Xeon, with its 1600 FSB and Nehalem, with its 3x DDR3-1333 is supposed to be a 4x increase in bandwidth.
Nehalem is supposed to scale from two to eight cores, but it appears that the Nehalem architecture, for everything from laptops to servers, is outfitted with two-way simultaneous multithreading that chip groupie Nathan Brookwood says will make a four-core chip look like eight processors to Windows and an eight-core chip look like 16. The architecture, he said also makes fully buffered DIMMs history.
Bound for production in Q4, the 45nm Nehalem has a shared and inclusive 8MB L3 cache, which will minimize snoop traffic. The low-latency L2 cache runs 256Kb per core. The L1 cache is the same as the Intel Core microarchitecture: 32KB instruction/32KB data.
Nehalem goes into the Tylerburg platform, which Intel said can be configured for both a one-socket high-end desktop and a two-socket HPC or dual-processing server.
Now then Larrabee, Intel’s first many-core chip. How many cores exactly is still unclear. Intel’s graphic suggests at least 12. It plans to demonstrate the thing later this year, though it may not hit market until 2010.
Larrabee is where computing and graphics are supposed to come together on the same silicon.
Since programming such beasts can be a problem – and not daring to repeat its Itanium experience – Intel is relying on good ole x86 to save the day but it will be introducing a specialized component called a vector processing unit, which means extending x86 instructions with vector processing instructions including integer and floating arithmetic, vector memory operations and conditional instructions.
These new instructions are there to improve the performance of graphics and video applications.
Larrabee, scalable into the teraflops, also includes a major new hardware coherent cache design that enables the many-core architecture. Larrabee will support familiar APIs such as Direct X and OpenGL.
Sandy Bridge, which used to be code named Gesher and another microarchitecture change slated for 2010, will also use a set of vector processing instructions that Intel calls Advanced Vector Extensions (AVX) or “SSE on steroids.” AVX will increase performance in floating point, media and processor-intensive software, Intel said, via 256-bit vectors instead of 128-bit. It can also increase energy efficiency. And, fear not, it is backwards-compatible, an odd point Intel called out.