Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
587 views
in Technique[技术] by (71.8m points)

x86 - How does the indexing of the Ice Lake's 48KiB L1 data cache work?

The Intel manual optimization (revision September 2019) shows a 48 KiB 8-way associative L1 data cache for the Ice Lake microarchitecture.

Ice Lake's 48KiB L1 Data cache and its 8-way associativity 1 Software-visible latency/bandwidth will vary depending on access patterns and other factors.

This baffled me because:

  • There are 96 sets (48 KiB / 64 / 8), which is not a power of two.
  • The indexing bits of a set and the indexing bits of the byte offset add to more than 12 bits, this makes the cheap-PIPT-as-VIPT-trick not available for 4KiB pages.

All in all, it seems that the cache is more expensive to handle but the latency increased only slightly (if it did at all, depending on what Intel means exactly with that number).

With a bit of creativity, I can still imagine a fast way to index 96 sets but point two seems an important breaking change to me.

What am I missing?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

The optimization manual is wrong.

According to the CPUID instruction, the associativity is 12 (on a Core i5-1035G1). See also uops.info/cache.html and en.wikichip.org/wiki/intel/microarchitectures/ice_lake_(client).

This means that there are 64 sets, which is the same as in previous microarchitectures.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...