Closing Remarks: TCS23 Promises Improved Performance and Power Efficiency

Coming from Arm's latest client technology day, my impression is that this is a generation where, even more so than usual, Arm's primary focus is on improving their power efficiency. Much of the focus on the latest cores, including the large Cortex-X4, the middle Cortex-A720, and the little Cortex-A520, isn't about reinventing the wheel, but more about greasing it to squeeze out improvements versus their previous Armv9-based microarchitecture. All the while, Arm has also gone the whole nine yards in ensuring SoC vendors and the broader market are prepared for a complete and dynamic switch from the current hybrid 64/32-bit mobile market to a full AArch64 world.

Comparing TCS22 to TCS23 with an identical core complex configuration of 1+3+4, Arm claims significant gains using the Speedometer benchmark of up to 33% at iso-frequency. Elsewhere, things get slightly skewed when Arm compares their latest TCS23 cluster with a 1+5+2 configuration against TCS22 with 1+3+4. Still, Arm is claiming a 27% improvement in GeekBench 6's Multi-threaded benchmark across the board, where opting for two larger middle cores and losing two smaller cores has a major impact on that figure.

The critical takeaway from Arm's Armv9.2 announcement is that their IP is going fully bound towards an entire and complete 64-bit ecosystem, and they are looking to harness all of the benefits from a more unified marketplace.

Even though from a technical standpoint, the refinement of its existing TCS22 IP for all of the efficiency gains they bring isn't avante-garde. It's more about refining the current IP to align with a broader market focus on efficiency. From what we can see, much of the gains have come from reducing specific power structures through implementations such as RAM power down and Slice power down to save energy where it can and allow the power saved to be used in other areas; or not at all to save on device battery life.

Arm has further improved its energy efficiency across implementations of all of its three new cores, the Cortex-X4, the fastest core Arm has ever created, down to the middle Cortex-A720 and little Cortex-A520 core. Finding power and performance gains in each of these cores adds to a much more significant effect on overall efficiency, precisely what Arm has been doing for many years. Even the latest DynamIQ Shared Unit (DSU-120) has implementations active through the use of dynamic power consumption and various power modes for idle, which makes things more efficient from a performance point of view, especially if the workload isn't intensive and can be allocated to the right cores and specific slices of logic can be powered down to maximize efficiency.

Meanwhile, the shift to the pure, 64-bit only AArch64 ISA yields many improvements in different areas. For Arm's IP teams, it allows Arm to focus its efforts directly on one specific ISA as we advance. Still, it also enables parity within its software, in which Arm's software engineering staff comprises 45% of its overall engineering team. That's a serious chunk of manpower working on making refinements within the gap between software, hardware, and IP, and one that is driving the 64-bit ISA ecosystem further toward a completely unified market space. The performance benefits and security-based ones are prevalent within the jump between 32-bit and 64-bit, it's just adopting a unified system, and Arm is undoubtedly encouraging a market shift from previous products, including the newly announced Armv9.2 architecture.

While Chinese companies such as OPPO have been notoriously slow in making these moves into 64-bit, the uptick in 64-bit applications within the Chinese market has grown exponentially over the last year. The application cycle between 64-bit and 32-bit has been primarily driven by Google and its Play Store, with its developers needing to compile a 64-bit version for many years. This requirement ensures that software developers, especially those partnered with Arm to optimize their software for the latest Arm IP, are having a positive effect on slower adopters and markets, pushing them to finally make the switch to 64-bit.

The next step for 64-bit is squeezing out more advantages over 32-bit; for one, security plays an even more significant part in driving things forward. Not only does AArch64 outperform AArch32, but the 64-bit ISA provides for more security options. And finding efficiencies in streamlining the whole process from IP to hardware, to software, to the device, to market, is one that should hopefully reduce costs by dropping 32-bit entirely from companies plans. Even for devices such as the next wave of Digital TVs (DTV), which is a growing market, these vendors can undoubtedly apply the benefits of both increased performance and security integrity to their products. 

All of this goes hand-in hand with the manufacturing side of things, as well. For as much as Arm has improved its designs at the IP level, and delivering gains on an iso-process basis, node shrinks are still the most potent way to improve chip performance, especially when it comes to energy efficiency. That Arm's TSC23 IP is the first IP to tape out on TSMC's N3E process is no fluke, and it marks the final ingredient of Arm's PPA design philosophy.

Overall, while Arm's 2023 CPU and system IP don't bring any radical microarchtectual changes at any one level, on the whole it's a solid slate of new IP offerings. After getting the ball rolling on the swtich to pure 64-bit CPU cores with Cortex-A715 last year, this year's final and full shift will still take some getting used to, but it should be a pretty smooth transition overall. And by doing this at the same time as focusing on those aspects of PPA that their SoC customers really care about – small die sizes that keep down energy consumption – Arm is giving their partners two very good reasons to keep moving forward with the rest of the company. Just what kind of chips Arm's partners ultimately build remains to be seen, but we are looking forward to seeing how things pan out later this year.

New DSU-120: More L3 Cache, Doubling Down on Efficiency
Comments Locked

52 Comments

View All Comments

  • tipoo - Sunday, May 28, 2023 - link

    6 years after iOS went 64 bit only. I'm guessing the cores have also been 64 bit only there for a while?
  • goatfajitas - Monday, May 29, 2023 - link

    IOS is an OS from one company that is made for a few specific products from that one company. You cant evenly compare an open platform to a narrow closed market like that.
  • iAPX - Monday, May 29, 2023 - link

    Yes Apple SoC seems to be 64bit-only for years, that simplify their own design and gives more efficiency.

    As 64bit ARM ISA as nothing in common with 32bit ARM ISA, contrary to the x86 and AMD64, they basically started with a blank page, profiting from experience of various preceding 64bit ISA, I feel it was the right way to go.
  • dotjaz - Monday, May 29, 2023 - link

    No it's not, Apple is not allowed to modify ARM ISA. If it's ARMv8 compliant, it CANNOT possibly be 64bit only.
  • Doug_S - Monday, May 29, 2023 - link

    ARMv8 makes execution of AArch32 optional. Apple may have been responsible for that as they were involved in the spec of ARMv8 and AArch64 - they would have known they'd want to drop 32 bit code as soon as it was practical.
  • dotjaz - Tuesday, May 30, 2023 - link

    That's factually UNTRUE, Aarch32 execution is mandatory in **hardware implementation**, Aarch64 **OS** can choose not to execute Aarch32 codes
  • Doug_S - Tuesday, May 30, 2023 - link

    Sorry but you are wrong, ARMv8 specifically makes support for AArch32 optional for hardware implementations.
  • Jaybird99 - Monday, May 29, 2023 - link

    Apple is a founding partner with an architectural license. They can change anything they wish on the CPU design, then have it fabricated. I thought this was known because of the wildly different core design from Apple. They take the ISA they pick and choose and add/delete what they need. They actually help ARM in the long run as seeing how Apple uses 64bit and finds solutions to their issues, because as stated above 64bit was blank slate for ARM. I'm very fairly certain of this, but if you know something I don't? (I might not..)
  • Doug_S - Monday, May 29, 2023 - link

    An architectural license allows them to implement the ISA, but they can't delete things from it. They are able to add things to it (i.e. TSO, their AMX instructions, etc.) but it still has to pass ARM's conformance tests to show it is capable of running ARM code.

    They were able to "delete" AArch32 because ARMv8 allows that. ARMv9 goes further and makes AArch32 a special license addition or something like that - basically Aarch32 is deprecated with ARMv9 and will probably go away entirely with ARMv10.
  • dotjaz - Tuesday, May 30, 2023 - link

    No, they were not able to "delete" AArch32. They can disallow AArch32 codes execution in their OS just like Google Pixel 7-series, they cannot remove the support from hardware.

    And Apple did not add anything to ARM ISA. AMX is masked as a co-processor only available through frameworks, it doesn't directly execute any code other than a "firmware".

    TSO is not an instruction. It's a **mode**. It pertains to HOW the CPU reorders L/S queue. It has nothing to do with the ISA.

Log in

Don't have an account? Sign up now