# IBM_PROLOG_BEGIN_TAG # This is an automatically generated prolog. # # bos720 src/bos/usr/lpp/bos/README.HMT.sh 1.2 # # Licensed Materials - Property of IBM # # Restricted Materials of IBM # # COPYRIGHT International Business Machines Corp. 2000 # All Rights Reserved # # US Government Users Restricted Rights - Use, duplication or # disclosure restricted by GSA ADP Schedule Contract with IBM Corp. # # IBM_PROLOG_END_TAG ************************* README.HMT ************************* * Overview * Limitations * AIX Behavior Differences * Performance Considerations * Usage * Note to Developers ======== Overview ======== The purpose of this README is to announce support of Hardware Multithreading (HMT) on the RS/6000 Enterprise Server M80 and IBM eServer pSeries p680. This new AIX feature provides a performance tuning technique which benefits some workloads, regresses others, limits some system functionality, and changes expected behavior of some AIX commands. Customers are advised to review this README in its entirety and the referenced White Paper before enabling HMT. Hardware Multithreading (HMT) should not be confused with software "multithreading", which is basically one task subdivided into multiple threads of execution. In HMT, the "threads" may or may not be related, as in the subdivided task definition. The basic technique of HMT is that a processor simultaneously holds the state of 2 threads. For example, when a cache miss occurs (L1 or L2), which would normally delay the processor for many cycles, the processor switches to another state and attempts to execute instructions from the other thread. This may keep the CPU more fully utilized and thereby improve processor throughput. When HMT is enabled, the system will appear to have twice as many processors as it actually physically has. For example, an 8-way SMP will appear to be a 16-way SMP. It should be noted that the performance improvements that may be gained by having HMT would not be as great as actually adding 8 processors to the system. =========== Limitations =========== 1) When HMT is enabled, unpredictable results (including system crashes) may occur when vendor software has built-in dependencies upon the number of processors in the system. 2) HMT is not supported on UP systems nor on the /usr/lib/boot/unix_up kernel. 3) Automatic Runtime Predictive Deconfiguration of Processors (RPDP) is NOT supported when HMT is enabled. If RPDP has been configured before HMT is enabled and then HMT is enabled, the RPDP subsystem will NOT be configured on the next system boot and AIX will not deallocate CPUs. When HMT is disabled, the RPDP will be configured on the next system boot (assuming RPDP's configuration has not been changed). 4) When HMT is enabled, CPU utilization metrics at the thread, process or processor level will be skewed by a factor heavily dependent on the workload running on the machine. This is due to the hardware threading of a single processor into "two processors" and the way CPU utilization metrics are currently updated. In particular, some of the sampling based tools are less accurate using HMT with low CPU utilization workloads. Measuring scaled throughput when HMT is enabled will not produce useful data. 5) The lsdev -C and lscfg commands will still report the actual number of physical processors present. That is, the number of processors reported by these commands will not be doubled when HMT is enabled. 6) The diag command will still test the actual number of physical processors present. That is, the number of processors tested by diagnostics will not be doubled when HMT is enabled. 7) The bindintcpu subroutine may now return the error EXDEV. 8) The bindintcpu command may now report the following error message: Unable to assign interrupt level to specified processor. 9) The i_int2cpu_ppc() service may now return the error EXDEV. 10) Capacity Upgrade On Demand (CUoD) constraints will be respected when HMT is enabled. 11) Kernel developers should take note of additional limitations indicated in the "Note to Developers" section of this README. ========================= AIX Behavior Differences ========================= 1) When the system is booting and the "Welcome to AIX" message is displayed, the number of processors reported will be the actual number of physical processors in the system. 2) While the system is booting with HMT enabled and the debugger is enabled, the following may be observed: Starting NODE#000 physical CPU#000 as logical CPU#001... done. Starting NODE#000 physical CPU#001 as logical CPU#002... done. Starting NODE#000 physical CPU#001 as logical CPU#003... done. Starting NODE#000 physical CPU#002 as logical CPU#004... done. Starting NODE#000 physical CPU#002 as logical CPU#005... done. .... The apparent starting of a physical CPU number twice is normal when HMT is enabled. 3) The "netstat -m" command will report twice the number of CPUs than the actual number of physical processors. 4) The "bindprocessor -q" command will report twice the number of CPUs than the actual number of physical processors. 5) With HMT active, the system will appear to have CPUs that can run at variable rates. The amount of CPU time reported used by a thread performing any given task may vary significantly from run to run. 6) With HMT active, the amount of CPU time reported used by a thread performing a given task will be increased as compared to the amount of time reported with HMT inactive. ========================== Performance Considerations ========================== IBM has seen mixed results in its performance evaluation of HMT. A report of this evaluation is available at ftp://ftp.software.ibm.com/aix/tools/perftools/perfpmr/misc_documents/HMT_wp.ps In general, HMT increases raw throughput of "heavy" workloads where CPU utilization is over 90%. Light workloads may see a decline in throughput when HMT is enabled. ===== Usage ===== 1) RS/6000 Enterprise Server 7026-M80 customers need to verify that the system firmware has been updated since 11/27/2000 and is at or beyond Version MM001108. See http://www.rs6000.ibm.com/support/micro for instructions for verifying, downloading, installing firmware. 2) To enable HMT, issue the following command: bosdebug -H on HMT being enabled will be indicated by the following being displayed: HMT on The bosboot command must be run and the system rebooted in order for HMT to be activated. 3) To disable HMT, issue the following command: bosdebug -H off The "HMT on" indicator message will no longer be displayed. The bosboot command must be run and the system rebooted in order for HMT to be deactivated. 4) To check if HMT is enabled, issue the following command: bosdebug If HMT is enabled the following will be displayed: HMT on If HMT is disabled, no HMT message will be displayed. 5) To check if HMT is active, issue the following command: bindprocessor -q If HMT is active, double the number of physical CPUs will be listed. When HMT is active, this means that the bosdebug -H on, bosboot, and a subsequent system reboot had successfully been done. The system is currently running with the HMT processor technique. ================== Note to Developers ================== When HMT is enabled, please note the following: 1) Developers who use the low-level debugger (lldb) that can be enabled with the /usr/lib/boot/unix_mp kernel should be aware that this debugger does NOT support switching between CPUs when HMT is enabled. While in the debugger, switching between CPUs may cause the debug session to hang. The system must be rebooted. 2) Developers who use the Kernel Debugger (KDB) that can be enabled with the /usr/lib/boot/unix_mp_kdb kernel should be aware that this debugger's default toggle setting do not support switching between CPUs. To be able to switch between CPUs, the IPI_enable toggle flag must be set to false. To do this, do: set IPI_enable until IPI_enable is false is displayed. If the IPI_enable flag is not set to false and switching between CPUs is attempted, the debug session may hang. The system must be rebooted. 3) Timing a code segment by time stamping the start and end of the segment can not be expected to yield the true CPU time required by the segment, even if that segment is running disabled for all interrupts.