{ @(#)22 1.10 src/bos/usr/sbin/perf/pmapi/libpmapi/POWER3-II.evs, pmapi, bos720 12/1/05 12:17:44 { IBM_PROLOG_BEGIN_TAG { This is an automatically generated prolog. { { bos720 src/bos/usr/sbin/perf/pmapi/libpmapi/POWER3-II.evs 1.10 { { Licensed Materials - Property of IBM { { COPYRIGHT International Business Machines Corp. 1999,2005 { All Rights Reserved { { US Government Users Restricted Rights - Use, duplication or { disclosure restricted by GSA ADP Schedule Contract with IBM Corp. { { IBM_PROLOG_END_TAG 51,40,31,32,26,23,24,16 { counter 1 } #0,v,n,n,n,n,PM_CYC,Processor cycles Processor clock cycles. Note: This count may vary from one run of a program to another due to speculative execution and system activity. #1,v,n,n,n,n,PM_INST_CMPL,Instructions completed Number of instructions completed. Note: The count begins as soon as the counters are enabled. If an interface code sequence is used to enable counting, the count will include those instructions in the interface code executed after the counters are enable. If an interface code sequence is used to disable counting, the count will include those instructions in the interface code executed before the counters are disabled. Note: The number of instructions completed may vary from one run to another if supervisor mode instructions are counted. Note: Speculatively executed instructions may be counted. #2,u,n,n,n,n,PM_TB_BIT_TRANS,Time Base bit transition Selected Time Base output. #3,u,n,n,n,n,PM_INST_DISP,Instructions dispatched Number of instructions dispatched (Max = 4 per cycle) #4,v,n,n,n,n,PM_LD_CMPL,Loads completed Number of load instructions completed (Max = 4 per cycle) #5,v,n,n,n,n,PM_IC_MISS,Instruction cache misses L1 Instruction cache misses. #6,v,n,n,n,n,PM_LD_MISS_L2HIT,Load miss occurred in L1, but hit in L2 A load miss occurred in L1. Note: Some dcb instructions are also treated/counted as load instructions (re: Book II Data Cache Instructions.) #7,u,n,t,n,n,PM_LD_MISS_EXCEED_NO_L2,Load D cache misses without lateral L2 cache intervention and exceeding threshold Load D cache misses without lateral L2 cache intervention and exceeding threshold. #8,u,n,n,n,n,NUSED,Not used Reserved. #9,u,n,t,n,n,PM_ST_MISS_EXCEED_NO_L2,Store D cache misses without lateral L2 cache intervention and exceeding threshold Store D cache misses without lateral L2 cache intervention and exceeding threshold. #10,u,n,n,n,n,PM_BURSTRD_MISS_L2_INT,L2 burst read miss & another processor has a modified copy L2 burst read miss & another processor has a modified copy #11,u,n,n,n,n,NUSED,Not used Reserved. #12,u,n,n,n,n,PM_IC_MISS_USED,An I cache miss line was brought in and used Brought/wrote a line into the I cache and used it #13,u,n,n,n,n,PM_DU_ECAM_RCAM_OFFSET_HIT,The ECAM/RCAM logic detected an offset hit from DU The ECAM/RCAM logic detected an offset hit from DU #14,u,n,n,n,n,PM_GLOBAL_CANCEL_INST_DEL,Instructions deleted on global cancel Number of instructions deleted on global cancel #15,u,n,n,n,n,PM_CHAIN_1_TO_8,Chain PMC1 to PMC8 Chain counter History Mode with PMC1[msb] chained to PMC8[lsb]. #16,u,n,n,n,n,PM_FPU0_BUSY,Floating Point Unit 0 busy FPU0 was busy. #17,u,n,n,n,n,PM_DSLB_MISS,Data SLB misses D cache SLB miss occurred. #18,u,n,n,n,n,PM_LSU0_ISS_TAG_ST,LSU0 issued a tagged store request to D cache LSU0 issued a tagged store request to D cache. #19,v,n,n,n,n,PM_TLB_MISS,TLB misses TLB misses. Includes both D cache and I cache misses. #20,u,n,n,n,n,PM_EE_OFF,Cycles MSR(EE) bit off Number of cycles the MSR(EE) bit is off. #21,u,n,n,n,n,PM_BRU_IDLE,Number of cycles the branch unit is idle Branch unit is idle (no conditional branches exec.). #22,u,n,n,n,n,PM_SYNCHRO_INST,Executing a single instruction serialization A single instruction serialization is executing. #23,u,n,n,n,n,NUSED,Not used Reserved. #24,u,n,n,n,n,PM_CYC_1STBUF_OCCP,Number cycles that 1 store buffer is occupied Number of cycles 1 and only 1 store buffer is occupied #25,u,n,n,n,n,PM_SNOOP_L1_M_TO_E_OR_S,Snoop based L1 transitions from M to E or S Number of snoop based L1 transitions from M to E or S #26,u,n,n,n,n,PM_ST_CMPLBF_AT_GC,Stores in the completion buffer at global cancel Number of stores in the completion buffer at global cancel. Note: In order for events 26,29, and 30 to be counted correctly, the following kind of sequence should be used to clear out residual values in the internal performance monitor counters and state machines: ISYNC /* complete previous activities */ MTSPR MMCR0, /* turn on counter 1 */ #27,u,n,n,n,n,PM_LINK_STACK_FULL,Link register stack is full Link register stack full. #28,u,n,n,n,n,PM_CBR_RESOLV_DISP,A conditional branch was resolved at dispatch A conditional branch was resolved at dispatch. #29,u,n,n,n,n,PM_LD_CMPLBF_AT_GC,Loads in the completion buffer at global cancel Number of loads in the completion buffer at global cancel. Note: In order for events 26,29, and 30 to be counted correctly, the following kind of sequence should be used to clear out residual values in the internal performance monitor counters and state machines: ISYNC /* complete previous activities */ MTSPR MMCR0, /* turn on counter 1 */ #30,u,n,n,n,n,PM_ENTRY_CMPLBF,Number of entries in the completion buffer Number of entries in the completion queue. Number of load in the completion buffer at global cancel. Note: In order for events 26,29, and 30 to be counted correctly, the following kind of sequence should be used to clear out residual values in the internal performance monitor counters and state machines: ISYNC /* complete previous activities */ MTSPR MMCR0, /* turn on counter 1 */ #31,u,n,n,n,n,NUSED,Not used Reserved. #32,u,n,n,n,n,PM_BIU_ST_RTRY,6xx master transaction retried on bus for store op 6xx master transaction retried on bus for store op. #33,u,n,n,n,n,PM_EIEIO_WT_ST,Cycles EIEIO waits on a store Number of cycles eieio stalls a store. #34,u,n,n,n,n,NUSED,Not used Reserved. #35,u,n,n,n,n,PM_I_1_ST_TO_BUS,Number of I=1 store operations to bus Number of I=1 store operations to bus. #36,u,n,n,n,n,PM_CRB_BUSY_ENT,Number of CRB busy block entries Number of CRB busy block entries. #37,v,n,n,n,n,PM_DC_PREF_STREAM_ALLOC_BLK,D cache prefetch stream allocations blocked Number of D cache prefetch data stream allocations blocked due to four streams. #38,u,n,n,n,n,PM_W_1_ST,Number of W=1 stores Number of W=1 accesses (store). #39,u,n,n,n,n,PM_LD_CI,Number of cache inhibit (I=1) loads Number of I=1 loads. #40,u,n,n,n,n,PM_4MISS,Cycles with 4 and only 4 outstanding misses Number of cycles with 4 and only 4 outstanding misses. #41,u,n,n,n,n,PM_ST_GATH_BYTES,Number of store bytes gathered Number of byte gathered (store). #42,u,n,n,n,n,PM_DC_HIT_UNDER_MISS,Number of D cache hit under misses Number of D cache hit under misses (max of 2/cycles) #43,u,n,n,n,n,PM_INTLEAVE_CONFL_STALLS,Cycles of store stalls due to interleave conflict Number of cycles a store stalls due to interleave conflict or other resource conflict (conflict of RA port) #44,u,n,n,n,n,PM_DU1_REQ_ST_ADDR_XTION,DU1 requested a store address translation A store address translation was requested from DU1 #45,u,n,n,n,n,PM_BTC_BTL_BLK,Number of cycles BTC/BTL is blocked from dispatch Number of cycles branch-to-count/branch-to-link is blocked from dispatch due to operand conflict (prior instruction is updating the count/link register) #46,u,n,n,n,n,PM_FPU_SUCCESS_OOO_INST_SCHED,Number of FPU successful out-of-order instruction scheduling Number of FPU successful out-of-order instruction scheduling to both FPU units. #47,v,n,n,n,n,PM_FPU_LD_ST_ISSUES,Number of FPU loads and stores issued by LSU to DU Number of FPU loads and stores issued by the LSU to the DU (category 6 Cray megaflops). Note: The count will be incremented by one each time an effective address is calculated; thus, misaligned operations will cause the count to be larger than the actual number of load/store instructions in the code stream. #48,v,n,n,n,n,PM_FPU_FPSCR,FPU executed FPSCR instruction FPU executed an FPSCR (count both FPUs, category 9 Cray megaflops). Note: Speculative instructions are counted and this may produce counts that are greater than the number of operations actually appearing in the static code stream. #49,c,n,n,n,n,PM_FPU0_FSQRT,FPU0 executed FSQRT instruction FPU0 executed an FSQRT instruction (category 3 Cray megaflops). Note: Only short-circuit FSQRTs are counted. A square root operation that requires iterative computation is counted as one FMADD operation. #50,v,n,n,n,n,PM_FPU0_FEST,FPU0 executed FEST instruction FPU0 executed Estimate instructions, FRSQRTE, FRES (category 7 Cray megaflops) Note: Speculative instructions are counted and this may produce counts that are greater than the number of operations actually appearing in the static code stream. $$$$ { counter 2 } #0,v,n,n,n,n,PM_INST_CMPL,Instructions completed Number of instructions completed. Note: The count begins as soon as the counters are enabled. If an interface code sequence is used to enable counting, the count will include those instructions in the interface code executed after the counters are enable. If an interface code sequence is used to disable counting, the count will include those instructions in the interface code executed before the counters are disabled. Note: The number of instructions completed may vary from one run to another if supervisor mode instructions are counted. Note: Speculatively executed instructions may be counted. #1,u,n,n,n,n,PM_CYC,Processor cycles Processor clock cycles Note: This count may vary from one run of a program to another due to speculative execution and system activity. #2,u,n,n,n,n,PM_TB_BIT_TRANS,Time Base bit transition Selected Time Base output. #3,u,n,n,n,n,PM_INST_DISP,Instructions dispatched Number of instructions dispatched (Max = 4 per cycle) #4,u,n,n,n,n,PM_SNOOP_L2ACC,Snoop operation which accessed the L2 snooped operation which accesses L2 #5,u,n,n,n,n,PM_DU0_REQ_ST_ADDR_XTION,DU0 requested a store address translation A store address translation was requested from DU0. #6,u,n,n,n,n,PM_TAG_BURSTRD_L2MISS,L2 misses caused by tagged burst read Tagged burst read caused L2 miss #7,u,n,n,n,n,PM_FPU_IQ_FULL,Number of cycles the FPU instruction queue is full Number of cycles the FPU instruction queue is full. #8,v,n,n,n,n,PM_BR_PRED,A conditional branch was predicted A conditional branch was predicted. #9,v,n,n,n,n,PM_ST_MISS_L1,L1 D cache store misses Store miss occurred in L1. Note: Some dcb instructions are also treated/counted as stores (re: Book II Data Cache Instructions.. #10,u,n,t,n,n,PM_LD_MISS_EXCEED_L2,Load D cache misses with lateral L2 cache intervention and exceeding threshold Load D cache misses with lateral L2 cache intervention and exceeding threshold. #11,u,n,n,n,n,PM_L2ACC_BY_RWITM,RWITM caused an L2 access RWITM caused an L2 access #12,u,n,t,n,n,PM_ST_MISS_EXCEED_L2,Store D cache misses with lateral L2 cache intervention and exceeding threshold Store D cache misses with lateral L2 cache intervention and exceeding threshold. #13,u,n,n,n,n,PM_ST_COND_FAIL,Store conditional failed Store conditional (stcx) instruction failed. #14,u,n,n,n,n,PM_ST_CI_WT_ST_CI,Cycles a cache inhibited store is waiting on a cache inhibited store Number of cycles a cache-inhibited store waited on a cache-inhibited store #15,u,n,n,n,n,PM_CHAIN_2_TO_1,Chain PMC2 to PMC1 Chain counter History Mode with PMC2[msb] chained to PMC1[lsb]. #16,u,n,n,n,n,PM_TAG_BURSTRD_MISS_L2_INT,L2 misses with intervention caused by tagged burst read Tagged L2 burst read miss and another processor has a modified copy #17,u,n,n,n,n,PM_FXU2_IDLE,FXU2 idle FXU2 idle #18,u,n,n,n,n,PM_SC_INST,System calls Number of system calls. #19,u,n,n,n,n,PM_DSLB_MISS,Data SLB misses D cache SLB miss occurred. #20,u,n,n,n,n,PM_2CASTOUT_BF,Cycles exactly 2 castout buffers occupied Number of cycles 2 and only 2 store buffers are occupied #21,u,n,n,n,n,PM_BIU_LD_NORTRY,Master generated load operation is not retried 6xx master transaction not retried on bus (load) #22,u,n,n,n,n,PM_LARX,Larx executed Number of larx executed (non speculative) #23,u,n,n,n,n,PM_SNOOP_E_TO_S,Snoop based L1/L2 transitions from eXclusive to Shared Number of snoop based L2 transitions from E to S. #24,u,n,n,n,n,NUSED,Not used Reserved. #25,u,n,n,n,n,PM_IBUF_EMPTY,Empty instruction buffer Instruction buffer empty this cycle. #26,u,n,n,n,n,PM_SYNC_CMPLBF_CYC,Cycles a sync instr is at bottom of the completion buffer Number of cycles a sync instruction is at the bottom of the completion buffer. #27,u,n,n,n,n,PM_TLBSYNC_CMPLBF_CYC,Cycles a tlbsync instr is at bottom of completion buffer Number of cycles a tlbsync instruction is at the bottom of the completion buffer. #28,c,n,n,n,n,PM_DC_PREF_L2_INV,D cache lines inval in L2 due to prefetch LD data Number of D cache lines invalidated in L2 due to a prefetch load data Note: This event is currently counted for only one of the four load ports. #29,u,n,n,n,n,PM_DC_PREF_FILT_1STR,Cycles D cache prefetch filter has 1 and only 1 stream entry Number of cycles D cache prefetch filter has 1 and only 1 stream entry. #30,u,n,n,n,n,PM_ST_CI_PREGATH,Cache inhibited (I=1) stores before gathering Number of I = 1 stores (before gathering). #31,u,n,n,n,n,PM_ST_GATH_HW,Number of store halfwords gathered Number of store halfword gathered (store). #32,c,n,n,n,n,PM_LD_WT_ADDR_CONF,Cycles load stalls due to interleave conflict Number of cycles load stalls due to interleave conflict. Note: The PM2_32 is triggered by either of the two signals that come from two different units: DU and LSU DU: the signal daa_intf_miss_confl_stall_load triggers PM2_32 once when a collision occurs. LSU: the signal laa_itrlv_cnflct_stal_ld from LSU triggers PM2_32. If the stall is generated by LSU due to interleave conflict, the signal laa will triggered once. If LSU execution is held due to other reason (ex: due to DU hold) and LSU is holding an interleave conflict then the signal laa_ will be triggered over and over again. #33,u,n,n,n,n,PM_TAG_LD_DATA_RECV,LSU1 receives data from memory side LSU1 received data from memory side. #34,u,n,n,n,n,PM_FPU1_DENORM,FPU1 received denormalized data FPU1 received denormalized data. #35,v,n,n,n,n,PM_FPU1_CMPL,FPU1 produced a result FPU1 produced a result. #36,v,n,n,n,n,PM_FPU_FEST,FPU executed FEST instruction FPU executed Estimate instructions FRSQRTE,FRES (count both FPUs, category 7 Cray megaflops). Note: Speculative instructions are counted and this may result in counts that are greater than the number of operations actually appearing in the static code stream. #37,v,n,n,n,n,PM_FPU_LD,FPU executed a load floating point instruction Number of FPU loads issued by the LSU to the DU (category 6 Cray megaflops). #38,c,n,n,n,n,PM_FPU0_FDIV,FPU0 executed FDIV instruction FPU0 executed a divide (category 1 Cray megaflops). #39,v,n,n,n,n,PM_FPU0_FPSCR,FPU0 executed FPSCR instruction FPU0 executed an FPSCR (category 9 Cray megaflops). Note: Speculative instructions are counted and this may produce counts that are greater than the number of operations actually appearing in the static code stream. $$$$ { counter 3 } #0,u,n,n,n,n,PM_IC_MISS_USED,An I cache miss line was brought in and used Brought/wrote a line into the I cache and used it. #1,u,n,n,n,n,PM_CYC,Processor cycles Processor clock cycles. Note: This count may vary from one run of a program to another due to speculative execution and system activity. #2,v,n,n,n,n,PM_INST_CMPL,Instructions completed Number of instruction completed. Note: The count begins as soon as the counters are enabled. If an interface code sequence is used to enable counting, the count will include those instructions in the interface code executed after the counters are enable. If an interface code sequence is used to disable counting, the count will include those instructions in the interface code executed before the counters are disabled. Note: The number of instructions completed may vary from one run to another if supervisor mode instructions are counted. Note: Speculatively executed instructions may be counted. #3,u,n,n,n,n,PM_TB_BIT_TRANS,Time Base bit transition Selected Time Base outputs. #4,u,n,n,n,n,PM_INST_DISP,Instructions dispatched Number of instruction dispatched (Max = 4 per cycle) #5,u,n,n,n,n,PM_LD_MISS_L1,L1 D cache load misses A load miss occurred in L1. Note: Some dcb instructions are also treated/counted as load instructions (re: Book II Data Cache Instructions.) #6,u,n,n,n,n,PM_TAG_ST_MISS_L2,Tagged RWITM caused L2 miss Tagged RWITM caused L2 miss #7,u,n,n,n,n,PM_BRQ_FULL_CYC,Cycles branch queue full The number of cycles the branch queue is full. #8,u,n,n,n,n,PM_TAG_ST_MISS_L2_INT,A RWITM in the L2 with intervention Tagged L2 RWITM miss, another processor has a modified copy #9,v,n,n,n,n,PM_ST_CMPL,Store instruction completed Number of store instructions completed. #10,u,n,n,n,n,PM_TAG_ST_CMPL,Tagged store completed Number of tagged stores completed. #11,u,n,n,n,n,PM_LD_NEXT,Load instruction is the next instr. to complete Load instruction is at the bottom of the completion buffer. #12,v,n,n,n,n,PM_ST_MISS_L2,RWITM caused L2 miss RWITM caused L2 miss #13,u,n,n,n,n,PM_TAG_BURSTRD_L2ACC,L2 access caused by tagged burst read Tagged burst read caused L2 access #14,u,n,n,n,n,PM_ST_CI_WT_ST_CI,Cycles a cache inhibited store is waiting on a cache inhibited store Number of cycles a cache-inhibited store waited on a cache-inhibited store #15,u,n,n,n,n,PM_CHAIN_3_TO_2,Chain PMC3 to PMC2 Chain counter History Mode with PMC3[msb] chained to PMC2[lsb]. #16,u,n,n,n,n,PM_UNALIGNED_ST,Unaligned stores Number of unaligned stores (One occurrence per valid unaligned store AGEN) #17,u,n,n,n,n,PM_CORE_ST_N_COPYBACK,Core-originated stores and copybacks Number of all core-originated stores and copybacks #18,u,n,n,n,n,PM_SYNC_RERUN,Sync rerun operations Number of sync rerun operations initiated by the master #19,u,n,n,n,n,PM_3CASTOUT_BF,Cycles exactly 3 castout buffers occupied Number of cycles 3 and only 3 store buffers are occupied) #20,u,n,n,n,n,PM_BIU_RETRY_DU_LOST_RES,BIU retries but DU lost reservation Number of time BIU sends a retry but DU already lost reservation, (I=1 only). #21,u,n,n,n,n,PM_SNOOP_L2_E_OR_S_TO_I,Snoop based L2 transitions E or S to I Number of snoop based L2 transitions from E or S to I. #22,c,n,n,n,n,PM_FPU_FDIV,FPU executed FDIV instruction FPU divides executed (count both FPUs, category 1 Cray megaflops). #23,u,n,n,n,n,NUSED,Not used Reserved. #24,u,n,n,n,n,PM_IO_INTERPT,Number of I/O interrupts Number of I/O interrupts detected. #25,u,n,n,n,n,PM_DC_PREF_HIT,D cache prefetch hit Number of D cache prefetch request and data in prefetch buffer #26,u,n,n,n,n,PM_DC_PREF_FILT_2STR,Cycles D cache prefetch filter has 2 stream entries Number of cycles D cache prefetch filter has 2 and only 2 stream entries. #27,u,n,n,n,n,PM_PREF_MATCH_DEM_MISS,Prefetch matches a demand miss Prefetch matches demand miss. #28,u,n,n,n,n,PM_LSU1_IDLE,Cycles LSU1 idle Number of cycles LSU1 is idle. #29,u,n,n,n,n,NUSED,Not used Reserved. #30,u,n,n,n,n,NUSED,Not used Reserved. $$$$ { counter 4 } #0,u,n,n,n,n,NUSED,Not used Reserved. #1,u,n,n,n,n,PM_CYC,Processor cycles Processor clock cycles. Note: This count may vary from one run of a program to another due to speculative execution and system activity. #2,v,n,n,n,n,PM_INST_CMPL,Instructions completed Number of instruction completed. Note: The count begins as soon as the counters are enabled. If an interface code sequence is used to enable counting, the count will include those instructions in the interface code executed after the counters are enable. If an interface code sequence is used to disable counting, the count will include those instructions in the interface code executed before the counters are disabled. Note: The number of instructions completed may vary from one run to another if supervisor mode instructions are counted. Note: Speculatively executed instructions may be counted. #3,u,n,n,n,n,PM_TB_BIT_TRANS,Time Base bit transition Selected Time Base outputs. #4,u,n,n,n,n,PM_INST_DISP,Instructions dispatched Number of instructions dispatched (Max = 4 per cycle) #5,u,n,n,n,n,PM_LD_CMPL,Loads completed Number of load instructions completed (Max = 4 per cycle) #6,u,n,n,n,n,PM_FPU0_DENORM,FPU0 received denormalized data FPU0 received denormalized data. #7,u,n,n,n,n,PM_LSU0_ISS_TAG_LD,LSU0 issued a tagged load request to D cache LSU0 issued a tagged load request to D cache. #8,u,n,n,n,n,PM_TAG_ST_L2ACC,Tagged RWITM caused L2 access Tagged RWITM caused L2 access. #9,u,n,n,n,n,PM_LSU0_LD_DATA,LDU0 received data (for a load) LSU0 received data from memory side (L1/L2/6xx). #10,u,n,n,n,n,PM_ST_MISS_L2_INT,RWITM in L2 with intervention L2 RWITM miss, another processor has modified copy. #11,u,n,n,n,n,PM_SYNC,SYNC instructions completed Sync request was made to the BIU. #12,u,n,n,n,n,NUSED,Not used Reserved. #13,u,n,n,n,n,PM_FXU2_BUSY,FXU2 is busy FXU2 was busy executing an instruction. #14,u,n,n,n,n,PM_BIU_ST_NORTRY,Master generated store operation is not retried 6xx master transaction not retried on bus for store op #15,u,n,n,n,n,PM_CHAIN_4_TO_3,Chain PMC4 to PMC3 Chain counter History Mode with PMC4[msb] chained to PMC3[lsb]. #16,u,n,n,n,n,PM_DC_ALIAS_HIT,Aliased hit in D cache ECAM/RCAM logic detected an aliased hit. #17,u,n,n,n,n,PM_FXU1_IDLE,FXU1 idle FXU1 idle. #18,u,n,n,n,n,PM_UNALIGNED_LD,Unaligned loads Number of unaligned loads. (One occurrence per valid load AGEN) #19,u,n,n,n,n,PM_CMPLU_WT_LD,Cycles completion unit is stalled for a load instruction Completion unit is stalled on load operations. #20,u,n,n,n,n,PM_BIU_ARI_RTRY,A master-generated Bus operation received an ARespin (ARI) retry A master-generated Bus operation received an ARespIn (ARI) Retry #21,c,n,n,n,n,PM_FPU_FSQRT,FPU executed FSQRT instruction Number of FPU FSQRT executed (count both FPUs, category 3 Cray megaflops). #22,v,n,n,n,n,PM_BR_CMPL,Branches completed Branches executed. #23,u,n,n,n,n,PM_DISP_BF_EMPTY,Dispatch buffer is empty this cycle Dispatch buffer empty. #24,u,n,n,n,n,PM_LNK_REG_STACK_ERR,Link register stack error Link register stack error. #25,u,n,n,n,n,PM_CRLU_PROD_RES,CR logical unit produced a result CR logical unit produced a result. #26,u,n,n,n,n,PM_TLBSYNC_RERUN,TLBSYNC rerun operations Number of tlbsync rerun operation initiated by the master #27,u,n,n,n,n,PM_SNOOP_L2_M_TO_E_OR_S,Snoop based L2 transitions from M to E or S Number of snoop based L2 transition from M to E or S. #28,u,n,n,n,n,NUSED,Not used Reserved. #29,u,n,n,n,n,PM_DEM_FETCH_WT_PREF,Demand fetch blocked by outstanding prefetch Number of demand fetch blocked by outstanding prefetch. #30,v,n,n,n,n,PM_FPU0_FRSP_FCONV,FPU0 executed FRSP or FCONV instructions FPU0 executed FRSP or FCONV (category 8 Cray megaflops). Note: Speculative instructions are counted and this may produce counts that are greater than the number of operations actually appearing in the static code stream. #31,u,n,n,n,n,NUSED,Not used Reserved. $$$$ { counter 5 } #0,u,n,n,n,n,NUSED,Not used Reserved. #1,u,n,n,n,n,PM_IC_HIT,The IC was accessed and a block was fetched I cache was accessed and a block was fetched. #2,u,n,n,n,n,PM_0INST_CMPL,No instructions were completing No instructions were completed. #3,u,n,n,n,n,PM_FPU_DENORM,FPU received denormalized data FPU sent denormalized data (count both FPUs). #4,u,n,n,n,n,PM_BURSTRD_L2ACC,A burst read caused an L2 access Burst read caused L2 access #5,v,n,n,n,n,PM_FPU0_CMPL,Floating Point unit produced a result FPU0 produced a result. #6,u,n,n,n,n,PM_LSU_IDLE,Cycles LSU is idle Number of cycles LSU is idle (count both LSUs). #7,u,n,n,n,n,PM_BTAC_HITS,BTAC Hits Number of hits in the BTAC. #8,u,n,n,n,n,PM_STQ_FULL,Store queue is full Store queue is full. #9,u,n,n,n,n,PM_BIU_WT_ST_BF,A master-generated store stalled for a store buffer A master-generated store operation is stalled waiting for a store buffer #10,u,n,n,n,n,PM_SNOOP_L2_M_TO_I,Snoop based L2 transitions from M to I Number of snoop based L2 transitions from M to I #11,v,n,n,n,n,PM_FPU_FRSP_FCONV,FPU executed FRSP or FCONV instructions Float FRSP or FCONV executed (count both FPUs, category 8 Cray megaflops). Note: Speculative instructions are counted and this may produce counts that are greater than the number of operations actually appearing in the static code stream. #12,u,n,n,n,n,PM_CYC,Processor cycles Processor clock cycles. Note: This count may vary from one run of a program to another due to speculative execution and system activity. #13,u,n,n,n,n,PM_BIU_ASI_RTRY,A master-generated Bus operation received an AStatin (ASI) retry A master-generated Bus operation received an AStatIn (ASI) Retry #14,u,n,n,n,n,NUSED,Not used Reserved. #15,u,n,n,n,n,PM_CHAIN_5_TO_4,Chain PMC5 to PMC4 Chain counter History Mode with PMC5[msb] chained to PMC4[lsb]. #16,u,n,n,n,n,PM_DC_REQ_HIT_PREF_BUF,D cache request hit on prefetch buffer Number of D cache request hit on prefech buffer #17,u,n,n,n,n,PM_DC_PREF_FILT_3STR,Cycles D cache prefilter has 3 and only 3 stream entries Number of cycles D cache prefetch filter has 3 and only 3 stream entries. #18,u,n,n,n,n,PM_3MISS,Cycles with 3 and only 3 outstanding misses Number of cycles D cache with 3 and only 3 outstanding misses. #19,u,n,n,n,n,PM_ST_GATH_WORD,Number of store words gathered Number of Word gathered (store). #20,u,n,n,n,n,PM_LD_WT_ST_CONF,Cycles load stalls due to store conflict Number of cycles load stalls due to store conflict. #21,u,n,n,n,n,PM_LSU1_ISS_TAG_ST,LSU1 issued a tagged store request to D cache LSU1 issued a tagged store request to D cache. #22,u,n,n,n,n,PM_FPU1_BUSY,FPU1 busy FPU1 was busy. #23,c,n,n,n,n,PM_FPU0_FMOV_FEST,FPU0 executed FMOV or FEST instructions FPU0 executed MOVE (category 5 Cray megaflops) or EST (category 9 Cray megaflops) or FSEL (category 4 Cray megaflops). Note: Cray category 5 includes float move and estimate instructions. The counter should not count fsel instructions on FPU0. Note: Speculative instructions are counted and this may produce counts that are greater than the number of operations actually appearing in the static code stream. #24,u,n,n,n,n,PM_4CASTOUT_BUF,Cycles 4 and only 4 castout buffers used Number of cycles 4 and only 4 castout push buffers used #25,u,n,n,n,n,NUSED,Not used Reserved. $$$$ { counter 6 } #0,u,n,n,n,n,PM_DSLB_MISS,Data SLB misses D cache SLB miss occurred. #1,u,n,n,n,n,PM_ST_HIT_L1,L1 D cache store hits A store hit occured in L1. #2,v,n,n,n,n,PM_FXU2_PROD_RESULT,FXU2 produced a result FXU2 produced a result. #3,u,n,n,n,n,PM_BTAC_MISS,BTAC Misses A BTAC miss was detected. #4,u,n,n,n,n,NUSED,Not used Reserved. #5,u,n,n,n,n,PM_CBR_DISP,Conditional branch dispatched A conditional branch was dispatched. #6,u,n,n,n,n,PM_LQ_FULL,Load queue is full Miss (load) queue is full. #7,u,n,n,n,n,PM_6XXBUS_CMPL_LOAD,Instr load op completed on 6xx bus & bus op to receive instr Instruction load op completed on 6xx bus and bus op to receive instruction #8,u,n,n,n,n,PM_SNOOP_PUSH_INT,A snoop caused a push or an intervention Number of snoop pushes and interventions #9,u,n,n,n,n,PM_EE_OFF_EXT_INT,Cycles MSR(EE) bit off and external interrupt pending EE bit is off and an external interrupt is pending. #10,u,n,n,n,n,PM_BIU_LD_RTRY,Master generated load is retried A master generated load operation is retried See also 2(21). #11,c,n,n,n,n,PM_FPU_FCMP,FPU executed FCMP instruction Float FCMP executed (count both FPUs, category 4 Cray megaflops). Note: Cray category 5 includes float move and estimate instructions. The counter should count float move and estimate instructions on both FPUs. The counter should not count fcmp instructions on both FPUs. Note: Speculative instructions are counted and this may produce counts that are greater than the number of operations actually appearing in the static code stream. #12,u,n,n,n,n,PM_CYC,Processor cycles Processor clock cycles. Note: This count may vary from one run of a program to another due to speculative execution and system activity. #13,u,n,n,n,n,PM_DC_PREF_BF_INV,D cache prefetch buffer invalidates Number of D cache prefetch buffer invalidates #14,u,n,n,n,n,PM_DC_PREF_FILT_4STR,D cache prefetch filter has 4 and only 4 stream entries Number of cycles D cache prefetch filter has 4 and only 4 stream entries. #15,u,n,n,n,n,PM_CHAIN_6_TO_5,Chain PMC6 to PMC5 Chain counter History Mode with PMC6[msb] chained to PMC5[lsb]. #16,u,n,n,n,n,PM_1MISS,Cycles with 1 and only 1 outstanding miss Number cycles with 1 and only 1 oustanding miss. #17,u,n,n,n,n,PM_ST_GATH_DW,Number of store doublewords gathered Number of Doubleword gathered (store). #18,u,n,n,n,n,PM_LSU1_ISS_TAG_LD,LSU1 issued a tagged load request to D cache LSU1 issued a tagged load request to D cache. #19,u,n,n,n,n,PM_FPU1_IDLE,FPU1 idle FPU1 idle. #20,v,n,n,n,n,PM_FPU0_FMA,FPU0 executed multiply-add instruction FPU0 executed a Multiply-Add (category 2 Cray megaflops). Note: Speculative instructions are counted and this may produce counts that are greater than the number of operations actually appearing in the static code stream. #21,u,n,n,n,n,PM_SNOOP_PUSH_BUF,Cycles snoop push buffer used Number of cycles snoop push buffer used #22,u,n,n,n,n,NUSED,Not used Reserved. $$$$ { counter 7 } #0,u,n,n,n,n,PM_IC_MISS,Instruction cache misses L1 Instruction cache misses. #1,v,n,n,n,n,PM_FXU0_PROD_RESULT,FXU0 produced a result FXU0 produced a result. #2,u,n,n,n,n,PM_BR_DISP,Instructions dispatched to the branch unit A branch was dispatched (any). #3,v,n,n,n,n,PM_BR_MPRED_GC,Global cancel due to mispredicted branch Global cancel due to a branch guessed wrong. #4,u,n,n,n,n,PM_SNOOP,Snoop requests received A snoop occurred (any). #5,u,n,n,n,n,NUSED,Not used Reserved. #6,u,n,n,n,n,PM_0INST_DISP,No instructions dispatched No instructions were dispatched. #7,c,n,n,n,n,PM_FXU_IDLE,FXU idle Counts the number of cycles that both of the fxu single cycle execution units (fxu0 and fxu1) are idle and the single cycle instruction queue is empty. Note: The fact that fxu1 is executing is ignored. #8,u,n,n,n,n,PM_6XX_RTRY_CHNG_TRTP,Bus retried transaction that change trans. type 6xx bus retried transaction that change transaction type #9,v,n,n,n,n,PM_FPU_FMA,FPU executed multiply-add instruction Float Multiply-Adds executed (count both FPUs, category 2 Cray megaflops). Note: Speculative instructions are counted and this may produce counts that are greater than the number of operations actually appearing in the static code stream. #10,u,n,n,n,n,PM_ST_DISP,Store instructions dispatched Number of store instructions which were dispatched. #11,u,n,n,n,n,PM_CYC,Processor cycles Processor clock cycles Note: This count may vary from one run of a program to another due to speculative execution and system activity. #12,u,n,n,n,n,PM_TLBSYNC_CMPLBF_CYC,Cycles a tlbsync instr is at bottom of completion buffer Number of cycles a tlbsync instruction is at the bottom of the completion buffer. #13,u,n,n,n,n,NUSED,Not used Reserved. #14,u,n,n,n,n,PM_DC_PREF_L2HIT,D cache prefetch request and data in L2 Number of D cache prefetch request and data in L2 #15,u,n,n,n,n,PM_CHAIN_7_TO_6,Chain PMC7 to PMC6 Chain counter History Mode with PMC7[msb] chained to PMC6[lsb]. #16,u,n,n,n,n,PM_DC_PREF_BLOCK_DEMAND_MISS,Cycles demand miss blocked with 1 or more prefetches outstanding Number of cycles demand miss blocked with 1 or more prefetches oustanding. #17,u,n,n,n,n,PM_2MISS,Cycles with 2 and only 2 outstanding misses Number cycles with 2 and only 2 oustanding miss. #18,v,n,n,n,n,PM_DC_PREF_USED,D cache prefetched and used Number of D cache prefetch and used #19,u,n,n,n,n,PM_LSU_WT_SNOOP_BUSY,Cycles LSU is stalled due to snoop busy Number of cycles load/store unit stalled due to snoop busy. #20,u,n,n,n,n,PM_IC_PREF_USED,I cache prefetched and used Number of I prefetch miss in progress changed to a normal, non-prefetch (demand) miss. #21,u,n,n,n,n,NUSED,Not used Reserved. #22,c,n,n,n,n,PM_FPU0_FADD_FCMP_FMUL,FPU0 executed FADD, FCMP or FMUL FPU0 executed an Add, Compare, Multiply, Subtract (category 4 Cray megaflops). Note: Cray category 4 includes float add, multiply, subtract, compare, and fsel instructions. The counter should count fsel instructions on FPU0. Note: Speculative instructions are counted and this may produce counts that are greater than the number of operations actually appearing in the static code stream. #23,u,n,n,n,n,PM_1WT_THRU_BUF_USED,Cycles 1 write-through buffer used Number of cycles 1 write-through buffer used $$$$ { counter 8 } #0,u,n,n,n,n,PM_TLB_MISS,TLB misses TLB misses. Includes both D cache and I cache misses. #1,v,n,n,n,n,PM_SNOOP_L2HIT,Snoop hits in L2 Snoop hit occurred and L2 has the valid block. #2,v,n,n,n,n,PM_BURSTRD_L2MISS,A burst read caused an L2 miss burst read caused L2 miss #3,u,n,n,n,n,PM_STCX_SUCCESS,Successful conditional stores with reservation Store conditional (stcx) instruction executed successfully #4,v,n,n,n,n,PM_FXU1_PROD_RESULT,FXU1 produced a result FXU1 produced a result. #5,u,n,n,n,n,PM_RETRY_BUS_OP,Retry 6xx bus operation Retry 6xx bus operation #6,u,n,n,n,n,PM_FPU_IDLE,Number of cycles the FP unit is idle FPU idle (count both FPUs). #7,u,n,n,n,n,PM_FETCH_CORR_AT_DISPATCH,Fetch corrections made at dispatch stage Fetch corrections made at dispatch stage. #8,u,n,n,n,n,PM_CMPLU_WT_ST,Cycles completion unit is stalled for a store instruction Completion unit is stalled on store operations. #9,c,n,n,n,n,PM_FPU_FADD_FMUL,FPU executed FADD or FMUL instruction Float Add, Multiply, Subtract executed (count both FPUs, category 4 Cray megaflops). Note: Cray category 4 includes float add, multiply, subtract, compare, and fsel instructions. The counter should count float compare and fsel instructions on both FPUs. Note: Speculative instructions are counted and this may result in counts that are greater than the number of operations actually appearing in the static code stream. #10,u,n,n,n,n,PM_LD_DISP,Load instr dispatched Number of load instruction dispatched. (lm & lstr counted as 1). #11,u,n,n,n,n,PM_ALIGN_INT,An alignment interrupt was executed Exception due to unaligned data. #12,u,n,n,n,n,PM_CYC,Processor cycles Processor clock cycles. Note: This count may vary from one run of a program to another due to speculative execution and system activity. #13,u,n,n,n,n,PM_SYNC_CMPLBF_CYC,Cycles a sync instr is at bottom of the completion buffer Number of cycles a sync instruction is at the bottom of the completion buffer. #14,u,n,n,n,n,PM_2WT_THRU_BUF_USED,Cycles 2 write-through buffers used Number of cycles 2 write-through buffer used #15,u,n,n,n,n,PM_CHAIN_8_TO_7,Chain PMC8 to PMC7 Chain counter History Mode with PMC8[msb] chained to PMC7[lsb].