Troubleshooting the SNMP-based status commands (clstat, cldump, cldisp) This troubleshooting procedure addresses all of the common problems (and some uncommon ones) that can cause the PowerHA SNMP-based status commands to fail. The procedure starts by looking at two known problems. Most of the time, fixing these problems will solve the issue and you will not need to go through the entire procedure. However, even if the first two items are correct, there may be other things that are still interfering with the SNMP-based status commands. The rest of the procedure goes through the entire configuration for the SNMP-based status commands in step by step fashion, making sure that everything is correct. 1. Check for access permission to the PowerHA portion of the SNMP Management Information Base (MIB) in the SNMP configuration file. Find the defaultView entries in the /etc/snmpdv3.conf file: # grep defaultView /etc/snmpdv3.conf #VACM_VIEW defaultView internet - included - VACM_VIEW defaultView 1.3.6.1.4.1.2.2.1.1.1.0 - included - VACM_VIEW defaultView 1.3.6.1.4.1.2.6.191.1.6 - included - VACM_VIEW defaultView snmpModules - excluded - VACM_VIEW defaultView 1.3.6.1.6.3.1.1.4 - included - VACM_VIEW defaultView 1.3.6.1.6.3.1.1.5 - included - VACM_VIEW defaultView 1.3.6.1.4.1.2.6.191 - excluded - VACM_ACCESS group1 - - noAuthNoPriv SNMPv1 defaultView - defaultView - VACM_ACCESS director_group - - noAuthNoPriv SNMPv2c defaultView - defaultView - Beginning in AIX 6.1, as a security precaution, the snmpdv3.conf file is shipped with the internet access commented out. The example shows the unmodified configuration file: the internet descriptor is commented out, which means that there is no access to most of the MIB, including the PowerHA information. (Other included entries provide access to other limited parts of the MIB.) By default in AIX 6.1 and later, the PowerHA SNMP-based status commands will not work, unless you edit the snmpdv3.conf file. There are two ways to provide access to the PowerHA MIB: - Uncomment the internet line in snmpdv3.conf. This will provide access to the entire MIB: VACM_VIEW defaultView internet - included - - If you do not want to provide access to the entire MIB, you can just provide access to the PowerHA part of the MIB. Add a line to snmpdv3.conf which only provides access to the PowerHA MIB: VACM_VIEW defaultView risc6000clsmuxpd - included - After editing the SNMP configuration file, you must stop and restart snmpd, and then refresh the cluster manager: stopsrc -s snmpd startsrc -s snmpd refresh -s clstrmgrES Retry the SNMP-based status commands. If they work now, you are done. 2. If you are using PowerHA 7.1.2 or later, check for the correct IPv6 entries in the configuration files for clinfoES and snmpd. In 7.1.2, an entry is added to the /usr/es/sbin/cluster/etc/clhosts file to support IPv6. However, the required corresponding entry was not added to the /etc/snmpdv3.conf file. This causes intermittent problems with clstat. There are two ways to address this problem: - If you do not plan on using IPv6, you can simply comment the line in the /usr/es/sbin/cluster/etc/clhosts file: and restart clinfoES: # ::1 # PowerHA SystemMirror and restart clinfoES: stopsrc -s clinfoES startsrc -s clinfoES Retry the SNMP-based status commands. If they work now, you are done. - If you are planning on using IPv6, or might be using it in the future, add the following line to the /snmpdv3.conf file: COMMUNITY public public noAuthNoPriv :: 0 - If you are using a different community (other than public) substitute that community name for the word public. After editing the SNMP configuration file, you must stop and restart snmpd, and then refresh the cluster manager: stopsrc -s snmpd startsrc -s snmpd refresh -s clstrmgrES Retry the SNMP-based status commands. If they work now, you are done. If you still have problems with the SNMP-based status commands, continue with the following steps. 3. Is snmpd running? lssrc -s snmpd If not, start it: startsrc -s snmpd 4. Are cluster services running? lssrc -ls clstrmgrES | grep state (looking for a state of ST_STABLE) If not, none of the SNMP status commands will work. You need to start cluster services. 5. If using clstat: Is the /usr/es/sbin/cluster/etc/clhosts file correct? The clhosts file should contain a list of IP labels/addresses of any PowerHA nodes with which the clinfoES daemon may communicate. Persistent addresses are a good choice. If the file contains addresses that do not belong to a cluster node, this will cause problems. If you edit the file on a system, you must restart clinfoES on that system. - If this is a cluster node: By default, the clhosts file is pre-populated with the localhost address. You may want to add entries for all the nodes in the cluster, so that clstat will work as long as cluster services are running on any node. ? Starting in PowerHA 7.1.2, an entry for the IPv6 loopback address is added to the default clhosts file. As described in step 2, you can either comment this line or add a line for the IPv6 loopback address to the SNMP configuration file. - If this is a client system: By default the clhosts file is empty. You must add addresses for the cluster nodes. 6. If using clstat: Is clinfoES running? lssrc -s clinfoES If not, start it: startsrc -s clinfoES You may also want to have clinfoES be started every time you start cluster services. 7. Is snmpd listening at the smux port and is the cluster manager connected? Use the following netstat command to list active sockets using the smux port: # netstat -Aa | grep smux f1000e0002988bb8 tcp 0 *.smux *.* LISTEN f1000e00029d8bb8 tcp4 0 0 loopback.smux loopback.32776 ESTABLISHED f1000e00029d4bb8 tcp4 0 0 loopback.32776 loopback.smux ESTABLISHED f1000e000323fbb8 tcp4 0 0 loopback.smux loopback.34266 ESTABLISHED f1000e0001b86bb8 tcp4 0 0 loopback.34266 loopback.smux ESTABLISHED If you do not see a socket in the LISTEN state, stop and start snmpd: stopsrc -s snmpd startsrc -s snmpd Once you have a smux socket in the LISTEN state, you should look for a socket pair in the ESTABLISHED state with one of the sockets owned by the cluster manager. The rmsock command can be used to find which process owns the sockets. If you just restarted the snmpd, ensure that there is a LISTEN socket at the smux port. If you do not see any smux socket in the ESTABLISHED state, you can either refresh the cluster manager (refresh -s clstrmgrES), or you can wait for a couple of minutes, and then retry the netstat -Aa command. The cluster manager tries to connect to the snmpd when services are started and periodically every few minutes after starting. The refresh command causes the cluster manager to retry immediately. Do not use stopsrc and startsrc on the cluster manager. Use rmsock to find the owners of the smux sockets in the ESTABLISHED state. Use the first field in the netstat output, which is the memory address of the socket, as an argument to rmsock. For example: # rmsock f1000e00029d4bb8 tcpcb The socket 0xf1000e00029d4808 is being held by proccess 4063356 (muxatmd). # rmsock f1000e0001b86bb8 tcpcb The socket 0xf1000e0001b86808 is being held by proccess 18546850 (clstrmgr). In this example, there are two ESTABLISHED socket pairs. One between snmpd and muxatmd and one between snmpd and the cluster manager. 8. Try the SNMP status commands again. If they work, you are done. 9. If the SNMP status commands are still not working. You need to check the SNMP configuration file. First determine which version of the snmpd is running. There are two versions which have different configuration files: snmpdv1 uses /etc/snmpd.conf and snmpdv3 uses /etc/snmpdv3.conf. Use the following command to find which version is being used. # ls -l /usr/sbin/snmpd lrwxrwxrwx 1 root system 9 May 14 22:19 /usr/sbin/snmpd -> snmpdv3ne In the rest of these instructions, we assume the snmpdv3 daemon, which is most common. For information regarding the syntax of the /etc/snmpd.conf file, search for /etc/snmpd.conf in the AIX information center.i Next, please check the configuration of the snmpdv3.conf file. First check the authentication, then check access control (authorization). clinfoES, cldump and cldisp use community based authentication. They will use the first community listed in the configuration file. It is possible for you to specify the community to clinfoES. This is rare. To check for this: odmget SRCsubsys | grep -p clinfo You are looking for the value of the cmdargs field. If the field is empty, clinfoES will use the first COMMUNITY entry in the configuration file. If the field is set to "-c community_name", clinfoES will use community_name. If you want to change the community used by clinfoES, use the chssys command. After making a change, you need to restart clinfoES. 10. Find the first SNMP community in the snmpdv3.conf file. # grep -i comm /etc/snmpdv3.conf | grep -v ^# COMMUNITY powerha powerha noAuthNoPriv 0.0.0.0 0.0.0.0 - COMMUNITY test test noAuthNoPriv 0.0.0.0 0.0.0.0 - In this example, the first community is powerha. If there are no uncommented community entries, you will need to add one. You can use these entries as a template. You can use any text string as the community name, although public is probably not a good choice because it is so well known. The community name needs to be the second and third fields in the line. You will need to restart snmpd after editing the file, but wait until the rest of the file is checked in case there are other changes. The snmpdv3 daemon uses VACM (view-based access control management) for access control. Please find the VACM_GROUP, the VACM_ACCESS and the VACM_VIEW entries associated with the community being used. 11. Find the group associated with the first community. Search in the configuration file for the community name. For example: # grep powerha /etc/snmpdv3.conf VACM_GROUP group1 SNMPv1 powerha - TARGET_PARAMETERS trapparms1 SNMPv1 SNMPv1 powerha noAuthNoPriv - COMMUNITY powerha powerha noAuthNoPriv 0.0.0.0 0.0.0.0 - VACM_GROUP director_group SNMPv2c powerha - In this example the VACM_GROUP is group1. The VACM_GROUP entry for the director_group can be ignored, which is used by IBM Systems Director. 12. Find the view associated with this group by searching for the group you just identified. The view is listed in a VACM_ACCESS entry. # grep group1 /etc/snmpdv3.conf VACM_GROUP group1 SNMPv1 powerha - VACM_ACCESS group1 - - noAuthNoPriv SNMPv1 defaultView - defaultView - The syntax of a VACM_ACCESS entry is: VACM_ACCESS groupName contextPrefix contextMatch securityLevel securityModel readView writeView notifyView storageType You are looking for the name of the view for readView access. In this example: defaultView is used for readView and notifyView access for group group1. No access is provided for writeView and storageType. 13. Find the VACM_VIEW entries associated with this community by searching for the view you just identified: # grep defaultView /etc/snmpdv3.conf #VACM_VIEW defaultView internet - included - VACM_VIEW defaultView 1.3.6.1.4.1.2.2.1.1.1.0 - included - VACM_VIEW defaultView 1.3.6.1.4.1.2.6.191.1.6 - included - VACM_VIEW defaultView snmpModules - excluded - VACM_VIEW defaultView 1.3.6.1.6.3.1.1.4 - included - VACM_VIEW defaultView 1.3.6.1.6.3.1.1.5 - included - VACM_VIEW defaultView 1.3.6.1.4.1.2.6.191 - excluded - VACM_ACCESS group1 - - noAuthNoPriv SNMPv1 defaultView - defaultView - VACM_ACCESS director_group - - noAuthNoPriv SNMPv2c defaultView - defaultView - You are looking for a VACM_VIEW entry which gives access to the PowerHA MIB. Locations in the MIB are identified either by a string of numbers (object identifier (OID)) or by a name (object descriptor). For example, the first entry shown here uses the object descriptor internet. That corresponds to the OID 1.3.6.1. If this line were uncommented, then that would allow access to the entire MIB: 1.3.6.1 and everything that starts with 1.3.6.1, which is effectively the entire SNMP MIB. However, in this example, the internet descriptor is commented out, which means that there is no access at that level. Beginning in AIX 6.1, as a security precaution, the snmpdv3.conf file is shipped with the internet access commented out. This means that by default in AIX 6.1 and later, the PowerHA SNMP-based status commands will not work, unless you edit the snmpdv3.conf file. You also need to make sure the relevant VACM_VIEW entry has the word included in the second to last field and not excluded. There are two ways to provide access to the PowerHA MIB: - Uncomment the internet line in snmpdv3.conf. This will provide access to the entire MIB. However, many IT departments will object to that. - Add a line which only provides access to the PowerHA MIB. The PowerHA MIB can be identified by object descriptor (risc6000clsumxpd) or by the OID (1.3.6.1.4.1.2.3.1.2.1.5). An example is shown in the next step. 14. Edit the snmpdv3.conf file to ensure that the PowerHA MIB is accessible for the first community. You need to make sure that the first COMMUNITY entry in the file maps to a VACM_GROUP entry that maps to a VACM_ACCESS entry that maps to a VACM_VIEW that includes the PowerHA MIB. In this example, the only change needed is to add a VACM_VIEW entry for the risc6000clsmuxpd object descriptor: VACM_VIEW defaultView risc6000clsmuxpd - included - 15. If you edited the snnmpdv3.conf file, restart snmpd. Note that you must use stopsrc and startsrc, not refresh for the snmpd. stopsrc -s snmpd startsrc -s snmpd 16. Repeat step 7 to ensure the cluster manager is connected to the snmpd. 17. Retry the PowerHA SNMP status commands.