今天写了一个数据库的监控脚本,在测试脚本能否正常告警时,发现邮件发不出去。
这个系统的环境是这样的,在整个系统中,大部分的机器放在192.168.1网段,为trust area,数据库主机也在此网段;另外有几台机器在192.168.3网段,为DMZ area。发送邮件不能直接在数据库主机上直接发送,需要通过192.168.3网段上的一台smtp server进行relay。我们假设这台smtp server的IP为192.168.3.99,下面我们来开始配置db主机,使得db主机上用mailx命令发送的邮件能中继到smtp server上进行发送。
在这里,db主机的os环境是aix 5.3,需要配置的文件为/etc/sendmail.cf。我们先备份一下这个文件,然后来进行修改:
在这个文件中,找到有如下相关的行(如果没有需要自行添加):
1 2 3 4 |
# for sendmail DSsmtp:[192.168.3.99] DwMYDAB02 Cwlocalhost |
其中DSsmtp:[192.168.3.99]表示smtp server的IP为192.168.3.99
Dw后面直接跟本机的主机名
Cw后面跟localhost
修改上面的参数后,重启sendmail服务:
1 |
refresh -s sendmail |
此时即可在db主机上,通过mailx命令将监控的告警邮件,relay到smtp sever,然后通过smtp sever集中发送:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 |
#!/usr/bin/sh ################################################################### # This script is written by username@cn.ibm.com at 2010-08-12. # Because HQ monitor can not cover all the db parameters, # it need to by monitor by this script. # main_normal.sh monitor the normal process and run every 2 hours # main_crital.sh monitor the crital process and run every 2 mins #################################################################### #### PARAMETER AND WORKING PATH SETTING export ORACLE_BASE=/u01/app/oracle export ORACLE_HOME=/u01/app/oracle/oracle/product/10.2.0/db_1 export PATH=$ORACLE_HOME/bin:$PATH export ORACLE_SID=MDBPRD WORKPATH=/u03/db_monitor LOGPATH=${WORKPATH}/log SRPTPATH=${WORKPATH}/bin MAILPATH=${WORKPATH}/mailresult CLOG=${LOGPATH}/db_monitor_${ORACLE_SID}_$(date +%Y%m%d).clog NLOG=${LOGPATH}/db_monitor_${ORACLE_SID}_$(date +%Y%m%d).nlog MRESULT=${MAILPATH}/mail_result_${ORACLE_SID}_$(date +%Y%m%d).mresult MAIL_TOOL=/usr/bin/mailx TO_MAIL=jianminh@cn.ibm.com CC_MAIL=jianminh@cn.ibm.com cd ${WORKPATH} #### CHECKING CRITAL PROCESS v_lsnr=`ps -ef |grep tns |grep -v grep |wc -l` v_process1521=`netstat -an |grep 1521|grep -v grep |wc -l` v_crit_process=`ps -ef |grep ora_ |grep ${ORACLE_SID} |grep -v grep|wc -l` #### WRITE CHECKING RESULT TO LOG echo "#################################################">>$CLOG echo "============= CRITAL REPORT BEGIN =============">>$CLOG date>>$CLOG echo "====THE NUMBER OF LNSR====">>$CLOG echo $v_lsnr>>$CLOG echo " ">>$CLOG echo "====THE NUMBER OF PROCESS USING PORT 1521====">>$CLOG echo $v_process1521>>$CLOG echo " ">>$CLOG echo "====THE NUMBER OF ORACLE BGPROCESS====">>$CLOG echo $v_crit_process>>$CLOG echo " ">>$CLOG echo "============== CRITAL REPORT END ==============">>$CLOG if [ $v_lsnr -lt 1 ] then cat /dev/null > $MRESULT tail -12 $CLOG>$MRESULT $MAIL_TOOL -s "IMPORTANT! MYDAB02 LSNR DOWN!" -c $CC_MAIL $TO_MAIL < $MRESULT else echo "ok" fi if [ $v_process1521 -lt 10 ] then cat /dev/null > $MRESULT tail -12 $CLOG>$MRESULT $MAIL_TOOL -s "IMPORTANT! MYDAB02 PORT 1521 DOWN!" -c $CC_MAIL $TO_MAIL < $MRESULT else echo "ok" fi if [ $v_crit_process -lt 5 ] then cat /dev/null > $MRESULT tail -12 $CLOG>$MRESULT $MAIL_TOOL -s "IMPORTANT! MYDAB02 ORACLE DOWN!" -c $CC_MAIL $TO_MAIL < $MRESULT else echo "OK" fi |
2条评论
这个得顶!前年我弄的时候费了挺大劲,最后还忘了总结。过些天又要弄了,正好参考!
你这个脚本做了后台作业定期跑吗