日歷
| 日 | 一 | 二 | 三 | 四 | 五 | 六 |
---|
26 | 27 | 28 | 29 | 30 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 1 | 2 | 3 | 4 | 5 | 6 |
|
統(tǒng)計
- 隨筆 - 11
- 文章 - 0
- 評論 - 28
- 引用 - 0
導航
常用鏈接
留言簿(2)
隨筆分類
隨筆檔案
搜索
最新評論

閱讀排行榜
評論排行榜
|
一般主動告警系統(tǒng)的告警信息采集主要有5種方法:
? ?1. 在告警服務器ping各種設備, 判斷設備是否存活和掉包率 ?2. 接收設備發(fā)過來的系統(tǒng)日志(syslog), 并通過相應的規(guī)則庫(正則表達式)匹配判斷是否需要告警 ?3. 接收設備發(fā)過來的snmp Trap信息, 進行判斷告警 ?4. 提取網(wǎng)管系統(tǒng)的告警信息 ?5. 通過snmp協(xié)議, 取回相應oid的值, 進行判斷告警 ? ? 什么是snmp:
? ?Simple Network Management Protocol (SNMP)提供了一些"簡單"的操作, 允許你更容易的監(jiān)控和管理網(wǎng)絡設備, 例如路由器,交換機,服務器,打印機等等. 通過snmp你可以監(jiān)控很多信息, 例如端口流量, 路由器里面的溫度, cpu使用率等等. 學習snmp其實并不是特別簡單, 請通過別的資料學習更多的方面, 特別是mib,oid之類的概念. ?推薦學習Essential SNMP, 2nd Edition這本書. ? ? 如何收集數(shù)據(jù):? ?如果安裝了NET-SNMP, 可以從 http://net-snmp.sourceforge.net/獲取NET-SNMP的RPM包以及源代碼。下載 解壓后 su?-
cd?ucd-snmp-4.2.3
./configure?--prefix=/usr??<--?缺省是/usr/local
make?clean?all
make?install snmpget?<target>?public?system.sysDescr.0 應該可以看到一個關(guān)于系統(tǒng)的簡短描述,類似這樣:
system.sysDescr.0?=?Sun?SNMP?Agent,?Ultra-60 上述命令中的public可以理解為SNMP agent的口令,術(shù)語叫做"community string"。 許多網(wǎng)絡設備、操作系統(tǒng)都用"public"做為缺省"community string",潛在帶來安全 問題。應該修改這個缺省"community string"。 上述命令還可以寫成: snmpget?<target>?public?.1.3.6.1.2.1.1.1.0 "system.sysDescr.0"只是".1.3.6.1.2.1.1.1.0"的另一種表述方式,最終還是要轉(zhuǎn) 換成數(shù)字形式的OID(對象標識符)。
snmpget返回一個值, 類型可以是數(shù)值或者字符串等, 還有一個snmpwalk的操作, 大概就是返回一個數(shù)組的結(jié)果. 本系統(tǒng)使用java語言實現(xiàn), 在網(wǎng)上下載了一個開源的snmp實現(xiàn), 假設有以下工具類:
public?class?Poller
  {
????public?Poller(?String?host,?String?community,?int?version?)
????????throws?IOException
 ???? {
????????//? 
????}
????
????public?String?get(?String?oid?)
????????????throws?IOException
 ???? {
????????//? 
????????return?null;
????}
????
????
????public?Map<String,?String>?walk(?String?base,?int?startIndex,
????????????int?indexCount?)
 ???? {
????????//? 
????????return?null;
????}????
????
????public?void?close()
 ???? {
????}
????
????public?static?void?main(?String[]?args?)
 ???? {
????????Poller?poller?=?new?Poller( );?//?該ip對應的設備是cisco-6509
????????
????????//?1.?cpu告警
????????String?cpuStr?=?poller.get(?"1.3.6.1.4.1.9.9.109.1.1.1.1.5.9"?);?//?cisco-6509的CPU使用率
????????long?cpu?=?Long.parseLong(?valueStr?);
????????
????????if?(?cpu?>?85?)
 ???????? {
????????????System.out.println(?"告警!?cisco-6509的CPU使用率超過85%"?)?;
????????}
????????
????????//?2.?板卡告警
????????String?statusStr?=?poller.get(?"1.3.6.1.4.1.9.5.1.3.1.1.10.1"?);?//?cisco-6509的第一個板卡狀態(tài)
????????long?status?=?Long.parseLong(?statusStr?);
????????
????????if?(?value?!=?2?&&?value?!=?1?)?//?1:未知?2:normal?3:minorFault?4:majorFault
 ???????? {
????????????System.out.println(?"告警!?cisco-6509的第一個板卡狀態(tài)不正常"?)?;
????????}
????????
????????//?3.?流量告警
????????String?octetStr?=?poller.get(?"ifHCInOctets.10"?);?//?cisco-6509的第10個接口的輸入流量,?單位Byte
????????long?value?=?Long.parseLong(?octetStr?);
????????long?time?=?System.currentTimeMillis()/1000;
????????long?lastValue?=?getLastValue( );?//?從數(shù)據(jù)庫或文件取上次的流量值
????????long?lastTime?=?getLastTime( );?//?從數(shù)據(jù)庫或文件取上次采集的時間
????????
????????if?(?(value-lastValue)/(time-lastTime)*8>800000000?)?//?一般流量單位是?bit/s,?所以要乘以8
 ???????? {
????????????System.out.println(?"告警!?cisco-6509的第10個接口的輸入流量超過800M"?)?;
????????}
????????
????????
????????poller.close();????????
????}
} 在上面的main函數(shù), 我們已經(jīng)基本可以實現(xiàn)snmp的告警功能了, 可是這樣相當不靈活, 全部都是硬編碼, 每添加一個新的snmp告警都要新加代碼模塊
?經(jīng)過分析, 大部分的snmp采集告警都是這樣的過程: ? ?1. 取得某設備的對象ID(oid) ?2. 通過snmp協(xié)議得到該oid相應的值, 賦值給value這個變量 ?3. 取當前的時間(秒), 賦值給time這個變量 ?4. 取上次采集的值和時間, 分別賦值給lastValue, lastValue ?5. 根據(jù)該oid返回值代表含義, 構(gòu)造一個表達式, 這個表達式只能包括value, time, lastValue, lastTime這4個變量, ?有時不必全部用上, 而且該表達式應回一個布爾類型的值, 如果為真則需要告警 ?6. 保存value, time為lastValue, lastTime, 用來在下次采集判斷時使用 ? ?這個時候就比較清楚了, 如果有一種動態(tài)語言或動態(tài)腳本在java環(huán)境里能運行就能夠比較靈活的實現(xiàn)snmp告警了, 不需要硬編碼所有的告警情況, 只需要在ui界面添加修改告警表達式就ok了 ?經(jīng)過在http://www.open-open.com或http://java-source.net上搜索, 發(fā)現(xiàn)BeanShell這個項目, 官方網(wǎng)站是http://www.beanshell.org/?
?Beanshell是用Java寫成的,一個小型的、免費的、可以下載的、嵌入式的Java源代碼解釋器,具有對象腳本語言特性。BeanShell執(zhí)行標準Java語句和表達式,另外包括一些腳本命令和語法。它將腳本化對象看作簡單閉包方法(simple method closure)來支持,就如同在Perl和JavaScript中的一樣。
以下是用BeanShell改寫的snmp告警模塊:
package?com.kelefa.warnlet.job;

import?java.io.IOException;
import?java.util.Date;

import?org.apache.log4j.Logger;
import?org.hibernate.HibernateException;
import?org.hibernate.classic.Session;

import?bsh.EvalError;
import?bsh.Interpreter;

import?com.kelefa.common.util.HibernateUtil;
import?com.kelefa.warnlet.dao.WarningDAO;
import?com.kelefa.warnlet.interpreter.SimpleInterpreter;
import?com.kelefa.warnlet.snmp.Poller;
import?com.kelefa.warnlet.vo.Device;
import?com.kelefa.warnlet.vo.SnmpObject;
import?com.kelefa.warnlet.vo.Warning;

public?class?SnmpTask
????????implements?Runnable
  {
????private?final?static?Logger?log?=?Logger.getLogger(?SnmpTask.class?);

????private?SnmpObject?snmpObject;

????private?WarningDAO?warningDAO;

????private?static?final?String?BSH?=?"bsh://";

????public?SnmpTask(?SnmpObject?snmpObject,?WarningDAO?warningDAO?)
 ???? {
????????this.snmpObject?=?snmpObject;
????????this.warningDAO?=?warningDAO;
????}

????public?void?run()
 ???? {
????????log.debug(?"----snmpObject.id="?+?snmpObject.getId()?);
????????try
 ???????? {
????????????Session?session?=?HibernateUtil.currentSession();
????????????HibernateUtil.beginTransaction();
????????????session.refresh(?snmpObject?);

????????????doSnmpTask();

????????????HibernateUtil.commitTransaction();
????????}
????????catch?(?Exception?ex?)
 ???????? {
????????????HibernateUtil.rollbackTransaction();
????????????log.warn(?ex.getMessage()?);
????????}
????????finally
 ???????? {
????????????HibernateUtil.closeSession();
????????}
????????log.debug(?"++++snmpObject.id="?+?snmpObject.getId()?);
????}

 ????/**?*//**
?????*?執(zhí)行snmp任務,?包括:?
?????*?1.?用snmp協(xié)議取相應oid的值,?如果網(wǎng)絡異常或oid設置錯誤則直接結(jié)束?
?????*?2.?如果返回的字符串不是數(shù)字則直接結(jié)束
?????*?3.?用BSH運算告警表達式,?表達式錯誤結(jié)束?
?????*?4.?告警表達式返回真,?進行告警?
?????*?5.?更新最后時間,值
?????*?
?????*/
????private?void?doSnmpTask()
 ???? {
????????Device?device?=?snmpObject.getDevice();

????????String?valueStr;
????????try
 ???????? {
????????????valueStr?=?snmpget(?device.getIp(),?device.getCommunity(),?device
????????????????????.getSnmpVersion(),?snmpObject.getOid()?);
????????}
????????catch?(?IOException?e?)
 ???????? {?//?1.?如果網(wǎng)絡異?;騩id設置錯誤則直接結(jié)束
????????????log.warn(?e.getMessage()?);
????????????return;
????????}

????????if?(?valueStr?==?null?||?valueStr.trim().length()?==?0?)
????????????return;

????????Long?value?=?null;
????????try
 ???????? {
????????????value?=?Long.valueOf(?valueStr?);
????????}
????????catch?(?NumberFormatException?ex?)
 ???????? {//?2.?如果返回的字符串不是數(shù)字則直接結(jié)束
????????????log.warn(?"NumberFormatException:?"?+?ex.getMessage()?+?"\t"
????????????????????+?device.getCommunity()?+?"@"?+?device.getIp()?+?":?"
????????????????????+?snmpObject.getOid()?);
????????????return;
????????}

????????Date?now?=?new?Date();
????????Long?time?=?new?Long(?(now.getTime()?+?500)?/?1000?);

????????if?(?snmpObject.getLastValue()?>?0?&&?snmpObject.getLastTime()?>?0?)
 ???????? {?//?第一次不執(zhí)行bsh腳本
????????????Long?lastValue?=?new?Long(?snmpObject.getLastValue()?);
????????????Long?lastTime?=?new?Long(?snmpObject.getLastTime()?);

????????????boolean?doWarn?=?false;
????????????try
 ???????????? {?//?3.?用BSH運算告警表達式
????????????????doWarn?=?evalExpr(?value,?time,?lastValue,?lastTime?);
????????????}
????????????catch?(?EvalError?ex?)
 ???????????? {
????????????????log.warn(?ex.getMessage(),?ex?);
????????????????updateSnmpObject(?value,?time?);
????????????????return;
????????????}

????????????if?(?log.isDebugEnabled()?)
 ???????????? {
????????????????logResult(?time,?value,?lastValue,?lastTime,?doWarn?);
????????????}

????????????if?(?doWarn?)
 ???????????? {?//?4.?告警表達式返回真,?進行告警
????????????????Warning?warning?=?newWarning(?now,?time,?value,?lastValue,?lastTime?);

????????????????try
 ???????????????? {
????????????????????warningDAO.insertWarning(?warning?);
????????????????}
????????????????catch?(?Exception?ex?)
 ???????????????? {
????????????????????throw?new?HibernateException(?ex.getMessage()?);
????????????????}
????????????}
????????}

????????//?5.?更新最后時間,值
????????updateSnmpObject(?value,?time?);
????}

 ????/**?*//**
?????*?更新監(jiān)控對象的最后的執(zhí)行時間(lastTime)以及最新值(lastValue)
?????*?
?????*?@param?value
?????*?@param?time
?????*/
????private?void?updateSnmpObject(?Long?value,?Long?time?)
 ???? {
????????snmpObject.setLastTime(?time.longValue()?);
????????snmpObject.setLastValue(?value.longValue()?);
????}

 ????/**?*//**
?????*?執(zhí)行動態(tài)bsh表達式,?并返回該表達式的結(jié)果值
?????*?
?????*?@param?value
?????*?@param?time
?????*?@param?lastValue
?????*?@param?lastTime
?????*?@return
?????*?@throws?EvalError
?????*/
????private?boolean?evalExpr(?Long?value,?Long?time,?Long?lastValue,?Long?lastTime?)
????????????throws?EvalError
 ???? {
????????Interpreter?bsh?=?new?Interpreter();

????????bsh.set(?"value",?value?);
????????bsh.set(?"time",?time?);
????????bsh.set(?"lastValue",?lastValue?);
????????bsh.set(?"lastTime",?lastTime?);

????????//?執(zhí)行bsh腳本,返回true則需要告警
????????Boolean?doWarn?=?(Boolean)?bsh.eval(?snmpObject.getWarnExpr()?);

????????return?doWarn.booleanValue();
????}

 ????/**?*//**
?????*?通過snmpget或snmpwalk命令取snmpObject的oid對應的值,?oid可能是單獨的oid例如?1.3.6.1.4.5,
?????*?也可能是包括sum,?count,?max,?min,?avg等函數(shù)的表達式.?如果是單獨的oid,?返回snmpget相應的值即可;
?????*?如果是復合函數(shù),?用snmpwalk,?再進行運算,?返回最后結(jié)果值
?????*?
?????*?@param?device
?????*??????????ip,?community,?version從這個對象取
?????*?@return
?????*?@throws?IOException
?????*/
????public?static?String?snmpget(?final?String?ip,?final?String?community,
????????????final?int?snmpversion,?final?String?oid?)
????????????throws?IOException
 ???? {
????????String?valueStr?=?null;
????????Poller?poller?=?null;
????????try
 ???????? {
????????????poller?=?new?Poller(?ip,?community,?snmpversion,?100?);

????????????log.debug(?"pollering?"?+?oid?);

????????????if?(?oid.indexOf(?'('?)?==?-1?)
 ???????????? {//?單獨一個oid
????????????????valueStr?=?poller.get(?oid?);
????????????????if?(?log.isDebugEnabled()?)
????????????????????log.debug(?"snmpget("?+?oid?+?")="?+?valueStr?);
????????????}
????????????else
 ???????????? {//?包括sum,?count,?max,?min,?avg等函數(shù)的表達式,?例如:
????????????????//?sum(ippoolSize)*100/sum(ippoolUse)
????????????????SimpleInterpreter?si?=?new?SimpleInterpreter(?poller,?oid?);
????????????????Long?result?=?si.interprete();
????????????????if?(?log.isDebugEnabled()?)
????????????????????log.debug(?oid?+?"="?+?result?);
????????????????if?(?result?!=?null?)
????????????????????valueStr?=?result.toString();
????????????}
????????}
????????finally
 ???????? {
????????????if?(?poller?!=?null?)
????????????????poller.close();
????????}

????????return?valueStr;
????}

????private?Warning?newWarning(?Date?now,?Long?time,?Long?value,?Long?lastValue,
????????????Long?lastTime?)
 ???? {
????????Warning?warning?=?new?Warning();
????????warning.setDeviceID(?snmpObject.getDeviceID()?);
????????warning.setWarnType(?snmpObject.getWarnType()?);
????????warning.setWarnLevel(?snmpObject.getWarnLevel()?);
????????warning.setPrimarykey(?snmpObject.getOid()?);

????????String?sms?=?snmpObject.getWarnSms();
????????sms?=?getBshWarnMsg(?sms,?value,?lastValue,?time,?lastTime?);
????????if?(?sms?==?null?||?sms.trim().length()?==?0?)
????????????warning.setWarnSms(?snmpObject.getWarnType()?);
????????else
????????????warning.setWarnSms(?sms.trim()?);

????????String?email?=?snmpObject.getWarnEmail();
????????email?=?getBshWarnMsg(?email,?value,?lastValue,?time,?lastTime?);
????????if?(?email?==?null?||?email.trim().length()?==?0?)
????????????warning.setWarnEmail(?snmpObject.getWarnType()?);
????????else
????????????warning.setWarnEmail(?email.trim()?);

????????warning.setWarnTTS(?snmpObject.getWarnTTS()?);

????????warning.setFirstTime(?now?);
????????warning.setLastTime(?now?);
????????warning.setSuggestion(?snmpObject.getSuggestion()?);
????????return?warning;
????}

????private?void?logResult(?Long?time,?Long?value,?Long?lastValue,?Long?lastTime,
????????????boolean?doWarn?)
 ???? {
????????StringBuffer?buf?=?new?StringBuffer();
????????buf.append(?"OID="?).append(?snmpObject.getOid()?);
????????buf.append(?",time="?).append(?time?);
????????buf.append(?",value="?).append(?value?);
????????buf.append(?",lastTime="?).append(?snmpObject.getLastTime()?);
????????buf.append(?",lastValue="?).append(?snmpObject.getLastValue()?);
????????buf.append(?"\n\t"?).append(?snmpObject.getWarnExpr()?).append(?"="?)
????????????????.append(?doWarn?);

????????if?(?snmpObject.getWarnExpr().indexOf(?"(value-lastValue)/(time-lastTime)"?)?>?-1?)
 ???????? {
????????????buf.append(?"\n\t(value-lastValue)/(time-lastTime)="?).append(
????????????????????((value?-?lastValue)?/?(time?-?lastTime))?);
????????}

????????log.debug(?buf.toString()?);
????}

 ????/**?*//**
?????*?如果參數(shù)是以"bsh://"開頭則通過BSH計算一個字符串表達式,返回最后結(jié)果;?否則直接返回。
?????*?表達式參數(shù)包括value,lastValue,time,lastTime,例如:
?????*?bsh://"端口45流量大于800M:"+((value-lastValue)/(time-lastTime)*8/1000000)+"M"
?????*?
?????*?@param?msgExpr
?????*??????????字符串表達式
?????*?@return?String
?????*/
????private?static?String?getBshWarnMsg(?String?msgExpr,?Long?value,
????????????Long?lastValue,?Long?time,?Long?lastTime?)
 ???? {
????????if?(?msgExpr?==?null?||?!msgExpr.startsWith(?BSH?)?)
????????????return?msgExpr;

????????msgExpr?=?msgExpr.substring(?BSH.length()?);
????????try
 ???????? {
????????????Interpreter?bsh?=?new?Interpreter();

????????????bsh.set(?"value",?value?);
????????????bsh.set(?"time",?time?);
????????????bsh.set(?"lastValue",?lastValue?);
????????????bsh.set(?"lastTime",?lastTime?);

????????????//?執(zhí)行bsh腳本,返回實際的告警信息
????????????msgExpr?=?(String)?bsh.eval(?msgExpr?);
????????}
????????catch?(?EvalError?ex?)
 ???????? {
????????????log.warn(?ex.getMessage()?);
????????}

????????return?msgExpr;
????}
}
評論:
-
# re: 網(wǎng)絡設備主動告警系統(tǒng)之snmp告警的實現(xiàn)
Posted @ 2007-01-18 12:56
非常好,獲益良多 回復 更多評論
-
# re: 網(wǎng)絡設備主動告警系統(tǒng)之snmp告警的實現(xiàn)
Posted @ 2007-05-21 23:01
很好啊 非常好 回復 更多評論
-
# re: 網(wǎng)絡設備主動告警系統(tǒng)之snmp告警的實現(xiàn)[未登錄]
Posted @ 2007-05-25 23:52
恩 獲益匪淺 回復 更多評論
-
# re: 網(wǎng)絡設備主動告警系統(tǒng)之snmp告警的實現(xiàn)
Posted @ 2007-09-27 17:10
真的太好了。受益匪淺啊
回復 更多評論
-
# re: 網(wǎng)絡設備主動告警系統(tǒng)之snmp告警的實現(xiàn)
Posted @ 2007-10-18 09:27
狂頂,太好了,恩人??! 回復 更多評論
-
# re: 網(wǎng)絡設備主動告警系統(tǒng)之snmp告警的實現(xiàn)
Posted @ 2007-11-07 15:32
太好了
網(wǎng)絡設備所有的告警能實現(xiàn)嗎?
要是有能貼出來的話就更好了
謝謝 回復 更多評論
-
# re: 網(wǎng)絡設備主動告警系統(tǒng)之snmp告警的實現(xiàn)
Posted @ 2008-10-13 13:10
把下面這些代碼也貼出來吧,否則不能運行啊
import com.kelefa.common.util.HibernateUtil;
import com.kelefa.warnlet.dao.WarningDAO;
import com.kelefa.warnlet.interpreter.SimpleInterpreter;
import com.kelefa.warnlet.snmp.Poller;
import com.kelefa.warnlet.vo.Device;
import com.kelefa.warnlet.vo.SnmpObject;
import com.kelefa.warnlet.vo.Warning;
回復 更多評論
|