错误1
#在open-falcon的agent端定义了一个 push脚本

  1. root@hypereal-test-10:/home# cat test
  2. #!/usr/bin/python
  3. #!-*- coding:utf8 -*-
  4. import requests
  5. import time
  6. import json
  7. ts = int(time.time())
  8. def kk():
  9. payload = [
  10. {
  11. "endpoint": "test-endpoint-dm/",
  12. "metric": "camera0.interfram_avg",
  13. "timestamp": ts,
  14. "step": 60,
  15. "value": 336,
  16. "counterTpye": "GAUGE",
  17. "tags": "cluster=detection-machine",
  18. },
  19. {
  20. "endpoint": "test-endpoint-dm/",
  21. "metric": "test-metric",
  22. "timestamp": ts,
  23. "step": 60,
  24. "value": 1,
  25. "counterType": "GAUGE",
  26. "tags": "cluster=detection-machine",
  27. },
  28. ]
  29. r = requests.post("http://127.0.0.1:1988/v1/push", data=json.dumps(payload))
  30. print(r.text)
  31. while True:
  32. kk()
  33. time.sleep(1)

但是在dashboard端只发现了一个数据,
#检查日志

  1. Oct 29 19:09:12 hypereal-test-10 falcon-agent[30171]: 2018/10/29 19:09:12 var.go:95: <= <Total=1, Invalid:1, Latency=0ms, Message:ok>
  2. Oct 29 19:09:12 hypereal-test-10 falcon-agent[30171]: 2018/10/29 19:09:12 var.go:88: => <Total=2> <Endpoint:test-endpoint-dm/70:85:c2:81:d5:0e, Metric:camera0.interfram_avg, Type:, Tags:cluster=detection-machine,cluster=detection-machine, Step:60, Time:1540811050, Value:336>
  3. Oct 29 19:09:12 hypereal-test-10 falcon-agent[30171]: 2018/10/29 19:09:12 var.go:95: <= <Total=2, Invalid:1, Latency=0ms, Message:ok>
  4. Oct 29 19:09:13 hypereal-test-10 falcon-agent[30171]: 2018/10/29 19:09:13 var.go:88: => <Total=2> <Endpoint:test-endpoint-dm/70:85:c2:81:d5:0e, Metric:camera0.interfram_avg, Type:, Tags:cluster=detection-machine,cluster=detection-machine, Step:60, Time:1540811050, Value:336>

#后来在大神的指导下, Invaild 就是说有格式错误。 OK就是好的,2个里面有一个有了问题。
#再次检查自己的脚本

  1. "counterTpye": "GAUGE", # Tpye 是什么,额,,,, 改成Type。

再次查看日志

  1. Oct 29 19:47:29 hypereal-test-10 falcon-agent[30171]: 2018/10/29 19:47:29 var.go:88: => <Total=2> <Endpoint:test-endpoint-dm/70:85:c2:81:d5:0e, Metric:camera0.interfram_avg, Type:GAUGE, Tags:cluster=detection-machine,cluster=detection-machine, Step:60, Time:1540813625, Value:336>
  2. Oct 29 19:47:29 hypereal-test-10 falcon-agent[30171]: 2018/10/29 19:47:29 var.go:95: <= <Total=2, Invalid:0, Latency=0ms, Message:ok>
  3. Oct 29 19:47:30 hypereal-test-10 falcon-agent[30171]: 2018/10/29 19:47:30 var.go:88: => <Total=2> <Endpoint:test-endpoint-dm/70:85:c2:81:d5:0e, Metric:camera0.interfram_avg, Type:GAUGE, Tags:cluster=detection-machine,cluster=detection-machine, Step:60, Time:1540813625, Value:336>
  4. Oct 29 19:47:30 hypereal-test-10 falcon-agent[30171]: 2018/10/29 19:47:30 var.go:95: <= <Total=2, Invalid:0, Latency=0ms, Message:ok> #Invaild:0 完全没问题了

错误2

写了个脚本来push

  1. #!/usr/bin/python
  2. #!-*- coding:utf8 -*-
  3. import requests
  4. import time
  5. import json
  6. ts = int(time.time())
  7. def kk():
  8. payload = [
  9. {
  10. "endpoint": "test-endpoint-dm",
  11. "metric": "cam.interfram_avg",
  12. "timestamp": ts,
  13. "step": 60,
  14. "value": 336,
  15. "counterType": "GAUGE",
  16. "tags": "cluster=detection-machine",
  17. },
  18. ]
  19. r = requests.post("http://127.0.0.1:1988/v1/push", data=json.dumps(payload))
  20. print(r.text)
  21. while True:
  22. kk()
  23. time.sleep(1)
  1. #看日志是, invalid 0, 说明没问题啊, 但是大盘是就是没数据, 各方面的日志也没有报错。

再次请教大神
看了下各方面日志

  1. Oct 30 17:34:57 hypereal-test-10 falcon-agent[30171]: 2018/10/30 17:34:57 var.go:88: => <Total=2> <Endpoint:test-endpoint-dm, Metric:camera0.interfram_avg, Type:GAUGE, Tags:cluster=detection-machine,cluster=detection-machine, Step:60, Time:1540885244, Value:336>
  2. Oct 30 17:34:57 hypereal-test-10 falcon-agent[30171]: 2018/10/30 17:34:57 var.go:95: <= <Total=2, Invalid:0, Latency=0ms, Message:ok>
  3. Oct 30 17:34:58 hypereal-test-10 falcon-agent[30171]: 2018/10/30 17:34:58 var.go:88: => <Total=2> <Endpoint:test-endpoint-dm, Metric:camera0.interfram_avg, Type:GAUGE, Tags:cluster=detection-machine,cluster=detection-machine, Step:60, Time:1540885244, Value:336>
  4. Oct 30 17:34:58 hypereal-test-10 falcon-agent[30171]: 2018/10/30 17:34:58 var.go:95: <= <Total=2, Invalid:0, Latency=0ms, Message:ok>
  5. Oct 30 17:34:59 hypereal-test-10 falcon-agent[30171]: 2018/10/30 17:34:59 var.go:88: => <Total=2> <Endpoint:test-endpoint-dm, Metric:camera0.interfram_avg, Type:GAUGE, Tags:cluster=detection-machine,cluster=detection-machine, Step:60, Time:1540885244, Value:336>
  6. Oct 30 17:34:59 hypereal-test-10 falcon-agent[30171]: 2018/10/30 17:34:59 var.go:95: <= <Total=2, Invalid:0, Latency=0ms, Message:ok>
  7. Oct 30 17:35:00 hypereal-test-10 falcon-agent[30171]: 2018/10/30 17:35:00 var.go:88: => <Total=2> <Endpoint:test-endpoint-dm, Metric:camera0.interfram_avg-, Type:GAUGE, Tags:cluster=detection-machine,cluster=detection-machine, Step:60, Time:1540885244, Value:336>

Time:1540885244 Time一直都是这个数值,什么鬼,
Time应该是随时间变化的。
重写代码

  1. #!/usr/bin/python
  2. #!-*- coding:utf8 -*-
  3. import requests
  4. import time
  5. import json
  6. ###之前时间在这里,不随函数改变,是个定值,所以dashboard看不到。
  7. def kk():
  8. ts = int(time.time()) #把这一句移动进去。 这样时间就变化了
  9. payload = [
  10. {
  11. "endpoint": "test-endpoint-dm",
  12. "metric": "cam.interfram_avg",
  13. "timestamp": ts,
  14. "step": 60,
  15. "value": 336,
  16. "counterType": "GAUGE",
  17. "tags": "cluster=detection-machine",
  18. },
  19. ]
  20. r = requests.post("http://127.0.0.1:1988/v1/push", data=json.dumps(payload))
  21. print(r.text)
  22. while True:
  23. kk()
  24. time.sleep(1)

脚本有问题啊, 还是要老实写脚本。

错误3

配置了告警,
open-falcon遇到错误1 - 图1
#接着设置触发, 故意弄个触发值
#但是, 没出现告警
open-falcon遇到错误1 - 图2
#很郁闷, 很受挫,怎么破?
#不着急,着急不解决问题
#让我们来看下架构, 整理下思路
open-falcon遇到错误1 - 图3
1 确保数据从agent上传成功
2确保 transfer传数据到了judge
3确保judge传数据到了redis
4确保alarm收到了redis的数据
##以上 1234基本就是我排查的数据,从下至上, 你也可以从上至下,4321
##查看日志,终于在 alarm.err上看到了报错,(这个日志,默认是/alarm/log/alarm.log)
##6379 redis的端口,大家都知道,问题就出在这
open-falcon遇到错误1 - 图4
##接下来,就是排错了,为什么连不上
1网络
2域名,地址
3权限
4防火墙
……..
#修改好之后, 查看dashboard,可以了。
open-falcon遇到错误1 - 图5