分布式服务器监控工具NebulaSolarDash  

详细介绍:

github:   

      工具分为客户端和服务端两部分: 服务端使用了 bottle 来作为 Web 框架,Echarts 来渲染生成图表;客户端使用 Python 原生类库采集服务器资源。

* 以客户端采集数据间隔时间120s为例,单节点24小时会向数据库写入大约4MB数据。* 单个客户端每次采集发送到服务端写入数据库的信息大概在5~6kb左右,请自行结合服务器个数以及监控时长和服务器存储自行设定监控间隔。

1、下载安装包NebulaSolarDash并解压:

[root@nginx1 ~]# unzip toddlerya-NebulaSolarDash-2.0.1-0-g58fe715.zip 

Archive:  toddlerya-NebulaSolarDash-2.0.1-0-g58fe715.zip

58fe71551f72441964ebcb7bb30fc0e436c9868c

   creating: toddlerya-NebulaSolarDash-58fe715/

  inflating: toddlerya-NebulaSolarDash-58fe715/LICENSE  

  inflating: toddlerya-NebulaSolarDash-58fe715/__init__.py  

   creating: toddlerya-NebulaSolarDash-58fe715/assets/

   creating: toddlerya-NebulaSolarDash-58fe715/assets/css/

  inflating: toddlerya-NebulaSolarDash-58fe715/assets/css/bootstrap.min.css  

  inflating: toddlerya-NebulaSolarDash-58fe715/assets/css/ns_tb.css  

   creating: toddlerya-NebulaSolarDash-58fe715/assets/js/

  inflating: toddlerya-NebulaSolarDash-58fe715/assets/js/bootstrap.min.js  

  inflating: toddlerya-NebulaSolarDash-58fe715/assets/js/dark.js  

  inflating: toddlerya-NebulaSolarDash-58fe715/assets/js/echarts.min.js  

   creating: toddlerya-NebulaSolarDash-58fe715/assets/picture/

  inflating: toddlerya-NebulaSolarDash-58fe715/assets/picture/NebulaSolarDash.gif  

  inflating: toddlerya-NebulaSolarDash-58fe715/assets/picture/NebulaSolarDash2.0.gif  

   creating: toddlerya-NebulaSolarDash-58fe715/conf/

  inflating: toddlerya-NebulaSolarDash-58fe715/conf/__init__.py  

  inflating: toddlerya-NebulaSolarDash-58fe715/conf/ns.ini  

  inflating: toddlerya-NebulaSolarDash-58fe715/init_db.py  

   creating: toddlerya-NebulaSolarDash-58fe715/lib/

  inflating: toddlerya-NebulaSolarDash-58fe715/lib/__init__.py  

  inflating: toddlerya-NebulaSolarDash-58fe715/lib/bottle.py  

  inflating: toddlerya-NebulaSolarDash-58fe715/lib/common_lib.py  

  inflating: toddlerya-NebulaSolarDash-58fe715/manager.py  

  inflating: toddlerya-NebulaSolarDash-58fe715/ns_agent.py  

  inflating: toddlerya-NebulaSolarDash-58fe715/ns_server.py  

  inflating: toddlerya-NebulaSolarDash-58fe715/readme.md  

  inflating: toddlerya-NebulaSolarDash-58fe715/release-note.txt  

  inflating: toddlerya-NebulaSolarDash-58fe715/run.py  

  inflating: toddlerya-NebulaSolarDash-58fe715/start_agent.sh  

  inflating: toddlerya-NebulaSolarDash-58fe715/start_insall_app.sh  

  inflating: toddlerya-NebulaSolarDash-58fe715/stop.py  

  inflating: toddlerya-NebulaSolarDash-58fe715/stop_uninstall_app.sh  

  inflating: toddlerya-NebulaSolarDash-58fe715/uninstall_app.sh  

   creating: toddlerya-NebulaSolarDash-58fe715/views/

  inflating: toddlerya-NebulaSolarDash-58fe715/views/agent_info.tpl  

  inflating: toddlerya-NebulaSolarDash-58fe715/views/each_agent_detail.tpl  

2、修改配置文件即设置server与client:

[root@nginx1 toddlerya-NebulaSolarDash-58fe715]# pwd

/root/toddlerya-NebulaSolarDash-58fe715

[root@nginx1 toddlerya-NebulaSolarDash-58fe715]# vim conf/ns.ini 

[root@nginx1 toddlerya-NebulaSolarDash-58fe715]# cat conf/ns.ini 

[server]

; 服务端IP

ip = 172.25.254.130

; 服务端端口号

port = 8081

debug = True

;报警信息阈值,百分比

;举例:

;cpu_yellow = 80,代表cpu使用率达到80%即提示使用×××标示

;cpu_red = 95,代表cpu使用率达到95%即提示使用×××标示

mem_yellow = 80

mem_red = 95

cpu_yellow = 80

cpu_red = 95

[agent]

; 客户端采集数据间隔时间, 单位是s

interval = 60

install_path = /home/RunTimeNSDash

;所有需要监控的节点的ip,以英文逗号分隔

[all_agent_ip]

ips = 172.25.254.134,172.25.254.135

3、出现验证问题,接下来进行无秘钥操作:

[root@nginx1 toddlerya-NebulaSolarDash-58fe715]# python manager.py -install

[+] 开始安装客户端到各个节点并自动启动客户端以及服务端

[+] 设置安装目录成功: /home/RunTimeNSDash

[+] 删除历史数据成功

[+] 启动服务端成功

[+] 此次安装的节点共计 2 个

[09/06/17 18:38:44] : INFO    : 校验服务器连通性: 172.25.254.134

[09/06/17 18:38:44] : INFO    : 开始部署

Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).

[09/06/17 18:38:44] : ERROR   : can not logon 172.25.254.134 without passwd.

[09/06/17 18:38:44] : INFO    : 校验服务器连通性: 172.25.254.135

[09/06/17 18:38:44] : INFO    : 开始部署

Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).

[09/06/17 18:38:44] : ERROR   : can not logon 172.25.254.135 without passwd.

[root@nginx1 toddlerya-NebulaSolarDash-58fe715]# ss

ss           ssh          ssh-agent    sshd         ssh-keygen   ssltap       

sserver      ssh-add      ssh-copy-id  sshd-keygen  ssh-keyscan  

[root@nginx1 toddlerya-NebulaSolarDash-58fe715]# ssh-keygen 

Generating public/private rsa key pair.

Enter file in which to save the key (/root/.ssh/id_rsa): 

/root/.ssh/id_rsa already exists.

Overwrite (y/n)? n

[root@nginx1 toddlerya-NebulaSolarDash-58fe715]# ssh-copy-id root@172.25.254.134

/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed

/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys

root@172.25.254.134's password: 

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh 'root@172.25.254.134'"

and check to make sure that only the key(s) you wanted were added.

[root@nginx1 toddlerya-NebulaSolarDash-58fe715]# ssh-copy-id root@172.25.254.135

/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed

/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys

root@172.25.254.135's password: 

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh 'root@172.25.254.135'"

and check to make sure that only the key(s) you wanted were added.

4、进行安装部署操作

运行参数:

[root@nginx1 toddlerya-NebulaSolarDash-58fe715]# python manager.py -h

usage: manager.py [-h] [-install] [-uninstall] [-startall] [-stopall]

                  [-start START_ONE] [-stop STOP_ONE]

Manager Tool

optional arguments:

  -h, --help        show this help message and exit

  -install          安装客户端到各个节点并自动启动客户端以

                    服务端

  -uninstall        停止各个节点的客户端并停止程序清理安装

                    件,同时停止服务端

  -startall         启动各个节点的客户端并设置crond守护

  -stopall          停止各个节点的客户端并去除crond守护

  -start START_ONE  启动一个指定节点的客户端并设置crond守护

  -stop STOP_ONE    停止一个指定节点的客户端并去除crond守护

[root@nginx1 toddlerya-NebulaSolarDash-58fe715]# python manager.py -install

[+] 开始安装客户端到各个节点并自动启动客户端以及服务端

[+] 设置安装目录成功: /home/RunTimeNSDash

[+] 删除历史数据成功

[+] 启动服务端成功

[+] 此次安装的节点共计 2 个

[09/06/17 18:39:25] : INFO    : 校验服务器连通性: 172.25.254.134

[09/06/17 18:39:25] : INFO    : 开始部署

[09/06/17 18:39:27] : INFO    : 校验服务器连通性: 172.25.254.135

[09/06/17 18:39:27] : INFO    : 开始部署

[root@nginx1 toddlerya-NebulaSolarDash-58fe715]# python manager.py -startall

[+] 启动各个节点的客户端并设置crond守护

[+] 此次安装的节点共计 2 个

[09/06/17 18:40:18] : INFO    : 校验服务器连通性: 172.25.254.134

[09/06/17 18:40:18] : INFO    : 开始部署

[09/06/17 18:40:20] : INFO    : 校验服务器连通性: 172.25.254.135

[09/06/17 18:40:20] : INFO    : 开始部署

[root@nginx1 toddlerya-NebulaSolarDash-58fe715]# lsof -i:8081

COMMAND  PID USER   FD   TYPE DEVICE SIZE/OFF NODE NAME

python  7588 root    4u  IPv4  42008      0t0  TCP *:tproxy (LISTEN)

5、实验的验证

Nebua-Solar服务器资源监控节点列表

序号 主机名 IP地址 内存 CPU
1 -109% 0.08%
2 1% 0.45%

节点基础信息 -- 各个图表都可以使用鼠标拖动和滚轮缩放

主机名 IP地址 CPU 内存(MB) SWAP(MB) 操作系统 内核版本 运行时长 当前时间
host3 172.25.254.134 2 x AMD Athlon(tm) X4 730 Quad Core Processor 977 0 CentOS Linux 7.2.1511 Core 3.10.0-327.el7.x86_64 27 days, 4:51:37 20170901-18:35:42

节点磁盘存储信息统计

序号 文件系统 总大小 已用 剩余 使用率 挂载点
1 /dev/mapper/centos-root 18G 2.2G 16G 13% /
2 devtmpfs 479M 0 479M 0% /dev
3 tmpfs 489M 0 489M 0% /dev/shm
4 tmpfs 489M 50M 440M 11% /run
5 tmpfs 489M 0 489M 0% /sys/fs/cgroup
6 /dev/sda1 497M 126M 372M 26% /boot
7 tmpfs 98M 0 98M 0% /run/user/0

20170901-18:24:26

USAGE(%) : 0.08
NICE(%) : 0
USER(%) : 0.01
SYSTEM(%) : 0.06
IOWAIT(%) : 0.01

0.0668

20170901-18:31:27

平均负载值 : 0

节点基础信息 -- 各个图表都可以使用鼠标拖动和滚轮缩放

主机名 IP地址 CPU 内存(MB) SWAP(MB) 操作系统 内核版本 运行时长 当前时间
web 172.25.254.135 2 x AMD Athlon(tm) X4 730 Quad Core Processor 1823 0 CentOS Linux 7.2.1511 Core 3.10.0-514.26.2.el7.x86_64 41 days, 4:19:10 20170906-19:02:46

节点磁盘存储信息统计

序号 文件系统 总大小 已用 剩余 使用率 挂载点
1 /dev/mapper/centos-root 18G 12G 6.1G 66% /
2 devtmpfs 897M 0 897M 0% /dev
3 tmpfs 912M 144K 912M 1% /dev/shm
4 tmpfs 912M 99M 814M 11% /run
5 tmpfs 912M 0 912M 0% /sys/fs/cgroup
6 /dev/sda1 497M 190M 307M 39% /boot
7 tmpfs 183M 32K 183M 1% /run/user/0
8 /dev/sr0 4.1G 4.1G 0 100% /run/media/root/CentOS

20170906-18:57:32

USAGE(%) : 0.45
NICE(%) : 0.01
USER(%) : 0.12
SYSTEM(%) : 0.31
IOWAIT(%) : 0

20170906-18:57:32

20170906-18:55:31

平均负载值 : 0