这是很久以前遇到的一个问题,做下记录
之前公司网站总出现502,进过排查发现,是因为php-fpm配置的原因。
502 一般是服务器资源耗尽,导致nginx(我们用的是nginx)无法创建新的进程(限制于php-fpm的配置),无法提供服务。
这里主要介绍几个参数
pm = dynamic pm.max_children = 100 pm.start_servers = 30 pm.min_spare_servers = 30 pm.max_spare_servers = 100 pm.max_requests = 500 request_terminate_timeout = 100 request_slowlog_timeout = 5s slowlog = /home/wwwlogs/slow.log pm.status_path = /nginx_status
dynamic说明nginx是动态增加进程的,最大是max_children=100(502就是因为同时有100个进程在使用,nginx无法提供服务而返回的),但这个参数也要根据服务器内存来设置,一般来说一个进程占用20~30M,可以大概算一下。若你的并发链接经常大于这个数,你就要考虑换大一点的内存了(在加了max_requests参数的前提下)。
重点:max_requests表示每个进程使用多少次后关闭。因为nginx每个进程打开后都会存在一个类似进程池的东西里面,不会释放内存,这样就会造成内存浪费,而且一个进程使用久后可能会内存泄漏,max_requests就是用来解决这个问题的,后面的500可以自行设置,不可太小,太小也会造成502,最好是在每天凌晨1点reload一下nginx。
假设服务器可用内存为 8G
如何查看可用内存?
free -m total used free shared buffers cached Mem: 988 927 61 0 56 348 -/+ buffers/cache: 522 466 Swap: 1983 152 1831
以我的1G虚拟机为例
实际可用内存为:buffers+cached+free = 462 ,还有462M可以用(其实看第二行466也是可使用内存)
单个进程消耗内存为 20M
如何查看每个fpm进程占用的内存大小?
ps --no-headers -o "rss,cmd" -C php-fpm | awk '{ sum+=$1 } END { printf ("%d%s\n", sum/NR/1024,"M") }' 20M
8 *1024 / 20 = 409.6
那么 max_children 设为400即可(注意,这里的8要为上面计算出的可用内存)
配置 pm.status_path 来实时查看进程使用情况(生产环境不推荐使用)
pm.status_path = /nginx_status
nginx 配置中做相应配置
server{ listen 80 ; location ~ ^/(nginx_status)$ { include fastcgi_params; fastcgi_pass unix:/tmp/php-cgi.sock; fastcgi_param SCRIPT_FILENAME $fastcgi_script_name; } }
fastcgi_params 文件内容(将它放到配置文件的同级目录)
fastcgi_param QUERY_STRING $query_string; fastcgi_param REQUEST_METHOD $request_method; fastcgi_param CONTENT_TYPE $content_type; fastcgi_param CONTENT_LENGTH $content_length; fastcgi_param SCRIPT_NAME $fastcgi_script_name; fastcgi_param REQUEST_URI $request_uri; fastcgi_param DOCUMENT_URI $document_uri; fastcgi_param DOCUMENT_ROOT $document_root; fastcgi_param SERVER_PROTOCOL $server_protocol; fastcgi_param HTTPS $https if_not_empty; fastcgi_param GATEWAY_INTERFACE CGI/1.1; fastcgi_param SERVER_SOFTWARE nginx/$nginx_version; fastcgi_param REMOTE_ADDR $remote_addr; fastcgi_param REMOTE_PORT $remote_port; fastcgi_param SERVER_ADDR $server_addr; fastcgi_param SERVER_PORT $server_port; fastcgi_param SERVER_NAME $server_name; # PHP only, required if PHP was built with --enable-force-cgi-redirect fastcgi_param REDIRECT_STATUS 200; # PHP only, required if PHP was built with --enable-force-cgi-redirect fastcgi_param REDIRECT_STATUS 200;
重启nginx ,访问 http://192.168.1.136/nginx_status,其中 192.168.1.136改为你的ip地址即可
有四种输出格式 full,json,xml,html
http://192.168.1.136/nginx_status?full 将显示所有进程的详细信息
pool: www process manager: dynamic start time: 22/Nov/2017:10:07:06 +0800 start since: 2878 accepted conn: 62 listen queue: 0 max listen queue: 0 listen queue len: 0 idle processes: 10 active processes: 1 total processes: 11 max active processes: 1 max children reached: 0 slow requests: 0 ************************ pid: 48176 state: Idle start time: 22/Nov/2017:10:07:06 +0800 start since: 2878 requests: 6 request duration: 534 request method: GET request URI: /nginx_status?json&full content length: 0 user: - script: /nginx_status last request cpu: 0.00 last request memory: 262144 ************************ pid: 48177 state: Running start time: 22/Nov/2017:10:07:06 +0800 start since: 2878 requests: 7 request duration: 141 request method: GET request URI: /nginx_status?full content length: 0 user: - script: /nginx_status last request cpu: 0.00 last request memory: 0 ....
其中 full 又可以和其它三个参数并用如
http://192.168.1.136/nginx_status?json&full 将以json的方式显示输出所有的进程信息
自己写了个分析统计代码,以我的虚拟机为例
$j = myCurl('http://192.168.1.136/nginx_status?json&full'); $arr = json_decode($j, true); $requestTime = !empty($_GET['rtime']) ? $_GET['rtime'] : ''; $requests = !empty($_GET['rs']) ? $_GET['rs'] : ''; $pmsg = array( 'pool' => 'fpm池子名称,大多数为www', 'process manager' => '进程管理方式,值:static, dynamic or ondemand. dynamic', 'start time' => '启动日期,如果reload了php-fpm,时间会更新', 'start since' => '运行时长', 'accepted conn' => '当前池子接受的请求数', 'listen queue' => '请求等待队列,如果这个值不为0,那么要增加FPM的进程数量', 'max listen queue' => '请求等待队列最高的数量', 'listen queue len' => 'socket等待队列长度', 'idle processes' => '空闲进程数量', 'active processes' => '活跃进程数量', 'total processes' => '总进程数量', 'max active processes' => '最大的活跃进程数量(FPM启动开始算)', 'max children reached' => '大道进程最大数量限制的次数,如果这个数量不为0,那说明你的最大进程数量太小了,请改大一点。', 'slow requests' => '慢请求' ); $pinfomsg = array( 'pid' => '进程PID,可以单独kill这个进程', 'state' => '当前进程的状态', 'start time' => '进程启动的日期', 'start since' => '当前进程运行时长', 'requests' => '当前进程处理了多少个请求', 'request duration' => '请求时长(微妙)', 'request method' => '请求方法 (GET, POST, …)', 'request uri' => '请求URI', 'content length' => '请求内容长度 (仅用于 POST)', 'user' => '用户 (PHP_AUTH_USER) (or ‘-’ 如果没设置)', 'script' => 'PHP脚本', 'last request cpu' => '最后一个请求CPU使用率。', 'last request memory' => '上一个请求使用的内存' ); $run = array(); foreach ($arr as $k => $p) { if ($k == 'processes') { $red = ''; if ($requestTime) { echo '---------------------------------------------------------------请求时长------------------------------------------------------------------------<br>'; $processes = _multi_array_sort($p, 'request duration', SORT_DESC); $red = 'request duration'; } elseif ($requests) { echo '-------------------------------------------------------------进程使用次数----------------------------------------------------------------------<br>'; $processes = _multi_array_sort($p, 'requests', SORT_DESC); $red = 'requests'; } else { $processes = $p; } foreach ($processes as $pinfo) { if (strtolower($pinfo['state']) == 'running') { $run[] = $pinfo; } $script[] = $pinfo['script']; $scriptUrl[$pinfo['script']][] = urldecode($pinfo['request uri']); echo '-----------------------------------------------------------------------------------------------------------------------------------------------<br>'; foreach ($pinfo as $k1 => $pinfo1) { switch ($k1) { case 'request duration': $pinfo2 = ($pinfo1 / 1000000) . ' 秒'; $light = $red == 'request duration' ? 1 : 0; break; case 'start time': $pinfo2 = date('Y-m-d H:i:s', $pinfo1); break; case 'start since': $pinfo2 = time2day($pinfo1); break; case 'requests': $pinfo2 = $pinfo1; $light = $red == 'requests' ? 1 : 0; break; case 'request uri': $pinfo2 = urldecode($pinfo1); break; default : $pinfo2 = $pinfo1; $light = ''; break; } style($pinfo2, $pinfomsg[$k1], $light); } } } else { $pred = ''; switch ($k) { case 'start time': $p1 = date('Y-m-d H:i:s', $p); break; case 'start since': $p1 = time2day($p); break; case 'listen queue': $p1 = $p; $pred = 1; break; case 'max children reached': $p1 = $p; $pred = 1; break; default : $p1 = $p; break; } style($p1, $pmsg[$k], $pred); } } echo '-----------------------------------------访问路径------------------------------------------------------'; echo "<pre>"; print_r(array_count_values($script)); print_r($scriptUrl); echo "</pre>"; echo '-----------------------------------------正在执行------------------------------------------------------'; echo "<pre>"; print_r($run); echo "</pre>"; function style($val, $k, $redk = '') { $style = ''; if ($redk) { $style = 'style="color:red;"'; } echo '<div ' . $style . '><span style="display:inline-block;width:40%;">' . $val . '</span><span style="display:inline-block;width:60%;">' . $k . '</span></div>'; } function time2day($str) { $min = $str / 60; $hours = $min / 60; $days = floor($hours / 24); $hours = floor($hours - ($days * 24)); $min = floor($min - ($days * 60 * 24) - ($hours * 60)); $s = $str % 60; if ($days !== 0) $day = $days . " 天 "; if ($hours !== 0) $day .= $hours . " 小时 "; $day .= $min . " 分钟 "; $day .= $s . " 秒 "; return $day; } function _multi_array_sort($multi_array, $sort_key, $sort = SORT_ASC) { if (is_array($multi_array)) { foreach ($multi_array as $row_array) { if (is_array($row_array)) { $key_array[] = $row_array[$sort_key]; } else { return false; } } } else { return false; } array_multisort($key_array, $sort, $multi_array); return $multi_array; } function myCurl($url) { $ch = curl_init(); curl_setopt($ch, CURLOPT_TIMEOUT, 5); curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); $dataBlock = curl_exec($ch); curl_close($ch); return $dataBlock; }
具体的对应信息上面已经有了,还做了访问路径/页面的统计,最后显示出正在执行的进程
效果图
简单的统计