Lua-Nginx-Module-Issues.md

这里记录lua_nginx_module使用中的各种问题, 包括lua_nginx_module, lua-resty-*, 火焰图等等.

ERROR: read fault [man error::fault] at 0x000000006174654e (addr) near operator '@cast' at stapxx-qpIGEAkT/luajit.stp:496:17

sample:

[root@kuro samples]# ./lj-lua-stacks.sxx -x 2888 --arg time=60
Found exact match for libluajit: /data/apps/luajit_debug/lib/libluajit-5.1.so.2.1.0
WARNING: Start tracing 2888 (/data/apps/nginx/sbin/nginx)
WARNING: Please wait for 60 seconds...
ERROR: read fault [man error::fault] at 0x000000006174654e (addr) near operator '@cast' at stapxx-qpIGEAkT/luajit.stp:496:17
WARNING: Number of errors: 1, skipped probes: 19
WARNING: /usr/bin/staprun exited with status: 1
Pass 5: run failed.  [man error::pass5]

这个问题是没好好阅读README引起的. 在stapxx项目的README中有明确说明:

If you see "read faults" errors, they are usually very normal and you can just try specifying the --skip-badvars option to ignore them.

运行的时候带上参数 --skip-badvars 就好了:

./lj-lua-stacks.sxx -x 2888  --arg time=60 --skip-badvars -D MAXSKIPPED=10240 > /tmp/2888.bt

需要注意的是, 如果测试时压力不够, 可能不会出现这个问题. 但"为了避免这个问题于是降低压力"的做法是不可取的.
因为只有在大压力的情况下才能准确捕捉真实情况. 换句话说, 如果捕捉的是线上状态, 是不应该对当前线上流量做任何改变的. 会影响观测.

ERROR: Skipped too many probes, check MAXSKIPPED or try again with stap -t for more details.

sample:

[root@kuro samples]# ./lj-lua-stacks.sxx -x 2888  --arg time=60 --skip-badvars  > /tmp/2888.bt
Found exact match for libluajit: /data/apps/luajit_debug/lib/libluajit-5.1.so.2.1.0
WARNING: Start tracing 2888 (/data/apps/nginx/sbin/nginx)
WARNING: Please wait for 60 seconds...
ERROR: Skipped too many probes, check MAXSKIPPED or try again with stap -t for more details.
WARNING: Number of errors: 0, skipped probes: 101
WARNING: /usr/bin/staprun exited with status: 1
Pass 5: run failed.  [man error::pass5]

提示 Skipped too many probes, check MAXSKIPPED. 那么把 MAXSKIPPED 参数调大就好了. 例如:

./lj-lua-stacks.sxx -x 2888  --arg time=60 --skip-badvars -D MAXSKIPPED=10240 > /tmp/2888.bt

需要注意的是, 如果测试时压力不够, 可能不会出现这个问题. 但"为了避免这个问题于是降低压力"的做法是不可取的.
因为只有在大压力的情况下才能准确捕捉真实情况. 换句话说, 如果捕捉的是线上状态, 是不应该对当前线上流量做任何改变的. 会影响观测.