arthas使用指南

本文最后更新于:2023年9月18日 下午

arthas使用指南

零、原理简介

arthas = Instrumentation + java agent + attach api + ASM

官方开源代码仓库:https://github.com/alibaba/arthas

Arthas-MVEL:https://github.com/XhinLiang/arthas-mvel

Instrumentation

Instrumentation 是 Java 提供的一个来自 JVM 的接口,该接口提供了一系列查看和操作 Java 类定义的方法,例如修改类的字节码、向 classLoader 的 classpath 下加入 jar 文件等。使得开发者可以通过 Java 语言来操作和监控 JVM 内部的一些状态,进而实现 Java 程序的监控分析,甚至实现一些特殊功能(如 JVM 版本的 AOP、热部署等)。

Java agent

Java agent 是一种特殊的Java程序(Jar文件),它是 Instrumentation 的客户端。与普通 Java 程序通过 main 方法启动不同,agent 并不是一个可以单独启动的程序,而必须依附在一个 Java 应用程序上,通过 Instrumentation API 与 JVM 交互。

Java agent 与 Instrumentation 二者需要在一起使用。

attach api

Java agent 可以在 JVM 启动后再加载,而这是通过 attach api 实现的。attach api 不仅仅是为了实现动态加载 agent,attach api 其实是跨 JVM 进程通讯的工具,能够将某种指令从一个 JVM 进程发送给另一个 JVM 进程。

加载 agent 只是 attach api 发送的各种指令中的一种, 诸如 jstack 打印线程栈、jps 列出 Java 进程、jmap 做内存 dump 等功能,都属于 attach api 可以发送的指令。

ASM 字节码操作和分析框架

arthas 通过 ASM 这个 Java 字节码操作和分析框架来对目标类的字节码进行修改。在字节码增强的过程中,arthas 会在目标类的方法入口和出口处插入特定的字节码,用于收集方法的调用次数、响应时间等性能指标。同时,arthas 可以通过修改字节码来实现条件断点、异常捕获等调试功能。

总而言之,arthas 首先利用 attach api 将自己作为 Java agent 附加到目标 JVM 进程中。然后,arthas 通过 Java Instrumentation 获取目标进程的类和对象等运行时信息,并借助 ASM 对目标类的字节码进行动态修改。最后,arthas 与目标 JVM 进程进行进程间通信,接收用户的命令并返回诊断结果

Karthas

  1. mvel 支持

arthas 动态表达式的引擎采用的是 ognl,ognl 的语法不太直观,为了方便用户的使用,karthas 增加了对 mvel 语法的支持,支持 Spring Bean 的自动加载,因此可以类似 python 等交互式语言一样,直接简单地交互执行当前 arthas 注入的 Java 进程的任意方法。具体的使用见 3.9 节。

  1. web console 协作

arthas本身是一个命令行工具,因此使用的主要场景主要是单个开发人员独自 debug 线上问题。但在实际应用场景中,往往需要多个开发人员共同 debug 同一个现场,这时候如果能把 arthas 的现场通过 web url 分享出来,就会使得这种场景的处理更加方便。

karthas 在这个方面针对 arthas 原生的 tunnel-server 进行了增强,使得 web console 能够穿透快手的 access proxy 环境,同时申请资源对 tunnel-server 在快手的环境进行了集群化部署,让快手用户能够开箱可用。

一、进入方式

流水线配置时,设置使用 kbox:

之后打开容器 webssh,使用 jps 找到 java 进程的 pid,使用命令 /opt/kbox/latest/karthas/kas.sh pid 得到 Karthas 的 url 链接,进入即可到达 Karthas 界面

二、辅助工具

因 Karthas 部分命令形式较为复杂,推荐安装 idea 插件 arthas idea:

安装后,右击想要观察的类字段、方法上,即可快捷生成相关命令

三、常用命令及功能

3.1 jad 命令——查看反编译代码

jad 命令后添加类全路径即可查看字节码反编译的 java 代码

3.2 查看 kconf 取值

依靠 Karthas 的 MVEL,可以像交互式解释器那样,使用赋值语句获得 kconf 对象,之后直接调用 get() 即可,也可调用 defaultValue()、configKeyNameSpace()、configKey() 等获取更多信息

1
2
3
4
5
6
7
8
[arthas@70]$ kconf = com.kuaishou.kwaishop.merchant.janus.center.core.config.kconf.MerchantJanusStringListConfigKey.ocpcSplitCharList
@MerchantJanusStringListConfigKey[ocpcSplitCharList]
[arthas@70]$ kconf.get()
@ArrayList[
@String[,],
@String[%2c],
@String[%2C],
]

3.3 watch 命令——监视方法调用

使用 watch 命令监视方法调用,可查看调用时间、入参出参以及抛出的异常。

参数说明:

class-pattern: 类名表达式匹配

method-pattern: 方法名表达式匹配

express: 观察表达式,默认值:{params, target, returnObj}

condition-express: 条件表达式

[b]: 在函数调用之前观察

[e]: 在函数异常之后观察

[s]: 在函数返回之后观察

[f]: 在函数结束之后(正常返回和异常返回)观察

[E]: 开启正则表达式匹配,默认为通配符匹配

[x:]: 指定输出结果的属性遍历深度,默认为 1,最大值是 4

观察表达式与条件表达式均为一个 ognl 表达式,其围绕以下对象进行:

1
2
3
4
5
6
7
8
9
10
11
12
13
public class Advice {
private final ClassLoader loader; // 本次调用类所在的类加载器
private final Class<?> clazz; // 本次调用方法所在类的 Class 引用
private final ArthasMethod method; // 本次调用方法反射引用
private final Object target; // 本次调用类的实例
private final Object[] params; // 本次调用参数列表,是一个数组,如果方法是无参方法则为空数组
private final Object returnObj; // 本次调用返回的对象。当且仅当 isReturn==true 成立时候有效,表明方法调用是以正常返回的方式结束。如果当前方法无返回值 void,则值为 null
private final Throwable throwExp; // 本次调用抛出的异常。当且仅当 isThrow==true 成立时有效,表明方法调用是以抛出异常的方式结束。
private final boolean isBefore; // 标志:当前的通知节点有可能是在方法一开始就通知,此时 isBefore==true 成立,同时 isThrow==false 和 isReturn==false,因为在方法刚开始时,还无法确定方法调用将会如何结束。
private final boolean isThrow; // 标志:当前的方法调用以抛异常的形式结束。
private final boolean isReturn; // 标志:当前的方法调用以正常返回的形式结束。
// getter/setter
}

因此默认会观察调用类的实例、入参列表以及返回值。

观察正常返回的出入参

下面的例子观察 KimServiceImpl#convertMsgRequest 方法的出入参,使用了参数 “-n 1” 表示只观察一次,“-x 3” 表示将出入参的属性仅展开三级。另外因入参被看作一个数组,所以其为 Object[] 类型。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
[arthas@71]$ watch com.kuaishou.kwaishop.merchant.answerize.center.core.service.impl.KimServiceImpl convertMsgRequest '{params,returnObj}'  -n 1  -x 3
Press Q or Ctrl+C to abort.
Affect(class count: 1 , method count: 1) cost in 309 ms, listenerId: 34
method=com.kuaishou.kwaishop.merchant.answerize.center.core.service.impl.KimServiceImpl.convertMsgRequest location=AtExit
ts=2023-05-04 19:58:09; [cost=0.050595ms] result=@ArrayList[
@Object[][
@KimRequest[
appKey=@String[18eb1702-c0b9-4eea-9455-506daf42ce85],
appSecret=@String[cde904cb-9169-4e9f-a32d-ce4a1200603a],
groupId=null,
userId=null,
userName=@String[yanghao12],
sendType=@Integer[0],
param=@HashMap[isEmpty=false;size=1],
],
],
@KimMsgRequest[
username=@String[yanghao12],
userId=null,
groupId=null,
msgType=null,
markdown=null,
text=null,
],
]
Command execution times exceed limit: 1, so command will exit. You can set it with -n option.

观察异常情况下的出入参及异常

对 ChatServiceImpl#chatDomainConversation 方法构造了一个抛出异常的执行,然后 watch 其详细信息,调用后得到如下结果。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
[arthas@71]$ watch com.kuaishou.kwaishop.merchant.answerize.center.core.service.impl.ChatServiceImpl chatDomainConversation '{params,returnObj,throwExp}'  -n 1  -x 3
Press Q or Ctrl+C to abort.
Affect(class count: 1 , method count: 1) cost in 264 ms, listenerId: 35
method=com.kuaishou.kwaishop.merchant.answerize.center.core.service.impl.ChatServiceImpl.chatDomainConversation location=AtExceptionExit
ts=2023-05-04 20:05:03; [cost=1.676912ms] result=@ArrayList[
@Object[][
@String[yanghao12],
null,
@String[hello],
@String[],
],
null,
com.kuaishou.kwaishop.merchant.answerize.center.common.exception.AnswerizeBaseException: 目前没有对应领域存在
at com.kuaishou.kwaishop.merchant.answerize.center.core.service.impl.ChatServiceImpl.chatDomainConversation(ChatServiceImpl.java:139)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
......
,
]
Command execution times exceed limit: 1, so command will exit. You can set it with -n option.

限定条件的 watch

可使用条件表达式,例如想限定入参属性 userName 为 “test”,则可以使用如下语句:

1
watch com.kuaishou.kwaishop.merchant.answerize.center.core.service.impl.KimServiceImpl convertMsgRequest '{params,returnObj,throwExp}' 'params[0].getUserName().equals("test")' -x 3

如想限定第一个参数,则可使用 params[0] 对第一个参数进行限定:

1
2
3
4
5
6
7
8
9
10
11
12
13
[arthas@71]$ watch com.kuaishou.kwaishop.merchant.answerize.center.core.service.impl.ChatServiceImpl chat '{params,returnObj,throwExp}' 'params[0].equals("yanghao12")' -x 3
Press Q or Ctrl+C to abort.
Affect(class count: 1 , method count: 1) cost in 252 ms, listenerId: 50
method=com.kuaishou.kwaishop.merchant.answerize.center.core.service.impl.ChatServiceImpl.chat location=AtExit
ts=2023-05-04 20:37:03; [cost=3167.408299ms] result=@ArrayList[
@Object[][
@String[yanghao12],
@String[你是谁?],
@String[],
],
@String[我是快手公司商家增长团队研发的智能小助手,专门为用户提供帮助和答疑服务。如果您有任何关于快手平台的问题需要咨询,随时欢迎向我提问。],
null,
]

可采用 &&、||、^ 等逻辑操作符进行多个条件的连接,例如通过 && 连接两个条件,其中 #cost 表示方法的耗时(ms):

1
2
3
4
5
6
7
8
9
10
11
12
13
[arthas@71]$ watch com.kuaishou.kwaishop.merchant.answerize.center.core.service.impl.ChatServiceImpl chat '{params,returnObj,throwExp}' 'params[0].equals("yanghao12")&&#cost > 3000' -x 3
Press Q or Ctrl+C to abort.
Affect(class count: 1 , method count: 1) cost in 257 ms, listenerId: 55
method=com.kuaishou.kwaishop.merchant.answerize.center.core.service.impl.ChatServiceImpl.chat location=AtExit
ts=2023-05-04 20:42:27; [cost=7961.660798ms] result=@ArrayList[
@Object[][
@String[yanghao12],
@String[讲个鬼故事吧],
@String[],
],
@String[好的,以下是一个短小的鬼故事:\n\n有一个人在深夜走在路上,突然看到前方有一个女人,女人的脸被头发遮住了,只露出一双眼睛。那个人觉得很奇怪,但还是走了过去,当他走过女人身边时,女人突然转过头来,露出了一张恐怖的面孔,那个人吓得立刻逃跑了。\n\n据说这个女人是一个被杀害的鬼魂,她会在深夜出现在路上,引诱路人靠近,然后杀死他们。因此,人们在深夜出行时要特别小心,避免遇到这个可怕的鬼魂。],
null,
]

监视 sql 语句

若使用了 mybatis 或 mybatis-plus,可使用如下命令简单监视任意 sql 语句

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
[arthas@71]$ watch org.apache.ibatis.mapping.BoundSql getSql '{params,returnObj,throwExp}' -x 3
Press Q or Ctrl+C to abort.
Affect(class count: 1 , method count: 1) cost in 368 ms, listenerId: 64
method=org.apache.ibatis.mapping.BoundSql.getSql location=AtExit
ts=2023-05-05 10:49:10; [cost=0.055517ms] result=@ArrayList[
@Object[][isEmpty=true;size=0],
@String[SELECT id,user,messages,create_time,update_time FROM history_info \n \n WHERE (user = ?) ORDER BY create_time DESC limit 1],
null,
]
method=org.apache.ibatis.mapping.BoundSql.getSql location=AtExit
ts=2023-05-05 10:49:12; [cost=0.014351ms] result=@ArrayList[
@Object[][isEmpty=true;size=0],
@String[UPDATE history_info SET user=?,\nmessages=?,\n\nupdate_time=? \n \n WHERE (user = ?)],
null,
]

监视 http 请求

以下会跟踪耗时大于 100ms 的 http 请求的 url 及耗时情况

1
watch org.springframework.web.servlet.DispatcherServlet doService '{params[0].getRequestURI()+" "+ #cost}'  -n 5  -x 3 '#cost>100'  -f

可获取指定 header 信息,例如以下获取 trace-id

1
watch org.springframework.web.servlet.DispatcherServlet doService '{params[0].getRequestURI()+"  header="+params[1].getHeaders("trace-id")}'  -n 5  -x 3 -f

3.4 thread 命令

参数说明

id: 线程 id

[n:]: 指定 CPU 占比最高的前 N 个线程并打印堆栈

[b]: 找出当前阻塞其他线程的线程

[i ]: 指定 cpu 使用率统计的采样间隔,单位为毫秒,默认值为 200

[–all]: 显示所有匹配的线程

排查CPU占比高的线程

使用 thread -n 命令,例如查询前 3 个最忙的线程:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
[arthas@70]$ thread -n 3
"consumer-holder-[commercialize_sync_topic][c_commercialize_sync_group-lane-PRT.test]-0" Id=1110 cpuUsage=37.22% deltaTime=76ms time=11317ms RUNNABLE (in native)
at java.base@11.0.14-internal/sun.nio.ch.EPoll.wait(Native Method)
at java.base@11.0.14-internal/sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
at java.base@11.0.14-internal/sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
......


"consumer-holder-[kwaishop_ecologic_questionnaire_user_submit][c_kwaishop_ecologic_questionnaire_user_submit_merchant_janus-lane-PRT.test]-0#5" Id=1141 cpuUsage=5.99% deltaTime=12ms time=19619ms RUNNABLE (in native)
at java.base@11.0.14-internal/sun.nio.ch.EPoll.wait(Native Method)
at java.base@11.0.14-internal/sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
at java.base@11.0.14-internal/sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
......


"arthas-command-execute" Id=1851 cpuUsage=2.48% deltaTime=5ms time=107ms RUNNABLE
at java.management@11.0.14-internal/sun.management.ThreadImpl.dumpThreads0(Native Method)
at java.management@11.0.14-internal/sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:494)
at com.taobao.arthas.core.command.monitor200.ThreadCommand.processTopBusyThreads(ThreadCommand.java:206)
......

可以附加 -i 属性指定采样时间间隔,例如 thread -n 3 -i 1000 会统计近 1000 ms 内前三个最消耗 CPU 的线程。

排查阻塞线程

使用 thread -b 直接查询

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
[arthas@70]$ thread -b
"consumer-holder-[invite_call_back_topic][c_invite_call_back_group-lane-PRT.test]-0" Id=1175 RUNNABLE
at java.base@11.0.14-internal/sun.nio.ch.EPoll.wait(Native Method)
at java.base@11.0.14-internal/sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
at java.base@11.0.14-internal/sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
- locked sun.nio.ch.Util$2@3903b099
- locked sun.nio.ch.EPollSelectorImpl@494f3bc9
at java.base@11.0.14-internal/sun.nio.ch.SelectorImpl.select(SelectorImpl.java:136)
at org.apache.kafka.common.network.Selector.select(Selector.java:645)
at org.apache.kafka.common.network.Selector.poll(Selector.java:396)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:430)
at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:226)
- locked org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient@7ad2abea <---- but blocks 1 other threads!
at org.apache.kafka.clients.consumer.KafkaConsumer.pollOnceWithCostInfo(KafkaConsumer.java:1256)
at org.apache.kafka.clients.consumer.KafkaConsumer.pollWithCostInfo(KafkaConsumer.java:1095)
at org.apache.kafka.clients.consumer.KafkaConsumer.pollWithCostInfo(KafkaConsumer.java:1130)
at com.kuaishou.framework.kafka.n.KafkaJavaConsumerHolder$ConsumerHolder.loopPollAndConsume(KafkaJavaConsumerHolder.java:1080)
at com.kuaishou.framework.kafka.n.KafkaJavaConsumerHolder$ConsumerHolder.lambda$createAndExecuteConsumeTask$7(KafkaJavaConsumerHolder.java:920)
at com.kuaishou.framework.kafka.n.KafkaJavaConsumerHolder$ConsumerHolder$$Lambda$1817/0x00000008413f4c40.run(Unknown Source)
at java.base@11.0.14-internal/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base@11.0.14-internal/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base@11.0.14-internal/java.lang.Thread.run(Thread.java:829)

Number of locked synchronizers = 1
- java.util.concurrent.ThreadPoolExecutor$Worker@7120dbac

可看到当前消费者组线程 “consumer-holder-[invite_call_back_topic][c_invite_call_back_group-lane-PRT.test]-0” 锁定了对象 org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient@7ad2abea,导致阻塞了另外一个线程。

查询特定线程堆栈

使用 thread id,查看特定 id 线程的堆栈信息

1
2
3
4
5
6
7
8
9
10
11
12
13
14
[arthas@71]$ thread 221
"http-nio-8080-exec-1" Id=221 WAITING on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@36bdd6d8
at java.base@11.0.14-internal/jdk.internal.misc.Unsafe.park(Native Method)
- waiting on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@36bdd6d8
at java.base@11.0.14-internal/java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
at java.base@11.0.14-internal/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2081)
at java.base@11.0.14-internal/java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:433)
at org.apache.tomcat.util.threads.TaskQueue.take(TaskQueue.java:108)
at org.apache.tomcat.util.threads.TaskQueue.take(TaskQueue.java:33)
at java.base@11.0.14-internal/java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1054)
at java.base@11.0.14-internal/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1114)
at java.base@11.0.14-internal/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
at java.base@11.0.14-internal/java.lang.Thread.run(Thread.java:829)

查看所有线程

使用 thread -all 显示所有匹配线程信息

查看特定状态所有线程

使用 thread –state xxx,例如:

3.5 trace 命令——查看方法调用链路和执行耗时

trace 命令能主动搜索 class-pattern/method-pattern 对应的方法调用路径,渲染和统计整个调用链路上的所有性能开销和追踪调用链路。

参数说明

class-pattern: 类名表达式匹配

method-pattern: 方法名表达式匹配

condition-express: 条件表达式

[E]: 开启正则表达式匹配,默认为通配符匹配

[n:]: 命令执行次数

#cost: 方法执行耗时

简单示例

一个简单的示例如下,可以观察到耗时的操作在 ChatService#chat 和 KimService#sendTextMessage 两个方法内。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
[arthas@71]$ trace com.kuaishou.kwaishop.merchant.answerize.center.core.handler.consume.ChatConsumeHandler handler
Press Q or Ctrl+C to abort.
Affect(class count: 1 , method count: 1) cost in 346 ms, listenerId: 16
`---ts=2023-05-04 14:16:36;thread_name=UNI_c_answerize_chat_topic_22;id=765;is_daemon=false;priority=5;TCCL=org.springframework.boot.loader.LaunchedURLClassLoader@749ab7b4
`---[1833.242546ms] com.kuaishou.kwaishop.merchant.answerize.center.core.handler.consume.ChatConsumeHandler:handler()
+---[0.035623ms] kuaishou.kwaishop.merchant.invite.center.AnswerizeMsgDto$AnswerizeFlowMsg:getUser() #37
+---[0.017509ms] kuaishou.kwaishop.merchant.invite.center.AnswerizeMsgDto$AnswerizeFlowMsg:getExtra() #38
+---[0.120501ms] com.kuaishou.framework.util.ObjectMapperUtils:fromJson() #40
+---[0.039254ms] org.apache.commons.lang3.StringUtils:isEmpty() #44
+---[1522.793878ms] com.kuaishou.kwaishop.merchant.answerize.center.core.service.ChatService:chat() #47
+---[0.03056ms] com.kuaishou.kwaishop.merchant.answerize.center.common.util.KimOpenUtil:msgParamConvertor() #48
+---[0.059335ms] com.kuaishou.kwaishop.merchant.answerize.center.core.config.kconf.MerchantAnswerizeMapConfigKey:get() #51
+---[0.023643ms] com.kuaishou.kwaishop.merchant.answerize.center.common.dto.kim.KimRequest:builder() #52
+---[0.021792ms] com.kuaishou.kwaishop.merchant.answerize.center.common.dto.kim.KimRequest$KimRequestBuilder:appKey() #53
+---[0.019118ms] com.kuaishou.kwaishop.merchant.answerize.center.common.dto.kim.KimRequest$KimRequestBuilder:appSecret() #54
+---[0.014695ms] com.kuaishou.kwaishop.merchant.answerize.center.common.dto.kim.KimRequest$KimRequestBuilder:param() #55
+---[0.017362ms] com.kuaishou.kwaishop.merchant.answerize.center.common.dto.kim.KimRequest$KimRequestBuilder:groupId() #56
+---[0.0151ms] com.kuaishou.kwaishop.merchant.answerize.center.common.dto.kim.KimRequest$KimRequestBuilder:userName() #57
+---[0.016715ms] com.kuaishou.kwaishop.merchant.answerize.center.common.dto.kim.KimRequest$KimRequestBuilder:sendType() #58
+---[0.027505ms] com.kuaishou.kwaishop.merchant.answerize.center.common.dto.kim.KimRequest$KimRequestBuilder:build() #59
`---[309.269577ms] com.kuaishou.kwaishop.merchant.answerize.center.core.service.KimService:sendTextMessage() #61

多层 trace

可见 trace 命令只会 trace 匹配到的函数里的子调用,并不会向下 trace 多层。为了实现 trace 多层的效果,可以使用 -E 参数开启正则表达式匹配,从而用正则表达式匹配路径上的多个类和函数,一定程度上达到多层 trace 的效果,以下跟踪了handler() 方法中调用的 ChatService#chat 方法,以及 chat() 内部调用的 ChatServiceImpl#chatNormalConversation 方法:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
[arthas@71]$ trace -E com.kuaishou.kwaishop.merchant.answerize.center.core.handler.consume.ChatConsumeHandler|com.kuaishou.kwaishop.merchant.answerize.center.core.service.impl.ChatServiceImpl handler|chat|chatNormalConversation
Press Q or Ctrl+C to abort.
Affect(class count: 2 , method count: 3) cost in 562 ms, listenerId: 18
`---ts=2023-05-04 14:34:05;thread_name=UNI_c_answerize_chat_topic_24;id=795;is_daemon=false;priority=5;TCCL=org.springframework.boot.loader.LaunchedURLClassLoader@749ab7b4
`---[1872.237546ms] com.kuaishou.kwaishop.merchant.answerize.center.core.handler.consume.ChatConsumeHandler:handler()
+---[0.01762ms] kuaishou.kwaishop.merchant.invite.center.AnswerizeMsgDto$AnswerizeFlowMsg:getUser() #37
+---[0.00891ms] kuaishou.kwaishop.merchant.invite.center.AnswerizeMsgDto$AnswerizeFlowMsg:getExtra() #38
+---[0.107442ms] com.kuaishou.framework.util.ObjectMapperUtils:fromJson() #40
+---[0.040396ms] org.apache.commons.lang3.StringUtils:isEmpty() #44
+---[1151.319768ms] com.kuaishou.kwaishop.merchant.answerize.center.core.service.ChatService:chat() #47
| `---[1151.257163ms] com.kuaishou.kwaishop.merchant.answerize.center.core.service.impl.ChatServiceImpl:chat()
| +---[12.101286ms] com.kuaishou.kwaishop.merchant.answerize.center.core.service.HistoryManageService:queryHistoryByUser() #50
| +---[0.032386ms] org.apache.commons.collections4.CollectionUtils:isEmpty() #51
| +---[0.025957ms] org.apache.commons.lang3.StringUtils:isEmpty() #58
| `---[1139.019001ms] com.kuaishou.kwaishop.merchant.answerize.center.core.service.impl.ChatServiceImpl:chatNormalConversation() #59
| `---[1138.955373ms] com.kuaishou.kwaishop.merchant.answerize.center.core.service.impl.ChatServiceImpl:chatNormalConversation()
| +---[0.024002ms] com.kuaishou.kwaishop.merchant.answerize.center.common.dto.chat.MessageDTO:builder() #117
| +---[0.019647ms] com.kuaishou.kwaishop.merchant.answerize.center.common.dto.chat.MessageDTO$Role:getName() #118
| +---[0.017317ms] com.kuaishou.kwaishop.merchant.answerize.center.common.dto.chat.MessageDTO$MessageDTOBuilder:role() #95
| +---[0.018757ms] com.kuaishou.kwaishop.merchant.answerize.center.common.dto.chat.MessageDTO$MessageDTOBuilder:content() #119
| +---[0.014803ms] com.kuaishou.kwaishop.merchant.answerize.center.common.dto.chat.MessageDTO$MessageDTOBuilder:build() #120
| +---[0.012072ms] com.kuaishou.kwaishop.merchant.answerize.center.common.dto.chat.ChatCompletionDTO$Model:getName() #123
| +---[2.446956ms] com.kuaishou.kwaishop.merchant.answerize.center.common.util.TikTokensUtil:tokens() #95
| +---[0.019998ms] com.kuaishou.kwaishop.merchant.answerize.center.common.dto.chat.ChatCompletionDTO$Model:getMaxToken() #124
| +---[1129.891215ms] com.kuaishou.kwaishop.merchant.answerize.center.client.client.OpenAIClient:chatCompletion() #127
| +---[0.028956ms] com.kuaishou.kwaishop.merchant.answerize.center.common.dto.chat.ChatCompletionResponseDTO:getChoices() #128
| +---[0.023938ms] com.kuaishou.kwaishop.merchant.answerize.center.common.dto.chat.ChatChoice:getMessageDTO() #95
| +---[6.006524ms] com.kuaishou.kwaishop.merchant.answerize.center.core.service.HistoryManageService:updateUserHistory() #131
| `---[0.017471ms] com.kuaishou.kwaishop.merchant.answerize.center.common.dto.chat.MessageDTO:getContent() #134
+---[0.016075ms] com.kuaishou.kwaishop.merchant.answerize.center.common.util.KimOpenUtil:msgParamConvertor() #48
+---[0.03273ms] com.kuaishou.kwaishop.merchant.answerize.center.core.config.kconf.MerchantAnswerizeMapConfigKey:get() #51
+---[0.014531ms] com.kuaishou.kwaishop.merchant.answerize.center.common.dto.kim.KimRequest:builder() #52
+---[0.011929ms] com.kuaishou.kwaishop.merchant.answerize.center.common.dto.kim.KimRequest$KimRequestBuilder:appKey() #53
+---[0.009899ms] com.kuaishou.kwaishop.merchant.answerize.center.common.dto.kim.KimRequest$KimRequestBuilder:appSecret() #54
+---[0.012046ms] com.kuaishou.kwaishop.merchant.answerize.center.common.dto.kim.KimRequest$KimRequestBuilder:param() #55
+---[0.009794ms] com.kuaishou.kwaishop.merchant.answerize.center.common.dto.kim.KimRequest$KimRequestBuilder:groupId() #56
+---[0.009603ms] com.kuaishou.kwaishop.merchant.answerize.center.common.dto.kim.KimRequest$KimRequestBuilder:userName() #57
+---[0.009825ms] com.kuaishou.kwaishop.merchant.answerize.center.common.dto.kim.KimRequest$KimRequestBuilder:sendType() #58
+---[0.010477ms] com.kuaishou.kwaishop.merchant.answerize.center.common.dto.kim.KimRequest$KimRequestBuilder:build() #59
`---[720.206435ms] com.kuaishou.kwaishop.merchant.answerize.center.core.service.KimService:sendTextMessage() #61

限制trace条件

为了防止采集到大量结果,可以使用 -n 和 #cost 来进行采集结果数量的限定和耗时条件的限定,例如下面这个例子将只采集耗时大于 3000ms 的第一次调用,之后便会直接退出命令:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
[arthas@71]$ trace com.kuaishou.kwaishop.merchant.answerize.center.core.handler.consume.ChatConsumeHandler handler -n 1 '#cost > 3000'
Press Q or Ctrl+C to abort.
Affect(class count: 1 , method count: 1) cost in 252 ms, listenerId: 21
`---ts=2023-05-04 14:47:23;thread_name=UNI_c_answerize_chat_topic_27;id=7ba;is_daemon=false;priority=5;TCCL=org.springframework.boot.loader.LaunchedURLClassLoader@749ab7b4
`---[6064.372975ms] com.kuaishou.kwaishop.merchant.answerize.center.core.handler.consume.ChatConsumeHandler:handler()
+---[0.020583ms] kuaishou.kwaishop.merchant.invite.center.AnswerizeMsgDto$AnswerizeFlowMsg:getUser() #37
+---[0.012811ms] kuaishou.kwaishop.merchant.invite.center.AnswerizeMsgDto$AnswerizeFlowMsg:getExtra() #38
+---[0.094053ms] com.kuaishou.framework.util.ObjectMapperUtils:fromJson() #40
+---[0.017264ms] org.apache.commons.lang3.StringUtils:isEmpty() #44
+---[5727.814648ms] com.kuaishou.kwaishop.merchant.answerize.center.core.service.ChatService:chat() #47
+---[0.016821ms] com.kuaishou.kwaishop.merchant.answerize.center.common.util.KimOpenUtil:msgParamConvertor() #48
+---[0.036667ms] com.kuaishou.kwaishop.merchant.answerize.center.core.config.kconf.MerchantAnswerizeMapConfigKey:get() #51
+---[0.037161ms] com.kuaishou.kwaishop.merchant.answerize.center.common.dto.kim.KimRequest:builder() #52
+---[0.011403ms] com.kuaishou.kwaishop.merchant.answerize.center.common.dto.kim.KimRequest$KimRequestBuilder:appKey() #53
+---[0.01047ms] com.kuaishou.kwaishop.merchant.answerize.center.common.dto.kim.KimRequest$KimRequestBuilder:appSecret() #54
+---[0.016492ms] com.kuaishou.kwaishop.merchant.answerize.center.common.dto.kim.KimRequest$KimRequestBuilder:param() #55
+---[0.010732ms] com.kuaishou.kwaishop.merchant.answerize.center.common.dto.kim.KimRequest$KimRequestBuilder:groupId() #56
+---[0.007971ms] com.kuaishou.kwaishop.merchant.answerize.center.common.dto.kim.KimRequest$KimRequestBuilder:userName() #57
+---[0.010299ms] com.kuaishou.kwaishop.merchant.answerize.center.common.dto.kim.KimRequest$KimRequestBuilder:sendType() #58
+---[0.011872ms] com.kuaishou.kwaishop.merchant.answerize.center.common.dto.kim.KimRequest$KimRequestBuilder:build() #59
`---[335.348733ms] com.kuaishou.kwaishop.merchant.answerize.center.core.service.KimService:sendTextMessage() #61

Command execution times exceed limit: 1, so command will exit. You can set it with -n option.

也可如前“watch”命令所示,使用更为复杂的条件语句进行条件限制。

通配符匹配

trace 默认采用通配符匹配,下面的示例中,将会匹配 ChatServiceImpl 内 chat* 的方法,可见其进行了两层 trace

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
[arthas@71]$ trace com.kuaishou.kwaishop.merchant.answerize.center.core.service.impl.ChatServiceImpl chat* -n 1 '#cost > 3000'
Press Q or Ctrl+C to abort.
Affect(class count: 1 , method count: 3) cost in 305 ms, listenerId: 22
`---ts=2023-05-04 14:51:18;thread_name=UNI_c_answerize_chat_topic_29;id=7cf;is_daemon=false;priority=5;TCCL=org.springframework.boot.loader.LaunchedURLClassLoader@749ab7b4
`---[4986.447081ms] com.kuaishou.kwaishop.merchant.answerize.center.core.service.impl.ChatServiceImpl:chat()
+---[6.610857ms] com.kuaishou.kwaishop.merchant.answerize.center.core.service.HistoryManageService:queryHistoryByUser() #50
+---[0.00605ms] org.apache.commons.collections4.CollectionUtils:isEmpty() #51
+---[0.004414ms] org.apache.commons.lang3.StringUtils:isEmpty() #58
`---[4979.744882ms] com.kuaishou.kwaishop.merchant.answerize.center.core.service.impl.ChatServiceImpl:chatNormalConversation() #59
`---[4979.709312ms] com.kuaishou.kwaishop.merchant.answerize.center.core.service.impl.ChatServiceImpl:chatNormalConversation()
+---[0.007337ms] com.kuaishou.kwaishop.merchant.answerize.center.common.dto.chat.MessageDTO:builder() #117
+---[0.005473ms] com.kuaishou.kwaishop.merchant.answerize.center.common.dto.chat.MessageDTO$Role:getName() #118
+---[0.006898ms] com.kuaishou.kwaishop.merchant.answerize.center.common.dto.chat.MessageDTO$MessageDTOBuilder:role() #95
+---[0.007451ms] com.kuaishou.kwaishop.merchant.answerize.center.common.dto.chat.MessageDTO$MessageDTOBuilder:content() #119
+---[0.005921ms] com.kuaishou.kwaishop.merchant.answerize.center.common.dto.chat.MessageDTO$MessageDTOBuilder:build() #120
+---[0.004486ms] com.kuaishou.kwaishop.merchant.answerize.center.common.dto.chat.ChatCompletionDTO$Model:getName() #123
+---[2.528856ms] com.kuaishou.kwaishop.merchant.answerize.center.common.util.TikTokensUtil:tokens() #95
+---[0.005295ms] com.kuaishou.kwaishop.merchant.answerize.center.common.dto.chat.ChatCompletionDTO$Model:getMaxToken() #124
+---[4970.527234ms] com.kuaishou.kwaishop.merchant.answerize.center.client.client.OpenAIClient:chatCompletion() #127
+---[0.0063ms] com.kuaishou.kwaishop.merchant.answerize.center.common.dto.chat.ChatCompletionResponseDTO:getChoices() #128
+---[0.006029ms] com.kuaishou.kwaishop.merchant.answerize.center.common.dto.chat.ChatChoice:getMessageDTO() #95
+---[6.438786ms] com.kuaishou.kwaishop.merchant.answerize.center.core.service.HistoryManageService:updateUserHistory() #131
`---[0.00694ms] com.kuaishou.kwaishop.merchant.answerize.center.common.dto.chat.MessageDTO:getContent() #134

Command execution times exceed limit: 1, so command will exit. You can set it with -n option.

默认 trace 会跳过 jdk 中的方法,如 toString() 等,使用 –skipJDKMethod false 关闭跳过。

动态trace

  1. 打开终端 1,trace 上面 demo 里的run函数,可以看到打印出 listenerId: 1

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    [arthas@59161]$ trace demo.MathGame run
    Press Q or Ctrl+C to abort.
    Affect(class count: 1 , method count: 1) cost in 112 ms, listenerId: 1
    `---ts=2020-07-09 16:48:11;thread_name=main;id=1;is_daemon=false;priority=5;TCCL=sun.misc.Launcher$AppClassLoader@3d4eac69
    `---[1.389634ms] demo.MathGame:run()
    `---[0.123934ms] demo.MathGame:primeFactors() #24 [throws Exception]

    `---ts=2020-07-09 16:48:12;thread_name=main;id=1;is_daemon=false;priority=5;TCCL=sun.misc.Launcher$AppClassLoader@3d4eac69
    `---[3.716391ms] demo.MathGame:run()
    +---[3.182813ms] demo.MathGame:primeFactors() #24
    `---[0.167786ms] demo.MathGame:print() #25

  2. 现在想要深入子函数primeFactors,可以打开一个新终端 2,使用telnet localhost 3658连接上 arthas,再 trace primeFactors时,指定listenerId。这时终端 2 打印的结果,说明已经增强了一个函数:Affect(class count: 1 , method count: 1),但不再打印更多的结果。

    1
    2
    3
    4
    [arthas@59161]$ trace demo.MathGame primeFactors --listenerId 1
    Press Q or Ctrl+C to abort.
    Affect(class count: 1 , method count: 1) cost in 34 ms, listenerId: 1

  3. 再查看终端 1,可以发现 trace 的结果增加了一层,打印了primeFactors函数里的内容:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    `---ts=2020-07-09 16:49:29;thread_name=main;id=1;is_daemon=false;priority=5;TCCL=sun.misc.Launcher$AppClassLoader@3d4eac69
    `---[0.492551ms] demo.MathGame:run()
    `---[0.113929ms] demo.MathGame:primeFactors() #24 [throws Exception]
    `---[0.061462ms] demo.MathGame:primeFactors()
    `---[0.001018ms] throw:java.lang.IllegalArgumentException() #46

    `---ts=2020-07-09 16:49:30;thread_name=main;id=1;is_daemon=false;priority=5;TCCL=sun.misc.Launcher$AppClassLoader@3d4eac69
    `---[0.409446ms] demo.MathGame:run()
    +---[0.232606ms] demo.MathGame:primeFactors() #24
    | `---[0.1294ms] demo.MathGame:primeFactors()
    `---[0.084025ms] demo.MathGame:print() #25

3.6 stack 命令——查看方法调用堆栈

有时不仅仅想知道当前方法的情况,也想知道是谁调用了当前方法,此时可以查看方法的调用堆栈。

其使用方法和 watch 及 trace 类似,均可使用条件表达式进行有选择性的查看调用堆栈,参数说明为:

参数说明

class-pattern: 类名表达式匹配

method-pattern: 方法名表达式匹配

condition-express: 条件表达式

[E]: 开启正则表达式匹配,默认为通配符匹配

[n:]: 执行次数限制

一个简单的示例:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
[arthas@70]$ stack com.kuaishou.kwaishop.merchant.janus.center.core.service.impl.brand.ShopBrandInternalServiceImpl getTradeMarks  -n 1
Press Q or Ctrl+C to abort.
Affect(class count: 1 , method count: 1) cost in 654 ms, listenerId: 3
ts=2023-05-05 14:40:05;thread_name=krpc-server-worker-25;id=515;is_daemon=true;priority=5;TCCL=org.springframework.boot.loader.LaunchedURLClassLoader@6433a2
@com.kuaishou.kwaishop.merchant.janus.center.core.service.impl.brand.ShopBrandInternalServiceImpl.getTradeMarks()
at com.kuaishou.kwaishop.merchant.janus.center.core.service.impl.brand.ShopBrandInternalServiceImpl.getShopBrandBaseByBrandIds(ShopBrandInternalServiceImpl.java:854)
at com.kuaishou.kwaishop.merchant.janus.center.core.krpc.brand.ShopBrandKRpcService.getShopBrandBaseByBrandIds(ShopBrandKRpcService.java:498)
at com.kuaishou.kwaishop.merchant.janus.center.core.krpc.brand.ShopBrandKRpcService$$FastClassBySpringCGLIB$$c4da3140.invoke(<generated>:-1)
at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218)
at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:750)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
at org.springframework.aop.aspectj.MethodInvocationProceedingJoinPoint.proceed(MethodInvocationProceedingJoinPoint.java:88)
......
Command execution times exceed limit: 1, so command will exit. You can set it with -n option.

3.7 monitor 命令——监控方法执行情况

这个命令可以实时监控方法的执行成功次数和执行失败率

监控项说明

timestamp: 时间戳

class: Java 类

method: 方法(构造方法、普通方法)

total: 调用次数

success: 成功次数

fail: 失败次数

avg-rt: 平均 RT

fail-rate: 失败率

使用 -c 10 限定统计周期为 10 秒,-n 2 表示只监控两个周期

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
[arthas@70]$ monitor com.kuaishou.kwaishop.merchant.janus.center.core.krpc.category.MerchantShopCategoryLimitKRpcService isUserPassCategoryLimit  -n 2  -c 10
Press Q or Ctrl+C to abort.
Affect(class count: 2 , method count: 2) cost in 1182 ms, listenerId: 5
timestamp class method total success fail avg-rt(ms) fail-rate
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
2023-05-05 15:21:54 com.kuaishou.kwaishop.merchant.janus.center.core.krpc.cat isUserPassCategoryLimit 4 4 0 26.10 0.00%
egory.MerchantShopCategoryLimitKRpcService
2023-05-05 15:21:54 com.kuaishou.kwaishop.merchant.janus.center.core.krpc.cat isUserPassCategoryLimit 4 4 0 27.41 0.00%
egory.MerchantShopCategoryLimitKRpcService$$EnhancerBySpr
ingCGLIB$$5395b8d6

timestamp class method total success fail avg-rt(ms) fail-rate
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
2023-05-05 15:22:04 com.kuaishou.kwaishop.merchant.janus.center.core.krpc.cat isUserPassCategoryLimit 2 2 0 18.29 0.00%
egory.MerchantShopCategoryLimitKRpcService
2023-05-05 15:22:04 com.kuaishou.kwaishop.merchant.janus.center.core.krpc.cat isUserPassCategoryLimit 2 2 0 18.65 0.00%
egory.MerchantShopCategoryLimitKRpcService$$EnhancerBySpr
ingCGLIB$$5395b8d6

3.8 tt 命令——方法执行数据的时空隧道

watch 和 trace 命令虽然很方便和灵活,但需要提前想清楚观察表达式,因为很多时候并不清楚问题在哪里,只能靠蛛丝马迹进行猜测。所以此时若能记录下当时方法调用的所有入参和返回值、抛出的异常等信息,会对整个问题的思考与判断非常有帮助。

参数说明

class-pattern: 类名表达式匹配

method-pattern: 方法名表达式匹配

condition-express: 条件表达式

[t]: 开启记录模式

[l]: 列出当前所有时间片记录

[s]: 对时间片记录进行搜索筛选

[i]: 对某一时间片查看详细调用信息

[p]: 重做调用,通常可配合 –replay-times 指定调用次数,通过 –replay-interval 指定多次调用间隔 (单位 ms, 默认 1000ms)

[w]: 观察时空隧道

[E]: 开启正则表达式匹配,默认为通配符匹配

[n:]: 执行次数限制

借用插件可以快捷生成命令:

下面是一个示例:

采集信息

下面采集了 ChatServiceImpl#chatNormalConversation 方法的 5 次调用

1
2
3
4
5
6
7
8
9
10
11
[arthas@71]$ tt -t com.kuaishou.kwaishop.merchant.answerize.center.core.service.ChatService chat -n 5
Press Q or Ctrl+C to abort.
Affect(class count: 2 , method count: 1) cost in 436 ms, listenerId: 75
INDEX TIMESTAMP COST(ms) IS-RET IS-EXP OBJECT CLASS METHOD
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1011 2023-05-05 16:15:52 1572.648716 true false 0x7469e76f ChatServiceImpl chat
1012 2023-05-05 16:16:20 8623.702255 true false 0x7469e76f ChatServiceImpl chat
1013 2023-05-05 16:16:54 3600.742563 true false 0x7469e76f ChatServiceImpl chat
1014 2023-05-05 16:18:53 6.255935 false true 0x7469e76f ChatServiceImpl chat
1015 2023-05-05 16:19:08 978.456102 true false 0x7469e76f ChatServiceImpl chat
Command execution times exceed limit: 5, so command will exit. You can set it with -n option.

其中 INDEX 为该时间片的唯一标识,IS-RET 为 true 表示该方法正常返回,IS-EXP 为 true 表示该方法抛出异常。

查看时间片信息

查看上面非正常返回的一次调用的信息

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
[arthas@71]$ tt -i 1014
INDEX 1014
GMT-CREATE 2023-05-05 16:18:53
COST(ms) 6.255935
OBJECT 0x7469e76f
CLASS com.kuaishou.kwaishop.merchant.answerize.center.core.service.impl.ChatServiceImpl
METHOD chat
IS-RETURN false
IS-EXCEPTION true
PARAMETERS[0] @String[yanghao12]
PARAMETERS[1] @String[hello]
PARAMETERS[2] @String[invite]
THROW-EXCEPTION com.kuaishou.kwaishop.merchant.answerize.center.common.exception.AnswerizeBaseException: 目前没有对应领域存在
at com.kuaishou.kwaishop.merchant.answerize.center.core.service.impl.ChatServiceImpl.chatDomainConversation(ChatServiceImpl.java:139)
at com.kuaishou.kwaishop.merchant.answerize.center.core.service.impl.ChatServiceImpl.chat(ChatServiceImpl.java:61)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
......

使用 -w 对该时间片进行观察

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
[arthas@71]$ tt -w '{method.name,params,returnObj,throwExp}' -x 3 -i 1014
@ArrayList[
@String[chat],
@Object[][
@String[yanghao12],
@String[hello],
@String[invite],
],
null,
com.kuaishou.kwaishop.merchant.answerize.center.common.exception.AnswerizeBaseException: 目前没有对应领域存在
at com.kuaishou.kwaishop.merchant.answerize.center.core.service.impl.ChatServiceImpl.chatDomainConversation(ChatServiceImpl.java:139)
at com.kuaishou.kwaishop.merchant.answerize.center.core.service.impl.ChatServiceImpl.chat(ChatServiceImpl.java:61)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
......
,
]
Affect(row-cnt:1) cost in 3 ms.

重新调用,返回现场

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
[arthas@71]$ tt -p -i 1014
RE-INDEX 1014
GMT-REPLAY 2023-05-05 16:25:24
OBJECT 0x7469e76f
CLASS com.kuaishou.kwaishop.merchant.answerize.center.core.service.impl.ChatServiceImpl
METHOD chat
PARAMETERS[0] @String[yanghao12]
PARAMETERS[1] @String[hello]
PARAMETERS[2] @String[invite]
IS-RETURN false
IS-EXCEPTION true
THROW-EXCEPTION com.kuaishou.kwaishop.merchant.answerize.center.common.exception.AnswerizeBaseException: 目前没有对应领域存在
at com.kuaishou.kwaishop.merchant.answerize.center.core.service.impl.ChatServiceImpl.chatDomainConversation(ChatServiceImpl.java:139)
at com.kuaishou.kwaishop.merchant.answerize.center.core.service.impl.ChatServiceImpl.chat(ChatServiceImpl.java:61)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
......

3.9 jvm 相关命令——查看当前 JVM 信息

dashboard——当前系统的实时数据面板,包含JVM线程、内存信息

jvm——查看当前 JVM 的信息

可以查看到JVM运行时信息、类加载器、垃圾收集、内存分布及占用、线程信息


sc——查看JVM已加载的类信息(-d 查看详细信息,-f 查看 field)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
[arthas@70]$ sc com.kuaishou.kwaishop.merchant.janus.center.core.config.kconf.KconfSupplier
com.kuaishou.kwaishop.merchant.janus.center.core.config.kconf.KconfSupplier
com.kuaishou.kwaishop.merchant.janus.center.core.config.kconf.MerchantInviteJsonConfigKey
com.kuaishou.kwaishop.merchant.janus.center.core.config.kconf.MerchantJanusBooleanConfigKey
com.kuaishou.kwaishop.merchant.janus.center.core.config.kconf.MerchantJanusObjectConfigKey
com.kuaishou.kwaishop.merchant.janus.center.core.config.kconf.MerchantJanusStringListConfigKey
com.kuaishou.kwaishop.merchant.janus.center.core.config.kconf.MerchantJanusTailNumberKey
Affect(row-cnt:6) cost in 143 ms.

[arthas@70]$ sc -d com.kuaishou.kwaishop.merchant.janus.center.core.config.kconf.MerchantJanusStringListConfigKey
class-info com.kuaishou.kwaishop.merchant.janus.center.core.config.kconf.MerchantJanusStringListConfigKey
code-source file:/home/web_server/kuaishou-runner/webapps/kwaishop-merchant-janus-center-core.jar!/BOOT-INF/classes!/
name com.kuaishou.kwaishop.merchant.janus.center.core.config.kconf.MerchantJanusStringListConfigKey
isInterface false
isAnnotation false
isEnum true
isAnonymousClass false
isArray false
isLocalClass false
isMemberClass false
isPrimitive false
isSynthetic false
simple-name MerchantJanusStringListConfigKey
modifier final,public
annotation
interfaces com.kuaishou.kwaishop.merchant.janus.center.core.config.kconf.KconfSupplier
super-class +-java.lang.Enum
+-java.lang.Object
class-loader +-org.springframework.boot.loader.LaunchedURLClassLoader@6433a2
+-jdk.internal.loader.ClassLoaders$AppClassLoader@55054057
+-jdk.internal.loader.ClassLoaders$PlatformClassLoader@352e4b6d
classLoaderHash 6433a2

Affect(row-cnt:1) cost in 90 ms.

sm——查看已加载类的方法信息(可使用 sm -d 查看方法详细信息)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
[arthas@71]$ sm com.kuaishou.kwaishop.merchant.answerize.center.core.service.impl.ChatServiceImpl chat
com.kuaishou.kwaishop.merchant.answerize.center.core.service.impl.ChatServiceImpl chat(Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;)Ljava/lang/String;
Affect(row-cnt:1) cost in 28 ms.

[arthas@71]$ sm com.kuaishou.kwaishop.merchant.answerize.center.core.service.impl.ChatServiceImpl chat -d
declaring-class com.kuaishou.kwaishop.merchant.answerize.center.core.service.impl.ChatServiceImpl
method-name chat
modifier public
annotation
parameters java.lang.String
java.lang.String
java.lang.String
return java.lang.String
exceptions
classLoaderHash 749ab7b4

Affect(row-cnt:1) cost in 32 ms.

mc——内存编译器,内存编译.java文件为.class文件

retransform——加载外部的.class文件,retransform到JVM里

redefine——加载外部的.class文件,redefine到JVM里

dump——dump 已加载类的 byte code 到特定目录

classloader——查看classloader的继承树,urls,类加载信息,使用classloader去getResource

sysprop——查看和修改JVM的系统属性

sysenv——查看JVM的环境变量

vmoption——查看和修改JVM里诊断相关的option

perfcounter——查看当前 JVM 的Perf Counter信息

logger——查看和修改logger

getstatic——查看类的静态属性

ognl——执行ognl表达式

mbean——查看 Mbean 的信息

heapdump——dump java heap, 类似jmap命令的heap dump功能

vmtool——从jvm里查询对象,执行forceGc

3.9 手动调用方法

Karthas 中集成了 MVEL,MVEL 是一种基于 Java 的表达式语言(expression language),本质上就是对 Java 反射的抽象。同时内部已写好函数并自动加载 bean:

1
getBeanByName = def (name) { com.kuaishou.framework.spring.BeanFactory.getBean(name) }

spring bean 方法

可通过以下的方式,来对 spring bean 进行直接方法调用:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
[arthas@70]$ ShopBrandService.getByShop(5004175295, 0)
@ArrayList[
@ShopBrand[
shopId=@Long[5004175295],
brandId=@Long[15131],
authType=@Integer[4],
brandType=@Integer[2],
status=@Integer[1],
validDate=@Long[1704009221536],
upgradingFlag=@Integer[0],
id=@Long[1220270],
createTime=@Long[1683193994173],
updateTime=@Long[1683193994173],
],
@ShopBrand[
shopId=@Long[5004175295],
brandId=@Long[346976],
authType=@Integer[4],
brandType=@Integer[2],
status=@Integer[1],
validDate=@Long[1704009352828],
upgradingFlag=@Integer[0],
id=@Long[1220271],
createTime=@Long[1683193994209],
updateTime=@Long[1683193994209],
],
]

类静态方法

使用全类名+方法进行调用

1
2
3
4
5
6
7
[arthas@71]$ GenerateIdUtil.generateId()
Failed to get static, exception message: [Error: could not access: GenerateIdUtil; in class: java.util.HashMap]
[Near : {... GenerateIdUtil.generateId() ....}]
^
[Line: 1, Column: 1], please check $HOME/logs/arthas/arthas.log for more details.
[arthas@71]$ com.kuaishou.kwaishop.merchant.answerize.center.common.util.GenerateIdUtil.generateId()
"483040520382451800"

非 spring bean 的非静态方法

构造对象后进行调用

1
2
3
4
5
6
[arthas@71]$ chatCompletionDTO = new com.kuaishou.kwaishop.merchant.answerize.center.common.dto.chat.ChatCompletionDTO()
{}
[arthas@71]$ chatCompletionDTO.tokens()
0
[arthas@71]$ chatCompletionDTO.maxToken()
4096

3.10 retransform动态加载指定类

jad 反编译文件class -> java

mc 在把java编译为class

retransform 把指定的class动态加载到jvm中

1
2
3
4
5
6
7
8
9
10
11
## 将class文件反编译为java文件
jad --source-only com.example.demo.arthas.user.UserController > /tmp/UserController.java

## 查看当前的类加载器,spring项目通常是LaunchedURLClassLoader
classloader

## 根据指定的类加载器,编译java文件
mc --classLoaderClass org.springframework.boot.loader.LaunchedURLClassLoader /tmp/BranchTeamService.java -d /tmp

## https://arthas.aliyun.com/doc/retransform.html
retransform /tmp/com/example/demo/arthas/user/UserController.class

需要注意jad不保证完成成果反编译class https://github.com/alibaba/arthas/issues/1158

第一个更加推荐使用本地的java文件修改并上传到服务上(yum install lrzsz 安装sz和rz)

然后通过mc在java文件编译为class文件

3.11 异步后台执行,并把日志写入指定文件

1
watch com.kuaishou.kdev.common.service.repo.RepoCommitService getCommitsPage '{params,returnObj,throwExp}'  -n 5  -x 3 'throwExp != null' > /tmp/log.txt &
  • throwExp != null发生异常才记录
  • > 写入文件
  • & 后台异步执行

这个时候使用quit命令 退出arthas控制台,让arthas 服务继续运行(stop 会停止arthas 服务),等一段时间之后就可以来查看这个文件是否有异常信息进行相关的bug诊断。

3.12 获取类的静态属性

1、通过getstatic

1
getstatic com.kuaishou.kdev.common.service.RepoService X_NEXT_PAGE -x 3

2、通过ognl

1
ognl --classLoaderClass org.springframework.boot.loader.LaunchedURLClassLoader -x 3 '@com.kuaishou.kdev.common.service.RepoService@X_NEXT_PAGE'

3.13 使用tt获取spring context上下文(获取spring bean)

1、原理

  • SpringMVC 的请求会通过 RequestMappingHandlerAdapter 执行 invokeHandlerMethod 到达目标接口上进行处理
  • 而在 RequestMappingHandlerAdapter 类中有 getApplicationContext()

2、执行过程

  • arthas 执行 tt,任意执行一次 web 请求,tt 即可捕获
1
tt -t org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter invokeHandlerMethod
  • 根据目标的索引,执行自定义 OGNL 表达式即可
1
tt -i 1000 -w 'target.getApplicationContext().getBean("targetBean").targetMethod()'

3、一个可执行的命令demo

  • 方法定义public UserDetailVO getSingleUserDetail(String userName){}
  • 类定义 public class PermissionUserServiceImpl extends ServiceImpl<PermissionUserMapper, PermissionUser> implements IPermissionUserService
  • tt执行命令
1
2
3
4
5
## 固定命令
tt -t org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter invokeHandlerMethod

## 根据不同的bean和method而改变
tt -i 1109 -w 'target.getApplicationContext().getBean("permissionUserServiceImpl").getSingleUserDetail("xieshijie")'
  • 注意事项
    • getBean的参数,需要根据类名来获取,把首字母小写
    • 方法有参数需要指定具体指,如果是字符串,则需要用双引号

参考文档

arthas官方文档

arthas idea plugin 使用文档

arthas的一些高级使用用法

Arthas实践–获取到Spring Context


arthas使用指南
http://coder-xieshijie.cn/2023/06/01/其他技能/问题定位/arthas使用指南/
作者
谢世杰
发布于
2023年6月1日
更新于
2023年9月18日
许可协议