侧边栏壁纸
  • 累计撰写 7 篇文章
  • 累计创建 7 个标签
  • 累计收到 0 条评论

目 录CONTENT

文章目录

通过jstack排查死锁、CPU占用高等问题

小白
2022-08-12 / 0 评论 / 0 点赞 / 166 阅读 / 28878 字

前言

jstackJDK的一个查看Java线程堆栈的工具,可以看到线程堆栈了自然用处就很多了,就像我们平时发生了异常需要打印堆栈出来看一样,可以很方便的找到相关代码的问题。

死循环、死锁等问题也许在普通的程序中不常见但是还是要得会基本的使用的,因为我觉得他是排查线上问题的终极手段(也许远程debug也算一个?)。并且也这个功能也算是比较常用的工具,我们不能不了解它,他也不一定排查死锁问题,它会显示线程的状态,用来比如分析性能程序性能瓶颈也是很方便的。之所以来写这篇笔记是因为在公司里面遇到一个项目在启动的时候死锁导致Web服务器没有启动起来,导致我们应用也没有启动。表面上看起来就是什么日志都没有,就是没启动,要是不会看线程快照那不就束手无策了吗,因为死锁问题还没有遇到过,之前用jstack学习排查过死循环,今天就顺便一起记一篇笔记然后开始吧。

我们直接从问题着手,通过问题再去排查代码。

服务卡住了一直不执行

一个接口卡住了一直不响应,也没有日志打印了,就像突然挂了一样。

这个时候按照平时在本地就是打个断点调一下,看看走到哪里卡住了有问题,在现实虽然说也能远程debug,但是线上一般不这么做,因为通过线程堆栈也能看到线程的执行情况那就是jstack命令。

格式

Usage:
    jstack [-l][-e] <pid>
        (to connect to running process)
Options:
    -l  除堆栈外,显示关于锁的附加信息,在发生死锁时可以用jstack -l pid来观察锁持有情况
    -e  打印有关线程的附加信息

我们看线程堆栈不用附加选项指定pid就可以了

~ ps -ef| grep java
username    147    92  0 23:35 pts/0    00:00:00 java -jar study-1.0-SNAPSHOT-jar-with-dependencies.jar
username    238   171  0 23:56 pts/1    00:00:00 grep --color=auto java

这样我们看java进程那个是我们的程序的进程,然后username后面那个就是pid

当然其实还可以用JDKpsjps

~ jps
147 study-1.0-SNAPSHOT-jar-with-dependencies.jar
239 Jps

可以看到结果是一样的pid是147

然后执行 jstack 147 得到结果

2022-08-11 23:59:57
Full thread dump OpenJDK 64-Bit Server VM (17.0.4+8-Ubuntu-120.04 mixed mode, sharing):
Threads class SMR info:
javathread_list=0x00007ff4ec001da0, length=14, elements={
0x00007ff53818f650, 0x00007ff538190a30, 0x00007ff5381985d0, 0x00007ff538199980,
0x00007ff53819ad90, 0x00007ff53819c7c0, 0x00007ff5381a5ce0, 0x00007ff5381a7150,
0x00007ff5381ae770, 0x00007ff5381b1d00, 0x00007ff5381ea970, 0x00007ff5381eb890,
0x00007ff538011da0, 0x00007ff4ec000df0
}
"Reference Handler" #2 daemon prio=10 os_prio=0 cpu=0.07ms elapsed=1485.36s tid=0x00007ff53818f650 nid=0x9b waiting on condition  [0x00007ff51422b000]
   java.lang.Thread.State: RUNNABLE
        at java.lang.ref.Reference.waitForReferencePendingList(java.base@17.0.4/Native Method)
        at java.lang.ref.Reference.processPendingReferences(java.base@17.0.4/Reference.java:253)
        at java.lang.ref.Reference$ReferenceHandler.run(java.base@17.0.4/Reference.java:215)
"Finalizer" #3 daemon prio=8 os_prio=0 cpu=0.13ms elapsed=1485.36s tid=0x00007ff538190a30 nid=0x9c in Object.wait()  [0x00007ff51412a000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(java.base@17.0.4/Native Method)
        - waiting on <0x0000000716202fa0> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(java.base@17.0.4/ReferenceQueue.java:155)
        - locked <0x0000000716202fa0> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(java.base@17.0.4/ReferenceQueue.java:176)
        at java.lang.ref.Finalizer$FinalizerThread.run(java.base@17.0.4/Finalizer.java:172)
"Signal Dispatcher" #4 daemon prio=9 os_prio=0 cpu=0.53ms elapsed=1485.36s tid=0x00007ff5381985d0 nid=0x9d waiting on condition  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE
"Service Thread" #5 daemon prio=9 os_prio=0 cpu=0.07ms elapsed=1485.36s tid=0x00007ff538199980 nid=0x9e runnable  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE
"Monitor Deflation Thread" #6 daemon prio=9 os_prio=0 cpu=14.99ms elapsed=1485.36s tid=0x00007ff53819ad90 nid=0x9f runnable  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE
"C2 CompilerThread0" #7 daemon prio=9 os_prio=0 cpu=4.78ms elapsed=1485.36s tid=0x00007ff53819c7c0 nid=0xa0 waiting on condition  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE
   No compile task
"C1 CompilerThread0" #15 daemon prio=9 os_prio=0 cpu=4.59ms elapsed=1485.36s tid=0x00007ff5381a5ce0 nid=0xa1 waiting on condition  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE
   No compile task
"Sweeper thread" #19 daemon prio=9 os_prio=0 cpu=0.05ms elapsed=1485.36s tid=0x00007ff5381a7150 nid=0xa2 runnable  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE
"Notification Thread" #20 daemon prio=9 os_prio=0 cpu=0.05ms elapsed=1485.36s tid=0x00007ff5381ae770 nid=0xa3 runnable  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE
"Common-Cleaner" #21 daemon prio=8 os_prio=0 cpu=0.77ms elapsed=1485.36s tid=0x00007ff5381b1d00 nid=0xa5 in Object.wait()  [0x00007ff4f739c000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
        at java.lang.Object.wait(java.base@17.0.4/Native Method)
        - waiting on <0x00000007162180a8> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(java.base@17.0.4/ReferenceQueue.java:155)
        - locked <0x00000007162180a8> (a java.lang.ref.ReferenceQueue$Lock)
        at jdk.internal.ref.CleanerImpl.run(java.base@17.0.4/CleanerImpl.java:140)
        at java.lang.Thread.run(java.base@17.0.4/Thread.java:833)
        at jdk.internal.misc.InnocuousThread.run(java.base@17.0.4/InnocuousThread.java:162)
"Thread-0" #22 prio=5 os_prio=0 cpu=15.84ms elapsed=1485.34s tid=0x00007ff5381ea970 nid=0xa6 waiting for monitor entry  [0x00007ff4f729b000]
   java.lang.Thread.State: BLOCKED (on object monitor)
        at ren.qita.study.deadlock.ThreadA.run(ThreadA.java:22)
        - waiting to lock <0x0000000716226cb8> (a ren.qita.study.deadlock.ThreadA)
        - locked <0x0000000716228a50> (a ren.qita.study.deadlock.ThreadB)
"Thread-1" #23 prio=5 os_prio=0 cpu=16.21ms elapsed=1485.34s tid=0x00007ff5381eb890 nid=0xa7 waiting for monitor entry  [0x00007ff4f719a000]
   java.lang.Thread.State: BLOCKED (on object monitor)
        at ren.qita.study.deadlock.ThreadB.run(ThreadB.java:22)
        - waiting to lock <0x0000000716228a50> (a ren.qita.study.deadlock.ThreadB)
        - locked <0x0000000716226cb8> (a ren.qita.study.deadlock.ThreadA)
"DestroyJavaVM" #24 prio=5 os_prio=0 cpu=15.74ms elapsed=1485.34s tid=0x00007ff538011da0 nid=0x94 waiting on condition  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE
"Attach Listener" #25 daemon prio=9 os_prio=0 cpu=0.32ms elapsed=1453.98s tid=0x00007ff4ec000df0 nid=0xcf waiting on condition  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE
"VM Thread" os_prio=0 cpu=16.47ms elapsed=1485.37s tid=0x00007ff53818b730 nid=0x9a runnable
"GC Thread#0" os_prio=0 cpu=0.15ms elapsed=1485.37s tid=0x00007ff53807e5b0 nid=0x95 runnable
"G1 Main Marker" os_prio=0 cpu=0.07ms elapsed=1485.37s tid=0x00007ff53808f520 nid=0x96 runnable
"G1 Conc#0" os_prio=0 cpu=0.04ms elapsed=1485.37s tid=0x00007ff538090480 nid=0x97 runnable
"G1 Refine#0" os_prio=0 cpu=0.05ms elapsed=1485.37s tid=0x00007ff53815a7e0 nid=0x98 runnable
"G1 Service" os_prio=0 cpu=86.47ms elapsed=1485.37s tid=0x00007ff53815b6d0 nid=0x99 runnable
"VM Periodic Task Thread" os_prio=0 cpu=302.19ms elapsed=1485.36s tid=0x00007ff5381b00b0 nid=0xa4 waiting on condition
JNI global refs: 17, weak refs: 0
Found one Java-level deadlock:
=============================
"Thread-0":
  waiting to lock monitor 0x00007ff4c4001bf0 (object 0x0000000716226cb8, a ren.qita.study.deadlock.ThreadA),
  which is held by "Thread-1"
"Thread-1":
  waiting to lock monitor 0x00007ff4c4000e60 (object 0x0000000716228a50, a ren.qita.study.deadlock.ThreadB),
  which is held by "Thread-0"
Java stack information for the threads listed above:
===================================================
"Thread-0":
        at ren.qita.study.deadlock.ThreadA.run(ThreadA.java:22)
        - waiting to lock <0x0000000716226cb8> (a ren.qita.study.deadlock.ThreadA)
        - locked <0x0000000716228a50> (a ren.qita.study.deadlock.ThreadB)
"Thread-1":
        at ren.qita.study.deadlock.ThreadB.run(ThreadB.java:22)
        - waiting to lock <0x0000000716228a50> (a ren.qita.study.deadlock.ThreadB)
        - locked <0x0000000716226cb8> (a ren.qita.study.deadlock.ThreadA)
Found 1 deadlock.

然后惊奇的发现他打印了一句话Found one Java-level deadlock难道这是帮我们找到了死锁???这么方便?答案是是的,它帮我们找到了死锁就是这么方便。

然后我们看

Thread-0 是在waiting to lock monitor状态也就是在等待锁定监视器,也就是在等待一个同步区然后下面再告诉我们which is held by "Thread-1"Thread-1持有,也就是说在等待Thread-1释放掉之后它才执行。然后下面就打印了Thread-1的状态它也在waiting to lock monitor 并且说which is held by "Thread-0"意思是不是就是说他也在等待进入同步区,但是目前Thread-0在持有这个锁,这样就是Thread-1在等Thread-0释放,然后Thread-0在等Thread-1释放,就是死锁了。

然后下面就要看具体这两个线程的堆栈来根据代码来排查了

然后我们找到Thread-0Thread-1的堆栈,就在前面打印的

"Thread-0" #22 prio=5 os_prio=0 cpu=15.84ms elapsed=1485.34s tid=0x00007ff5381ea970 nid=0xa6 waiting for monitor entry  [0x00007ff4f729b000]
   java.lang.Thread.State: BLOCKED (on object monitor)
        at ren.qita.study.deadlock.ThreadA.run(ThreadA.java:22)
        - waiting to lock <0x0000000716226cb8> (a ren.qita.study.deadlock.ThreadA)
        - locked <0x0000000716228a50> (a ren.qita.study.deadlock.ThreadB)
"Thread-1" #23 prio=5 os_prio=0 cpu=16.21ms elapsed=1485.34s tid=0x00007ff5381eb890 nid=0xa7 waiting for monitor entry  [0x00007ff4f719a000]
   java.lang.Thread.State: BLOCKED (on object monitor)
        at ren.qita.study.deadlock.ThreadB.run(ThreadB.java:22)
        - waiting to lock <0x0000000716228a50> (a ren.qita.study.deadlock.ThreadB)
        - locked <0x0000000716226cb8> (a ren.qita.study.deadlock.ThreadA)

刚才我们就知道它阻塞住了,这个时候我们只需要看它阻塞在哪里了

看起来Thread-0是阻塞在ThreadA.java:22而Thread-1阻塞在ThreadB.java:22,这个时候打开代码找到对应的行

主程序:

package ren.qita.study.deadlock;

public class BootStrapClass {
    public static void main(String[] args) {
        ThreadA threadA = new ThreadA();
        ThreadB threadB = new ThreadB();
        threadA.setThreadB(threadB);
        threadB.setThreadA(threadA);
        threadA.start();
        threadB.start();
    }
}

ThreadA:

package ren.qita.study.deadlock;

import java.util.concurrent.TimeUnit;

public class ThreadA extends Thread {
    private ThreadB threadB;

    public void setThreadB(ThreadB threadB) {
        this.threadB = threadB;
    }

    @Override
    public void run() {
        synchronized (threadB) {
            try {
                System.out.println("ThreadA:Good morning!");
                TimeUnit.SECONDS.sleep(3);
            } catch (InterruptedException e) {
                throw new RuntimeException(e);
            }
            synchronized (this) {
                System.out.println("ThreadA:Good night!");
            }
        }
    }
}

我们看到22行是一行打印语句,其实不是这一行是前面一行synchronized (this)这个地方阻塞住了它这个地方是在等待this对象也就是ThreadB的一个对象进入同步代码块刚才分析死锁日志看到了是它已经获取到了一个同步锁所以再等待释放,所以我们就要看它在什么地方还取得了同步锁,死锁日志也说了是另外一个线程那我们看下面他的代码

ThreadB:

package ren.qita.study.deadlock;

import java.util.concurrent.TimeUnit;

public class ThreadB extends Thread {
    private ThreadA threadA;

    public void setThreadA(ThreadA threadA) {
        this.threadA = threadA;
    }

    @Override
    public void run() {
        synchronized (threadA) {
            try {
                System.out.println("ThreadB:Good morning!");
                TimeUnit.SECONDS.sleep(3);
            } catch (InterruptedException e) {
                throw new RuntimeException(e);
            }
            synchronized (this) {
                System.out.println("ThreadB:Good night!");
            }
        }
    }
}

我们发现另外一个线程也在等待锁,在等待this对象也就是ThreadB的对象获取同步锁,但是前面看几行代码就能看到它以及锁住了ThreadA的对象,然后要再锁住ThreadB的对象,但是ThreadA的代码里面ThreadB已经得到了一把同步锁,所以这样就卡住了,这谁写的代码太垃圾了哈哈哈哈哈哈哈。演示代码比较简单现实中肯定还是需要找一会哪里锁住了什么时机下会锁住,最终导致死锁。当然jstack的工具用法就是这样的了。我们就这样排查出了一个死锁,并通过代码找到了代码的bug,大功告成了!

java进程占用大量CPU

上线之后线上服务器的CPU突然增高都是JAVA占用了,怎么办~这个在这里一起来写其实是都可以借助jstack来排查,我们直接进入正题。

执行top命令

得到结果如下

top - 14:10:41 up 11 min,  0 users,  load average: 0.89, 0.55, 0.28
Tasks:   9 total,   1 running,   8 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.4 us,  5.3 sy,  0.0 ni, 93.6 id,  0.0 wa,  0.0 hi,  0.7 si,  0.0 st
MiB Mem :  15955.3 total,  15316.7 free,    493.5 used,    145.1 buff/cache
MiB Swap:   4096.0 total,   4096.0 free,      0.0 used.  15228.4 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
  236 wangwei   20   0 6975964  36960  25344 S 100.0   0.2   0:12.47 java
    9 root      20   0    2112    352      0 S   2.3   0.0   0:09.13 init
    1 root      20   0    1756   1072   1020 S   0.0   0.0   0:00.00 init
    8 root      20   0    2112    344      0 S   0.0   0.0   0:00.00 init
   10 user   20   0   10304   5488   3540 S   0.0   0.0   0:00.03 bash
  149 root      20   0    2112    344      0 S   0.0   0.0   0:00.00 init
  150 root      20   0    2112    352      0 S   0.0   0.0   0:00.00 init
  151 user   20   0   10172   5224   3424 S   0.0   0.0   0:00.06 bash
  213 user   20   0   10872   3628   3112 R   0.0   0.0   0:00.02 top

看到大量CPU被java进程占用着,那么我们用jstack来看看

~$ jstack 236
2022-08-25 14:14:28
Full thread dump OpenJDK 64-Bit Server VM (17.0.4+8-Ubuntu-120.04 mixed mode, sharing):

Threads class SMR info:
_java_thread_list=0x00007f31ec001ee0, length=12, elements={
0x00007f3238011da0, 0x00007f323818f650, 0x00007f3238190a30, 0x00007f32381985d0,
0x00007f3238199980, 0x00007f323819ad90, 0x00007f323819c7c0, 0x00007f32381a5ce0,
0x00007f32381a7150, 0x00007f32381ae770, 0x00007f32381b1d00, 0x00007f31ec000f30
}

"main" #1 prio=5 os_prio=0 cpu=41181.70ms elapsed=41.19s tid=0x00007f3238011da0 nid=0xed runnable  [0x00007f323ec6b000]
   java.lang.Thread.State: RUNNABLE
        at ren.qita.study.infiniteloop.LoopTest.main(LoopTest.java:8)

"Reference Handler" #2 daemon prio=10 os_prio=0 cpu=0.09ms elapsed=41.18s tid=0x00007f323818f650 nid=0xf4 waiting on condition  [0x00007f321838e000]
   java.lang.Thread.State: RUNNABLE
        at java.lang.ref.Reference.waitForReferencePendingList(java.base@17.0.4/Native Method)
        at java.lang.ref.Reference.processPendingReferences(java.base@17.0.4/Reference.java:253)
        at java.lang.ref.Reference$ReferenceHandler.run(java.base@17.0.4/Reference.java:215)

"Finalizer" #3 daemon prio=8 os_prio=0 cpu=0.15ms elapsed=41.18s tid=0x00007f3238190a30 nid=0xf5 in Object.wait()  [0x00007f321828d000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(java.base@17.0.4/Native Method)
        - waiting on <0x0000000716202fa0> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(java.base@17.0.4/ReferenceQueue.java:155)
        - locked <0x0000000716202fa0> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(java.base@17.0.4/ReferenceQueue.java:176)
        at java.lang.ref.Finalizer$FinalizerThread.run(java.base@17.0.4/Finalizer.java:172)

"Signal Dispatcher" #4 daemon prio=9 os_prio=0 cpu=0.31ms elapsed=41.18s tid=0x00007f32381985d0 nid=0xf6 waiting on condition  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Service Thread" #5 daemon prio=9 os_prio=0 cpu=0.06ms elapsed=41.18s tid=0x00007f3238199980 nid=0xf7 runnable  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Monitor Deflation Thread" #6 daemon prio=9 os_prio=0 cpu=0.39ms elapsed=41.18s tid=0x00007f323819ad90 nid=0xf8 runnable  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread0" #7 daemon prio=9 os_prio=0 cpu=2.83ms elapsed=41.18s tid=0x00007f323819c7c0 nid=0xf9 waiting on condition  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE
   No compile task

"C1 CompilerThread0" #15 daemon prio=9 os_prio=0 cpu=1.83ms elapsed=41.18s tid=0x00007f32381a5ce0 nid=0xfa waiting on condition  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE
   No compile task

"Sweeper thread" #19 daemon prio=9 os_prio=0 cpu=0.06ms elapsed=41.18s tid=0x00007f32381a7150 nid=0xfb runnable  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Notification Thread" #20 daemon prio=9 os_prio=0 cpu=0.06ms elapsed=41.17s tid=0x00007f32381ae770 nid=0xfc runnable  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Common-Cleaner" #21 daemon prio=8 os_prio=0 cpu=0.09ms elapsed=41.17s tid=0x00007f32381b1d00 nid=0xfe in Object.wait()  [0x00007f3208cdd000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
        at java.lang.Object.wait(java.base@17.0.4/Native Method)
        - waiting on <0x00000007162180a8> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(java.base@17.0.4/ReferenceQueue.java:155)
        - locked <0x00000007162180a8> (a java.lang.ref.ReferenceQueue$Lock)
        at jdk.internal.ref.CleanerImpl.run(java.base@17.0.4/CleanerImpl.java:140)
        at java.lang.Thread.run(java.base@17.0.4/Thread.java:833)
        at jdk.internal.misc.InnocuousThread.run(java.base@17.0.4/InnocuousThread.java:162)

"Attach Listener" #22 daemon prio=9 os_prio=0 cpu=0.19ms elapsed=0.10s tid=0x00007f31ec000f30 nid=0x115 waiting on condition  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"VM Thread" os_prio=0 cpu=0.58ms elapsed=41.18s tid=0x00007f323818b730 nid=0xf3 runnable

"GC Thread#0" os_prio=0 cpu=0.09ms elapsed=41.19s tid=0x00007f323807e5b0 nid=0xee runnable

"G1 Main Marker" os_prio=0 cpu=0.07ms elapsed=41.19s tid=0x00007f323808f520 nid=0xef runnable

"G1 Conc#0" os_prio=0 cpu=0.04ms elapsed=41.19s tid=0x00007f3238090480 nid=0xf0 runnable

"G1 Refine#0" os_prio=0 cpu=0.06ms elapsed=41.18s tid=0x00007f323815a7e0 nid=0xf1 runnable

"G1 Service" os_prio=0 cpu=1.91ms elapsed=41.18s tid=0x00007f323815b6d0 nid=0xf2 runnable

"VM Periodic Task Thread" os_prio=0 cpu=6.65ms elapsed=41.17s tid=0x00007f32381b00b0 nid=0xfd waiting on condition

JNI global refs: 17, weak refs: 0

我们看到了java的各个线程在干嘛,但是我们不知道具体是那个线程占用了大量CPU(这里测试程序很简单估计你都才出来了),那我们需要借助linux的top -Hp 236来查看线程的暂用情况,将236换成具体java进程的pid就对了

结果如下

~$ top -Hp 236
Threads:  20 total,   1 running,  19 sleeping,   0 stopped,   0 zombie
%Cpu(s):  6.2 us,  0.0 sy,  0.0 ni, 93.8 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :  15955.3 total,  15490.2 free,    319.7 used,    145.4 buff/cache
MiB Swap:   4096.0 total,   4096.0 free,      0.0 used.  15402.1 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
  237 wangwei   20   0 7042528  43076  25344 R  99.9   0.3   3:47.24 java
  236 wangwei   20   0 7042528  43076  25344 S   0.0   0.3   0:00.00 java
  238 wangwei   20   0 7042528  43076  25344 S   0.0   0.3   0:00.00 GC Thread#0
  239 wangwei   20   0 7042528  43076  25344 S   0.0   0.3   0:00.00 G1 Main Marker
  240 wangwei   20   0 7042528  43076  25344 S   0.0   0.3   0:00.00 G1 Conc#0
  241 wangwei   20   0 7042528  43076  25344 S   0.0   0.3   0:00.00 G1 Refine#0
  242 wangwei   20   0 7042528  43076  25344 S   0.0   0.3   0:00.01 G1 Service
  243 wangwei   20   0 7042528  43076  25344 S   0.0   0.3   0:00.00 VM Thread
  244 wangwei   20   0 7042528  43076  25344 S   0.0   0.3   0:00.00 Reference Handl
  245 wangwei   20   0 7042528  43076  25344 S   0.0   0.3   0:00.00 Finalizer
  246 wangwei   20   0 7042528  43076  25344 S   0.0   0.3   0:00.00 Signal Dispatch
  247 wangwei   20   0 7042528  43076  25344 S   0.0   0.3   0:00.00 Service Thread
  248 wangwei   20   0 7042528  43076  25344 S   0.0   0.3   0:00.00 Monitor Deflati
  249 wangwei   20   0 7042528  43076  25344 S   0.0   0.3   0:00.00 C2 CompilerThre
  250 wangwei   20   0 7042528  43076  25344 S   0.0   0.3   0:00.00 C1 CompilerThre
  251 wangwei   20   0 7042528  43076  25344 S   0.0   0.3   0:00.00 Sweeper thread
  252 wangwei   20   0 7042528  43076  25344 S   0.0   0.3   0:00.00 Notification Th
  253 wangwei   20   0 7042528  43076  25344 S   0.0   0.3   0:00.03 VM Periodic Tas
  254 wangwei   20   0 7042528  43076  25344 S   0.0   0.3   0:00.00 Common-Cleaner
  277 wangwei   20   0 7042528  43076  25344 S   0.0   0.3   0:00.00 Attach Listener

我们可以看到是pid为237这个线程占用了大量CPU

然后继续看我们刚才jstack的结果结合起来jstack里面的nid就是top -Hp 236pid的十六进制表示。

我们把237换成16进制结果为ED,那我们看jstack的结果发现是

~$ jstack 236
2022-08-25 14:14:28
Full thread dump OpenJDK 64-Bit Server VM (17.0.4+8-Ubuntu-120.04 mixed mode, sharing):

Threads class SMR info:
_java_thread_list=0x00007f31ec001ee0, length=12, elements={
0x00007f3238011da0, 0x00007f323818f650, 0x00007f3238190a30, 0x00007f32381985d0,
0x00007f3238199980, 0x00007f323819ad90, 0x00007f323819c7c0, 0x00007f32381a5ce0,
0x00007f32381a7150, 0x00007f32381ae770, 0x00007f32381b1d00, 0x00007f31ec000f30
}

"main" #1 prio=5 os_prio=0 cpu=41181.70ms elapsed=41.19s tid=0x00007f3238011da0 nid=0xed runnable  [0x00007f323ec6b000]
   java.lang.Thread.State: RUNNABLE
        at ren.qita.study.infiniteloop.LoopTest.main(LoopTest.java:8)

"Reference Handler" #2 daemon prio=10 os_prio=0 cpu=0.09ms elapsed=41.18s tid=0x00007f323818f650 nid=0xf4 waiting on condition  [0x00007f321838e000]
   java.lang.Thread.State: RUNNABLE
        at java.lang.ref.Reference.waitForReferencePendingList(java.base@17.0.4/Native Method)
        at java.lang.ref.Reference.processPendingReferences(java.base@17.0.4/Reference.java:253)
        at java.lang.ref.Reference$ReferenceHandler.run(java.base@17.0.4/Reference.java:215)

"Finalizer" #3 daemon prio=8 os_prio=0 cpu=0.15ms elapsed=41.18s tid=0x00007f3238190a30 nid=0xf5 in Object.wait()  [0x00007f321828d000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(java.base@17.0.4/Native Method)
        - waiting on <0x0000000716202fa0> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(java.base@17.0.4/ReferenceQueue.java:155)
        - locked <0x0000000716202fa0> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(java.base@17.0.4/ReferenceQueue.java:176)
        at java.lang.ref.Finalizer$FinalizerThread.run(java.base@17.0.4/Finalizer.java:172)

"Signal Dispatcher" #4 daemon prio=9 os_prio=0 cpu=0.31ms elapsed=41.18s tid=0x00007f32381985d0 nid=0xf6 waiting on condition  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Service Thread" #5 daemon prio=9 os_prio=0 cpu=0.06ms elapsed=41.18s tid=0x00007f3238199980 nid=0xf7 runnable  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Monitor Deflation Thread" #6 daemon prio=9 os_prio=0 cpu=0.39ms elapsed=41.18s tid=0x00007f323819ad90 nid=0xf8 runnable  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread0" #7 daemon prio=9 os_prio=0 cpu=2.83ms elapsed=41.18s tid=0x00007f323819c7c0 nid=0xf9 waiting on condition  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE
   No compile task

"C1 CompilerThread0" #15 daemon prio=9 os_prio=0 cpu=1.83ms elapsed=41.18s tid=0x00007f32381a5ce0 nid=0xfa waiting on condition  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE
   No compile task

"Sweeper thread" #19 daemon prio=9 os_prio=0 cpu=0.06ms elapsed=41.18s tid=0x00007f32381a7150 nid=0xfb runnable  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Notification Thread" #20 daemon prio=9 os_prio=0 cpu=0.06ms elapsed=41.17s tid=0x00007f32381ae770 nid=0xfc runnable  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Common-Cleaner" #21 daemon prio=8 os_prio=0 cpu=0.09ms elapsed=41.17s tid=0x00007f32381b1d00 nid=0xfe in Object.wait()  [0x00007f3208cdd000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
        at java.lang.Object.wait(java.base@17.0.4/Native Method)
        - waiting on <0x00000007162180a8> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(java.base@17.0.4/ReferenceQueue.java:155)
        - locked <0x00000007162180a8> (a java.lang.ref.ReferenceQueue$Lock)
        at jdk.internal.ref.CleanerImpl.run(java.base@17.0.4/CleanerImpl.java:140)
        at java.lang.Thread.run(java.base@17.0.4/Thread.java:833)
        at jdk.internal.misc.InnocuousThread.run(java.base@17.0.4/InnocuousThread.java:162)

"Attach Listener" #22 daemon prio=9 os_prio=0 cpu=0.19ms elapsed=0.10s tid=0x00007f31ec000f30 nid=0x115 waiting on condition  [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"VM Thread" os_prio=0 cpu=0.58ms elapsed=41.18s tid=0x00007f323818b730 nid=0xf3 runnable

"GC Thread#0" os_prio=0 cpu=0.09ms elapsed=41.19s tid=0x00007f323807e5b0 nid=0xee runnable

"G1 Main Marker" os_prio=0 cpu=0.07ms elapsed=41.19s tid=0x00007f323808f520 nid=0xef runnable

"G1 Conc#0" os_prio=0 cpu=0.04ms elapsed=41.19s tid=0x00007f3238090480 nid=0xf0 runnable

"G1 Refine#0" os_prio=0 cpu=0.06ms elapsed=41.18s tid=0x00007f323815a7e0 nid=0xf1 runnable

"G1 Service" os_prio=0 cpu=1.91ms elapsed=41.18s tid=0x00007f323815b6d0 nid=0xf2 runnable

"VM Periodic Task Thread" os_prio=0 cpu=6.65ms elapsed=41.17s tid=0x00007f32381b00b0 nid=0xfd waiting on condition

JNI global refs: 17, weak refs: 0

我们发现是主线程占用了大量CPU,并且我们多次执行jstack 236看它几乎都在执行那几行代码我们去找到代码看看有什么问题

下面就是分析代码时间了

这太明显了,这是一个死循环啊,好的接下来改掉代码就好了。

总结

上面将的两个例子都是通过jstack分析线程的堆栈来确定问题的,核心就是这个,能看到服务器在跑哪一行代码了,那我们就相当于知道了程序运行到哪一行代码了,再根据代码来分析找出原因。比如也还可以用来做瓶颈分析,看到大量线程在等待文件IO,那么可能就是硬盘速度造成了瓶颈,如果大量线程在等待网络IO那么网络可能造成了瓶颈。

那么就结束了。

0

评论区