[zeromicro/go-zero]使用durationLogger打印日志,崩溃

2024-01-10 535 views
6

部分调用栈信息如下:
at /usr/local/go/src/runtime/signal_unix.go:846 7 0x0000000000437a4b in runtime.fatalpanic at /usr/local/go/src/runtime/panic.go:1216 8 0x00000000004373b0 in runtime.gopanic at /usr/local/go/src/runtime/panic.go:1064 9 0x000000000044d013 in runtime.panicmem at /usr/local/go/src/runtime/panic.go:212 10 0x000000000044d013 in runtime.sigpanic at /usr/local/go/src/runtime/signal_unix.go:720 11 0x00000000005e7ee6 in log/syslog.(Writer).Write at /usr/local/go/src/log/syslog/syslog.go:175 12 0x000000000051c6ab in encoding/json.(Encoder).Encode at /usr/local/go/src/encoding/json/stream.go:231 13 0x00000000005eb938 in github.com/tal-tech/go-zero/core/logx.outputJson at /root/go/pkg/mod/github.com/tal-tech/go-zero@v0.0.0-20210421090637-a7ec5f0cf7d5/core/logx/logs.go:380 14 0x00000000005e9eab in github.com/tal-tech/go-zero/core/logx.(durationLogger).write at /root/go/pkg/mod/github.com/tal-tech/go-zero@v0.0.0-20210421090637-a7ec5f0cf7d5/core/logx/durationlogger.go:66 15 0x00000000005e9a75 in github.com/tal-tech/go-zero/core/logx.(durationLogger).Infof at /root/go/pkg/mod/github.com/tal-tech/go-zero@v0.0.0-20210421090637-a7ec5f0cf7d5/core/logx/durationlogger.go:41

回答

9

能否给一下测试代码?

3

@kevwan 你好,这个是我们生产环境偶现的,业务代码大致如下: logger :=logx.WithDuration(time.Second) 在并发场景下复用logger对象打印日志

4

我的判断是这里对象里有非法指针,导致json encoding的时候panic了

func panicmem() {
    panicCheck2("invalid memory address or nil pointer dereference")
    panic(memoryError)
}
5

上层逻辑 func (l durationLogger) write(writer io.Writer, level, content string) { l.Timestamp = getTimestamp() l.Level = level l.Content = content outputJson(writer, logEntry(l)) } 这里没有引用任何指针相关的对象呢。复用durationLogger对象,用法不正确,但是也不会导致panic吧

3

我们线上用了好多年,确实没遇到这样的问题,你看看能不能找到啥重现代码,我怀疑哪里使用错了?传了nil指针啥的

4

好的,感谢 我们这边再尝试排查下,目前逻辑看起来 只是简单的打印一条字符串日志

7

@wangyanglong 有进展嘛?

0

@kevwan 已提重现建议给相关方,目前暂无进展哈

4

好的,有了这里同步一下哈

6

进程在运行过程中,外部发送kill信号,崩溃堆栈和本issue类似。该产品有多个无厘头的崩溃堆栈,猜测都是外部发送信号有关,还在深入验证中。

8

有个新的报错堆栈,就是简单的打印日志,也许和我们的程序运行环境有关,服务部署在嵌套虚拟化机器上。 0 0x0000000000473fe1 in runtime.raise at /usr/local/go/src/runtime/sys_linux_amd64.s:165 1 0x0000000000450c1d in runtime.dieFromSignal at /usr/local/go/src/runtime/signal_unix.go:754 2 0x0000000000451271 in runtime.sigfwdgo at /usr/local/go/src/runtime/signal_unix.go:968 3 0x000000000044f8d4 in runtime.sigtrampgo at /usr/local/go/src/runtime/signal_unix.go:409 4 0x0000000000474383 in runtime.sigtramp at /usr/local/go/src/runtime/sys_linux_amd64.s:409 5 0x00007ffa462c1390 in ??? at ?:-1 6 0x000000000043a52b in runtime.crash at /usr/local/go/src/runtime/signal_unix.go:846 7 0x000000000043a52b in runtime.fatalpanic at /usr/local/go/src/runtime/panic.go:1216 8 0x0000000000439e90 in runtime.gopanic at /usr/local/go/src/runtime/panic.go:1064 9 0x00000000004fd4e5 in strings.(Builder).copyCheck at /usr/local/go/src/strings/builder.go:42 10 0x00000000004fd4e5 in strings.(Builder).Write at /usr/local/go/src/strings/builder.go:89 11 0x00000000005256eb in encoding/json.(*Encoder).Encode at /usr/local/go/src/encoding/json/stream.go:231 12 0x00000000005f5bd7 in github.com/tal-tech/go-zero/core/logx.outputJson at /root/go/pkg/mod/github.com/tal-tech/go-zero@v0.0.0-20210713033546-a1d1e89a651e/core/logx/logs.go:483 13 0x00000000005f58e5 in github.com/tal-tech/go-zero/core/logx.output at /root/go/pkg/mod/github.com/tal-tech/go-zero@v0.0.0-20210713033546-a1d1e89a651e/core/logx/logs.go:451 14 0x00000000005f59bf in github.com/tal-tech/go-zero/core/logx.outputWithCaller at /root/go/pkg/mod/github.com/tal-tech/go-zero@v0.0.0-20210713033546-a1d1e89a651e/core/logx/logs.go:456 15 0x00000000005f4985 in github.com/tal-tech/go-zero/core/logx.infoSync at /root/go/pkg/mod/github.com/tal-tech/go-zero@v0.0.0-20210713033546-a1d1e89a651e/core/logx/logs.go:430 16 0x00000000005f4985 in github.com/tal-tech/go-zero/core/logx.Info

9

这里真得不到什么有效信息,能否贴下相关调用代码?

另外,其他还有谁遇到这个问题嘛?请跟帖

0

func outputJson(writer io.Writer, info interface{}) { // V1: 原始代码,对特殊字符会转义 // if content, err := json.Marshal(info); err != nil { // log.Println(err.Error()) // } else if atomic.LoadUint32(&initialized) == 0 || writer == nil { // log.Println(string(content)) // } else { // writer.Write(append(content, '\n')) // }

// V2: 修改后,对特殊字符不转义
if atomic.LoadUint32(&initialized) == 0 || writer == nil {
    if content, err := json.Marshal(info); err != nil {
        log.Println(err.Error())
    } else {
        log.Println(string(content))
    }

    return
}

var builder strings.Builder
enc := json.NewEncoder(&builder)
enc.SetEscapeHTML(false)
if err := enc.Encode(info); err != nil {
    log.Println(err.Error())
}
writer.Write([]byte(builder.String()))

}

内部魔改了这个函数,目前初步确认是内存逃逸相关,将最后一行改为 buf := append([]byte(nil), builder.String()...) writer.Write(buf) 有待继续观察

6

@kevwan 问题确认是使用的内核有bug,合入了最新的内核补丁,已解决。