Re: lost stderr in console


Laszlo Ersek
 

On 03/17/21 04:23, wenyi,xie via groups.io wrote:
Hi,everyone,

I used to set makeflags like below,so that only message from stderr will be shown in console while compiling, and at the same time message from stderr and stdout will all be saved in xxx.log.
COMMAND MAKEFLAGS= set -eo pipefail && sh xxx.sh 2>&1 >> ${LOG_FILE_DIR}/xxx.log | tee -a ${LOG_FILE_DIR}/xxx.log
Redirections are processed in the following order:
- pipe first (redirects stdout)
- then left to right, as seen on the command line

So what we have for "xxx.sh" is:

step#1: file descriptor 1 refers to the (nameless) pipe's file description

step#2: file descriptor 2 refers to the same (nameless) pipe's file
description (i.e., to the file description that file descriptor 1
*currently* refers to)

step#3: file descriptor 1 now refers to a file description that refers
to the inode ("file") originally looked up by the name "xxx.log". At the
file description level, the O_APPEND status flag is set.

So "tee" will only see stderr from "xxx.sh". Furthermore, the stdout of
"xxx.sh" will only go to the log file ("xxx.log").


Let's consider "tee" then: "tee" opens the inode (the "file") looked up
by the name "xxx.log" separately from when the shell opens "xxx.log",
for the "xxx.sh" redirection. This means that, in the kernel, a separate
file description exists, for the "xxx.log" inode. This file description
also has the O_APPEND status flag set, but it doesn't matter -- the file
description that "xxx.sh" writes through, and the file description that
"tee" writes through, are independent. The "file offset" property is at
the file description level. Therefore "tee" and "xxx.sh" do not share
the file offset (and O_APPEND is useless, in both file descriptions),
and they will mutually overwrite parts of each other's output.

In other words, your command line is buggy.


In general:

file descriptor --> file description --> file (inode)

When you open() the same file by name, you get this:

file descriptor --> file description \
--> file (inode)
file descriptor --> file description /

Whereas, if you use fork() or dup(), this is what you get:

file descriptor \
--> file description --> file (inode)
file descriptor /

O_APPEND and the file offset both exist in the *file description*
object. So in the first case, you get no coordination from the kernel,
and in the second case, you do.

Note that even in the second case, that is, when both file descriptors
refer to the same file description, it is not guaranteed that
*concurrent* writes will not be *interleaved*. No data will be
*overwritten*, for sure, but the granularity of "atomic" writes is not
an easy question. If the file description refers to a pipe, then there
are some guarantees from POSIX, as long as the writes are "small enough"
(PIPE_BUF):
<https://pubs.opengroup.org/onlinepubs/9699919799/functions/write.html>.


So... what you want to do is actually difficult. There are two
approaches, but both are broken.

Approach (1): fuse stdout and stderr into a single stream, capture the
common stream in the log file, and print the "diagnostic" lines to the
console:

sh xxx.sh 2>&1 | tee -- xxx.log | grep -- 'what exacty?'

This is broken because you cannot identify the diagnostic-only lines
(the original stderr) by content.

Approach (2): duplicate the original stderr into two streams. The first
instance will go to the console. The second instance, together with the
script's stdout, will be written to the log file.

# fd 1: original stdout
# fd 2: original stderr
# make fd 3 point to the original stdout as well

exec 3>&1

(
# fd 1: pipe leading to the log file
# fd 2: original stderr
# fd 3: original stdout
# make fd 4 too point to the pipe leading to the log file

exec 4>&1

(
# fd 1: pipe carrying the main script's stderr
# fd 2: original stderr
# fd 3: original stdout
# fd 4: pipe leading to the log file

# main script's stderr goes to the duplicator pipe
# main script's stdout goes to the log file

xxx.sh 2>&1 >&4
) | (

# fd 0: pipe carrying the main script's stderr
# fd 1: pipe leading to the log file
# fd 2: original stderr
# fd 3: original stdout
# fd 4: pipe leading to the log file

# duplicate the main script's stderr to the original stdout (3)
# and to the log file (1)

tee /dev/fd/3
)
) | cat > xxx.log


In short form (no comments and no useless subshells):

exec 3>&1
(
exec 4>&1
xxx.sh 2>&1 >&4 | tee /dev/fd/3
) | cat > xxx.log

This will not lose messages. It will also not interleave write()
syscalls that do not exceed PIPE_BUF (usually 4KB) individually -- this
is the justification for the outermost pipe, and "cat".

However, it is still broken: dependent on process scheduling between
"xxx.sh" and "tee", it's now possible that stdout and stderr lines from
"xxx.sh" will be reordered, relative to each other. Let's say line#1 is
a diagnostic message, while line#2 is a normal message. "xxx.sh" writes
them in line#1, line#2 order. With the above, line#1 goes from "xxx.sh"
to "tee" to "cat", while line#2 goes from "xxx.sh" to "cat". If "tee" is
"slow", then "cat" could see the messages in line#2, line#1 order.

Thanks
Laszlo

Join discuss@edk2.groups.io to automatically receive all group messages.