With the restoration of corrupt utf8 multibyte editing
to the library, it's important to establish the proper
locale before calling that 'fatal_proc_unmounted' guy.
He calls the original 'look_up_our_self' who, in turn,
will invoke 'stat2proc' which then calls 'escape_str'.
Once there, it's too late to effect changes to locale.
[ the result would be lots and lots of '?' displayed ]
Signed-off-by: Jim Warner <james.warner@comcast.net>
Much of what was represented in the commit message for
the reference shown below was revisited in this patch.
It also means that the assertion in the last paragraph
of that message will only now be true with LANG unset.
[ and forget all the bullshit about not altering any ]
[ kernel supplied data. sometimes we must to avoid a ]
[ corrupt display due to a string we can not decode. ]
And while this commit still avoids the overhead of the
'mbrtowc', 'wcwidth' 'isprint, & 'iswprint' functions,
we achieve all the benefits with simple table lookups.
Plus such benefits are extended to additional strings.
For example, both PIDS_EXE and PIDS_CMD fields are now
also subject to being 'escaped'. If a program name did
contain multibyte characters, potential truncation may
corrupt it when it's squeezed into a 15/63 byte array.
Now, all future users of this new library only need to
deal with the disparities between string and printable
lengths. Such strings themselves are always printable.
[ the ps program now contains some unnecessary costs ]
[ with the duplicated former 'escape' functions. But ]
[ we retain that copied escape.c code for posterity. ]
[ besides, in a one-shot guy it's of little concern. ]
Note: Proper display of some multibyte strings was not
possible at the linux console. It would seem a concept
of zero length chars (like a 'combining acute accent')
is not recognized. Thus the display becomes corrupted.
But if utf8 decoding is disabled (via LANG=), then all
callers will now see '?', restoring correct alignment.
Reference(s):
. Dec 2020, newlib 'escape' logic refactored
commit a221b9084a
Signed-off-by: Jim Warner <james.warner@comcast.net>
This new library provides callers with pure strings or
string vectors. It is up to those callers to deal with
potential utf8 multibyte characters and any difference
between strlen and the corresponding printable widths.
So, it makes no sense for the library to go to all the
trouble of invoking those rather expensive 'mbrtowc' &
'wcwidth' functions to ultimately yield total 'cells'.
Thus, this patch will eliminate all the code and parms
that are involved with such possible multibyte issues.
[ Along the way we'll lose the ability to substitute ]
[ '?' for an invalid/unprintable multibyte sequence. ]
[ We will, however, replace ctrl chars with the '?'. ]
[ This presents no problem for that ps program since ]
[ it now duplicates all of the original escape code. ]
[ And, we'll no longer be executing that code twice! ]
[ As for the top program, it takes the position that ]
[ it is wrong to alter kernel supplied data. So with ]
[ potential invalid/unprintable stuff, he'll rely on ]
[ terminal emulators to properly handle such issues! ]
[ Besides, even using a proper multibyte string, not ]
[ all terminals generate the proper printable width. ]
[ This is especially true when it comes to an emoji. ]
[ And should callers chose not to be portable to all ]
[ locales by calling setlocale(LC_ALL, ""), they can ]
[ expect to see lots of "?", regardless of what this ]
[ library fixes in a faulty multibyte string anyway. ]
Signed-off-by: Jim Warner <james.warner@comcast.net>
When any process' command line contains multibyte utf8
characters, two separate display problems could arise.
1. If that COMMAND column is not displayed as the very
last field, then field(s) to the right are misaligned.
2. Even when last, should utf8 string length (not that
display length) exceed allowable screen width, it will
nonetheless suffer from improper premature truncation.
Number 1 is less of a concern since the cmdline column
is likely to always be the last field to be displayed,
if only to enable right and left scrolling provisions.
Number 2 is much more likely to occur, especially with
additional fields which might be shown before COMMAND.
Or, forest view child tasks can yield the same effect.
So, this commit will permit the correct utf8 multibyte
display regardless of field position or string length.
And, we'll bring top into line with the ps program for
additional fields potentially subject to utf8 display.
Signed-off-by: Jim Warner <james.warner@comcast.net>
Form its inception (back in May of 2011), escaped_copy
has always been a flawed function. It does not operate
on 'escaped' strings but instead treats all input as a
regular string incapable of containing utf8 sequences.
As such, it should only be used for strings guaranteed
to NOT embody multibyte characters (like SUPGIDS). For
all other strings, which could contain utf8 stuff, the
correct function should have been that escape_str guy.
So this commit changes nearly every escaped_copy call.
Reference(s):
. May 2011, original escaped_copy (cmdline, cgroup)
commit 7b0fc19e9d
Signed-off-by: Jim Warner <james.warner@comcast.net>
After 'errno' management was standardized, a couple of
fields were added to the <pids> api. Only 1 (PIDS_EXE)
involved dynamic memory and, unfortunately, it did not
conform to that expected normalized ENOMEM convention.
Reference(s):
. Jun 2018, added 'exe' to library
commit ad4269f118
. Nov 2017, standardized 'errno' management
commit 06be33b43e
Signed-off-by: Jim Warner <james.warner@comcast.net>
Under the above #define this commit now also addresses
2 additional possible toe stubbers involving 'select'.
If some readproc.h constants were uncoupled from their
pids.h enumerators a 'make check-lib' will now detect.
Signed-off-by: Jim Warner <james.warner@comcast.net>
The Fieldstab uses the full pids_item enumerator names
but also shows top's cryptic relative enumerator names
as comments. So, this commit will mirror that approach
in task_show, adding full pids_item names as comments.
Signed-off-by: Jim Warner <james.warner@comcast.net>
I'm not sure if anyone actually uses these things, but if you
selected test fields on the command line ps would crash.
$ ps/pscommand -o _left
Signal 11 (SEGV) caught by pscommand (3.3.11.877-0488).
/home/csmall/Projects/procps/procps/ps/.libs/pscommand:ps/display.c:66: please report this bug
Segmentation fault
Anyway, it doesn't now:
$ ps/pscommand -o pid,_left,_left2,_right,_unlimited 1
PID LLLLLLLL L2L2L2L2 RRRRRRRRRRR U
1 tty7 3270/tty4 59:59 [123456789-12345] <defunct>
This is part of !118 where @tt.rantala found a memory leak.
The other part of !118 may come later if the performance change
is significant.
References:
procps-ng/procps!118
With glibc, each time the strftime() function is used (twice per process
in a typical ps -fe run), a stat("/etc/localtime") system call is used
to determine the timezone. Not only does this add extra system call
overhead, but when multiple ps processes are trying to access this
file (or multiple glibc programs using strftime) in parallel, this can
trigger significant lock contention within the OS kernel.
Since ps is not intended to run for long periods of time as a
daemon (during which the system timezone could be altered and PS might
reasonably be expected to adapt its output), there is no benefit to
repeatedly doing this stat(). To stop this behavior, explicitly set the
TZ variable to its default value (:/etc/localtime) whenever it is unset.
glibc will then cache the stat() result.
While sysctl did change the order of /run and /etc to match
systemd in the referenced commit, the Debian bug report that
brought it to light was not documented.
References:
commit 24a1574f0ahttps://bugs.debian.org/950788
So as to not obscure the results from this new target,
we'll redirect that final 'make clean' output to null.
Signed-off-by: Jim Warner <james.warner@comcast.net>
If a hash results report is output (via UNREF_RPTHASH)
a portion is devoted to occupied table entries ordered
by depth. There is a possibility that some depths will
not be found among existing occupied table entries and
to avoid any confusion probably should not be printed.
[ to illustrate the potential for confusion prior to ]
[ this patch, force a very small table size (like 8) ]
[ & then trigger the procps_pids_unref() eoj report. ]
So this patch ensures only 'in use' entries are shown.
[ admittedly, all of the remaining logic in the loop ]
[ could/should be subordinate to this new 'if' test, ]
[ but we will keep the change to a minimum. besides, ]
[ there's no harm subtracting/adding a zero numdepth ]
[ especially since the chance of a zero is very low. ]
Signed-off-by: Jim Warner <james.warner@comcast.net>
free, slabtop and uptime would happily take extra command line
arguments and doing nothing about them. The programs now check
optind after option processing and will give you usage screen
if there is anything extra.
References:
procps-ng/procps#181
For long lines from a process, watch would wrap them around to the
next. While this default option has it uses, sometimes you want to
just cut those long lines down.
watch has a -w flag which will truncate the lines to the number
of columns. A few simple lines to do this new trick.
I think I caught all the ANSI state correctly but there might be
a chance it bleeds to the next row.
References:
procps-ng/procps#182
noinst_PROGRAMS are built with "make" even though we had the
test programs in there and only needed them for "make check".
In theory the check target should depend on check_PROGRAMS as
check-am target does and the document states it should, but for
reasons understood by the automake whisperers only, it doesn't
build them.
check only depends on BUILT_SOURCES for some reason.
check-am: all-am
$(MAKE) $(AM_MAKEFLAGS) $(check_PROGRAMS)
$(MAKE) $(AM_MAKEFLAGS) check-TESTS
check: $(BUILT_SOURCES)
$(MAKE) $(AM_MAKEFLAGS) check-recursive
References:
https://www.gnu.org/software/automake/manual/html_node/Scripts_002dbased-Testsuites.html
The referenced commits created the library infrastructure and test
program to validate that the structures and macros line up with
each other.
The library needs to be (re)built with -DITEMTABLE_DEBUG and then
the test program ran. We clean before and after so we are not
testing a non-debug library or having a debug library hanging around
to cause future problems.
Due to test_Itemtables depending on the library, we don't need to
explicitly build the library.
To validate the library structure/header corrospondence run:
make check-lib
References:
commit e616409aa4
commit 92d0297e1ehttps://www.freelists.org/post/procps/keep-on-patchin,19
This patch just raises the size of the hash table used
to calculate elapsed task stuff. The net result should
be less need for 'chaining' under pid hash collisions.
[ the hash scheme is intentionally kept as primitive ]
[ and, therefore, as fast as possible. it employs an ]
[ 'and' approach versus a 'mod' operation since both ]
[ yield similar distribution but the former approach ]
[ was 4 fewer cpu instructions in terms of overhead. ]
[ additionally, for hash collisions, 'chaining' uses ]
[ an array index rather than the usual pointer since ]
[ the HST_t guys may move when they are reallocated. ]
Signed-off-by: Jim Warner <james.warner@comcast.net>
This patch separates the memory allocations into those
used initially from those used in later reallocations.
Thus, we can reduce that iterative realloc() overhead.
Additionally, we'll correct a long standing oops where
multiple history_info structures were created at 'new'
time when only one should have been allocated (jeeze).
[ originally the allocation was strangely based upon ]
[ number of 'items' (???) & later a #define constant ]
Reference(s):
. May, 2016 - subsequent bad history_info logic
commit 9ebadc1438
. Aug, 2015 - original faulty history_info code
commit 7e6a371d8a
Signed-off-by: Jim Warner <james.warner@comcast.net>
The referenced commit the comm length was increased from 16 to 64
characters to handle the larger command names for things like kernel
threads.
However most user processes are limited to 15 characters which means
if you try something like ps -C myprogramisbiggerthansixteen this would
fail to match because /proc/<PID>/comm would only be myprogramisbigg
ps now checks the comm length and if it is 15 and if the given match
is 15 or more, it will only match the first 15 characters.
This is also how killall has worked for about a year.
Thanks to Jean Delvare <jdelvare@suse.de> for the note.
Copy of commit from master.
References:
commit 14005a371e
commit psmisc/psmisc@1188315cd0
commit 3e1c00d051
Signed-off-by: Craig Small <csmall@dropbear.xyz>
This patch is an outgrowth of that commit shown below.
Many additional potential segmentation faults might be
encountered if interactive commands are opened up to a
user when a '-p' switch has a single non-existent pid.
[ always the 'k', 'L', 'r', 'Y' keys & maybe 'v' too ]
So, this patch will restrict such a loser (oops, user)
to a reduced subset of normal commands until he/she/it
quits then restarts top with something to be displayed
or issues the '=' command overriding that '-p' switch.
Reference(s):
commit d3203d99dd
Signed-off-by: Jim Warner <james.warner@comcast.net>
I'm not sure why, but the make check will now fail for vmstat
Running ./vmstat.test/vmstat.exp ...
FAIL: vmstat disk information (-d option)
With the _new function returning the error.
In vmstat all other structures are set to NULL before calling _new
except the diskstat ones. This has been corrected.
This patch fixes a nearly decade old bug discovered by
Frederik Deweerdt. His merge request shown below would
be an adequate solution except for iterative overhead.
This alternate patch will represent substantially less
overhead for an admittedly extremely rare possibility.
Reference(s):
https://gitlab.com/procps-ng/procps/-/merge_requests/114
And-thanks-to: Frederik Deweerdt <fdeweerdt@fastly.com>
Signed-off-by: Jim Warner <james.warner@comcast.net>