This commit just ensures that the relatively expensive
ID to name conversions aren't performed unless they're
explicitly requested. It also internalizes those flags
that required the PROC_FILLSTATUS flag to also be set.
[ requiring a caller, in our case pids.c, to provide ]
[ two flags when a single field was the objective is ]
[ wrong & represents a future potential toe-stubber. ]
[ moreover, what's worse is that those two flags are ]
[ seemingly unrelated. but, without both, a SEGV can ]
[ can be expected when a result.str pointer is NULL. ]
[ by contrast, in the master branch those fields are ]
[ arrays which, when set to zeroes, produce an empty ]
[ string. So, there is no abend (but no name either) ]
[ when one of those two required flags were omitted. ]
[ and worth noting, in that branch it's not just one ]
[ caller required to observe a two flag requirement. ]
Signed-off-by: Jim Warner <james.warner@comcast.net>
This commit adapts our man page for a new consolidated
'misc.h' header file. Along the way, some descriptions
were shortened, others lengthened and whitespace added
in an effort to (hopefully) improve readability a bit.
[ the #include subdirectory was also corrected while ]
[ rearranging & grouping functions into 3 categories ]
Signed-off-by: Jim Warner <james.warner@comcast.net>
With the 4 header files removed in the previous patch,
this commit just changes all those obsolete references
to that new consolidated 'misc.h' header file instead.
Signed-off-by: Jim Warner <james.warner@comcast.net>
Prior to this patch, we had four separate header files
dealing with miscellaneous functions. Those four files
were documented in the single man page: procps_misc.3.
Now, we will have just a single header file documented
in a single man page (that is what's called symmetry).
[ and while we're at it, we will shorten that overly ]
[ long struct 'procps_namespaces' name to agree with ]
[ the function naming conventions, e.g. 'procps_ns'. ]
Signed-off-by: Jim Warner <james.warner@comcast.net>
The preceding commit made that 'esc_all' function more
efficient by using direct pointer manipulation instead
of an indexed string reference approach within a loop.
This commit applies the same approach to the 'esc_ctl'
function. Now we'll save 12 more iterated instructions
while decreasing the function's code size by 43 bytes.
Signed-off-by: Jim Warner <james.warner@comcast.net>
While this patch has some cosmetic whitespace changes,
more importantly it makes that 'esc_all' function more
efficient. By abandoning the indexed loop approach for
a direct pointer manipulation, we will save 9 iterated
machine instructions, for a total of 33 bytes of code.
Signed-off-by: Jim Warner <james.warner@comcast.net>
The Inspection feature already offered an INSP_SLIDE_1
provision. This patch now offers similar extensions to
variable width column scrolling (assuming SCROLLVAR_NO
isn't defined). Such a provision was useful during the
development of some recent library UTF-8 enhancements.
Signed-off-by: Jim Warner <james.warner@comcast.net>
With the restoration of corrupt utf8 multibyte editing
to the library, it's important to establish the proper
locale before calling that 'fatal_proc_unmounted' guy.
He calls the original 'look_up_our_self' who, in turn,
will invoke 'stat2proc' which then calls 'escape_str'.
Once there, it's too late to effect changes to locale.
[ the result would be lots and lots of '?' displayed ]
Signed-off-by: Jim Warner <james.warner@comcast.net>
Much of what was represented in the commit message for
the reference shown below was revisited in this patch.
It also means that the assertion in the last paragraph
of that message will only now be true with LANG unset.
[ and forget all the bullshit about not altering any ]
[ kernel supplied data. sometimes we must to avoid a ]
[ corrupt display due to a string we can not decode. ]
And while this commit still avoids the overhead of the
'mbrtowc', 'wcwidth' 'isprint, & 'iswprint' functions,
we achieve all the benefits with simple table lookups.
Plus such benefits are extended to additional strings.
For example, both PIDS_EXE and PIDS_CMD fields are now
also subject to being 'escaped'. If a program name did
contain multibyte characters, potential truncation may
corrupt it when it's squeezed into a 15/63 byte array.
Now, all future users of this new library only need to
deal with the disparities between string and printable
lengths. Such strings themselves are always printable.
[ the ps program now contains some unnecessary costs ]
[ with the duplicated former 'escape' functions. But ]
[ we retain that copied escape.c code for posterity. ]
[ besides, in a one-shot guy it's of little concern. ]
Note: Proper display of some multibyte strings was not
possible at the linux console. It would seem a concept
of zero length chars (like a 'combining acute accent')
is not recognized. Thus the display becomes corrupted.
But if utf8 decoding is disabled (via LANG=), then all
callers will now see '?', restoring correct alignment.
Reference(s):
. Dec 2020, newlib 'escape' logic refactored
commit a221b9084a
Signed-off-by: Jim Warner <james.warner@comcast.net>
This new library provides callers with pure strings or
string vectors. It is up to those callers to deal with
potential utf8 multibyte characters and any difference
between strlen and the corresponding printable widths.
So, it makes no sense for the library to go to all the
trouble of invoking those rather expensive 'mbrtowc' &
'wcwidth' functions to ultimately yield total 'cells'.
Thus, this patch will eliminate all the code and parms
that are involved with such possible multibyte issues.
[ Along the way we'll lose the ability to substitute ]
[ '?' for an invalid/unprintable multibyte sequence. ]
[ We will, however, replace ctrl chars with the '?'. ]
[ This presents no problem for that ps program since ]
[ it now duplicates all of the original escape code. ]
[ And, we'll no longer be executing that code twice! ]
[ As for the top program, it takes the position that ]
[ it is wrong to alter kernel supplied data. So with ]
[ potential invalid/unprintable stuff, he'll rely on ]
[ terminal emulators to properly handle such issues! ]
[ Besides, even using a proper multibyte string, not ]
[ all terminals generate the proper printable width. ]
[ This is especially true when it comes to an emoji. ]
[ And should callers chose not to be portable to all ]
[ locales by calling setlocale(LC_ALL, ""), they can ]
[ expect to see lots of "?", regardless of what this ]
[ library fixes in a faulty multibyte string anyway. ]
Signed-off-by: Jim Warner <james.warner@comcast.net>
When any process' command line contains multibyte utf8
characters, two separate display problems could arise.
1. If that COMMAND column is not displayed as the very
last field, then field(s) to the right are misaligned.
2. Even when last, should utf8 string length (not that
display length) exceed allowable screen width, it will
nonetheless suffer from improper premature truncation.
Number 1 is less of a concern since the cmdline column
is likely to always be the last field to be displayed,
if only to enable right and left scrolling provisions.
Number 2 is much more likely to occur, especially with
additional fields which might be shown before COMMAND.
Or, forest view child tasks can yield the same effect.
So, this commit will permit the correct utf8 multibyte
display regardless of field position or string length.
And, we'll bring top into line with the ps program for
additional fields potentially subject to utf8 display.
Signed-off-by: Jim Warner <james.warner@comcast.net>
Form its inception (back in May of 2011), escaped_copy
has always been a flawed function. It does not operate
on 'escaped' strings but instead treats all input as a
regular string incapable of containing utf8 sequences.
As such, it should only be used for strings guaranteed
to NOT embody multibyte characters (like SUPGIDS). For
all other strings, which could contain utf8 stuff, the
correct function should have been that escape_str guy.
So this commit changes nearly every escaped_copy call.
Reference(s):
. May 2011, original escaped_copy (cmdline, cgroup)
commit 7b0fc19e9d
Signed-off-by: Jim Warner <james.warner@comcast.net>
After 'errno' management was standardized, a couple of
fields were added to the <pids> api. Only 1 (PIDS_EXE)
involved dynamic memory and, unfortunately, it did not
conform to that expected normalized ENOMEM convention.
Reference(s):
. Jun 2018, added 'exe' to library
commit ad4269f118
. Nov 2017, standardized 'errno' management
commit 06be33b43e
Signed-off-by: Jim Warner <james.warner@comcast.net>
Under the above #define this commit now also addresses
2 additional possible toe stubbers involving 'select'.
If some readproc.h constants were uncoupled from their
pids.h enumerators a 'make check-lib' will now detect.
Signed-off-by: Jim Warner <james.warner@comcast.net>
The Fieldstab uses the full pids_item enumerator names
but also shows top's cryptic relative enumerator names
as comments. So, this commit will mirror that approach
in task_show, adding full pids_item names as comments.
Signed-off-by: Jim Warner <james.warner@comcast.net>
I'm not sure if anyone actually uses these things, but if you
selected test fields on the command line ps would crash.
$ ps/pscommand -o _left
Signal 11 (SEGV) caught by pscommand (3.3.11.877-0488).
/home/csmall/Projects/procps/procps/ps/.libs/pscommand:ps/display.c:66: please report this bug
Segmentation fault
Anyway, it doesn't now:
$ ps/pscommand -o pid,_left,_left2,_right,_unlimited 1
PID LLLLLLLL L2L2L2L2 RRRRRRRRRRR U
1 tty7 3270/tty4 59:59 [123456789-12345] <defunct>
This is part of !118 where @tt.rantala found a memory leak.
The other part of !118 may come later if the performance change
is significant.
References:
procps-ng/procps!118
With glibc, each time the strftime() function is used (twice per process
in a typical ps -fe run), a stat("/etc/localtime") system call is used
to determine the timezone. Not only does this add extra system call
overhead, but when multiple ps processes are trying to access this
file (or multiple glibc programs using strftime) in parallel, this can
trigger significant lock contention within the OS kernel.
Since ps is not intended to run for long periods of time as a
daemon (during which the system timezone could be altered and PS might
reasonably be expected to adapt its output), there is no benefit to
repeatedly doing this stat(). To stop this behavior, explicitly set the
TZ variable to its default value (:/etc/localtime) whenever it is unset.
glibc will then cache the stat() result.
While sysctl did change the order of /run and /etc to match
systemd in the referenced commit, the Debian bug report that
brought it to light was not documented.
References:
commit 24a1574f0ahttps://bugs.debian.org/950788