Fun fact: if you've ever had bash (or another shell) complain that a file doesn't exist, even though it's on $PATH, check if it's been cached by `hash`. If the file is moved elsewhere on $PATH and bash has the old path cached, you will get an ENOENT. The entire cache can be invalidated with `hash -r`.
Unsure of in which situation, but I've had situations where a script didn't have the right shebang, and as such I had to resort to `alias --save hash='#'` to make sure the script worked.
I think bash has an alias “rehash” that does the same as hash -r too. But zsh doesn’t have it, so “hash -r” has entered my muscle memory, as it works in both shells.
the odd thing is, at some point I ended up with `hash -R` as muscle memory that I always type before I correct it to a lower case r, and I'm not sure why, I can't remember any shell that uses `-R`.
Bah, you’re right! I got it backwards, it’s zsh that has rehash, bash does not. And hash -r works in both.
I guess I’ve been using zsh longer than I thought, because I learned about rehash first, then made the switch to hash -r later. I started using zsh 14 years ago, and bash 20+ years ago, so my brain assumed “I learned about rehash first” must have been back when I was using bash. zsh is still “that new thing” in my head.
It's just how bash works. If there's an entry in the session cache, it uses it. Since executable paths only get cached when you run a command successfully, this only happens when it gets moved from one directory in your PATH to another after you run it once, which isn't that common.
Setting PATH or calling hash -r will clear the session cache or one could run set +h which will disable it altogether.
> this only happens when it gets moved from one directory in your PATH to another after you run it once
It also happens when you have two executables in different directories and then you delete the one with the higher priority. Happens regularly for me after I uninstall a Linux Homebrew package.
Sure but not doing it on ENOENT suggests they’re just being completely lazy. Not to mention that they do have the tools (eg inotify watches) to proactively remove stale entries based on HD changes. Of course I’d be careful about the proactive one as it’s really easy to screw things up more (eg 100 bash instances all watching the same PATH directories might get expensive or forgetting to only do this for interactive mode connected to a TTY)
If you want to be compatible across all shells, use command -v. POSIX mandates it exists and has that returncode behaviour, whereas it doesn't mandate the hash, which or where command
...and of course, if you're going to run the command anyway, and you know an invocation that does nothing and always exits with success, you can do that too. I like doing running "--version" or equivalent in CI systems, because it has the side effect of printing what actual versions were in use during the run.
Yeah, if you're targetting POSIX shells, then "command -v" may be more reliable.
If you're targetting BASH, then "hash" is a builtin so maybe slightly quicker (not that it's likely to be an issue) and it caches the location of "java" or whatever you're looking for, so possibly marginally quicker when you do want to run the command.
Whilst running "java -version" may be useful in some scripts (my scripts often put the output into a debug function so it only runs it when I set LOG_LEVEL to a suitable value, but it writes output to a file and STDERR), you run into an issue of "polluting" STDOUT which then means that you're not going to be using your script in a pipeline without some tinkering (ironically you're putting the failure message into STDERR when you probably don't care as the script is exiting and hopefully breaking the pipeline). Also, it can take some research to figure out what invocation to use for a specific command, whereas the "hash" version can just be used with little thought.
By the way, I don't believe that ">&2" is POSIX compliant, but that's trivial to fix.
The redirection operator:
[n]>&word
shall duplicate one output file descriptor from another, or shall close one. If word evaluates to one or more digits, the file descriptor denoted by n, or standard output if n is not specified, shall be made to be a copy of the file descriptor denoted by word
It seems too far to go to say that because a system library holds some implementation details that the responsibility doesn't lie with the program using them. There's all sorts of complex interdependent details that make those kind of boundary distinctions difficult in many operating systems.
> So we rely on different libc projects to provide this, and work with them when needed.
> This ends up being more flexible as there are different needs from a libc, and for us to "pick one" wouldn't always be fair.
> And yes, you can just use a "nolibc" type implementation of you like.
> I know I do that for new syscalls when working on them, there's nothing stopping anyone else from doing that as well.
You can trash the entire GNU system and rewrite it all in Rust or Lisp if you wanted. It doesn't have to be some POSIX-like thing either, it could be whatever you wanted it to be. It doesn't need to have things like PATH. You could write a static freestanding application and boot Linux directly into it.
Nobody does stuff like this it's a lifetime of work. But it could be done.
That is indeed one of the more well defined boundaries in the system. Also worth understanding is that programs aren't generally invoking system calls directly, for example calling interrupt 0x80, glibc provides wrapper functions that invoke system calls, blurring the boundary a bit. Further blurring the boundary is the vDSO layer that intercepts some system call wrappers for more efficient access.
At issue in this article and comment thread is the boundary between the shell, environment, and Linux. This is a blurrier boundary still because the shell sets up the environment, which is passed through the kernel, and interpreted for downstream processes, generally (but not necessarily) by that shared system library.
> vDSO layer that intercepts some system call wrappers for more efficient access.
Technically the vDSO library doesn't intercept. libc chooses to use either the vDSO or the syscall. This can happen either in the wrapper itself, or through a special PLT helper where the linker asks libc to resolve the symbol to populate the GOT entry. vDSO symbols have the prefix __kernel_ or __vdso_.
That's fair, sorry for my casual language in a technically nuanced discussion. I hadn't looked at this in quite a while, but it was good to review. Thanks for the prompting.
I also double-checked the glibc and musl code to make sure I wasn't misremembering, and ended up learning about IFUNC.[1] Previously I had avoided going down the rabbit hole to understand what glibc's libc_ifunc was doing. I don't think musl uses IFUNC, at least not for clock_gettime; it seems to always link the wrapper which calls the vdso through an internally managed pointer.[2]
And now I'm wondering how safe all this indirection is. For the PLT/GOT approach I think you can disable lazy binding and force the GOT table to be read-only so exploits can't overwrite the symbol addresses. But for musl's approach it doesn't seem like you can make it's internal function pointer read-only, though maybe it's more difficult to find the address of than GOT table slots.
> Also worth understanding is that programs aren't generally invoking system calls directly
They don't generally do that but they absolutely can. I wrote a Lisp interpreter that does just that. It's completely static, has zero dependencies and talks to the kernel directly. The idea is to implement every primitive on top of Linux, and everything else on top of the primitives.
From the kernel's perspective, every program is talking to it directly. They just typically use glibc routines to do it for them. There's no actual need for glibc to be there though.
At some point I even tried adding Linux system call builtins to GCC so that the compiler itself would generate the code in the correct calling convention. Lost that work due to a hard disk crash but on the mailing list I didn't get the impression the maintainers favored merging it anyway.
> for example calling interrupt 0x80, glibc provides wrapper functions that invoke system calls, blurring the boundary a bit
Not all of them. It still doesn't support all of the clone system calls.
It's really annoying how these glibc wrappers get confused with the actual Linux system calls which work very differently. The most notable difference is there's no global thread local errno nonsense with the real system calls, the kernel just gives you a perfectly normal return value in a register. There's also a ton of glibc machinery related to system call cancellation that gets linked in if you use it.
Documentation out there conflates the two. I expected the man page above to describe only the Linux system call but it also describes the glibc specific stuff. That way people get the impression they are one and the same.
> Further blurring the boundary is the vDSO layer that intercepts some system call wrappers for more efficient access.
The vDSO is a documented stable Linux kernel interface:
It's just a perfectly normal ELF shared object that the kernel maps into the address space of every process on certain architectures. Its address is passed via the auxiliary vector which is located immediately after the environment vector. Glibc merely finds it and uses it. I can make my interpreter use it too.
It's completely optional. Its purpose is making certain system calls faster by eliminating the switch to kernel mode. This is useful for time/date system calls which are invoked frequently. The original system calls are still available though.
> This is a blurrier boundary still because the shell sets up the environment, which is passed through the kernel, and interpreted for downstream processes, generally (but not necessarily) by that shared system library.
The shell passes the environment to the execve system call but the kernel does not interpret it in any way. It doesn't even enforce the "key=value" format since this is just a convention. It's essentially an opaque array of strings and it's up to user space to make sense of whatever it contains. Glibc chooses to parse those strings into program state in the form of environment variables whose values programmers can query.
A tangent: Robert Clausecker, the guy who submitted the proposal for adding tcgetwinsize() and SIGWINCH to POSIX, apparently did it because it "is probably the easiest way to get glibc to implement a feature you want" [0].
My use of "blurry" is because you asserted a clear boundary between user and kernel space. While I agree that this boundary is well-defined, it is indeed "blurred" (made less clear) by the glibc function wrappers and vDSO injected functions. Because the glibc library is a system library and the vDSO is a blob of library code mapped in the kernel. It's not a simple interrupt to context switch and return when complete with state having been updated from "over the fence."
To me the description of a "clear boundary" should avoid the amount of nuance around whether the application's call lands in a library or the kernel's syscall handler. The fact that it doesn't means that the boundary is less clear, or blurry as was the term I adopted here.
Those functions aren't the real system calls provided by Linux, they're just glibc wrappers with added functionality. Linux kernel execve has absolutely no concept of PATH, it just opens the file at the provided pathname. That's a good thing too, user space might want to customize that stuff.
Yes. Shells typically do their own path resolution as well. I know GNU bash does, at least. I customized that logic in order to make a little library system for shell scripts.
Why would strace cat be useful here? By the time cat runs, it was obviously already found.
It is basic knowledge that PATH is used by a command interpreter to locate the pathname of binaries. This is true for Window's cmd.exe as well. I never heard of a system where locating files for execution was performed by a kernel.
The kernel's job is to execute executable files, while the shell's job is to bridge the gap between a user-facing command name ("cat") and an executable file (/usr/bin/cat).
The PATH environment variable provides such a good general and transparent way to control this task that most shells on most operating systems work that way.
I never heard of a system where locating files for execution was performed by a kernel.
Also true for MS/PC-DOS... which also holds the distinction of having some rare "truly monolithic" API-compatible variants that put the kernel, drivers, and shell in a single binary, so that may satisfy your criteria.
In the [exec][1] family of POSIX functions, if the command path doesn't contain a slash, then it's looked up in the PATH.
> If the file argument contains a slash character, the file argument shall be used as the pathname for this file. Otherwise, the path prefix for this file is obtained by a search of the directories passed as the environment variable PATH [...]
The Linux kernel also doesn't have any concept of shared libraries, which are resolved by ld.so, a program that's usually shipped as part of libc.
I like this approach of shunting off functionality that's important, necessary, and omnipresent across all OSes to userspace, rather than giving into the temptation to put everything and the kitchen sink into the kernel. It seems to make a more versatile and future proof OS, that's easy to work with in spite of uncertainty.
I've worked with "both sides" and the way ELF shared libraries on Linux work is an absolute bloody mess compared to how Windows' PE works. On Windows the same executable format and dynamic linker are usable in both user and kernel mode.
This is even reflected in the ELF format itself. There's this really arcane dichotomy between sections and segments.
Sections are very detailed metadata that all sorts of things use for all sorts of purposes. Compilers use them. Debuggers use them. Static and dynamic linkers use them. Anyone can use them for any purpose whatsoever. You can easily add your own custom sections to any executable using tools like objcopy. It's completely arbitrary, held together by convention.
Segments, on the other hand, don't even have names. They are just a list of file extents required for the program to actually execute and their address space locations. The program header table is essentially a sorted list of arguments for the mmap system call.
It basically just mmaps in the PT_LOAD segments of the ELF file, copies stuff like arguments and environment and then starts a thread at the entry point specified in the ELF header.
It's just that when loading dynamic ELFs it jumps into the dynamic linker instead of the actual program. It's as though every single program had a #!/lib/ld.so shebang line. The absolute path is even hardcoded into the executable itself.
readelf -a $(which cat) | grep -i interpreter
[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
When an "interpreter" is requested, Linux will load it alongside the actual program and will run it instead of the actual program. This "ELF interpreter" then does an absurd amount of work by recursively loading and linking libraries, linking the actual executable and only then jumping into its entry point.
I'm not kidding about the "absurd amount of work" part. These linkers even have to topologically sort dependencies like a package manager so they can be initialized properly.
Ignorance leading to assumptions. Their eureka moment: "The shell, not the Linux kernel, is responsible for searching for executables in PATH!" makes it obvious they haven't read up on operating systems. Shame because you should know how the machine works to understand what is happening in your computer. I always recommend reading Operating Systems: Three Easy Pieces.
https://pages.cs.wisc.edu/~remzi/OSTEP/
The thing is, though, that PATH being a userspace concept is a contingent detail, an accident of history, not something inherent to the concept of an operating system. You can imagine a kernel that does path searches. Why not?
There's a difference between something being a certain way because it has to be that way in order to implement the semantics of the system (e.g. interrupt handlers being a privilege transition) and something being a certain way as a result of an arbitrary implementation choice.
OSes differ on these implementation choices all the time. For example,
* in Linux, the kernel is responsible for accepting a list of execve(2) argument-words and passing them to the exec-ed process with word boundaries intact. On Windows, the kernel passes a single string instead and programs chop that string up into argument words in userspace, in libc
* in Linux, the kernel provides a 32-bit system call API for 32-bit programs running on 64-bit kernels; on Windows, the kernel provides only a 64-bit system call API and it's a userspace program that does long-mode switching and system call argument translation
* on Windows, window handles (HWNDs, via user32.dll) in IPC message passing (ALPC, in ntoskrnl) are implemented in the kernel, whereas the corresponding concepts on most Linux systems are pure user-space constructs
And that's not even getting into weirder OSes! Someone familiar with operating systems in general can nevertheless be surprised at how a particular OS chooses to implement this or that feature.
> The thing is, though, that PATH being a userspace concept is a contingent detail, an accident of history, not something inherent to the concept of an operating system. You can imagine a kernel that does path searches. Why not?
Right. You can't be sure that someone didn't stick $PATH expansion into glibc, or something. Because someone did.
QNX gets program loading entirely out of the kernel. When QNX is booted, initial programs and .so files in the boot image are loaded into memory. That's how things get started. Disk drivers, etc. come in that way, assuming the system has a disk.
Calling "exec.." or ".. spawn" merely links to a .so file that knows how to open and read an executable image. Program loading is done entirely by userspace code. Tiny microkernel. The "exec.." functions do not use the PATH variable.[1]
However, "posix_spawn" does read the PATH environment variable, in both QNX [2] and Linux.[3] Linux, for historical reasons, tends not to use "spawn" as much, but those are the defined semantics for it. QNX normally uses "spawn", because it lacks the legacy that encouraged fork/exec type process startup. "posix_spawn" is apparently faster in modern Linux, especially when the parent process is large, but there's a lot of fork/exec legacy code out there.
"posix_spawn" comes from FreeBSD in 2009, but I think the QNX implementation precedes that, because QNX's architecture favors "spawn" over "exec.." It may go back to UCLA Locus.
Windows has different program startup semantics. Someone from Windows land can address that. MacOS has a built in search path if you don't have a PATH variable.[5]
* in Linux, the kernel is responsible for accepting a list of execve(2) argument-words
Yes it does, but the more surprising thing is (coming from AmigaOS with its dos.library function ReadArgs()) that the shell does this. The shell is also responsible for argument expansion - madness!
On AmigaOS, when you type "delete foo#? force", the shell passes the entire command line to the delete command. The delete command calls ReadArgs() with a template (FILE/M/A, ALL/S, QUlET/S, FORCE/S), and the standard OS function parses it into lists of files, flags, keyword arguments, etc. The "file" passed is "foo#?", and the command uses MatchFirst()/MatchNext() to do file pattern matching.
Every command (that uses ReadArgs() and didn't plump for "standard C" parsing) has the same behaviour: running the command with "?" gives you the template, which tells you how to use it. Args are parsed consistently across all programs.
Then you get "standard C", which because K&R and main(), ignores this standard Amiga parsing function and just does naive splits. Across multiple Amiga C compilers, quoting rules are inconsistent. Amiga C compilers have to produce an executable, and it knows it'll be called with a full command line, so the executable itself has to break that into words before it can call main(), and it's up to each compiler writer how they're going to do that. Urgh.
In unix-land, it's up to the shell to parse the command line, and pass only the words... hence why the shell naturally does all the filename globbing, and why you have gotchas like when these two commands are sometimes the same and sometimes they're not:
find . -name foo*
find . -name 'foo*'
Then we have Windows, which is like Amiga C programs - it's being passed a full command string and will have its C runtime parse it for main() to consume. There's a vague expectation that it'll do quoting "like COMMAND", which itself has very odd quoting rules. At least, most people are all using the same C compiler on Windows, so it's mostly only MSVCRT's implementation so it's mostly consistent.
Prepend for all paths on a command line? Or just for the executable?
For all paths it could be dangerous and should very probably not be done. But for executables it's less dangerous and can easily be done by putting '.' into $PATH.
> I often wish there was a convenient way of doing such an operation in the shell: if path start with "/", leave it, otherwise prepend "./"
Both bash and zsh have enough functionality exposed via shell functions and variables for you to define a keybinding that does exactly this, interactively. Good idea.
Did you mean an interactive command? Or something else?
I meant non-interactive, for use in scripts which take user input. We already have "--" for end of options, but the support for it is not universal and even with that some programs will interpret certain strings in a special way. On the other hand, prepending the dot-slash should work for any program or argument passing style.
One thing I was surprised to learn a couple years ago is that users and groups aren't really tracked much by the Linux kernel: they're just numeric IDs that track process and file ownership. So if you setuid() to a user ID that doesn't exist in /etc/passwd or anywhere else, the kernel won't stop you.
If I have a file on machineA with uid10001 and I copy the file to machineB, I might want it to retain that uid, but it shouldn't matter to machineB that it doesn't map to a real user.
The question is why author wrote such a clickbait title and made such an odd conclusion? Legitimate question IMO, nothing rude there.
It's not about knowledge, but about assumptions. The title and conclusion hint that there are some obvious assumptions, but these are not detailed. Maybe author assumed that because of the ubiquitous use of PATH across shells, it had to be managed centrally.
I don't think it's an odd assumption at all! The lines between shell, exec calls, globbing, etc, are very blurry if you don't already know how it all fits together.
Why not? Every executable is started with execve(2) syscall which takes an array of the environment variables that the kernel use to reset the process's environment variables it inherited from its parent, so obviously the kernel has full knowledge of the environment variables of all of the processes in the system.
Now, there is a reason why kernel actually does not have such knowledge, but it's not at all unreasonable to assume that the kernel has it.
The thing that really blows minds is the fact Linux does not do name resolution at all. Getting rid of glibc breaks a lot of stuff because everyone depends on glibc to do it.
You and I and bunch of other people know it and take it to be self-evident, but someone discovered it (maybe recently, maybe they have known it for a while) and did the nice write up for people who had not have known that yet. https://xkcd.com/1053/
The lucky 10,000 is a positive take on the situation. But the article using "real," which I think would connote to "legitimate" to most, seems a little more polarizing that sharing a discovery.
That's not a truth that'd come from first principles, never mind a trivial truth; it's extremely trivial to imagine a kernel that does parse PATH where it wouldn't be true.
As such, it's a thing one has to explicitly look up to know, which the author did.
Well, execve(2) and execvp(3) are both "system" functions. C (which is already black magic for some people) invokes both by calling into functions exported from libc. If you're not super dorky^Wfamiliar with low-level systems stuff, you might guess that the two functions are implemented in the same place and in the same way. That the latter is just a libc wrapper around the former that does a PATH search is arcane detail you don't have to care about 99% of the time.
It's hard to appreciate how the world looks before you learn a fact. You can't unsee things.
Using "#!sh" at the top of the file does work, but not predictably. It may execute sh in your current directory, which is what Linux does, but your shell may override that (zsh does if the first attempt fails). So it works, but not the way you want it to.
The title is nonsense. PATH is the name of an environment variable (a Real Thing(TM)) which lists a set of directories to search for an executable. It is used by shells (including those running on Linux) to locate an executable when the full path to the executable is not supplied by the user.
This is needed because the exece()/execve() [2] kernel system call is unaware of things like environment variables so it will not have any idea how or where to execute a program 'cat' unless it is given the full path to 'cat', so the shell has to look it up (again if the user doesn't pass the full path). It's the same on every POSIX system and the original UNIXes. It's been this way for at least 50 years. (edit 60 years, it's from Multics [1])
Kids today really need to learn the fundamentals of computer operating systems. Or do that boring old-person thing we did before StackOverflow, and read all the manual pages, which tell you all this [3] [4].
The fact that the Linux kernel does not track environment variables of the processes is not a "fundamental". The setenv/getenv could very well have been syscalls, it's simply a design decision that they are not. One can make a kernel with such tracking, and it'd still be POSIX compliant as long as you supply setenv(3)/getenv(3) wrappers with expected signatures in your system libc.
It's fine to understand what the code is doing in a shallow way. But this leaves out a lot of important information. Information that isn't immediately obvious just by reading code, that can help you avoid problems and understand the system in-depth, without years of trial and error. Which is why they wrote a manual. You can also both read the code and the manual, but the manual will give you much more knowledge in a smaller amount of time.
This actually helps explain some behaviors I’ve encountered. It was never a serious issue, since the answer is to use a full path. But is slightly annoying none the less. Understanding helps a lot.
I was trying to understand what the lede was here, and it turns out the author assumed that PATH was something understood by the kernel, which is rather an odd assumption, but perhaps one that others make.
I did get one thing out of this though. I had honestly wondered for the longest time why we need to call env to get the same functionality as PATH in a shebang.
Ironically, thanks to either an article I read here (or on the crustacean site) recently, I already knew that the shebang is something which is parsed by the kernel, but had not put two and two together at all.
Much like the author. So goes to show the benefits of exploring and thinking about seemingly "obvious" concepts.
config BINFMT_SCRIPT
tristate "Kernel support for scripts starting with #!"
default y
help
Say Y here if you want to execute interpreted scripts starting with
#! followed by the path to an interpreter.
Doing this is what made programming fun again for me. I made a freestanding Lisp interpreter instead of a shell. No C library, just Linux system calls. I've written quite a bit of ELF code too, no linker yet though.
Uh yeah duh. But I through waiting for him to discover hash in the shell. No such luck. Guess it's in the magic somewhere. (Do man hash or something if you have no idea what I'm talking about)
Nobody talks about vfs path resolution here? There are too many layers in the whole process, even the path from strace can be resolved to another path.
Accessing environment variables from the kernel space isn't even all that easy, because the information lives in userspace in process VM. Here's how it's done for the purpose of showing it in `/proc/[pid/environ`:
> The shell, not the Linux kernel, is responsible for searching for executables in PATH!
I mean, no shit, Sherlock? the exec family of system calls requires a path to a file, not a filename with an implicit path from the environment, of course the PATH is handled by the shell.
All members of the exec family of system calls, which consists of only two syscalls, namely, execve(2) and execveat(2), literally have the envp parameter which is supposed to have all the environment variables for the process.
Now, the semantics of this parameter is that kernel does not use it for path resolution when searching for the executable — but it could.
Fun fact: if you've ever had bash (or another shell) complain that a file doesn't exist, even though it's on $PATH, check if it's been cached by `hash`. If the file is moved elsewhere on $PATH and bash has the old path cached, you will get an ENOENT. The entire cache can be invalidated with `hash -r`.
Unsure of in which situation, but I've had situations where a script didn't have the right shebang, and as such I had to resort to `alias --save hash='#'` to make sure the script worked.
you just solved a bug I couldnt explain like 6 years ago
I think bash has an alias “rehash” that does the same as hash -r too. But zsh doesn’t have it, so “hash -r” has entered my muscle memory, as it works in both shells.
Edit: wrong shell, zsh has rehash, bash does not.
the odd thing is, at some point I ended up with `hash -R` as muscle memory that I always type before I correct it to a lower case r, and I'm not sure why, I can't remember any shell that uses `-R`.
but zsh has "rehash"? for as long as I remember.
Bah, you’re right! I got it backwards, it’s zsh that has rehash, bash does not. And hash -r works in both.
I guess I’ve been using zsh longer than I thought, because I learned about rehash first, then made the switch to hash -r later. I started using zsh 14 years ago, and bash 20+ years ago, so my brain assumed “I learned about rehash first” must have been back when I was using bash. zsh is still “that new thing” in my head.
If there is something nice that one has and one does not have. zsh is the one that has it.
Counterpoint, /dev/udp pseudo-devices in bash.
Is this an old behavior? I would think ENOENT would invalidate the cache entry at least.
It's still a thing in bash 5.2.37.
It's just how bash works. If there's an entry in the session cache, it uses it. Since executable paths only get cached when you run a command successfully, this only happens when it gets moved from one directory in your PATH to another after you run it once, which isn't that common.
Setting PATH or calling hash -r will clear the session cache or one could run set +h which will disable it altogether.
> this only happens when it gets moved from one directory in your PATH to another after you run it once
It also happens when you have two executables in different directories and then you delete the one with the higher priority. Happens regularly for me after I uninstall a Linux Homebrew package.
Isn't cache invalidation one of the hard problems?
Sure but not doing it on ENOENT suggests they’re just being completely lazy. Not to mention that they do have the tools (eg inotify watches) to proactively remove stale entries based on HD changes. Of course I’d be careful about the proactive one as it’s really easy to screw things up more (eg 100 bash instances all watching the same PATH directories might get expensive or forgetting to only do this for interactive mode connected to a TTY)
Retested with bash 5.2.37(1)-release from Debian testing, it still does this :(
Changing $PATH does wipe it at least.
Not working, as intended
Ah, so that's where sudo texhash -r comes from when installing a latex package!
The other typical cause is when an interpreter or library is compiled with the wrong libc version.
Wtf. TIL about hash.
Using "hash" is arguably the best way to determine if a command is available in a BASH script
e.g.
If you want to be compatible across all shells, use command -v. POSIX mandates it exists and has that returncode behaviour, whereas it doesn't mandate the hash, which or where command
...and of course, if you're going to run the command anyway, and you know an invocation that does nothing and always exits with success, you can do that too. I like doing running "--version" or equivalent in CI systems, because it has the side effect of printing what actual versions were in use during the run.Yeah, if you're targetting POSIX shells, then "command -v" may be more reliable.
If you're targetting BASH, then "hash" is a builtin so maybe slightly quicker (not that it's likely to be an issue) and it caches the location of "java" or whatever you're looking for, so possibly marginally quicker when you do want to run the command.
Whilst running "java -version" may be useful in some scripts (my scripts often put the output into a debug function so it only runs it when I set LOG_LEVEL to a suitable value, but it writes output to a file and STDERR), you run into an issue of "polluting" STDOUT which then means that you're not going to be using your script in a pipeline without some tinkering (ironically you're putting the failure message into STDERR when you probably don't care as the script is exiting and hopefully breaking the pipeline). Also, it can take some research to figure out what invocation to use for a specific command, whereas the "hash" version can just be used with little thought.
By the way, I don't believe that ">&2" is POSIX compliant, but that's trivial to fix.
As far as I know, its POSIX compliant?
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V...
Fair enough - I usually just target BASH and don't worry about POSIX compliance.
Holy shit
Path globbing, pipes, redirection, job control (fg/bg), and all shell variables -- not just $PATH -- are all handled by the shell.
The kernel has no idea what the current process' environment $PATH is, and doesn't even parse any process environment variables at all.
PATH isn't just handled by the shell though. Many (but not all!) of the exec* family of functions in libc respect PATH.
It seems too far to go to say that because a system library holds some implementation details that the responsibility doesn't lie with the program using them. There's all sorts of complex interdependent details that make those kind of boundary distinctions difficult in many operating systems.
On Linux the main boundary between user space and kernel is quite clear: the system call layer. It is stable and well documented.
https://github.com/torvalds/linux/blob/master/Documentation/...
System libraries like glibc are not part of the kernel, they are just components that can be replaced.
I wrote an article about it:
https://www.matheusmoreira.com/articles/linux-system-calls
I even asked Greg Kroah-Hartman about it:
https://old.reddit.com/r/linux/comments/fx5e4v/im_greg_kroah...
> So we rely on different libc projects to provide this, and work with them when needed.
> This ends up being more flexible as there are different needs from a libc, and for us to "pick one" wouldn't always be fair.
> And yes, you can just use a "nolibc" type implementation of you like.
> I know I do that for new syscalls when working on them, there's nothing stopping anyone else from doing that as well.
You can trash the entire GNU system and rewrite it all in Rust or Lisp if you wanted. It doesn't have to be some POSIX-like thing either, it could be whatever you wanted it to be. It doesn't need to have things like PATH. You could write a static freestanding application and boot Linux directly into it.
Nobody does stuff like this it's a lifetime of work. But it could be done.
That is indeed one of the more well defined boundaries in the system. Also worth understanding is that programs aren't generally invoking system calls directly, for example calling interrupt 0x80, glibc provides wrapper functions that invoke system calls, blurring the boundary a bit. Further blurring the boundary is the vDSO layer that intercepts some system call wrappers for more efficient access.
At issue in this article and comment thread is the boundary between the shell, environment, and Linux. This is a blurrier boundary still because the shell sets up the environment, which is passed through the kernel, and interpreted for downstream processes, generally (but not necessarily) by that shared system library.
> vDSO layer that intercepts some system call wrappers for more efficient access.
Technically the vDSO library doesn't intercept. libc chooses to use either the vDSO or the syscall. This can happen either in the wrapper itself, or through a special PLT helper where the linker asks libc to resolve the symbol to populate the GOT entry. vDSO symbols have the prefix __kernel_ or __vdso_.
That's fair, sorry for my casual language in a technically nuanced discussion. I hadn't looked at this in quite a while, but it was good to review. Thanks for the prompting.
https://github.com/bminor/glibc/blob/glibc-2.41/sysdeps/unix...
https://github.com/bminor/glibc/blob/glibc-2.41/sysdeps/unix...
I also double-checked the glibc and musl code to make sure I wasn't misremembering, and ended up learning about IFUNC.[1] Previously I had avoided going down the rabbit hole to understand what glibc's libc_ifunc was doing. I don't think musl uses IFUNC, at least not for clock_gettime; it seems to always link the wrapper which calls the vdso through an internally managed pointer.[2]
And now I'm wondering how safe all this indirection is. For the PLT/GOT approach I think you can disable lazy binding and force the GOT table to be read-only so exploits can't overwrite the symbol addresses. But for musl's approach it doesn't seem like you can make it's internal function pointer read-only, though maybe it's more difficult to find the address of than GOT table slots.
[1] https://sourceware.org/glibc/wiki/GNU_IFUNC [2] https://git.musl-libc.org/cgit/musl/tree/src/time/clock_gett...
Modern binaries use the syscall instruction instead of int 0x80. The latter still works though.
> Also worth understanding is that programs aren't generally invoking system calls directly
They don't generally do that but they absolutely can. I wrote a Lisp interpreter that does just that. It's completely static, has zero dependencies and talks to the kernel directly. The idea is to implement every primitive on top of Linux, and everything else on top of the primitives.
From the kernel's perspective, every program is talking to it directly. They just typically use glibc routines to do it for them. There's no actual need for glibc to be there though.
At some point I even tried adding Linux system call builtins to GCC so that the compiler itself would generate the code in the correct calling convention. Lost that work due to a hard disk crash but on the mailing list I didn't get the impression the maintainers favored merging it anyway.
> for example calling interrupt 0x80, glibc provides wrapper functions that invoke system calls, blurring the boundary a bit
Not all of them. It still doesn't support all of the clone system calls.
https://www.man7.org/linux/man-pages/man2/clone.2.html
It's not just niche system calls either. It took years for glibc to provide getrandom.https://www.man7.org/linux/man-pages/man2/getrandom.2.html
https://lwn.net/Articles/711013/
It's really annoying how these glibc wrappers get confused with the actual Linux system calls which work very differently. The most notable difference is there's no global thread local errno nonsense with the real system calls, the kernel just gives you a perfectly normal return value in a register. There's also a ton of glibc machinery related to system call cancellation that gets linked in if you use it.
Documentation out there conflates the two. I expected the man page above to describe only the Linux system call but it also describes the glibc specific stuff. That way people get the impression they are one and the same.
> Further blurring the boundary is the vDSO layer that intercepts some system call wrappers for more efficient access.
The vDSO is a documented stable Linux kernel interface:
https://github.com/torvalds/linux/blob/master/Documentation/...
It's just a perfectly normal ELF shared object that the kernel maps into the address space of every process on certain architectures. Its address is passed via the auxiliary vector which is located immediately after the environment vector. Glibc merely finds it and uses it. I can make my interpreter use it too.
It's completely optional. Its purpose is making certain system calls faster by eliminating the switch to kernel mode. This is useful for time/date system calls which are invoked frequently. The original system calls are still available though.
> This is a blurrier boundary still because the shell sets up the environment, which is passed through the kernel, and interpreted for downstream processes, generally (but not necessarily) by that shared system library.
The shell passes the environment to the execve system call but the kernel does not interpret it in any way. It doesn't even enforce the "key=value" format since this is just a convention. It's essentially an opaque array of strings and it's up to user space to make sense of whatever it contains. Glibc chooses to parse those strings into program state in the form of environment variables whose values programmers can query.
> It took years for glibc to provide getrandom.
A tangent: Robert Clausecker, the guy who submitted the proposal for adding tcgetwinsize() and SIGWINCH to POSIX, apparently did it because it "is probably the easiest way to get glibc to implement a feature you want" [0].
[0] https://news.ycombinator.com/item?id=42041467
My use of "blurry" is because you asserted a clear boundary between user and kernel space. While I agree that this boundary is well-defined, it is indeed "blurred" (made less clear) by the glibc function wrappers and vDSO injected functions. Because the glibc library is a system library and the vDSO is a blob of library code mapped in the kernel. It's not a simple interrupt to context switch and return when complete with state having been updated from "over the fence."
To me the description of a "clear boundary" should avoid the amount of nuance around whether the application's call lands in a library or the kernel's syscall handler. The fact that it doesn't means that the boundary is less clear, or blurry as was the term I adopted here.
There is also paths.h usually located at /usr/include/paths.h. It contains the default PATH macro _PATH_DEFPATH.
Those functions aren't the real system calls provided by Linux, they're just glibc wrappers with added functionality. Linux kernel execve has absolutely no concept of PATH, it just opens the file at the provided pathname. That's a good thing too, user space might want to customize that stuff.
Sure, but it is also not the same thing as the shell.
Yes. Shells typically do their own path resolution as well. I know GNU bash does, at least. I customized that logic in order to make a little library system for shell scripts.
Why would strace cat be useful here? By the time cat runs, it was obviously already found.
It is basic knowledge that PATH is used by a command interpreter to locate the pathname of binaries. This is true for Window's cmd.exe as well. I never heard of a system where locating files for execution was performed by a kernel.
True... `strace bash -c cat` would give more the series of stat calls they're intending to see:
newfstatat(AT_FDCWD, ".", {st_mode=S_IFDIR|0700, st_size=4096, ...}, 0) = 0
newfstatat(AT_FDCWD, "/usr/local/sbin/cat", 0x7fffcec2f3b8, 0) = -1 ENOENT (No such file or directory)
newfstatat(AT_FDCWD, "/usr/local/bin/cat", 0x7fffcec2f3b8, 0) = -1 ENOENT (No such file or directory)
newfstatat(AT_FDCWD, "/usr/sbin/cat", 0x7fffcec2f3b8, 0) = -1 ENOENT (No such file or directory)
newfstatat(AT_FDCWD, "/usr/bin/cat", {st_mode=S_IFREG|0755, st_size=68536, ...}, 0) = 0
The kernel's job is to execute executable files, while the shell's job is to bridge the gap between a user-facing command name ("cat") and an executable file (/usr/bin/cat). The PATH environment variable provides such a good general and transparent way to control this task that most shells on most operating systems work that way.
I never heard of a system where locating files for execution was performed by a kernel.
Also true for MS/PC-DOS... which also holds the distinction of having some rare "truly monolithic" API-compatible variants that put the kernel, drivers, and shell in a single binary, so that may satisfy your criteria.
In the [exec][1] family of POSIX functions, if the command path doesn't contain a slash, then it's looked up in the PATH.
> If the file argument contains a slash character, the file argument shall be used as the pathname for this file. Otherwise, the path prefix for this file is obtained by a search of the directories passed as the environment variable PATH [...]
[1]: https://pubs.opengroup.org/onlinepubs/009695399/functions/ex...
The Linux kernel also doesn't have any concept of shared libraries, which are resolved by ld.so, a program that's usually shipped as part of libc.
I like this approach of shunting off functionality that's important, necessary, and omnipresent across all OSes to userspace, rather than giving into the temptation to put everything and the kitchen sink into the kernel. It seems to make a more versatile and future proof OS, that's easy to work with in spite of uncertainty.
I've worked with "both sides" and the way ELF shared libraries on Linux work is an absolute bloody mess compared to how Windows' PE works. On Windows the same executable format and dynamic linker are usable in both user and kernel mode.
to be fair to linux, elf was bolted on after a few years, the original linux used a variant of coff without shared library support.
This is even reflected in the ELF format itself. There's this really arcane dichotomy between sections and segments.
Sections are very detailed metadata that all sorts of things use for all sorts of purposes. Compilers use them. Debuggers use them. Static and dynamic linkers use them. Anyone can use them for any purpose whatsoever. You can easily add your own custom sections to any executable using tools like objcopy. It's completely arbitrary, held together by convention.
Segments, on the other hand, don't even have names. They are just a list of file extents required for the program to actually execute and their address space locations. The program header table is essentially a sorted list of arguments for the mmap system call.
This is Linux kernel's ELF loader:
https://github.com/torvalds/linux/blob/master/fs/binfmt_elf....
It basically just mmaps in the PT_LOAD segments of the ELF file, copies stuff like arguments and environment and then starts a thread at the entry point specified in the ELF header.
It's just that when loading dynamic ELFs it jumps into the dynamic linker instead of the actual program. It's as though every single program had a #!/lib/ld.so shebang line. The absolute path is even hardcoded into the executable itself.
When an "interpreter" is requested, Linux will load it alongside the actual program and will run it instead of the actual program. This "ELF interpreter" then does an absurd amount of work by recursively loading and linking libraries, linking the actual executable and only then jumping into its entry point.I'm not kidding about the "absurd amount of work" part. These linkers even have to topologically sort dependencies like a package manager so they can be initialized properly.
https://blogs.oracle.com/solaris/post/init-and-fini-processi...
This is really great information. Thanks for this.
Why would the author think that the PATH environment variable is being used by the kernel? What an odd assumption.
Ignorance leading to assumptions. Their eureka moment: "The shell, not the Linux kernel, is responsible for searching for executables in PATH!" makes it obvious they haven't read up on operating systems. Shame because you should know how the machine works to understand what is happening in your computer. I always recommend reading Operating Systems: Three Easy Pieces. https://pages.cs.wisc.edu/~remzi/OSTEP/
The thing is, though, that PATH being a userspace concept is a contingent detail, an accident of history, not something inherent to the concept of an operating system. You can imagine a kernel that does path searches. Why not?
There's a difference between something being a certain way because it has to be that way in order to implement the semantics of the system (e.g. interrupt handlers being a privilege transition) and something being a certain way as a result of an arbitrary implementation choice.
OSes differ on these implementation choices all the time. For example,
* in Linux, the kernel is responsible for accepting a list of execve(2) argument-words and passing them to the exec-ed process with word boundaries intact. On Windows, the kernel passes a single string instead and programs chop that string up into argument words in userspace, in libc
* in Linux, the kernel provides a 32-bit system call API for 32-bit programs running on 64-bit kernels; on Windows, the kernel provides only a 64-bit system call API and it's a userspace program that does long-mode switching and system call argument translation
* on Windows, window handles (HWNDs, via user32.dll) in IPC message passing (ALPC, in ntoskrnl) are implemented in the kernel, whereas the corresponding concepts on most Linux systems are pure user-space constructs
And that's not even getting into weirder OSes! Someone familiar with operating systems in general can nevertheless be surprised at how a particular OS chooses to implement this or that feature.
> The thing is, though, that PATH being a userspace concept is a contingent detail, an accident of history, not something inherent to the concept of an operating system. You can imagine a kernel that does path searches. Why not?
Right. You can't be sure that someone didn't stick $PATH expansion into glibc, or something. Because someone did.
QNX gets program loading entirely out of the kernel. When QNX is booted, initial programs and .so files in the boot image are loaded into memory. That's how things get started. Disk drivers, etc. come in that way, assuming the system has a disk.
Calling "exec.." or ".. spawn" merely links to a .so file that knows how to open and read an executable image. Program loading is done entirely by userspace code. Tiny microkernel. The "exec.." functions do not use the PATH variable.[1]
However, "posix_spawn" does read the PATH environment variable, in both QNX [2] and Linux.[3] Linux, for historical reasons, tends not to use "spawn" as much, but those are the defined semantics for it. QNX normally uses "spawn", because it lacks the legacy that encouraged fork/exec type process startup. "posix_spawn" is apparently faster in modern Linux, especially when the parent process is large, but there's a lot of fork/exec legacy code out there.
"posix_spawn" comes from FreeBSD in 2009, but I think the QNX implementation precedes that, because QNX's architecture favors "spawn" over "exec.." It may go back to UCLA Locus.
Windows has different program startup semantics. Someone from Windows land can address that. MacOS has a built in search path if you don't have a PATH variable.[5]
[1] https://www.qnx.com/developers/docs/8.0/com.qnx.doc.neutrino...
[2] https://www.qnx.com/developers/docs/8.0/com.qnx.doc.neutrino...
[3] https://www.man7.org/linux/man-pages/man3/posix_spawn.3.html
[4] https://www.whexy.com/posts/fork
[5] https://developer.apple.com/library/archive/documentation/Sy...
* in Linux, the kernel is responsible for accepting a list of execve(2) argument-words
Yes it does, but the more surprising thing is (coming from AmigaOS with its dos.library function ReadArgs()) that the shell does this. The shell is also responsible for argument expansion - madness!
On AmigaOS, when you type "delete foo#? force", the shell passes the entire command line to the delete command. The delete command calls ReadArgs() with a template (FILE/M/A, ALL/S, QUlET/S, FORCE/S), and the standard OS function parses it into lists of files, flags, keyword arguments, etc. The "file" passed is "foo#?", and the command uses MatchFirst()/MatchNext() to do file pattern matching.
Every command (that uses ReadArgs() and didn't plump for "standard C" parsing) has the same behaviour: running the command with "?" gives you the template, which tells you how to use it. Args are parsed consistently across all programs.
Then you get "standard C", which because K&R and main(), ignores this standard Amiga parsing function and just does naive splits. Across multiple Amiga C compilers, quoting rules are inconsistent. Amiga C compilers have to produce an executable, and it knows it'll be called with a full command line, so the executable itself has to break that into words before it can call main(), and it's up to each compiler writer how they're going to do that. Urgh.
In unix-land, it's up to the shell to parse the command line, and pass only the words... hence why the shell naturally does all the filename globbing, and why you have gotchas like when these two commands are sometimes the same and sometimes they're not:
Then we have Windows, which is like Amiga C programs - it's being passed a full command string and will have its C runtime parse it for main() to consume. There's a vague expectation that it'll do quoting "like COMMAND", which itself has very odd quoting rules. At least, most people are all using the same C compiler on Windows, so it's mostly only MSVCRT's implementation so it's mostly consistent.Username checks out.
I think one of the most surprising things I learned about bash is that you can do this:
And now you have rm -rf'd. :)Indeed, always prefer ./* to *
I often wish there was a convenient way of doing such an operation in the shell: if path start with "/", leave it, otherwise prepend "./"
Prepend for all paths on a command line? Or just for the executable?
For all paths it could be dangerous and should very probably not be done. But for executables it's less dangerous and can easily be done by putting '.' into $PATH.
> I often wish there was a convenient way of doing such an operation in the shell: if path start with "/", leave it, otherwise prepend "./"
Both bash and zsh have enough functionality exposed via shell functions and variables for you to define a keybinding that does exactly this, interactively. Good idea.
Did you mean an interactive command? Or something else?
I meant non-interactive, for use in scripts which take user input. We already have "--" for end of options, but the support for it is not universal and even with that some programs will interpret certain strings in a special way. On the other hand, prepending the dot-slash should work for any program or argument passing style.
We should use "--" more, but who has all this time to waste? :)
One thing I was surprised to learn a couple years ago is that users and groups aren't really tracked much by the Linux kernel: they're just numeric IDs that track process and file ownership. So if you setuid() to a user ID that doesn't exist in /etc/passwd or anywhere else, the kernel won't stop you.
If I have a file on machineA with uid10001 and I copy the file to machineB, I might want it to retain that uid, but it shouldn't matter to machineB that it doesn't map to a real user.
Hopefully that user actually doesn’t exist on the second machine!
You’ll see this observation all the time building containers.
Don't if you only run them with root user.
or with ipa-esque authentication schemes and shared mounts
And NFS!
Unnecessarily rude. There was also a time when you didn’t know this. I can guarantee it!
The question is why author wrote such a clickbait title and made such an odd conclusion? Legitimate question IMO, nothing rude there.
It's not about knowledge, but about assumptions. The title and conclusion hint that there are some obvious assumptions, but these are not detailed. Maybe author assumed that because of the ubiquitous use of PATH across shells, it had to be managed centrally.
I don't think it's an odd assumption at all! The lines between shell, exec calls, globbing, etc, are very blurry if you don't already know how it all fits together.
Why not? Every executable is started with execve(2) syscall which takes an array of the environment variables that the kernel use to reset the process's environment variables it inherited from its parent, so obviously the kernel has full knowledge of the environment variables of all of the processes in the system.
Now, there is a reason why kernel actually does not have such knowledge, but it's not at all unreasonable to assume that the kernel has it.
The thing that really blows minds is the fact Linux does not do name resolution at all. Getting rid of glibc breaks a lot of stuff because everyone depends on glibc to do it.
https://wiki.archlinux.org/title/Domain_name_resolution
https://en.wikipedia.org/wiki/Name_Service_Switch
https://man.archlinux.org/man/getaddrinfo.3
You and I and bunch of other people know it and take it to be self-evident, but someone discovered it (maybe recently, maybe they have known it for a while) and did the nice write up for people who had not have known that yet. https://xkcd.com/1053/
The lucky 10,000 is a positive take on the situation. But the article using "real," which I think would connote to "legitimate" to most, seems a little more polarizing that sharing a discovery.
Click bait title for sure
That's not a truth that'd come from first principles, never mind a trivial truth; it's extremely trivial to imagine a kernel that does parse PATH where it wouldn't be true.
As such, it's a thing one has to explicitly look up to know, which the author did.
Well, execve(2) and execvp(3) are both "system" functions. C (which is already black magic for some people) invokes both by calling into functions exported from libc. If you're not super dorky^Wfamiliar with low-level systems stuff, you might guess that the two functions are implemented in the same place and in the same way. That the latter is just a libc wrapper around the former that does a PATH search is arcane detail you don't have to care about 99% of the time.
It's hard to appreciate how the world looks before you learn a fact. You can't unsee things.
But the man page section tells you which one is is a a kernel syscall (2) and which is a C library function (3)...
Which is the universally known convention everyone is born with inherent knowledge of. Also, people read man-pages.
What person diving into their shell's source code on Linux doesn't read manpages? Or even man's manpage at least once?
Daniel Huang, the one that wrote TFA? People are different, I don't know what else to tell you. But generally, people don't read man pages or docs.
Using "#!sh" at the top of the file does work, but not predictably. It may execute sh in your current directory, which is what Linux does, but your shell may override that (zsh does if the first attempt fails). So it works, but not the way you want it to.
And I'm sure other kernels do other things too.
I don't get the logical passage from "PATH is handled by the shell" to "isn't real on Linux".
It's real, it's just implemented by the shell -- same as all Unix-like operating systems. Heck, same as Windows.
Might be more accurate to say "Linux doesn't know about PATH, but your shell does"
The title is nonsense. PATH is the name of an environment variable (a Real Thing(TM)) which lists a set of directories to search for an executable. It is used by shells (including those running on Linux) to locate an executable when the full path to the executable is not supplied by the user.
This is needed because the exece()/execve() [2] kernel system call is unaware of things like environment variables so it will not have any idea how or where to execute a program 'cat' unless it is given the full path to 'cat', so the shell has to look it up (again if the user doesn't pass the full path). It's the same on every POSIX system and the original UNIXes. It's been this way for at least 50 years. (edit 60 years, it's from Multics [1])
Kids today really need to learn the fundamentals of computer operating systems. Or do that boring old-person thing we did before StackOverflow, and read all the manual pages, which tell you all this [3] [4].
[1] https://en.wikipedia.org/wiki/PATH_(variable) [2] https://man7.org/linux/man-pages/man2/execve.2.html [3] https://www.man7.org/linux/man-pages/man1/dash.1.html [4] https://www.man7.org/linux/man-pages/man1/intro.1.html https://www.man7.org/linux/man-pages/man2/intro.2.html https://www.man7.org/linux/man-pages/man7/man-pages.7.html https://www.man7.org/linux/man-pages/man7/standards.7.html
The fact that the Linux kernel does not track environment variables of the processes is not a "fundamental". The setenv/getenv could very well have been syscalls, it's simply a design decision that they are not. One can make a kernel with such tracking, and it'd still be POSIX compliant as long as you supply setenv(3)/getenv(3) wrappers with expected signatures in your system libc.
Reading the code to things is perfectly fine, actually.
It's fine to understand what the code is doing in a shallow way. But this leaves out a lot of important information. Information that isn't immediately obvious just by reading code, that can help you avoid problems and understand the system in-depth, without years of trial and error. Which is why they wrote a manual. You can also both read the code and the manual, but the manual will give you much more knowledge in a smaller amount of time.
This actually helps explain some behaviors I’ve encountered. It was never a serious issue, since the answer is to use a full path. But is slightly annoying none the less. Understanding helps a lot.
A few others are saying, "well yeah, duh!", but this to me demonstrates a mental fault that arises in calling GNU+Linux, "Linux".
Is it right to assume that the PATH env variable and the context in what it is used are two different things ?
While the PATH variable fundamentally is same as other env variables like HOME / USER
but how PATH is interpreted will change from context to context ?
Silly tangentially related question; I like to think of myself as fairly competent in the Linux and unix world.
In the unix systems of the past was it easier to hold a more complete understanding of the system and its components in your head?
I was trying to understand what the lede was here, and it turns out the author assumed that PATH was something understood by the kernel, which is rather an odd assumption, but perhaps one that others make.
I did get one thing out of this though. I had honestly wondered for the longest time why we need to call env to get the same functionality as PATH in a shebang.
Ironically, thanks to either an article I read here (or on the crustacean site) recently, I already knew that the shebang is something which is parsed by the kernel, but had not put two and two together at all.
Much like the author. So goes to show the benefits of exploring and thinking about seemingly "obvious" concepts.
Another bit of trivia about the shebang support in Linux is that is possible to build the kernel without it. https://github.com/torvalds/linux/blob/master/fs/Kconfig.bin...
It's real in GNU/Linux tho...
legitimately, if you're interested try writing a shell, your own libc, an elf loader even. It's fun! C is good and cool!
Doing this is what made programming fun again for me. I made a freestanding Lisp interpreter instead of a shell. No C library, just Linux system calls. I've written quite a bit of ELF code too, no linker yet though.
Uh yeah duh. But I through waiting for him to discover hash in the shell. No such luck. Guess it's in the magic somewhere. (Do man hash or something if you have no idea what I'm talking about)
Nobody talks about vfs path resolution here? There are too many layers in the whole process, even the path from strace can be resolved to another path.
I'm telling you, that environment variables you have are NOT real!
Accessing environment variables from the kernel space isn't even all that easy, because the information lives in userspace in process VM. Here's how it's done for the purpose of showing it in `/proc/[pid/environ`:
https://elixir.bootlin.com/linux/v6.14.4/source/fs/proc/base...
Doesn't that go without saying?
what about rehash?
[flagged]
> The shell, not the Linux kernel, is responsible for searching for executables in PATH!
I mean, no shit, Sherlock? the exec family of system calls requires a path to a file, not a filename with an implicit path from the environment, of course the PATH is handled by the shell.
All members of the exec family of system calls, which consists of only two syscalls, namely, execve(2) and execveat(2), literally have the envp parameter which is supposed to have all the environment variables for the process.
Now, the semantics of this parameter is that kernel does not use it for path resolution when searching for the executable — but it could.