How (memory) safe is Zig? (2021)

38 points by vortex_ape 2 months ago

90s_dev 2 months ago

> But it does not nearly approach the level of systematic prevention of memory unsafety that rust achieves.

Unless I gravely misunderstood Zig when I learned it, the Zig approach to memory safety is to just write a ton of tests fully exercising your functions and let the test allocators find and log all your bugs for you. Not my favorite approach, but your article doesn't seem to take into account this entirely different mechanism.

josephg 2 months ago

Yes, testing is Zig's answer. But that quote is right. Testing doesn't achieve the same kind of systematic prevention of memory bugs that rust does. (Or GC based languages like Go, Java, JS, etc.).
You can write tests to find bugs in any language. C + Valgrind will do most of the same thing for C that the debug allocator will do for zig. But that doesn't stop the avalanche of memory safety bugs in production C code.
I used to write a lot of javascript. At the time I swore by testing. "You need testing anyway - why not use it to find other kinds of bugs too?". Eventually I started writing more and more typescript, and now I can't go back. One day I ported a library I wrote for JSON based operational transform from javascript to typescript. That library has an insane 3:1 test:code ratio or something, and deterministic constraint testing. Despite all of that testing, the typescript type checker still found a previously unknown bug in my codebase.
As the saying goes, tests can only prove the presence of bugs. They cannot prove your code does not have bugs. For that, you need other approaches - like rust's borrow checker or a runtime GC.
- throwawaymaths 2 months ago
  
  static code analysis tools can also do it. there's no reason why the borrow checker must be in the compiler proper.
  
  alpaca128 2 months ago
  
  There's also no reason to have a separate borrow checker if it could just be integrated in the compiler.
  When a compiler has a borrow checker that means the language was already designed to enable borrow checking in the first place. And if a language can let you do borrow checking why would you use a separate tool?
  
  throwawaymaths 2 months ago
  
  because it gets it out of the fast path compile cycle. do you need a borrow checker for `ls`? Probably not. don't use it. do you need it every time you work through intermediate ideas in a refactor? probably not. just turn it on in CI.
  
  josephg 2 months ago
  
  > do you need a borrow checker for `ls`? Probably not.
  Does ls use references and objects with lifetimes? I bet it does. And if so, the answer is yes. You do need the borrow checker in rust to make sure it uses memory and lifetimes correctly.
  If your program somehow doesn’t use references or owned objects, then the borrow checker doesn’t have any work to do. So there’s no harm done in leaving it on.
  
  alpaca128 2 months ago
  
  The borrow checker is not the slow part of the Rust compiler and lets me avoid bugs, why would I not always want to use it?
  And if you put the borrow checker in the CI you massively increased the latency between writing the code and getting all relevant feedback from the compiler/tooling. This would do the opposite of what you intended.
  
  throwawaymaths 2 months ago
  
  You don't have to query all checks at all times. You only have to borrow check when you pr to main. What is the sum latency of the coder dealing with conceptual problem of satisfying the borrow checker at every stage during an arduous refactor where they don't fully have the destination model in mind (and might want to try things out only to discard them). Versus, just ignoring it, seeing how the chips fall, and fixing ownership errors in one or two passes at the end once you've settled om the code structure? Of course you don't have to take the full round trip-to-someone-else's-computer latency either. If you can run it in CI, I presume you could run it locally too, and borrow checking is not particularly slow for the computer.
  You're also not (because one can't) quantifying the problem of a developer getting exasperated and saying, fuck it, it passes the borrow checker, good enough, instead of actually taking the time to make their code legible first and then making sure it is memory safe. This absolutely happens.
  
  steveklabnik 2 months ago
  
  The borrow checker is very fast.
  
  throwawaymaths 2 months ago
  
  computationally. But it slows down the programmer. It is not a zero-cost human operation. If it were, we wouldn't need computers to do it.
  
  steveklabnik 2 months ago
  
  I don’t think it slows me down, it speeds me up. It’s well known that surfacing issues earlier in a process saves time compared to later in a process.
  
  josephg 2 months ago
  
  Also it’s a great way to make sure every library in the ecosystem passes the borrow checker.
  
  throwawaymaths 2 months ago
  
  why do you expect compile time static analysis to fail at this? unless youre loading a precompiled asset?
  
  josephg 2 months ago
  
  I don't. I think compile time static analysis is great. Upthread you said this:
  > there's no reason why the borrow checker must be in the compiler proper.
  On a technical front, I completely agree. But there's an ecosystem benefit to having the borrow checker as part of the compiler. If the borrow checker wasn't in the compiler proper, lots of people would "accidentally forget" to run it. As a result, lots of libraries would end up on crates.io which fail the borrow checker's checks. And that would be a knock on disaster for the ecosystem.
  But yes, there's nothing stopping you writing a rust compiler without a borrow checker. It wouldn't change the resulting binaries at all.
  
  throwawaymaths 2 months ago
  
  > lots of people would "accidentally forget" to run it
  yeah, like how the sel4 guys accidentally forget to run their static analysis all the time.
  You put a badge on CI. If you "forget to run" the static analysis, then people get on you for not running it. Or people get on you if you don't have the badge. Just like how people get on people for not writing programs in rust.
  
  josephg 2 months ago
  
  Q: If we can get men on the moon, why is my lawnmower such a piece of junk?
  A: The engineers who got us in to space didn’t design your lawnmower.
  The SeL4 team won’t forget to run their static analysis checks. But most people aren’t at their level. Most people just want to get on with it. The borrow checker is a pain in the neck to learn - and if it were optional, you better believe lots of people would avoid it forever. I may well have been in that camp too - I found it really hard to get my head around it!
  If you want a modern C-like language without rust’s borrow checker, Zig or Odin is probably a better bet. They’re both fine languages.
  > Just like how people get on people for not writing programs in rust.
  Who? Where?
  
  throwawaymaths 2 months ago
  
  your argument doesn't hold water, because you are appealing to the existence of a schroedinger's programmer that simultaneously would use rust but also wouldn't turn on a borrow checker if it existed in another language.
  My point is the fact that people use Rust means that people do care. And those people will run static analysis checks (as long those static analysis checks aren't the shitty or overly difficult to run), and they will notice that your library isn't checked. Or, better yet, they will integrate it into their project, run the checker on their project, find the problem in the lazy person's dependency and PR a fix or fork it if the maintainer is an asshole.
  Here's the thing. Rust is currently "the only game in town" for spatiotemporal memory safety and that is sucking the air out of the atmosphere, because no one wants to invest time into trying other ideas. And that's a shame, because everyone makes assumptions that Rust must be the only way to do spatiotemporal safety (cough cough Ada/Spark). And it gives cover for people to claim things that are presumptuous and totally untested like our schroedinger's programmer here.
  Rust has a lot lot lot of shitty things in it, like RAII, and proc macros, and pushing you into some incredibly complicated (and privileged) templated types that are doing some things under the hood (remember when rust switched from jemalloc to malloc?), and Rust hasn't ever gotten to keyword generics yet. Those of us who don't want that kind of BS in our language have to constantly be accused of not caring about memory safety which is hardly the truth.
  
  josephg 2 months ago
  
  > your argument doesn't hold water, because you are appealing to the existence of a schroedinger's programmer that simultaneously would use rust but also wouldn't turn on a borrow checker if it existed in another language.
  I'm appealing to the existence of people who would like rust very much, but find the borrow checker to be cumbersome / annoying / difficult. I suspect a lot of people would fall into this camp. Including a lot of people who admire Zig, Jai, Odin, dmd. And probably a lot of C++ developers.
  In my opinion, most of the best parts of rust have nothing to do with the borrow checker. For example, sum types, the rust standard library, cargo & crates.io, traits, Result / Option, editions, rust's unit testing, etc.
  If someone released a borrow checker for zig, would it get 100% adoption? No way. Especially if it was hard to learn, required adding unfamiliar "lifetime annotations" throughout your code and sometimes required you to refactor functions that you know are correct anyway.
  > Here's the thing. Rust is currently "the only game in town" for spatiotemporal memory safety and that is sucking the air out of the atmosphere, because no one wants to invest time into trying other ideas. [...] Those of us who don't want that kind of BS in our language have to constantly be accused of not caring about memory safety which is hardly the truth.
  I think we've found agreement. I've been saying this for years - Rust is humanity's first serious attempt at a programming language like this. Its like the first iPad. I think there's lots of ways a rust successor could learn from rust's mistakes and make a better language. I'm really looking forward to what a borrow checker might look like a few more languages down the line.
  
  saagarjha 2 months ago
  
  Because you're always going to write some code that the tools can't reason about.
  
  throwawaymaths 2 months ago
  
  thats true for rust too (hence "unsafe")
  
  tialaramex 2 months ago
  
  "But seatbelts would also work if everybody was just choosing to use them rather than us mandating their fitment and use, so I don't understand why facts are true"
  Amusingly this is even true for the linter, nobody ran the C linter, more or less everybody runs the Rust linter, the resulting improvement in code quality is everything you'd hope. All humans love to believe they're above average, most are not and average is by definition a mediocre aspiration. Do better.
  
  throwawaymaths 2 months ago
  
  what the hell are you talking about. if you are writing security conscious software you should turn on a static checker and proudly show a badge that says "this code is memory safe". if youre writing a custom data pipeline to be used in a niche scientific field where the consumers are you and anyone that wants to repro your pipeline, and everything is in arenas, who the fuck cares. don't bother with static analysis.
  
  josephg 2 months ago
  
  If everything is in arenas, lifetimes get much easier.
  But, the borrow checker doesn't just check lifetimes. It also checks ownership, and that variables either have a single mutable reference or immutable references. The optimizer assumes those invariants are maintained in the code. Many of its optimizations wouldn't be sound otherwise.
  So, if you could compile code which fails the borrow checker, there's all sorts of weird and wonderful sources of UB eagerly waiting to give you a really bad day - from aliasing issues to thread safety problems to use-after-free bugs. The borrow checker has been around forever in rust. So I don't think anyone has any idea what the implications would be of compiling "bad" code.
  
  throwawaymaths 2 months ago
  
  Point being, there are many many individual programs where none of those things you talk about exist. So why not have a programming system where you can actually turn those things off for development velocity.
  I'm rejecting the idea that "opt-in" is bad. Opt-out is of course better, but "no choice" is not good.
  
  josephg 2 months ago
  
  > there are many many individual programs where none of those things you talk about exist
  Which things? Programs without mutable references and aliasing concerns? Can you give an example?
  Having worked in rust for a few years now, I’m not convinced you’d gain much velocity by disabling the borrow checker. Some code would be a little simpler without lifetime annotations, but you’d also end up spending a lot more time debugging your code. The borrow checker and rust type system are insanely good at finding bugs at compile time. It just takes awhile of working in rust before you stop stubbing your toe on the borrow checker’s (sometimes silly) rules.
  If you want easy to write rust, you can always just lean heavily on Box and Rc, and .clone() everywhere. The trade off is your code won’t be as performant - but that doesn’t matter much if you’re prototyping.
  If you care that much, rust is opensource. Fork it and turn the borrow checker off when compiling. It’s probably not even that hard to do. I’d love to hear about the experience - and what it’s like using rust without the borrow checker.
nine_k 2 months ago

I suppose you can even ship the test/logging allocator with your production build, and instruct your users to run your program with some option / env var set to activate it. This would allow to repro a problem right where it happens, hopefully with some info helpful for debugging attached.
Not a great approach for critical software, but may be much better than what C++ normally offers for e.g. game software, where the development speed definitely trumps correctness.
KerrAvon 2 months ago

What that means, though, is that you have a choice between defining memory unsafely away completely with Rust or Swift, or trying to catch memory problems by a writing a bunch of additional code in Zig.
- TimSchumann 2 months ago
  
  I’d argue that ‘a bunch of additional code’ to solve for memory safety is exactly what you’re doing in the ‘defining memory safety away’ example with Rust or Swift.
  It’s just code you didn’t write and thus likely don’t understand as well.
  This can potentially lead to performance and/or control flow issues that get incredibly difficult to debug.
  
  agarren 2 months ago
  
  That sounds a bit unfair. All that code that we neither wrote nor understood, I think in the case of Rust, it’s either the borrow checker or the compiler itself doing something it does best - i.e., “defining memory safety away”. If that’s the case, then labeling such tooling and language-enforced memory safety mechanisms as “a bunch of additional code…you didn’t write and…don’t understand” appears somewhat inaccurate, no?
  
  rk06 2 months ago
  
  It is quite fair as far as rust is concerned. For simple data structures, like doubly linked list,are hard problems for rust
  
  josephg 2 months ago
  
  So? That wasn't the claim. The GP poster said this:
  > This can potentially lead to performance and/or control flow issues that get incredibly difficult to debug.
  Writing a linked list in rust isn't difficult because of control flow issues, or because rust makes code harder to debug. (If you've spent any time in rust, you quickly learn that the opposite is true.) Linked lists are simply a bad match up for the constraints rust's borrow checker puts on your code.
  In the same way, writing an OS kernel or a high performance b-tree is a hard problem for javascript. So what? Every language has things its bad at. Design your program differently or use a different language.
  
  josephg 2 months ago
  
  > This can potentially lead to performance and/or control flow issues that get incredibly difficult to debug.
  The borrow checker only runs at compile-time. It doesn't change the semantic meaning - or the resulting performance - of your code.
  The borrow checker makes rust a much more difficult and frustrating language to learn. The compiler will refuse to compile your code entirely if you violate its rules. But there's nothing magical going on in the compiler that changes your program. A rust binary is almost identical to the equivalent C binary.
- ajross 2 months ago
  
  Weird that Swift is your totem for "managed/collected runtime" and not Java (or C#/.NET, or Go, or even Javascript). I mean, it fits the bill, but it's hardly the best didactic choice.
  
  saagarjha 2 months ago
  
  I don't think they said anything about that?
  
  ajross 2 months ago
  
  The point was that basically no one knows Swift, and everyone knows Java. If you want to point out a memory safe language in the "managed garbage-collected runtime" family, you probably shouldn't pick Swift.
  
  orobio 2 months ago
  
  I wouldn’t put Swift in the same ‘managed garbage-collected runtime’ family as Java, C#/.NET, Go, and Javascript, so maybe they weren’t trying to do what you think.
  Swift is more like a native systems programming language that makes it easy to trade performance for ergonomics (and does so by default).
- 90s_dev 2 months ago
  
  What if -- stay with me now -- what if we solved it by just writing vastly less code, and having actually reusable code, instead of reinventing every type of wheel in every project? Maybe that's the real secret to sound code. Actual code reuse. I know it's a pipedream, but a man can dream, can't he?
  
  codr7 2 months ago
  
  The way we've done code reuse up to this point rarely lives up to its promises.
  I don't know what the solution is, but these days I'm a lot more likely to simply copy code over to a new project rather than try to build general purpose libraries.
  I feel like that's part of the mess Rust/Swift are getting themselves tangled up in, everything depends on everything which turns evolution into more and more of an uphill struggle.
  
  josephg 2 months ago
  
  Why? In C I'd understand. But cargo and the swift package manager work great.
  By all means, rewrite little libraries instead of pulling in big ones. But if you're literally copy+pasting code between projects, it doesn't take much work to pull that code out into a shared library.
  
  saagarjha 2 months ago
  
  No, this doesn't solve the problem. Libraries have security issues like every other codebase.
  
  0x6c6f6c 2 months ago
  
  Yeah that is the opposite take of recent posts that the Cargo/npm package dependence is way too heavy.
  Saying we should rely on reusable modules is great and all, but that reusable code is going to be maintained by who now?
  There's no sustainable pattern for this yet, most things are good graces of businesses or free time development, many become unmaintained over time- people who actually want to survive on developing and supporting reusable modules alone might actually be more rare than the unicorn devs.
  
  90s_dev 2 months ago
  
  I meant in programming in general, not specific to Rust or Cargo.

pizlonator 2 months ago

> it seems impossible to secure c or c++

False. Fil-C secures C and C++. It’s more comprehensively safe than Rust (Fil-C has no escape hatches). And it’s compatible enough with C/C++ that you can think of it as an alternate clang target.

NIckGeek 2 months ago

Fil-C is impressive and neat, but it does add a runtime to enforce memory safety which has a (in most cases acceptable) cost. That's a reasonable strategy, Java and many other langs took this approach. In research, languages like Dala are applying this approach to safe concurrency.
Rust attempts to enforce its guarantees statically which has the advantage of no runtime overhead but the disadvantage of no runtime knowledge.
- pizlonator 2 months ago
  
  Rust attempts to enforce guarantees statically, but in practice fails, because of pervasive use of `unsafe`.
  Fil-C doesn't "add a runtime". C already has a runtime (loader, crt, compiler runtime, libc, etc)
  
  NIckGeek 2 months ago
  
  > but in practice fails, because of pervasive use of `unsafe`.
  Yes, in `unsafe` code typically dynamic checks or careful manual review is needed. However, most code is not `unsafe` and `unsafe` code is wrapped in safe APIs.
  I'm aware C already has a runtime, this adds to it.
  
  pizlonator 2 months ago
  
  > Yes, in `unsafe` code typically dynamic checks or careful manual review is needed. However, most code is not `unsafe` and `unsafe` code is wrapped in safe APIs.
  Those are the excuses I heard from C++ programmers for years.
  Memory safety is about guarantees enforced by the compiler. `unsafe` isn't that.
  
  Rusky 2 months ago
  
  The stuff Fil-C adds is on the same footing as `unsafe` code in Rust- its implementation isn't checked, but its surface area is designed so that (if the implementation is correct) the rest of the program can't break it.
  Whether the amount and quality of this kind of code is comparable between the two approaches depends on the specific programs you're writing. Static checking, which can also be applied in more fine-grained ways to parts of the runtime (or its moral equivalent) is an interesting approach, depending on your goals.
  
  pizlonator 2 months ago
  
  > The stuff Fil-C adds is on the same footing as `unsafe` code in Rust- its implementation isn't checked, but its surface area is designed so that (if the implementation is correct) the rest of the program can't break it.
  It’s not the same.
  The Fil-C runtime is the same runtime in every client of Fil-C. It’s a single common trusted compute base and there’s no reason for it to grow.
  On the other hand Rust programmers use unsafe all over the place, not just in some core libraries.
  
  Rusky 2 months ago
  
  Yeah, that's what I meant by "depends on the specific programs you're writing." Confining unsafe Rust to core libraries is totally something people do.
  
  pizlonator 2 months ago
  
  You're equating a core runtime that doesn't grow with libraries written by anyone.
  There's no world in which a Fil-C user would write unsafe code. That's not a thing you can do in Fil-C.
  Rust users write unsafe code a lot and the language allows it and encourages it even.
  
  steveklabnik 2 months ago
  
  > Rust users write unsafe code a lot
  This isn't the case.
  
  pizlonator 2 months ago
  
  Over 170 uses of unsafe in sudo-rs. That’s just one example.
  That’s “a lot” in my book.
  
  steveklabnik 2 months ago
  
  There’s no reason to believe that one program is inherently representative. sudo-rs eschews dependencies and so is likely to be higher than most programs.
  Furthermore, 170 uses in a 200 line program vs a one million line program are very different. I don’t know off hand how big sudo-rs is.
  Even in embedded OS kernels, it’s often around 1%-5% of code. Many programs have no direct unsafe code at all.
  
  Rusky 2 months ago
  
  I mean, again, yeah. I specifically compared the safe API/unsafe implementation aspect, not who writes the unsafe implementation.
  To me the interesting thing about Rust's approach is precisely this ability to compose unrelated pieces of trusted code. The type system and dynamic semantics are set up so that things don't just devolve into a yolo-C-style free-for-all when you combine two internally-unsafe APIs: if they are safe independently, they are automatically safe together as well.
  The set of internally-unsafe APIs you choose to compose is a separate question on top of that. Maybe Rust, or its ecosystem, or its users, are too lax about this, but I'm not really trying to have that argument. Like I mentioned in my initial comment, I find this interesting even if you just apply it within a single trusted runtime.
  
  pizlonator 2 months ago
  
  It’s not about who writes it.
  The important question is: is this a per-program recurring cost, or a per-language-implementation fixed cost.
  That’s unsafe is a recurring cost.
  Fil-C’s runtime is a fixed cost.
  
  Rusky 2 months ago
  
  Yeah, you're still responding to something I'm not saying, and not saying anything I'm trying to argue with.
  I wrote "who" as shorthand for "the language implementation vs the individual programs."
Ygg2 2 months ago

> It's more comprehensively safe than Rust
Yeah. By adding a runtime.
> Fil-C achieves this using a combination of concurrent garbage collection and invisible capabilities (each pointer in memory has a corresponding capability, not visible to the C address space)
https://github.com/pizlonator/llvm-project-deluge/tree/delug...
- pizlonator 2 months ago
  
  > Yeah. By adding a runtime.
  So? That doesn't make it any less safe or useful.
  In almost all uses of C and C++, the language already has a runtime. In the Gnu universe, it's the combination of libgcc, the loader, the various crt entrypoints, and libc. In the Apple version, it's libcompiler_rt and libSystem.
  Fil-C certainly adds more to the runtime, but it's not like there was no runtime before.
  
  jandrewrogers 2 months ago
  
  It makes it a lot less performant and there is no avoiding or mitigating that downside. C++ is often selected as a language instead of safer options for its unusual performance characteristics even among systems languages in practice.
  Fil-C is not a replacement for C++ generally, that oversells it. It might be a replacement for some C++ software without stringent performance requirements or a rigorously performance-engineered architecture. There is a lot of this software, often legacy.
  
  pizlonator 2 months ago
  
  > It makes it a lot less performant and there is no avoiding or mitigating that downside.
  You can’t possibly know that.
  > C++ is often selected as a language instead of safer options for its unusual performance characteristics even among systems languages in practice.
  Is that why sudo, bash, coreutils, and ssh are written in C?
  Of course not.
  C and C++ are often chosen because they make systems programming possible at all due to their direct access to syscall ABI.
  > Fil-C is not a replacement for C++ generally, that oversells it.
  I have made no such claim.
  Fil-C means you cannot claim - as TFA claims - that it’s impossible to make C and C++ safe. You have to now hedge that claim with additional caveats about performance. And even then you’re on thin ice since the top perf problems in Fil-C are due to immaturity of its implementation (like the fact that linking is hella cheesy and the ABI is even cheesier).
  > It might be a replacement for some C++ software without stringent performance requirements or a rigorously performance-engineered architecture. There is a lot of this software, often legacy.
  It’s the opposite in my experience. For example, xzutils and simdutf have super lower overhead in Fil-C. In the case of SIMD code it’s because using SIMD amortizes Fil-C’s overheads.
  
  comex 2 months ago
  
  > C and C++ are often chosen because they make systems programming possible at all due to their direct access to syscall ABI.
  Surely Fil-C cannot provide direct access to syscalls without violating the safety guarantee. There must be something ensuring that what the kernel interprets as a pointer is actually a valid pointer.
  > Fil-C means you cannot claim - as TFA claims - that it’s impossible to make C and C++ safe. You have to now hedge that claim with additional caveats about performance. And even then you’re on thin ice since the top perf problems in Fil-C are due to immaturity of its implementation (like the fact that linking is hella cheesy and the ABI is even cheesier).
  The world of compilers is littered with corpses of projects that spent years claiming faster performance was right around the corner.
  I believe you can make it faster, but how much faster? We'll see.
  I think these types of compatibility layers will be a great option moving forward for legacy software. But I have a hard time seeing the case for using Fil-C for new code: all the known disadvantages of C and C++, now combined with performance closer to Java than Rust (if not worse), and high difficulty interoperating with other native code (normally C and C++'s strength!), in exchange for marginal safety improvements over Rust (minus Rust's more general safety culture).
  edit: I feel bad writing such a dismissive comment, but it's hard to avoid reacting that way when I see unrealistically rosy portrayals of projects.
  
  pizlonator 2 months ago
  
  > Surely Fil-C cannot provide direct access to syscalls without violating the safety guarantee. There must be something ensuring that what the kernel interprets as a pointer is actually a valid pointer.
  This is exactly what Fil-C does.
  > all the known disadvantages of C and C++
  The main disadvantage of C and C++ is unsafety and fil-C comprehensively fixes that.
  > edit: I feel bad writing such a dismissive comment, but it's hard to avoid reacting that way when I see unrealistically rosy portrayals of projects.
  How is my portrayal unrealistically rosy?
  Even the fact that you know what the current perf costs are is the result of me being brutally honest about its perf.
  I suspect something else is going on.
  
  comex 2 months ago
  
  > This is exactly what Fil-C does.
  Okay, I just checked. It does not. I wrote: "There must be something ensuring that what the kernel interprets as a pointer is actually a valid pointer." And sure enough, your runtime manually wraps each Linux syscall to do exactly that:
  https://github.com/pizlonator/llvm-project-deluge/blob/6804d...
  For harder cases like fcntl, where arguments can be either pointers or integers, you have to enumerate the possible fcntl arguments:
  https://github.com/pizlonator/llvm-project-deluge/blob/6804d...
  ioctl is even harder because some ioctls take pointers to structs that themselves contain pointers; on Linux that includes v4l2 and mmc. It looks like you don't handle that properly, judging by:
  https://github.com/pizlonator/llvm-project-deluge/blob/6804d...
  --
  My point is: having to go through wrapper functions is not what I'd call "direct" access to "ABI". (Also, the wrappers don’t even wrap the syscall ABI directly; they wrap the libc ABI that in turn wraps syscalls.)
  You might object that the wrappers are thin enough that they still count as direct. While that's a matter of definitions, I think my previous comment made it clear what _I_ meant when I questioned "direct", given my followup sentence about "actually a valid pointer".
  But beyond quibbles about who meant what, this lack of directness matters because it implicates the portability of your approach.
  At least as currently implemented, you rely on compiling almost everything (even libc) inside the sandbox, while having a ‘narrow waist’ of syscall wrappers mediating access between the sandbox and the outside world. That should work for most use cases on Linux, where there's already an assumption that different processes can have completely different library stacks, and static linking is common. Even if you want to do a GUI application you should be able to recompile the entire GTK or Qt stack inside the sandbox, and it doesn’t matter if other apps are using different versions or GTK or Qt.
  But what about other operating systems? For server and CLI software you can probably still get away with exposing a small syscall/libc API, similar to Cosmopolitan (though that will still require significant effort for each OS). But for GUIs and platform integration more broadly, you’re expected to use the platform-provided libraries that live in your address space. They are usually proprietary, and even when they’re not, the system isn’t designed for multiple versions of the libraries to coexist.
  I know I’m not telling you anything you don’t already know. But it’s an important point, because aside from performance, the _other_ big reason that Rust relies on user-written unsafe code is for FFI. If anyone can write their own FFI bindings, as opposed to making all FFI bindings live in a centralized runtime, then it becomes more feasible to scale the mammoth task of writing safe wrappers for all those ABIs. Your approach explicitly rejects user-written unsafe code, so I don’t know how you can possibly end up with reasonable OS library coverage.
  Now sure, you didn’t claim anything about GUIs or portability. Perhaps this is more on topic for my previous comment’s point about “high difficulty interoperating with other native code” which you didn’t rebut. But it’s also relevant to “direct access to syscall ABI”, because if there _were_ some way to provide direct access to syscall ABI while remaining memory-safe, then the same approach would probably extend to other system ABIs. For example, a fully CHERI-aware system actually would allow for that. It’s an unfair comparison, because CHERI assumes cooperation from both the hardware and the OS, while you’re trying to run on existing hardware and OSes. I have no idea if we’ll ever see CHERI in general purpose systems. But in exchange, CHERI achieves something that’s otherwise impossible: combining direct system access and memory safety. And I originally read your comment as claiming to do the impossible.
  
  pizlonator 2 months ago
  
  It’s true that the Fil-C approach focuses on having a memory safe userland with no exceptions, which means no FFI to unsafe code (aside from the Fil-C runtime itself, which is just syscalls wrappers and a small handful of other things).
  I have considered what FFI to native could look like, and have so far rejected it because it hurts the “no unsafe code” purism.
  You’re right that this limits me to those OSes where the syscalls themselves are an adequate ABI. Linux isn’t the only such OS. It just happens to be the only OS Fil-C supports right now.
  
  dustbunny 2 months ago
  
  > a lot less performant
  Is this just you speculating? How much is "a lot"? Where's the data? Let's get some benchmarks!
  
  pizlonator 2 months ago
  
  I mean bro isn’t totally wrong.
  Fil-C’s perf sucks on some workloads. And it doesn’t suck on others.
  Extreme examples to give you an idea:
  - xzutils had about 1.2x overhead. So slower but totally usable.
  - no noticeable overhead in shells, systems utilities, ssh, curl, etc. But that’s because they’re IO bound.
  - 4x or sometimes maybe even higher overheads for things like JS engines, CPython, Lua, Tcl, etc. Also OpenSSL perf tests are around 4x I think.
  But you’re on thin ice if you say that this is a reason why Fil-C will fail. So much of Fil-C’s overhead is due to known issues that I will fix eventually, like the function call ABI (which is hella cheesy right now because I just wanted to have something that works and haven’t actually made it good yet).
nzeid 2 months ago

I love this shameless self-promotion. ;)
Fil-C is in the cards for my next project.
- pizlonator 2 months ago
  
  Thank you for considering it :-)
  Hit me up if you have questions or issues. I’m easy to find
  
  90s_dev 2 months ago
  
  One of these days, a project will catch on that's vastly simpler than any memory solution today, yet solves all the same problems, and more robustly too, just like how it took humanity thousands of years to realize how to use levers to build complex machines. The solution is probably sitting right under our noses. I'm not sure it's your project (maybe it is) but I bet this will happen.
  
  pizlonator 2 months ago
  
  That’s a really great attitude! And I think you’re right!
  I think in addition to possibly being the solution to safety for someone, Fil-C is helping to elucidate what memory safe systems programming could look like and that might lead to someone building something even better

dang 2 months ago

How safe is Zig? - https://news.ycombinator.com/item?id=31850347 - June 2022 (254 comments)

How Safe Is Zig? - https://news.ycombinator.com/item?id=26537693 - March 2021 (274 comments)

How Safe Is Zig? - https://news.ycombinator.com/item?id=26527848 - March 2021 (1 comment)

How Safe Is Zig? - https://news.ycombinator.com/item?id=26521539 - March 2021 (1 comment)

nanolith 2 months ago

There is a third category of memory and other software safety mechanisms: model checking. While it does involve compiling software to a different target -- typically an SMT solver -- it is not a compile-time mechanism like in Rust.

Kani is a model checker for Rust, and CBMC is a model checker for C. I'm not aware of one (yet!) for Zig, but it would not be difficult to build a port. Both Kani and CBMC compile down to goto-c, which is then converted to formulas in an SMT solver.

dnautics 2 months ago

There isn't a real one yet, but to scratch an itch I tried to build one for Zig. It's not complete nor do I have plans to complete it. https://github.com/ityonemo/clr
If zig locks down the AIR (intermediate representation at the function level) it would be ideal for running model checking of various sorts. Just by looking at AIR I found it possible to:
- identify stack pointer leakage
- basic borrow checking
- detect memory leaks
- assign units to variables and track when units are incompatible
DrNosferatu 2 months ago

Any good primers on SMT solvers?
- nanolith 2 months ago
  
  Start with this.
  https://smt.st/SAT_SMT_by_example.pdf
  The algorithms behind SAT / SMT are actually pretty straight-forward. One of these days, I'll get around to publishing an article to demystify them.

Dwedit 2 months ago

If you're filling uninitialized pointers with AAAAAAAA, it might be best to also reserve that memory page and mark it as no-access.

I'm not even joking. Any pattern used by magic numbers that fill pointers (such as HeapFree filling memory with FEEEEEEE on Windows) should have a corresponding no-access page just to ensure that the program will instantly fail, and not have a valid memory allocation mapped in there. For 32-bit programs, everything past 0x8000000 used to be reserved as kernel memory, and have an access violation when you access it, so the magic numbers were all above 0x80000000. But with large address aware programs, you don't get that anymore, only manually reserving the 4K memory pages containing the magic numbers will give you the same effect.

throwawaymaths 2 months ago

that only happens in debug-builds.
https://ziglang.org/documentation/master/#undefined

BrouteMinou 2 months ago

I don't know why we are still having this topic going on. Zig is not safe, period.

Zig gives you the control you need if that is what you want, safety isn't something Zig is chasing.

Safer than C, yeah, but not safe.

Rust = safe Zig = control

Pick your weapon for the foe in front of you.

saagarjha 2 months ago

I don't think Zig gives you significantly more control than Rust.
- Zambyte 2 months ago
  
  Maybe not Zig the language, but the fact that all allocating functions in the standard library accept an allocator (and community libraries follow this precedent) does give you much more control in practice.
  For example, how would you use a Vec using stack memory for elements, instead of the heap? For the equivalent data structure in Zig (std.ArrayList), it's just a matter of using a stack allocator instead of using a heap allocator, which is an explicit decision either way.
  
  tialaramex 2 months ago
  
  > For example, how would you use a Vec using stack memory for elements, instead of the heap? For the equivalent data structure in Zig (std.ArrayList), it's just a matter of using a stack allocator instead of using a heap allocator, which is an explicit decision either way.
  In Rust it would likewise "just" be a matter of using the allocator we want. In this specific case we can see that's nonsense - there's a single stack pointer in the CPU so while it's perfectly possible to make two growable arrays (Vec or ArrayList depending on the language) on the heap, if we use the stack instead they're both obliged to somehow share that single stack pointer when growing, thus whether Zig or Rust this idea can't actually work, but for examples which do work the Rust and Zig doesn't look that different.
  
  Zambyte 2 months ago
  
  Welcome to the magic of Zig, where ideas that can't actually work actually do work :-)
  const std = @import("std"); pub fn main() !void { var not_heap_memory: [1024]u8 = undefined; var fixed_buffer_allocator = std.heap.FixedBufferAllocator.init(&not_heap_memory); const allocator = fixed_buffer_allocator.allocator(); var my_list = std.ArrayList(i32).init(allocator); defer my_list.deinit(); // notice the try! allocation may fail! try my_list.append(1); try my_list.append(2); for (my_list.items) |d| { std.debug.print("{}\n", .{d}); } }
  The value here is not that you have unbounded stack memory - obviously. The value is that you can use the same API for a growable list (std.ArrayList) backed by stack memory, as you would for a growable list backed by heap memory. This could be useful if you have a function that accepts an ArrayList as an argument and will append to it, but you have a known maximum size for the ArrayList in the context that you're using it. You cannot use the standard Vec type in the same way in Rust. You cannot use any data structures from the Rust standard library that do allocations in the same way.
  Yes, of course, you can make data structures that accept custom allocators in the same way in Rust. Maybe there are even community libraries that do it already. The problem is that because they're not in the standard library, you're going to have a hard time using those data structures with any other community libraries. And thus, in practice, you have less control over your allocation strategies in Rust than you do in Zig.
  
  tialaramex 2 months ago
  
  > The problem is that because they're not in the standard library, you're going to have a hard time using those data structures with any other community libraries.
  No, all the Rust standard library collections do in fact have the same feature, you can in fact Vec::new_in(SomeAllocator) - my guess is that you'll say "Ah, but that's not yet in stable Rust" and that'd make sense in other contexts but it's a weird objection when the entire Zig language still isn't 1.0.
  If you wanted to write a function which takes a container and adds things to it, which I wouldn't recommend, in Rust you'd write that as a polymorphic function, so it'll just get monomorphized for the SomeAllocator variant.
  The use of FixedBufferAllocator here is absurd, indeed it's hard to think of non-absurd uses for this allocator, it's a toy because it has no reclamation.
  ArrayList has a slightly weird variation on the 1.5x growth pattern, for a large T the ArrayList<T> will grow something like 1, 2, 4, 7, 11, 17, 26, etc. But with your 1024 byte FixedBufferAllocator, the older sizes are just discarded so somewhere around 62 or so items it'll blow up, all the rest of the space was just thrown away and so the growth fails
  Overall this not only doesn't do what you said you were doing originally (it's not actually a stack† allocated growable array, those simply don't exist), it also doesn't do the thing you ostensibly claim it's useful for either, so much of Zig feels like this.
  † Edited: For a few minute this said "heap" due to a thinko