Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Memory Safety for Skeptics (acm.org)
87 points by steveklabnik 1 day ago | hide | past | favorite | 179 comments




> If you're tired of hearing about memory safety, this article is for you.

Tell me more about memory safety, any time; just hold the Rust.

Rust skeptics are not memory safety skeptics. Hopefully, there are no memory safety skeptics, other than rhetorical strawmen.


I've spoken with quite a few C++ developers who swear that they don't need memory safety tooling because they're good enough to achieve it on their own. More than a few also expect that the worst that can happen from breaking memory safety is a SEGFAULT, which also suggests that they don't understand memory safety.

So, I'd say that there is still some outreach to do on the topic.

On the other hand, you're absolutely right that Rust is only one of the many ways to get there.


OK, so those are skeptics about tooling, not about memory safety per se.

And not even about tooling per se, since achieving safety on their own doesn't literally mean on their own; they are relying on tooling.

True Scotsman's "on your own" means working in assembly language, in which you have to carefully manage even just calling a function and returning from it: not leaving stray arguments on the stack, saving and restoring all callee-saved register that are used in the function and so on.

Someone who thinks that their job is not to have segfaults is pretty green, obviously.


I'm not entirely certain what you mean.

These C++ developers I'm mentioning are all at least senior (some of them staff+), which makes their remarks on segfaults scary, because clearly, they haven't realized that a segfault is the best case scenario, not the worst one. This means that they very much need a refresher course on memory safety and why it matters.

The fact that they assume that they're good enough to avoid memory errors without tooling despite the fact that most of these errors are invisible and may remain invisible for years before being discovered is a second red flag, because it strongly suggest that they misunderstand the difficulty.

Of course, the conversation between you and I is complicated by the fact that the same words "memory safety" could apply to the programming language/tooling or to the compiled binary.


The whole discussion sucks, because Rust is not memory safe, and can easily be worse than C and C++ regarding memory safety. Memory unsafety is entirely possible in Rust[0].

[0]: https://materialize.com/blog/rust-concurrency-bug-unbounded-...


Not sure where that comes from. We're not even discussing Rust in this thread.

> True Scotsman's "on your own" means working in assembly language, in which you have to carefully manage even just calling a function and returning from it: not leaving stray arguments on the stack, saving and restoring all callee-saved register that are used in the function and so on.

I've actually been trying to figure out if it's practical to write assembly but then use a proof assistant to guarantee that it maintains safety. Weirdly feels easier than C insofar as that you don't have (C's) undefined behavior to contend with. But formal methods are a whole thing and I'm very very new to them so it remains to be seen whether this is actually a good idea, or awful and stupid:) (Edit: ...or merely impractical, for that matter)


It would be impractical. It would be monumental to even specify what 'safe' means in that degree of freedom, let alone allow/reject arbitrary assembly code according to the spec.

E.g. Trying to reject string operations which write beyond the trailing \0. At assembly level, \0 is only one of many possible conventions for bounding a string. E.g. maybe free() is allowed to write past 0s. So you'd need to decide whether it's safe depending on context.


Well yeah, I assumed if I was doing formal verification that that would necessarily involve fully specifying/tracking data sizes, which would mean that all strings would be pascal strings (at least internally; obviously it might be necessary to tack on a \0 for compatibility with external code).

You should take a look at TAL (the Typed Assembly Language) and DTAL (the Dependently Typed Assembly Language). The latter was designed specifically to support co-writing assembly and formal proofs.

You could do that and some very expensive aerospace tooling exists in that direction (for verifying compiler outputs), but an easier way is to use sanitizers on a best effort basis. You can either use something like AFL++ with QAsan or use retrowrite to insert whatever you want into the binary as if it was a normal compiled program.

That doesn’t sound so different from using a programming language that guarantees safety. The compiler proves that its translations are safe.

Sounds like a straw-man. I know developers who are good enough to achieve it on their own, but they use the tooling anyway, because one can’t write perfect code always: feature requests might be coming in too fast, team members have different skill levels, dev turnover happens, etc.

Furthermore, memory bugs still can be considered by teams as just another bug, so they might not get prioritised.

The only significant difference is that there’s lots of criminal energy targeting them, otherwise nobody would care much.


I wish memory safety skepticism was nothing more than a rhetorical strawman. It's not hard to find prominent people who think differently though. Take Herb Sutter for example, who argues that "memory safety" as defined in this article is an extreme goal and we should instead focus on a more achievable 95% safety instead to spend the remaining effort on other types of safety.

I can also point to more extreme skeptics like Dan O'dowd, who argue that memory safety is just about getting gud and you don't actually need language affordances.

Discussions about this topic would be a lot less heated if everyone was on the same page to start. They're not. It's taken advocates years of effort to get to the point where we can start talking about memory safety without immediate negative reactions and that process is still ongoing.


> Take Herb Sutter for example, who argues that "memory safety" as defined in this article is an extreme goal and we should instead focus on a more achievable 95% safety instead to spend the remaining effort on other types of safety.

One thing I've noticed when people make these arguments is that they tend to ignore the fact that most (all?) of these other safeties they're talking about depend on being able to reason about the behaviour of the program. But when you violate memory safety a common outcome is undefined behaviour, which has unpredictable effects on program behaviour.

These other safeties have a hard dependency on memory safety. If you don't have memory safety, you cannot guarantee these other safeties because you can no longer reason about the behaviour of the program.


Herb Sutter's article on this.[1]

For C/C++, memory safety is a retrofit to a language never designed for it. Many people, including me, have tried to improve the safety of C/C++ without breaking existing code. It's a painful thing to attempt. It doesn't seem to be possible to do it perfectly. Sutter is taking yet another crack at that problem, hoping to save C/C++ from becoming obsolete, or at least disfavored. Read his own words to see where he's coming from and where he is trying to go.

Any new language should be memory safe. Most of them since Java have been.

The trouble with thinking about this in terms of "95% safe" is that attackers are not random. They can aim at the 5%.

[1] https://herbsutter.com/2024/03/11/safety-in-context/


> Most of them since Java have been

The most popular ones have not been necessarily. Notably Go, Zig, and Swift are not fully memory safe (I’ve heard this may have changed recently for swift).


Could you expand on how Go isn’t memory safe?

Go's memory safety blows up under concurrency. Non-trivial data races are Undefined Behaviour in Go, violating all safety considertions including memory safety.

Go maps don't have enough locking to be thread safe, as I understand it. That was true at one time. Was it fixed?

I would not expect that it makes sense to provide this as the default for Go's hash table type, my understanding is that modern Go has at least a best effort "fail immediately" detector for this particular case, so when you've screwed this up your code will exit, reporting the bug, in production and I guess you can curse "stupid" Go for not allowing you to write nonsense if you like, or you could use the right tool for the job.

Similar to how Rust clearly is not memory safe, Go is also not memory safe.

Not sure what this is saying but you can create a trivial concurrent program that violates the memory safety guarantees Go is supposed to provide [1]: https://biggo.com/news/202507251923_Go_Memory_Safety_Debate

That is not true of Rust.


> [You can] create a trivial concurrent [Go] program that violates the memory safety guarantees ... > That is not true of Rust.

It's not supposed to be, but Rust does have plenty of outstanding soundness bugs: https://github.com/rust-lang/rust/issues?q=state%3Aopen%20la...

Rust as intended is safe and sound unless you write u-n-s-a-f-e, but as implemented has a ways to go.


It's hard to imagine that if a memory problem were reported to Sutter about one of his own programs, that he would not prioritize fixing that, over most other work.

However, I imagine he would probably take into consideration the context. Who and what is the program for? And does the issue only reproduce if the program is misused? Does the program handle untrusted inputs? Or are there conceivable situations in which a user of the program could be duped by a bad actor into feeding the program a malicious input?

Imagine Sutter wrote a C compiler, and someone found a way to crash it. But the only way to reproduce that crash is via code that invokes undefined behavior. why would Herb prioritize fixing that over other work?

Suppose the user insists that he's running the compiler as a CGI script, allowing unauthenticated visitors to their site to compile programs, making it a security issue.

How should Herb reasonably reply to that?


It's worth differentiating the case of a specific program from the more general case of memory safety as a language feature. A specific program might take additional measures appropriate to the problem domain like static analysis or using a restricted subset of the language. Memory safety at the language level has to work for most or all code written using that language.

Herb is usually talking about the latter because of the nature of his role, like he does here [0]. I'm willing to give him the benefit of the doubt on his opinions about specific programs, because I disagree with his language opinions.

[0] https://herbsutter.com/2024/03/11/safety-in-context/


> Imagine Sutter wrote a C compiler, and someone found a way to crash it. But the only way to reproduce that crash is via code that invokes undefined behavior. why would Herb prioritize fixing that over other work?

Because submitting code that invokes undefined behavior to one's compiler is a very normal thing that most working C developers do dozens of times per day, and something that a decent C compiler should behave reasonably in response to. (One could argue that crashing is acceptable, but erasing the developer's hard drive is not - but by definition that means means undefined behaviour in this situation is not acceptable).

> Suppose the user insists that he's running the compiler as a CGI script, allowing unauthenticated visitors to their site to compile programs, making it a security issue. How should Herb reasonably reply to that?

By fixing the bug?


Yeah, what kind of Crazy Person would make a web site where unauthenticated visitors can write programs and it just compiles them?

What would you even call such a thing? "Compiler Explorer"?

I guess maybe if Herb had helped the guy who owned that web site, say, Matt Godbolt, to enable his "Syntax 2 for C++" compiler cppfront, on that site, it would feel like Herb ought to take some responsibility right ?

Or maybe I am being unreasonable ?


The problem in this conversation is that you are equivocating between "fixing memory safety bugs" and "preventing memory safety bugs statically." When this blog post refers to "memory safety skeptics," it refers to people who think the second is not a good way to expend engineering resources, not your imagined flagrantly irresponsible engineer who is satisfied to deliver a known nonfunctional product.

> Take Herb Sutter for example, who argues that "memory safety" as defined in this article is an extreme goal and we should instead focus on a more achievable 95% safety

I wonder how you figure out when your codebase has reached 95% safety? Or is it OK to stop looking for memory unsafety when you hit, say, 92% safe?


Anything above 90% safety is acceptable because attackers look at that and say “look they’ve tried hard. We shouldn’t attack them, it’ll only discourage further efforts from them.” When it comes to software security, it’s the thought that counts.

> Take Herb Sutter for example, who argues that "memory safety" as defined in this article is an extreme goal and we should instead focus on a more achievable 95% safety instead to spend the remaining effort on other types of safety.

I don't really see how that's a) a scepticism of memory safety or b) how it's not seen as a reasonable position. Just because someone doesn't think X is the most important thing ever doesn't mean they are skeptical of it, but rather that the person holding the 100% viewpoint is probably the one with the extreme position.


Look at the definition quoted in the article:

    [A] program execution is memory safe so long as a particular list of bad things, called memory-access errors, never occur
"95% memory safety" is not a meaningful concept under this definition! That's very much skepticism of memory safety as defined in this article, to highlight the key phrase in the comment you're quoting.

It's also not a meaningful concept within the C++ language standard written by the committee Herb Sutter chairs. Memory unsafety is undefined behavior (UB). C++ code containing UB has no defined semantics and is inherently incorrect, whether that's 1 violation or 1000.

Now, we can certainly discuss the practical ramifications of 95% vs 100%, but even here Herb's arguments have fallen notoriously flat. I'll link Sean Baxter's piece on why Herb's actual proposals fail to achieve even these more modest goals as an entry point [0]. No need to rehash the volumes of digital ink already spilled on this subject in this particular comment thread.

[0] https://www.circle-lang.org/draft-profiles.html


Skepticism of an absolutist binary take on memory safety is not the same as skepticism of memory safety in general and it's important to distinguish the two.

It's like saying that people skeptical of formal verification are actually skeptical of eliminating bugs. Most people are not skeptical of eliminating bugs, but they might be skeptical of extreme approaches to do so.


As I explained in a sibling comment, memory safety violations aren't comparable to logic bugs. Avoiding them isn't an absolutist take, it's just a basic requirement in common programming languages like C and C++. That's not debatable, it's written right into the language standards, core guidelines, and increasingly government standards too.

If you think that's impossibly difficult, you're starting to understand the basic problem. We already know from other languages that memory safety is possible. I've already linked one proposal to retrofit similar safety onto C++. The author of Fil-C is elsewhere in these comments arguing for another way.


Everything you say about memory safety issues applies to logic bugs too. And likewise in reverse - you can have a memory safety issue that doesn't result in a vulnerability or crash. So I don't buy it that memory safety is so different from other types of bugs that it should be considered a binary issue and not on a spectrum like everything else!

> Everything you say about memory safety issues applies to logic bugs too.

It doesn't, because logic bugs generally have, or can be made to have limited scope.

> And likewise in reverse - you can have a memory safety issue that doesn't result in a vulnerability or crash.

No you can't, not in standard C. Any case of memory unsafety is undefined behaviour, therefore a conforming implementation may implement it as a vulnerability and/or crash. (You can have a memory safety issue that happens to not result in a vulnerability or crash in the current version of gcc/clang, but that's a lot less reassuring)


This whole memory-bugs-are-magical thinking just comes from the Rust community and is not an axiomatic truth.

It’s also trivial to discount, since the classical evaluation of bugs is based on actual impact, not some nebulous notions of scope or what-may-happen.

In practice, the program will crash most of the time. Maybe it will corrupt or erase some files. Maybe it will crash the Windows kernel and cause 10 billion in damages; just like a Rust panic would, by the way.


We simply don't treat "gcc segfaults on my example.c" file the same way as "libssl has an exploitable buffer overflow". That's a synopsis of the nuance.

Materials to be consumed by engineers are often unsafe when misused. Not just programs like toolchains with undefined behaviors, but in general. Steel beams buckle of overloaded. Transistors overhead and explode outside of their SOA (safe operating area).

When engineers make something for the public, their job is to combine the unsafe bits, but make something which is safe, even against casual misuse.

When engineers make something for other engineers, that is less so; engineers are expected to read the data sheet.


> engineers are expected to read the data sheet

even if you know what the data sheet says, it's easier said than done, especially when the tool gives you basically no help. you are just praying people will magically just git gud.


I prefer to treat testing like insurance. You purchase enough insurance to get the coverage you need, and not a penny more. Anything beyond that could be invested better.

Same thing with tests, get the coverage you need to build the confidence in your codebase, but don't tie yourself in knots trying to get that last 10%. It's not worth it. Create some manual and integration tests and move one.

I feel like type safety, memory safety, thread safety, etc. are are all similar. Building a physics core to simulate the stability of your nuclear stockpile? The typing should be second to none. Building yet another CSV exporter? Who gives a damn.

Context is so damn important.


This is a perfectly reasonable argument if memory safety issues are essentially similar to logic bugs, but memory unsafety isn't like a logic bug.

A logic bug in a library doesn't break unrelated code. It's meaningful to talk about the continued execution of a program in the presence of logic bugs. Logic bugs don't time travel. There are ways to exhaustively prove the absence of logic bugs, e.g. MC/DC or state space exploration, even if they're expensive.

None of these properties are necessarily true of memory safety. A single memory safety violation in a library can smash your stack, or allow your code to be exploited. You can't exhaustively defend against this with error handling either. In C and C++, it's not meaningful to even talk about continued execution in the presence of memory safety violations. In C++, memory safety violations can time travel. You typically can't prove the absence of memory safety violations, except in languages designed to allow that.

With appropriate caveats noted (Fil-C, etc), we don't have good ways to retrofit memory safety onto languages and programs built without it or good ways to exhaustively diagnose violations. All we can do is structurally eliminate the possibility of memory unsafety in any code that might ever be used in a context where it's an important property. That's most code.


All of that stuff doesn’t matter though. If you look close enough everything is different to everything, but in real life we only take significant differences into consideration otherwise we’d go nuts.

Memory bugs have a high risk of exploitability. That’s it; the threat model will tell the team what they need to focus on.

Nothing in software or engineering is absolute. Some projects have decided they need compile-time guarantees about memory safety, others are experimenting with it, many still use C or C++ and the Earth keeps spinning.


If your attacker controls the data you're exporting to a CSV file, they can take advantage of a memory safety issue in your CSV exporter to execute arbitrary code on your machine.

https://georgemauer.net/2017/10/07/csv-injection.html


> Building yet another CSV exporter? Who gives a damn.

The problem with memory unsafe code is that it can have unexpected and unpredictable side-effects. Such as subtly altering the critical data you're exporting, of letting an attacker take control of your CSV exporter.

In other words, you need quite a lot of context to figure out that a memory bug in your CSV exporter won't be used for escalation. Figuring out that context, documenting it and making sure that the context never changes for the lifetime of your code? That sounds like a much complex proposition that using memory-safe tools in the first place.


I’m curious, what memory safe alternative is there for a C/C++ codebase that doesn’t give up performance?

Also for what it’s worth Rust ports tend to perform faster according to Russinovich. Part of that may be second system syndrome although the more likely explanation is that the default std library is just better optimized (eg hash tables in Rust are significantly better than unordered_map)


Ada has been around for years. The approach to memory safety isn't as strong as Rust, but it is a lot strong than C or C++. C++ is also adding a lot of memory safety, it is a lot easier to bypass than it is in Rust (though I've seen Rust code where everything is marked unsafe), but you still get some memory safety if you try.

All benchmarks between Ada, C, C++, and Rust (and others) should come down to a wash. A skilled programmer can find a difference but it won't be significant. A skilled C++ programmer wouldn't be using unordered_map so it is unfair to point out you can use something bad.


It has but you need spark too to avoid the runtime overhead. And I haven’t seen adoption of Ada in the broader industry so I wouldn’t pick it based on that. I would need to understand why it remains limited to industry’s that mandate government certification.

> A skilled C++ programmer wouldn't be using unordered_map so it is unfair to point out you can use something bad.

Pretending defaults don’t matter is naive especially in a language that is so hostile to being easy to add 3p dependencies (even without that defaults matter).


> A skilled C++ programmer wouldn't be using unordered_map so it is unfair to point out you can use something bad.

C++ isn't my primary language. Pray tell - what's wrong with unordered_map, and what's the alternative?


std::unordered_map basically specifies a bucket-based hashtable implementation (read: lots of extra pointer chasing). Most high-performance hashtables are based on probing.

Bluntly: exactly why does Ada matter, at all? The code footprint of software (1) written in Ada and (2) of concern when we talk about memory safety has measure zero. Is Ada better than C++? It turns out, I don't have to care: to go from C++ to Ada, one needs to rewrite, and if one is going to rewrite for memory safety, they're not going to rewrite to Ada.

If I'm going to rewrite I'm going to look at if formal proofs offer me anything, something ada can give. Ada is tiny I'll grant, but it has always been there.

A lot of Rust versus C or C++ comparisons be like: "Yo, check this Rust rewrite of Foo, which runs 2.5 faster¹ than the C original²".

---

1. Using 8 cores.

2. Single-threaded


1. Amdahl's law

2. That's a language feature too. Writing non-trivial multi-core programs in C or C++ takes a lot of effort and diligence. It's risky, and subtle mistakes can make programs chronically unstable, so we've had decades of programmers finding excuses for why a single thread is just fine, and people can find other uses for the remaining cores. OTOH Rust has enough safety guarantees and high-level abstractions that people can slap .par_iter() on their weekend project, and it will work.


Given most machines have cores to spare, and people want answers faster, is that a bad thing?

I think the complaint is that the C version isn’t multi threading ignoring that Rust makes it much easier to have a correct multithreaded implementation. OP is conveniently also ignoring that the Rust ports that I reference Russinovich talking about are MS internal code bases where it’s a 1:1 port, not a rearchitecture or an attempt to improve performance. The defaults being better, no aliasing that the compiler takes advantage, and automatic struct layout optimization all largely explain that it ends up being 5-20% faster having done nothing other than rewrite it.

But critics seem to often never engage with the actual data and just blindly get knee jerk defensive.


> Part of that may be second system syndrome

It may be that they've implemented it differently in a way that is more performant but has fewer features. A "rust port" is not automatically or apparently a 1:1 comparison.


It could be, but it's often just that the things you got in the box were higher quality and so your results are higher quality by default.

Better types like VecDeque<T>, better implementations of common ideas like sorting, even better fundamental concepts like providing the destructive move, or the owning Mutex by default.

Even the unassuming growable array type, Rust's Vec<T>, is just plain better than C++ std::vector<T>. It's not a huge difference and for many applications it won't matter, but that's the sort of pervasive quality difference I'm talking about and so I can well believe that in practice this ends up showing through.


There’s also compiler optimizations that aren’t available in c++ - noalias automatically applied everywhere + the compiler automatically optimally laying out structs is probably a non trivial perf add as well.

> Tell me more about memory safety

What is it you want to hear about memory safety? If you’re willing to accept the tradeoffs of an automatic garbage collector, memory safety has been a solved problem for decades, and there’s not a whole lot to tell you about it other than learning about widespread, mature technology.

But if you have some good reason to avoid that technology, then your options are far more limited, and there are good reasons that Rust dominates that conversation.

So the question stands - what is it you want to hear more about?


> Hopefully, there are no memory safety skeptics, other than rhetorical strawmen.

There are plenty of such skeptics. It's why Google, Microsoft, etc all needed to publish things like "70% of our vulnerabilities are memory-safety linked".

Even today, the increasing popularity of Zig indicates that memory-safety is not taken as a baseline.


Good point. There are even two posts about Zig on the front page along side this post.

> Rust skeptics are not memory safety skeptics

Definitely not all of them, yes.

> Hopefully, there are no memory safety skeptics, other than rhetorical strawmen.

You'll find the reality disappointing then…


There was an article about Zig on the front page just a few hours ago that attracted many "Why do I need memory safety?" comments. The fact that new languages like Zig aren't taking memory safety as a foundational design goal should be evidence enough that many people are still skeptical about its value

Zig's approach to memory safety is an open question. I don't like it (obviously, a very subjective statement), but as more software is written in it, we'll get empirical data about whether Zig's bet pays off. It very well might.

The post in question had early empirical data comparing bug reports from Node, Bun and Deno. It wasn't the main focus of the article, and I would love to see a deeper dive, but it showed that Bun had almost 8x the amount of "crash" or "segfault" bug reports on Github as Deno, despite having almost the same amount of total issues created. (4% of bug reports for Deno are crashes, 26% of bug reports for Bun are crashes).

This matches my experience with the runtimes in question—I tried Bun out for a very small project and ran into 3 distinct crashes, often in very simple parts of the code like command line processing. Obviously not every crash / null-pointer dereference is a memory safety issue, but I think most people would agree that Zig does not seem to make bug-free software easier to write.


Go programs "segmentation fault" all the time. They're still memory-safe.

I don't think so...

You're wrong.

Rust certainly was not the first "systems programming" language that was memory safe, ADA was aiming for the same title and I think achieved it way before Rust.

While Ada is a great and sadly underused language, if my memory serves, it's not out-of-the-box memory-safe by today's definitions. I seem to recall that it takes Spark to make it memory-safe.

SPARK is practically just a restricted version of Ada with a few added features for formal verification. You can write a program primarily in SPARK but disable said restrictions based on circumstance by setting the `SPARK_Mode` pragma/aspect to `off` on a package, procedure, function, etc. Mixing Ada and SPARK is trivial.

I guess it is similar to Rust code that uses `unsafe {}` as the other poster mentioned (maybe `unsafe fn` for a closer analogy). My knowledge of Ada/SPARK is much greater than what I know about Rust, so I might be guessing wrong.


So in Rust an unsafe block and an unsafe function mean two different things. An unsafe block allows you to do things that are unsafe, such as dereference raw pointers, access union fields, calling unsafe functions, etc.

Unsafe functions mark that the caller is responsible for upholding the invariants necessary to avoid UB. In the 2021 and earlier editions, they also implicitly created an unsafe block in the body, but don't in 2024.

Or, in a more pithy tone: an unsafe block is the "hold my beer" block, while an unsafe function is a "put down your beer" function.


> by today's definitions.

What are todoay's definitions? If Ada simply had more thorough rules but introduced an "unsafe {}" construct, then what would the practical difference actually be? Compiler defaults?


Also Go (and arguably Java). Managed languages work and are regularly used for systems programming.

That depends on your definition of systems programming. Go and java are not fit for low level OS or embedded development. They are too memory hungry. A lot of software runs on hardware other than servers. iot, auto, tvs, phones, watches, etc all have much tighter requirements on acceptable software overhead from languages employed. Also, the more memory and CPU you use in systems layers, the less that is available for higher level layers, which are often written in higher level languages and desire the majority of the devices resources.

There are entire kernels and userlands written in go that are used in production today, in rather constrained environments no less. Technically sim cards run a kind of Java derivative too. I dislike both languages but that doesn’t mean they can’t be used to build foundational systems.

I'm aware of java being used for embedded, but that wasn't really proper java and it requires specialized hardware. You can't just expect to run java on an arm microcontroller and expect it to work out. I'm curious to learn more about production quality golang kernels. I've never heard about this before. Personally, if a language doesn't give you the flexibility required to run it on microcontroller grade hardware, I find it difficult to ascribe it systems programming language status. Anything with a large runtime makes that difficult without some sort of hardware assistance to offload the runtime (eg javacard).

Sure, those languages are not suited for all systems programming tasks. Just like application programming languages are typically not suited for every kind of application, or database management systems are not suitable for any kind of database. But there isn't a big disagreement about terminology in the field I think. And of course Go was launched specifically as a systems programming language.

What is the justification to call `Null pointer dereference` as a memory safety issue?

1) Null pointer derefs can sometimes lead to privilege escalation (look up "mapping the zero page", for instance). 2) As I understand it (could be off base), if you're already doing static checking for other memory bugs, eliminating null derefs comes "cheap". In other words, it follows pretty naturally from the systems that provide other memory safety guarantees (such as the famous "borrow checker" employed by Rust).

Both Linux and Windows forbid mapping the zero page.

Thank you.

It's worse than a memory safety issue, it's undefined behaviour (at least in C, C++, and Rust)

UB is in fact not worse than a memory safety issue, and the original question is a good one: NULL pointer dereferences are almost never exploitable, and preventing exploitation is the goal of "memory safety" as conceived of by this post and the articles it references.

> UB is in fact not worse than a memory safety issue

The worst case of UB is worse than the worst case of most kinds of non-UB memory safety issues.

> NULL pointer dereferences are almost never exploitable

Disagree; we've seen enough cases where they become exploitable (usually due to the impact of optimisations) that we can't say "almost never". They may not be the lowest hanging fruit, but they're still too dangerous to be acceptable.


What is the worst case of UB that you're thinking of that is worse than the worst memory safety issue?

Essentially Descartes' evil demon, since there are no limits at all on what UB can do.

Can I ask you to be specific here? The worse memory corruption vulnerabilities enable trivial remote code execution and full and surreptitious reliable takeovers of victim machines. What's a non-memory-corruption UB that has a worse impact? Thanks!

I know we've talked about this before! So I figure you have an answer here.


> Can I ask you to be specific here? The worse memory corruption vulnerabilities enable trivial remote code execution and full and surreptitious reliable takeovers of victim machines. What's a non-memory-corruption UB that has a worse impact?

I guess just the same kind of vulnerability, but plus the fact that there are no possible countermeasures even in theory. I'm not sure I have a full picture of what kind of non-UB memory-corruption cases lead to trivial remote code execution, but I imagine them as being things like overwriting a single segment of memory. It's at least conceivable that someone could, with copious machine assistance, write a program that was safe against any single segment overwrite at any point during its execution. Even if you don't go that far, you can reason about what kinds of corruption can occur and do things to reduce their likelihood or impact. Whereas UB offers no guarantees like that, so there's no way to even begin to mitigate its impact (and this does matter in practice - we've seen people write things like defensive null checks that were intended to protect their programs against "impossible" conditions, but were optimised out because the check could only ever fail on a codepath that had been reached via undefined behaviour).


I'm sorry, I'm worried I've cost us some time by being unclear. It would be easy for me to cite some worst-case memory corruption vulnerabilities with real world consequences. Can you do that with your worst-case UB? I'm looking for, like, a CVE.

> It would be easy for me to cite some worst-case memory corruption vulnerabilities with real world consequences.

Could you do that for a couple of non-UB ones then? That'll make things a lot more concrete. As far as I can remember most big-name memory safety vulnerabilities (e.g. the zlib double free or, IDK, any random buffer overflow like CVE-2020-17541) have been UB.


Wasn't CVE-2020-17541 a bog-standard stack overflow? Your task is to find a UB vulnerability that is not a standard memory corruption vulnerability, or one caused by (for instance) an optimizer pass that introduces one into code that wouldn't otherwise have a vulnerability.


Cases that are both memory corruption and UB tell us nothing about one being worse than the other. My initial claim in this thread was "the worst case of UB is worse than the worst case of most kinds of non-UB memory safety issues" and I stand by that; if your position is that memory corruption is worse then I'd ask you to give examples of non-UB memory corruption having worse outcomes.


UB can lead to memory safety issues[0], among other terrible outcomes. Hence it’s worse than memory safety issues.

0: https://lwn.net/Articles/342330/


No, that doesn't hold logically.

I believe the point is if something is UB, like NULL pointer dereference, then the compiler can assume it can't happen and eliminate some other code paths based on that. And that, in turn, could be exploitable.

Yes, that part was clear. The certainty of a vulnerability is worse than the possibility of a vulnerability, and most UB does not in fact produce vulnerabilities.

Most UB results in miscompilation of intended code by definition. Whether or not they produce vulnerabilities is really hard to say given the difficulty in finding them and that you’d have to read the machine code carefully to spot the issue and in c/c++ that’s basically anywhere in the codebase.

You stated explicitly it isn’t but the compiler optimizing away null pointer checks or otherwise exploiting accidental UB literally is a thing that’s come up several times for known security vulnerabilities. It’s probability of incidence is less than just crashing in your experience but that doesn’t necessarily mean it’s not exploitable either - could just mean it takes a more targeted attack to exploit and thus your Baysian prior for exploitability is incorrectly trained.


> by definition

But not in reality. For example a signed overflow is most likely (but not always) compiled in a way that wraps, which is expected. A null pointer dereference is most likely (but not always) compiled in a way that segfaults, which is expected. A slightly less usual thing is that a loop is turned into an infinite one or an overflow check is elided. An extremely unusual thing and unexpected is that signed overflow directly causes your x64 program to crash. A thing that never happens is that your demons fly out of your nose.

You can say "that's not expected because by definition you can't expect anything from undefined behaviour" but then you're merely playing a semantic game. You're also wrong, because I do expect that. You're also wrong, because undefined behaviour is still defined to not shoot demons out of your nose - that is a common misconception.

Undefined behaviour means the language specification makes no promises, but there are still other layers involved, which can make relevant promises. For example, my computer manufacturer promised not to put demon-nose hardware in my computer, therefore the compiler simply can't do that. And the x64 architecture does not trap on overflow, and while a compiler could add overflow traps, compiler writers are lazy like the rest of us and usually don't. And Linux forbids mapping the zero page.


Doesn't null-pointer-dereference always crash the application?

Is it only an undefined-behavior because program-must-crash is not the explicitly required by these languages' specs?


> Doesn't null-pointer-dereference always crash the application?

No. It's undefined behaviour, it may do anything or nothing.

> Is it only an undefined-behavior because program-must-crash is not the explicitly required by these languages' specs?

I don't understand the question here. It's undefined behaviour because the spec says it's undefined behaviour, which is some combination of because treating it as impossible allows many optimisation opportunities and because of historical accidents.


> No. It's undefined behaviour, it may do anything or nothing.

This is clearly nonsense.


> This is clearly nonsense.

It is indeed. Unfortunately it's also the C language standard.


It is not nonsense: see https://lwn.net/Articles/575563/

Compilers are allowed to assume undefined behavior doesn't happen, and dereferencing an invalid pointer is undefined behavior. You don't have to like it, but that's how it is.


No, it does not always crash. This is a common misconception caused by thinking about the problem on the MMU (hardware) level, where reading a null pointer predictably results in a page fault. If this was the only thing we had to contend with, then yes, it would immediately terminate the process, cutting down the risk of a null pointer dereference to just a crash.

The problem is instead in software - it is undefined behavior, so most compilers may optimize it out and write code that assumes it never happens, which often causes nightmarish silent corruption / control flow issues rather than immediately crashing. These optimizations are common enough for it to be a relatively common failure mode.

There is a bit of nuance that on non-MMU hardware such as microcontrollers and embedded devices, reading null pointers does not actually trigger an error on a hardware level, but instead actually gives you access to the 0 position on memory. This is usually either a feature (because it's a nice place to put global data) or a gigantic pitfall of its own (because it's the most likely place for accidental corruption to cause a serious problem, and reading it inadvertently may reveal sensitive global state).


> No, it does not always crash.

Can you give me an example that I can reproduce?


This crashes, but after doing something unexpected (printing "Wow" 4 times): https://godbolt.org/z/GPc7bEMn5

Only if that memory page is unmapped, and only if the optimizer doesn't detect that it's a null pointer and start deleting verification code because derefing null is UB, and UB is assumed to never happen.

How common is this in practice?

Compilers regularly delete null pointer checks when they can see that the pointer is dereferenced.

(GCC controls this with `-fno-delete-null-pointer-checks` https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#ind... )

This is a good article.

Small nit: As someone curious about a definition of memory safety, I had come across Michael Hicks' post. He does not use the list of errors as definition, and argues that such a definition is lacking rigor and he is right. He says;

> Ideally, the fact that these errors are ruled out by memory safety is a consequence of its definition, rather than the substance of it. What is the idea that unifies these errors?

He then offers a technical definition (model) involving pointers that come with capability of accessing memory (as if carrying the bounds), which seems like one way to be precise about it.

I have come to the conclusion that language safety is about avoiding untrapped errors, also known as "undefined behavior". This is not at all new, it just seems to have been forgotten or was never widely known somehow. If interested, find the argument here https://burakemir.ch/post/memory-safety-the-missing-def/


What's important is the context in which the term is used today: it's specifically about security and software vulnerabilities, not about a broader notion of correctness and program reliability. Attempts to push past that have the effect of declaring languages like Java and Python memory-unsafe, which is not a defensible claim.

This is a false dichotomy. Language design choices are the causes of security and software vulnerabilites. It is possible to recognize the value of GC languages and have precise technical terminology at the same time. We can invent new words.

I believe everyone who cares about memory safety appreciates that certain bugs cannot occur in Java and go, and if the world calls that memory safe, that is ok.

There are hard, well-defined guarantees that a language and implementation must make, and a space of trade-offs. We need language and recognition for the ability to push the boundary of hard, well-defined guarantees further. That, too, is memory safety and it will be crucial for moving the needle beyond what can be achieved with C and C++.

No one has a problem with applications being ported from low-level to GC-ed languages, the challenge is the ones where this is not possible. We need to talk about memory safety in this specific context, and mitigations and hardening will not solve the entire problem, only pieces of it.


You can invent whatever word you want, but "memory safety" is a term of art that very definitely includes GC'd languages.

Did you read what I wrote up there?

There is art and there is science. What both have in common is that their protagonists do not intend to become obstacles of progress.

I'm afraid GC'd languages have been around for a very long time and yet we continue to talk about memory safety as an urgent problem. Now what?

How does pretending that low-level memory safety is not its own complex domain deserving of its own technical definitions help with anything?


The urgent problem is the problem settings where GC'd languages are not a good fit, including kernels and userland-kernels (AKA browsers). The problem is not that GC'd languages are insufficiently memory-safe.

1 The numbers on memory safety should nowadays separate between spatial ones (bounds-checked in most languages with sane flags) and temporal ones. Temporal ones will be lower than 70%.

2 The article does not mention (compilation) time costs of static checks and its influence on growing code bases, which is a more fundamental system trade-off on scalability and development velocity/efficiency.

> Wrap unsafe code with safe interfaces

3 That sounds like an equivalent of using permission-based separation logic hopefully soon available for LLVM.

> "Get good" is not a strategy

4 Being good is all about knowing exactly techniques, processes, and tools with exact trade-offs and applying them; so I would expect here teaching process knowledge about static and dynamic analysis strategies, tooling and tactics to eliminate bug classes. However, neither have we sane bug classes overviews nor can we generate sane probabilities/statistics of occurrence based on source code and available static and dynamic analysis even when ignoring functionality requirements. This reads somewhat like developers should "stay mediocre" and "trust the tools", even though "Get good and improve processes for groups" is probably the here intended strategy.


What do you mean by "temporal ones will be lower than 70%"? Are you suggesting --- referring to the article's cite --- that subcomponent ports to Rust reduce spatial vulnerabilities more than temporal ones? If so, why do you believe that?

I would expect more of the opposite for languages offering easy to enable bound checks. Only observing the general trend for when bound checks are available (and used) would be helpful information. Temporal ones will remain tricky to debug until at least static + dynamic analysis via scheduling in kernel, hypervisor/simulation and/or run-time will become common-place. Probably even longer, because analysis is cumbersome for bigger programs without Seperation Logic (like Rust has).

If you're talking about the impact of replacing a subcomponent in a larger C/C++ codebase with a memory safe language and saying you'd expect that to make less of an impact on the temporal memory safety issues latent in the remaining C/C++ code, I guess I get that.

If you're saying that you think memory safe languages are less successful at dealing with their own temporal memory safety concerns than spatial memory safety concerns, that doesn't make sense to me and I would push back on it.


I do agree with point 1 and with most of point 2 besides some more arcane things like intentionally racy memory access and some write-cache eviction instruction not being properly modeled (in for example Rust).

> a roughly 70 percent reduction in memory-safety vulnerabilities.

Couldn't find this in the reference text. Is it my interpretation? https://www.memorysafety.org/docs/memory-safety/#how-common-...


This article comes up with yet another definition of memory safety. Thankfully, it does not conflate thread safety with memory safety. But it does a thing that makes is both inaccurate (I think) and also not helpful for having a good discussion:

TFA hints at memory safety requiring static checking, in the sense that it's written in a way that would satisfy folks who think that way, by saying thingys like "never occur" and including null pointer safety.

Is it necessary for the checking to be static? No. I think reasonable folks would agree that Java is memory safe, yet it does so much dynamic checking (null and bounds). Even Rust does dynamic checking (for bounds).

But even setting that aside, I don't like how the way that the definition is written in TFA doesn't even make it unambiguous if the author thinks it should be static or dynamic, so it's hard to debate with what they're saying.

EDIT: The definition in TFA has another problem: it enumerates things that should not happen from a language standpoint, but I don't think that definition is adequate for avoiding weird execution. For example, it says nothing about bad casts, or misuses of esoteric language features (like misusing longjmp). We need a better definition of memory safety.


I want to be there with you, but the definition this piece uses is, I think, objectively the correct one --- "memory safety", at least as used in things like "The Case For Memory Safe Roadmaps" government guidance, is simply the property of not admitting to memory corruption vulnerabilities.

I don't see where you're seeing the article drawing a line between static and dynamic defenses. The article opens by noticing Rust isn't the first memory safe language. It is by implication referring to things like Java, which have dynamic, runtime-based protections against memory corruption.


> I want to be there with you, but the definition this piece uses is, I think, objectively the correct one --- "memory safety", at least as used in things like "The Case For Memory Safe Roadmaps" government guidance, is simply the property of not admitting to memory corruption vulnerabilities.

This piece does not define memory safety as "not admitting memory corruption vulnerabilities". If it was using that definition, then:

- You and I would be on the same page.

- I would have a different complaint, which is that now we have to define "memory corruption vulnerability". (Admittedly, that's maybe not too hard, but it does get a bit weird when you get into the details.)

The definition in TFA is quoted from Hicks, and it enumerates a set of things that should never happen. It's not defining memory safety the way you want.


From this, I think we probably just mostly read the piece differently, and you probably read it more carefully than I did. (I know your background in this space).

I'm always a little guarded about message board definitions of "memory safety", because they tend to be axiomatically derived from the words "memory" and "safety", and they tend to have an objective of saying that there's only one mainstream language that provides memory safety.


> and they tend to have an objective of saying that there's only one mainstream language that provides memory safety.

Yeah!

I agree it's hard to do it without some kind of axioms or circularity, but it's also important to get the definition right, because at some point, we'll need to be able to objectively say whether some piece of critical software is memory safe, or not.

So we have to keep trying to find a good definition. I think that means rejecting the bad definitions.


Are you responding to an AI hallucinated version of this article? It doesn't say what you're saying it does.

TFA is too long, like all articles since the arrival of you know what. So the definitions are scattered. Here it claims:

Rust's big step function was to offer memory safety at compile time through the use of static analysis borrowed and grown out of prior efforts such as Cyclone, a research programming language formulated as a safe subset of C.

In other words, Rust has solved the halting problem since the static checking of array bounds is undecidable in the general case!


> In other words, Rust has solved the halting problem

No one is making that claim.


"offer memory safety at compile time through the use of static analysis"

For arrays, this problem is not computable at compile time, hence the sarcastic remark that, IF THE ABOVE DEFINITION IS TAKEN AT FACE VALUE, Rust must have solved the halting problem. Downvoters are so dumb here.


> IF THE ABOVE DEFINITION IS TAKEN AT FACE VALUE

Why are you shouting? That's what twits do. You don't want to be a twit, do you? Read the site guidelines, emphasis is done with italicized text marked up with an * at the beginning and another * at the end.

But to respond to the topic at hand: Are you familiar with the distinction between sound (what Rust aims for) and complete analyses?


Are you familiar with the fact that Rust does array bounds checking at runtime, contrary to the cited claim? And that this was kind of the topic of this subthread?

https://stackoverflow.com/questions/28389371/why-does-rust-c...


Are you aware you’ve built a straw man and replied to it? Literally you quoted:

> Rust's big step function was to offer memory safety at compile time through the use of static analysis borrowed and grown out of prior efforts such as Cyclone, a research programming language formulated as a safe subset of C.

It is absolutely true that memory safety is offered at compile time in Rust which is a novel thing. You then pivoted this to start talking about bounds safety of arrays which is a strawman interpretation of what was written.


I did not pivot. TFA itself defines:

"Memory safety—the property that makes software devoid of weaknesses such as buffer overflows, double-frees, and similar issues—has been a popular topic in software communities over the past decade"

And the buffer overflows are not detected statically except for the cases when the compiler can prove them. And Rust proponents keep ignoring the topic of this subthread, which is memory safety by static analysis.


As others have pointed out to you. Memory safety is enforced at compile time through static analysis. This does not mean the absence of runtime checks - that is a property you are injecting into the definition and arguing about.

RefCell enforces memory safety too by doing the lifetime enforcement at runtime by using UnsafeCell to provide certain semantics. But the compiler ensures that the RefCell itself still has correct lifetimes and is used correctly, resulting in a compile time guarantee that the code that’s run is memory safe.


Since it seems that Rust programmers are unable to provide a definition for "memory safety", here is what more mature and academically minded people think:

https://blog.adacore.com/memory-safety-in-rust

Notice that in-bounds indexing is included, same as in the definition from this submission that I quoted at you.


Again, no one is arguing that bounds checking isn’t required for memory safety. You are intentionally continuing to argue against a position no one has taken anywhere that Rust somehow always has 0 runtime safety checking and everything happens statically at compile time. Literally RefCell and array bounds. I keep trying to clarify the wording that’s in the article and what it’s intended to mean that you have issue with and instead you keep insisting it means the wrong thing literally no one has argued for.

Bounds checking is part of "memory safety" and Rust people don't get the monopoly on that term. The definition of "memory safety" is literally the topic of this subthread.

Nope. You don't need to "solve the halting problem" and I guess I'll explain why here.

Firstly lets attack the direct claim. The reason you'd reduce to the halting problem is via Rice's theorem. But, Rice only matters if we want to allow exactly all the programs with the desired semantics. In practice what you do is either allow some incorrect programs too (like C++ does, that's what IFNDR is about, now some programs have no defined behaviour at all, but oh well, at least they compiled) or you reject some correct programs too (as Rust does). Now what we're doing is merely difficult rather than provably impossible, and people do difficult things all the time.

This is an important choice (and indeed there's a third option but it's silly, you could do both, rejecting some correct programs while allowing some incorrect ones, the worst of both worlds) but in neither case do we need to solve an impossible problem.

Now, a brief aside on what Rust is actually doing here, because it'll be useful in a moment. Rust's compiler does not need to perform a static bounds check on array index, what Rust's compiler needs to statically check is only that somebody wrote the runtime check. This satisfies the requirement.

But now back to laughing at you. While in Rust it's common to have fallible bounds checks at runtime (only their presence being tested at compile time) in WUFFS it's common for all the checks to be at compile time and so to have your code rejected if the tools can't see why your indexing is always in bounds.

When WUFFS sees you've written arr[k] it considers this as a claim that you've proved 0 <= k < arr.len() if it can't see how you've proved that then your code is wrong, you get an error. The result is that you're going to write a bunch of math when you write software, but the good news is that instead of going unread because nobody reviewing the code bothered reading the maths the machine read your math and it runs a proof checker, so if you were wrong it won't compile.

Edited: Fix off-by-one error


I'm glad that I provide amusement, but rustc-1.63.0 still compiles this program that panics at runtime. My experience with Rust is 5 min, so the style may provide further amusement:

  fn f(n : usize) -> usize {
    if n == 0 {
      10
    } else if n % 32327 == 0 {
      1
    }
    else {
      f(n-1)
    }
  }

  fn main() {
    let a = [1,2,3,4,5];
    let n = f(12312);
    println!("{:?}", a[n]);
  }

At compile time, Rust guaranteed memory safety here by ensuring a runtime check was present, which your code triggered, resulting in a panic.

A panic is not a violation of memory safety; if you wanted to violate memory safety you'd need to have caused that deref to succeed, and println to spit out the result.

EDIT: let me guess, did the panic message include "index out of bounds: the len is 5 but the index is 10"?



> but rustc-1.63.0 still compiles this program that panics at runtime.

An index OOB error? Here, it's important to remember the Rust panic is still memory safe. Perhaps you should read the article, or read up on what undefined behavior is?[0] Here, the Rust behavior is very well defined. It will either abort or unwind.[1]

If you prefer different behavior, there is the get method on slices.[2]

[0]: https://en.wikipedia.org/wiki/Undefined_behavior [1]: https://doc.rust-lang.org/reference/panic.html [2]: https://doc.rust-lang.org/std/primitive.slice.html#method.ge...


This subthread is about claims of STATIC checking of memory safety. A panic is not static checking. Perhaps you should read what you respond to.

Statically asserting at compile time that all memory access are either in-bounds, or will result in a controlled unwind or exit of the process, guarantees there are no memory safety violations.

We understand you're saying it's not possible in the general case to assert that all memory accesses are in bounds. Instead of that, if you ensure all memory accesses are either in bounds or that they at least do not violate memory safety, you've achieved the requirement of "memory safety", regardless of runtime inputs.


> This subthread is about claims of STATIC checking of memory safety. A panic is not static checking. Perhaps you should read what you respond to.

Oh, I read it.

Rust, and for that matter, the person to whom you are replying, above, never claimed that Rust could statically check array bounds. You created that straw-man. Yes, Rust does use static analysis, as one important method to achieve memory safety, but Rust doesn't require static analysis in every instance for it to be Rust's big step function, or for Rust to achieve memory safety.

Yes, certain elements of memory safety can only be achieved at runtime. Fine? But using static analysis to achieve certain elements of memory safety at compile time is obviously better where possible, rather than only at runtime, such as re: Java or Fil-C?


A panic is memory-safe, so static checking of memory safety holds. Perhaps you should understand your own claims.

I'll henceforth refer to the process of using vector.at(0) instead of vector[0] in C++ as "providing memory safety by static analysis".

Static analysis has a specific meaning, and rote insertion of bounds checking isn't it.


If the only way of triggering spatial memory unsafety in C++ was vector[i] and that operation was defined to always interrupt execution, then yes, C++ would be considered memory safe. But that is not the case.

The equivalent of vector[i] in Rust is Vex::get_unchecked, which is marked as unsafe, not the default that people reach for normally.


We are, however, talking in this subthread about the compiler inserting bounds checks and (incorrectly) calling the process "static checking".

I refuted that point by pointing out that the same process, if done manually in C++, would not be considered "static analysis that provides memory safety for array access".


Memory safety has a specific meaning, and panic isn’t it.

C++ can have UB, compilable non-unsafe Rust can’t, that’s what static analysis of memory safety is.

Main point here is you don’t know (and refuse to learn) new knowledge.


> In other words, Rust has solved the halting problem since the static checking of array bounds is undecidable in the general case!

Pfft, the simply typed lambda calculus solved the halting problem back in 1940.


What if, instead of fighting the symptom of people writing faulty, exploitable software, by explaining to them that Rust (as a stand in) is good, we try to fix the cause?

Imagine if for every data breach, every ransomware attack, every intrusion which used privilege escalation etc. there would be an investigation if the company in question is at fault for using insecure software (i.e. written in unsafe languages and not properly vetted). And if the company is found guilty, it is punished accordingly, so that every company is motivated to prevent such cases.

Of course people can still write C / C++ programs as a hobby or publish their software with open source licenses. But the people or organizations using that software must make sure (within reason) that it is safe or they are held liable if it inflicts damage. It would automatically make software written in safe languages much more attractive to them, because safety is much easier to prove – no evangelist needed.


Ugh. I know all that and I am still sick of hearing about memory safety. My teammates spend way more time fixing security issues in “safe” languages than C/C++/whatever. It simply doesn’t matter…

It's hard to compare the two. A low-level memory safety issue can intersect with security. So can a flaw in logic that touches on security, but is reproducible and not undefined in any way.

The latter can often be more easily exploited than the former, but the former can remain undetected longer, affect more components and installations, and be harder to reproduce in order to identify and resolve.

As an exmaple of "more easily exploited". Say that you have a web application that generates session cookies that are easy to forge, leading to session hijack. Not much skill is needed to do that, compared to exploiting a memory safety problem (particuarly if the platform has some layers defenses against it: scrambled address space, non-executable stacks, and whatnot).


What security issues are biting you in safe languages that wouldn't also appear in C/C++ ?

Perhaps he's talking about risk compensation (https://en.wikipedia.org/wiki/Risk_compensation) --- e.g. maybe safe languages structurally excluding memory corruption and concurrency problems tempts developers to let their guard down with respect to correctness generally and produce security vulnerabilities that wouldn't occur in a language with C's need for rigor.

Doubtlessly, there is some of that going on. I doubt the risk compensation erases the benefit of memory safety, but let's not kid ourselves.


A large number of real world security issues are attacks on humans not software. No programming language can solve social engineering problems.

If the 5 eye agencies recommend memory safety, can we conclude that they already get all their information via "AI" data harvesting in Office 365 and similar? What will happen to Pegasus? Or do they have yet unknown backdoors in Rust?

I believe they also recommend encryption but I wouldn’t conclude that encrypted data doesn’t make their job harder.

They do indeed recommend encryption, sometimes with deliberate backdoors: https://en.wikipedia.org/wiki/Dual_EC_DRBG

So viewing their recommendations with caution seems wise.


It seems obvious that with hardware-level memory safety on the way[1], just gradually modernizing existing C and C++ codebases to take advantage of safer constructs (like smart pointers or checked arithmetic) makes much more sense than rewriting everything in Rust. Even better, thanks to GCC, is that you don't need to sacrifice any portability to take advantage of even bleeding-edge features, due to its front-end/back-end separation and multitude of supported platform back-ends. Fish shell had to drop support for some platforms when it was rewritten in Rust[2].

[1] https://community.intel.com/t5/Blogs/Tech-Innovation/open-in...

[2] http://fishshell.com/blog/rustport/


> Fish shell had to drop support for some platforms when it was rewritten in Rust

Weird misrepresentation of your source... they had to drop support for only the obscurest of platforms, and concluded "We don’t see a big problem here".

Rust has enormous platform support. See https://doc.rust-lang.org/nightly/rustc/platform-support.htm... for a list. It also has the backend/frontend separation you describe, because it is based on clang internals. There is also ongoing work to plug it into GCC, as well as Rust compilers that can output C code directly to target dead embedded platforms that only have a single proprietary C compiler.


> It seems obvious that with hardware-level memory safety on the way[1], just gradually modernizing existing C and C++ codebases to take advantage of safer constructs (like smart pointers or checked arithmetic) makes much more sense than rewriting everything in Rust

I read this in the opposite way: if the hardware is going to be stricter about memory accesses being valid, that suggests that software is going to have to meet a higher standard in order to successfully run.

Imagine if you had to satisfy the Rust borrow checker, except you're still writing C and don't have additional tooling during compilation to show how a problem could trigger, you just have more crashes.


Ya ever heard of this thing called a debugger? They have this amazing ability to show you what the problem is right when it happens!

Crashes can be difficult to repro, especially if they occur in rare error paths and your software distribution mechanism doesn't yield you in field telemetry. Of course even rust isn't going to catch all issues at compile time (a lot of checks at runtime result in panics), but it does seem to catch many if not most, which is very helpful. This is much like the argument for static typing.

I've used many, debugging both dumps and live processes. And I'll take a compiler that highlights the problem at build time any day.

Hardware level memory safety doesn’t solve bugs caused by the compiler seeing UB in your code and choosing to emit different code than you intended.

I think memory safety is fine, but I plan to do it in C++ not Rust-- nothing in this article is remotely new either, just repeating the same tired stuff.

It seems pretty clear statistical hardware level memory safety is coming(Apple has it, Intel/AMD have a document saying they are planning to add it), the safety zealots can turn that on, or you can use FilC if you need absolute memory safety, but really C++ with various hardening features turned on is already fairly solid.

Also I think that memory safety is generally less important than thread safety(because memory safety is rather easy to achieve and detect violations), which almost all those languages recommended by this article blow chunks at. Rust could actual make a solid argument here, instead of wasting time yammering about memory safety.


Rust’s thread safety story is a subset of broader memory safety - it just guarantees that concurrent programs still are memory safe. This also happens to correspond to the most frequent sources of bugs in concurrent programs, but it’s not all there is to thread safety.

It’s talked about all the time but whenever people talk about memory safety, thread safety is implied. Statistical hardware memory safety is more a security feature. But knowing that your code is correct before you even try it is a huge productivity boost and valuable in and of itself for all sorts of reasons.

It’s weird the pushback from c++ people considering that Rust makes high performing code easier to achieve because of rules around aliasing and auto-optimizing struct layouts among other things. This is stuff c++ will never get.


Some days ago I took again a look at Rust.

I don't like the syntax at all, and coming from a Scala background that should say a lot. I will never like it, unless they simplify it.

However, they implemented at compile time all the rules that someone wrote in the CPP guidelines about memory, or in dozens of books about how to write safe code.

Now you can say: so you see, if you are good at c++, you achieve the same as Rust.

True. But I don't buy it. I don't believe that even hardcore c++ developers never make a memory related mistake. I have been in the industry for a while and shit does happen, even to the best of us, unless you have "guardrails" around, and even then, so many things can go wrong...

So a few days ago I asked on HN why not to use hardened allocators, which would basically block entirely the possibility to deploy such mistakes, a bit like OpenBSD allocator.

It seems 1) people who develop low level stuff don't even need the heap, or use anyways a limited subset of things, 2) such allocators slow down the runtime significantly.

So what I hear is:

1) I know what I am doing, I don't need a compiler to tell me that.

2) I anyway need a subset of that, so ... who cares.

3) runtime tooling helps me.

Except for 2 (like writing something for Arduino or by using some very specific subsets of C/C++), everything else sounds pretty much hypocrite to me, and the proof is that even the top of the top of the engineers have to fix double free or things like that.

If C++ broke the ABI and decided tomorrow to adopt memory safety at compile time, you would see that most people would use it. However, this holy war against Rust "because" is really antithetical of our industry which should be driven by educated people that use the right tool for the right job. As I said, ok for very low level/subsets, but why would you start today a CLI tool in C/C++ (if not for some very specific reasons )?

For me this has more with fear of learning/doing something new. It makes no sense otherwise to reject a language that guarantees that a random developer missed the lifetime of that object etc... A language that is only improving, not dying anytime soon, with one of the fastest growing ecosystem, not "one person's weekend project" or things like that.

As I mentioned multiple times, I dislike particularly Rust syntax. I love the C syntax the most, although C++ has deviated from that significantly with templates etc. But if something is better suited, sorry, we need to use it. If the C/C++ committee are the slowest things on the planet, why should my company/software pay for that? Everyone said python would die after the backward incompatible 2->3 upgrade. Sure.


I have found Rust syntax a breath of fresh air as compared with C/C++ and I was doing c/C++ for over 15 years before I started with Rust. No arbitrary semicolons, everything is an expression leading to more concise code, no need to worry about the interior layout and excess padding of a struct, function declarations that are trivial to read, magical inference all over the place, variable shadowing that’s actually safe, etc etc. traits that let you duck type behavior in a 0 overhead way elegantly. A proper macro system. No weird language rules about having to know the informal rule of 0/5 or failing to manage resources or code that fails to manage resources in an exceptional safe way.

It’s alien and unfamiliar, but I actually don’t have problems with the syntax itself. Swift has the “everything is an expression” property. Python’s list comprehension felt weird and confusing the first time I encountered it. Typescript (&Go) has traits after a fashion. In fact Go syntax I find particularly ugly even though it’s “simpler”


Efficiently implementing a doubly linked list in C or C++ is easy. In Rust, less so.[0]

And the prevalence and difficulty of unsafe means both that Rust is not memory safe [1], and that Rust sometimes is less memory safe than C or C++.

[0]: https://rust-unofficial.github.io/too-many-lists/

[1]: For an example of memory unsafety in Rust: https://materialize.com/blog/rust-concurrency-bug-unbounded-...


You can write a linked list in Rust the same was as C or C++ with unsafe, so it as least as easy.

While I think it’s a troll account, it is technically true that the rules around unsafe Rust are a little harder to get exactly right to avoid UB because you still have to uphold the much larger surface area of rules in safe Rust without any of the compiler help. C++ by contrast of course has fewer such rules and they’re easier to reason about but of course there’s no compiler warning when you violate them.

On the other hand that line of argument is kind of weak sauce because the vast majority of bugs aren’t in complicated recursive data structures. And you can always implement doubly-linked list in pure safe rust just with slightly more overhead by using Rc/Arc if you wanted guarantees (and you can also verify the unsafe implementation using Miri which is a significantly strong runtime checker than c++ where you only have ASAN/UBSAN)


Did you actually create this account just to hate on Rust?



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: