Since this is about C declarations: for anyone who (like me) had the misfortune of learning the so-called "spiral rule" in college rather than being taught how declarations in C work, below are some links that explain the "declaration follows use" idea that (AFAIK) is the true philosophy behind C declaration syntax (and significantly easier to remember/read/write).
TL;DR: you declare a variable in C _in exactly the same way you would use it:_ if you know how to use a variable, then you know how to read and write a declaration for it.
Using x, or dereferencing p, or subscripting the array arr, or declaring a function that can be called with fn, or dereferencing the function pointer pfn then calling it, all these things would produce an int.
It's the intended way to read/write declarations/expressions. As a consequence, asterisks ends up placed near the identifiers. The confused ones will think it's a stylistic choice and won't understand any of this.
Yes, the () operator dereference function pointers automatically for you for convenience. There's also the surprise that you can infinitely dereference function pointers as they just yield you more function pointers.
One baffling thing I see people do with typedefing function pointers is insisting on adding in the pointer part in the typedef which just complicates and hides things.
If you want to typedef a function pointer, make a completely ordinary function declaration, then slap 'typedef' at the beginning, done.
This does require you to do "foo_func *f" instead of "foo_func f" when declaring variables, but that is just clearer imo.
typedef int foo_func(int); // nice
typedef int (*foo_func)(int); // why?
I once considered Wren for a situation where I (to a first approximation) wanted to allow users to write 'plugins' that link against my internal C application symbols but using a language focusing more on ease of use (rather than C).
Unfortunately, neither Wren nor any of the other major 'embeddable scripting languages' (e.g., Lua) were really a good fit for this, because they commit fully to the 'all-numbers-are-floats' thing and generally don't seem to even try to provide a general equivalent to the C++ `extern "C" { ... }` thing.
Of course, I know this isn't really the target use case of Wren/Lua/etc., but if anyone knows of a good embeddable scripting language for this I'd love to hear about it. Eventually I went with CPython (which provides ctypes to solve my problem) but it's a huge pain to embed properly.
LuaJIT does provide C-compatible types through its FFI. I generally prefer LuaJIT over normal Lua for this reason. It also makes embedding super trivial as you only need to use the Lua API to bootstrap and call the first Lua function, after that you can just use the FFI which lets you work directly with extern host functions.
Ones that might be of interest to you are Umka, tcl, and berry.
There's also a lot of others listed that range from someone's experimental side project to professional grade and well supported languages. Kinda fun to see different people's approaches to things, and no matter what your preferred programming style, there's probably a few in there that will mesh pretty well.
Given the goal is to work with existing C programs (which already have free(...) calls "carefully" placed), and you're already keeping separate bounds info for every pointer, I wonder why you chose to go with a full GC rather than lock-and-key style temporal checking[1]? The latter would make memory usage more predictable and avoid the performance overhead and scheduling headaches of a GC.
Perhaps storing the key would take too much space, or checking it would take too much time, or storing it would cause race condition issues in a multithreaded setting?
I think the lock and key approaches don’t have Fil-C’s niftiest property: the capability model is totally thread safe and doesn’t require fancy atomics or locking in common cases
Also find it interesting that you're allowing out-of-bounds pointer arithmetic as long as no dereference happens, which is a class of UB compilers have been known to exploit ( https://stackoverflow.com/questions/23683029/is-gccs-option-... ). Do you disable such optimizations inside LLVM, or does Fil-C avoid this entirely by breaking pointers into pointer base + integer offset (in which case I wonder if you're missing out on any optimizations that work specifically on pointers)?
For starters, llvm is a lot less willing to exploit that UB
It’s also weird that GCC gets away with this at all as many C programs in Linux that compile with GCC make deliberate use of out of bounds pointers.
But yeah, if you look at my patch to llvm, you’ll find that:
- I run a highly curated opt pipeline before instrumentation happens.
- FilPizlonator drops flags in LLVM IR that would have permitted downstream passes to perform UB driven optimizations.
- I made some surgical changes to clang CodeGen and some llvm passes to fix some obvious issues from UB
But also let’s consider what would happen if I hadn’t done any of that except for dropping UB flags in FilPizlonator. In that case, a pass before pizlonation would have done some optimization. At worst, that optimization would be a logic error or it would induce a Fil-C panic. FilPizlonator strongly limits UB to its “memory safe subset” by construction.
I call this the GIMSO property (garbage in, memory safety out).
Not knowing the exact language used by the C standard, I suspect the reason GCC doesn't cause these issues with most programs is that the wording of "array object" refers specifically to arrays with compile-time-known sizes, i.e. `int arr[4]`. Most programs that do out of bounds pointer arithmetic are doing so with pointers from malloc/mmap/similar, which might have similar semantics to arrays but are not arrays.
FilPizlonator drops flags in LLVM IR that would have permitted downstream passes to perform UB driven optimizations.
Does this work reliably or did your patches have to fix bugs here? There are LLVM bugs with floating point where backend doesn't properly respect passed attributes during codegen, which violate the behaviors of user level flags. I imagine the same thing exists for UB.
LLVM is engineered to be usable as a backend for type-safe/memory-safe languages. And those flags are engineered to work right for implementing the semantics of those languages, provided that you also do the work to avoid other LLVM pitfalls (and FilPizlonator does that work by inserting aggressive checks).
Of course there could be a bug though. I just haven't encountered this particular kind of bug, and I've tested a lot of software (see https://fil-c.org/programs_that_work)
To add another suggestion for understanding the Fourier transform, personally the first explanation that ever clicked with me was the Aho/Hopcroft/Ullman algorithms textbook.
Rather than talking about sine and cosine waves, they motivate the Fourier transform entirely in terms of polynomials. Imagine you want to multiply two polynomials (p(x) and q(x)). The key is to recognize that there are two ways to represent each polynomial:
1. "Coefficient form," as a set of coefficients [p_0, p_1, p_2, ..., p_d] where p(x) = p_0 + p_1x + p_2x^2 + ... + p_dx^d, OR
2. "Sample form," as a set of sampled points from each polynomial, like [(0, p(0)), (1, p(1)), (2, p(2)), ..., (d, p(d))]
Now, naive multiplication of p(x) and q(x) in coefficient form takes O(d^2) scalar multiplications to get the coefficients of p(x)q(x). But if you have p(x) and q(x) in sample form, it's clear that the sample form of p(x)q(x) is just [(0, p(0)q(0)), (1, p(1)q(1)), ...], which requires only O(d) multiplications!
As long as you have enough sample points relative to the degree, these two representations are equivalent (two points uniquely defines a line, three a quadratic, four a cubic, etc.). The (inverse) Fourier transform is just a function that witnesses this equivalence, i.e., maps from representation (1) to representation (2) (and vice-versa). If the sample points are chosen cleverly (not just 1/2/3/...) it actually becomes possible to compute the Fourier transform in O(d log d) time with a DP-style algorithm (the FFT).
So, long story short, if you want to multiply p(x) and q(x), it's best to first convert them to "sample" form (O(d log d) time using the FFT), then multiply the sample forms pointwise to get the sample form of p(x)q(x) (O(d) time), and then finally convert them back to the "coefficient" form (O(d log d) using the inverse FFT).
> This is the leverage paradox. New technologies give us greater leverage to do more tasks better. But because this leverage is usually introduced into competitive environments, the result is that we end up having to work just as hard as before (if not harder) to remain competitive and keep up with the joneses.
Off-topic, but in biology circles I've heard this type of situation (where "it takes all the running you can do, to keep in the same place" because your competitors are constantly improving as well) called a "Red Queen's race" and really like the picture that analogy paints.
I feel that I understand the leverage paradox concept, and the induced demand concept, but I don't understand how they are the same concept. Can you explain the connection a little more?
More leverage = more productivity = more supply of good and services
The induced remand for more goods and services therefore fills the gap, and causes people to work just as hard as before -- similarly to how a highway remains full after adding a lane
I assume the poster is referencing Citizens United v. FEC, specifically about the government's use of the 2002 Bipartisan Campaign Reform Act to restrict showing of political documentaries (apparently, called "Hillary: The Movie" and "Celsius 41.11").
While (as far as I know) the law was never actually used to ban books (only documentaries), the case became infamous because the government argued that it had the right to ban books if it wanted to. See, e.g., the NYTimes article below: "The [government's] lawyer, Malcolm L. Stewart, said Congress has the power to ban political books, signs and Internet videos, if they are paid for by corporations and distributed not long before an election.".
Yeah I made a mistake. There were a couple of films the FEC went after and they claimed the power extended to books as you pointed out. I was under-caffienated.
Absolutely not a perfect solution, and maybe you're already using it within your Makefiles, but for anyone who doesn't yet know about it there's Latexmk[1] which is supposed to automate all of this hassle. I think at least on Debian it's included with texlive-full. In addition it has some nice flags like `-outdir` which lets you send all the crazy LaTeX intermediate build/aux files to a separate directory that's easy to gitignore.
> LaTeX needs several passes to compile because it was designed with minicomputers of the 80s in mind, i.e. tiny memory constraints.
That's certainly part of it, but any typesetting program will need multiple passes to properly handle tables of contents—you can't know a section's page number until you've compiled everything before that section (including the table of contents), but adding a new section to the contents could push everything ahead by another page. The only unique thing about LaTeX here is that it directly exposes these multiple passes to the user.
> Let me guess, they published a bunch of papers, did a bunch of experiments like "lets do X but gui" "what if you didn't have to learn syntax" and then nobody ever did anything with any of the work because it was a total dead end.
This response is very confusing to me, and it seems you have a very different understanding of what STEPS did than I do.
In my understanding, the key idea of STEPS was that you can make systems software orders of magnitude simpler by thinking in terms of domain-specific languages, i.e., rather than write a layout engine in C, first write a DSL for writing layout engines and then write the engine in that DSL. See also, the "200LoC TCP/IP stack" https://news.ycombinator.com/item?id=846028
You seem to think they're advocating a Scratch-like block programming environment, but I'm not sure that's accurate. Can you point to where in their work you're finding this focus?
I too believe STEPS was basically a doomed project, but I don't think it's for the reason you've said (moreso just the extreme amount of backwards compatibility users expect from modern systems).
(--- edit: ---)
> You don't make leaps from paying grad students to play around with "how can we make programming better", you get it from all of a sudden an AI can just generate code.
I think this is a more compelling point, but it doesn't seem to explain things like the rise of Git as "a way to make programming (source control) better," and it's not clear how to determine when something counts as an "all of a sudden" sort of technology. They would probably say their OMeta DSL-creation language was this sort of "all of a sudden" technological advance that lets you do things in orders of magnitude less code than before.
That is not a TCP/IP stack in 200 LoC. The thing described is a TCP/IP segment parser with response packet encoder.
The “stack” described [1] can not transmit a payload stream. That also means it avoids the “costly (in terms of lines)” problems of transmit flow control and “reliable send” guarantees in the presence of transmit pipelining needed for even modest throughput. For that matter, it does not even reconstruct the receive stream which is, again, one of the more “costly” elements due to data allocation and gap detection. It also does not appear to handle receive side flow control either, but that could be hidden in the TCP receive engine state which is not described.
These are not minor missing features. The thing described does not bear any resemblance to a actual TCP implementation, instead being more similar to just the receive half of a bespoke stateful UDP protocol.
Now it is possible that the rest of the TCP/IP stack exists in the other lines, as only about 25-ish lines are written down, but you can conclude almost nothing from the described example. The equivalent C code supporting the same features would be similar-ish (under 100 lines) in length, not 10,000 lines. That is not to say it is not a tight implementation of the feature set, but it is not reasonable to use it as evidence of multiple order of magnitude improvements due to representation superiority.
I agree --- I mostly think it's interesting as one of the most concrete examples of what they claim to have actually done that I've been able to find.
In general, it's frustrating that, as far as I can tell, they don't seem to have made any of the code from the project open source. Widespread skepticism about their claims due to this is (IMO) justified.
I guess I'm making a wider claim about the effectiveness of funding 'directed innovation'.
Very often innovations can happen if you fund accomplishing some previously unaccomplished task. Building a nuclear bomb, sending a man to the moon, decoding the genome. The innovations come about because you have smart people trying to solve something nobody has ever tried to solve before, and to do that they slightly revolutionize things.
I'm not aware of a single case where the goal was defined in terms of innovation itself, as in "find a better way to program" and anything useful or even slightly innovative resulted. They are by definition doing something that lots of people are trying to do all the time. It's just very unlikely that you are creating conditions which are novel enough to produce even a slightly new idea or approach.
Generally what you get is a survey of how things are currently done, and some bad ideas about how you could tweak the current state of affairs a little. But if there was a way to just patch up what we already know how to do then it's very likely someone already tried it, really it's likely 1000 people already tried it.
Sorry, added an edit to my above post before I saw this, just to summarize:
I think that's a more reasonable complaint, but I fear it's too vague to be applicable.
The STEPS folks would probably say that a modern computing environment in ~20kloc is something that was previously unaccomplished and thought to be unaccomplishable, but you're writing that off/not counting it as such, presumably because it failed.
On the other end of the spectrum, things like Git (to my knowledge) did come out of the "find a better way to source control" incremental improvement mindset. (Of course, you can say the distributed model was "previously unaccomplished," but the line here is blurry.)
I don't think Git itself is a revolution or new technology. It took what people were trying to do (but with extremely frustrating bad user experience, like taking an hour to change branches locally) and just did it very well by focusing on what is important and throwing away what isn't. It's an achievement of fitting a solution to a problem .
I don't think they DID build a modern computing environment at all. They built something that kindof aped one, but unlike Git they missed the parts that made a computing environment useful for users. It's more like one of those demo-scene demos where you say "Wow how did they get a commodore 64 to do that!?"
If they did build a modern computing environment with 20k LOC, that is a trillion dollar technology. Imagine how much faster Microsoft or Apple would be able to introduce features and build new products if they could do the same work with 2 orders of magnitude less code! That is strong evidence that this wasn't actually the result.
> I don't think they DID build a modern computing environment at all.
I agree that, now, after they've tried and failed, we can say they didn't build a modern computing environment in 20kloc.
My point is just that, when they were pitching the project for funding, there was no real way to know that this "trillion dollar technology" goal would fail whereas nukes/moon mission/etc. would succeed. Hindsight is 20/20, but at the beginning, I don't think any of these projects defined themselves as "doing something that lots of people are trying to do all the time;" instead they would probably say "nobody else has tried for a 20kloc modern computing system, we're going to be the first."
Given they all promised to try something revolutionary, I'm not sure it's fair to claim after-the-fact how obvious it is that one would fail to achieve that vs. another.
But I do take your point that it's important in general not to fall into the trap of "do X but tweak it to be a bit better" and expect grand results from that recipe.
TL;DR: you declare a variable in C _in exactly the same way you would use it:_ if you know how to use a variable, then you know how to read and write a declaration for it.
https://eigenstate.org/notes/c-decl https://news.ycombinator.com/item?id=12775966