I’ve been working on a very similar sync problem and hit this too. I think the way forward is to use a broadcast channel to elect an individual worker that communicates with all other contexts using the broadcast channel.
It is supported by all browsers. You just have all the tabs try to lock the same resource and have them return a promise in the lock callback. The first one wins and when that tab closes the next one in line gets automatically elected.
The leader can then use a broadcast channel and act as a server for all the other tabs to serialize access to any shared resources.
The filler tokens actually do make them think more. Even just allowing the models to output "." until they are confident enough to output something increases their performance. Of course, training the model to do this (use pause tokens) on purpose works too: https://arxiv.org/pdf/2310.02226
OK that effect is super interesting, though if you assume all the computational pathways happen in parallel on a GPU, that doesn't necessarily increase the time the model spends thinking about the question, it just conditions them to generate a better output when it actually decides to spit out a non-pause answer. If you condition them to generate pauses, they aren't really "thinking" about the problem while they generate pauses, they are just learning to generate pauses and do the actual thinking only at the last step when non-pause output is generated, utilizing the additional pathways.
If however there were a way to keep passing hidden states to future autoregressive steps and not just the final tokens from the previous step, that might give the model true "thinking" time.
> if you assume all the computational pathways happen in parallel on a GPU, that doesn't necessarily increase the time the model spends thinking about the question
The layout of the NN is actually quite complex, which a large amount of information calculate beside the token-themselves, and the weights (think "latent vectors").
The biggest benefit to this is that it makes one of the slowest parts of virtual DOM diffing (longest common subsequence) no longer required. After this becomes supported in the mainstream, not even signal-based frameworks will have to include VDOM algorithms. I know this because I remember pushing for this to be supported a few years ago — a nice reminder that the standards are evolving and that nothing stops you from being a small part of the effort.
Next up — DOM parts and standardized signals, unified.
How do you even bring this up in a way that it gets noticed by the right people? There's so many times I read niche parts of the DOM that I feel need serious enhancements. I use Blazor (WASM) a lot more these days, so a lot of that is masked from me now.
VDOM hasn't been needed for a long while, if ever.
You can do a lot better just checking if the template that/s rendering to a spot in the DOM is the same or different from the previous template. If it's the same, just update the bindings from the template, if it's different re-render the whole thing.
That's simpler and faster, but the one thing it leaves out is stateful reordering of lists. So you can have a specific re-ordering code path, which is simpler than a full VDOM, but if you want to preserve state like moveBefore() does, even that reordering gets pretty complicated because you can only preserve state for one contiguous range of nodes that you don't actually move - instead you move all the other nodes around them. moveBefore() just eliminates all that extra complexity.
There's also a couple of standards issues open for native reordering of siblings, like reorderChildren(). That would eliminate the re-ordering code completely.
I mean that's fine, but when mixed into a conversation on web standards and APIs it leads to a place where you don't care about any web-specific evolution including most progress on the DOM. You end up only caring about a future of WASM + WebGPU, etc. This is exactly where some of the original React core team said they wanted to go - they just wanted a network connected framebuffer.
Again, fine, but I personally prefer a vibrant web platform that evolves to meet the needs of web developers.
You can still have a framework-specific render tree that maps to the DOM that tracks changes with signals instead of diffing. We’re just saying that there’s no requirement for diffing algorithms anymore to performantly and correctly reconcile state to the DOM. Keyed children was the last place it was needed.
Signal based libraries already don't need VDOM. It's possible that it'll allow faster reordering of elements that are already connected, but the added checks needed to see if an element is connected (need to use insertBefore if not connected) might also cause overall perf regression. The main thing is that it allows elements to be moved without triggering disconnect/connect.
With ~2/3 (and growing) of Earth’s land area being arid, you’d expect there to be a lot more interest in reverse-desertification. It’s pretty well established by now that it’s almost entirely a labor allocation issue given that there’s success stories all over the world [1] (including in the US [2]) to just create earthworks that slow and divert water. Water seems to be the biggest single issue and so many deserts actually get enough of it during flash floods. With how much Elon is interested in terraforming Mars why not just start here?
The key is topological sorting of a dependency graph. This can be done implicitly by storing a reactive variable node’s depth once it is created, and just making sure that updates are enqueued in separate queues per depth.
My memory says that there’s a “Chinchilla” paper showing how to make the best model with a given training budget. There’s a trade-off between the amount of training data and the size of the model itself. Chinchilla under-training would mean that the model is too big for the amount of training data used. Llama is Chinchilla over-trained in that there is a ton of data relative to the small size of the model.
Note that this is still desirable for inference because you want the most possible training on whatever model you can actually fit in your memory.
Exactly - the twitter thread would be really improved if everyone read an introduction to information theory. There are 1024 bits of information in a kilobit, and that's enough information for 2^1024 unique messages. The interpreter of the messages judges what the message information means, but the Kolmogorov complexity of a message is inescapable - you would need to modify the interpreter to squeeze out more messages (which is nothing more than pre-transferring information to the receiver)
Shannon’s insight into the hyperspheres of transmission blew my mind 15 years ago, and I’m still not over it.
The idea of an ideal communication channel rivals most ideas ever had. The original paper is surprisingly accessible. Highly recommended if anyone hasn’t had a chance to give it a tour.
I guess I did take for granted that it includes a bit of calculus and discrete math. I have a little tip.
Once you refresh your knowledge of the greek alphabet, it’s not that scary. Math just needs a lot of symbols.
A good starting point would be infinite series, as this is the basis on which we imagine things bigger than a human lifetime. And it reminds us why we had to define a “limit” as a summation that strangely looks like ∑
It took me years to see the significance of arithmetic series, but as with any ∑, you have to start somewhere! (Often at -∞ and +∞)
The trick to all of it is that there are patterns declared by the underlying structures. There is not really a way to understand them without playing around with equalities and diagrams until you reach a certain zenlike moksha. Trust me, it is quite fulfilling!
I made a library in which state management started the same way. A simple approach is great for performance-sensitive parts and it's easy to write, but there is very little helpful structure to it. Reactivity fails very easily, consider a dependency graph like this:
a -> b
a -> c
b,c -> d
In this case, updating `a` would cause d to be calculated (and reactions run) twice. Also, update batching is absolutely necessary for anything non-trivial.
At least for what I made, I don't think it's too complex for what it provides. It transparently integrates with the event loop and only runs what it needs to, at the cost of (min-brotli) 432 bytes at the moment.
Compare the "naive" SimpleReactive with the complete FunctionalReactive version here:
https://github.com/Technical-Source/bruh/blob/main/packages/...
Hangul covers Korean language, IPA covers all languages. For example, some languages use clicks; though to be fair, IPA only currently covers 5 of the 6 clicks. Again, point is not IPA is the solution, but that having a common notation would make learning languages much easier and learning only one written notation would be needed.
I just wanted to use a concrete example of a featural writing system. It would of course not be Hangul if it had to accommodate all phonemes. The other direction to take would be to create a sophisticated (probably ML) model that could enable certain features to be optimized, like speed of syllable recognition. If we are constructing a modern writing system then that would definitely be a good approach.
I initially thought that “if” should be the obvious choice, ignoring sass entirely. However, it’s clear that “when” makes more sense than “if” given CSS’s interpretation anyways. “if” sort of implies a more “one-time” use like in a template/macro or preprocesser. “when” really captures the temporal invariance of CSS’s conditionals.
I think it's fair to acknowledge the clash, but "if" is our word for that thing - "when" could have been that word - every time you do a conditional jump you're only doing it when certain conditions are met "when x > 5 { y += 3; } elsewise { y -= 1; }" that reads perfectly fine - but it's common convention to use "if" for this condition.
I don't really have a strong opinion on the when vs. if fight w.r.t. yielding support to SASS but I think that "if" is just a better keyword to use in this instance based on language usage alone.
I think the difference is that, if x later becomes > 5, y doesn't suddenly get increased by 3.
By contrast, in CSS, if the condition later becomes true, the declarations in it do start to apply. So I also think "when" (in the sense of "whenever") makes more sense than "if".
CSS is evaluated (probably) in some sort of event loop in the browser - you can also evaluate the code I included above in an event loop - there is nothing inherently instantaneous or continuous about CSS and, given that it is deterministic, it's only actually recalculated on certain event triggers.