Hacker Newsnew | past | comments | ask | show | jobs | submit | pmc00's commentslogin

You can do this in Windows too, useful if you want tiny executables that use minimum resources.

I wrote this little systemwide mute utility for Windows that way, annoying to be missing some parts of the CRT but not bad, code here: https://github.com/pablocastro/minimute


I thought windows had an unstable syscall interface?


Pretty much yeah.

You have your usual Win32 API functions found in libraries like Kernel32, User32, and GDI32, but since after Windows XP, those don't actually make system calls. The actual system calls are found in NTDLL and Win32U. Lots of functions you can import, and they're basically one instruction long. Just SYSENTER for the native version, or a switch back to 64-bit mode for a WOW64 DLL. The names of the function always begin with Nt, like NtCreateFile. There's a corresponding Kernel mode call that starts with Zw instead, so in Kernel mode you have ZwCreateFile.

But the system call numbers used with SYSENTER are indeed reordered every time there's a major version change to Windows, so you just call into NTDLL or Win32U instead if you want to directly make a system call.


It looks like that project does link against the usual Windows DLLs, it just doesn't use a static or dynamic C runtime.


Windows isn’t quite like Linux in that typically apps don’t make syscalls directly. Maybe you could say what’s in ntdll is the system call contract, but in practice you call the subsystem specific API, typically the Win32 API, which is huge compared to the Linux syscall list because it includes all sorts of things like UI, COM (!), etc.

The project has some of the properties discussed above such as not having a typical main() (or winmain), because there’s no CRT to call it.


Fair point on latency, we (Azure AI Search) target both scenarios with different features. For instant search you can just do the usual hybrid + rerank combo, or if you want query rewriting to improve user queries, you can enable QR at a moderate latency hit. We evaluated this approach at length here: https://techcommunity.microsoft.com/blog/azure-ai-foundry-bl...

Of course, agentic retrieval is just better quality-wise for a broader set of scenarios, usual quality-latency trade-off.

We don't do SPLADE today. We've explored it and may get back to it at some point, but we ended up investing more on reranking to boost precision, we've found we have fewer challenges on the recall side.


Code-golfed Game of Life in Javascript we wrote a while back with some friends. Always surprising how much abuse Javascript can take: https://shorterlife.github.io/challenge/


i stared at this with more intensity and interest than any movie in the recent times


This is such a good way of presenting code-golf — well done!


For another set of measurements that support RRF + Hybrid > vectors, we (Azure AI Search team) did a bunch of evaluations a few months ago: https://techcommunity.microsoft.com/t5/ai-azure-ai-services-...

We also included supporting data in that write up showing you can improve significantly on top of Hybrid/RRF using a reranking stage (assuming you have a good reranker model), so we shipped one as an optional step as part of our search engine.


(Disclaimer: I work in this team)

More details, including old vs new limits: https://techcommunity.microsoft.com/t5/ai-azure-ai-services-...


Here's a bit of a quantification of the point Doug makes. Indeed, for a number of scenarios you get better results if you combine vector search and keyword search into a hybrid retrieval step, and do reranking on top of that.

https://techcommunity.microsoft.com/t5/ai-azure-ai-services-...

(disclaimer: I work in that team)


This is interesting. I recently built a search tool that needed to locate documents by keyword or by semantics, so I implemented a hybrid search straight away: BM25 + embeddings (from `gte-base`), with a cross-encoder for reranking.

I found that the lexical search was adding nothing; the embeddings alone produced almost identical results for keyword queries. (The re-ranker, however, made a big difference.)

Is this unusual?


It depends on the scenario. For example, for concept-seeking queries, vectors tend to do better (less likely to be an overlap in words between query and content), whereas for keyword searches (a product name, a serial number, project codenames, etc.) BM25 + keywords does much better. If your workload is all concept-seeking queries, it's reasonable that keywords don't add much.

If you look at the table in the section "3. Hybrid Retrieval brings out the best of Keyword and Vector Search" of that article, we shared there the significant variability of metrics as a function of query types.


Agreed, vector search is great but it's only one of many tools you can use to create a great search solution.

We recently did a bunch of evaluation work to quantify the differences between keyword search, vector search, hybrid, reranking, etc. across a few datasets. We shared the results here: https://techcommunity.microsoft.com/t5/azure-ai-services-blo...

Disclosure - I work in the Azure Search team.


Check out FeatureBase, when you get a chance. Vectors and super fast operations on sets. I'm using it for managing keyterms extracted from the text and stored along with the vectors.


Microsoft (Azure Cognitive Search team) | Engineering Manager | Full-Time | Onsite (multiple locations)

The Cognitive Search team continues to grow. We're looking for engineering managers for multiple locations. Join us to work at the intersection of cloud services, information retrieval and machine learning.

To apply: Seattle: https://careers.microsoft.com/us/en/job/1167446/Principal-So... Atlanta: https://careers.microsoft.com/us/en/job/1018927/Principal-So... Bangalore: https://careers.microsoft.com/us/en/job/1188511/Principal-So... Hyderabad: https://careers.microsoft.com/us/en/job/1157546/Principal-So...

For any questions reply here or ping me at pablo DOT castro AT company name.


A friend forked the original I posted and did exactly that: https://nbruno234.github.io/hello-world/lifewar.html


Microsoft (Azure Synapse Analytics and Azure Cognitive Search teams) | Software Engineers | Full-Time | ONSITE (Aliso Viejo CA, Atlanta GA, Hyderabad India) or REMOTE (US, Mexico, Chile, Argentina, Peru, Colombia, Costa Rica) | https://careers.microsoft.com/us/en/search-results?keywords=... and https://careers.microsoft.com/us/en/search-results?keywords=...

We're growing our teams in a number of different areas. If you have expertise in database systems internals (query processing, optimization, distributed query execution, storage engines), search (lucene, elasticsearch, infrastructure) or cloud systems (any cloud platform), we probably have a role that could be interesting for you.

Our current positions are all for folks with at least 2-3 years of experience and go up to fairly senior. We don't have entry-level positions in these teams right now (I'm sure other parts of Microsoft most likely have some though).

If interested apply on the links above, or email me and I'll route to the right hiring manager. pablo <dot> castro <at> company's domain.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: