Hacker Newsnew | past | comments | ask | show | jobs | submit | sgc's commentslogin

I really don't understand what is his usage pattern would have triggered that obviously automated ban. Can somebody let me know what they might think is adversarial enough to be considered 'hacking' or similar by a bot?

I can get multiple sets of footnotes (critical + content notes) reliably recognized and categorized using gemini-3-flash-preview. I took 15-20 hours to iterate on my prompt for a specific format. Otherwise it would not produce good enough results. It was a slow process because results from batch did not mirror what I was getting from the chat mode, and you have to wait for batch results while analyzing the last set. There was also a bit of debugging of the batch protocol going on at the same time. Flash is also surprisingly affordable for the results I am getting, 4-5x less than I had anticipated. I gave up on gemini-3-pro pretty quickly because it overthinks and messes things up.

The best leader board I have used is ocrarena.ai. I agree it is not detailed enough. I wish people could rate what part of the ocr went well or bad (layout, text recognition, etc). However, my more specific results using custom prompts and my own images on their playground page are relatively closely aligned with the rankings as others have voted.

What more are you looking for?


Electrical upgrades are almost always required, and price is more like 7k-9k around here. It's going to be seriously painful for a lot of people.

If you were in the market for an resistive electric heat pump, you likely had the service for it already. A heat pump version will almost always require less power.

My bad, read too quickly. I was thinking of the forced change over from gas water heaters, which is already happening in the California Bay Area and will only expand.

If you currently have an electric resistive water heater, a heat pump water heater with the same heating capacity will use 3-4x less power, which means you can use a much smaller circuit.

A 6kW 240V EWH uses 25A, it’ll need #8 wire and a 35A or 40A breaker.

An equivalent HPHW would use 1.5kW at 240V, or 6.25A. You can use #14s and a 15A breaker.


Ehh 120v models exist. My 65 gal runs fine on a standard 20a breaker.

This is why I won't use random distros, even if they have better features. It's just one more point of failure, one more point of unnecessary trust. I would rather fight to deal with specific problems with specific apps on one of the handful of core distros with long histories.

Agreed, I just installed Fedora 43. I don’t even trust CachyOS at this point.

I feel like Cachy is even more fragile than Archlinux.

I feel this way about open source generally.

Lots of cool stuff that I happily use, but the bar to installing something that gets to see my password (OS, terminal, input handler, etc) is very high.

Not a popular take, but I'd rather run something from Valve or Google for the same reason. I trust there to be more vetting if a corporation is putting its reputation on the product than a toy I found on GitHub.

It's a bit of a myth that open source leads to more eyes on the software. Most people just install it and trust that somebody else did the audit.

Something with a vibrant community of maintainers? Maybe.

Something that's too big to personally audit but too small for that community? I'll pass.


That's not an open source problem, though; that's a supply chain problem. Some random little proprietary freeware isn't better.

Semantics, but yes.

The problem isn't the open source (in fact, that's better). The problem is downloading random shit from the internet, and the biased assumption that open-source == trustworthy.


Open source does not equal trustworthy, but open source repositories usually are trustworthy, because they're trusted repositories.

Debian repos are not NPM. Yes, the package are actually vetted to some degree.


exactly. I remember there was a case where louis rossman covered a repair tool that was hacking its customers if they did something the developer didnt like.

At least with open source you have a chance to prevent this. With proprietary its pure trust.


And then there was Gaggiaino that was intentionally bricking displays if you tried to use your own. A project with open source roots.

It can happen anywhere, really


I'll take can be inspected over the alternative.

I agree, there are companies I'd trust but most software isn't made by Valve and Google. There are plenty of developers also not auditing their dependencies.


The most surprising thing on this page for me was:

> The areas of least atmospheric humidity ... a large area of ‘dry’ atmosphere also covers part of the South Atlantic Ocean (centre of image).

This area is not that far south as to basically indicate the antarctic, and it is warm season in the southern hemisphere. I did not even think it would be possible to have a larger area of low humidity over a massive ocean like that.


If I were extremely cynical, I would suspect they might have intentionally falsified that response to make it seem like they were more naive than they actually were.


I suspect the more likely scenario is they don't actually care how accurate these nominal categorizations are. The information they're ultimately trying to extract is, given your history, how likely you are to click through a particular ad and engage in the way the advertiser wants (typically buying a product), and I would be surprised if the way they calculate that was human interpretable. In the Facebook incident where they were called out for intentionally targeting ads at young girls who were emotionally vulnerable, Facebook clarified that they were merely pointing out to customers that this data was available to Facebook, and that advertisers couldn't intentionally use it.[0] Of course, the result is the same, the culpability is just laundered through software, and nobody can prove it's happening. The winks and nudges from Facebook to its clients are all just marketing copy, they don't know whether these features are invisibly determined any more than we do. Similarly, your Google labels may be, to our eyes, entirely inaccurate, but the underlying data that populates them is going to be effective all the same.

[0] https://about.fb.com/news/h/comments-on-research-and-ad-targ...


This. They would have been better off just tagging you with a GUID and it would have been less confusing. "This GUID is your bubble"


I think its their currently targeted ad demographic or whatever. Its probably a "meaningless" label to humans, but to the computer it makes more sense, he probably watches the same content / googles the same things as some random person who got that label originally, and then anyone else who matched it.


Yeah somewhat like "likes football" might just be a proxy for "male".


male, lives in this region, has an income between X to X+40000, and has used the following terms in chat or email, regardless of context, in the last 6 months: touchdown, home run, punt, etc. etc.

the ad game is not about profiling you specifically, it's about how many people in a group are likely to click and convert to a sale; they're targeting 6 million people, not you specifically, and that's balanced by how much the people who want the ads are willing to pay.

palantir or chinese social credit, etc., is targeting you specifically, and they don't care about costs if it means they can control the system, forever.


So better, richer content twitter? I never used that site except for a brief period at the beginning of the current Ukrainian war, but it sounds like it appeals to the same type of person.


This is like the education or gun debates, or basically any quality of life message you might have. It's almost impossible to get your message heard. There will always be some non-reason why everything is oh-so-different in the US. It's very frustrating to live here with all the matter-of-fact head-in-the-sand know-it-all bloviating.

Meanwhile our teachers are suffering enormously, our education is terrible, our roads are terrible, we are poisoning ourselves with substandard food, we have extremely expensive but relatively poor healthcare to deal with the problems that creates, we have no time off and are labor slaves where maximum effort for minimum pay is the norm, and half the country has become violently oppressive to the point of absolutely thriving off the suffering they perceive inflicted on others. And still, we know better - of course - because we are Americans.


There are some very wealthy people who have spent massive amounts of time and money making things they way they are. They've got things set up in a way that benefits them. They go to great lengths to keep Americans convinced that the way things are can't be changed and it's an uphill battle trying to convince Americans otherwise. Even if most Americans wise up they'll still use the resources they have to stop the changes we want from happening. I don't know what the solution is, but I do know it won't be easy.


How does this compare to dots.ocr? I got fantastic results when I tested dots.

https://github.com/rednote-hilab/dots.ocr


Ocrbase is CUDA only while dots.ocr uses vLLM, so should support ROCm/AMD cards?


How about CPU?


dots.ocr requires requires a considerable amount of computational resources. If you have Mac device with ARM CPU(M series), you can try my dots.ocr.runner(https://github.com/jason-ni/app.dots.ocr.runner).

There is a pipeline solution with multiple small specific models that can run only with CPU: https://github.com/RapidAI/RapidOCR


Jason, your runner looks interesting. I am using debian linux on my laptop with an intel cpu and nvidia gpu (proprietary nvidia cuda drivers). Should I be able to get it working? What is your speed per page at this point? Thank you


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: