Hacker Newsnew | past | comments | ask | show | jobs | submit | palsecam's commentslogin

I do something quite similar with nginx:

  # Nothing to hack around here, I’m just a teapot:
  location ~* \.(?:php|aspx?|jsp|dll|sql|bak)$ { 
      return 418; 
  }
  error_page 418 /418.html;
No hard block, instead reply to bots the funny HTTP 418 code (https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/...). That makes filtering logs easier.

Live example: https://FreeSolitaire.win/wp-login.php (NB: /wp-login.php is WordPress login URL, and it’s commonly blindly requested by bots searching for weak WordPress installs.)


418? Nice I'll think about it ;-) I would, in addition, prefer that "402 Payment Required" would be instantiated for scrapers ...

https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/...


nginx also has "return 444", a special code that makes it drop the connection altogether. This is quite useful if you don't even want to waste any bandwidth serving an error page. You have an image on your error page, which some crappy bots will download over and over again.


Yes @ 444 (https://http.cat/status/444). That’s indeed the lightest-weight option.

> You have an image on your error page, which some crappy bots will download over and over again.

Most bots won’t download subresources (almost none of them do, actually). The HTML page itself is lean (475 bytes); the image is an Easter egg for humans ;-) Moreover, I use a caching CDN (Cloudflare).


Beware of nginx 444 if your webserver is behind a load balancer.

The LB will see the unresponded requests and think your webserver is failing.

Ideal would be to respond at the webserver and let the LB drop the response.


Does it also tell the kernel to drop the socket? Or is a TCP FIN packet still sent?

Be better if the scraper is left waiting for a packet that'll never arrive (till it times out obviously)


Exactly. The Atom feed of my website declares an XSLT stylesheet which transforms it to HTML. That way it can be served directly to, and renders prettily by, a web browser (see https://paul.fragara.com/feed.xml). For the curious, the XLST can be found here: https://gitlab.com/PaulCapron/paul.fragara.com/-/blob/master...

Btw, you can also apply an XSLT sheet to an XML document using standard JavaScript: https://developer.mozilla.org/en-US/docs/Web/API/XSLTProcess...


Btw, a Tor relay can be relatively lightweight. I run one on a $5/mo VPS (which does many other things). You need 1 GiB of RAM, but a single basic CPU core largely suffices. My relay sends/receives ~150 GiB of traffic per day (~15 Mbits/s). It’s not an exit node, so no legal worries.

Here’s my torrc:

  SocksPort  0
  ExitRelay  0

  ORPort     NNNN
  DirPort    NNNN

  Nickname     X
  ContactInfo  X@X.com

  RelayBandwidthRate    80 megabits
  RelayBandwidthBurst  120 megabits

  MaxMemInQueues  384 megabytes

  AvoidDiskWrites  1
  HardwareAccel    1
  NoExec           1
  NumCPUs          1
Here’s my override config for systemd (Ubuntu 24.04):

  $ sudo systemctl edit tor@default
  [Service]
  Nice=15
  CPUAffinity=0
  CPUWeight=60
  StartupCPUWeight=6
  IOWeight=60
  TimerSlackNSec=100us

  MemoryMax=896M
  MemoryHigh=800M
  OOMScoreAdjust=1000

  LimitAS=2G
  LimitNPROC=512
  LimitNOFILE=10240

  PrivateDevices=true
  ProtectSystem=true
  ProtectHome=true


Oh but yes, Firefox (and Chrome) do support XSLT natively! See https://paul.fragara.com/feed.xml as an example (the Atom feed of my website, styled with XSLT).

FTR, there’s also XSLTProcessor (https://developer.mozilla.org/en-US/docs/Web/API/XSLTProcess...) available from Javascript in the browser. I use that on my homepage, to fetch and transform-to-HTML said Atom feed, then embed it:

  const atom = new XMLHttpRequest, xslt = new XMLHttpRequest;
  atom.open("GET", "feed.xml"); xslt.open("GET", "atom2html.xsl");
  atom.onload = xslt.onload = function() {
    if (atom.readyState !== 4 || xslt.readyState !== 4) return;
    const proc = new XSLTProcessor;
    proc.importStylesheet(xslt.responseXML);
    const frag = proc.transformToFragment(atom.responseXML, document);
    document.getElementById("feed").appendChild(frag.querySelector("[role='feed']"));
  };
  atom.send(); xslt.send();
Server-side, I’ve leveraged XSLT (2.0) in the build process of another website, to slightly transform (X)HTML pages before publishing (canonicalize URLs, embed JS & CSS directly in the page, etc.): https://github.com/PaulCapron/pwa2uwp/blob/master/postprod.x...


Interesting, it's more than 20 years since I tried anything in that space. Thanks for the correction.


> and directly running it? How much more personally involved can you get?

Ms. Farmer is the CEO of BNSF Railway[1], she is the one “directly running it”.

Moreover, Berkshire/Buffet is notorious for _not_ micro-managing, for letting its subsidiaires enjoy greater autonomy than they would in an usual conglomerate.

1: https://en.wikipedia.org/wiki/Kathryn_Farmer


Right, but what is the point of letting companies own other companies if this also doesn't imply transitive accountibility? It's hard to imagine we can't find reasons to imprison the board of blackrock, for instance. Such a massive company has disproportionately small social responsibility.


We should extend this logic of accountability. If your kid does something wrong every living parent and grandparent should be held accountable. The world would instantly be better.

Not sarcasm.


> My pet theory is the BigCo's are walking a tightrope of model safety and are intentionally incorporating some uncanny valley into their products, since if people really knew that AI could "talk like Pete" they would get uneasy. The cognitive dissonance doesn't kick in when a bot talks like a drone from HR instead of a real person.

FTR, Bruce Schneier (famed cryptologist) is advocating for such an approach:

We have a simple proposal: all talking AIs and robots should use a ring modulator. In the mid-twentieth century, before it was easy to create actual robotic-sounding speech synthetically, ring modulators were used to make actors’ voices sound robotic. Over the last few decades, we have become accustomed to robotic voices, simply because text-to-speech systems were good enough to produce intelligible speech that was not human-like in its sound. Now we can use that same technology to make robotic speech that is indistinguishable from human sound robotic again.https://www.schneier.com/blog/archives/2025/02/ais-and-robot...


Reminds me of the robot voice from The Incredibles[1]. It had an obviously-robotic cadence where it would pause between every word. Text-to-speech at the time already knew how to make words flow into each other, but I thought the voice from The Incredibles sounded much nicer than the contemporaneous text-to-speech bots, while also still sounding robotic.

[1] https://www.youtube.com/watch?v=_dxV4BvyV2w


Like adding the 'propane smell' to propane.


That doesn't sound like ring modulation in a musical sense (IIRC it has a modulator above 30 Hz, or inverts the signal instead of attenuating?), so much as crackling, cutting in and out, or an overdone tremolo effect. I checked in Audacity and the signal only gets cut out, not inverted.



My favorite writing school haha


“Reverse brain drain”, so to say.

Well, good for Europe. Like it was good for the USA when (jewish) scientists emigrated there in the 30–40s.


There is a certain irony in that, considering op. paperclip two decades later.


Ironic indeed.

For ref: https://en.wikipedia.org/wiki/Operation_Paperclip (I didn’t get it at first)

The Operation Paperclip was a secret United States intelligence program in which more than 1,600 German scientists, engineers, and technicians were taken from former Nazi Germany to the US for government employment after the end of World War II in Europe, between 1945 and 1959; several were confirmed to be former members of the Nazi Party, including the SS or the SA.


The “Lost Decades” (1990–2010), during which Japan GDP hardly grew and deflation occured: https://en.wikipedia.org/wiki/Lost_Decades


Prior to that they had an 'economic miracle' from WW2 till then which is probably more what people refer to. The lost decades are quite easy to understand with conventional economics - they had a property and stock bubble which got everyone left with huge debts when it collapsed.


You have a point, for sure; the _whole_ previous century (couple centuries, even) of Japan economical history is one-of-a-kind (as is Argentina’s).

However, wtr. “lost decades are quite easy to understand with conventional economics”: that’s more debatable. For example, stagflation is common(ly explained), but stag-deflation (as in Japan) is more unusual and has weirder effects. The US subprime crisis was also a real estate bubble (which, here also, rippled to the financial markets), but its burst had quite different fallouts.

Anyway, thanks for adding to the discussion, and disclaimer: IANAE (I Am Not An Economist — thanks God ^^)


> I'm positively surprised by some deputies remarks.

Not personally surprised, but I agree, they talk sense and understand the matter quite well. Thanks for finding and sharing the debate video, btw!

> Retailleau is either totally incompetent technically, with his advisors as well, or is a liar.

Sadly, gotta agree here too. M. Retailleau seems off the mark. He looks (to me) like he speaking in good faith tho, and that he “just” doesn’t get it.

Nevertheless, he has a point when he mentions Apple and CSAM (Child Sexual Abuse Material). Apple has shown that using homomorphic encryption could be 1) actually practical, and 2) helpful on such matters. Cf. https://www.apple.com/child-safety/pdf/CSAM_Detection_Techni... & https://machinelearning.apple.com/research/homomorphic-encry... & https://repositorio.fgv.br/items/047aca31-ccdc-45bd-a7d3-6c0....


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: