Hacker Newsnew | past | comments | ask | show | jobs | submit | rapier1's commentslogin

It's still possible but we only suggest doing it on private known secure networks or when it's data you don't care about. Authentication is still fully encrypted - we just rekey post authentication with a null cipher.


If you want to see the impact that the flow control buffer size has on OpenSSH I put up a graph based on data collected last week. Basically, it has a huge impact on throughput.

https://gist.github.com/rapier1/325de17bbb85f1ce663ccb866ce2...


Keep in mind that SCP/SSH might be faster in some cases than SFTP but in both cases it is still limited to a 2MB application layer receive window which is drastically undersized in a lot of situations. It doesn't matter what the TCP window is set to because the OpenSSH window overrides that value. Basically, if your bandwidth delay product is more than 2MB (e.g. 1gbps @ 17ms RTT) you're going to be application limited by OpenSSH. HPN-SSH gets most of the performance benefit by normalizing the application layer receive window to the TCP receive window (up to 128MB). In some cases you'll see 100X throughput improvement on well tuned hosts on a high delay path.

If your BDP is less than 2MB you still might get some benefit if you are CPU limited and use the parallel ciphers. However, the fastest cipher is AES-GCM and we haven't parallelized that as of yet (that's next on the list).


When I need speed, I drop down to FTP/rcp or some other cleartext protocol.

Moving a terabyte database in an upgrade, I have connected three ports direct (no switch), then used xargs to keep all three connections busy with transferring the 2gb data files. I can get the transfer done in under an hour this way.

I don't currently have a performance need for an encrypted transfer, but one may arise.


I fully understand that. We're using this, along with parsyncfp2 (which you should checkout) to move 1.5PB of data a month across a 40Gb link. Not saying that HPN-SSH is only useful in that context but different people certainly do have different needs.


If encryption was absolutely required, I might try it over s_client/s_server or two stunnels, one in client=yes mode.

I'm assuming that would have different limits than you outlined above, although I don't think they do multi core.


to be honest, there was a period of time in about 2010 or 2012 where I simply wasn't maintaining it as well as I should have been. I wouldn't have upstreamed it then either. That's changed a lot since then.

As an aside - you only really need HPN-SSH on the receiving side of the bulk data to get the buffer normalization performance benefits. It turns out the bottleneck is almost entirely on the receiver and the client will send out data as quickly as you like. At least it was like that until OpenSSH 8.8. At that point changes were made where the client would crash if the send buffer exceeded 16MB. So we had to limit OpenSSH to HPN-SSH flows to a maximum of 16MB receive space. Which is annoying but that's still going to be a win for a lot of users.


We've been looking at using QUIC as the transport layer in HPN-SSH. It's more of a pain that you might think because it breaks the SSH authentication paradigm and requires QUIC layer encryption - so a naive implementation would end up encrypting the data twice. I don't want to do that. Mostly what we are thinking about doing is changing the channel multiplexing for bulk data transfers in order to avoid the overhead and buffer issues. If we can rely entirely on TCP for that then we should get even better performance.


Yeah, my naive implementation thought experiment was oriented towards a side channel brokered by the ssh connection using nginx and curl. Something like source opens nginx to share a file and tells sink via ssh to curl the file from source with a particular cert.

However, I observed that curl [0] uses openssl' quic implementation (for one of its experimental implementations). Another backend for curl is Quiche [1] which has client and server components already, has the userspace crypto etc. It's a little confusing to me, but CloudFlare also has a project quiche [2] which is a Rust crate with a CLI to share and consume files.

0. https://curl.se/docs/http3.html

1. https://github.com/google/quiche/tree/main/quiche/quic

2. https://github.com/cloudflare/quiche


Yeah, it's an issue because there is also the per channel application layer flow control. So when you are using SFTP you have the TCP flow control, the SSH layer flow control, and then the SFTP flow control. The maximum receive buffer ends up being the minimum of all three. HPN-SSH (I'm the dev) normalizes the SSH layer flow control to the TCP receive buffer but we haven't done enough work on SFTP except to bump up the buffer size/outstanding requests. I need to determine if this is effective enough or if I need some dynamism in there as well.


More than 2 decades at this point. The primary reasons is that the full patch set would be a burden for them to integrate and they don't prioritize performance for bulk data transfers. Which is perfectly understandable from their perspective. HPN-SSH builds on the expertise of OpenSSH and we follow their work closely - when they make a new release we incorporate it and follow with our own release inside of a week or two (depending on how long the code review and functionality/regression testing takes). We focus on throughput performance which involves receive buffer normalization, private key cipher speed, code optimization, and so forth. We tend to stay clear of anything involve authentication and we never roll our own when it comes to the ciphers.


Rsync commonly uses SSH as the transport layer so it won't necessarily be any faster than SFTP unless you are using the rsync daemon (usually on port 873). However, the rsync daemon won't provide any encryption and I can't suggest using it unless it's on a private network.


I'm the lead developer. I can go into this a bit more when I get from an appointment if people are interested.


I’m interested. Mainly to update the documentation on it for Gentoo, people have asked about it over the years. Also, TIL it appears HN has a sort of account dormancy status it appears you are in.


For Gentoo I should put you in touch with my co-developer. He's active in Gentoo and has been maintaining a port for it. I'll point him at this conversation. That said, documentation wise, the HPN-README goes into a lot of detail about the HPN-SSH specific changes. I should point out that while HPN-SSH is a fork we follow OpenSSH. Whenever they come out with a new release we come out with one that incorporates their changes - usually we get this out in about a week.


The parallel ciphers are built using OpenSSL primitives. We aren't reimplementing the cipher itself in anyway. Since counter ciphers use an atomically increasing counter you can precompute the blocks in advance. Which is what we do - we have a cache of ketstream data that is precomputed and we pull the correct block off as needed - this gets around the need to have the application compute the blocks serially which can be a bottleneck at higher throughput rates.

The main performance improvement is from the buffer normalization. This can provide, on the right path, a 100x improvement in throughput performance without any compromise in security.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: