Hacker Newsnew | past | comments | ask | show | jobs | submit | seleniumbase's commentslogin

Looks like that one hasn't received any major updates in awhile.

The secret ingredient is to use Playwright's `connect_over_cdp()` method to connect to an existing browser that is already stealthy. Playwright can then perform its usual actions without being detected by anti-bot services. Example setup: `browser = playwright.chromium.connect_over_cdp("http://localhost:9222")`

Oh, the irony of using SeleniumBase to make Playwright stealthy!

Mix in some Python, PyAutoGUI, SeleniumBase, ThreadPoolExecutor, and then you get some serious multi-threaded CAPTCHA-bypass.


You can set `browser="firefox"`, but there's no stealth mode for it.


Here's an example that bypasses Kasada on the Hyatt website: https://github.com/seleniumbase/SeleniumBase/blob/master/exa...


The biggest issue with going from a home machine to a server is that you may lose having a "residential IP address", which is something that you'll want to have in order to prevent automation from being blocked outright. Hence the popularity of residential proxies. However, some servers live in a residential IP space, which makes them optimal for running web automation in. As was partially covered in https://www.youtube.com/watch?v=Mr90iQmNsKM, GitHub Actions appears to live in a "Residential IP space", which makes it a good server choice for web automation.


IP is definitely not the biggest issue in my experience, as proxies are required at scale regardless, unless you get into more theoretical areas like p0f.

The biggest issues are the ones that aren't obvious or easily tested for like missing a particular font, being on an abnormal gfx driver that produces an unidentified hash for particular fingerprint methods, not having certain APIs available that require browser patches, and then these aspects will differ between anti bot vendors and the data sets that they have.

The reason they can be hard to test for is that everything is based on a trust score, which is potentially influenced by anything from website load to things tied to your personal session and for some vendors optionally even input data.


The "Python vs Java" debate is probably one for a different Hacker News post. :)


I meant that some of the code reminds me of enterprise python. The kicker is that code that works > pretty code. People here act as if ugly code is somehow lesser just because it’s ugly. Meanwhile there’s a lot of ugly code making millions of dollars.

Didn’t mean to bash your project. Sorry if it came across that way.


It's OK. No offense was taken. It almost looked like the conversation was expanding into a "Python vs Java" debate, but (thankfully) it did not. I've seen both worlds. I've seen advantages to both. I decided to stay in the Python world.


Same. Although enterprise python is akin to wrestling a boa constrictor.


SeleniumBase CDP Mode uses `DOM.scrollIntoViewIfNeeded` (https://chromedevtools.github.io/devtools-protocol/tot/DOM/#...), so it only scrolls when elements are offscreen, rather than always scrolling. This reduces the number of scrolls needed. Also, it seems that most anti-bot services are not looking at scrolling as a way of identifying users.


Patching chromedriver is a lot easier than patching the browser. Plus, if you're just using a regular Chrome browser for the automation, then there's nothing to patch. Automated CDP calls aren't detectable if they don't leave any trace of automation activity. However, since Google created CDP, they might have ways of detecting automated CDP in ways that other services cannot.


What about faking mouse movement from inside the browser? PyAutoGUI is not the right way to be doing this for interacting with JavaScript that has no hope of interrogating user operating system GUI interactions.

And it seems like it would be important to try and adopt user-like mouse movement since JavaScript has access to this information.


PyAutoGUI is the optimal tool for clicking things inside of closed shadow-root elements, which are hidden to JavaScript. Can use CDP for clicking other elements.


There are live demos on YouTube of SeleniumBase bypassing CAPTCHAs (if you want see first): https://www.youtube.com/watch?v=Mr90iQmNsKM


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: