Hello all, I am still a bit new to proxies and everything and I have some questions about some stuff I came across when researching proxies. I don’t understand what ‘ports’, ‘single thread mode’, and ‘multithreaded mode’ mean and how they relate to proxies? Any help will be appreciated.
Ports are essentially the “gates” that data travels through on a network. When you connect to a proxy, you pick a port number (like 8080), which helps direct where the data goes.
Single thread and multithread don’t necessarily relate to proxies themselves. They relate more to programs that use proxies. So lemme stick to an analogy, lets picture a restaurant kitchen. A single thread is one chef cooking everything by themselves but a multithread is more like a team of chefs cooking everything together.
Let us put that into more of a technical sense, lets say you’re scraping a thousand pages. With a single threaded scraper, it will scrape each page one by one, but if you’re using a multithreaded scraper, it can scrape as many pages as you give it threads to work with.
Oh okay. Yeah I get you. So like… how exactly does that work when it comes to proxies then? I’m still a bit lost.
If you decide to do multithreaded, the website you’re scraping will get suspicious if one proxy or one IP is scraping 100s to 1000s of pages at the same time so if you’re doing multithreaded, use either a rotating proxy or a couple of different proxies so you don’t get detected. Keep in mind that threads are a resource so it really depends on what your computer can handle. But if youre just scraping a couple of different pages, around like 5 or 10 or whatever, then you can do single threaded. Hope this helps!