Valery Vasiliev | 11/26/2019
Web scraping is the process of searching for and extracting data from websites that is of interest to the person who organized the process.
Russian company Xmldatafeed parses data mainly in the field of e-commerce, then processes and sells it. Of interest are product names, their consumer characteristics, descriptions, prices, related products, availability on shelves... Most often, such data is ordered by competitors, although the results of parsing can be useful to the owners of the sites subjected to parsing. Xmldatafeed processes about 600 large sites daily, including Beru!, Ozon, Avito, Leroy Merlin, Eldorado, 220 Volt, etc.
According to Xmldatafeed Commercial Director Maxim Kulgin, it takes 6-7 seconds to collect data on one item . As a result, a regional online store, Leroy Merlin, for example, can be colombia mobile database in about a day. Technically, a company can organize several data collection sessions per day, but this will already look like a DDoS attack, which professional teams (unlike amateurs) do not allow.
As explained by Qrator Labs CTO Artem Gavrichenkov, dozens of teams of varying sizes and qualifications focused on this business can simultaneously parse the websites of companies similar to those mentioned above. During the data collection process, qualified teams try not to cause any harm to the source being processed. However, some website owners (who parse so actively that the parsing process becomes similar to a DDoS attack) use protection against parsing and turn to professionals to check its quality.
According to Maxim Kulgin, it is impossible to completely protect yourself from parsing: professionals, he claims, can collect data even from sites protected from parsing. However, tools to counter parsing (DDoS attack protection tools are also suitable for these purposes) can greatly complicate the parsing process, especially for unskilled amateurs.
Web Scraping: Pros and Cons
-
- Posts: 529
- Joined: Mon Dec 23, 2024 3:13 am