Our team is working on a unique service that combines the capabilities of the Web Archive system (archive.org) and a search engine.
The experience in creating the Archivarix site recovery service allowed us to start working on something big.
We classify and index all retrieved data in order to make it convenient to search.
The data is not deleted and is stored in a convenient format for further processing.
Saved sites are technically static. Tools such as Archivarix CMS allow you to see and edit them as a single site, add a dynamic part, combine data from different sites and do the necessary optimization without having technical knowledge.
Starting with the launch of the Archivarix Site Restore Project in 2017, we have started collecting live site data in parallel.
We have collected and are collecting historical metrics of various site metrics and domain information since 2009, which we update every day.
The content of the sites that we process for full text search and content classification begins in 1996.
Our database contains information on historical data for more than 350 million domains.
The number of Spider and Archivarix processing servers involved already exceeds 50.
Our servers download over 100GB of website content from the Internet every day.
Every day we collect and analyze about 50GB of metrics data for domains and sites from various sources. Some of them are listed below.
We launched our own backlink index in 2010 and today their spiders are crawling up to 8 billion pages per day.
Alexa Internet has been collecting website traffic statistics, global rankings and other information since 1996. In 1999, Amazon bought the service.
Founded by Brewster Keil, who founded Alexa Internet a few years earlier. Retains copies of web pages since 1996 and archives various formats of material for free access.
A company that develops and sells network equipment. But besides this, it provides useful data in the field of safety, which it forms from the statistics of its equipment.
An independent international company that regulates domain names, IP addresses and other important aspects of the Internet.
Originally called MajesticSEO since 2008, it provides many useful tools for webmasters.
The service (originally called SEOmoz) was founded in 2004 as a blog and online post on the topic of search engine optimization. It now provides many useful tools for webmasters.
The American company that maintains two of the thirteen DNS root services and also manages the registries of two of the most important domain zones on the Internet .com and .net.