SEO Hostings and the significance of IP addresses in PBNs by Dominik Wojcik
We’d like to present the profile of another of the speakers at SearchMarketingDay.com. Dominik is one of the best-known German SEOs as well as the author of the widely read blogboeserseo.com. He has also been a speaker at countless German conferences. In May, he gave a presentation in his parents’ homeland, Poland, during SearchMarketingDay.com. We asked Dominik a few questions to give you a taste of what his presentation was about. We are also publishing a translation of Dominik’s article in which he explains the significance of IP numbers to the SEO industry in an understandable, easy-to-read manner.
Dominik Wojcik – interview
Maciej Janas: Your name and surname sound so Polish it’s hard to believe you don’t speak Polish Surely you have some connections with Poland, though. What are they? Where do you live and work?
Dominik Wojcik: I actually speak a little bit of Polish, but not much at all. I was born and raised in Germany. I never attended a Polish school, but my parents are from Poland. I understand almost everything, but I have a very hard time speaking it. I read in Polish like a 4-year-old, and I can’t write at all 😉
Maciej Janas: In the German SEO industry, you are known as BoeserSEO, which basically means “diabolical positioner”. What is it that you do that’s so evil?
Dominik Wojcik: First, they called me GreyHatSeo. I have a lot of technical knowledge and a bit of experience in hacking and cracking, so at the beginning I used a lot of black hat techniques. So basically, I spammed Google a lot. I am also a big fan of Darth Vader, and that’s why I chose the nickname boeserseo.
Maciej Janas: Dominik, how well do you sleep at night? Don’t you feel guilty towards Matt Cutts after a day spent using black hat techniques? Is it even possible to effectively position competitive industry websites using only the techniques Matt allows?
Dominik Wojcik: I sleep very well, especially because I haven’t really been concentrating on black hat methods for a few years. Now, I mainly just watch what the real black hats do. I think that a combination of active link acquisition and organic growth are the future, and that those two things should be enough to rank well in the search results in competitive industries. However, in extremely competitive industries, I think it would be risky to rely on white hat methods alone. They need to be helped along a little if you want to achieve a high position in Google.
Maciej Janas: I’m kind of kidding around asking if you feel guilty towards Matt Cutts, but there is a more serious issue at hand as well: doesn’t black hat ruin the internet? We don’t have to care about Google Inc.’s profits, but we should probably care about all the internet users out there, shouldn’t we? Only use black hat techniques temporarily so as not to utterly fall behind the competition, but simultaneously cheer Google on as it fights to make the internet better and more useful?
Dominik Wojcik: Not every black hat technique is bad. Take a look at the IP Cloaking delivery technique, for example. It’s very popular in display marketing. One way or another, positioning is a game, and as log as it remains profitable, there will always be someone looking to cheat Google.
Maciej Janas: You like to use buying expired domains as an SEO technique. Why is this so effective? Can’t Google simply notice that the domain has been taken over and revoke all of the advantages a domain history offers?
Dominik Wojcik: Yes, Google is smart enough to detect expired domains, but you can still gain from skillfully “reviving” such domains. I however would prefer to focus on acquiring good quality links. If there is little time to spare, it is more effective to acquire natural-looking links than play around with domains like that.
Maciej Janas: What can people learn from your SearchMarketingDay.com presentation?
Dominik Wojcik: There is a lot of information about the most recent Penguin Update in my presentation, and I basically explain what has changed in the algorithm. I also talk about a great little black hat trick I know 😉
Article: “The IP Issue”
Some of the words you hear the most in the SEO industry are IP, domain, host and class. This has led me to the conclude that most people simply have no idea what they’re talking about. Most of them don’t even know what an IP address is exactly, but they love to sound all smart talking about the decreasing value of PBNs, which just makes me sick. Of course, it is easy for me to say, seeing as how I was already managing clusters and server loads in a very big data center ten years ago. Despite this, I firmly believe that understanding how the internet is built and how this enormous web even works is absolutely necessary to understanding SEO. And all of this is connected with IP addresses.
What is an IP address?
In general, IP is an abbreviation for Internet Protocol. Thanks to IP, all devices that are capable of communication connect to make a web.
This can happen locally via intranet (at home/at work), during so-called LAN parties, or globally, vie the internet. An IP address is therefore a network address via which we are available and that allows us to communicate in this network. We can therefore state that it is the basis of every computer network without which it would be impossible for the internet to exist as we know it.
Ipv4 is the most known type of address. In short, it is made up of 4 numbers separated by periods whose values range from 0 to 255. If you take a closer look, you will notice that an IP address is 32-digit binary number.
This means that every time you surf the Web—reading this article, for instance—your internet provider (TP/Orange in Poland, VodaPhone or Telekom in Germany, etc.) assigns you a specified IP address. If our connection was made via DSL modem, that means you were assigned a direct IP address. If on the other hand your connection was made via router, the internet provider assigns an appropriate IP address to your router, which then manages all the devices connected to it. Usually this is done using DHCP (Dynamic Host Configuration Protocol). What this really means is that the router assigns you a dynamic IP address. This address is usually in the following format: 192.168.x.x. Anyone who wants to find out what their current IP address is should use the Windows command line that can, in Windows XP, be turned on by clicking Start and entering the command cmd.
A good old DOS console should come up then 😉 Nice… Here, we enter the command ipconfig and confirm by pressing enter. Within a few moments, we get a look at the the entire network interface card along with its very own local IP address. My local (!) IP address is 192.168.1.90. This is a typical IP address, one which was assigned to me by my router and that makes me available only locally. If I was connected directly via DSL modem, I would have an IP address assigned to me by my internet provider instead, meaning I would be connected directly to the internet. A good old DOS console should come up then 😉 Nice… Here, we enter the command ipconfig and confirm by pressing enter. Within a few moments, we get a look at the the entire network interface card along with its very own local IP address. My local (!) IP address is 192.168.1.90. This is a typical IP address, one which was assigned to me by my router and that makes me available only locally. If I was connected directly via DSL modem, I would have an IP address assigned to me by my internet provider instead, meaning I would be connected directly to the internet.
If I wanted to find out what IP address my internet provider assigned me, I could check this in the router management panel or use one of those “what’s my IP” internet sites. This address usually looks different. For example, mine is:
I am the only person in the world using that IP address. Theoretically, I could set up a server on this IP address and host domains. However, things get problematic when the internet provider assigns a new IP address every 24 hours. This would make it necessary to update the DNSes every 24 hours, which is of course extremely impractical. There are some services that pretty much solve this problem, though—Dyndns is a good example. Getting back to the topic at hand…
The hoster also has IP addresses on which it sets up its own main servers that are virtually always connected with permanent Ip addresses that are changed very rarely. So how does that look in practice?
The easiest version is when 1 web server is connected directly the the web. Smaller hosters are therefore connected directly.
Big hosters (i.e. the German Strato) have load balancers to balance the server load, which makes it possible for them to manage many domains and traffic. CDNs (Content Delivery Networks) works in a similar fashion, just a little bit smarter: along with with DNS location and routing to the nearest data center. The connection between the load balancers ensure that even if one of them should stop working for any reason, another would automatically take over the IP address, thus preventing total failure. Thus, in practice, this is bilateral monitoring.
I should definitely add that many domains can be hosted on one IP address. In Bing, we have a great command: “IP:”, thanks to which you can check how many domains/web pages are attached to the same IP address.
So how does this look with a Googlebot? It functions the same way as every computer that is connected to the internet. Below I’ve presented a very simplified diagram. Generally, this is much more complicated, but it works more or less according to the following rule:
DNSes are actually the heart of the internet and world wide web would not exist as we know it today without them.
A/B/C address classes
There are a lot of misconceptions about this topic. What most people now know to be class C used to in fact be a subnetwork of A/B. Is that clear? Not really, it seems…
Alright then. So what does the concept of a class C network mean to us?
It is generally believed that the first 3 numbers of an IP address are class C. Only if those first 3 numbers are different from each other do we have a unique IP address. So mathematically, there are 16,581,375 unique class C networks, not 255 😉 At least that’s how we currently understand class C. There are however a few older books that pay tribute instead to the old IP class theory.
Previously, there were 5 IP address classes:
1. class A [0.0.0.0 to 127.255.255.255]
2. class B [126.96.36.199 to 188.8.131.52]
3. class C [192.0.0.0 to 184.108.40.206]
4. class D [220.127.116.11 to 18.104.22.168]
5. class E [240.0.0.0 to 255.255.255.255]
Those classes were all used differently and their use was strictly regulated. Currently, the situation is much different, and IANA (Internet Assigned Numbers Authority) is much more flexible when it comes to using classes.
Currently, IP addresses are assigned according to the classless IP assignment method (Classless Inter Domain Routing). These can be easily observed when you take a look at abbreviated IP addresses that have that silly slash at the end of them. This is how subnetwork masks are defined (IP address range) as a binary system. For example for 22.214.171.124/8 the IP range is from 126.96.36.199 to 188.8.131.52, and for 184.108.40.206/16 it is from 220.127.116.11 to 18.104.22.168. This isn’t easy, but this is a relatively good explanation. However, this is only interesting from the point of view of assigning IP ranges—and much less still for traditional SEO. In this way, we have come to the conclusion that the general understanding of class C variants is correct. But we should remember that there are 255 different class C networks! So basically, just about 16 million more 😉
Exhaustion of Ipv4 protocol
You’ve definitely heard that the available amount of IP addresses is slowly being exhausted. That’s true. And that’s why all Ipv4 addresses will be assigned soon! Below is a link to a useful counter:
Ipv6—the solution to the problem
I’ve been hearing this prediction for about 10 years, and maybe it will finally come true… Ipv4 addresses are 32-bit numbers, which allows 4,294,967,296 unique addresses to be created. Ipv6 addresses are in the hexadecimal number system, not the decimal system like Ipv4. The result is that most encrypted IP addresses are separated by colons, not periods. A typical IP address looks like this:
Such an IP address offers all of the possibilities that Ipv4 addresses have. Even Google has its own Ipv6 search. When entering Ipv6 addresses in the search, the IP address must be in square brackets, i.e. http://[2001:0db8:85a3:08d3:1319:8a2e:0370:7344] Thanks to this form the last colon is not interpreted as being a port number. Thus, this would be request to a server on port 8080.
This is a quite good solution, and it will become fact sooner or later—that’s simply inevitable. Heise online, a German IT industry web site, organized a Ipv6 day and published the results of its tests, according to which Ipv6 really does work well. You can read more about this here: http://www.heise.de/netze/meldung/IPv6-Tag-bei-heise-de-Erste-Ergebnisse-1081201.html
Class C—the best (linking) currency on the web?
Let’s talk a little more about current Ipv4 addresses, though, and then move on to matters connected directly to SEO. Let’s take a look at a typical linking diagram:
I will explain and summarize some of the points.
1) 430 hostnames
What is a hostname? A hostname is the name of a host/computer. Basically, it’s everything found between the “http://” protocol and the first “/”.
2) 397 domains
Internet domains make up the names and domains of the highest level (TLD).
3) 338 IP addresses
Using DNS, every domain gets one IP address—after all, we surf the web on IP addresses. The DNS is so we don’t have to remember that damn long string of IP address numbers, which makes out lives quite a bit easier. By checking the IP address of every domain, we obtain information about the popularity of a given IP address.
4) 257 masked IP addresses/24 (class C)
This is ho we managed to keep a proper recording system. Here, IP addresses are added, reduced to class C and counted again.
Looking at the “linking diagram”, you can see an arrow pointed down on the right side. This demonstrates the decreasing value of certain items, and this is how things work in reality. The pyramid will never be the other way around. According to this logic, the number of backlinks is being reduced to the lowest common denominator by tools such as Sistrix (an SEO tool for checking backlinks). If one of those numbers turns out to be disproportionate when compared with the rest, there is a chance there is some unnatural linking going on. According to my observations, a 15% decrease in value of each of these items is perfectly normal. If this percentage is bigger, however, you need to take a closer look at your linking methods.
Ok, so I’m not a talented graphic designer…but getting back to the topic. Even the most steadfast link traders, sellers an buyers usually pay attention to class C differentiation.
Reverse-IP and SEO hostings…
Everyone who has a PBN (a linking network) or has created one knows the problem that is differentiating class C IP addresses for their own domains. Previously, it was enough to simply buy a certain are on the main server and links the domains amongst each other properly. Lastly, you had to contact the hoster and ask him nicely to just set up your domains on as many of his servers as possible. Today we have quite different demands—you are far more likely to say to the hoster: “I need more addresses with different C classes!”
Google hires the best programmers in the world. You didn’t think it would be that easy to cheat it, did you?
Looking at German companies that offer SEO hosting, I’ve come to the conclusion that most of them make mistakes that their colleagues in the United States have already learned not to.
IANA gives every IP-Range to the appropriate regional administrators (i.e. RIPE for Europe and APNIC for Asia, etc.), who then publish information about specific IP ranges. For Europe, this can be easily found here: ftp://ftp.ripe.net/ripe/stats/membership/alloclist.txt RIPE is a trusted source. Tanks to this data specific IP addresses are assigned to specific hosters. This way, it is easy to check if the class C data doesn’t belong to just one hosting company. DomainTools shows us that this works pretty well: http://whois.domaintools.com/22.214.171.124.
Google knows everything and it also knows who a given IP belongs to. Google is two steps ahead of us when it comes to detecting PBNs.
I would need half a day to filter though the data I get from RIPE.
Still, hypothetically, let’s assume that I’m a client, of, say, Hosteurope and I want to to play it smart by getting cheap hosting from Hosteurope in each of the different class C networks. How many different class C networks does Hosteurope have? To figure this out, you have to take a look at this RIPE data (link above).
This is what we have:
The first number on the left is the date that RIPE delegated certain IP networks, including Ipv6. I will omit Ipv6 addresses at this time, as there were not many real sites at these addresses.
The second column are addresses in accordance with CIDR.
The third column is RIPE status.
So let’s check how many C class networks we could theoretically get from Hosteurope.
BTW: the last IP address is 254, because 255 is always the class C network broadcast address.
What are the chances that for every class C belonging to Hosteurope, we’ll be able to get a backlink? Not too good, that’s for sure. If on the other hand we combine the aspect of one hoster and different class C networks, we can cause an extremely unnatural situation:
This is how this would look in Sistrix. Naturally, having the same value (616) in the first 5 types of backlinks is pretty much impossible. Scraper sites, SERPs or social bookmarking could seem like a good way out of this situation at first glance. However, if these are very low-quality websites, they are ignored by Google anyway, with sites with a high trust rank being much more important and valuable. But who really knows what criteria Google uses when assessing links…
However, I would also like to show that there is a factor that makes building PBNs much easier than it is based on class C! This factor is the attachment of a certain IP to a certain hoster!
Maybe Google isn’t all that advanced when it comes to devaluing backlinks and such extreme cases are just exceptions. Still, you have to keep this in mind if you are planning to build a PBN.
How to build the perfect PBN
It isn’t enough to buy a large SEO package from one hoster. A perfect Pbn is so good that it’s even hard for you to know everything about specific hosters, internet providers, collocations, DNSes and domain registrants. Pretty much every domain needs to be built completely separate from the rest. To do this, you have to use a variety of solutions—for example, you can change DNSes, hoster, etc… Basically cover your tracks whenever and wherever you can.
PBNs are a wonderful thing that can be used to achieve many different SEO goals. Once we’ve built one, it gives us a feeling of security. Thanks to all this it is possible to build a completely independent SEO area where you do not need to trade links or rely on other SEOs.
Of course, I know that building and managing such a PBN takes a lot of time and work—and the bigger the PBN, the more time and work it takes. That’s why if someone has 1 or 2 web pages, he can create another 2 for SEO purposes. In most cases it is enough to move those 2 domains to different hosters.
The more pages we have, the more we see how wise it is to make use of a PBN. It doesn’t pay for just just one or two projects , though, so if that’s all you have to do, then other sensible link-building solutions are likely better for you.