Nowadays when most of the people talk about Google search engines, it is actually called as World Wide Web search engines. It is kind like invisible progress which is provide by Google. Before the Web became the most visible part of the Internet, there were already search engines in place to help people find information on the Net. Programs with names like "gopher" and "Archie" kept indexes of files stored on servers to connected the Internet facilities, and dramatically reduced the amount of time required to find programs and documents.
Actually I’ve no idea what is Google search engine spider? After that I’ve been wonder to learn what it is. Eventually I got a small chance to know about that. The first and best way that I found it to search in Google search. In the first result is “Crawling is the process by which Googlebot discovers new and updated pages to be added to the Google index. We use a huge set of computers to fetch (or "crawl") billions of pages on the web. The program that does the fetching is called Googlebot (also known as a robot, bot, or spider)”
Constantly, before talk about Google search engine spider and all, first let me brief about what is Google search engine how it is work.
What is search engine?
Basically, search engines are kind a programs which is search text or sentences simply can say documents (which is input from the user) for specified keywords and result will be show a list of the documents where the keywords were matched or found on the web. A search engine is really a general class of programs, apart from that, we were called the term is often used to specifically describe systems such as Google, Bing and Yahoo! Search that is providing a service for the user to search and return a particular documents on the web.
In other hand, how is work? A Web search engines work by sending out a spider to fetch as many documents as possible. That program called as an indexer, after that reads those documents and creates an index based on the texts contained in each of the document. In every search engine uses a proprietary algorithm to create indices like that. More over got a chance only meaningful results are output for each of the query. As many website owners relay on search engines to send traffic to their website, and entire industry has grown around the idea of optimizing Web content to improve the placement in search engine outputs.
Typically, the Google search engine spider also do the same process even more include advance optimizing. The Google's spiders always crawl the web to rebuild in our website index. Basically the Crawls are based on many factors like an example, PageRank, links to a page, and crawling limitation or restriction such as the number of parameters based on the URL.
When we can consider how does spider work on our site, its kind a spider program that is take our website and visit to reads all pages and other resources in order to create data entries for a search engine index. Which is called as a "crawler" or a "bot”.
Incase if you’re built any kind of website, once you’ve done the hosting part. Then search the Google. It will take your page or contents based on search keywords. Actually the Google handles billions of billions of websites every day. Today I going to explain about how can you get bit more idea the Google what the heck are they doing our website, let’s take view on several part above (100 things) which is some are proved and some are we can take controversial if not that is going to be SEO nerd speculation. Which list are shown below.
1. (LSI) Latent Semantic Indexing Keywords Contents: In this factor which is LSI keywords it help to google search engines take out the real meaning from that particular key words with more meanings will give you. Another thing is the fundamental states (the present and absence) of being Latent Semantic Indexing. Even more it is also react as a content quality signal management for the search engines.
2. Keyword in the tag title: The main title tag is an HTML element that uniquely specified the title of web page. Eventually the title tags are displayed on search result page. This called as “SERPs” This one is clickable the headline for given output. And this webpage’s must piece of the content it is what we called as besides the content of the webpage and insider of that it sending a bulk SEO signal on page. Here is the important of the title tag in Google.
3. The measuring page load speed (via HTML): The major factor of Google use to measure page load speed as a reason of ranking factor during the loading the page. In case search engine spiders(SES) can judge the value our site speed very efficiently and effectivly based on the page of code, and contain folder or document’s capacities.
4. The measuring page load speed (via Chrome): This also little bit same carp thing do Google take the local user temporary data to get an idea about the pages loading speed time as far takes into account server speed, even more CDN usage.
5. Keywords Order: Another important factor is the accurate match users keyword in a pages content will usually ranking sublime than same keywords send the different order.
6. Regency of content updates: In Google update favors during the updated content, mostly it is able to make searches accurate time-sensitive. This also very important point of Google search engine bot algorithm, at the Google return pages access last update. Here is the small example. Which I shown below.
In this example will summarize it take a look: A little search for: “cat shaving techniques”. A page optimized for the phrase “cat shaving techniques” will rank better than a page optimized for “techniques for shaving a cat”. This is a good illustration of why keyword research is must.
7. The Outbound Link: Consider the moz data, search engines might be using contents of the pages that you linked as correct signal.
8. Keyword in H2, H3 Tags: In test way our searching keyword appear as a subheading in H2 or H3 format may be another weak relevancy signal.
9. Image Results: Google elbows our organic listings for image results for searches commonly used on Google Image Search.
10. Helpful Enhancing Content: Nowadays the public Google Raters user documents, helpful enhancing content is an indicate of pages standards. Such as loan interest calculator apps, currency converter apps even more interactive recipes apps.
11. Reading Level optimizing: This is the absolutely true one which is Google estimates the reading level of webpages. Here is the small brief about Google used to give you reading level stats
In Generally the point is what they've do with data or information for the arguments discussion. someone talk about the very basic reading level will be help you ranking, cause it will requested to the large masses.
12. HTML bugs and W3C validation: More lots of HTML bugs or constancy coding might be an indicate of a lack quality website.There are lot of search engine optimizing identify that WC3 validation, It's a weak quality signal for the site.
13. Affiliate Links: This affiliate links it may won't hurt themselves our rank. rather if we've too much, Google’s algorithm it may closer to other standards signals.
14. Page Category: There are many page category the page shown as a accurate signal. The page part of a accurate category must get a correct boost comparation to the page that will filed efficient or less related category.
15. URL Path: During the search any keyword, the page closer to the homepage may get a slight authority boost.
16. URL String: This will be read by Google and it may provide a thematic signal to what kind of page are currently available. Here is the small view about that:
17. Sources and References: The Google standard User guide that may reviewers must counting on for the source while looking at keep an eye out for sources when looking at familer pages:
- Point: This is the subject that where can taken and/or importand user permission sources.
Constantly, Google has denied that they use external links as a ranking signal.
18. Lists Numbered and Bullets: Another major point the lists and numbers and bullets It's help to brak the word content for the viewers or readers.
19. Unique Insights provides value of Content: Nowadays hunt our website by Google. Basically they look forward to “thin affiliate” websites.
20. Contact Us from Website: appropriate Google standard files. They fix a prefer a prefix sites with “appropriate amount of contact information”. In case aggrements if your basic contact information matches.
21. Site Architecture: Basically a good put-together site architecture mostly a silo structure helps Google thematically organize your content.
22. Trust Domain: Basically the trust domain calculate by how many of the links away from our site. This is a large major important thing to consider that.
23. Website Architecture: This is basically silo structure to helps Google thematically company your contents.
24. Mobile Optimized: On mobile is to create a responsive site. It’s likely that responsive sites get an edge in searches from a mobile device. In fact, they now add “Mobile friendly” tags to sites that display well on mobile devices. Google also started penalizing sites in Mobile search that aren’t mobile friendly.
25. Terms, ondition Service vacy Pages: Those technical page use requested Google. This is the site is kinda trustworthy customers of the network or internet practices.
26. Server Location: The Server location may influence where the site ranks in different geographical regions. Especially important for geo-specific searches.
27. Navigation of Bread crumb Navigation: This navigation style is fully user attractive friendly site architecture. Its help to customers indentify where theier locate on that site.
28. Site Uptime: Lots of downtime from site maintenance or server issues may hurt your ranking (and can even result in de indexing if not corrected).
29. The strong network (YouTube): The YouTube videos files are give a prefix prefrancial solution in the SERPs. We can say truly, Google Search Engine take and find that keyword "YouTube.com" traffic increased.
30. Quantity of Other Keywords Page Ranks for: If the page ranks for several other keywords it may give Google an internal sign of quality.
31. Site Usability: A site that’s difficult to use or to navigate can hurt ranking by reducing time on site, pages viewed and bounce rate. This may be an independent algorithmic factor gleaned from massive amounts of user data.
32. Page’s PageRank: Not perfectly correlated. But in general higher PR pages tend to rank better than low PR pages.
33. Keyword in Description Tag: Another relevancy signal. Not especially important now, but still makes a difference.
34. Use of Google Analytics and Google Webmaster Tools: Some think that having these two programs installed on your site can improve your page’s indexing. They may also directly influence rank by giving Google more data to work with for example more accurate bounce rate, whether or not you get referral traffic from your backlinks etc.)
35. Linking Domain Age: Backlinks from aged domains may be more powerful than new domains.
36. Social Shares of Referring Page: The amount of page-level social shares may influence the link’s value.
37. User reviews/Site reputation: A site’s on review sites like Yelp.com and RipOffReport.com likely play an important role in the algorithm. Google even posted a rarely candid outline of their approach to user reviews after an eyeglass site was caught ripping off customers in an effort to get backlinks.
38. Linking Root Domains: The number of referring domains is one of the most important ranking factors in Google’s algorithm.
39. Links from .edu or .gov Domains: Matt Cutts has stated that TLD doesn’t factor into a site’s importance. However, that doesn’t stop SEOs from thinking that there’s a special place in the algo for .gov and .edu TLDs.
40. Authority of Linking Domain: The referring domain’s authority may play an independent role in a link’s importance (ie. a PR2 page link from a site with a homepage PR3 may be worth less than a PR2 page link from PR8 Yale.edu).
41. Links from Separate C-Class IPs: Links from separate class-c IP addresses suggest a wider breadth of sites linking to you.
42. Links from Separate C-Class IPs: Links from separate class-c IP addresses suggest a wider breadth of sites linking to you.
43. Links From Competitors: Links from other pages ranking in the same SERP may be more valuable for a page’s rank for that particular keyword.
44. Guest Posts: Although guest posting can be part of a white hat SEO campaign, links coming from guest posts — especially in an author bio area — may not be as valuable as a contextual link on the same page.
45. Links to Homepage Domain that Page Sits On: Links to a referring page’s homepage may play special importance in evaluating a site’s — and therefore a link’s — weight.
46. No follow Links: One of the most controversial topics in SEO. Google’s official word on the matter is:
“In general, we don’t follow them.”
Which suggests that they do…at least in certain cases. Having a certain % of no follow links may also indicate a natural vs. unnatural link profile.
47. Diversity of Link Types: Having an unnaturally large percentage of your links come from a single source for example forum profiles, blog comments) may be a sign of web spam. On the other hand, links from diverse sources is a sign of a natural link profile.
48. Contextual Links: Links embedded inside a page’s content are considered more powerful than links on an empty page or found elsewhere on the page. A good example of contextual links are backlinks from guestographics.
49. Internal Link Anchor Text: Internal link anchor text is another relevancy signal, although probably weighed differently than backlink anchor text.
50. Backlink Anchor Text: As noted in this description of Google’s original algorithm:
“First, anchors often provide more accurate descriptions of web pages than the pages themselves.”
Obviously, anchor text is less important than before (and likely a web spam signal). But it still sends a strong relevancy signal in small doses.
51. Link Title Attribution: The link title (the text that appears when you hover over a link) is also used as a weak relevancy signals.
52. Link Location on Page: Where a link appears on a page is important. Generally, links embedded in a page’s content are more powerful than links in the footer or sidebar area.
53. Linking Domain Relevancy: A link from site in a similar niche is significantly more powerful than a link from a completely unrelated site. That’s why any effective SEO strategy today focuses on obtaining relevant links.
54. Text around Link Sentiment: Google has probably figured out whether or not a link to your site is a recommendation or part of a negative review. Links with positive sentiments around them likely carry more weight.
55. Linked to as Wikipedia Source: Although the links are no follow, many think that getting a link from Wikipedia gives you a little added trust and authority in the eyes of search engines.
56. Links from Real Sites vs. Splogs: Due to the proliferation of blog networks, Google probably gives more weight to links coming from “real sites” than from fake blogs. They likely use brand and user-interaction signals to distinguish between the two.
57. Reciprocal Links: Google’s Link Schemes page lists “Excessive link exchanging” as a link scheme to avoid.
58. User Generated Content Links: Google is able to identify links generated from UGC vs. the actual site owner. For example, they know that a link from the official WordPress.com blog at en.blog.wordpress.com is very different than a link from besttoasterreviews.wordpress.com.
59. Schema.org Microformats: Pages that support microformats may rank above pages without it. This may be a direct boost or the fact that pages with microformatting have a higher SERP CTR:
60. Number of Outbound Links on Page: PageRank is finite. A link on a page with hundreds of OBLs passes less PR than a page with only a few OBLs.
61. Forum Profile Links: Because of industrial-level spamming, Google may significantly devalue links from forum profiles.
62. Bounce Rate: Not everyone in SEO agrees bounce rate matters, but it may be a way of Google to use their users as quality testers (pages where people quickly bounce is probably not very good).
63. Organic CTR for All Keywords: The page’s (or site’s) organic CTR for all keywords it ranks for may be a human-based, user interaction signal.
64. Direct Traffic: It’s confirmed that Google uses data from Google Chrome to determine whether or not people visit a site (and how often). Sites with lots of direct traffic are likely higher quality than sites that get very little direct traffic.
65. Repeat Traffic: They may also look at whether or not users go back to a page or site after visiting. Sites with repeat visitors may get a Google ranking boost.
66. Google Toolbar Data: Search Engine Watch’s Danny Goodwin reports that Google uses toolbar data as a ranking signal. However, besides page loading speed and malware, it’s not known what kind of data they glean from the toolbar.
67. Chrome Bookmarks: We know that Google collects Chrome browser usage data. Pages that get bookmarked in Chrome might get a boost.
68. Blocked Sites: Google has discontinued this feature in Chrome. However, Panda used this feature as a quality signal.
69. Google Toolbar Data: Search Engine Watch’s Danny Goodwin reports that Google uses toolbar data as a ranking signal. However, besides page loading speed and malware, it’s not known what kind of data they glean from the toolbar.
70. User Comments: All the websites having comments section. It might be getting high signal for user authuntication of the web site qualities.
71. Tool of Dwell Time: These days Google spending a money to pays attractive “dwell time”. This is actually how long users spend the websites when it comes to the Google search.
72. The Query Deserves Diversity: Another cool option in Google makes Query deserves. It is upadte the diversity SERP for ambiguous oriented keywords.
73. Search History: The customes Search results for too track searches. For example, if you search for “reviews” then search for “toasters”, The Google is mostly capture to get your reviews highest SERP.
74. The Google Geo Targeting: As you can find your Google search result, it may provide the side’s preference with a local server IP address and additionally country-specific domain name with extension.
75. Internet Browsing History: Such a point of history, it makes the sites influence access the signal into Google getting SERP solution into your searches.
76. The Google Geo Targeting: As you can find your Google search result, it may provide the side’s preference with a local server IP address and additionally country-specific domain name with extension.
77. Google Local search result: This result looks like Google+ organic SERPs. It’s should have location information with each result.
78. The Google transactional search result As similar that Google shopping result, It will showing up the transactional result as related keywords. In example while you’re try get the flight information
79. Google shopping Result: In point of the similar situation. Google provide the shopping search result for in the Organic result.
80. Google image search results: In interesting topic which calibrating of Google elbows, It is the give our organic listings for image by our search results. The reason is easily get any kind of image from the Google Image searching.
81. The Effective Result of Easter Egg: Such give efficient result by Easter Egg. As look an example Once you yo need to search something on Googel through the image tag. This will give un discoverable doubted search result.
82. The brand from similar one site: This is almost impact on website Domain even more can say about brand mutual keywords, That is build up several effective results from the related sites.
83. Authority of Facebook User Accounts: As with Twitter, Facebook shares and likes coming from popular Facebook pages may pass more weight.
84. The Amount of Facebook pages or status Likes: The Most common thing, Its looks like Facebook status post coming from users, The profiles with a ton of likers have more impact the facebook usage from new.
85. Twitter Accounts Authority: In case of that many times Google cannot identify most twitter profiles. It may looks like they consider in amount of the twitter post pages receives the ranking factors signals.
86. The Linking Domain Relevancy: This is one of the most good analysis consequence by MicroSiteMasters.com. It is has been found on their sites with an unnaturally the best high numbers of links from relevancy sites even more susceptible to Penguin.
87. Google+1’s Followers: This is acting as Matt Cutts. This is going to be gain the record as talking about Google+. That has “no more direct effect” on the ranking factors, it really surprised to wonder that they did ignore the personal social account networks.
88. Google+ Accounts Authority Sense: Eventually that is logical things on Google gain weight +1’s taking from authoritative accounts even more than from the account holders, exactly without any consequences of lots of followers numbers.
89. Authorship interaction: In example of one story will happened on February 2013. Google+ authorship program has been terminated. Actually what it is happened the Google use some form authorship to exanimate the most popular influential contents peoples. This may help for boost their tank factors.
90. The Searches of the Branded: Nowadays it is not a big deal for people in search brands. Once they search about your website in Google. It may like to take your query to consider while your examinee the brand on Google.
91. The technique Tool of Disavow: Basically this tool use for manually or automatically remove the consisting of penalty sites. That make sense victims of negative SEO result of the web pages.
92. The News Box on Google: This is actually trigger the strong similar keywords on the posts. Following the example you can get an idea about it
93. Legitimacy of Social Media Accounts: These days every social media profiles or accounts having such many followers. This is might be impact the blog posts compiling the different with other media countless followers into the communication on strong account holders.
94. Linkedin Provides related list of company employees information: The major fishing tools which the Rand Fishkin. This is always wondering and targeting the people Linkedin profiles. It may be identify the company’s brand signals very effectively.
95. News Sites with organization quality brand identification: There are some companies provide the company brands on the news section, in point of that some are the brands unfortunately matching with Google feeds on the starting page. Look at this example of Google search result.
96. Countless RSS Subscribers: Mostly the Google determinate they Feedburner Rss services. It is the possibilities find the RSS subscribers information with high rank site quality signals.
97. Google listing and local list of bricks locations: These days every company has an office with different locations. What it is does the Google fished all of the location to check your business quality background is good or bad.
98. The automated generate site contents: This is one the big punishing your web results or de indexing. The reason is Google. It is always like automated generate contents. Once it is suspect your site loader is out of autogenerated contents, it might be de-index your site.
99. Usage of site Meta Tag unwanted spam: The technique use of SEO Keyword stuffing, which is web page is loaded with keywords in the Meta Tags. This is also possible way. This is why the Google wonder your modify your keywords into your Meta Tags. Even though this will be fighting with your site with reducing your rank.
100. The LinkedIn in account 2 type of links (High and Low Quality): Basically LinkedIn links getting resource take by black hat SEOs technique (It is known as such blog comments and review).