Google’s algorithm uses over 200 factors when determining where a page will show up in the rankings, but Matt Cutts, head of Google’s Webspam team, time and time again points to quality content and inbound links from pages off-site as the two most important criteria for high rankings.
“Having original and useful content and making your site search engine friendly is the best strategy for better ranking. With an appealing site, you’ll be recognized by the web community as a reliable source and links to your site will build naturally.”
- Google Webmaster Central
Ideally, a successful page will have great content and as many inbound links as possible from highly regarded pages with related content. That said, it’s still very possible for a page to rank well for certain terms using content alone. This happens all the time with new blog posts.
We can refer to the text content, the title tags, the heading tags, the internal linking of pages, and other factors arising from the code itself as on-page factors. These factors are within our direct control, and our SEO audit looks at the following on-page factors for areas of improvement.
Google has published a Search Engine Optimization Starter Guide that is a great place to start researching SEO.
Basic, publicly available information about the server running the web site is reported. We’ll tell you what OS is it running, whether it is up-to-date, where is it physically located, etc. If you use shared hosting, we’ll let you know how many other domains are hosted on that IP address. We’ll check your server headers to make sure your pages are being sent with the correct HTTP Status Code.
Your server’s physical location, as determined by IP address, directly affects your rankings. So a site hosted on a server in Germany would be more likely to show up in searches by German users of Google. You can also use Google’s Webmaster Tools to target a specific market.
Matt Cutts addresses Can the geographic location of a web server affect organic rankings?
You might also want to obtain a dedicated IP address. Many webhosts put hundreds if not thousands of sites on a single IP address. You can see how many sites are sharing your IP address by performing a Reverse IP lookup.
Web pages are written in a ‘markup language’ called HTML, short for Hypertext Markup Language. It provides a means to describe the structure of text-based information in a document – by denoting certain text as headings, paragraphs, lists, and so on – and to supplement that text with interactive forms, embedded images, and other objects. HTML is written in the form of labels (known as tags), surrounded by less-than (<) and greater-than (>) signs.
As the technology of the web evolves, new versions of HTML are developed to improve upon older versions. Most of the time, the latest version will be the best choice. A web page can identify which version of HTML it uses with a line of code called a DOCTYPE Declaration (DTD).
Internet Explorer uses the presence or absence of a DOCTYPE Declaration to switch between rendering the page in either ‘standards mode’ and ‘quirks mode’, potentially causing very unexpected results.
We’ll report on your pages’ use of a DTD and make recommendations about which one seems to be the best fit. Including a DOCTYPE Declaration and valid HTML doesn’t directly influence rankings, but it’s part of building a site “the right way”.
Keywords in the URL:
The URL of a page should contain keywords for that page, as the URL is something Google looks at when determining what a page is about. But Google has also warned against keyword stuffing in URLs, so keep the URL concise and easy to remember and type.
The title tag is a required element in an HTML document and is among the most important parts of a web page because it contains the most heavily weighted text on the page. You can view the contents of the title tag in the blue bar at the top of your browser.
The title serves a number of purposes. For many potential visitors, it is their first exposure to your site, as the search engines present page titles in their results pages, where they are displayed as the links that visitors click to go to each site. Effectively written titles will therefore increase your traffic, while poorly written tags will result in potential visitors passing over your page. Do not use the same title tag across all of your pages. It is important to find a balance between a title that humans will find compelling and one that incorporates targeted key words for the search engines. Matt Cutts has a good video about not going overboard when tweaking title and description tags.
The audit includes a list of all spiderable pages on your site and their corresponding title tags. We will make recommendations based on a number of criteria. You should review this list carefully, as adjustments here may greatly influence your rankings and traffic.
A pretty concise explanation of the title tag can be found at http://www.seologic.com/faq/title-tags.php
Description Meta Tag:
Because the description meta tag may appear on the SERPs underneath the page’s title, including a concise summary of the content is an important part of every page. Resist the urge to have this tag automatically generated, or to just copy and paste from the body text – it’s another opportunity to put your message in front of potential customers.
In terms of SEO, the tag is of little importance to Google, is of moderate importance to MSN/Live Search, and is still important to Yahoo!. If the tag is missing, the search engines may create one by excerpting body text near the searched-for terms.
Description meta tags of 125 – 150 characters should be safe from truncating. I’d recommend using one to two sentences, and do use a period at the end. I’ve seen Google truncate longer titles (ex.: 210 characters) to 140 – 157 characters, breaking on whole words. Such a wide range makes it difficult to predict what will be displayed in the SERPs if your tag exceeds 150 characters.
Keywords Meta Tag:
The keywords meta tag is optional. In terms of SEO, some minor search engines still consider its contents while the major ones ignore it completely. Because it has been and may still be used by Google to identify ‘spam’ pages, careful attention must be paid to the keywords in this tag. Omitting the tag entirely is acceptable, and doing so will reduce the risk that Google will use it to incorrectly identify a page as spam.
Read more about the keywords meta tag’s waning usefulness at http://searchengineland.com/meta-keywords-tag-101-how-to-legally-hide-words-on-your-pages-for-search-engines-12099
The H1 tag indicates the top-level heading of a page, and most developers argue that there should be only one H1 tag per page. Google won’t penalize for using more than one H1 tag on a page, but has warned against its overuse. The tag should contain text of primary importance and that is highly relevant to the content of the page, comparable to the headline of a newspaper. Search engines place some value on the text within the H1 tag.
Each image used on a web page should have an ALT tag that provides a text alternative to the image. Once, it was possible to stuff keywords into these ALT tags to influence rankings, but those days are long gone. Search engines do consider ALT tag text, but the recommendation is to use these tags only as they were intended, to describe the image, and limit that description to less than 140 characters.
We’ll let you know if your images are using ALT tags correctly.
Pay particular attention to the text in your internal site links. Google considers link text when determining what a page is about, so all of your internal links should use text that is descriptive of the target page. Try to use a likely search phrase as your link text wherever possible. Do not use “click here”, for example, as link text, because it doesn’t communicate anything about the target page.
It has been suggested, that Google considers only the link text from the first link to a resource on each page, and the link text in any subsequent links to the same resource on that page are ignored. If this is true, one might benefit from pushing the nav below the content in the HTML (as is done on many blogs with sidebar navigation) so that the usually more descriptive links in the content of the page are parsed first, rather than the snappy one-word navigation links that are in vogue.
Absolute and Relative URLs:
Links to pages on the same domain can be written in two ways, using either absolute or relative URLs. For example, a link appearing on http://example.com/index.html using an absolute URL would be written as http://example.com/bar.html while the same link using a relative URL would be written as bar.html.
Google has recommended using absolute URLs in all of your internal linking. We’ll tell you if this is happening, and if you wish, fix it where it is not.
Using valid HTML code in web pages is the first step toward maximizing compatibility with the greatest number of browsers and other user agents, such as PDAs and cell phones. Using valid HTML means having HTML code that correctly follows one of the DTDs of the HTML specification. You can check whether a page uses valid HTML by passing the page through an HTML validator, a program that checks the correctness of your document against the declared DOCTYPE. The most widely used validator is located at http://validator.w3.org/.
Validation is not a guarantee of the quality of the code and a valid page may look very different in different browsers. But validation is the fastest and most convenient metric available to an owner or end-user of a page who is concerned about the degree of care with which a page has been created. In my opinion, a page isn’t finished until it validates.
As the W3C people put it, “Validity is one of the quality criteria for a Web page, but there are many others. In other words, a valid Web page is not necessarily a good web page, but an invalid Web page has little chance of being a good web page.”
If you are paying for a web site, insist that it validates – ideally against the most strict DTD available (currently, XHTML 1.0 Strict). With respect to using valid code, Google has said, ‘make it easy on us, and we’ll make it easy on you’.
Learn more at http://validator.w3.org/
A modern web page is typically comprised of multiple files – including an HTML file and a CSS file – that allows the separation of the content elements from the presentational elements. The HTML file contains the tags (the basic structural elements, e.x.: the title tag, the H1 tag, etc.) and the content (the text) of the page. The CSS file contains all of the presentational information, such as the color, size, and font face of the text, the amount of space between each line, and the background color or image of the page. While the two files can be combined, it’s preferable to use them separately. The external CSS file is downloaded once and then cached by the browser to reduce bandwidth consumption and page load times.
A CSS file can be validated in the same way as HTML.
Learn more at http://jigsaw.w3.org/css-validator/
In the early days of the web, when HTML was a more primitive and limited language and before the advent of CSS, web pages were often structured using tables. Graphic designers looking for ways to precisely control the visual appearance of Web pages used tables to create page layouts that were dependably identical in all browsers.
This table-based method causes a number of problems, however. Complex pages are typically designed with tables nested within tables, resulting in large HTML documents that require more bandwidth than documents with simpler formatting. In addition, a web browser usually has to download all of the content within a table before displaying it on a page, resulting in slower-seeming load times. Table-based layouts require more markup than CSS-based layouts, increasing the ratio of code to content. Furthermore, when a table-based layout is linearized, for example, when being parsed by a screen reader or a search engine, the resulting order of the content can be somewhat jumbled and confusing. This can negatively affect how the search engines prioritize the page’s content.
CSS was developed to improve the separation between design and content and move back towards a semantic organization of content on the web. With CSS, the placement of text on a page doesn’t necessarily correspond to where that text exists in the code. This means that you can display the navigation above the text on the page, but put the text before the navigation in the code – an impossible situation with table-based layouts. This gives CSS-based layouts a terrific advantage over table-based layouts in terms of front-loading important content near the top of the document.
Canonicalization (www redirect):
Canonicalization is the process of selecting a single URL from several possible choices. For example, most people would consider these URLs to be the same:
And most of the time, a visitor would see the same page for each URL. But technically, all of these URLs are different. A web server could return completely different content for all the URLs above, and so search engines treat a page at any one of these URLs as distinct and separate from a page at any of the other URLs.
One of the factors that influences PageRank is the number of incoming links to a page. A problem arises when a page has incoming links to multiple URLs, for example, to both www.example.com/mypage.html and example.com/mypage.html. Google may determine that this page is actually two distinct pages, and so each of the pages starts to accrue its own PageRank, which is disadvantageous. To consolidate PageRank, it is recommended that a site’s owner first pick one URL and use that URL consistently across the entire site. On Apache servers, an .htaccess file is then used to cause the server to rewrite the non-favored URL to match the favored URL, eliminating the ‘other’ page.
In practice, when a browser or spider follows a link to the page at www.example.com/mypage.html, the server would tell the requestor that that page doesn’t exists and instead send the page from example.com/mypage.html. This ensures that the search engines are only ever served one page and that that page gets the PageRank credit for all the inbound links, no matter how the links are written.
No page should be more than three clicks from the home page of your site. One way to accomplish this is by creating a sitemap page that contains links to all of the pages on the site, or at least to the pages that you are prioritizing, and then linking to that page from the home page.
An XML Sitemap is an XML file that lists the URL of each page of a site, along with additional metadata about each page (when it was last updated, how often it usually changes, and how important it is relative to other pages in the site) so that search engines can more intelligently crawl the site.
Web crawlers normally discover pages of a web site by crawling the links within the site. Sitemaps supplement this data to allow crawlers that support Sitemaps to pick up all the URLs in the sitemap and learn about those URLs using the associated metadata. Using the Sitemap protocol does not guarantee that web pages are included in search engines, but it helps robots do a better job of spidering your site.
XML Sitemaps should not be confused with HTML sitemaps. HTML sitemaps are web pages that are accessible to human visitors, while XML Sitemaps are only ever seen by web robots.
Learn more at http://www.sitemaps.org/
A robots.txt file is a text file that resides in the root directory of your site and contains instructions for the robots about whether they are allowed to crawl your site and which folders or file types, if any, are off-limits to them. A robots.txt file can also point robots to your XML Sitemap.
An example of a robots.txt file that excludes problem robots: http://www.webmarketingnow.com/robots.txt
Page load times:
Another way to reduce page load times is to reduce the number of HTTP requests. Each file (JS, CSS, images, etc.) referenced by the HTML causes an HTTP request. Combining multiple JS and CSS files is an easy way to reduce the number of requests. Combining multiple images into a single file is a bit trickier. Luckily, there is a neat bookmarklet that will help you do this, called SpriteMe.
Google Webmaster Tools:
The Google Webmaster Tools is an online suite of tools that provide website owners with a free and easy way to make their sites more Google-friendly. The tools can show you how Google views a site, help you diagnose problems, and let you share info with Google to help improve your site’s visibility.