Sure Oak SEO Tools
Robots txt Generator

Create a robots.txt file

What Is a Robots txt File?

A robots.txt file is a really simple, plain text format file. Its core function is to prevent certain search engine crawlers from crawling and indexing content on a website.

If you’re not certain whether your website or your client’s website has a robots.txt file, it’s easy to check.

Simply enter: yourdomain.com/robots.txt. You’ll either find an error page or a simple format page. Want to learn what they mean? Read on and find out.

The most common phrases you’ll find within your robots.txt file include:

User-agent:

Since each search engine has its own crawler (the most common being Googlebot), the ‘user-agent:’ allows you to notify certain search engines that the following set of instructions is for them.

You’ll commonly find ‘user-agent’ followed by a *, otherwise known as a wildcard. This indicates that all search engines should take note of the next set of instructions.

There is also typically a default phrase following the wildcard that tells all search engines not to index any webpage on your site.

The default phrase is to disallow the symbol ‘/’ from being indexed, which essentially prohibits every internal page except your main URL from the bots. It’s important that you check for this phrase and immediately remove it from your robots.txt page.

It will look something like this:

User-agent: *

Disallow: /

Disallow:

The term ‘Disallow’ followed by a URL slug of any kind gives strict instructions to the aforementioned user-agent, which should appear on the line above.

For instance, you’re able to block certain pages from search engines that you feel are no use to users. These commonly include WordPress login pages or cart pages, which is generally why you’d see the following line of text within the robots.txt files of WordPress sites:

User-agent: *

Disallow: /wp-admin/

XML Sitemap:

Another phrase you may see is a  reference to the location of your xml sitemap file. This is usually placed as the last line of your robots.txt file, and it indicates to search engines where your sitemap is located. Including this makes for easier crawling and indexing.

You’re able to make this optimization to your own website by entering the following simple function:

Sitemap: yourdomain.com/sitemap.xml (or the exact URL of your xml sitemap file).

Our Robots.txt Generator

Learn how to creata a robots.txt with our generator tool is designed to help webmasters, or even general site owners, create robots.txt files without needing huge amounts of technical knowledge.

Although this tool is pretty straightforward, we would suggest you familiarize yourself with the lines of code we mentioned above before using it. This is because incorrect implementation can lead to search engines being unable to crawl critical pages on your site or even your entire domain.

Now that you’ve learned about the codes, let’s delve into some of the features our tool provides.

The first option you’ll be presented with when you use the tool is the ‘default – all robots are’ drop down menu. This menu allows you to decide whether you want your website to be indexed; however, if you choose not to have your website indexed, you must have a good reason.

The second option is to add a ‘crawl delay’. This is a feature typically found on major websites like Twitter and Facebook, since their content is frequently changing.

Adding a crawl delay prevent bots from crawling your website at the same time. It is essentially a method of preventing an overload of requests to the web server, so unless you’re creating a robots.txt file for an enterprise site, or a site that will receive a large number of requests from crawlers, adding a crawl delay is unnecessary.

The third option you’ll receive is whether to add your xml sitemap file, as mentioned earlier. Simply enter its location within this field. You are also able to choose whether to block or allow specific crawlers from accessing your web pages. These include. Googlebot, Google Image, Google Mobile, MSN, Yahoo and DMOZ to name a few.

Finally, you’re given the option to block certain pages or directories from being indexed by search engines This is typically done for pages that don’t provide any useful information to users, such as login, cart, and parameter pages.

Sound useful? We hope so!

Give our robots.txt tool a try and let us know how it works for you.

To enable screen reader support, press ⌘+Option+Z To learn about keyboard shortcuts, press ⌘slash