Site map and everything we need to know from the sitemap.xml
8 minute(s) read
Jan 26, 2021
The main purpose of website owners is to optimize their websites and get their website noticed by the Google search engine so that they can
improve website ranking
Have you ever wondered how search engines measure the value of a website? How they crawl between different pages of a website and index them? Here are the answers to these questions.
You may have heard of Sitemap , it's good to know the value of Sitemap for ranking on search engine results and site optimization, it acts just like a map for bots.
What is a sitemap?
is a file that contains information about the content of pages, photos and other important information of the website and the relationship between them that all search engines such as Google use this file to examine the website with more ability and awareness.
In other words, a sitemap is an XML file that contains a list of all the URLs. By using the sitemap you have access to more information about the URLs plus, they inform the search engines about the more important contents, which helps to Index websites easily. Search engines like Google use a program called spider.
The spider search engine is also known as a web crawler. In fact, it is an Internet bot that scans websites and stores important information for search engine indexing.
What is the reason of using the sitemap?
Sitemap allows Google bots to access the entire website without any size limit, these bots do not have the ability of a user to view and categorize the entire website separately, thus they help to optimize the search engine because they allow search engines to find the whole content that a webmaster wants to discover and index on their website and make the best use of the XML sitemap, which, as we mentioned, is a list of website URLs.
Where is the sitemap located?
All URLs listed in the sitemap should be located in the root directory of your website HTML server and your sitemap file should be located on the host and the main file folder.
Types of tags in the XML sitemap:
The <urlset> tag is required, it encapsulates documents.
This tag is required and the rest of the tags are subsets of this one.
The <loc> tag is required as well and this URL should not exceed 2048 characters. In this tag, the URL of the web pages is mentioned.
This tag is optional and shows the date of the file last modification. It must also be in the W3C Datetime format, which allows you to delete the timeline and use YYYY-MM-DD if you wish.
The <changefreq> tag is optional as well, and indicates how much a page changes. The valid values include, never, hourly, daily, weekly, monthly, yearly, always exist. For example, for archived URLs, the value “never” must be used.
This tag is optional and shows the importance of a page compared to other pages of the website. Valid values are from 0.0 to 1.0 and shows search engines which pages you consider as the most important pages for crawlers. Also, the rank of a page is 0.5 by default.
Sitemap file format:
Search engines have adopted the XML format as the protocol, search engines also accept sitemap feed that uses the RSS 2.0, Atom 1.0, and ASCII text files.
Generally, a sitemap should:
Begin with an opening <urlset> tag and end with a closing tag.
Specify the namespace (protocol standard) within the tag.
Include a entry for each URL, as a parent XML tag.
Include a child entry for each parent tag.
Other tags are optional and may be different in each search engine.
You should also pay attention to the points mentioned below:
1. tags are mandatory and , and tags are optional.
2. All URLs in the sitemap must belong to the same domain.
3. Sitemap cannot be larger than 10 MB in file size and can have a maximum of 50,000 URLs, if the points mentioned are not followed and a website contains more URL or larger file size, sitemap index should be created, the sitemap index file can contain up to 1000 sitemaps and can contain up to 10 MB of file size.
Note: Note that if the files are larger, use gzip compression.
In general, the points that should be considered in the site map are mentioned below:
Sitemap file size:
As we have mentioned before, the sitemap file should not contain more than 50,000 URLs and the size of this file should not be more than 10 MB. If you have more than 5000 URLs, the sitemap index file should be used.
Specify the namespace (protocol standard) within the <urlset> tag. Include a <url> entry for each URL, as a parent XML tag.
Use entity-escaped characters in the URL:
The sitemap protocol format consists of XML tags. All data values in a sitemap must use entity-escaped characters in the URL.
Your website and all URLs on the sitemap must follow the same rules and regulations. Following the same guidelines means that when the URLs in this section start with the prefix WWW, it is not possible to combine addresses without this prefix and put them together, and as a point, do not include URLs that have session ID in this file.
The sitemap file that is placed in the directory, only has the ability to show the URLs in which the directory or their subdirectories are located. As described above, URLs in parent directories, parallel directories, different subdirectories, or those that use different rules are not trusted and principled references. Placing a sitemap in the root directory can prevent the creation of an unreliable and disreputable reference.
Principled and accurate links:
Your sitemap is accurate and principled when the error rate is less than 1%. Otherwise, your sitemap is unusable and should be discarded. Here we are going to explain the link error to you. Perhaps the main question is that how does a link error occur? Any type of HTTP response code includes 404 redirections for broken links and 301 and 302 redirects for redirect links. This is also a known rule for Bing. If you need proper and approved SEO by search engines mandatory, you must use accurate and correct links. The way to check the HTTP response code for each URL is using the Header Checker tool. For multiple URLs on a website, you can use tools such as Find Broken Link, find broken links, Redirects & Google Sitemap Generator Free Tool.
Types of sitemaps:
To get a great result in the field of website optimization , you should use different types of sitemaps in combination. Visual sitemap, XML sitemap and HTML sitemap each one has separate features that the simultaneous use of all of them together causes coherence and increases the value of a website in terms of search engines and turns a website into the best one. Each of these maps is to achieve a separate goal that the use of all three types of sitemap in combination ensures you that your site contains all the information you need and the applicable and valuable content you want, and also provides you with a good performance while searching for the information you have.
Generally Sitemaps were introduced by Google with MSN and Yahoo ! in 2005. These sitemaps are known as URL inclusion protocols that organize search engines on what to crawl. The opposite of it is the robots.txt, which is known as an exclusion protocol and helps the search engine not to crawl. XML is a powerful tool when it comes to websites as it helps major search engines like Google to understand web structures after crawling it. In other words, it can be said that by the correct use of sitemaps for your website, the search engines will easily target the selected areas for keywords used in the domain of search bars (search engines). One is recommended to create their own XML sitemap for websites to improve SEO . An XML sitemap also provides Google to participate and find pages on your website.Website SEO analysis services