Movatterモバイル変換

Clean URL

From Wikipedia, the free encyclopedia

URL intended to improve the usability of a website

^{[[1] 1]}Clean URLs (also known asuser-friendly URLs,pretty URLs,search-engine–friendly URLs orRESTful URLs) are web addresses orUniform Resource Locators (URLs) intended to improve theusability andaccessibility of awebsite,web application, orweb service by being immediately and intuitively meaningful to non-expertusers. Such URL schemes tend to reflect the conceptual structure of a collection of information anddecouple theuser interface from a server's internal representation of information. Other reasons for using clean URLs includesearch engine optimization (SEO),^[1] conforming to therepresentational state transfer (REST) style of software architecture, and ensuring that individualweb resources remain consistently at the same URL. This makes theWorld Wide Web a more stable and useful system, and allows more durable and reliablebookmarking of web resources.^[2]

Clean URLs also do not contain implementation details of the underlying web application. This carries the benefit of reducing the difficulty of changing the implementation of the resource at a later date. For example, many URLs include the filename of aserver-side script, such asexample.php,example.asp orcgi-bin. If the underlying implementation of a resource is changed, such URLs would need to change along with it. Likewise, when URLs are not "clean", if the site database is moved or restructured it has the potential to causebroken links, both internally and from external sites, the latter of which can lead to removal fromsearch engine listings. The use of clean URLs presents a consistent location for resources touser agents regardless of internal structure. A further potential benefit to the use of clean URLs is that the concealment of internal server or application information can improve thesecurity of a system.^[1]

Structure

[edit]

This sectiondoes notcite anysources. Please helpimprove this section byadding citations to reliable sources. Unsourced material may be challenged andremoved.(March 2025) (Learn how and when to remove this message)

A URL will often comprise apath, script name, andquery string. The query string parameters dictate the content to show on the page, and frequently include information opaque or irrelevant to users—such as internal numericidentifiers for values in adatabase, illegiblyencoded data,session IDs, implementation details, and so on. Clean URLs, by contrast, contain only the path of a resource, in a hierarchy that reflects some logical structure that users can easily interpret and manipulate.

Original URL	Clean URL
`http://example.com/about.html`	`http://example.com/about`
`http://example.com/user.php?id=1`	`http://example.com/user/1`
`http://example.com/index.php?page=name`	`http://example.com/name`
`http://example.com/kb/index.php?cat=1&id=23`	`http://example.com/kb/1/23`
`http://en.wikipedia.org/w/index.php?title=Clean_URL`	`http://en.wikipedia.org/wiki/Clean_URL`

Implementation

[edit]

The implementation of clean URLs involvesURL mapping via pattern matching or transparentrewriting techniques. As this usually takes place on the server side, the clean URL is often the only form seen by the user.

For search engine optimization purposes, web developers often take this opportunity to include relevant keywords in the URL and remove irrelevant words. Common words that are removed includearticles andconjunctions, while descriptive keywords are added to increase user-friendliness and improve search engine rankings.^[1]

Afragment identifier can be included at the end of a clean URL for references within a page, and need not be user-readable.^[3]

Slug

[edit]

Some systems define aslug as the part of a URL that identifies a page inhuman-readable keywords.^[4]^[5] It is usually the end part of the URL (specifically of thepath /pathinfo part), which can be interpreted as the name of the resource, similar to thebasename in afilename or the title of a page. The name is based on the use of the wordslug in the news media to indicate a short name given to an article for internal use.

Slugs are typically generated automatically from a page title but can also be entered or altered manually, so that while the page title remains designed for display and human readability, its slug may be optimized for brevity or for consumption by search engines, as well as providing recipients of a shared bare URL with a rough idea of the page's topic. Long page titles may also be truncated to keep the final URL to a reasonable length.

Slugs may be entirely lowercase, with accented characters replaced by letters from theLatin script andwhitespace characters replaced by ahyphen or anunderscore to avoid beingencoded. Punctuation marks are generally removed, and some also remove short, common words such asconjunctions. For example, the titleThis, That, and the Other! An Outré Collection could have a generated slug ofthis-that-other-outre-collection.

Another benefit of URL slugs is the facilitated ability to find a desired page out of a long list of URLs without page titles, such as a minimal list of openedtabs exported using abrowser extension, and the ability to preview the approximate title of a target page in the browser ifhyperlinked to without title.

Should a tool to save web pages locally use the string after the last slash as the defaultfile name, likewget does, a slug makes the file name more descriptive.

Websites that make use of slugs includeStack Exchange Network with question title after slash, andInstagram with?taken-by=username URL parameter.^[6]^[7]