Saturday, June 4, 2011

hash-bang URL

Hash-bang URLs are URL, that, directly after domain itself, start with '#!'- for example twitter.com/anoop becomes twitter.com/#!/anoop. The part of the URL that uniquely identifies the content of the page is then added at the end. This technique is aimed at improving performance- it is aimed at not entirely reloading an entire page when you only need to reload a small piece of it. But it does come with serious downsides.

Uniform Resource Locator
The term Uniform Resource Locator specifies the location of a certain resource, such as a web page. Since a location is the identification of a place, any URL is also a URI or Uniform Resource Identifier. However, a URL specifies not just a location of a URI but also the method for accessing it- the scheme or protocol.

Here, we focus on web addresses that use the HTTP protocol, and ignore such things as MAILTO, FTP or FILE, as well as ports, embedded usernames and passwords. An HTTPS address is the same as any regular HTTP URL, with the added requirement that it uses a secure connection.

Domain
The www is not part of the domain. It's merely a subdomain that is commonly used by websites. Whether you use www.domain.com or just domain.com, both addressesshould get visitors to one and the same website.

Path
The path is one of the most important parts of URL design and should be created like a folder structure, using forward slashes, regardless of your backend server setup. Each unique page of your website or web application should have its own unique path.
Keep your paths as short as possible.

Query strings
The majority of websites enable visitors to search. This is what query strings are best for as well as related actions such as filtering and sorting the contents of a page.
A lot of server-side systems misused query string parameters to server different pages of a site, such as domain.com/index.php?p=aboutme other sites went one step further and rewrote search query strings as a path. These are both bad practices. A query string should be treated as an optional addition to the page; the URL should work to produce a valid and useful page even when it is removed.

Fragment Identifiers
Is the only part of the URL that doesn't get sent to the server hosting the page. Instead, it is meant to identify a specific location inside the resulting page.
Browsers can navigate between multiple fragment identifiers without reloading the page. Since, this is a desirable user experience browser vendors created the HTML5 history API, which is the appropriate technique for navigating around sites without triggering page reloads.

Breaking the Agreement
A Uniform Resource Locator is a Uniform Resource Identifier that specifies where an identified resource is available and the mechanism for retrieving it. A hash-bang based URL insufficiently specifies the mechanism for retrieving the content, as it requires as Javascript round trip to the server after the server has already sent the browser an HTML page- a page that doesn't have the content associated with the requested URL yet. And that's all assuming the Javascript is not filtered out by some proxy server or firewall, and doesn't contain any errors anywhere in the page. When users turn Javascript off in their browser, these sites will stop working. Having the entire site rely on fragile techniques isn't bad enough, hash-bangs are a one way street to permanent maintenance and support.

Bad Practices
There are many different ways to design you URLs. We should know what makes for bad URL design in order to fully understand and appreciate good URL design.

Page Identification Hashes
Some content management systems or blog engines identify each unique page with a long string of random characters. If your CMS or site engine generates such URLs, find out how to overwrite or turnoff that behaviour immediately. There are only downsides to these URLs.

Session Hashes
While not as bad as when used for pages, hashes-used for sessions on your site are still bad. They negatively affect SEO. The bigger concern is that most systems employing them use SHA-1, which is relatively insecure- certainly for user sessions or logins containing any sensitive data.

File Extensions
URLs should be free of .php, .aspx and so forth. File extensions are not forward compatible, so if you change backend systems and all your URLs contain .aspx you are forced to do server side rewriting for every single page on your site. The HTML extension isn't really recommended either, but if your confident you'll only serve the pages you're building as static files it's an acceptable technique.

Non ASCII Characters
Sites with character language as the primary content language are somewhat excused, but accented Latin and non basic punctuation is best avoided.

Underscores
These have poor usability and SEO value, and no tangible benefits to over hyphens.

Keyword Stuffing
Adding multiple keywords to URLs may help with SEO, but it will confuse users. Also, you will very soon be marked as a keyword spammer.

Good Practices
While it is important to know what techniques to avoid, it is obviously more worthwhile to know which you should use.

Robust URL Mapping
It is possible that the recipient's environment might wrap the URL across two lines. This is most common with blog posts that include a full date and long title in the URL.
One good solution is to keep your URLs shorter than 70 characters, but that is not always ideal.Furthermore, the nature of relational database systems is such that ID values are quick to look up but strings are not.
With large amounts of traffic, this can be a serious bottleneck to bring the server down.
Robust mapping can solve these problems for you. By embedding a unique ID early on in your path, you can have long fully descriptive URLs when needed but still enjoy the reliability pf shorter URLs and ID lookups.
Take URL : domainname.com/news/1975-this-post-needs-two-lines. In this example '1975' is the ID value for the database record for this particular post. Your CMS should only use this part of the URL to do a successful lookup: domainname.com/news/1975.
Everything after that is optional, good for SEO, but it won't matter if it gets wrapped onto two lines.
The only downside is that IDs are not too human- friendly.

Hackable URLs
A good hackable URL a human can adjust or remove parts of the path and get expected results from your site. They give your visitors better orientation around your pages, and enable them to easily move up levels. example, domain.com/blog/1975/10/12/ article. Reducing that to each forward slash should produce expectable results. Example: domain.com/blog/1975/10/12/ should return all posts published on 12th October 1975, while domain.com/blog/1975/10/ an overview of October 1975 posts.
How detailed you should be about designing such URLs depends on the site's content and audience. The more topical content is, the more it benefits from publication dates in the URL; the more frequently new content gets published, the more it benefits from finer granularity. No matter how detailed your URLs end up being, they should ultimately be completely hackable.

Namespaces
The top-level section of the path is the most valuable real estate in a URL. If the site enables users to sign up have have their own profile at this level, you should create a blacklist of usernames containing all current and possible future features you may wish to have.
Namespacing features behind the username: lists or /followers are great solutions for public features that belong to each user individually. Private things , such as account settings, should never be namspaced behind the username, and should appear after /account or /settings.
Share

Add to Google Reader or Homepage

Subscribe in NewsGator Online

Add to My AOL

Add to netvibes

Subscribe in Bloglines

Add to The Free Dictionary

Add to Plusmo

Add to Excite MIX

Add to netomat Hub

Add to fwicki

Add to Webwag

Add To Fwicki

No comments:

Post a Comment