By Rob Station
PEER 1's Content Delivery Network (CDN) is remarkably easy to setup and use. Of course, once setup, you have many tools at your disposal to fine tune the CDN to your distribution needs. Also, you can use our online portal to check your stats and keep your settings up to date. The following is a guide to help you get your new and faster CDN-enabled site up and running.
Picking A Hostname
Every site starts somewhere, and the first step to getting your site running on the CDN is the hostname. There are two basic rules to follow:
i. The site needs to exist within a single hostname. For example, if you set up 'www.example.com' PEER 1 will cache elements under www.example.com only. It will not cache its linked banner site banner.example.com or its development blog at blog.example.com. You can always add additional hostnames or change existing ones by contacting your sales representative.
ii. The site should not be a domain name. Attempts to setup a cache of 'mydomain.com' will break the DNS records for 'mydomain.com'. Only leaf host names such as 'content.mydomain.com' or 'www.mydomain.com' will work by default.
If you have an issue with either of these points, there are some slightly more involved technical workarounds that can be employed - just let us know you have a special requirement and we will do our best to meet your needs.
Setting Up Your Origin Server
Setting up your origin server is simple. PEER 1's CDN only requires the following:
i. Your origin server must be on listening on a publicly available ip on port 80. Our caching network must be able to reach your origin server over the Internet.
ii. Your origin server must have an entry for the hostname you are placing on the CDN. It can be a virtual host or share an ip with other sites,as long as there is an entry for your CDN hostname.
Taking Stock
After submitting your hostname and origin and having it quickly set up by the Peer 1 team, you will receive a confirmation email containing this document and other important information. It may be a good idea to make note of this in a safe place. The following is relevant to your RapidEdge CDN site and will be referenced throughout this document:
- Your Hostname:
- Your Origin Server's IP:
- Your Cluster Hostname (ex: mysite.cache.peer1.net):
- Mercury Username:
- Mercury Password:
Your Mercury Online Portal
One of the best features of PEER 1's CDN is the online portal, which you can now log into using the username and password provided in your "welcome" email. On the portal, you can do the following:
- Check Reports, including usage history and our remarkably configurable graphs.
- Use Tools, including our Loader and Purger services.
- View and fine tune your cache settings in our Settings tab.
- Read up on various How-Tos in our Resources section.
Require Further Help?
Don't hesitate to contact PEER 1. Our very capable NOC team can answer your questions or escalate your concerns to an engineer.
24/7 Peer 1 NOC
[email protected]
1.866.484.2588
Quick Start Guide
1) Origin Server - make sure your Origin Server is up and accepting requests for your CDN hostname on the ip you provided for us!
2) The Quick Test - if you'd like, you can test your site before your users. You can fool your computer to connect through Peer 1's CDN without going live to the whole internet!
The following instructions are for Windows, but most modern OSes have a similar feature called a hostfile.
i. From the 'Start Menu' choose 'Run' and enter: 'cmd'.
ii. At the DOS prompt type 'ping cache.peer1.net'. Write down the IP address it gives you. This is the closest Peer 1 CDN cache server to your location.
iii. Text-edit the static hosts file.
In WindowsXP enter the following command:
edit \windows\system32\drivers\etc\hosts
In Windows Server or NT try:
edit \winnt\system32\drivers\etc\hosts
iv. Create a new line in this file for your site's hostname and the closest cache IP address you found in step ii. For example:
69.90.155.252 www.mydomain.com
v. Save the file and test out the cached version of your site. You may need to restart your browser before it works.
vi. DO NOT FORGET - Remove the entry from your hosts file when you are done!
3) Going Live - once you are happy with the cache, it is time to get the site running off the cache for the rest of the internet. To do this, internet users need to be told to consult our name servers for the hostname in question. If you don't manage your own DNS then you will have to ask your DNS service provider to do this step for you. (If your DNS provider can't or won't do it, talk to PEER 1 about moving your DNS over to us.)
i. Create a CNAME alias from your hostname to 'cache.peer1.net'. This 'canonical name record' basically instructs DNS to use the IP address for 'cache.peer1.net' as the IP address for your hostname.
ii. Set the TTL for the CNAME record to a reasonably low value (like 5 minutes). The 'time to live' of a record determines how quickly changes to DNS records will propagate. For existing sites, your current TTL will determine how long the switchover will take. The new TTL will determine how long a switchback would take. A five minute TTL would mean a switchover of five minutes. There are some ISPs who force a minimum TTL of five minutes. A BIND zone file example follows:
--
;start
@ IN SOA ns1.mydomain.com. hostmaster.mydomain.com. (
2004090101 ; Serial
10800 ; Refresh
3600 ; Retry
3600000 ; Expire 1000 hours
86400 ) ; Default TTL
IN NS ns1.mydomain.com.
IN NS ns2.mydomain.com.
; origin server IP
IN A 123.123.123.123
; alias to cache server, with a 5 minute TTL.
www 300 IN CNAME cache.peer1.net.
;end
--
Making It Better
One of the advantages of PEER 1's CDN is that it is so easy to get started with. However, A CDN site can benefit from some additional optimization work. Becoming familiar with the following topics can send you on your way to being a CDN power-user.
Testing Tools
Having a site on a CDN is good, but knowing how it is cached better helps you plan. There are a few tools that let you peer into HTTP headers, which are communication strings every web server sends along with requested data. These headers contain Cache-control responses, which let you know how your origin server is telling our caching network to cache your content. Also, you can view weather your files are HITs or MISSes on the cache (cached in memory or object had to be downloaded from your origin server).
Example
HTTPHeaders is an Internet Explorer plugin that lets you view the headers of the sites your browse.
http://www.blunck.se/iehttpheaders/iehttpheaders.html
LiveHTTPHeaders is a Firefox plugin that does the same thing.
http://livehttpheaders.mozdev.org/
wget is a command line tool available on most *NIX platforms it would be invoked with the "-S" option.
http://www.gnu.org/software/wget/
In addition, you may find checking the [In/Out] comparison graph in your Reports section of your Mercury login helpful.
Defaults
By default most content will be configured appropriately for caching. Web servers know that a dynamic file (as from a CGI or ASP page) should not be cached and they mark it appropriately when sending out such objects. For static files (images and html pages) a refresh interval based off of the creation time of the file can be used. By default the Peer 1 CDN considers an object fresh for 20% of its age (up to a maximum of 3 days). For example an image that was created 2 hours ago will be rechecked after 24 minutes, and an HTML page that was created a month ago will be rechecked every 3 days.
Headers and Dynamic Content
A web server typically marks a dynamically generated page with a Cache-Control response header that marks the page as non-cacheable. While some scripts really are dynamic (meaning that they return a different response for every request), many (like search engines and database-driven sites) can benefit from being cache-friendly. If a script produces output that is reproducible with the same request at a later time (whether it be minutes or days later), it should be cacheable. If the content of the script changes only depending on what's in the URL, it is cacheable; if the output depends on a cookie, authentication information, or other external criteria, it probably isn't.
Headers and What They Do
Cache-control response headers can do much more then simply mark content as uncacheable or cacheable. Other useful Cache-Control response headers include:
- max-age=[seconds] -- specifies the maximum amount of time that an object will be considered fresh. This directive is relative to the time of the request, rather than an absolute.
- s-maxage=[seconds] -- similar to max-age, except that it only applies to an client ISP's proxy cache. In other words, this directive is ignored by RED, but not by AOL's client proxy cache.
- public -- marks authenticated responses as cacheable; normally, if HTTP authentication is required, responses are automatically uncacheable.
- no-cache -- forces caches to submit the request to the origin server for validation before releasing a cached copy, every time. This is useful to assure that authentication is respected (in combination with public), or to maintain rigid freshness, without sacrificing all of the benefits of caching. This also the directive commonly used by webservers for dynamic content from CGIs and ASPs.
- no-store -- instructs caches not to keep a copy of the representation under any conditions.
- must-revalidate -- tells caches that they must obey any freshness information you give them about an object. RED caches will serve stale representations under special conditions; specifying this header tells the cache that you want it to strictly follow your rules.
- proxy-revalidate -- similar to must-revalidate, except that it only applies to a client ISP's proxy caches (like s-maxage above).
All of these headers can be defined on your origin server, either by using webserver controls, or through languages like perl, ASP or PHP.
Setting Cache-Control Headers
Apache HTTP Server
^^^^^^^^^^^^^^^^^^
Apache uses optional modules to include headers, including both Expires and Cache-Control. The modules need to be built into Apache; although they are included in the distribution, they are not turned on by default. To find out if the modules are enabled in your server, find the httpd binary and run 'httpd -l'; this should print a list of the available modules. The modules you're looking for are mod_expires and mod_headers. See the module documentation for more information, or send PEER 1 a quick email if your having trouble.
For Cache-Control headers, the most flexible option is the mod_headers module, which allows you to specify arbitrary HTTP headers for an object.
Here's a quick example .htaccess file that demonstrates the use of some headers.
--
### activate mod_expires
ExpiresActive On
### Expire .gif's 1 month from when they're accessed
ExpiresByType image/gif A2592000
### Expire everything else 1 day from when it's last modified
### (this uses the Alternative syntax)
ExpiresDefault "modification plus 1 day"
### Apply a Cache-Control header to index.html
<Files index.html>
Header append Cache-Control "public, must-revalidate"
</Files>
--
Apache 2.0's configuration is very similar to that of 1.3; see the 2.0 mod_expires and mod_headers documentation for more information.
Microsoft IIS
^^^^^^^^^^^^
Microsoft's Internet Information Server makes it very easy to set headers in a somewhat flexible way. Note that this is only possible in version 4 of the server, which will run only on NT Server.
For versions 5 and 6, here is a good starting point: http://support.microsoft.com/kb/247404
To specify headers for an area of a site, select it in the Administration Tools interface, and bring up its properties. After selecting the HTTP Headers tab, you should see two interesting areas; Enable Content Expiration and Custom HTTP headers. The first should be self-explanatory, and the second can be used to apply Cache-Control headers.
CGI
^^^^
CGI scripts are one of the most popular ways to generate content. You can easily append HTTP response headers by adding them before you send the body; Most CGI implementations already require you to do this for the Content-Type header. For instance, in Perl;
#!/usr/bin/perl
print "Content-type: text/html\n";
print "Expires: Thu, 29 Oct 1998 17:04:19 GMT\n";
print "\n";
### the content body follows...
PHP
^^^
By default, objects processed by PHP are non-cacheable. However, developers can set HTTP headers by using the Header() function. For example, this will create a Cache-Control header:
<?php
Header("Cache-Control: must-revalidate");
?>
Remember that the Header() function MUST come before any other output.
See also the cgi_buffer library, which automatically generates cache-friendly headers as well as useful Content-Length generation and gzip Content-Encoding for PHP scripts with a one-line include.
ASP
^^^^
To set an expiry time in ASP, you can use the properties of the Response object;
<% Response.CacheControl="public" %>
When setting HTTP headers from ASPs, make sure you either place the Response method calls before any HTML generation, or use Response.Buffer to buffer the output. Also, note that some versions of ISS set a Cache-Control:private header on ASPs by default, and must be declared public to be cacheable by PEER 1's CDN.
In ASP.NET, Response.Expires is deprecated; the proper way to set cache-related headers is with Response.Cache;
Response.Cache.SetCacheability ( HttpCacheability.Public ) ;
See the MSDN documentation for more information.
Using The Online Portal to Set Caching Defaults
A final way to set caching and freshness information for your site is to use the online portal.
Using this method you can set broad freshness information for your entire site. While not giving you the fine grained control of other methods, using the online portal is a quick and easy way to set some basic site defaults.
To find the Freshness Pattern section of the online portal, log in and click the Settings tab. Select Cache from the services menu.
Under the Sitewide Freshness Pattern heading, you have the ability to set a Maximum Freshness level. By default, it is set at 1440 minutes (24 hours). You can keep this setting, or you can change it to another value that suits your site better. Once you have inputted a number that you are happy with, click the Create button.
This new change will take about 20 minutes to propagate.