The analysis program has three main functions:
Now that we've covered the executive summary, let's move on to the introduction:
http://groucho.gcal.ac.uk/SupportStuff/mac-specific.html#macsupport -> /ISO/ISOmain.html http://web.mit.edu/mugs/www/fmug.htm -> /ISO/ISOmain.html http://www.mindspring.com/~fmpro/reference.html -> /ISO/ISOmain.html[Actually, it's not all the references. We try to ignore the references within a site because we're most interested in the links that brought a browser into your site from somewhere outside of it.]
At NetYard we've modified the above format to include the access time, so that you can link the referer.log entry into the access logs and track a particular browser through the site from their first contact...
If your site is an online brochure that you personally refer people to, then it may not be very usefull to you.
If your site is used to bring in prospective clients, then you are probably interested in finding out how people got to your site, so that you can try to bring in more people.
Knowing where your visitors are coming from can help you tailor your site to match the visitor.
Knowing what links they are following in can tell you if an exchanged link, or a purchased link is working.
Knowing what queries people are entering into search engines can help you write your pages to fit those queries so that you can rank higher in the search results.
http://your.server/sec-bin/referer.pl
Don't forget that it's password protected with your ftp userid and password (sorry visitors, examples are coming).
Now a little sample output:
Sample Referer stats for netyard server:
Host Counts:
28 webcrawler.com
19 guide-p.infoseek.com
15 lycos.com
13 altavista.digital.com
12 excite.com
<SNIP>
URL Counts:
19 http://guide-p.infoseek.com/Titles
15 http://www.webcrawler.com/cgi-bin/WebQuery
15 http://www.lycos.com/cgi-bin/pursuit
13 http://webcrawler.com/cgi-bin/WebQuery
12 http://www.excite.com/search.gw
7 http://www.altavista.digital.com/cgi-bin/query
6 http://altavista.digital.com/cgi-bin/query
<SNIP>
TEXT search WORD Counts:
27 web
25 hosting
8 Web
7 Hosting
5 resell
5 host
4 service
3 server
2 virtual
2 Service
<SNIP>
Sigh, I hate it when two day old pages are out of date :-). The URL listing above is
now listed as active <A HREF=...> links, and if you turn on the raw log entries,
each URL is shown completely... and can be followed backwards into the search page or
remote page with a link that the client followed.
You say there's a descrepency between the host report and the URL report! (Good eyes :-) Right and wrong. The host report is a summary, and we dropped any www. prefixes from host names, so www.webcrawler.com and webcrawler.com got combined for the host counts, but are reported separately for the URL listing.
The "TEXT search WORD Counts" indicate that the searches leading folks into this area are very heavily weighted towards "web hosting" which makes sense. It's a good short summary of what people are looking for.
Now, can you see the correspondence between the three check boxes on the form, and the three output areas (host count, URL count, word count)?
Some comments about the require and exclude fields. These are actually space separated lists, and the match is simply a substring match.
Usually you would list a host name (or several) that you either wanted to analyze, or exclude from the analysis, in the "Refering URL" column. But this is more flexible.... You could enter a require value of "hosting" or "cgi-bin" (both part of the URLs of referer entries from the search engines) to get some of the search engine entries.
The Target URL column behaves the same way, but controls the destination of the reference ... which is obviously on your server, so the host name isn't even part of the log entry... just the directory/file name component.
The final two check boxes are for detail work.
The "Display Query Terms" check box will display each actual search string as it is encountered.
The "Display Raw Log Entry" will display the complete line from the referer line.
This is probably best used with restrictions on the refering or target URLs.... if
you wanted the complete log file it would be faster to just download it :-).
I think that's it!