# This is a sample configuration file for the Webalizer (ver 2.01)
# Lines starting with pound signs '#' are comment lines 
# Blank lines are skipped as well.  
# have the form "ConfigOption  Value" where
# ConfigOption is a valid configuration keyword, and Value is the value to assign that configuration option.  
# Invalid keyword/values are ignored, with appropriate warnings being displayed.  
# There must be # at least one space or tab between the keyword and its value.
#
# 'default' configuration file "webalizer.conf" in the current directory, and if not found there, "/etc/webalizer.conf".


# VisitTimeout set the default timeout for a visit  (aka session).
# default 30 minutes,
# Visits are determined by looking at the time of the current request, and the time of the last request from the site.  
# If the time difference is greater than the VisitTimeout value, it
# is considered a new visit, and visit totals are incremented.
# Value is the number of seconds to timeout (default=1800=30min) 
#VisitTimeout	1800

# PageType types of URL's are a 'page',  permitts excluding images and audio files.  
# defaults :'htm*', 'cgi' and HTMLExtension  if different for web logs, 'txt' for ftp logs).  
PageType	htm*
PageType	cgi
#PageType	phtml
#PageType	php3
#PageType	pl

# The Hide*, Group* and Ignore* and Include* keywords change the way Sites, URL's, Referrers, User Agents and Usernames are manipulated.  
# Ignore* keywords will completely ignore records as if they didn't exist (and thus not counted in the main site totals). 
# Hide* keywords will prevent things from being displayed in the 'Top' tables, but will still be counted in the main totals.  
# Group* keywords allow grouping similar objects as if they were one.  
#   Grouped records are displayed in the 'Top' tables and can optionally be displayed in BOLD and/or # shaded. 
#   Groups cannot be hidden, and are not counted in the main  totals. 
#   Group* do not, by default, hide all the items # that it matches.  
#   To hide the records that match (so just the grouping record is displayed), follow with a Hide*  with the same value.  (see example below)  
#   Group* keywords may have an optional label which will be displayed instead of the keywords value.  
#   The label should be seperated from  the value by at least one 'white-space' character, such as a space or tab.
#
# The value can have either a leading or trailing '*' wildcard
# If no wildcard is found, a match can occur anywhere in the string. 
# Given a string "www.yourmama.com", the values "your", "*mama.com" and "www.your*" will all match.  
# Your own site should be hidden
#HideSite	*mrunix.net
#HideSite	localhost

# Your own site gives most referrals
#HideReferrer	mrunix.net/

# This one hides non-referrers ("-" Direct requests)
#HideReferrer	Direct Request

# Usually you want to hide these
HideURL		*.gif
HideURL		*.GIF
HideURL		*.jpg
HideURL		*.JPG
HideURL		*.png
HideURL		*.PNG
HideURL		*.ra

# Hiding agents is kind of futile
#HideAgent	RealPlayer

# You can also hide based on authenticated username
#HideUser	root
#HideUser	admin

# Grouping options
#GroupURL	/cgi-bin/*	CGI Scripts
#GroupURL	/images/*	Images

#GroupSite	*.aol.com
#GroupSite	*.compuserve.com

#GroupReferrer	yahoo.com/	Yahoo!
#GroupReferrer	excite.com/     Excite
#GroupReferrer	infoseek.com/   InfoSeek
#GroupReferrer	webcrawler.com/ WebCrawler

#GroupUser      root            Admin users
#GroupUser      admin           Admin users
#GroupUser      wheel           Admin users

# The following gets an overall total for browsers, and not display all the detail records.  
#GroupAgent	MSIE		Micro$oft Internet Exploder
#HideAgent	MSIE
#GroupAgent	Mozilla		Netscape
#HideAgent	Mozilla
#GroupAgent	Lynx*		Lynx
#HideAgent	Lynx*

# HideAllSites forces individual sites to be hidden in the report.  
# useful with the "GroupDomain" feature, but could be useful in other
# situations as well, such as when you only want to display grouped
# sites (with the GroupSite keywords...).  The value for this
# default, no allowing individual sites to be displayed.  
#HideAllSites	no

# GroupDomains groups individual hostnames into their respective domains.  
# The value specifies the level of grouping to perform, and can be thought of as 'the number of dots'
# that will be displayed.  
# For example, if a visiting host is named cust1.tnt.mia.uu.net, a domain grouping of 1 will result in just
# "uu.net" being displayed, while a 2 will result in "mia.uu.net".
# default zero disable this feature.  
# Domains will only be grouped if they do not match any existing "GroupSite" records, overrides this feature with your own if desired.  
#GroupDomains	0

# The GroupShading allows grouped rows to be shaded in the report.
# Useful if you have lots of groups and individual records that
# intermingle in the report, and you want to diferentiate the group records a little more.  default yes.  
#GroupShading	yes

# GroupHighlight allows the group record to be displayed in BOLD.
#GroupHighlight	yes

# The Ignore* keywords allow you to completely ignore log records based
# on hostname, URL, user agent, referrer or username.  I hessitated in
# adding these, since the Webalizer was designed to generate _accurate_
# statistics about a web servers performance.  By choosing to ignore
# records, the accuracy of reports become skewed, negating why I wrote
# this program in the first place.  However, due to popular demand, here
# they are.  Use the same as the Hide* keywords, where the value can have
# a leading or trailing wildcard '*'.  Use at your own risk ;) 
#IgnoreSite	bad.site.net
#IgnoreURL	/test*
#IgnoreReferrer	file:/*
#IgnoreAgent	RealPlayer
#IgnoreUser     root

# The Include* keywords allow you to force the inclusion of log records based on hostname, URL, user agent, referrer or username.  
# They take precidence over the Ignore* keywords.  
# Using Ignore/Include combinations to selectivly process parts of a web site is _extremely
# inefficent_!!! Avoid doing so if possible (ie: grep the records to a seperate file if you really want that kind of report).  
# Example: Only show stats on Joe User's pages...
#IgnoreURL	*
#IncludeURL	~joeuser*

# Or based on an authenticated username
#IgnoreUser     *
#IncludeUser    someuser

# LogFile defines the web server log file to use.  If not specified here or on on the command line, input will default to STDIN.  
# If the log filename ends in '.gz' (ie: a gzip compressed file), it will be decompressed on the fly as it is being read.  
#LogFile        /var/lib/httpd/logs/access_log

# LogType defines the log type being processed.  
# expects a CLF or Combined web server log as input. 
# process ftp logs as well (xferlog as produced by wu-ftp and
# others), or Squid native logs.  Values can be 'clf', 'ftp' or 'squid', with 'clf' the default.  
#LogType	clf

# relative paths might work as well.
# If no output directory is specified, the current directory will be used.  
#OutputDir      /var/lib/httpd/htdocs/usage

# keeps the data for up to 12 months # worth of logs, used for generating the main HTML page (index.html).
# default "webalizer.hist", stored in the specified# output directory.  
# specify the filename (without a path), it will be kept in the specified output directory.  
# is relative to the output directory, unless absolute (leading /).  
#HistoryName	webalizer.hist

# Incremental processing allows multiple partial log files to be used .
# The Webalizer will save its internal state before exiting, and restore it the next time run, in
# order to continue processing where it left off.  
# causes Webalizer to scan for and ignore duplicate records 
# value 'yes' or 'no', default 'no'.
# The file 'webalizer.current' is used to store the current state data,
# and is located in the output directory of the program (unless changed with the IncrementalName option below).  
#Incremental	no

# Similar to the HistoryName option where the name is relative to the specified output directory, unless an absolute
# filename is specified.  default "webalizer.current"
#IncrementalName	webalizer.current

# hostname is appended to the end of this string to generate the final full title string.
# Default "Usage Statistics for".  
#ReportTitle    Usage Statistics for

# Prepended to the URL table items.  
# allows # clicking on URL's in the report to go to the proper location in
# the event the report on a 'virtual' web server, or for a server different than the one the report resides on.
#  Default system call.  or "localhost".  
#HostName       localhost

# defaults to "html", but can be changed like for PHP embeded pages).  
#HTMLExtension  html

# UseHTTPS links to urls should use 'https://' This only changes the behaviour of the 'Top  URL's' table.
# Default 'no'.  
#UseHTTPS       no

# DNSCache specifies the DNS cache filename to use for reverse DNS lookups.
# must be specified to perform name lookups on IP addresses in the log file.  
# If an absolute path is not given as part of the filename (ie: starts with a leading '/'), then the name is
# relative to the default output directory.  See DNS.README 
#DNSCache	dns_cache.db

# DNSChildren number of "child" processes to perform DNS lookups to create or update the DNS cache file.
# If specified, the DNS cache file will be created/updated  each time the Webalizer is run, prior to processing,
# default 0, disables DNS cache file creation/updates at run time.  
# Reasonable values should be between 5 and  20.  See DNS.README 
#DNSChildren	0

# HTMLPre defines HTML code to insert at the beginning of the file.  
# Default is the DOCTYPE line shown below.  Max line length is 80 characters, multiple entries permitted
#HTMLPre 

# HTMLHead defines HTML code to insert within the 
# is 80 characters, Multiple entries permitted 
#HTMLHead 

# HTMLBody defined the HTML code to be inserted, starting with the
#  tag.  If not specified, the default is shown below.  
# Maximum line length is 80 char, multiple 
#HTMLBody 

# HTMLPost defines the HTML code to insert before the first 
on the document, which is just after the title and # "summary period"-"Generated on:" lines. # used to clean up in case an image was inserted with HTMLBody. # As with HTMLHead, you can define as many of these as needed` # they will be inserted in the output stream in order of apperance. # Max string size is 80 characters. #HTMLPost
# HTMLTail defines the HTML code to insert at the bottom of each HTML document, usually to include a link back to your home # page or insert a small graphic. # inserted as a table # data element (ie: your code here ) and is right # alligned with the page. Max string size is 80 characters. #HTMLTail 100% Micro$oft free! # HTMLEnd defines the HTML code to add at the end of generated files. # defaults shown below. If # specify the and closing tags as the last lines. Max string length is 80 characters. #HTMLEnd # The Quiet option suppresses output messages... Useful when run as a cron job to prevent bogus e-mails. # Values can be either # "yes" or "no". Default is "no". Does not suppress warnings and errors sent to stderr. #Quiet no # ReallyQuiet supress all messages including errors and warnings. # default. no If 'yes' is used here, it cannot be overriden from # the command line, so use with caution. A value of 'no' has no effect. #ReallyQuiet no # TimeMe forces display of timing information at the end of processing. # 'yes' will force the timing information to be displayed. #TimeMe no # GMTTime allows reports to show GMT (UTC) time instead of local time. # Default is to display the time the report was generated # in the timezone of the local machine, such as EDT or PST. #GMTTime no # Debug prints additional information for error messages. # causes dump bad records/fields instead of just telling you it found a bad one. # default "no". It shouldn't be needed unless you start getting a lot of Warning or Error # messages and want to see why. warning and error messages are printed to stderr #Debug no # FoldSeqErr forces ignore sequence errors. # useful for web servers that do not guarentee that they will be in chronological order. # causes out of sequence log records to be treated as if they had the same time stamp as the last valid record. # Default is to ignore out of sequence log records. #FoldSeqErr no # IgnoreHist shouldn't be used in a config file, but it is here # just because it might be usefull in certain situations. If the # history file is ignored, the main "index.html" file will only # report on the current log files contents. Usefull only when you # want to reproduce the reports from scratch. USE WITH CAUTION! # Default is "no". #IgnoreHist no # disabled. default 'yes'. #CountryGraph yes #DailyGraph yes #DailyStats yes #HourlyGraph yes #HourlyStats yes # GraphLegend allows the color coded legends to be turned on or off # default displayed. This only toggles the color coded legends, the other legends are not changed. #GraphLegend yes # GraphLines allows you to have index lines drawn behind the graphs. # number of lines displayed. Default is 2, disable zero ('0'). [max is 20] # The lower the better, with 1,2,3,4,6 and 10 producing nice results. #GraphLines 2 # The "Top" options below define the number of entries for each table. # Defaults are Sites=30, URL's=30, Referrers=30 and Agents=15, and Countries=30. # TopKSites and TopKURLs (by KByte tables) both default to 10, as do the top entry/exit tables (TopEntry/TopExit). # The top search strings and usernames default to 20. # disabled by using zero ( #TopSites 30 #TopKSites 10 #TopURLs 30 #TopKURLs 10 #TopReferrers 30 #TopAgents 15 #TopCountries 30 #TopEntry 10 #TopExit 10 #TopSearch 20 #TopUsers 20 # The All* keywords allow the display of all URL's, Sites, Referrers User Agents, Search Strings and Usernames. # If enabled, a seperate HTML page will be created, and a link will be added to the bottom of the appropriate "Top" table. # There are a couple of conditions for this to occur.. # First, there must be more items than will fit in the "Top" table (otherwise it would just be duplicating what is # already displayed). # Second, the listing will only show those items that are normally visable, which means it will not show any hidden items. # Grouped entries will be listed first, followed by individual items. # default 'no'. # these pages can be quite large in size, particularly the sites page, and seperate # pages are generated for each month, which consume a lot of disk space #AllSites no #AllURLs no #AllReferrers no #AllAgents no #AllSearchStr no #AllUsers no # normally 'index.' is striped the string off the end of URL's in order to consolidate URL totals. # For example, the URL /somedir/index.html is turned into /somedir/ which is really the same URL. # allows specifing additional strings to treat in the same way. # index.' is always scanned for. Multiples will degrade performance. # The string is scanned for anywhere in the URL, so a string of # 'home' would turn the URL /somedir/homepages/brad/home.html into # just /somedir/ which is probably not what was intended. #IndexAlias home.htm #IndexAlias homepage.htm # The MangleAgents allows you to specify how much, if any, The Webalizer # should mangle user agent names. This allows several levels of detail # to be produced when reporting user agent statistics. There are six # levels that can be specified, which define different levels of detail # supression. Level 5 shows only the browser name (MSIE or Mozilla) # and the major version number. Level 4 adds the minor version number # (single decimal place). Level 3 displays the minor version to two # decimal places. Level 2 will add any sub-level designation (such # as Mozilla/3.01Gold or MSIE 3.0b). Level 1 will attempt to also add # the system type if it is specified. The default Level 0 displays the # full user agent field without modification and produces the greatest # amount of detail. User agent names that can't be mangled will be # left unmodified. #MangleAgents 0 # The SearchEngine keywords allow specification of search engines and their query strings on the URL. # These are used to locate and report what search strings are used to find your site. # The first word is a substring to match in the referrer field that identifies the search # engine, and the second is the URL variable used by that search engine to define it's search terms. SearchEngine yahoo.com p= SearchEngine altavista.com q= SearchEngine google.com q= SearchEngine eureka.com q= SearchEngine lycos.com query= SearchEngine hotbot.com MT= SearchEngine msn.com MT= SearchEngine infoseek.com qt= SearchEngine webcrawler searchText= SearchEngine excite search= SearchEngine netscape.com search= SearchEngine mamma.com query= SearchEngine alltheweb.com query= SearchEngine northernlight.com qr= # Dump* of Sites, URL's, Referrers User Agents, Usernames and Search strings to seperate tab delimited # text files, suitable for import into most database or spreadsheet programs. # DumpPath specifies the path to dump the files. If not specified, # it will default to the current output directory. Do not use a trailing slash ('/'). #DumpPath /var/lib/httpd/logs # DumpHeader a header record written to the file as the first record of the # contains the labels for each field written. # Files that are intended to be imported into a database system will not need a header record, while spreadsheets usually do. # Value can be either 'yes' or 'no', with 'no' being the default. #DumpHeader no # DumpExtension default "tab" the filenames they use, so you may change it here (for example,use "csv"). #DumpExtension tab # These control the dumping of each individual table. The value can be either 'yes' or 'no'.. the default is 'no'. #DumpSites no #DumpURLs no #DumpReferrers no #DumpAgents no #DumpUsers no #DumpSearchStr no # End of configuration file... # Sample Webalizer configuration file # Copyright 1997-2000 by Bradford L. Barrett (brad@mrunix.net) # # Distributed under the GNU General Public License. See the # files "Copyright" and "COPYING" provided with the webalizer distribution for additional information.