MT-MostVisited Plugin
This plugin determines the most popular entries on your blog, based upon the results of the Apache webserver access logs. It makes this information available via the MTMostVisited container tag, which can contain MTMostVisitedCount and MTMostVisitedLink tags, as well as any MTEntry-type tags.
SYNOPSIS
<ol>
<MTMostVisited blogurl="/blog/archives"
logfile="/var/log/httpd/access_log*"
count="10">
<li> <a href="<$MTEntryPermalink$>"><$MTEntryTitle$></a>
of <MTEntryDate format="%B %e, %Y">
(<$MTMostVisitedCount$> hits)
</li>
</MTMostVisited>
</ol>
DESCRIPTION
This plugin determines the most popular entries on your blog, based upon the results of the Apache webserver access logs. It makes this information available via the MTMostVisited container tag, which can contain MTMostVisitedCount and MTMostVisitedLink tags, as well as any MTEntry-type tags.
This information may be interesting to display, if some of your most popular entries do not have many comments, or if they have fallen off of your index page.
Please note: Your webserver must be Apache for this to work, and Apache must be configured to output log information. (This was the default for Redhat Linux, at least, and I expect it to be true for the majority of sites running Movable Type.)
Tags made available through this plugin:
- MTMostVisited - Container tag, which loops through the most-requested pages on your blog. It has a number of attributes, which will be enumerated later.
- MTMostVisitedCount - Returns the number of hits that a particular entry has received. Used in conjunction with MTMostVisited.
- MTMostVisitedLink - The URL of the particular entry. Used in conjunction with MTMostVisited. The MTEntryLink or MTEntryPermalink tags can be used instead.
- MTEntryXXX - MTEntry tags can be used within the MTMostVisited container, too.
INSTALLATION
The plugin is available (as a zip archive) here.
To install, place the mt-mostvisited.pl file in your Movable Type 'plugins' directory.
This plugin also requires Akira Hangai's Apache::ParseLog module, available via CPAN, www.cpan.org. This can either be installed within your server's Perl libraries (if root access is available), or placed within the extlib subdirectory of your Movable Type home. I.e. the total installation should look like:
- (mt home)/plugins/mt-mostvisited.pl
- (mt home)/extlib/Apache/ParseLog.pm
Refer to the Movable Type documentation for more information regarding plugins.
The apachelog.pl script, included with this distribution, is a command-line Perl script that also parses the Apache log. It outputs to the console, thus making it easier to check the configuration of the plugin. It is NOT needed to implement the Movable Type functionality.
CONFIGURATION
The configuration of the plugin is done via attributes of the MTMostVisited element.
- filetype - This specifies the extension of your individual archive files. It defaults to "html". This can also be a regex-like expression, like "html|php". (Thanks to Eric James Stone for the regex suggestion, and sample implementation.)
- blogurl - This specifies the URL of your individual archive directory, relative to your domain name. For example, if your individual entry archives can be found by going to the URL http://www.borlik.net/blog/archives/000001.html, then the blogurl should be "/blog/archives". Note that the trailing slash should not be present. Also note that this will be used within a regular expression, so multiple directories can be searched with the right wildcards. For example, if you store your individual entries in category directories (e.g. "/blog/archives/work" and "/blog/archives/play"), then you can use a regex wildcard like "/blog/archives/.*". (Please let me know if that functionality does not work intended.)
- logfile - This plugin needs to know the location of the webserver hit logs. The logfile attribute is used to specify that location on the server. Note that the wildcards are acceptible, if you have multiple logfiles (e.g. rotating logs). An example would be "/var/log/httpd/access_log*". If you are not running your own server, ask your host for help.
- ziplogfiles - This attribute is similar to the "logfile" attribute, and used in the same way. Some webservers automatically gzip older log files. Logs specified with this attribute will be unzipped and interpreted correctly. Please note that this assumes that the server has gunzip installed in the default path and that a temporary file can be written to the current directory.
- count - This specifies the number of entries to enumerate. By default, only the top ten will be listed.
- cleanbuild - If set to "1", then plugin errors will be written out as HTML and template processing will not be interrupted. This can be considered to be a debugging utility, perhaps. Some errors, such as missing requirements or Perl compilation errors will still screw things up.
PERFORMANCE NOTES AND BUGS
Some webserver log files can be quite large, especially on a popular site. This plugin may slow down your blog rebuilding. It hasn't been a problem for me, but then again, nobody likes me.
One technique, suggested by Eric James Stone on the gday mate website, is to create a separate template just for the MTMostVisited section, and use the MTInclude directive to include it in one's index page. The extra template can be rebuilt separately (maybe once a day or something, and possibly via the various build scripts that are out there). Such a technique is a necessity if you wish to include MTMostVisited stuff on individual archive pages. The MTFastInclude plugin may be useful for this, too.
Monthly archives are not handled well, nor are popup-image pages. (A critical assumption of this plugin is that all of the files within the archive directory - $root - are named according to their entry id's.) Because the monthly archives are named differently, the plugin is unable to load them in, in order to process MTEntry-type tags. This hasn't been an issue for me, since my most-accessed files are individual archives.
THE MOST COMMON MISTAKE is not installing Apache::ParseLog. Without that module, this plugin will break. (Does anyone know of a way of detecting if a module exists?) This often results in a compile error, with a strange error message (such as "You used an MTEntryLink tag outside the context of an entry" or some such nonsense). (The correct Perl compilation error is dumped into the webserver error log, but not shown to the MT user.)
Any assistance or suggestions would be appreciated. In particular, I would like to offer this plugin to bloggers who use entry titles rather than entry ID's for the naming of the archive web page.
COPYRIGHT
Copyright 2004, Jeffrey Borlik. All rights reserved.
This program is free software; You can redistribute it and/or modify it, as long as this copyright notice is kept intact and the original author is given credit.
DISCLAIMER
This package is so distributed WITHOUT ANY WARRANTY in that any use of the data generated by this package must be used at the user's own discretion, and the author shall not be held accountable for any results from the use of this package.
CREDITS
I would like to give credit to Timothy Appnel, whose MT plugin article was quite helpful, and to Brad Choate, whose MT-SQL plugin served as a useful starting-point for integrating the MTEntry tag functionality. Akira Hangai's Apache::Parselog made this plugin possible, of course. And there have been many users who have offered useful ideas and sample implementations, including Ian Fenn, Zack, and Eric James Stone.
CHANGELOG
- 1.0 - Initial release
- 1.1 - Removed unneeded $conf parameter
- 1.2 - Ignore nonentry pages (via a test for only digits)
- 1.3 - Removed pod, and greatly improved error messages
- 2.0 - Moved configuration out of plugin. Allow wildcards for most configs.
AUTHOR / SUPPORT
MT-MostVisited was written and is maintained by Jeffrey Borlik. Please feel free to email me with problems or suggestions.
The latest version of the plugin archive can be found at http://www.borlik.net/~jborlik/mt-mostvisited.zip
Webpage: http://www.borlik.net for this plugin:
http://www.borlik.net/blog/archives/000043.html
Email: jborlik.DONTSPAM.ATSYMBOL.earthlink.IGNORE.net
Support is also available at the the Movable Type forums: http://www.movabletype.org/support/index.php?act=ST&f=20&t=20333&s=63aa7fee167f1be5388a03bfc8c7be0c
And at the mt-plugins site: http://mt-plugins.org/archives/entry/mostvisited.php
Comments
what if you dont have the ID in the URL?
Posted by: Faf | May 19, 2003 08:31 AM
Faf - There needs to be some sort of mapping between the URL and the EntryID. URL==EntryID is the simplest, and is the default for MT. However, I think that there are many ways of loading up the MT::Entry object. (For example, if the title of the article is unique, and the URL is based on that, then the relationship could possibly be established.)
Posted by: Jeff | May 19, 2003 08:07 PM
this plugin doesnt work on my index is it suppose to go in the archives?
Posted by: iced glare | May 22, 2003 01:56 PM
iced glare - I run the plugin in my index, but I don't see any reason why it shouldn't run in an archive template. For some debugging help, check out the Movable Type support forum. If you have access, try running the apachelog.pl Perl script, and seeing if you get real output. Also note that you will probably have to a little configuration of the plugin.
Posted by: Jeff | May 22, 2003 11:23 PM
i am on linux and I checked what you posted at the forum and still nothing
on my index it says "no hits to archives".
Posted by: iced glare | May 23, 2003 06:43 AM
my archives are not /blog/archives just /archives
Posted by: iced glare | May 23, 2003 06:46 AM
OK I am going to post to the forum like implied!!
Posted by: iced glare | May 23, 2003 11:28 AM
whats new?
Posted by: iced glare | June 27, 2003 02:00 PM
Hi Jeffrey,
I have my movabletype posts archived into directories named after their category so I modified your script to include:
if (exists $args->{"root"}) { $root = $args->{"root"}; }
Then I've set the root via the movabletype tag itself so that I can have category specific listings. Does that make sense?
All the best,
--
Ian
Posted by: Ian Fenn | August 1, 2003 04:13 PM
Ian - That is an excellent way of doing it! Actually, I should make the plugin so that all of the configuration is via the tags. Not only would it simplify installation for some users, but it would also allow clever uses like yours. I'll make the change in the next version.
Posted by: Jeff | August 5, 2003 10:31 PM
when's the next version coming out? I found my paths and it still doesn't work :-/
Posted by: iced glare | August 16, 2003 08:05 PM
I think I got the strangest error ouput for this. Here it is:
"Build error in template 'Main Index': Error in tag: Error in MTMostVisited, while loading EntryID "000014": This does not appear to be a valid EntryID. It should be all numbers."
Any idea how to fix this? The entry is all number.
Thanks!
Posted by: mentor | September 6, 2003 06:25 AM
Works great. One thing I would like to see is the ability to provide the log path as a tag parameter instead of hard coding it in the script. I have multiple blogs running and they have different Apache log files. So right now I can only use the plugin on one site or the other.
Posted by: Brandon Fuller | September 15, 2003 10:46 AM
Thanks for this plugin. Works great.
Do you know if "search_files" can be set to look for access logs like "access.log.2003-10-03.gz"?
Posted by: Zack | October 3, 2003 04:12 AM
I think its amusing that this post is now the most visited post listed on your home page. ;)
Seriously though, its a great idea for a plugin. Unfortunately, my archives are named thusly: http://www.flexistentialist.org/archives/2003/10/15/comment_spam_the.shtml where the filename is the title dirified and trimmed to 16 characters. Without the entry id in the URL, its hard to figure out how I would make this script work... Perhaps a combination of the date and the title, but that would be a bit of a kludge, and would not be futureproof (which is why I set the urls up this way in the first place).
Anyway, I'll check back in periodically to see if you come up with anything, and if I have any great ideas, I'll be sure to let you know. :)
Posted by: sam | October 15, 2003 10:21 PM
WHat causes the following error:
"Build error in template 'Main Index': Error in tag: You used an 'MTEntryLink' tag outside of the context of an entry; perhaps you mistakenly placed it outside of an 'MTEntries' container?
"
??
Is it something in the PL or PM file being out of whack?
Posted by: John F | October 29, 2003 10:02 PM
I am running windows 2003 server with ActiveState Perl 5.8 and Apache 2.047.
Currently there isn't support for Apache:ParseLog module for ActivePerl 5.8.
Is there any other solution for this ?
Regards,
-m
Posted by: morpheous | November 4, 2003 05:26 AM
Morpheous - You should be able to download and use Apache::ParseLog off of CPAN (see the link in the text), as it is just a single Perl file (I think). I'm not sure where you would place the file in the ActiveState setup, but I suppose that you could put it where the rest of the Perl library files reside. - J
Posted by: Jeff | November 4, 2003 02:07 PM
Looks like someone either didn't install MT Blacklist OR someone found a way around it :(
Posted by: John F | November 13, 2003 12:32 PM
Hi there I just wanted to let you know that I finally got this to work :-)
Posted by: iced glare | December 15, 2003 06:24 PM
"$logdir The directory where your site's webserver log files are written."
Well my server said that that dir is dep in the root of the server and that they cannot give out that info. So I'm at a bust here. There must be a simplier way...
Any ideas?
Thanks.
Posted by: Eliah Holiday | December 27, 2003 03:08 PM
2.0 works swell as well :-)
Posted by: iced glare | January 8, 2004 08:42 AM
for some reason my hits seem to be reseting everyday ever since i put 2.0
why is that?
Posted by: iced glare | January 9, 2004 10:15 PM
Iced glare - It is probably because your webserver rotates the logs every day. I doubt that your server discards the older logs, though.... Try adding a wildcard to the "logfile" attribute. (E.g if your log is at /x/access_log normally, but you also have /x/access_log.1 and /x/access_log.2, then maybe those files contain the older data. Use "/x/access_log*" as your logfile attribute.)
Posted by: Jeff | January 10, 2004 05:22 PM
OK, I believe I've figured out a way for this to work with blog entry archives that do not follow the usual numeric pattern.
It involves:
1. Creating a special index template to generate a file of Perl code that creates an "ID Map" hash. The keys in this hash are the paths to each individual entry, and the values are the entry ID's. (I'm sure there must be an easier way to do this from within the plugin itself, instead of using an index template, but I don't know how to do it.)
2. Adapting the mt-mostvisited.pl file to read in the file with the hash code. Then, it uses the keys to determine whether an entry has been visited, and the values to assign the visit to the correct entry.
3. Adding a "idmap" attribute to the MTMostVisited tag, with the path to the file generated by the template.
I've posted the code and some limited documentation at: http://ericjamesstone.com/script_archive/mt_mostvisited.htm
Jeff, feel free to incorporate any of this into your next version. What I have done is backwards compatible for people who do not need this functionality. Thanks for all the work you've put into this.
Posted by: Eric James Stone | January 14, 2004 04:49 PM
There is an error in version 2.0. Line 52 in mt-mostvisited.pl is:
require File::Copy;
That gives an error for me when rebuilding. It should be:
use File::Copy;
Thanks for the update.
Posted by: Zack | January 26, 2004 02:26 AM
HI,
I am getting the following error:
Build error in template 'Main Index': Error in tag: Error in MTMostVisited, while loading EntryID "000133": This does not appear to be a valid EntryID. It should be all numbers.
No idea, what this means?
Also I would like to note, that if readme.txt and the other perlscript are in the plugin directory, mt.cgi dies ...*ugly*
Posted by: arved | January 26, 2004 07:56 AM
I also get the darn "Build error in template 'Main Index': Error in tag: Error in MTMostVisited, while loading EntryID "000515": This does not appear to be a valid EntryID. It should be all numbers." error. The Entry is OBVIOUSLY just a number. What causes this?
Posted by: - - e r i k - - | January 30, 2004 04:10 AM
That "Invalid EntryID" comes about because MT was unable to find an entry with that EntryID. Could it be that the file in your archives directory is actually NOT an individual entry archive? Let me know if that is true.
A number of people have mentioned that problem. I suppose that I could either change the program to ignore that particular entry entirely. Thoughts?
Posted by: Jeff | January 30, 2004 06:47 PM
How would I go about showing the number of views per entry?
Like 100 views 17 comments 4 trackbacks
Posted by: iced glare | February 4, 2004 08:52 PM
iced glare - If I understand you correctly, you are suggesting that each entry could have a "views" property (based upon the number of hits from the webserver log). That's a good idea, but really isn't addressed by this plugin as it is set up right now. Because it would be done on a per-entry basis, the webserver log certainly couldn't be parsed each request. (It takes too long now!) Your idea is definately worth thinking about for the next version. - Jeff
Posted by: Jeff | February 4, 2004 09:51 PM
My blog is on a subdomain. MT is installed to /mt/ for instance:
http://mydomain.com/mt/mt.cgi
Blog:
http://blog.mydomain.com/
Can I put the full path (URL)? (http://blog.mydomain.com/archives)
Posted by: Rob | March 26, 2004 12:49 PM
Rob - The "blogurl" parameter points to your individual archive directory. The subdomain doesn't matter, I think.
It sounds like it should be "/archives".
Make sure your "logfile" parameter is set correctly, as it should point to the access log file somewhere on your hosts's filesystem.
Posted by: Jeff | March 26, 2004 07:21 PM
Hi there, wonder if you have any clues here. I'm getting an error:
Build error in template 'Main Index': Error in tag: Error in MTMostVisited. A webserver log was not found. The plugin was looking for log files at "logfile" (i.e. "/var/log/httpd/access_log*"). Please double-check the location of your Apache webserver access log, and possibly change the "logfile" attribute tag.
I'm on Mac OS X, I have an apache logfile at /private/var/log/httpd/access_log but if I use that path in the attribute I get the same error (except it mentions the /private path). MT is installed at /Library/Webserver/CGI-executables. Is this a permissions problem?
Posted by: tim | July 7, 2004 04:41 PM
Does anyone have any luck getting this to work on Pair Network servers? It's driving me insane - I cannot get the plugin to recognize the path to the logfiles...
Posted by: Nick | November 22, 2005 04:33 AM
Hello,
any chance you could give a few hints on how to make this work with MT3.3x?
cheers
paul
Posted by: Paul | January 20, 2007 02:35 AM