2011-08-06

Muliple Virtual Hosts on a Cloud Server

I have spent some time getting an inexpensive cloud server set up to host my hobby domains and have leapt enough hurdles to think I can document the process better than any single document I've found (but note that I've received lots of help from various open source project mailing lists, particularly from Mark Sapiro at Mailman-Users@python.org).  Follow me as I re-trace my steps.

The domains:
  • mygnus.com (the "real" domain)
  • usafa-1965.org (virtual: site for my college classmates)
  • highlandsprings61.org (virtual: site for my high school classmates)
Domain registrar:

Cloud server hosting:
The goals:
  • use SSL/TLS encryption for all web access
  • host multiple mailing lists for each virtual domain with one GNU Mailman instance
  • use an RDBMS for Apache2 Digest authentication for access to user contact data (also on an RDBMS)
    • I first used MySQL, but could not get the current Ubuntu Apache2 mod_dbd to work
    • now I'm using the latest release of PostgreSQL (9.0.4), built from source, with which I am very satisfied
The cloud server:
  • Ubuntu 10.04 LTS (64-bit), 256 Gb RAM, 10 Gb disk space, single CPU
The primary software needing configuration (all from Ubuntu packages):
Other resources required:
To be continued...

2011-06-16

Parsing Apache Access Logs with Mixed Formats

I changed the LogFormat for my Apache server recently from the default common log format (CLF) to the "vhost_common" format and couldn't find an easy solution to handle access logs using multiple formats. The easiest Perl module I found to use is "Parse::AccessLogEntry" (from CPAN.org), but it only works with the common log format. However, with a simple hack I can now parse logs with either or both formats. The trick is that the "vhost_common" format has only one token extra compared to the CLF and that token is the first one on the log line. It is easily detected so it can be removed if present, and the rest of the line can then be handled normally since it will then be in CLF.

Following is a fragment of code from the log line parsing loop showing the hack:

# the incoming line may be in CLF or vhost_common format
# split the line on space to tokenize it
my @d = split(' ', $line);
next if !defined $d[0];
my $vhost = $d[0];
# the vhost token is in format "servername:port"
# and the next token (the first in the CLF format) is
# the remote host address in format "xxx.xxx.xxx" so
# the presence of the ':' tells us the type
# of format we have
my $idx = index $vhost, ':';
if ($idx >= 0) {
  # we have detected the vhost info
  # I remove the port info, you may want it
  $vhost = substr $vhost, 0, $idx;
  # remove the vhost token from the list
  shift @d;
  # reconstitute the log line into the CLF
  $line = join(' ', @d);
}
else {
  # we din't find vhost so set it to zero
  $vhost = 0;
}
# parse the CLF
my $href = $p->parse($line);
if ($vhost) {
  # add the vhost to the hash
  $href->{vhost} = $vhost;
}

I'm sure the code can be improved, but it does work as is.

2011-06-08

SSL Certificates, Virtual Web Sites, Perl for Apache AuthDigest

Note that I have changed two of my favorite links: (1) Debian is now my Linux distribution of choice and (2) I had to fall back to the more powerful Apache HTTPD server because of features I could not find elsewhere.

I have just finished two Perl scripts to aid in Apache htdigest handling for large numbers of users (a database solution will be better but I don't have the time for that at the moment). The first uses the program *nix program pwgen() to yield reasonably secure clear text passwords for an input list of user names. The second takes an input list of pairs of user names with clear text passwords and produces an Apache htdigest file. I will post them if I get any interested responses.

I have been working part time on multiple virtual hosts on my single cloud httpd server and have learned much--with lots to go. The main sites are two for my college and high school graduating classes and one for my company (and I would be very interested in critiques of them):

In the process of building the web sites I found a source for a reasonably priced SSL certificate for multiple virtual hosts on a single server: StartSSL whom I highly recommend.

2011-02-05

An Aid for DocBook Authors

I have a Perl program in work which is being discussed on the docbook-apps mailing list. Executing it at the CLI shows:

$ ./make_index_markup.pl
Usage: ./make_index_markup.pl [-h |--help][options...]  \
    <DocBook xml file(s)...>
Use the '-h' or '--help option for details.


The current version of the program, four companion class modules, and a checksum file for each of their respective sha512 check sums are available at my Dropbox site (http://dropbox.com/) at these links (note the class module names have changed):

  http://dropbox.com/u/18447611/make_index_markup.pl
  http://dropbox.com/u/18447611/DocBookSearch.pm
  http://dropbox.com/u/18447611/DocBookToken.pm
  http://dropbox.com/u/18447611/DocBookTokenList.pm
  http://dropbox.com/u/18447611/DocBookWordHash.pm

The links are the same for the check sum files but with the extension ".sha512" added.

Note: Some people have reported problems with getting the file links. I discovered that here my links didn't always match up with the correct file names which probably caused the problem. Please let me know if you can't get the files.

The text-stripping function works nicely, I think, and the index-making function is almost ready for you to try. It's not completely documented, and it needs more error checking, but maybe you can suggest fixes.

Comments encouraged.