Thursday, August 28, 2008

Meta Can Be Readable

Stumbled upon this post today on why you should avoid meta programming. The author contends that meta programming in ruby can, one, slow down your code, and two, it can make you code too hard to follow.

If you read the article comments, you'll see that some responders have shown various ways of speeding up the code example he gave, so I'll leave that part alone. Instead let's focus on the readability issue. While you can definitely take things too far and over generalize a solution, there are some steps you can take to make the code easier to follow. Consider this refactoring:

module TimeMethods
  def self.define_time(sym, secs)
    module_eval <<-EOC, __FILE__, __LINE__
      def #{sym}; self * #{secs} end
      alias_method :#{sym}s, :#{sym}
    EOC
  end
  
  Numeric.send(:include, self)
  
  define_time :second, 1
  define_time :minute, 60.seconds
  define_time :hour,   60.minutes
  define_time :day,    24.hours
  define_time :week,   7.days
  define_time :year,   364.25.days
end

puts 2.years

All I did was move the tricky "magic" part into it's own method. Any class or module methods that we create will be available to us inside of that Class or Module definition. That makes the final part very where we use define_time very readable because it's declarative and intention revealing.

Don't be afraid of ruby meta programming. I think Giles Bowkett puts it perfectly when he says that it's not meta programming, it's just programming. We should use the same tools to refactor it as any other code.

Friday, February 1, 2008

Ruby: Trivial Memoisation with Hashes

Using the block form of ruby's Hash constructor makes memoisation really easy. Here's some code that does has a hash do recursive lookups upon itself. Pretty nifty, eh?

require 'logger'

logger = Logger.new($stdout)

fib = Hash.new {|fib, x|
  logger.debug "Calculating fib(#{x})..."
  fib[x] = if [1,2].include? x
    1
  else
    fib[x-1] + fib [x-2]
  end
}

puts fib[5]
puts fib[7]

Saturday, March 3, 2007

Splat!!

Oh the things you can learn by reading _why's code. It really is like being in the mind of a genius.

Over the past few days I've been playing around with the Camping “micro-framework.” Like Rails, Camping a web framework built on the MVC concept. While Rails aims to be feature rich, however, camping blazes off in the direction of keeping small and light weight. Oh, and did I mention exceedingly clever? Take the time to read through the source—the unabbridged version is better for this—there's lots of goodies in there.

Anyway. One of the cool tidbits in there is defining a to_a method for using with the splat operator. The splat "*" is used to explode an array into the parameter list of a method. For example, assume the following function:

def foo a,b
  puts "You gave me a(n) #{a}, and also a(n) #{b}.  How very generous of you!"
end

Normally the function would be called passing it two parameters:

foo "Apple", "Banana"

If you use the splat you can break up an array into the method:

fruit = %w|Kiwi Grape|
foo *fruit

Now, none of this is all too earth shattering; however, it does lead to some clever uses when paired with a suitable to_a class method.

class ContinentalBreakfast
  def initialize(*offerings)
    @offerings = offerings
  end
  def to_a # Pick some random fruit...
    plate = []
    table = @offerings.dup
    2.times do
      f = table[rand(table.length)]
      table.delete(f)
      plate << f
    end
    plate
  end
end

Cool. Now we can ask somebody to bring us back a bit of a snack:

foo *ContinentalBreakfast.new('grape','pear','melon','strawberry')

Pretty neat, eh?

Thursday, February 22, 2007

Ruby Lib Path Manipulations

Mucking around with the library search path in ruby isn't exactly the hardest thing in the world. It's just a matter of fiddling around with a horrible looking perlish global variable and using __FILE__ to munge the right path together. Again, nothing too difficult but it's annoying and ugly—totally out of place next to all that beautiful ruby code.

For the most part, this isn't much of a problem for libraries. In fact, you don't normally want to hard code paths anyway. Typically a flag is passed to the ruby interpreter:

$ ruby -Isome/path foo.rb

However, in some cases you still need to change the path in the code. Likely the most common place this shows up is within test/spec files. Does this look familiar?

# some_cool_spec.rb
$:.unshift File.join(File.dirname(__FILE__), '..', 'lib')

require 'lib_under_test'

context '...' do
# snip
end

It’s so ugly, right? To me it's quite the eyesore so I wrote the following really simple library (rubylib.rb) to fix it up a bit:

module LibCommands
def lib(*path_segs)
  $:.unshift File.join(*path_segs)
end

def lib_rel(*path_segs)
  file = caller.first.split(':').first
  lib File.dirname(file), *path_segs
end
end

Object.send :include, LibCommands

Now when you want to add something to the load path, you can just use the lib method:

lib 'path', 'to', 'library'

It's even better when dealing with relative paths. lib_rel specifies the directory to add relative to the file it's called from—rather than the current working directory, which is probably different. That makes the above rspec:

# some_cool_spec.rb
lib_rel '..', 'lib'

require ‘lib_under_test’

context '...' do # snip end

Nice, eh? Sure, but you still need to require rubylib.rb first. You can fix that up with a few little tricks used by rubygems. First, create aubylib.rb that just requires rubylib.rb. That way it looks nicer when you add it in as a flag to ruby.

$ ruby -rubylib foo.rb

The next level of laziness is to add it you your RUBYOPT. With the environment variable set the two commands are available without any explicit requires:

$ export RUBYOPT="${RUBYOPT} -rubylib"
$ ruby foo.rb

Thursday, February 1, 2007

Blogger: the Switch

As you may have noticed, the blog is looking a little different today. Previously, I was hosting a typo blog on my VPS account, which was working out perfectly well enough, but it brought up a few issues: every time I went to go publish an article I would think about how typo didn't really work the same way I did.

This problem is pervasive with me. There's likely always something I should be doing at the moment, but I get it in my head that it's not exactly the way I'd like it so I go off to write my own program to do X—I really need to cut that out.

So I decided rather than try to rewrite my own blog engine, or hack on something to get it into shape, I could better spend my time actually writing articles. We'll see how this works out. It's rather nice that blogspot allows you to use your own domain name as well, which was one of the last sticking points.

We'll see how the little experiment goes. Sure it's going to kill me to write out these posts in the web form for a bit, but I'm sure I'll figure out the api quickly. I've really crippled myself with vim. I can't even type anymore without it.

Tuesday, December 12, 2006

Shell Tasks? In Haskell?

No. Your eyes are not deceiving you: I am in fact suggesting that Haskell is suitable for tasks that are normally relegated to shell scripts.

Recently, I was asked by a colleague to come up with a simple shell script to rename some files. Basically, the files were being moved from a windows machine to a *nix environment, which meant that case sensitivity was going to become an issue. The request was simple enough, we needed to rename all the specified items to lowercase names

You might be thinking, ``classic tr territory here.'' And, of course, you'd be absolutely correct in going in this direction. The important bits go something like this:

for file in ${*}; do
  downcase=$(echo ${file} | tr 'A-Z' 'a-z')
  if ! [[ "${downcase}" == "${file}" ]]; fi
    echo "Moving ${file} to ${downcase}"
    mv ${file} ${downcase}
  fi
done

So what's wrong with that?

Absolutely nothing. It's both simple and effective. So why bother writing it in any other language, least of all Haskell? Because it's not beautiful.

The great thing about the above code is that you don't really need to put it into a script. You can just bang away at it from the command line -- yes, you can type loops on the command line, *smirks*.

When at the command line, however, you just do the simplest thing that works: you aren't worried about wrapping the variable names in curly brackets, and so on. IF you do want to keep scripts like this around though, after a while you'll start to see them grow ugly. Nobody likes ugly code, so lets see how it looks if we port it to Haskell.

First lets start by importing a few functions:

module Main (main) where
import System(getArgs) -- so we can get the file names
import Data.Char(toLower) -- will do 'tr's work
import Directory(renameFile) -- so we can do the renames
import Control.Monad -- helps us quit on bad params

Those will definitely come in handy later. Okay, so let's talk about the data for this ``application.'' Obviously we're going to have a bunch of filepaths, but we also need to keep track of the downcased names as well. We'll gather these together with the original names in pairs. Since there will be multiple such pairings, they will be gathered together into a list, which looks like this: [(String,String)].

There are two actions to be done for each pair,

  1. Each pair of directories should be printed to the screen, and
  2. The file should be moved from the first name to the second.

Let's define a function to print let the user know what's going on. This is easy enough, and as the type of this function suggests we take a pair and do some IO:

putDirs :: (String,String) -> IO ()
putDirs pair = putStrLn $ "Moving: " ++ fst pair ++ " " ++ snd pair

The second action is so simple that we won't even make a function for it. It's by-and-large already done for us -- remember the import from the Directory module?

One other thing the original bash version did is check that the file actually has to be moved. This is simple in Haskell. We'll use the higher order function filter to remove the pairs where the two elements are already equal:

rmDups = filter (uncurry (/=))

Here uncurry takes a function and returns a function which acts on a pair.

Now all that is left is to do the main part. Check the parameters and loop our actions over each of the files

main = do
 files <- getArgs
 let usage = "Usage: dcfiles [files]"
 when (length files < 1) $
   putStrLn ("Argument Error: no arguments specified\n" ++ usage)
   >> return ()
 let pairs = (rmDups.zip files) $ map (map toLower) files
 mapM_ (liftM2 (>>) putDirs $ uncurry renameFile) pairs

I think all of this reads very well. The only tricky part is the last two lines. A lot of stuff is going on in a little amount of space, but it's still not too hard. toLower takes a character and returns it's lowercase equivalent, so this just get's mapped over the entire string. We do this for each of the file names passed in. Next we use zip to form our pairs from the original list and the mapping, and finally we filter it using our removeDups function.

The ultimate line is just a fancy way of saying that we need to map our two actions one after the other. The mapM_ construct just says that we'll be mapping actions rather than regular functions.

So, now that that's done, what have we gained? We'll if we never added onto the script, it probably would have been just as well to leave it alone. However, if we wanted to make it a little more robust it would be very easy to do and you'd have all the power of Haskell at your disposal.

What I wanted to show here is that Haskell needn't be reserved for massively complex projects. It doesn't get any more simple than this. Don't fear Haskell, love it :)

Wednesday, April 26, 2006

Missing Perl Modules

Hey everybody. Sorry It's been so long since an update. I've had Oblivion taking up all my time of late. It's a great game; check it out if you get the chance—but that is really neither here nor there. This post is about perl.

Soon I'll be moving to a new position at work. I'm going to be taking a break from system administration for a bit and doing some coding, which is really a breath of fresh air. You know, it's the whole problem of dealing with users: they drive you crazy, but they also provide you with a job.

Anyway, the new projects I'll be working on will be web stuff, mostly in perl. So, I figured I'd lay off the ruby for a while and get back on the camel. I started out looking at a simple script to do ldap lookups for co-workers. We do have the capability to look up phone extentions using our VoIP enabled telephone sets, but it can really be a pain to key in the information.

It all comes down to the interface: I will never be able to key in a query against somebodies last name on the phone as fast as I can on the keyboard—even if I could remember which letter mapped to which number, you are still required to wait for a timeout between two characters. Otherwise, there would be no way to differentiate between entering 22 meaning "aa" or 22 meaning "b".

So the script is really simple, it just takes any number of query strings as command line arguments and plugs them into the ldap query filter using the Net::LDAP module. As it stands, everything worked fine on my workstation because I have a whole slew of useful perl modules installed there from the CPAN. Unfortunately, the other machines are missing a lot of these modules. I could have installed the missing modules locally on each machine, but that would have taken more effort than it would be worth.

Rather than go at it that way, I decided to just work around the problem. The missing functions were some of the really nice list processing routines from List::MoreUtils. We'll look at the function all() as an example. all() takes a reference to a function and a list of values. It uses the function as a predicate, returning true only when the function is true for each value in the list. Here is an example of it's usage:

if (all { $_ >= 0 } (1,5,4,0,3)) {
    print "The list contains all positive values.\n";
}
else { 
    print "The list has at least one negative value.\n";
}

which of course would execute the first branch because they are all positive. It turns out that this function is pretty simple to write in perl, but there are some nice points to using the module version. First the standard reason, you don't have to reinvent the wheel. There's less chance that the module version has bugs in it then our hand rolled version. Second, the module version is written as a C extention, so it's really fast.

In the script we want to load the module if it is available, but if we do that, perl will not be able to find it in the include path (@INC) and the program will blow up. What we need to do is require the module file instead. require will throw an execption if the module is not found, and it also does not call the modules import function—we're going to want to do the import manually anyway as you'll see in a moment.

To catch the error, you can use an eval block. If an exception is thrown, it will get stuffed into the package variable $@. We can test this variable to see if the module couldn't be loaded. If not, we define the function ourselves. Otherwise, we just import the modules version into our current namespace. This looks something like the following:

BEGIN {
    no strict 'refs';
    eval { require List::MoreUtils };

    if ($@) {
        *all = sub (&@) {
            my $func_ref = shift;
            my @items    = @_;
            foreach (@items) {
                return unless $func_ref->();
            }
            return 1;
        }
    }
    else { *all = \&List::MoreUtils::all }
}

We put this inside of a BEGIN block because we want this to happen at compile time rather than runtime. The (&@) part after the sub definition is a function prototype. It tells the compiler that we want to look for a subroutine reference as the first parameter, which is what gives us a calling interface like the builtins grep and map.

The only other tricky part in there is the assignment to the all typeglob (*all). Since we are setting the value to a funciton reference, the all() function in the current namespace gets set to that function. Now when we call all() in the program we should get either the module version or the one we defined depending on whether it could find the module. If I run it on my workstation it uses the List::MoreUtils version. On the other machines, however, I don't have to worry about it not being available. The script will still run, It will just use our above definition instead.

Thursday, March 16, 2006

Appriciate Your GNU Tools

It is often remarked that one never knows what one has until it is lost. While I think that most people do believe something along those lines, I suspect that such feelings are normally reserved for the big things in life: security, companionship, et cetra.

Personally, get this feeling every time I log onto a proprietary UNIX machine. Perhaps you recognize the situation: for every one of your clever little shell incantations, the machine responds with a brutally cold, "illegal option." I know that this command worked yesterday!" you say to yourself.

And then it dawns on you. You were on the Linux box yesterday.

These are the times that make us realize just how good we have it. Not only have the fine people at the FSF provided us with tools which are free--as in software--but also tools which are superior with an incredible amount of robustness and functionality.

Case in point: Today I was writing a script for our users to generate public keys for use with ssh (scp). The ssh implementation on Tru64 uses a different key format than OpenSSH. Fine, no big deal. That's not where the problem was.

The problem was with humble little grep. I wanted to capture the name of the key created from the ssh-keygen command. With the GNU version it's easy to pick out a just a matching pattern, rather than the entire line: just use the -o option:

$ grep -o 'pattern'
looking
for a pattern
> pattern
find the pattern in the line
> pattern
many little patterns. find all of the patterns
> pattern
> pattern

On the Tru64 version, there's no such option. In fact, the number of available options (17) falls far short of those made available by the GNU version (41). I'm not going to argue whether all of those options are necessary, but you'll be happy they are there when you need them most.

Luckily perl is installed on this machine, so I was still able to solve the problem:

$ perl -ne '/id_dsa_\d+_[^.]+/ && print $&, qq/\n/'

Of course, this solution presents its own difficulties. Namely, you might be writing a script that needs to be maintained by someone that's taken a few Korn Shell classes, yet knows absolutly zero perl. Besides, this is only an example. Sometimes the workaround doesn't come so easily. (Not to mention that perl is under an open source license as well.)

So the moral of the story? Be greatful for what you have: especially fantastic software, provided to you for free by volunteers.

Monday, March 13, 2006

RPMs by Hand: Ouch!

Watching Debian's Advanced Package Tool (apt) at work is an awesome thing to behold. Armed with apt-get and friends, installing a new package is only ever a few keystrokes away:

# apt-get install mypackage

The above command will have apt happily run off, fetch the package from the repository and install it on the system. All the dependancy checking is taken care of--for the most part--automatically. But what does this have to do with the Redhat Package Manager (RPM)? Well, nothing actually.

RPM does not intend to solve the same problem as apt. It's actually analogous to dpkg, Debian's package format and associated tools. Therefore, you can't really fault rpm for the fact that

# rpm -i mypackage.rpm

will most likely come screaming back at you that you are missing a bunch of dependancies.

No. The gripe here is that you really need to have something like apt. If you have a system decended from Debian, based on dpkg, you'll most certainly have apt as well; this is not necessarily the case with some RedHat relatives. Many of the rpm based distros do in fact have an acceptable replacement, while others--*hack* *cough* SuSE--do not.

No. YaST does not count. I am (unfairly?) limiting the field to command line applications. There is no way I'm firing up cruddy old yast just to install a simple package.

This combined by the fact that that you have packages which depend not only on other packages, but also individual files, leaves you in a sticky situation. Thus you are forced to do something ugly and inefficient like the following:

$ find /path/to/rpms -name '*.rpm' | while read package
  do
    rpm -qpl "${package}" | while read file; do
      echo -e "${file}\t${package}"
    done
  done > files-per-package
$ grep 'somestupidlib.so' files-per-package

to find out what you actually need to install. Once you get that list you can go right back to it with

# rpm -i some-other-package-with-1x10^7-deps.rpm

fun no?

P.S. If any SuSE fans out there have a better way of doing this, I'd love to hear it. The Administration Guide provided with SuSE Linux Pro 9.1—I don't have the more recent ones handy—suggests doing something remarkably similar to the above.

Sunday, March 5, 2006

It's All in the Shell

Shell scripts are supposed to make our lives easier, right? Scribble down a bit of bash magic and all your busy work will vanish faster than you can type #!/bin/bash. Too good to be true?

Well, perhaps, but it will make life much easier in the long run. As your scripts become more advanced, you'll spend less time doing boring repetitive tasks, and more time...eh, writing scripts.

The best way to start out is to just play around at the command line for a while. This will get you on your way, but here is a pair of tips that will help you train your scripting muscle:

  • Read a man page every day. The man pages are there to help you, but you can't take advantage of them unless you read them. You'll thank yourself later.

    Seriously. I'm not kidding. Do it for pleasure. Do it for power. Do it for the babes. Let's face it, ladies dig geeks, and this command that will lead you to everlasting glory:

    man $(ls -1 /usr/share/man/man1 | \
    sed -nr "$(($RANDOM % $(ls /usr/share/man/man1/ | wc -l) +1 ))p" | \
      sed "s/\.1\.gz//")

    What the heck is that, you ask? That will give you the low-down on a random command. Just think of it as "word of the day" for geeks.

  • Read other quality scripts. There are plenty of places you can look for them, but some good starting points are the scripts provided by your distribution (You are running an open source operating system, aren't you?). Try looking through the system startup scripts under /etc/init.d/. You can learn a lot of good tricks just by trying to follow along with the code.

Saturday, March 4, 2006

"Stupid" SSH Tricks

SSH is typically used to log into a remote machine and start up a shell session. This lets users run command line programs just as if they were sitting at a local terminal. That is a very useful ability to have in and of itself. However, some of the coolest things you can do with SSH don't involve starting an interactive session.

Here's a shortened version of the options ssh takes:

localhost$ ssh [misc. options] [user@]host [commands]

Normally, you'll see something simple along the lines of

localhost$ ssh remotehost

but, you can tack on an arbitrary command afterward to run it remotely. Let's say you need to get a list of all the users on a particular machine; there's no need to pull up a remote bash session. Just run this off:

localhost$ ssh -x remotehost 'cut -d: -f1 /etc/passwd'
    ...
    gandalf
    frodo
    ...
    localhost$

As indicated in the last line, this will leave you in your original session on the local machine. The -x option is used to turn of x forwarding; this will speed things up significantly.

Just using this method opens up a lot of options. If you have your authentication keys set up correctly, you can start running commands like this in shell scripts as well.

Also, you can poll many machines all at once. Let's say you want to check your group memberships on a bunch of machines. Use your shell's looping mechanism to do the trick.

localhost$ for box in host0{1,2,3,4}; do
  ssh -x ${box} groups
done

Nice.

Bonus stupid trick: if you want to run a remote command that requires user interaction—perhaps an ncurses application—you can have ssh request a pseudo terminal. For instance, you can open an editor on a remote file thus:

localhost$ ssh -xt vi somefile.txt

I'm sure you can come up with more interesting examples.