• Alan Storm
• the professional weblog; because we have to

Running Unix Text Filters on Windows

Frustrated by Magento? Then you’ll love Commerce Bug, the must have debugging extension for anyone using Magento. Whether you’re just starting out or you’re a seasoned pro, Commerce Bug will save you and your team hours everyday. Grab a copy and start working with Magento instead of against it.

Updated for Magento 2! No Frills Magento Layout is the only Magento front end book you'll ever need. Get your copy today!

Programming Quickies

Quick dispatches from the life of a working programmer.

As I’ve moved more and more away from the independent, “we
don’t care how you do it as long as it gets done” career
path and more towards the “getting it done is less
important then doing it how we tell you” path, I’ve found
myself having to become reacquainted with Windows. This
presents two problems. Despite expectations, I still want to get things done, and the past 4 years most of my
personal workflow revolves around OS X. More
specifically around the ability to apply unix shell
scripts to any arbitrary piece of text I want to.

Most of my Independent Work Day™ is spent in
BBEdit, where I make heavy use of Unix Text Filters.
Everything from using something like Markdown to speed up
my XHTML production time to quick little scripts I’ve
whipped up to automate other mundane, repetitive tasks.

The Obstacles

Out of the box, windows
doesn’t have any of the popular script interpreters you’d
expect from a modern OS. Perl, PHP and Python are nowhere
to be found. There’s Windows Scripting Host, but it’s
ability to manipulate text is less than ideal, and outside
of scripts for managing Windows Server, there’s
not a lot of developers doing interesting things with it (if
you consider managing Windows Server interesting).

The next problem is the lack of a quality, well supported
text editor on Windows. I’ve yet to find a native app that
has support for sending a chunk of text off to an external
process and receiving the results back. There’s JEdit, but
I’ve found the TextFilter plugin to be kind of buggy on windows.

Plus, I mean, it’s JEdit. Ugh.

Finally, the windows shell environment is a horrible afterthought. Anytime I have to drop into the command line on
windows a little part of me dies (fortunately, it’s the same
part that’s already died from having to work with
computers all day). The shell is based on DOS, and just
hasn’t been a priority for Microsoft developers. You can’t even copy/paste properly.

Of course, I wouldn’t be writing this if I hadn’t hashed out
some kind of solution that meets at least 90% of my needs.

The Goal

Within any windows text editor, allow the user (me) to
invoke a external process that will take the currently
selected text, pass it through whatever unix-like process I
want, and then replace the aforementioned selected text with
the results.

Materials Used

• A computer running Windows XP SP2

• A GUI text editor that supports copy/paste with a ctrl-key. JEdit and EditPlus come to mind.

• Cygwin

• AutoHotKey

• A Unix text filter. We’ll use Markdown.pl in out examples.

• A bit of text to test our filter with.

#A parable

The quick brown fox, as he jumped over
the lazy dog, suddenly realized that

1. He'd been wasting his life at the ad agency

2. He wanted something more out of life

3. Lorum Ipsum was not real latin

The fox, temporarily distracted by his impending
mid-life crisis was then eaten by the dog. Lazily.


We’ll assume you have your OS and text editor all taken
care of, so the first thing is to get some proper unix tool on

Installing Cygwin

Cygwin is a collection of open source programs, including a
bash shell and the standard make tools, which will give you
a linux like environment on your PC. We’re not going to
hold your hand through the installation process, but here
are a few pointers I would have liked when i started monkeying about with it.

1. Download all your source files to the local disk and
install from there

Installing from a Network Resource is a wonderful
convenience, but the second you have a network problem you’re
in trouble. Better to create a local copy of all the packages
and install from there.

2. Expand those options.

The first few times I installed Cygwin I wondered where all
the packages were. I assumed the complete lack of anything
except the basic shell commands was a combined results of my
stupidity and older OS (Windows ’98). Turns out the install wizard
just sucks, and doesn’t recursively install your selections
like I assumed.

So it was just my stupidity.

3. Install the SSH Daemon for a better shell experience

This isn’t needed for our tutorial, but it’s something I
like to do. The default Cygwin terminal application sucks a
lot like the pseudo-DOS shell terminal application. No easy copy/paste, no easy
window resizing, etc. There’s a graphical shell package called rxvt,
but it sucks like X-windows.

You could run X-windows, but you could also pluck out your eyes with your sharpened toenails if you really wanted to.

So,I install an SSH daemon/service, and then use my ssh client
(Putty, F-Secure) to connect to localhost. You won’t be able to do
everything you could with a local terminal, but you can do
most of what you need and be in an environment that doesn’t suck.

4. Make sure you install whatever interpreter your need from the package list.

In our case, this is Perl.

Install AutoHotKey

AutoHotKey is an open-source program that acts a lot like
Type-It-For-Me or Textpander on OS X. You can
define snippets of text that when typed anywhere
will be auto-expanded. You can also assign keyboard
combinations to do the same thing.

AutoHotKey goes a step further, and each snippet/keyboard
combination can perform a variety of scripting actions.

When AutoHotKey is installed and running, you’ll see a
little green H in your status bar.

The Approach

Michael Sippey teaches us that Cygwin comes with two utility
programs called getclip and putclip, which allow you access
to the windows clipboard. He also lays out instructions
for using these programs to “Markdown” text that’s on your
clipboard.

We’re going to go a step further, and use AutoHotKey to
automate the copying, pasting, and re-inserting of the text.
This way, we never have to leave the comfort of our
favorite text editor.

First, create a batch file that contains the following (the command should all be on a single line with echo off at the top)

@echo off
d:\cygwin\bin\getclip | d:\cygwin\bin\perl
d:\path\to\markdown.pl | d:\cygwin\bin\putclip


As outlined in Sippey’s tutorial, this command will take
whatever is on the clipboard, run it through your
script/filter (Markdown here), and then return the results
of your script to the clipboard.

Let’s make sure everything is working.

1. Copy our test text

2. Double click or Run the batch file

3. Execute a paste command in a text editor

You should now see test text formated in HTML.

Not too shabby, but it’s a little, um, obvious that running that batch
file each time we want to transform some text is going to
get tedious. Time for AutoHotKey.

Right click on the green H, and select the “Edit This
Script” menu option. A notepad window will open with a bunch of code.

Any scripts entered in this file will
be loaded into memory whenever you start or refresh the
AutoHotKey application. Copy and paste the following
script into this file, and then refresh AutoHotKey by right
clicking on the green H and selecting the “Reload this
script option”

^!m::
SetKeyDelay 100
original = clipboard
Send ^C
RunWait, C:\path\to\your\batch\file.bat
Send ^V
clipboard = original
return


Let’s go over this script line by line.

^m:
This defines the key combination that will be used to
invoke our AutoHotKey script. In this case, Ctrl-Alt-M.

SetKeyDelay 100
AutoHotKey works by literally sending typed input to the OS.
This command sets the delay between keystrokes. More on
this later

original = clipboard
This stores the current contents of
the clipboard in a variable so we can restore it once we’re
done. This isn’t necessarily needed, but it’s always good
practice to leave as little sign of disruption as possible.

Send ^C
The Send command will send keystrokes to your
computer. In this case, a Ctrl-C combination. This is for
copying our text. If your text editor of choice uses a
different key combination, you’ll want to change this (see
the AutoHotKey documentation for more on this)

RunWait, C:\path\to\your\batch\file.bat
This command will run our batch
script. RunWait will wait until the process it launches is
completed before continuing script execution.

Send, ^V
Much like our Send ^C command, this one sends a
paste command to our application. Since we just ran our
batch file, this should be our “markdowned” text.

clipboard=original
This returns the clipboard to its
original state. We are ninja, we were never here. Those are someone else’s footprints in your butter.

So, that out of the way, lets give this thing a try.
Select our test passage in your text editor and hit Ctrl-Alt-M.

Most of you (assuming all your ducks were in a row) saw your
selection replaced by an HTML formated passage.

A few of you probably saw your selected passage replaced by
the word ‘original’, a blank passage, or nothing
appeared to happen at all.

Here’s what’s going on.

After AutoHotKey sends Ctrl-V, it immediately executes the next statement,
which is clipboard=original. Sometimes there’s not enough
time for the OS to complete it’s paste before this command
runs. So the command runs, and THEN the OS makes the
paste. The same thing can happen with the Ctrl-C
keystroke. Send the stroke, run the batch file and THEN
copy the text to the clipboard.

Makes AppleScript seem like a wonder, doesn’t it?

If you’re running into this problem, just increase the
SetKeyDelay until it’s high enough where you don’t see the
problem. SetKeyDelay sets how long AutoHotKey waits between keystrokes.
By setting it to a higher number, you’re giving the OS time
to do it’s copy and paste operations before moving on.

That’s it. The only two thing I’m dissatisfied about with this approach is
the brief flash of the console window as the batch file
runs, and the need to add different scripts if an application doesn’t use Ctrl-C and Ctrl-V for copy/paste.

Still, I think it’s a marked improvment over
manually invoking the command either via a batch file or
console window.

Originally published March 1, 2006