Running Unix Text Filters on Windows

As I’ve moved more and more away from the independent, “we don’t care how you do it as long as it gets done” career path and more towards the “getting it done is less important then doing it how we tell you” path, I’ve found myself having to become reacquainted with Windows. This presents two problems. Despite expectations, I still want to get things done, and the past 4 years most of my personal workflow revolves around OS X. More specifically around the ability to apply unix shell scripts to any arbitrary piece of text I want to.

Most of my Independent Work Day™ is spent in BBEdit, where I make heavy use of Unix Text Filters. Everything from using something like Markdown to speed up my XHTML production time to quick little scripts I’ve whipped up to automate other mundane, repetitive tasks.

The Obstacles

Out of the box, windows doesn’t have any of the popular script interpreters you’d expect from a modern OS. Perl, PHP and Python are nowhere to be found. There’s Windows Scripting Host, but it’s ability to manipulate text is less than ideal, and outside of scripts for managing Windows Server, there’s not a lot of developers doing interesting things with it (if you consider managing Windows Server interesting).

The next problem is the lack of a quality, well supported text editor on Windows. I’ve yet to find a native app that has support for sending a chunk of text off to an external process and receiving the results back. There’s JEdit, but I’ve found the TextFilter plugin to be kind of buggy on windows.

Plus, I mean, it’s JEdit. Ugh.

Finally, the windows shell environment is a horrible afterthought. Anytime I have to drop into the command line on windows a little part of me dies (fortunately, it’s the same part that’s already died from having to work with computers all day). The shell is based on DOS, and just hasn’t been a priority for Microsoft developers. You can’t even copy/paste properly.

Of course, I wouldn’t be writing this if I hadn’t hashed out some kind of solution that meets at least 90% of my needs.

The Goal

Within any windows text editor, allow the user (me) to invoke a external process that will take the currently selected text, pass it through whatever unix-like process I want, and then replace the aforementioned selected text with the results.

Materials Used

  • A computer running Windows XP SP2

  • A GUI text editor that supports copy/paste with a ctrl-key. JEdit and EditPlus come to mind.

  • Cygwin

  • AutoHotKey

  • A Unix text filter. We’ll use Markdown.pl in out examples.

  • A bit of text to test our filter with.

    #A parable
    
    The quick brown fox, as he jumped over 
    the lazy dog, suddenly realized that
    
    1. He'd been wasting his life at the ad agency
    
    2. He wanted something more out of life
    
    3. Lorum Ipsum was not real latin
    
    The fox, temporarily distracted by his impending 
    mid-life crisis was then eaten by the dog. Lazily. 
    

We’ll assume you have your OS and text editor all taken care of, so the first thing is to get some proper unix tool on your machine.

Installing Cygwin

Cygwin is a collection of open source programs, including a bash shell and the standard make tools, which will give you a linux like environment on your PC. We’re not going to hold your hand through the installation process, but here are a few pointers I would have liked when i started monkeying about with it.

1. Download all your source files to the local disk and install from there

Installing from a Network Resource is a wonderful convenience, but the second you have a network problem you’re in trouble. Better to create a local copy of all the packages and install from there.

2. Expand those options.

The first few times I installed Cygwin I wondered where all the packages were. I assumed the complete lack of anything except the basic shell commands was a combined results of my stupidity and older OS (Windows ‘98). Turns out the install wizard just sucks, and doesn’t recursively install your selections like I assumed.

Cygwin Installer

So it was just my stupidity.

3. Install the SSH Daemon for a better shell experience

This isn’t needed for our tutorial, but it’s something I like to do. The default Cygwin terminal application sucks a lot like the pseudo-DOS shell terminal application. No easy copy/paste, no easy window resizing, etc. There’s a graphical shell package called rxvt, but it sucks like X-windows.

You could run X-windows, but you could also pluck out your eyes with your sharpened toenails if you really wanted to.

So,I install an SSH daemon/service, and then use my ssh client (Putty, F-Secure) to connect to localhost. You won’t be able to do everything you could with a local terminal, but you can do most of what you need and be in an environment that doesn’t suck.

4. Make sure you install whatever interpreter your need from the package list.

In our case, this is Perl.

Install AutoHotKey

AutoHotKey is an open-source program that acts a lot like Type-It-For-Me or Textpander on OS X. You can define snippets of text that when typed anywhere will be auto-expanded. You can also assign keyboard combinations to do the same thing.

AutoHotKey goes a step further, and each snippet/keyboard combination can perform a variety of scripting actions.

When AutoHotKey is installed and running, you’ll see a little green H in your status bar.

Auto Hot Keys in System Tray

The Approach

Michael Sippey teaches us that Cygwin comes with two utility programs called getclip and putclip, which allow you access to the windows clipboard. He also lays out instructions for using these programs to “Markdown” text that’s on your clipboard.

We’re going to go a step further, and use AutoHotKey to automate the copying, pasting, and re-inserting of the text. This way, we never have to leave the comfort of our favorite text editor.

First, create a batch file that contains the following (the command should all be on a single line with echo off at the top)

@echo off
d:\cygwin\bin\getclip | d:\cygwin\bin\perl 
d:\path\to\markdown.pl | d:\cygwin\bin\putclip

As outlined in Sippey’s tutorial, this command will take whatever is on the clipboard, run it through your script/filter (Markdown here), and then return the results of your script to the clipboard.

Let’s make sure everything is working.

  1. Copy our test text

  2. Double click or Run the batch file

  3. Execute a paste command in a text editor

You should now see test text formated in HTML.

Not too shabby, but it’s a little, um, obvious that running that batch file each time we want to transform some text is going to get tedious. Time for AutoHotKey.

Right click on the green H, and select the “Edit This Script” menu option. A notepad window will open with a bunch of code.

Any scripts entered in this file will be loaded into memory whenever you start or refresh the AutoHotKey application. Copy and paste the following script into this file, and then refresh AutoHotKey by right clicking on the green H and selecting the “Reload this script option”

^!m::
  SetKeyDelay 100
  original = clipboard
  Send ^C
  RunWait, C:\path\to\your\batch\file.bat
  Send ^V
  clipboard = original
return

Let’s go over this script line by line.

^m:
This defines the key combination that will be used to invoke our AutoHotKey script. In this case, Ctrl-Alt-M.

SetKeyDelay 100
AutoHotKey works by literally sending typed input to the OS. This command sets the delay between keystrokes. More on this later

original = clipboard
This stores the current contents of the clipboard in a variable so we can restore it once we’re done. This isn’t necessarily needed, but it’s always good practice to leave as little sign of disruption as possible.

Send ^C
The Send command will send keystrokes to your computer. In this case, a Ctrl-C combination. This is for copying our text. If your text editor of choice uses a different key combination, you’ll want to change this (see the AutoHotKey documentation for more on this)

RunWait, C:\path\to\your\batch\file.bat
This command will run our batch script. RunWait will wait until the process it launches is completed before continuing script execution.

Send, ^V
Much like our Send ^C command, this one sends a paste command to our application. Since we just ran our batch file, this should be our “markdowned” text.

clipboard=original
This returns the clipboard to its original state. We are ninja, we were never here. Those are someone else’s footprints in your butter.

So, that out of the way, lets give this thing a try. Select our test passage in your text editor and hit Ctrl-Alt-M.

Most of you (assuming all your ducks were in a row) saw your selection replaced by an HTML formated passage.

A few of you probably saw your selected passage replaced by the word ‘original’, a blank passage, or nothing appeared to happen at all.

Here’s what’s going on.

After AutoHotKey sends Ctrl-V, it immediately executes the next statement, which is clipboard=original. Sometimes there’s not enough time for the OS to complete it’s paste before this command runs. So the command runs, and THEN the OS makes the paste. The same thing can happen with the Ctrl-C keystroke. Send the stroke, run the batch file and THEN copy the text to the clipboard.

Makes AppleScript seem like a wonder, doesn’t it?

If you’re running into this problem, just increase the SetKeyDelay until it’s high enough where you don’t see the problem. SetKeyDelay sets how long AutoHotKey waits between keystrokes. By setting it to a higher number, you’re giving the OS time to do it’s copy and paste operations before moving on.

That’s it. The only two thing I’m dissatisfied about with this approach is the brief flash of the console window as the batch file runs, and the need to add different scripts if an application doesn’t use Ctrl-C and Ctrl-V for copy/paste.

Still, I think it’s a marked improvment over manually invoking the command either via a batch file or console window.

Originally published March 1, 2006
blog comments powered by Disqus