I’ve recently attained a renewed interest in my blog client blogtool
. A big part of that renewal is due to unfinished business- I’d alway meant to release it into the wild but had never taken the time to learn how to package it. I finally took that plunge a few weeks ago. Ever since, I’ve come up with a series of improvements, fine tunings and new ideas to make it a more capable tool and a better piece of software in general.
The original concept was to keep it as simple as possible. At the time, most other clients were GUI based with custom editors designed specifically for the act of blogging. Once I’d been blogging for a couple of months, I realized that it was overkill for my needs. That was when it seemed obvious that a blog client that relied on a text editor, with support for plain text markup, was all I really needed. In fact, the original concept was that a blog file could really be a lot like and email which, at it’s core, consists of a file with a header section and a content section.
The idea was appealing because it leveraged a precedent for dealing with the metadata that had proven effective and extensible over a long period of time. The header took some basic info from the user and the rest was supplied by an MUA and all the MTA’s that touched the file along it’s path from originator to sender. The content was the main thing, as far as a user is concerned. The act of writing a blog post was really very similar- provide some basic info for connecting to a blog along with some basic info about the post and, or course, the content.
The header concept was also appealing for it’s seeming simplicity. It wasn’t until I actually started coding that I realized there was a lot more to it than I’d originally thought. It’s not that the overall concept was faulty, just that there were tons of details to work out. Should I make it really flexible, and potentially messy? or put stricter limits on header interpretation and potentially shortchange a user on flexibility where it’s needed? How many header fields to support? How to support multiple blogs? Should a user have to enter the same information over and over or allow for a configuration file? There were tons of questions like that to answer along the way. Some obvious, many less so that became evident only as the usage possibilities came into view.
Ultimately, I had a rough infrastructure that closely resembles how it works today. I don’t recall the details, but the code was a mess at that point. So I rewrote the whole thing to make it cleaner. That project resulted in the ability to make the program support more functionality at less cost in terms of coding time. Thus, blogtool
‘s capabilities grew.
More recently, I’ve been updating the XMLRPC interface. When I originally wrote the code, WordPress didn’t have it’s own XMLRPC API for things like posting, editting posts and deleting posts. Rather, they borrowed from other API’s like Blogger, Movable Type and Metaweblog. Now that WordPress does have it’s own API, I decided to take the time to update blogtool
to put myself in position to take advantage of any new functional possibilities.
Lo and behold, a couple of opportunities presented themselves. In particular, the ability to add an excerpt field to a post. I’ve never used excerpts before, but with my using Twitter now, I felt that could improve the tweets that go out to announce new blog posts. Besides which, having released it into the wild, others might want to take advantage.
But how to implement the excerpt? My initial plans revolved around some kind of in-post markup that would set off the excerpt from the post content. I could just pre-process the text, find that markup and deal with it from there. But now I had to figure out some kind of suitable markup. That’s where I ran into a stumbling block. I didn’t like any of the options I could conceive of. There were ideas like setting it apart from the rest of the post with a line of special characters like the ~
or some such. I also considered using a prepend character for each line. I even considered creating a directive, like what’s used in reStructuredText.
But none of these felt right. Choosing a new character for plain text markup is always problematic since it won’t necessarily be obvious what it means. Directives required a whole new code infrastructure and I didn’t like what they’d do to the file. A prepend character seemed a real possibility, but I already used Markdown for content markup and I felt it might be confusing. Plus, I’m strongly considering adding the ability to use other plain text markups like reSt, so there might be collisions.
I finally, reluctantly, decided to leverage what I already had: the header. I say reluctantly because my header parsing is line based and in order to support an excerpt in my header, I’d have to somehow provide the capability for multiline value assignment. I couldn’t expect users to string potentially 2 or 3 hundred characters on a single line. The idea of mucking with the header parsing was unappealing, since I didn’t want to break the code.
The obvious choice was some kind of end of line escape, or just quotes. But again, both had their drawbacks. Escaping the end of a line is non-obvious for a user and looks ugly. Quotes meant I lost another character from potential usage in the excerpt content.
Ultimately, I opted to try using quotes, and figured I’d introduce backslash escapes so quotes could still be used as part of the text. While implementing the functionality, I was staring at my code when I realized that a solution was sitting right there in my source code.
Blogtool
is written entirely in python, and python has a string quoting mechanism known as a docstring. A docstring is delimited by a very simple, very obvious sequence:
"""This is a docstring in python and those three consecutive double-quotes
mark the beginning of it. The next sequence of 3 double-quotes will mark
the end of the docstring.
Everything you see here would still be considered part of the string,
including all of these characters: "',.<>/?!
I'm about to close this docstring..."""
The use of a sequence of characters to delimit a long string, I realized, solved a number of problems with little drawback: multiline support is a snap since once the initial delimiting sequence is found everything up to the next delimiting sequence is part of the string regardless of the character, there is no need for escaping characters because of the specific nature of the delimiter character sequence, it’s obvious and clean from a usage standpoint, and it was general enough that I didn’t have to contrive the code to make it specifically for 1 new keyword in the header. The only drawback was mucking with header parsing code. Yet even there, it wasn’t all bad as the concept was simple so it wasn’t necessary to rip apart the current parsing code.
The implementation only took a few minutes of coding. I tested it, found a couple of edge cases and improved upon it. I now had the capability to use any character for header values while also gaining the ability to support multi-line values, all while maintaining a simple syntax structure.
And none of that would have been possible if I hadn’t gotten everything wrong to start.