Python

Fun With Numbers

0

Periodically, I try to take a look at our home finances to see if there’s something that can be done to find some hidden stash of money. So far, my efforts have been for naught.

One expense I always investigate is our mortgage payment. I’ve always tried to pay ahead on the mortgage to save on future interest payments. So yesterday I got curious about what the best way to pay the curtailment- at the same time as the payment or halfway through the month or some other day of the month? I could have resorted to a web page that calculates amortization tables, but what fun is that?

So I wrote some python code that can be used to generate a repayment table.

Here’s the meat of it:

Months = [31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31] 
Month = 0

def setPaymentParameters(payment, rate, day = 0):

    monthlyrate = rate / 100.0 / 12

    def _calcMonth(principal, curtailment = 0):

        def _calcADB(principal, curtailment, day):
            global Month, Months

            dim = Months[Month]
            Month += 1
            if Month == 12:
                Month = 0
            return ((day * (principal + curtailment)) + ((dim - day) * principal))/dim

        # _calcMonth code starts here...
        adb = _calcADB(principal, curtailment, day)
        interest = adb * monthlyrate
        return (principal,
                adb,
                interest, 
                payment-interest,  # principal payment
                curtailment,
                (payment-interest)+curtailment,  # total principal payment
                principal-((payment-interest)+curtailment))

    return _calcMonth

So the setPaymentParameters function returns a function that will calculate the monthly interest, principal payment and so forth for a single month. The function returned is a closure over the set monthly payment, the interest rate and the day of month a theoretical curtailment payment is made. No curtailment is necessary for the function to work.

In order to determine the effect of curtailments separate from the normal payment, the calculation uses an average daily balance method. For instance, a normal payment is typically made on the 1st of the month and a separate curtailment payment is made on the 15th. The average is calculated by summing the days the principal the post-payment level and adding the sum of the days the principal is at the post-curtailment level. Then divide by the total number of days to get the average daily balance. In the absence of a curtailment, the calculation simplifies to the prinicipal balance at the beginning of the period.

Following is an example of how to use the function:

Rate = float(5.25)
Payment = float(1000.00)

Amortization = []
CalcMonth = setPaymentParameters(Payment, Rate, 15)
Principal = float(150000.00)
while (Principal > 0):
    t = CalcMonth(Principal)
    Amortization.append(t)
    Principal = t[6]

print len(Amortization)
interestTotal = float(0.0)
for i in range(len(Amortization)):
    print map(lambda x: format(x, ".2f"), Amortization[i])
    interestTotal += Amortization[i][2]

print format(interestTotal, ".2f")

The output won’t be particularly pretty, but it will list the total number of payments made to payoff the loan, followed by a breakdown of the effect of each monthly payment, followed by a calculation of the total interest paid. A monthly payment line will look like this:

['150000.00', '150000.00', '656.25', '343.75', '0.00', '343.75', '149656.25']

From left to right, we have the beginning principal, the average daily balance, the interest for the month, the principal paydown, the principal curtailment, the total principal paydown and finally the principal balance after the payment is applied. Each subsequent month uses this final principal balance number as the beginning balance.

The above snippet doesn’t use a curtailment payment to accelerate the paydown of the mortgage. To do that, the while loop needs to be modified slightly:

Curtailment = float(500.00)
while (Principal > 0):
    if len(Amortization) == 0:
        temp = CalcMonth(Principal)
        t = (temp[0],
             temp[1], 
             temp[2],
             temp[3], 
             Curtailment, 
             Payment-temp[2]+Curtailment,
             Principal-(Payment-temp[2]+Curtailment))
    else:
        t = CalcMonth(Principal, Curtailment)
    Amortization.append(t)
    Principal = t[6]

The modification is needed for the first payment. Since it’s the first payment, no curtailment is made, so the interest is calculated on the entire loan amount. The returned payment info needs to be modified then, manually inserting the curtailment payment. Thereafter, all calculations use the curtailment.

Here are the first couple of payment output lines:

['150000.00', '150000.00', '656.25', '343.75', '500.00', '843.75', '149156.25']
['149156.25', '149424.11', '653.73', '346.27', '500.00', '846.27', '148309.98']

The curtailment payment is included and the ending principal balance includes the extra payment. Notice the second line’s average daily balance number, which is higher than the starting principal balance. To fully understand that, first notice that the setPaymentParameters was called with the day set to 15, meaning the curtailment payment is applied on the 15th of the month, not the same day as the normal payment. Therefore, there are 15 days where the principal sits without the curtailment payment applied. Then the payment is applied for the remainder of the month. The end result is the ADB, which is used to calculate interest, is slightly higher than the principal balance after the curtailment.

The final answer to my question about the optimal day to apply the curtailment turned out to be- it saves the most money if the curtailment is paid on the same day as the normal payment. This makes sense since in general, paying earlier means the outstanding principal is reduced quicker, therefore interest is minimized.

But, that’s not the whole picture. Sometimes, for monthly household cash flow purposes, it is preferable to make multiple smaller payments. Will that result in a big difference in total interest paid? The answer there turns out to be no, it won’t. Depending on the amount owed and repayment length, the difference is only a few hundred dollars.

Design Is Not a Straight Line

0

I’ve recently attained a renewed interest in my blog client blogtool. A big part of that renewal is due to unfinished business- I’d alway meant to release it into the wild but had never taken the time to learn how to package it. I finally took that plunge a few weeks ago. Ever since, I’ve come up with a series of improvements, fine tunings and new ideas to make it a more capable tool and a better piece of software in general.

(more…)

Release Announce- blogtool v1.1.0

0

I’ve just uploaded blogtool v1.1.0 to pypi.

The minor release number bump is due to switching the option parser library as well as adding the ability to process information from the standard input. The comment option has also been modified to take a couple of arguments.

I’ve added some spiffy, new web based documentation to help with getting up and running with blogtool. The documentation stuff was generated with the help of sphinx, a very cool tool that uses a different plain-text markup format that I’ll be exploring adding support for in blogtool.

Announce- blogtool v1.0.1

0

I’ve released blogtool version 1.0.1 into the wild.

This is a bug fix version. It fixes an error in HTML output where tags like \<img> were not being properly closed. Also takes care of stray ‘&’ characters that need to be escaped.

It also fixes some bugs in the getpost option related to converting the post HTML into it’s markdown equivalent. Nested inline elements were not properly accounted for and escaping of a number of characters was also added.

Release Announcement- blogtool

0

I wrote a blog client a couple years ago and have been developing it on and off ever since. One of the reasons I hadn’t done anything public with it is I needed to take the time to organize it appropriately for something like pypi.

I’ve finally taken those steps and have put it out into the wild. The source code is on github, here. I’ve also used python’s setuptools to publish it on pypi, here.

It works with my self-hosted WordPress blog and I’ve used it almost for all but a handful of the blog posts I’ve written on the blog, so I consider it reasonably well tested for those purposes. It won’t support all of WordPress features, but I plan on changing that as I migrate some of the functionality over to using more of the WordPress API. When I originally wrote blogtool, WordPress didn’t have its own API for posting, so that’s why that shortcoming exists.

There are a couple of nice features to blogtool that I thought I’d mention here. One, it uses python-markdown to mark-up post text. It’s proven very capable for my style of blogging, which is 90% text. It handles pictures as well, and I’ve added a little wrinkle for that purpose. Rather than supply a URL or some such for markdown's syntax, simply supply a file path to the picture. Then, blogtool will take care of the rest.

The other nice feature is that posts can be retrieved and edited from a blog. When retrieving, it will reformat the HTML into markdown style format. This is useful for editing comments as well as posts.

So, there it is. My first published code project.

Dealing with Unicode in Python

0

I haven’t touched the code for the blog client I’d written in quite awhile. This is largely because it works well for my purposes and I haven’t had the need to add further support for other features.

There has been one major shortcoming for it, however, that I hadn’t taken the time to investigate and correct. Often times, when quoting text from an article on the web, I would get a unicode decode error related to the blob of text I’d copied from the browser.

Now, I understood in general terms what the problem was: stray characters within the copied text were not ASCII characters and markdown chokes on those characters. I had an inelegant workaround that kept me from properly dealing with the problem: I’d scan the text for offending characters, typically punctuation, and replace them with reasonable ASCII equivalents. It was a pain, but it worked.

Like all workarounds, this method had limitations. Specifically, certain special letter characters like letters with umlauts, tildes, accent graves or accent aigus over them cannot be duplicated. The fact that I didn’t run into that problem a lot kept me from dealing with it quicker. Also, scanning a block of text for unicode violators is tedious.

What I failed to understand at the time was that the characters on a web page were encoded in some kind of format, like UTF-8 for example. For most of the alpha characters (those without umlauts and the like) UTF-8 and unicode are identical. The problem comes in when characters don’t line up so neatly. What I finally came to understand was that the encoded web page text needed to be decoded into unicode prior to processing. The concept seems so blisteringly obvious, now, that I’m actually perplexed as to how I never grasped it originally.

So I finally fixed the problem. Or, perhaps better put, I came up with a solution with a better set of trade-offs. Because in order to actually “fix” the problem, it would be necessary to always know how text had been encoded. Unfortunately, from the program’s perspective, it can’t be done.

But it can make some educated guesses.

Here’s the basic code that fixes the problem:

for encoding in ['ascii', 'utf-8', 'utf-16', 'iso-8859-1']:
    try:
        xhtml = markdown.convert(text.decode(encoding))
    except (UnicodeDecodeError, UnicodeError):
        continue
    except:
        print "Unexpected Error: %s\n" % sys.exc_info()[0]
        sys.exit(1)
    else:
        return helperfunc(xhtml)

In this case, markdown is an object for marking up markdown formatted text. Prior to passing the text to the markdown object, I decode it using encoding that represent the most likely encodings I’ll run into. If an encoding fails, that a UnicodeDecodeError will get raised, which is caught by the first except clause. That clause merely passes control back to the for loop where the next encoding is selected and tried. Rinse, repeat. When no exception is created, control passes to the else clause where normal program flow continues on the returned xhtml from markdown.

This section of code eliminates, in my case, almost all occurrences my afore explained unicode problems. But that’s because the vast majority of webpages I use are encoded using UTF-8. I’ve since added a command line option to specify the encoding to use for decoding purposes. This should provide a means to cover all other situations that arise. In this instance, when the user specifies the encoding on the command line, the user specification supersedes all other encodings and is used. The presumption is the user knows what they are doing.

The code to support that looks like this:

if charset:
    encodings = [charset]
else
    encodings = ['ascii', 'utf-8', 'utf-16', 'iso-8859-1']

for encoding in encodings:
 .
 .
 .

The rest of the code looks identical to the above snippet.

It was a good exercise for me to muddle through, as I now fully comprehend the unicode problems that can arise and how to deal with them. The basic rules are:

  1. Decode text going into the program.
  2. Encode text coming out of the program.
  3. Use unicode for the string literals within the program.

These should help keep me out of unicode trouble in the future.

python-markdown Typed-List Extension

0

I’ve contributed a bit to the python-markdown project in the form of bug fixes. Today I finished creating an extension that allows python-markdown to recognize list types and generate code with the appropriate list markers. For instance, lists can be marked with upper or lower case letters, or upper or lower case Roman numerals.

The git repository for the extension is available here.

Down and Dirty Mail Notification

5

Following is a simple new mail notification implementation for the awesome window manager that leverages procmail. It’s main virtue is simplicity: there are about 20 lines of python code, 1 procmail recipe and several lines of code required in the rc file for awesome. The result is a numeric count of new email displayed in the statusbar.

(more…)

Python 2.7 Upgrade

0

This is one of those posts where I’m setting a marker for when the sh** hits the fan, so to speak. There are no deep dark secrets revealed here, though there is some geek talk after the jump. Don’t say you weren’t warned.

(more…)

Python Class Attributes

0

While working on a piece of code, I ran into an unexpected bug- in my program not Python. I’ve since done some investigating and I’ve learned something about how Python deals with class attributes and instance attributes. Following is a little exposition of what I’ve learned.

(more…)

uzbl and dmenu

0

I’ve been playing around with uzbl again and decided it was high time I tried out dmenu. What I’d read made it sound pretty slick, I was just leery of having to learn how to work with another application. Thankfully, dmenu is extremely easy to use. It isn’t available as a deb package, but the source is readily available. I built it with the vertical patch.

As practice I figured I’d rewrite one of the stock uzbl scripts as a python script. I chose the load_url_from_bookmarks.sh script since it was pretty easy for me to decipher. That’s not sayin’ much since my bash scripting foo is, well- ‘miserable’ is probably the right word.

The exercise proved valuable for a couple of reasons.

(more…)

Dynamic Modules in Python

0

I have defined a rather simple class for an XML proxy server to facilitate interacting with a blog. Right now, I’ve implemented a WordPress version so I can work with my own blog. Theoretically, it should be possible to implement other classes specific to other blog types thus making my program more general, and useful to others.

What I hadn’t figured out was how to structure the code so as to minimize monkeying around with multiple files when adding the new class. My goal was to come up with a structure that simply called for adding a module to extend the functionality.

Today, I finally came up with something and it utilizes the dynamic module loading capabilities of python.

(more…)

A Personal Milestone

0

Having been using open source software for years now, I’ve never really done any thing to try and reciprocate, as it were. Well, I’ve finally gotten my chance and contributed a 3-line bug fix to the python-markdown package.

Here’s the fixi, marked for posterities sake.

Python Unicode

0

I’ve been working on a piece of code to convert a blogpost into Markdown text. Yes, I’m aware of the html2text.py module (mine is html2md, so nyeah!) All I’ll say is how the heck does one learn anything if they keep relying on other people’s work?

Anyway, I’ve got a naive implementation working now (won’t handle more complicated nestings like lists in blockquotes) when I ran into a snag involving unicode. Upon retrieval of a particular post, I got the following error:

UnicodeEncodeError: ‘ascii’ codec can’t encode character u’xa0′ in position 174: ordinal not in range(128)

It came up in the context of passing a string to the iterparse function of the cElementTree module. The character in question is a ‘non-breaking space.’ Frankly, I wasn’t sure how it got there, but I verified it’s presence in the string and then set to figuring a way to deal with it. I believe this is an instance of mixing strings and unicode together, rather than dealing solely in one or the other.

I found the unicodedata module for a solution. There is a function call normalize which will map unicode characters to the local encoding. In this case, ASCII. Now this solution is far from perfect, since special characters (say, from the Russian alphabet or characters with tildes above them) are just converted to rough ASCII equivalents.

The following code additions fixed my problem:

import unicodedata as ud
    .
    .
    .
uthml = ud.normalize('NFKD', unicode(html))

Where html is a string of html directly from the website. I can then pass uhtml to the iterparse function and the error is gone (because the u’xa0′ characters are translated to ASCII space characters) and the rest of the program is able to do it’s thing.

I don’t know if this will be the final solution, but it allows me to continue with the development I had been originally interested in. I was aware of the potential for unicode issues, but had hoped they wouldn’t crop up. This gives me a simple 2 line way to deal with it for the now.

Learned Something New

2

I’m in the process of learning how to use lxml- a fast, powerful XML parser for python that relies on libxml2. I don’t know much in the way of details regarding xml so I got stuck as soon as I got started.

I was passing straight markdown generated XHTML into the various xml parser methods and objects for lxml. All of them were dying with the same error. The only parser that worked was the HTML parser.

The xml parsers kept coughing up the following error:

lxml.etree.XMLSyntaxError: Extra content at the end of the document, line 1, column 222

I really had know clue as to what this might mean. Since the output was from markdown, I reasoned there was little likelihood of malformed XHTML or some such. Besides, I knew that it rendered fine on web pages. Liekly, there was some significant detail I was missing. Unfortunately, my lack of xml knowledge meant I didn’t have anything to fallback on for solving the problem.

Finally, I turned up a comment thread where I learned that xml documents require a “root element.” I had seen the term “root” but only had a vague notion of what it meant. Now, I know exactly what is meant.

Prior to passing the markdown string to the parser, I performed the following operation:

rootedhtml = "<post>%s</post>" % html

Where html was the markdown output. I then passed rootedhtml to the parser and it no longer chokes.

Now I can get back to solving my original problem.

Python-markdown Strike

0

I’ve incorporated markdown into my blog client and in the course of doing so, I saw that it could be extended. So I thought I’d give it a go and see if I could add a ‘strike’ extension to the markdown module.

It turned out to be almost trivially simple thanks to the documentation. I chose to use a double hyphen around a word to create the strike. I looked elsewhere and double tildes seems to be another nice way to do it. The code below shows it for the double hyphen but the code would be identical in either case except for the RE definition.

Here’s the code:

import re

import markdown
from markdown import etree

STRIKE_RE = r'(-{2})(.+?)\2'

class StrikeExtension(markdown.Extension):
    def extendMarkdown(self, md, md_globals):
        md.inlinePatterns.add('strike', markdown.inlinepatterns.SimpleTagPattern(STRIKE_RE, 'strike'), '>strong')

def makeExtension(configs = None):
    return StrikeExtension(configs = configs)

That’s it.

The SimpleTagPattern object is a general purpose object that’s part of the markdown library and it’s used to create text rules involving inline patterns such as emphasis or strong. That’s all my version of the strike rule is.

I implemented my extension as a module, so I had to put it in the ‘extensions’ directory of my markdown library. It’s also possible to just incorporate the extension in project code and make markdown aware of it, but I haven’t tried that yet.

(more…)

Pythonic Code

0

I came up with another nice little piece of python code to solve a problem while working on my blog client.

(more…)

Go to Top