Normalizing a MAC address string

Over the last few days, I have been spending some time working on my python - reading the sections of Diving into Python that I have never got around to and refactoring parts of some of my python scripts to make better use of the features of language and, ultimately, to make them more robust (i.e. usable by people other than me).

The script I have started with is a simple one for registering hosts for DHCP access. Basically, it takes two command line arguments - a fully qualified hostname and a MAC address - and then does some validation, checks that neither address is already in use, normalizes the output to the correct format, constructs a properly formatted host stanza and appends it to the end of our ISC DHCP servers dhcpd.conf configuration file.

I have made improvements to various parts of the code but the changes I am most conflicted about are those I have made to the MAC address normalization function which works reliably and therefore probably isn't a good candidate for refactoring but (to me anyway) looks inelegant - something which I think matters in an elegant language like python.

The normalization function takes as inpute a MAC address in one of three (string) formats - unix (00:11:22:33:aa:55), Windows(00-11-22-33-AA-55) and Cisco (0011.2233.aa55) - and the same MAC address in the unix format.

Since I am trying to move towards a more test-driven development approach, I started out by writing a very basic unit test to make sure my new function is at least as reliable as the old function. Here is the code on the unit test (test.py):


#!/usr/bin/env python
import unittest
import sys
sys.path.append(".")
import oldmac as mac
#import newmac as mac

class Test(unittest.TestCase):
 """ Unit test/s for MAC address normalization function """
 normalize_values = (
  ('00:11:22:33:aa:55', '00:11:22:33:aa:55'),
  ('0011.2233.aa55', '00:11:22:33:aa:55'),
  ('00-11-22-33-aa-55', '00:11:22:33:aa:55'),
  ('00:11:22:33:AA:55', '00:11:22:33:aa:55')
 )

 def testMacNormalize(self):
  """ Normalize MAC addresses to lowercase unix format """
  for addr, expected in self.normalize_values:
   result = mac.normalize(addr)
   self.assertEqual(expected, result)

if __name__ == "__main__":
 unittest.main()

Here is my orginal normalization function (oldmac.py):


#!/usr/bin/env python
""" Old MAC normalization function """
import re

def normalize(m):
 """ Normalize a MAC address to lower case unix style """
 m = re.sub("[.:-]", "", m)
 m = m.lower()
 n =  "%s:%s:%s:%s:%s:%s" % (m[0:2], m[2:4], m[4:6], m[6:8], m[8:10], m[10:])
      return n

There are two things I don't like about this code:

Use of a regular expression for something as simple as eliminating the delimiters from the string. There must be a simpler way to do this.
And the bit I find inelegant - the construction of the normalised string, n. It looks ugly :)

First off, here is the results of running my unit tests against this old function:


$ ./test.py -v
Normalize MAC addresses to lowercase unix format ... ok
----------------------------------------------------------------------
Ran 1 test in 0.000s
OK

So, I set out to rewrite the normalize function more elegantly. Here is the result (newmac.py):


#!/usr/bin/env python
""" New MAC normalization function """

def normalize(addr):

 # Determine which delimiter style out input is using
 if "." in addr:
  delimiter = "."
 elif ":" in addr:
  delimiter = ":"
 elif "-" in addr:
  delimiter = "-"

 # Eliminate the delimiter
 m = addr.replace(delimiter, "")

 m = m.lower()

 # Normalize
 n= ":".join(["%s%s" % (m[i], m[i+1]) for i in range(0,12,2)])

 return n

The differences between this version and old version are:

I replaced the regular expression with a simple string.replace. Why use a (something-big) when a (something-small) will do.
I replaced the normalization expression with does the same thing but using a list comprehension.

And my unit test runs against this version:


$./test.py -v
Normalize MAC addresses to lowercase unix format ... ok
----------------------------------------------------------------------
Ran 1 test in 0.000s
OK

So, what do you think? An improvement or not?

Have I over-engineered the normalization code by replacing something static, simple and fast with something more complex, less readable and probably slower?

Is there a better way of normalizing a MAC address string?

Note: I have just presented one aspect of the refactoring on this script - the MAC address normalization. This is the part I am most conflicted about because the old code worked fine but the new code looks "cooler" :) . The other improvements such as the more robust validation I have left out (perhaps for another blog post).

More pyparsing and DHCP hosts

Since I wrote my original pyparsing post a few days ago, I've done some more work on refining my ISC dhcpd.conf host parsing example program. I also received some useful comments and suggests from Paul McGuire, the author the pyparsing module (thanks, Paul!), which I have also tried to incorporate. It's it's currently just a useless toy program but it is starting to look quite pretty. #!/usr/bin/python from pyparsing import * # An few host entries from dhcpd.conf sample_data = """ # A host with dynamic DNS attributes host a.foo.bar { ddns-hostname a; ddns-domainname "foo.bar"; hardware ethernet 00:11:22:33:44:55; fixed-address 192.168.100.10, 192.168.200.50; } # A simple multi-line host host b.foo.bar { hardware ethernet 00:0f:12:34:56:78; fixed-address 192.168.100.20; } # A simple single-line host host c.foo.bar { hardware ethernet 00:0e:12:34:50:70; fixed-address 192.168.100.40; } """ digits = "0123456789&qu

Craig Balfour's Blog

Search This Blog

Normalizing a MAC address string

Labels

Comments

Popular posts from this blog

More pyparsing and DHCP hosts

Sorting a list of IP addresses in Python