Thursday, December 2, 2010

Personal income taxes and job creation

With elections over, the issue of whether to extend the Bush-era personal income tax cuts to families earning over $250,000 a year is back in the news. Currently, there is a 33% tax bracket that affects income in excess of $209,250 and a 35% tax bracket for income in excess of $372,950 (for joint filers). So, in effect, under the Democratic proposal, a new tax bracket will be created between the two existing brackets like so:

33%$209,250-$250,000
36%$250,000-$373,650
39.6%$373,650+


The exact numbers may still change as the details are worked out, but bear with me.

The Republicans are currently pushing to extend the Bush-era personal income tax cuts to families earning over $1,000,000 a year or extend them to everyone. If this were to come to pass, again an extra tax bracket would be created, but it would look something like:

33%$209,250-$373,650
35%$373,650-$1,000,000
39.6%$1,000,000+


Again, the exact numbers aren't all that important. For the sake of example, though, let's look at a scenario of a family making a cool half-a-million dollars a year in personal income. Currently, this family would be paying approximately $141,563 in federal income tax (assuming the standard deduction of $10,700 for 2010).

BracketRangeDollars Taxed
in this Bracket
Tax Amount
10%$0 – $16,750$16,750$1,675.00
15%$16,751 – $68,000$51,250$7,687.50
25%$68,001 – $137,300$69,300$17,325.00
28%$137,301 – $209,250$71,950$20,146.00
33%$209,251 – $373,650$164,400$54,252.00
35%$373,651 +$115,650$40,477.50
Total:$489,300$141,563.00

Wow, that's a big number. It is funny to think we're having all this argument over people earning so much money that their taxes are triple what the average American grosses in a year.

Anyway, our poor put-out example family is paying $141,563 out of their $500,000 annual income in federal income tax. That is an effective tax rate of 28.3%.

Under the Republicans' proposed plan, there would be no change in the amount of federal income tax paid by our hypothetical family since they earn less than 1 million dollars a year.

Under the Democrats' proposed plan, our hypothetical family would have to pay $150,592 in tax, which is 30.1% of their income.


BracketRangeDollars Taxed
in this Bracket
Tax Amount
10%$0 – $16,750$16,750$1,675.00
15%$16,751 – $68,000$51,250$7,687.50
25%$68,001 – $137,300$69,300$17,325.00
28%$137,301 – $209,250$71,950$20,146.00
33%$209,251 – $250,000$40,750$13,447.50
36%$250,001 – $373,650$123,650$44,514.00
39.6%$373,651 +$115,650$45,797.40
Total:$489,300$150,592.40


So the difference between the two proposals amounts to a 1.8% tax increase in this example. With a little hand-waving, let's just say the argument is over a 2% tax increase affecting families earning between $250,000 and $1,000,000 a year.

The Republicans claim that this tax will curb job creation. In response to the vote in the House of Representatives approving of the Democrat's proposal, Republican representative Gary Miller of California issued a statement saying
During these difficult economic times, raising taxes on any American family or small business will not help our economy recover nor foster the private-sector job growth needed to achieve economic recovery. The only thing that Democrats have accomplished by today's vote is yet more uncertainty for our nation's job creators.


Certainly, no one will argue that the U.S. could use more jobs. But is a 2% personal income tax increase going to materially affect job creation? Returning to the example above, a small business owner making $500,000 a year would see a difference of about $9000 in their take-home pay. That isn't enough to create even one job.

In any event, this ignores the elephant in the room: the issue being debated is a tax rate on personal income tax, not corporate tax. Why would a business owner pay out income from their business to themselves, incurring personal income tax, only to reinvest that money into their business? Wouldn't it make more sense to create those jobs using *before* tax dollars? And that is what any business owner can do -- and is doing -- right now, under the current tax law. And what they'll be able to continue doing no matter what happens with regards to personal income tax.

So it is patently silly to think that a decrease of any kind in the personal income tax is going to affect job creation. The money that creates jobs isn't taxed. You don't get lower than a zero percent tax rate. Businesses are not directly affected by the personal income tax rate.

No, the Republicans' cherished 2% tax cut on families making more than $250,000 a year only helps wealthy people put more money in their pockets. At best, businesses may benefit indirectly by virtue of the fact that wealthy people have more disposable income.

Sunday, November 21, 2010

Where are the protesters?

I'm waiting with baited breath to see whether there will be any protests over the airport nudey scanners and TSA groping this Thanksgiving holiday. I mean, we all remember the riots over universal health care, right?

If the government putting their nose in our personal business rankled people, surely the government putting their hands in our privates will unleash true rage.

Surely people will be up-in-arms over the TSA taking naked pictures of them and keeping the good ones for fun. Surely people will be angry that their wives and daughters are being felt up by convicted sex offenders. Surely people will be rabid over the loss of their 4th amendment rights.

Given the public shows of anger and disgust -- the public backlash -- in town hall meetings regarding something as mundane as health care, I expect the protests against the government's full assault of our privacy and dignity to be something to see. I can't wait.
Update 2010/11/24:
Well, the busiest travel day of the year has passed without any notable protests. I'm not quite sure what to make of that.

Sunday, November 14, 2010

New Chapter

Well, tomorrow is a big day for me: I'll be starting a new job. I'll be working for a super-secretive company in Cupertino which means that I won't be writing any more tech-related posts. I'm sure the next few weeks will be hectic, so I doubt I'll be up to writing much anyway. But I've got some ideas of what I'd like to write about once things settle down.

I'd say "stay tuned" but I'm pretty sure no one is tuned in to my blog as it is. :)

Friday, November 12, 2010

Python: Enumerating IP Addresses on FreeBSD

As promised in my earlier post on enumerating local interfaces and their IP addresses on MacOS X, this time I'll cover how to do the same on FreeBSD and other operating systems that implement the getifaddrs API. Basically, this is just a python wrapper around the getifaddrs interface using ctypes.

The code is a bit longer than I typically like to include in a blog post, but here it goes:
"""
Wrapper for getifaddrs(3).
"""

import socket
import sys

from collections import namedtuple
from ctypes import *

class sockaddr_in(Structure):
_fields_ = [
('sin_len', c_uint8),
('sin_family', c_uint8),
('sin_port', c_uint16),
('sin_addr', c_uint8 * 4),
('sin_zero', c_uint8 * 8)
]

def __str__(self):
assert self.sin_len >= sizeof(sockaddr_in)
data = ''.join(map(chr, self.sin_addr))
return socket.inet_ntop(socket.AF_INET, data)

class sockaddr_in6(Structure):
_fields_ = [
('sin6_len', c_uint8),
('sin6_family', c_uint8),
('sin6_port', c_uint16),
('sin6_flowinfo', c_uint32),
('sin6_addr', c_uint8 * 16),
('sin6_scope_id', c_uint32)
]

def __str__(self):
assert self.sin6_len >= sizeof(sockaddr_in6)
data = ''.join(map(chr, self.sin6_addr))
return socket.inet_ntop(socket.AF_INET6, data)

class sockaddr_dl(Structure):
_fields_ = [
('sdl_len', c_uint8),
('sdl_family', c_uint8),
('sdl_index', c_short),
('sdl_type', c_uint8),
('sdl_nlen', c_uint8),
('sdl_alen', c_uint8),
('sdl_slen', c_uint8),
('sdl_data', c_uint8 * 12)
]

def __str__(self):
assert self.sdl_len >= sizeof(sockaddr_dl)
addrdata = self.sdl_data[self.sdl_nlen:self.sdl_nlen+self.sdl_alen]
return ':'.join('%02x' % x for x in addrdata)

class sockaddr_storage(Structure):
_fields_ = [
('sa_len', c_uint8),
('sa_family', c_uint8),
('sa_data', c_uint8 * 254)
]

class sockaddr(Union):
_anonymous_ = ('sa_storage', )
_fields_ = [
('sa_storage', sockaddr_storage),
('sa_sin', sockaddr_in),
('sa_sin6', sockaddr_in6),
('sa_sdl', sockaddr_dl),
]

def family(self):
return self.sa_storage.sa_family

def __str__(self):
family = self.family()
if family == socket.AF_INET:
return str(self.sa_sin)
elif family == socket.AF_INET6:
return str(self.sa_sin6)
elif family == 18: # AF_LINK
return str(self.sa_sdl)
else:
print family
raise NotImplementedError, "address family %d not supported" % family


class ifaddrs(Structure):
pass
ifaddrs._fields_ = [
('ifa_next', POINTER(ifaddrs)),
('ifa_name', c_char_p),
('ifa_flags', c_uint),
('ifa_addr', POINTER(sockaddr)),
('ifa_netmask', POINTER(sockaddr)),
('ifa_dstaddr', POINTER(sockaddr)),
('ifa_data', c_void_p)
]

# Define constants for the most useful interface flags (from if.h).
IFF_UP = 0x0001
IFF_BROADCAST = 0x0002
IFF_LOOPBACK = 0x0008
IFF_POINTTOPOINT = 0x0010
IFF_RUNNING = 0x0040
if sys.platform == 'darwin' or 'bsd' in sys.platform:
IFF_MULTICAST = 0x8000
elif sys.platform == 'linux':
IFF_MULTICAST = 0x1000

# Load library implementing getifaddrs and freeifaddrs.
if sys.platform == 'darwin':
libc = cdll.LoadLibrary('libc.dylib')
else:
libc = cdll.LoadLibrary('libc.so')

# Tell ctypes the argument and return types for the getifaddrs and
# freeifaddrs functions so it can do marshalling for us.
libc.getifaddrs.argtypes = [POINTER(POINTER(ifaddrs))]
libc.getifaddrs.restype = c_int
libc.freeifaddrs.argtypes = [POINTER(ifaddrs)]


def getifaddrs():
"""
Get local interface addresses.

Returns generator of tuples consisting of interface name, interface flags,
address family (e.g. socket.AF_INET, socket.AF_INET6), address, and netmask.
The tuple members can also be accessed via the names 'name', 'flags',
'family', 'address', and 'netmask', respectively.
"""
# Get address information for each interface.
addrlist = POINTER(ifaddrs)()
if libc.getifaddrs(pointer(addrlist)) < 0:
raise OSError

X = namedtuple('ifaddrs', 'name flags family address netmask')

# Iterate through the address information.
ifaddr = addrlist
while ifaddr and ifaddr.contents:
# The following is a hack to workaround a bug in FreeBSD
# (PR kern/152036) and MacOSX wherein the netmask's sockaddr may be
# truncated. Specifically, AF_INET netmasks may have their sin_addr
# member truncated to the minimum number of bytes necessary to
# represent the netmask. For example, a sockaddr_in with the netmask
# 255.255.254.0 may be truncated to 7 bytes (rather than the normal
# 16) such that the sin_addr field only contains 0xff, 0xff, 0xfe.
# All bytes beyond sa_len bytes are assumed to be zero. Here we work
# around this truncation by copying the netmask's sockaddr into a
# zero-filled buffer.
if ifaddr.contents.ifa_netmask:
netmask = sockaddr()
memmove(byref(netmask), ifaddr.contents.ifa_netmask,
ifaddr.contents.ifa_netmask.contents.sa_len)
if netmask.sa_family == socket.AF_INET and netmask.sa_len < sizeof(sockaddr_in):
netmask.sa_len = sizeof(sockaddr_in)
else:
netmask = None

try:
yield X(ifaddr.contents.ifa_name,
ifaddr.contents.ifa_flags,
ifaddr.contents.ifa_addr.contents.family(),
str(ifaddr.contents.ifa_addr.contents),
str(netmask) if netmask else None)
except NotImplementedError:
# Unsupported address family.
yield X(ifaddr.contents.ifa_name,
ifaddr.contents.ifa_flags,
None,
None,
None)
ifaddr = ifaddr.contents.ifa_next

# When we are done with the address list, ask libc to free whatever memory
# it allocated for the list.
libc.freeifaddrs(addrlist)

__all__ = ['getifaddrs'] + [n for n in dir() if n.startswith('IFF_')]
As always, this code is released under a BSD-style license.

Friday, October 22, 2010

Euklas

Carnegie Mellon University has produced a tool called Euklas (Eclipse Users' Keystrokes Lessened by Attaching from Samples). From the description:

Euklas enhances Eclipse's JavaScript editor to help users to more successfully employ copy-and-paste strategies for reuse.

I think I'm going to cry.

Wednesday, October 20, 2010

Python: Enumerating IP Addresses on MacOS X

How do you enumerate the host's local IP addresses from python? This turns out to be a surprisingly common question. Unfortunately, there is no pretty answer; it depends on the host operating system. On Windows, you can wrap the IP Helper GetIpAddrTable using ctypes. On modern Linux, *BSD, or MacOS X systems, you can wrap getifaddrs(). Neither is trivial, though, so I'll save those for a future post.

Luckily, MacOS X provides a simpler way to get the local IP addresses: the system configuration dynamic store. Using pyObjC, which comes pre-installed on every Mac, we can write a straight port of Apple's example in Technical Note TN1145 for retrieving a list of all IPv4 addresses assigned to local interfaces:

from SystemConfiguration import * # from pyObjC
import socket

def GetIPv4Addresses():
"""
Get all IPv4 addresses assigned to local interfaces.
Returns a generator object that produces information
about each IPv4 address present at the time that the
function was called.

For each IPv4 address, the returned generator yields
a tuple consisting of the interface name, address
family (always socket.AF_INET), the IP address, and
the netmask. The tuple elements may also be accessed
by the names: "ifname", "family", "address", and
"netmask".
"""
ds = SCDynamicStoreCreate(None, 'GetIPv4Addresses', None, None)
# Get all keys matching pattern State:/Network/Service/[^/]+/IPv4
pattern = SCDynamicStoreKeyCreateNetworkServiceEntity(None,
kSCDynamicStoreDomainState,
kSCCompAnyRegex,
kSCEntNetIPv4)
patterns = CFArrayCreate(None, (pattern, ), 1, kCFTypeArrayCallBacks)
valueDict = SCDynamicStoreCopyMultiple(ds, None, patterns)

ipv4info = namedtuple('ipv4info', 'ifname family address netmask')

for serviceDict in valueDict.values():
ifname = serviceDict[u'InterfaceName']
for address, netmask in zip(serviceDict[u'Addresses'], serviceDict[u'SubnetMasks']):
yield ipv4info(ifname, socket.AF_INET, address, netmask)

One interesting point regarding this code is that it doesn't actually inspect interface information in the system configuration dynamic store. The interface-related keys are stored under State:/Network/Interface/, but this code (and Apple's example on which it is based) inspect keys under State:/Network/Service/ instead. However, if you want to get IPv6 addresses then you do have to inspect the system configuration's interface information:

from SystemConfiguration import * # from pyObjC
import socket
import re
ifnameExtractor = re.compile(r'/Interface/([^/]+)/')

def GetIPv6Addresses():
"""
Get all IPv6 addresses assigned to local interfaces.
Returns a generator object that produces information
about each IPv6 address present at the time that the
function was called.

For each IPv6 address, the returned generator yields
a tuple consisting of the interface name, address
family (always socket.AF_INET6), the IP address, and
the prefix length. The tuple elements may also be
accessed by the names: "ifname", "family", "address",
and "prefixlen".
"""
ds = SCDynamicStoreCreate(None, 'GetIPv6Addresses', None, None)
# Get all keys matching pattern State:/Network/Interface/[^/]+/IPv6
pattern = SCDynamicStoreKeyCreateNetworkInterfaceEntity(None,
kSCDynamicStoreDomainState,
kSCCompAnyRegex,
kSCEntNetIPv6)
patterns = CFArrayCreate(None, (pattern, ), 1, kCFTypeArrayCallBacks)
valueDict = SCDynamicStoreCopyMultiple(ds, None, patterns)

ipv6info = namedtuple('ipv6info', 'ifname family address prefixlen')

for key, ifDict in valueDict.items():
ifname = ifnameExtractor.search(key).group(1)
for address, prefixlen in zip(ifDict[u'Addresses'], ifDict[u'PrefixLength']):
yield ipv6info(ifname, socket.AF_INET6, address, prefixlen)

In fact, you could easily adapt the above function to be able to fetch IPv4 addresses from the interface configuration.

Friday, October 1, 2010

Caffeine Deficiency

My wife has been complaining that she always feels nauseous after taking her daily vitamin. On the theory that it was perhaps a result of taking them first thing in the morning on an empty stomach, she tried taking them at night with dinner. I saw with my own eyes what she was talking about: she almost vomited.

So, I told her to start taking mine instead and I'd finish her vitamins off. Well, I'm not doing that again. I don't know what it is about One A Day Women's Active Metabolism vitamins, but they made me sick too. While we hate wasting the money, we just chucked the rest of the bottle because there was no way either of us were ever taking those again. Before we tossed the bottle in the trash, though, we were examining the label for differences between her One A Day Women's Active Metabolism and my One A Day Men's vitamins that might account for the nausea. In particular, we were looking for ingredients that were in higher concentration in the women's vitamins.

There are a few: vitamins D, K, B1, B2, B6, Calcium, and Iron to be exact. But I couldn't find any hard evidence that any of these vitamins could cause the sort of nausea we experienced (at least not in quantities we are likely to be exposed to). But that is when we noticed what makes the women's vitamins earn the "active metabolism" moniker:


Caffeine, and lots of it. Guarana seed is just another source of caffeine, so it is possible that its 50mg is included in the total of 120mg of caffeine in each "vitamin" pill. But if not, that means that 1 One A Day Women's Active Metabolism pill has more caffeine (~170mg) than a cup of coffee (100-150mg) and approaching that of a single Vivarin pill (200mg).

While neither of us drinks much coffee and we try to avoid caffeinated drinks, I doubt that 170mg of coffee in the morning would be enough to induce nausea. After all, millions of people have a cup of coffee in the morning and that is approximately the same amount of caffeine.

But, we never expected to have caffeine added to vitamin pills. I guess that is what we get for not having looked at the label before buying them. We certainly won't be making that mistake again.

All that background out of the way, this experience got me thinking: just how much caffeine do people consume everyday now? Here my wife was getting a base 170mg a day without even knowing it. The marketing for caffeinated products tends to emphasize that it helps you stay "mentally aware"; how long before being constantly-caffeinated is considered the norm and the symptoms of caffeine deficiency are "mentally sluggish"?

At what point does caffeine go from "booster" to baseline? Is there a point in our future wherein caffeine belongs in vitamins and even has a FDA-recommended daily allowance?

Monday, September 27, 2010

Getting the APNS device token

As alluded to in my previous post, this time I'm covering how to get the APNS device token for a given iOS client. Actually, it is pretty straightforward. First, call registerForRemoteNotificationTypes from your application's didFinishLaunchingWithOptions UIApplicationDelegate callback. You need to specify which type of notifications your application will accept. Here is an example:
- (BOOL)application:(UIApplication *)application didFinishLaunchingWithOptions:(NSDictionary *)launchOptions
{
...

// Register with the Apple Push Notification Service
// so we can receive notifications from our server
// application. Upon successful registration, our
// didRegisterForRemoteNotificationsWithDeviceToken:
// delegate callback will be invoked with our unique
// device token.
UIRemoteNotificationType allowedNotifications = UIRemoteNotificationTypeAlert
| UIRemoteNotificationTypeSound
| UIRemoteNotificationTypeBadge;
[[UIApplication sharedApplication] registerForRemoteNotificationTypes:allowedNotifications];

...

return YES;
}


As mentioned in the code comments, the application will then talk to Apple's Push Notification Service in the background and, when the device's unique token has been issued, your application delegate's didRegisterForRemoteNotificationsWithDeviceToken callback is invoked. This is where you actually get the device token.
- (void)application:(UIApplication *)application
didRegisterForRemoteNotificationsWithDeviceToken:(NSData *)deviceToken
{
NSLog(@"didRegisterForRemoteNotificationsWithDeviceToken: %@", deviceToken);
// Stash the deviceToken data somewhere to send it to
// your application server.
...
}

Once your iOS client application knows its own device token, it needs to send it to your application server so that the application server can push notifications back to the client later. How you do this depends on the architecture of your client-server communication.

At this point, I should add the device tokens can change. So, I recommend repeating the above logic every time your iOS client application starts so your application server always gets the latest device token for the user's terminal.

I would be remiss to not mention error handling. It is possible that the registerForRemoteNotificationTypes call will fail. The most obvious way it could fail is if the user does not have access to the 3G or WiFi networks and, as a result, cannot communicate with Apple's Push Notification Service; for example, when the device is in Airplane Mode with the wireless signals turned off.

In this case, the didFailToRegisterForRemoteNotificationsWithError delegate callback is invoked instead of didRegisterForRemoteNotificationsWithDeviceToken. In which case, you probably want to retry the registration later when network connectivity is restored.

Monday, September 20, 2010

Using pyapns with django

There is a handy daemon for sending push notifications to iOS-based mobile clients via Apple's Push Notification Service; it is called pyapns. It is implemented in python but, since it runs as a standalone XML-RPC server process, that fact is largely irrelevant. The important facts are that:
  • It properly and fully implements the client interface to APNs, including the requirement for maintaining a persistent connection with Apple's servers rather than repeatedly setting-up and tearing-down SSL connections.

  • It includes client libraries for communicating with the pyapns daemon from python and ruby, although any language that can speak XML-RPC (including C) will work too.
The way it works is: you first start the pyapns daemon process. This process acts as a XML-RPC server handling requests from your application(s), packing them into Apple's binary APNS protocol, and sending them to Apple to deliver to the iPhone, iPad, or iPod.

In order for your applications to send a push notification request, though, they must first tell pyapns which client certificate it should use to authenticate with the APNS servers. Here is a decent guide on how to obtain a client certificate. Once you have a certificate, you have to use the pyapns client library's configure and provision APIs to tell the pyapns daemon process to use your certificate.

If you are implementing your application in django, you can accomplish the configuration and provisioning directly from your django settings.py file like so:
# Configuration for connecting to the local pyapns daemon,
# including our certificate for pushing notifications to
# mobile terminals via APNS.
PYAPNS_CONFIG = {
'HOST': 'http://localhost:7077/',
'TIMEOUT': 15,
'INITIAL': [
('MyAppName', 'path/to/cert/apns_sandbox.pem', 'sandbox'),
]
}

The pyapns python client library will automatically configure and provision itself from these settings. So, assuming you know the APNS device token of the mobile device you want to send a notification to, all you need to do to send a push notification is to call the pyapns.client.notify() function.

If only it were so easy. One complication arises in that the pyapns provisioning and configuration state is split between the client library and the pyapns daemon process. As a result, there are two scenarios to be wary of:

  1. The django application is restarted. In this case, the client library, which is part of your django application, loses its state and tries to re-configure and re-provision itself from your django settings. Luckily, since the client library will re-read the configuration and provisioning settings from settings.py and seamlessly resume communication with the pyapns daemon.

    However, as noted in the pyapns documentation, "attempts to provision the same application id multiple times are ignored." As a result, if you change pyapns configuration in the settings.py file and restart django, you need to restart the pyapns daemon too for the new settings to take effect. Otherwise, if the settings are unchanged, the client library will seamlessly resume communication with the pyapns daemon.

  2. The pyapns daemon is restarted. In this case, the client library thinks it has already configured and provisioned the daemon, but the daemon has lost this configuration due to restart. As a result, any attempt to send a push notification will fail as the daemon does not know how to establish the connection with Apple's Push Notification service.

As I mentioned above, the first scenario isn't a big deal. If you have to restart your web application or the web server for some reason, the connection between the pyapns client library and the daemon process will automatically resume right where it left off. In the rare case that you changed the pyapns settings in your django settings.py file, you need to restart both the django application and the pyapns daemon process for the new settings to take effect.

The latter scenario, though, is a bigger problem because it is impossible to detect until it is too late: that is, it doesn't manifest itself until you try to send a push notification and fail. Luckily, however, we can catch the failure condition and resolve the problem automatically. Specifically, if the pyapns client library fails to send a push notification to the daemon process due to the daemon process not being configured or provisioned, we can force the client library to re-configure and re-provision and retry.

So here you go, a wrapper around the pyapns client library to automatically recover when the backend pyapns daemon has been restarted:
"""
Wrappers for the pyapns client to simplify sending APNS
notifications, including support for re-configuring the
pyapns daemon after a restart.
"""

import pyapns.client
import time
import logging

log = logging.getLogger('APNS')

def notify(apns_token, message, badge=None, sound=None):
"""Push notification to device with the given message

@param apns_token - The device's APNS-issued unique token
@param message - The message to display in the
notification window
"""
notification = {'aps': {'alert': message}}
if badge is not None:
notification['aps']['badge'] = int(badge)
if sound is not None:
notification['aps']['sound'] = str(sound)
for attempt in range(4):
try:
pyapns.client.notify('MyAppId', apns_token,
notification)
break
except (pyapns.client.UnknownAppID,
pyapns.client.APNSNotConfigured):
# This can happen if the pyapns server has been
# restarted since django started running. In
# that case, we need to clear the client's
# configured flag so we can reconfigure it from
# our settings.py PYAPNS_CONFIG settings.
if attempt == 3:
log.exception()
pyapns.client.OPTIONS['CONFIGURED'] = False
pyapns.client.configure({})
time.sleep(0.5)

Since I glossed over it in this post, I'll cover how to get the APNS device token for a mobile device in my next post. The device token acts as an address, telling Apple's Push Notification service which mobile device it should deliver your notification message to.

Tuesday, September 7, 2010

Calculating degree deltas for distances on the surface of the Earth

Here is the scenario: you've got your GPS coordinates (latitude & longitude in degrees) of your current position and you want to find the number of degrees north/south and east/west needed to contain an area of some size around you. I know, this sounds contrived, but it comes up if you use the iPhone's MapKit framework and want to zoom a MKMapView to a level where only a certain distance around a location is displayed. In that case, you need a MKCoordinateRegion to pass to the setRegion:animated: method.

One might think that if you know the rate of conversion between meters (or if you prefer, miles) and degrees, this would be a straightforward conversion. The problem is that it isn't so simple. Since the earth is a sphere, the number of meters in one degree of longitude depends on your latitude. For example, at the equator, there are 111.32 kilometers per degree of longitude; at the poles, however, there are 0 meters per degree.

A MKCoordinateRegion is comprised of two components: the coordinates of the center of the region and a span of latitudinal and longitudinal deltas. Let's assume the center is known; for example, it could be your user's current location. To calculate the span, here is a simple function that takes into account the curvature of the earth:
/*!
* Calculate longitudinal and latitudinal deltas in
* degrees for the given linear horizontal and vertical
* distances in kilometers. Longitudinal degrees per
* kilometer vary with latitude, so a coordinate is
* needed as a frame of reference.
*
* @param coord - point of reference.
* @param xDistance - east-west distance in kilometers.
* @param yDistance - north-south distance in kilometers.
* @return MKCoordinateSpan representing the distances
* in degrees at the given coordinate.
*/
static
MKCoordinateSpan
spanForDistancesAtCoordinate(CLLocationCoordinate2D coord,
double xDistance,
double yDistance)
{
const double kilometersPerDegree = 111.0;

MKCoordinateSpan span;

// Calculate the latitude and longitude deltas that
// correspond to the distance (in kilometers) at
// the given coordinate. Note that the longitude
// degrees calculation is complicated by virtue of
// the fact that the number of meters per degree
// varies depending on the coordinate's latitude.
span.latitudeDelta = xDistance / kilometersPerDegree;
span.longitudeDelta = yDistance / (kilometersPerDegree * cos(coord.latitude * M_PI / 180.0));
return span;
}

The user's current location, which will become the center of the MKCoordinateRegion, should be passed as the point-of-reference coord argument. This is used to calculate the number of meters per degree at the user's current latitude.

The xDistance and yDistance parameters are the number of kilometers east-west and north-south, respectively, that define the region.

Note that the constant kilometersPerDegree represents the number of kilometers per degree of latitude or the number of kilometers per degree of longitude at the equator; it is only an estimate. Since the Earth isn't a perfect sphere, the actual number varies, but for the sake of most iPhone apps, the estimate of 111.0 kilometers/degree should be sufficient.

Tuesday, August 17, 2010

RFC 3339-compliant Unicode date format pattern

Here is a quick note just to say that if you need to generate RFC 3339 timestamps or ISO 8601-compliant combined date/time representations, here is the Unicode date format pattern to do so:
yyyy-MM-dd'T'HH:mm:ss.SSSSZ

This could come in handy if, for example, you are using Apple's NSDateFormatter class. NSDateFormatter has no predefined style corresponding to RFC 3339 / ISO 8601 format so you'll need to use a format specifier string instead; NSDateFormatter format strings comply with the Unicode date format patterns. So you can use the format pattern above to parse or output strings that can be exchanged with other RFC 3339 / ISO 8601 compliant systems.

For example, the following Objective-C code will print out the current timestamp as an RFC 3339 / ISO 8601 compliant combined date/time string:
NSDateFormatter *rfc3339 = [[NSDateFormatter alloc] init];
[rfc3339 setDateFormat:@"yyyy-MM-dd'T'HH:mm:ss.SSSSZ"];

NSDate *now = [NSDate date];
NSLog(@"%@", [rfc3339 stringFromDate:now]);
[rfc3339 release];

At the time of this writing, the output looked like "2010-08-17T16:38:21.9640-0700".

Friday, July 16, 2010

HTML encoding of form inputs

I suppose this is common knowledge amongst professional web developers but I just discovered myself that if a user enters characters into a HTML form input that is not representable in the character set of the page the form is in, browsers will HTML-encode the non-representable characters when the form is submitted. I just spent over an hour assisting a coworker to track down a bug in one of our web applications that was due to this poorly-documented -- but reasonable -- behavior.

I say "reasonable" because, as obscure as it is, this is really the best thing I think a browser can do given the situation.

To recap, here is the scenario:
  • You have a web page with a form in it that is served using some locale-specific encoding. In our case it was Shift-JIS, but the default ISO8859-1 encoding leads to the same problem.
  • The user enters text into a form input field that is not representable in the displayed page's character set or encoding. For example, entering Cyrillic characters into a form displayed on an ISO8859-1 page.
  • When the user submits the form, the browser tries to convert the inputs to the encoding of the page. Any character not representable in that page's character set or encoding has its Unicode character code point encoded as an HTML numeric character reference (e.g. &#0452;).
  • The web application or CGI receiving this input needs to a) know the character encoding of the page that was used to submit the form data so it knows how to interpret the data as characters and b) be prepared to convert any embedded HTML numeric character references back to their corresponding characters.

I like that last part where web applications (or CGIs) have to know the encoding of an HTML page served to the client in order to be able to properly parse input from that client. This fact shatters any remaining fantasies I had of HTTP being stateless.

Anyway, the real surprise is that a web application or CGI needs to be prepared to unencode HTML entities in form input. I quick check of perl's CGI.pm and python's cgi module indicates that neither of them do entity decoding of inputs automatically. And considering that information on the web regarding this behavior is sparse , I suspect that most web developers are unaware of it. At the time of writing, I can only find two references [1][2] that document HTML character reference encoding in the scenario described above.

Luckily, there is a really simple solution: always serve pages in UTF8 encoding and always expect form input to be in UTF8 encoding. One of the many great things about UTF8 encoding is that all characters are representable, so you never have to worry about the browser resorting to HTML character reference encoding.

Monday, July 12, 2010

Sharp pointy sticks

This past Saturday my wife and I bought recurve bows with the intent to make a hobby out of archery. It turns out there are a number of parks with free-to-the-public archery ranges in the Bay Area. A couple of the ranges we have found are:
We actually got started a couple of weeks ago when we went for a free lesson from the Kings Mountain Archery Club. That was a lot of fun and our instructors were very friendly, helpful, and patient. They give free two-hour introductory lessons about once a month; information about how to signup is available on their web site.

We picked up our bows and arrows at a shop intimidatingly called Predator's Archery down in Gilroy. Much like our experience with Kings Mountain Archery, the staff at Predator's Archery were pretty friendly and helpful and they offer a great "starter package" which includes everything you need as well as 5 lessons.

Friday, July 2, 2010

In other news: Subversion still sucks

OK, I've had 4 months to get used to Subversion now. And it is growing on me. Or perhaps it is Stockholm Syndrome. But there are still a lot of annoyances.

After being bashed as a "troll" by one of Subversion's authors after daring to suggest it wasn't all ponies and rainbows, I thought I would check to see if others where sharing my pain transitioning from CVS to Subversion.

Not surprisingly, I did. I found a wonderful summary of all the frustrations I've been experiencing, thoughtfully compiled by no less than David O'Brien of the FreeBSD community.

I would add to his list, as my friend John pointed out in comments to my previous post, that it is really annoying to have to depend on external tools (ironically, CVS) to see commits across branches.

Thursday, July 1, 2010

Serving file downloads with non-ASCII filenames

Recently, while helping out one of my coworkers, it came to my attention that there is no universally-agreed on way to download a file to a web browser while suggesting a filename that contains non-ASCII characters.

The common way to tell a browser to download a file (rather than try to display it in-browser) is to include a Content-Disposition header in the HTTP response; the header's value should be "attachment". Additionally, the server can include a filename parameter in the Content-Disposition header as a suggestion to the browser for what filename to save the file as.

As a bit of history, the Content-Disposition header was originally defined in RFC 1806 which was obsoleted and replaced by RFC 2183. However, the Content-Disposition header was originally defined for use in MIME messages and, while RFC 2616 (HTTP 1.1) makes reference to the Content-Disposition header, it does so only to note that:
Content-Disposition is not part of the HTTP standard, but since it is widely implemented, we are documenting its use and risks for implementors.

Luckily, while not officially standardized for use in HTTP, the Content-Disposition header is "widely implemented" indeed; it seems that all modern browsers implement the header. If the web server responds to a request with application/octet-stream data and a Content-Disposition header of "attachment", your browser will display the familiar "Save As..." dialog. If the server included a filename parameter in that Content-Disposition header, your browser will likely pre-fill the filename input field of the "Save As..." dialog with the specified filename.

But here is where things start getting murky.

RFC 2183, skirts the issue of international filenames by disclaiming responsibility:
Current [RFC 2045] grammar restricts parameter values (and hence Content-Disposition filenames) to US-ASCII. We recognize the great desirability of allowing arbitrary character sets in filenames, but it is beyond the scope of this document to define the necessary mechanisms.

So, as long as the downloaded files' names are always representable in the ASCII charactet set, any browser should properly display the filename (although I've seen rumors that some browsers, such as IE, do enforce a limit on the length of the filename). However, I work at a Japanese company, making products largely for the Japanese market, so we don't have the privilege of assuming the whole world is ASCII.

By the way, in case you are curious, even iso-8869-1 (latin1) isn't consistently supported across browsers so Europeans are left high-and-dry too.

You are probably thinking, like I was, that surely this is a solved problem. And actually, it is. Kind of. The Content-Disposition header originates with the MIME protocol which, since the publication of RFC 2231 in 1997, now supports non-ASCII character encodings for header values. So, for example, the filename "foo-ä.html" can be represented in the Content-Disposition header like so:
Content-Disposition: attachment; filename*=UTF-8''foo-%c3%a4.html


The problem is that few browsers actually implement this RFC 2231 syntax. For example, Firefox 3.6 and Opera 10 appear to support the RFC 2231 syntax. On the other hand, for Internet Explorer, Microsoft's developers choose to simply perform URL-style percent-decoding and then interpret the result as bytes of UTF8-encoded characters. So a server would need to send the Content-Disposition header as
Content-Disposition: attachment; filename="foo-%c3%a4.html"
for an MSIE user to see "foo-ä.html" in the "Save As..." dialog.

Despite requests for IETF working group members to fix it, Google's Chrome browser also does not comply with RFC 2231, preferring to follow Microsoft's lead and use simple URL-style percent decoding.

As a result, there is no consistent cross-browser way to suggest a non-ASCII filename for a file download. I'm sure it doesn't help that the Content-Disposition header has never formally been part of the HTTP specification, but yet it is used by all major browsers to implement file download functionality.

Julian Reschke has compiled a test suite and publishes a nifty page illustrating all of incompatibilities between browsers regarding handling of the Content-Disposition header. In addition, as part of the IETF Network Working Group, he is working on an RFC to formally define the interpretation of the Content-Disposition header in the HTTP context.

Unfortunately, because the ambiguity has been left unresolved for so long, some web servers have adopted the MSIE/Chrome encoding technique for their non-ASCII filenames. Actually, my gut feeling is that probably most have, although I don't have any hard numbers to back up that claim. The good news is that since the MSIE/Chrome encoding is only used for parameters in the form filename="..." while the RFC 2231-style encoding used by Firefox, Opera, and Julian's proposal uses filename*=... it is possible for the two to coexist in the same Content-Disposition header (note the presence of the * in the RFC 2231 format to differentiate it).

In fact, probably the most important section of Julian's proposal is section 4.2 where defines the HTTP client's behavior when the server responds with both filename=... and filename*=..., allowing for an easy upgrade path for MSIE and Chrome.

For now, however, Julian's test results show that when presented both traditional and extended formats, only Firefox and Opera will select the extended filename*=... format.

This opens an opportunity for those of us that need to serve file downloads containing non-ASCII filenames: we can include the filename in non-standard encoding supported by MSIE and Chrome first in the Content-Disposition header, followed by the filename in extended RFC 2231 encoding. According to Julian's tests, MSIE and Chrome will always take the first parameter while Firefox and Opera will properly selecte the extended-syntax parameter, no matter what order it appears in.

For example, if the server includes the header:
Content-Disposition: attachment; filename="foo-%c3%a4.html" filename*=UTF-8''foo-%c3%a4.html
all four major browsers should properly display the filename "foo-ä.html" in the "Save As..." dialog. Unfortunately, WebKit-based browsers, like Apple's Safari browser, would display the raw percent-encoded value "foo-%c3%a4.html" as the filename. At least for now, though, I'm afraid this is the best we can do.

Monday, June 28, 2010

Sharon Jones and the Dap Kings

This past Friday we had the chance to see Sharon Jones and the Dap Kings at the Warfield in San Francisco. The music was fun and Sharon was an amazing performer and entertainer. In addition to her own high-energy dancing around the stage, she invited a few of the audience onto the stage to dance with her. This was sort-of a crap-shoot as at least one person's dance skills ranked somewhere between Elaine from Seinfeld and epileptic seizure victim. However, one guy who referred to himself as merely "The Gator" got up on stage in full 70's era leisure suit and that boy could dance. Sharon even turned the spotlight over to him and let him go to town.

I'll be damned if someone didn't manage to get a video of it and has already uploaded it to Youtube:



The opening act was The Heavy, whose is perhaps best known for their song "How Do You Like Me Now?" that Kia uses in car ads featuring bad-ass stuffed animals. Except for the horns, I wouldn't have thought that they would have been a good match for Sharon Jones and the Dap Kings. But they were another high-energy act that really got the crowd riled up for the main act.

Overall it was a great show. We had some punk kids causing trouble in front of us, but eventually security took care of them and we were able to enjoy the rest of the show.

Tuesday, June 22, 2010

Experiences with Clipper

While I was away in Japan, Caltrain and the other Bay-Area transit agencies teamed up to introduce the Translink Clipper payment system. I was excited at the prospect of an easy-to-use IC-card payment system like the Suica system I enjoyed using in Tokyo, and I was tired of having to buy new 8-ride tickets every few days, so I got a Clipper card and tried it out.

The card is easy enough to use, but the experience refilling my balance on the card has been less than stellar. I'm not about to tie my Clipper card to my credit card account, effectively turning it into a backdoor for a thief to clean me out (the "autoload" feature they push on their website). I was hoping to be able to refill it manually using an automated machine, just like I always did with my Suica card.

It turns out that there are a whopping 7 "Add Value" machines in the entire Bay Area: two in San Rafael, one in Sausalito, one in Oakland, two in downtown San Francisco, and one on the freaking Golden Gate Bridge (I guess for the toll-booth workers?). Fortunately, for those of us whose commute doesn't include one of those 7 locations, you can also add to your Clipper card balance at most Walgreens locations.

And here is where my experience with Clipper takes a nose-dive. First, not all Walgreens locations are equipped to add money to your Clipper card balance. There is, however, a map of the locations on the Clipper website. Unfortunately for me, the Walgreens one block from my office in San Mateo is not on the list and, when asked in person, I received the same look one would expect had I asked them if sold live Jabberwockies.

So, each time I need to refill the balance on my card, I hop in the car and drive two miles to the Walgreens nearest my home. Two miles isn't particularly far, but it does feel a little odd to have to drive somewhere so I can pay for public transportation. Anyway, I only have to do it once every couple of weeks, so it isn't a big deal; just a minor annoyance.

What is more annoying about the experience is that apparently only managers have the specialized training necessary to work the Clipper card "add value" device they keep behind the counter. So the poor checkout clerk has to page for the manager who, after finishing his dooby, mosies up to the front counter and fumbles with the machine until (hopefully) it actually has the balance on it I paid for. This past time, my wife and I had to repeatedly tell other customers that they might want to get in the other line because we were going to be a while. Ten minutes in fact.

The good news is that, with the name change from Translink to Clipper, it looks like they have added an option to refill your card balance online. I'm planning on trying that the next time I need to update my balance. I might miss the biweekly visit to the bloodshot-eyed Walgreens manager, though.

Monday, May 3, 2010

Crappy Code Hopscotch


I'm officially coining the term "crappy code hopscotch" to refer to the stupid games you have to play to workaround crappy code. I guess it could equally well refer to that feeling of being surprised by the effects of crappy code in any otherwise simple task, which might not be altogether unlike the feeling of unpleasant surprise you would get if someone were to throw a pile of dog poo in your hopscotch square.

The term popped into my head today while doing some MySQL wrangling; I was testing a stored function that called LOWER() on the results of CONCAT_WS(). Sounds simple enough: lower-case the result of concatenating strings with a separator. Check this output from MySQL 5.1:

mysql> SELECT LOWER(CONCAT_WS(' ', 'MySQL', 'scores', 'a', 0));
+--------------------------------------------------+
| LOWER(CONCAT_WS(' ', 'MySQL', 'scores', 'a', 0)) |
+--------------------------------------------------+
| MySQL scores a 0 |
+--------------------------------------------------+

Silent lower-case fail.

The problem, it seems, is that CONCAT_WS() doesn't convert the numeric argument to a string, but rather decides to convert *all* of the parameters to BINARY types and, as a result, returns a BINARY value. To it's credit, at least LOWER() is documented as being a no-op on BINARY values, hence the useless output shown above. What amazed me is that not just the undocumented, unintuitive behavior of CONCAT_WS() but that MySQL did not emit a single warning when LOWER() returned a value without, you know, lower-casing it.

So out of nowhere, I find myself playing crappy code hopscotch. I can explicitly either cast the numeric argument to CONCAT_WS() to a string or else let CONCAT_WS() return a BINARY value and explicitly convert that back to a string before passing it to LOWER().

Two crappy boxes to pick from and I got to put my foot in one of them.

Thursday, April 22, 2010

Invent with Python

Al Sweigart gave a presentation of his book, Invent Your Own Computer Games with Python, to Baypiggies tonight. His book is aimed at kids that are interested in learning how to program. He said he didn't have a particular age range in mind, but I would say from experience it would probably be fine for anyone age 8 to 15 with an interest in computers.

I was impressed with the overall tone and layout of his book. And his choice of teaching programming via writing simple computer games is right on the money. He mentioned that what got him hooked on programming was tinkering around creating simple games when he was kid; I'd venture that was what got a great many of the best engineers I've met started. In addition, he choose to teach programming concepts via Python, which I also agree is a great language for learning because it is expressive, easy to understand, and yet powerful to build professional applications.

In his presentation, Al accurately pointed out that there is a trend to try and simplify programming for kids until it resembles building with duplo blocks and that really isn't helpful for kids nor interesting for them. I concur enthusiastically. Projects like Scratch are neat, but seem patronizing to me. I learned on BASIC and Pascal and I don't doubt kids today are just as capable. That said, BASIC and Pascal are dated now; python is just as easy to learn yet more powerful and modern so it is a great choice.

Another difference between Al's book and many intro books is that his programs are short, fun, and mostly text. Of course, every kid dreams of writing graphical games like the video games they play but that is, frankly, not realistic. Again, Al doesn't lie to his audience; he presents fun text-based games that kids can tinker with. He starts with a simple guessing game, then hangman, tic-tac-toe, and othello. Towards the end of the book, he does introduce pygame and shows how to use it to make simple graphical games, but the vast majority of the book focuses on teaching fundamental concepts via text-based games.

I actually taught introductory computer programming to high school students, age 14 and 15, a number of years ago. Back then we used C, but I was surprised to find that Al presents software concepts in the same order I did and even uses the same games to drive those concepts home. If I could teach the same class today in python, Al Sweigart's Invent Your Own Computer Games with Python is the book I'd want to use.

Overall I was quite impressed with the great job Al did with the book and appreciate him taking the time to talk about it, and the process of writing it, to us at Baypiggies tonight.

P.S. I should mention that he has published the book under the Creative Commons license so it is free to read; you can even download the latest edition off his website. Amazon sells it in dead-tree format too, but you should hold off because the second edition will be going to print soon.

Monday, April 19, 2010

Good Day

All of our sea mail from Japan arrived and was just delivered. In addition, my new MacBook Pro I ordered last week (right after the revision bump) arrived today too. Looks like I'm going to be busy this week. I just hope I can find time to make it to BayPiggies this Thursday.

Friday, April 16, 2010

Subversion sucks

While I was away in Tokyo, NTTMCL switched from CVS to Subversion for their version control system. Perhaps it is just that I'm too accustomed to CVS's eccentricities, but so far I have to say the Subversion sucks. While I'm sitting here waiting for the checkout of one of our heavily-branched repositories to complete (45 minutes and counting!), I took the opportunity to read a little about how much more wonderful Subversion is than CVS.

So far, the best I've come up with is that Subversion is newer, therefore it is better. Yay. With subversion, I just get the delight of knowing I'm playing with a fresh(er) turd.

Sure, CVS sucks too. What bothers me about Subversion is that it sucks at least as much as CVS without giving anything in return. At least with CVS, I can tag a release or create a branch without having to make a whole other copy of the repository (on each developer's machine, no less!). At least with CVS, I can diff and merge files between branches or tags without developing a Repetitive Strain Injury. At least with CVS, the repo files are text so I can recover when it screws up. At least with CVS, I don't have to run a friggin' web server just to do revision control.

So, yeah, maybe I'm just an old fogey. Or maybe Subversion sucks so much, it actually makes me long for CVS. Wouldn't that be sad.

Thursday, April 15, 2010

Help me, I'm in RPM Hell

Dear Internet,

Surely there must be a better way to install packages on linux. For better or worse I've been locked in the FreeBSD ivory tower for the better part of 10 years. But now I am assigned to a project that is using linux and I find myself yearning for BSD-style package management. Specifically, I am looking for:

  • A way to find a package for my OS (in this case, Red Hat Enterprise Linux 5, 64-bit).
  • Download that package and the packages for all of its dependencies.
  • Install a package and all of its dependencies onto a host that is not connected to the Internet.
  • Is command-line based.

In FreeBSD, you can use the FreeBSD packages database to locate the desired package and download it to your system in a format that you can install offline. Getting all of the dependencies it a little trickier, but I wrote a simple perl script some years ago that does that using the FreeBSD ports collection.

Obviously linux has the rpm format for its packages but I'm finding that searching for rpms and identifying all of their dependencies is just as manual of a process as it was 10 years ago. For example, the yum command can install a package and its dependencies, but does not seem to support downloading the rpms for those packages to reproduce the process on an offline machine. Surely there must be a way.

If anyone could point me to a tool for linux that satisfies the 4 goals above, I would be grateful.

P.S. In a case of "it's a small world after all", I went to elementary school through high school with the author of yum, Seth Vidal.

Monday, March 29, 2010

Soil and "Pimp" Sessions

Soil and "Pimp" Sessions is a Japanese jazz band that I've taken quite a liking to. I first learned about them when they released a single with Shina Ringo called "カリソメ乙女" (Karisome Otome) back in 2006. They have had a number of videos on YouTube which I enjoyed, but I just couldn't get my hands on one of their albums. It doesn't look like they have a distributor in the U.S. and none of the record stores I visited in Japan stocked them either.





So you can imagine my surprise and delight when iTunes started carrying their albums around November of last year. The U.S. iTunes store even. I've since stocked up bought 20 or so songs of theirs and have yet to be disappointed. The best I can tell, the U.S. iTunes store is selling all of their albums.

To top it all off, Shina Ringo's latest album, 三文ゴシップ (sanmon gossip) not only includes the 2006 Karisome Otome single, but includes another awesome collaboration with Soil and Pimp Sessions: マヤカシ優男 (Mayakashi Yasaotoko). There isn't a video for it, but you can preview the song on YouTube. Don't worry, the lyrics for this song are all in English.

Even though they are a Japanese band, since it is jazz, most of the songs have no lyrics. That said, What few words there are do tend to be English. Which makes them a very approachable band to international audiences. In fact, right now they are in the middle of a European Tour. They have a live show schedule on their web site; I know I'll be watching to see when they come to San Francisco.

Back in California

Well, we're back in California and getting settled in. I had a month off to get moved out of our place in Tokyo and moved into our new place in Mountain View. It has been a hectic month but the last few days have been relaxing. That said, I'm ready to get back to work.

Anyway, I've got a whole backlog of things I've been meaning to blog about so hopefully I'll be a little more frequent with the updates for a while.

Thursday, March 4, 2010

No place like home

Well, I've wrapped up my time here in Tokyo and preparing to move back to California. Barring a few small obstacles -- like just noticing that our passports had expired and frantically visiting the U.S. Embassy to get new ones -- we should be back in the U.S. by this time next week.

My coworkers gave me a nice farewell party which was quite fun. I'm not sure why, but I got pretty choked up on my last day at the office. I'm sure part of it was the relief of finally going home, but a lot of it was that I really liked my coworkers. They were the best part of working in Japan.

This past week I've had more time and finally gotten a chance to see more of the neighborhood where we have been living. It is actually a pretty nice area. It is a shame that it took over 2 years before I could actually enjoy it. Nonetheless I'm really looking forward to going home.

Tuesday, February 9, 2010

Yattsuke Shigoto

My friend Matt introduced me to a Japanese singer/songwriter named Shiina Ringo a few years back. I've taken quite a liking to her music, but one of her songs has really grown on me since I've been in Tokyo. It is a song called 「やっつけ仕事」(Yattsuke Shigoto), which means a job done half-assed. For me, the lyrics reflect the general feeling of apathy and soullessness that Tokyo seems to emanate. I guess it may be different for natives, but this song speaks to me. It speaks to me every time I get on the train and stair blankly off into the distance or pretend to sleep like everyone around me.

Feeling down today, I thought I would check to see if there were any English translations of the lyrics. All I could find was one terrible almost-literal translation that killed the voice of the song. So I whipped up a translation of my own (below).

I'm pretty confident that this translation captures the spirit of the song, even if I did stray from the literal translation by a wide margin on a couple of lines. The only line I'm not happy with is the one about the high-speed traffic jam. In Japan, if there is a traffic jam on a highway (which is literally called a "high-speed road"), they just say high-speed (road) traffic-jam, omitting the word "road". Hence producing the contradictory, "high-speed" traffic jam.

The only other part I'm not satisfied with at the pronouns. Japanese doesn't have pronouns and it isn't clear who is saying what in the song. I just assumed that everything was being said from a first-person perspective, since that is normally the case in most songs.

In case you are interested, the original Japanese lyrics are here. There are a couple versions of the song on iTunes also.


Yattsuke Shigoto - Shiina Ringo

Every day I'm assaulted by the ring of phones
I just want some peace and quiet

You call it a "high-speed" traffic jam, but isn't it slow?
I'm indifferent to reasoning contrary to reality

I can't think of anything good
But I'm not indignant about anything either
What day was it today?
I guess it doesn't much matter
Ah, all I want is something memorable

This consistency wears down my individuality
Maybe I'll just get an arranged marriage

Please control me
I hate boredom
When's the last Ginza-line train?
I guess it's not a big deal
Ah, I wish I could be a machine

Hey, what was "love" again?
I can't remember
I can't remember

I can't think of anything good
But I'm not indignant about anything either
What day was it today?
I guess it doesn't much matter
Ah, all I want is something memorable

Please control me
I hate boredom
When's the last Ginza-line train?
I guess it's not a big deal

I can't think of anything good
I can't think of anything good
I can't think of anything good

Hey, what was "love" again?

Sunday, February 7, 2010

When Standards Collide

I can't help but shake my head in disappointment every time I run across some specification that is blatantly in violation of the standards it is built on. When I encounter a specification published by an ostensibly-reputable professional organization that contradicts my understanding of the underlying protocols, at first I'm confused. Is my understanding faulty? My memory bad (hint: it is)? Is there a new RFC that updated the protocol when I wasn't looking?

As a recent example, I finally got my hands on the WRIX Interconnect 1.03 specification from the Wireless Broadband Alliance. In which, they specify that partner ISPs must support "passwords up to 253 characters which contain a mixture of alphabetic, numeric, and special characters"; this is written in the comments for the the User-Password attribute in a RADIUS Access-Request packet.

There are two problems with this text that I'm amazed industry professionals did not correct (assuming they noticed):

  1. RFC 2865 section 5.2 clearly states that the maximum length of the User-Password attribute is 130 octets, including the 2-byte attribute header. In other words, the encrypted password text cannot be longer than 128 octets. If this were the only issue, I'd be inclined to believe that the 128 octet limit has been relaxed in a later RFC.

  2. Which leads to the second issue: RFC 2865 says the *encrypted* password text cannot be longer than 128 octets. Section 5.2 also lays out the encryption algorithm, which is a 16-byte block cipher. Being a block cipher, the password plaintext is padded out to a multiple of 16 bytes before the encryption is applied. Which means that a 7 octet password will encrypt to 16 octets of encrypted text. The 253 character password specified by WRIX would encrypt to 256 octets of encrypted text. RADIUS allocates 8 bits per attribute to represent the length of that attribute in octets, including the attribute's 2-byte header. So, if you have 256 octets of encrypted text, the total length of the attribute would be 258 octets.



So, in order to comply with the WRIX Interconnect 1.03 specification, it would appear you are required to violate both RFC 2865 and physics (to get 258 values represented with just 8 bits).

What baffles me is that I would have expected the difficulty of implementing the specification as written would have become obvious in compatibility testing. Surely someone has written unit tests based on the spec to test their implementation. At least once. Right? Right?

Anyway, I imagine the intent of the specification's authors was only to violate RFC 2865 and leave the violation of physics to the hardware guys. It looks like they wanted partner ISPs to allow the longest password representable by the User-Password attribute, ignoring the 128-octet stated limit. Since the encryption algorithm always produces encrypted text that is a multiple of 16-bytes in length and the longest encrypted text that can be stored in the User-Password attribute is 255 - 2 = 253 octets, the question is what is the largest multiple of 16 less than or equal to 253?

240. 240 octets.

Any plaintext 241 octets long or longer will encrypt to 256 bytes of encrypted data, which is not representable by a RADIUS User-Password attribute. So, a more reasonable specification would require implementers to support "passwords up to 240 characters which contain a mixture of alphabetic, numeric, and special characters."

Maybe in revision 1.04.

Thursday, January 21, 2010

The Value of a Comfortable Office

As I sat at my desk sweating in the dead of winter, I got to thinking about how people's comfortable working temperature must be cultural. Offices are hot in Japan. In the summer, their CoolBiz campaign has businesses setting the thermostats to 28 degrees Celsius (82.4 degrees Fahrenheit). I just checked, it is winter and the thermostat in my office is reading 30 degrees (86 degrees Fahrenheit). In contrast, offices in the U.S. have traditionally been regulated to 72 degrees Fahrenheit.

I know my productivity suffers when I'm uncomfortable. It suffers doubly when I can only type with one hand because I'm fanning myself with the other. But is this just because of differences in cultural sensitivity to heat?

Out of curiosity, I did a quick search and found this interesting opinion piece written by Professor Shin-ichi Tanabe of Waseda University. Some of the interesting points in his opinion piece are:

  • A guidebook recently published by the Federation of European Heating, Ventilating and Air Conditioning Associations (REHVA) reports that 21.8°C is the optimal room temperature to foster intellectual productivity.

    21.8°C is a little over 71 degrees Fahrenheit...almost exactly what offices in the U.S. set their temperatures too. Perhaps we have a hint why U.S. and European workers are the most productive in the world?

  • ...28°C seems a little too high for a room temperature setting in summer. The most comfortable temperature when sleeping naked is 29°C. People burn more calories in the workplace than at home where they are more relaxed, and however casually they may dress, they are still not naked in the workplace.

    While there are studies showing some variance in comfortable working temperatures depending on culture and gender, it would seem that 28°C can't be comfortable for anyone.

  • Raising the cooling temperature of a standard building in Tokyo from 25°C to 28°C could increase energy efficiency by 15%, which is equivalent to saving ¥72 per square meter of office space during the COOLBIZ campaign. On the other hand, the resulting decrease in working efficiency could cause a loss of 13,000 yen per square meter of office space.

    What kind of company loses 13,000 yen to save 72 yen? A Japanese company, apparently.

1.1.1.1

A number of captive portal implementations, including products from Cisco and Nomadix, use 1.1.1.1 as a virtual IP address, HTTP requests to which are redirected to the access control server's logout page. A quick google search turns up numerous network service providers, mostly wireless ISPs, that use 1.1.1.1 to access their logout pages.

This trick has worked because the 1.1.1.1 IP address resided in an IP block that was reserved by the IANA, so there could be no server that actually used that IP address.

However, this month the IANA assigned the 1.0.0.0/8 IP block to the Asia-Pacific NIC. As its name implies, APNIC is responsible for the allocation of IP addresses in Asia and the Pacific, meaning that there may come a day when a company in China, Australia, or elsewhere is allocated a subnet containing the 1.1.1.1 IP address.

In short, the 1.1.1.1 IP address no longer resides in reserved IP space. Network access servers should stop using it.

Thursday, January 7, 2010

The Public Option

Every time I have to take off my shoes at the airport, I'm grateful that Richard Reid stuck explosives in his shoes rather than his ass. Waiting in line while people take off their loafers and flip-flops is silly, but waiting in line while people get full cavity searches might make me think twice about flying.

Anyway, as everyone has surely heard by now, an idiot on a flight to Detroit lit his nuts on fire for Christmas. So now the TSA will be installing new full-body scanners to see if we have anything stuffed in our trousers. I haven't heard the details, but I suppose you are privy to a more intimate inspection if something in your pants draws attention.

And then today I received an e-mail from my friend Matt alerting me that, sure enough, suicide bombers are now stuffing explosives in their rectums. Unless these new airport scanners can see clear into my bowels and distinguish between a Taco Bell lunch and IED, I wonder how far off we are from getting free colon exams whenever we fly.

Actually, I think I may have stumbled upon health care reform that everyone can get behind. The TSA can be our delivery method for socialized health care. If the TSA is going to be probing every orifice looking for explosives anyway, with a modicum of medical training, they could alert us to any potential medical conditions we might not have been aware of while they are in there. Everybody wins: planes are safer, people are healthier, and we save money on colon exams, prostate checks, and gynecologist visits.

It might cost a little money to implement, but who would object? Only terrorists could be against making our skies safer.