Competition inside corporations?.
Having observed Hewlett-Packard from the inside for almost 18 months now,
I'm struck by a paradox: our economy is a chaotic marketplace of capitalist
competition, practiced and championed by corporations, but internally, companies
are run as top-down, centrally-planned dictatorships. Why is that? Why isn't a
company simply a microcosm of the larger economy?
Take the case of IT services: inside HP, there is a large IT organization,
and they provide services to the rest of the company. When my group joined HP,
we had no choice about how to get, for example, email service. The IT group
provided email, and we used it. When we need to buy a laptop, there is one
group that provides that service. When we need servers hosted, we have only
one place to turn.
I'm sure the reason for this is the efficiency gained by eliminating redundancy.
If there were two groups providing email services, surely one group could do the
job of both, with less total staff, equipment, and so on.
That's certainly true, but then why don't we apply the same logic to the larger
economy? After all, HP's email group has a huge overlap with Dell's, IBM's, Sun's,
Microsoft's, and so on. Couldn't our economy gain by eliminating the overlap?
When these questions are considered at the national level, we tout the increased
efficiency produced by competition. The economy as a whole gains from the pressure
competition puts on each company. Without competition, there is no incentive to
improve, no reason to do your best. In a centrally-planned nationalized economy,
incompetence is not punished, incentives are mis-aligned, and apathy takes over.
There's no reason to improve because your customers have nowhere else to turn,
poor service will not lead to loss of business, there's no price pressure, and
your existence is guaranteed by the state.
That's logic that every capitalist believes, and we laugh at economies that
have tried central planning and failed. So why doesn't the same logic hold inside
companies? Why are monopolies and lack of competition not just accepted, but
enforced? Don't we believe the same forces will be at work? Is there any compelling
reason to improve if you have no competition?
Why couldn't a company have three IT groups (call them Red, Green, and Blue).
Each is separate, and lives or dies based on their ability to attract business
from the rest of the company. When my group needs servers hosted, we shop
around. Maybe Red is the deluxe service, and Blue is economy, and we've heard from
friends that Green has the best service. For whatever reason, we choose one of
them, and spend our internal dollars with them. The groups will compete, and
that competition will force them to optimize and find the best solutions for their
customers. If they don't, they will go out of business.
I know it seems wasteful to have all that going on inside a company. There will
be duplication. But remember the capitalist logic: without competition, there's
no reason to do your best. Just as with the larger economy, the duplication will
be worth it because of the increased efficiency forced by competition. And without
competition, your only option will be a poor one.
Of course, not all work inside corporations could be run this way. For
example, legal departments deal with the outside world, and the corporation must
speak with one voice there. But couldn't competition be used in at least some
parts of large companies?
Where's the flaw in this logic? Why isn't competition inside corporations a
good idea?
Honda Civic hybrid.
I've just bought a new car: a Honda Civic hybrid.
I don't buy cars that often. The car I just replaced was a 1994 Civic. To keep
the same pace, I'll add an entry to my calendar for 2022 to buy my next car.
I like the Civic for its gas mileage, 45 mpg highway. The extra expense over
a non-hybrid Civic is actually more than I'll save on gas over the life of the
car, but I like being the change I want to see in the world.
One thing that surprised me about this car is how familiar it felt after having
driven a 1994 Civic. Lots of extra bells and whistles that I'd gotten used to
in my wife's larger cars are still absent in this car.
Features in the hybrid I didn't have in my 1994 Civic (other than the hybrid engine):
- A temperature setting in the climate control
- Front seat map lights
- A chime to alert me that I've left my headlights on
- An auxilliary jack for the stereo
- Electronic dashboard with thermometer, etc
Things that work in the hybrid that used to work in the 1994 Civic, but no longer do:
- Remote entry buttons
- Reliable low-speed wipers
- Rear left passenger door handle
- Exhaust system. The last thing that failed on the 94 was the exhaust. For
its last two days, it sounded like a four-door Harley.
Fancy features the Hybrid doesn't have that my wife's car does:
- Motorized seat adjustments with memory
- Heated seats
- Lighted mirrors in visors
- Fold-in side mirrors
- Leather seats
- Separate temperature settings for driver and passenger
- Individual lights for rear passengers
I'm pleased to have a new car that just works, and especially one that does
so well on gas.
Evil apple.
I really don't know what Apple is thinking. First they release a really
cool phone, good. Then they release an SDK for it, also good. But developers
aren't allowed to talk to each other about developing for the phone. That's
bad, doesn't Apple realize how developers learn? Then Apple sets up a store
and keeps control over what apps can be sold there. Partly good (no malware can
pollute the ecosystem), but partly bad (no one knows how Apple will decide what
can be sold).
Then Apple started to reject apps from the app store, which is bad, because
app developers only find out they've been rejected after they've expended all the
effort to build the app, and it can be hard to predict whether an app will be
rejected or not, making it
risky to build iPhone apps.
After this breathtaking descent into cluelessness, Apple has topped itself by
deciding that
app rejections
are subject to the non-disclosure, making it illegal for developers to
talk about the fact that their app has been rejected! Is Apple actively trying
to discourage app development? Is there any other company that could act this
way without raising the ire of the development community? This is the company
that used Gandhi in an ad?
What exactly is Apple thinking?
Cisco minus t.
One of those simple typos that turns into an embarassing public mistake:
Cisco home page FAIL,
where (it is theorized) a regex that should have had t had only t, and as a
result, all lowercase t's were removed from the page, breaking it completely.
A server memory leak.
We pushed new code to our production
servers last week. There were a lot of changes, including our upgrade to
Django 1.0. As soon
as the servers restarted, they immediately suffered, with Python processes bloated
to 2Gb or more memory each. Yikes! We reverted to the old code, and began the process of finding the
leak.
These are details on what we (Dave,
Peter, and I, mostly them)
did to find and fix the problem.
We used Guppy, a very capable
Python memory diagnostic tool. It showed that the Python heap was much smaller than
the memory footprint of the server process, so the leak seemed to be in memory
managed by C extensions.
We identified these C extensions:
We tried to keep these possibilities in mind as we worked through our next steps.
PIL and PDFlib in particular seemed likely given how heavily we use them, and because
they traffic in large data (high-res images).
We had some unit tests that showed fat memory behavior. We ran
valgrind on them hoping they would demonstrate
a leak that we could fix. Valgrind is a very heavy-weight tool, requiring
re-compiling
the Python interpreter to get good results, and even so, we were overwhelmed
with data and noise. The tests took long enough to run that other techniques
proved more productive.
Our staging server had been running the code for over a week, and showed no ill
effects. We tried to reason out what is the important difference between the staging server and the production
server? We figured the biggest difference is the traffic they each receive. We
tried to load up the staging server with traffic. An aggressive test downloading
many dynamic PDFs quickly ballooned the memory on the staging server, so we suspected PDFlib
as the culprit.
Closely reading the relevant code, we realized we had a memory leak if an exception occurred:
p = PDF_new()
PDF_delete(p)
We felt pretty good about finding that, and fixed it up with a lot of unfortunate
try/finally clauses. We put the code on our staging server, and it behaved much
better. Lots of PDF downloads would still cause the memory to grow, but when the
requests were done, it would settle back down again. So we liked the theory that
this was the fix. The only flaw in the theory was it didn't provide a reason why
our old code was good and our new code was bad. We put the fixed code on
the production server: boom, the app server processes ballooned immediately.
Apparently as good as this exception fix was for our PDFlib code, it wasn't the real problem.
We tried chopping out functionality to isolate the problem. Certain subsets of URLs
were removed from the URL map to remove the traffic from the server. We ran the code
for short five-minute bursts to see the behavior under real traffic, and it was no better.
To be sure it wasn't still PDFlib somehow, we tried removing PDFlib by raising an
exception at the one place in our code where PDF contexts are allocated. Memory
still exploded. We tried removing PIL by writing a dummy Image.py that raises
exceptions unconditionally. It didn't help.
We tried logging requests and memory footprints, but correlations elusive. We
tried changing the process architecture to use only one thread per process, no luck.
We tried reverting all the Django 1.0 changes, to move back to the Django version
we had been using before. This changed back the Django code, and the adaptations
we'd made to that code, but (in theory) left in place all of the feature work
and bug fixes we had done.
We pushed that to the servers, and everything performed beautifully, the
server processes used reasonable amounts of memory, and didn't grow and shrink.
So now we know the leak is either in the Django 1.0 code, or in our botched
adaptation to it, or in some combination of the two. Many people are using
Django 1.0, so it seemed unlikely to be as simple as a Django leak, so we
focused on our Django-intensive code.
Now that we'd narrowed it down to the Django upgrade, how to find it? We went
back to the request logs, examining them more closely for any clues. We found
one innocuous-seeming URL that appeared near a number of the memory explosions.
We took one app server out of rotation, so that it wasn't serving any live requests.
Our nginx load balancer is configured so that a URL parameter can direct a request
to a particular app server. We used that to hit the isolated app server once with the
suspect request. Sure enough, the process ballooned to 1Gb, and stayed there.
Then we killed that process, and did it again. The Python process grew
to 1Gb again. Yay! We had a single URL that reproduced the problem!
Now we could review the code that handled that URL, and eyeball everything for
suspects. We found this:
@memoize()
def getRecentStories(num=5):
""" Return num most recent stories. Only public stories are returned.
"""
stories = Story.objects.published(access=kAccess.public).
exclude(type=kStoryType.personal).
order_by('-published_date')
if num:
stories = stories[:num]
return stories
Our @memoize decorator here caches the result of the function, based on its
argument values. The result of the function is a QuerySet. Most of the code that
calls getRecentStories uses a specific num value, so it returns a QuerySet for a small
number of stories, and the caller simply uses that value (for example, in a template
context variable).
However, in this case, the getRecentStories function is called like this:
next_story = getRecentStories(0).filter(published_date__lt=the_date)[0]
The QuerySet is left unlimited until after it is filtered by published_date,
and then the first story is limited off.
Now we're getting to the heart of one of our mysteries: why was the old
Django code good, and the new Django code bad? The Django ORM changed a great
deal in 1.0, and one of the changes was in what happened when you pickle a QuerySet.
To cache a QuerySet, you have to pickle it. Django's QuerySets are lazy: they
only actually query the database when they need to. For as long as possible, they simply
collect up the parameters that define the query. In Django 0.96, pickling a
QuerySet didn't force the query to execute, you simply got a pickled version
of the query parameters. In Django 1.0, pickling the query causes it to query the database,
and the results of the query are part of the pickle.
Looking at how the getRecentStories function is called, you see that it returns
a QuerySet for all the public stories in the database, which is then narrowed by
the caller first on the published_date, but more importantly, with the [0] slice.
In Django 0.96, the query wasn't executed against the database until the [0] had
been applied, meaning the SQL query had a "LIMIT 1" clause added. In Django 1.0,
the query is executed when cached, meaning we request a list of all public stories
from the database, then cache that result list. Then the caller further filters
the query, and executes it again to get just one result.
So in Django 0.96, this code resulted in one query to the database, with a LIMIT 1
clause included, but in Django 1.0, this code resulted in two queries. The first
was executed when the result was cached by the @memoize decorator, the second
when that result was further refined in the caller. The second query is the same
one the old code ran, but the first query is new, and it returns a lot
of results because it has no LIMIT clause at all.
The fix to reduce the database query was to split getRecentStories
into two functions: one that caches its result, and is used when the result will not be
filtered further, and another uncached function to use when it will be filtered:
def getRecentStories(num=5):
""" Return num most recent stories. Only public stories are returned.
Use this function if you want to filter the results yourself.
Otherwise use getCachedRecentStories.
"""
stories = Story.objects.published(access=kAccess.public).
exclude(type=kStoryType.personal).
order_by('-published_date')
if num:
stories = stories[:num]
return stories
@memoize()
def getCachedRecentStories(num=5):
""" Return num most recent stories. Only public stories are returned.
If you need to filter the results further, use getRecentStories.
"""
return list(getRecentStories(num=num))
One last point about the Django change: should we have known this from reading
the docs? Neither the
QuerySet refactoring notes
nor the 1.0 backwards incompatible changes
pages mention this change, or address the question of pickled QuerySets directly.
Interestingly, an older version of the docs
does describe this exact behavior. This changes was explicitly made and discussed,
but seems to have been misplaced in the 1.0 doc refactoring. Of course, we may not
have realized we had this behavior even if we had read about the change.
So we've found a big difference in the queries made using the old code and the
new code. But why the leak? The theory is that
MySQLdb has a leak which has been fixed on its trunk.
Looking at the MySQLdb code, it's pretty clear that they've been developing for
a while since releasing version 1.2.2. Unfortunately, the MySQLdb trunk doesn't
work under Django yet, so we can't verify the theory that MySQLdb is the source
of the leak.
Ironically, MySQLdb was not on our list of C extensions to look at. If it had been,
we might have identified it as the culprit with a Google search. Since the MySQLdb
trunk doesn't work under Django, I guess we would have hacked MySQLdb or Django
to get them to work together. We would have run leak-free, but would be
unknowingly executing the giant database query.
The last mystery: why didn't the problem appear on our staging server? Because it was running
with a much smaller database than our production servers, so the "all public stories"
query wasn't a big deal. We learned a lesson there: sometimes subtle difference
can make all the difference. We need to keep the staging server's database as
current as we can to make sure it's replicating the production environment as
much as possible. It's impossible to make them identical (for example, the staging
server doesn't get traffic from search bots), but at times like this, it's important
to understand what all the differences are, and minimize them where you can.
Switching python versions on windows.
I forget what software first set up these associations, but I have .py files
registered with Windows so that they can execute directly. The registry defines
.py as a Python.File, which has a shell open command of:
"C:\Python24\python.exe"Â "%1"Â %*
My PATHEXT environment variable includes .py, so the command prompt will
attempt to execute .py files, using the registry associations to find the
executable.
But: I wanted to switch from Python 2.4 to Python 2.5. That meant updating
the registry in a handful of places. A Python script to the rescue!
""" Change the .py file extension to point to a different
Python installation.
"""
import _winreg as reg
import sys
pydir = sys.argv[1]
todo = [
('Applicationspython.exeshellopencommand',
'"PYDIR\\python.exe" "%1" %*'),
('Applicationspythonw.exeshellopencommand',
'"PYDIR\\pythonw.exe" "%1" %*'),
('Python.CompiledFileDefaultIcon',
'PYDIR\pyc.ico'),
('Python.CompiledFileshellopencommand',
'"PYDIR\\python.exe" "%1" %*'),
('Python.FileDefaultIcon',
'PYDIR\py.ico'),
('Python.Fileshellopencommand',
'"PYDIR\\python.exe" "%1" %*'),
('Python.NoConFileDefaultIcon',
'PYDIR\py.ico'),
('Python.NoConFileshellopencommand',
'"PYDIR\\pythonw.exe" "%1" %*'),
]
classes_root = reg.OpenKey(reg.HKEY_CLASSES_ROOT, "")
for path, value in todo:
key = reg.OpenKey(classes_root, path, 0, reg.KEY_SET_VALUE)
reg.SetValue(key, '', reg.REG_SZ, value.replace('PYDIR', pydir))
Invoke this with your desired Python installation directory, and the registry
is updated to point to it.
Note that this doesn't affect what the command Python means, that's determined
by your PATH enviroment variable. These registry settings change which Python
executable is found when you invoke a .py file as a command.
Python registry grepper.
In writing the python registry switcher,
I needed to search the registry for references to my old Python version.
Another good use for a Python script:
""" Search the Windows registry.
"""
import _winreg as reg
import itertools
RegRoots = {
reg.HKEY_CLASSES_ROOT: 'HKEY_CLASSES_ROOT',
reg.HKEY_CURRENT_USER: 'HKEY_CURRENT_USER',
reg.HKEY_LOCAL_MACHINE: 'HKEY_LOCAL_MACHINE',
reg.HKEY_USERS: 'HKEY_USERS',
}
class RegKey:
""" A handy wrapper around the raw stuff in the _winreg module.
"""
def __init__(self, rawkey, root, path):
self.key = rawkey
self.root = root
self.path = path
def __str__(self):
return "%s\\%s" % (RegRoots.get(self.root, hex(self.root)), self.path)
def close(self):
reg.CloseKey(self.key)
def values(self):
""" Enumerate the values in this key.
"""
for ikey in itertools.count():
try:
yield reg.EnumValue(self.key, ikey)
except EnvironmentError:
break
def subkey_names(self):
""" Enumerate the names of the subkeys in this key.
"""
for ikey in itertools.count():
try:
yield reg.EnumKey(self.key, ikey)
except EnvironmentError:
break
def subkeys(self):
""" Enumerate the subkeys in this key.
"""
for subkey_name in self.subkey_names():
if self.path:
sub = self.path + '\' + subkey_name
else:
sub = subkey_name
yield OpenRegKey(self.root, sub)
def OpenRegKey(root, path):
try:
rawkey = reg.OpenKey(root, path)
except Exception, e:
return None
return RegKey(rawkey, root, path)
def grep_key(key, target):
for name, value, typ in key.values():
if isinstance(value, basestring) and target in value:
print "%s\\%s = %r" % (key, name, value)
for subkey in key.subkeys():
if not subkey:
continue
grep_key(subkey, target)
subkey.close()
def grep_registry(args):
for root in RegRoots.keys():
grep_key(OpenRegKey(root, ""), args[1])
if __name__ == '__main__':
import sys
grep_registry(sys.argv)
Most of this is a pythonic wrapper around the _winreg module, with a few simple
functions at the end to actually search the registry.
Aptus 2.0.
Aptus 2.0, the latest version of my Mandelbrot
explorer, is now available. It's got a lot of improvements over the previous
version, including speed improvements, multiple top-level windows, tool windows
for displaying information and Julia set support.

It's built with wxPython, so it runs on Windows,
Linux, and Mac.
Five thirty eight.
We are in full swing now in the presidential campaign, and we are constantly
bombarded with poll numbers. Funny thing is, most of those polls are just
national polls, a prediction of how the nation-wide popular vote will turn out.
But as the 2000 election underscored, that doesn't matter at all: what matters
is the electoral vote. To predict that, you'd have to track individual
state-by-state polls to see who wins the popular vote in each state, and compute
the electoral vote totals. Sounds like a lot of work, but
FiveThirtyEight.com
(Electoral Predictions Done Right) has done all the work already. They also
run statistical simulations to predict the likelihood of various outcomes (for
example: the chance of McCain losing the popular vote but winning the election
is 1.7%).
Add extensive tables of data detailing the poll data, the simulations,
their predictions, maps of outcomes, more of the same for congressional races,
and so on, and you have a quantitative political junkie's dream site.
BTW, as of this moment, they predict an Obama win, with 339 electoral votes to McCain's 199.
And they aren't the only game in town: there's also Electoral-vote.com
(currently predicting a 329 over 194 win for Obama), and Election Projection
(364 to 174 for Obama).
3 down, 47 to go.
Connecticut has joined
the ranks of states allowing gay marriage, good for them. The process was
similar to Massachusetts and California: couples sue for the right to marry,
eventually the state Supreme Court finds that either existing laws don't preclude
gay marriage, or the state constitution won't allow distinguishing between
straight and gay couples. I for one am glad. I believe that eventually this
will be accepted across the country, and people will wonder what the fuss was
about. Those predicting the downfall of society will be proven wrong. We continue
to have thriving families here in Massachusetts even after four years of gay marriage.
For a vibrant "debate" on the issue, check out the comments on
Hot Air's post about the news.
The post itself, while disagreeing with the decision, does a good job analyzing
the legal arguments in it. The comments, though, consist mostly of people hurling
invective at each other, no one being swayed by either sides' arguments.
This decision will bring the usual complaints of judicial activism (actually,
they were interpreting the constitution, that's their job), the collapse of
morality (how exactly?), harm to families (by creating more of them? I don't get
it), the disenfranchisement of the people (the whole point of judges is to
decide independently of public opinion) and so on. To all of them I say, open
your eyes and close your mouths. Everything is fine.
The boogey-man of gay marriage simply doesn't exist.