Categories: ,
Posted by: bjb

I had to look at a 9-MB json file this weekend. Here’s how I converted it from one-line to indented multi-line:

$ sudo apt-get install python-simplejson
$ dpkg -L python-simplejson
$ less /usr/share/pyshared/simplejson/tests/

$ python
>>> import simplejson as json
>>> f = open (big.json, 'r')
>>> oneline = ()
>>> f.close
>>> ds = json.loads (oneline)                  # ds = "data structure"
>>> multiline = json.dumps (ds, indent="  ")   # two spaces / indent level
>>> f = open (formatted.json, 'w')
>>> f.write (multiline)
>>> f.close ()
>>> ^D

In reviewing the steps for this blog post, I note that there is also a “load” function, that might be even easier.

>>> import simplejson as json
>>> dir (json)
['Decimal', 'JSONDecodeError', 'JSONDecoder', 'JSONEncoder',\
 'OrderedDict', '__all__', '__author__', '__builtins__', '__doc__',\
'__file__', '__name__', '__package__', '__path__', '__version__',\
'_default_decoder', '_default_encoder', '_import_OrderedDict',\
'_import_c_make_encoder', '_speedups', '_toggle_speedups',\
'decoder', 'dump', 'dumps', 'encoder', 'load', 'loads', 'ordered_dict',\

Example input:

[{'pk': '5', 'model': 'theModel', 'fields': {'two': 'two', 'one': 'one'}}]

Example output:

    "pk": "5", 
    "model": "theModel", 
    "fields": {
      "two": "two", 
      "one": "one"

07/30: hpet_info

Categories: , ,
Posted by: bjb

HPET: High Precision Event Timer. My team is going to use it to measure some kernel activity and for starters I’ve written a userspace program that uses the timer. Well that turned out to be a bit more difficult than I thought. It’s not that well documented.

HPET is quite PC oriented, so you don’t get it on other architectures. When a machine has one, it might take over the task of the Real Time Clock and the OS timer. In each of the two machines I examined, there was one HPET block and it had three “timers” in it. Actually there was one counter and three comparators. But for the purposes of software we can consider it to be three separate but related timers. The HPET spec waxes eloquent about how you can have bunches of timers and comparators in a machine, but I haven’t seen it yet. Maybe in specialized hardware. Update I have another machine with 4 comparators; the first two are reserved by Linux and the other two are available for when /dev/hpet is opened. If you open /dev/hpet a third time (without closing any) it will say it’s busy.

The Linux kernel uses the first two timers for itself, and makes the third one available to entities that ask for it. Userspace can use it via the /dev/hpet device. You open /dev/hpet, then you can make fcntl and ioctl calls on it. Eventually you close the device. You can open the device in read-only mode, even if you’re going to set the timer, because you will never call write on it.

The API for using the HPET is rather narrow, you can’t examine all the registers or anything. You can ask the driver to fill in hpet_info for you with the INFO ioctl call. That fills in a data structure like this (I was looking at kernel 2.6.35):

struct hpet_info {
        unsigned long hi_ireqfreq;      /* Hz */
        unsigned long hi_flags; /* information */
        unsigned short hi_hpet;
        unsigned short hi_timer;

I could find NO documentation on what goes into this structure, aside from reading the code. And you have to read the code pretty closely and in conjunction with the HPET spec … Anyway:

hpet_info gets filled with the info for the timer device in question. In this case, with timer 2 (because that’s the “leftover timer that gets used for /dev/hpet requests”.

  • hi_ireqfreq is (not surprisingly) the requested frequency of the periodic timer. The device driver does a little arithmetic to give you a value in Hz, rather than the raw numbers from the register.

  • hi_flags contains 0 if the timer is capable of periodic (repeated) interrupts in hardware and 2 if not (kind of a waste of a 32 bit field, oh well)

  • hi_hpet contains the id of the timer (2 in the case of timer 2) … in this block, I think. I only have one block.

  • hi_timer contains the address offset between datastructures in the kernel. I have no idea why they thought that might be interesting to userspace … Update It is supposed to be counting the number of structures, rather than giving an address offset. But it should have given 2, and it actually returned 0x40 (twice the structure size). ???

It turns out that (at least on my hardware) you can’t set a one-shot timer. When triggered, the interrupt handler adds the period into the comparator and the timer will trigger again. And again, and again, until you turn it off. I haven’t tried turning it off in the interrupt handler yet. Not sure if that’s a safe operation for the interrupt handler. I guess I’ll find out tomorrow! Update Turning the interrupt off in the interrupt handler worked. That time. YMMV.

As it turns out, it would do that even if the timer was capable of doing periodic interrupts in hardware. To use the hardware periodic interrupts, you have to stop, reset and start the whole timer block (with all the comparators). Kinda catastrophic for the block with the system clock in it, so Linux just doesn’t use the hardware period timers.

Well I’m not done looking at this, I’ll update the post when I learn more.

Categories: , ,
Posted by: bjb
I created a django project and application and the associated database. I created the tables with syncdb over a couple of development iterations. I can still run ./ syncdb in the original project with no output. I tried copying the project to another directory, creating a new (empty) database (db-dev) and adjusting the in the new project. When I run ./ syncdb in the new project, it says:
psycopg2.ProgrammingError: relation "fileshare_language" does not exist
Shouldn’t django create the relation (table) as part of the syncdb operation?
    class Language (models.Model):
        Class to represent the choice of languages available so far
        language = models.CharField (max_length = LANGUAGE_LEN)
        def __repr__ (self):
            return self.language
    class Clients (models.Model):
        Class to represent the clients.  This class is associated
        with the Django User Model (where the name and email address are stored).
        user = models.ForeignKey (User, unique = True)
        filedir = models.CharField (max_length=FILE_PATH_LEN)
        language = models.ForeignKey (Language)

    class AddClientForm (forms.Form):
        Form for adding a client
        Also adds a django user and creates a directory
        username = forms.CharField (label = ugettext_lazy (u'Username'),
            widget = forms.TextInput (attrs = {'class' : 'form_object_bg' }),
            required = True)
        firstname = forms.CharField (label = ugettext_lazy (u'First Name'),
            widget = forms.TextInput (attrs = {'class' : 'form_object_bg' }))
        lastname = forms.CharField (label = ugettext_lazy (u'Last Name'),
            widget = forms.TextInput (attrs = {'class' : 'form_object_bg' }))
        email = forms.EmailField (label = ugettext_lazy (u'Email'),
            widget = forms.TextInput (attrs = {'class' : 'form_object_bg' }))
        filedir = forms.CharField (label = ugettext_lazy (u'Files Location'),
            required = True)

        language_qs = Language.objects.all ().order_by ('id')
        language_choices = []
        for ll in language_qs:
            language_choices.append ((, ll.language))
        language = forms.ChoiceField (choices = language_choices,
                                      label = ugettext_lazy (u'Language'))
        is_admin = forms.BooleanField (label =
                                       ugettext_lazy (u'Is administrator'),
                                       required = False)
        password = forms.CharField (label = ugettext_lazy (u'Password'),
            widget = forms.PasswordInput (attrs = {'class' : 'form_object_bg' }),
            required = True)
        pwd_confirm = forms.CharField (label = ugettext_lazy (u'Password Confirmation'),
            widget = forms.PasswordInput (attrs = {'class' : 'form_object_bg' }),
            required = True)
It turns out that the attempt to put the language choices in a dropdown list in the form is causing syncdb (and every other ./ command) to fail with that traceback. I suppose the quick fix is to create the table and populate it manually in the empty database, and then run syncdb. Later I can fix up the form so it doesn’t have code in the middle of the field declarations. Oops.
Categories: ,
Posted by: bjb

I wrote a small app to allow people to sign up to declare publicly, in a theme-based community, their intention of completing a project by a certain date. It was good practice to learn about django users and authentication and forms.

I wanted to allow the users to sign up and make their own accounts — but obviously I didn’t want them to mistakenly use a username that was already taken. But there is no obvious way to do that in the Django framework. The form validation takes place in a class that does not have access to the request information (where the user id is kept).

Fortunately for me, Ian Ward has already run into that problem and has solved it, both for plain old forms and for modelforms, in a very neat and elegant way.

Categories: ,
Posted by: bjb

I run Debian Sid on my laptop. Sid is the unstable or “Still in development” release of Debian. Just recently (on Jun 14, likely) grub was updated in a way that broke my boot. The symptom was that my laptop passed the BIOS, but printed something like “unaligned pointer 0x4c19a146” and then stopped. No grub menu, no graphics, no console, nothing. And of course, no response when you type stuff.

The (short-term, non-participative) solution is to downgrade grub until the problem is fixed. I got a question as to how to do the downgrade when your system won’t boot. So ….

First you find a CD that you can boot from. I used the Debian netinst CD. I had an iso image lying around for Debian 5.02, but pretty much any live CD that lets you have a shell will do. Stick the CD in the machine and reboot. If necessary, change the BIOS settings to boot from the CD first (not the hard disk).

The Debian netinst CD wants to do an install. Press Ctl-Alt-F2 to get another console, and press enter to get a prompt.

If you have one of those fancy-pantsy live CD’s that mounts your disks for you, you might have to unmount them first. At least you’ll know what the names are, in that case …

umount /dev/sda1
umount /dev/sda5
umount /dev/sda6
umount /dev/sda7
umount /dev/sda8

Now you want to mount your hard disks onto the running system and chroot to the root of the hard disk hierarchy. In my case, I have several partitions. I can never remember what the partitions are called — are they hda, hdb, sda, sdb or something else? So I look for disk-related entries in the output from dmesg:

dmesg | egrep '[sh]d'


dmesg | grep disk

or even

dmesg | grep '<'

Don’t forget to quote the angle-bracket on that last one.

It turns out my partitions are:

/dev/sda1  /boot
/dev/sda2  /
/dev/sda5  /usr
/dev/sda6  /var
/dev/sda7  /home
/dev/sda8  /srv

So I created a directory /mnt/target, and mounted the partitions:

mount -t ext3 /dev/sda2 /mnt/target
mount -t ext3 /dev/sda1 /mnt/target/boot
mount -t ext3 /dev/sda5 /mnt/target/usr
mount -t ext3 /dev/sda6 /mnt/target/var
mount -t ext3 /dev/sda7 /mnt/target/home
mount -t ext3 /dev/sda8 /mnt/target/srv

Then chrooted to the top of the disk hierarchy:

cd /mnt/target
chroot .

Then mount the proc partition (needed for dpkg later)

mount -t proc proc /proc

I looked in the logs to see what version was broken and what version it replaced. Looked in /var/log/dpkg.log … found that grub-pc_1.98-20100614-2 was the one that had just been installed and it was replacing grub-pc_1.98-20100602-1 which used to work just fine. I have been running apt-cacher on the laptop, and looked in the cache to see if the old package was still there — it was.

Then I was able to run dpkg to install the older version of grub. I found it with

ls /var/cache/apt-cacher/packages/*grub*

because that directory has a _lot_ of files in it and I was only interested in grub packages.

grub-pc depends on grub-common so I installed them both:

cd /var/cache/apt-cacher/packages
dpkg -i grub-pc_1.98-20100602-1-686.deb grub-common_1.98-20100602-1-686.deb

That command took care of the grub-install command for me ….

Then I rebooted and when it came up again, I saw a nice grub, and grub started up Linux and all was good. This way of rebooting will unmount the disk partitions nicely. It doesn’t matter that the netinst disk thinks you’re still in the middle of an install.

shutdown -r now

The last step is to tell dpkg and apt and the whole package management chain not to ever install a different grub version. Edit /etc/apt/preferences to contain:

               Package: grub-pc
               Pin: version 1.98-20100602-1-686
               Pin-Priority: 1001

               Package: grub-common
               Pin: version 1.98-20100602-1-686
               Pin-Priority: 1001

You’ll have to remember to change that at some time in the future so you’ll get grub updates. Personally, I’m not in a rush for that.

Note that I wrote this mostly from memory so paths and version numbers might not be exactly correct. Keep your eyes open, if you’re following these instructions!

Categories: , ,
Posted by: bjb

Not surprisingly, the blog spam commenters found my blog. Now I am investigating ways to ensure that only people comment on my blog.

With a little searching, I found that recaptcha is the recommended method these days. However, since Google has bought the recaptcha organization, you have to agree to a large legal agreement between yourself and Google to use it.

I’m not really up for that. It’s too bad as I might have liked to contribute to the “digitize old books” effort. So now I’m looking into regular captchas.

Posted by: bjb

db migration:

When creating several tables in one migration, put them all in the same class. Give the class the same name as the migration. For example:

file: 001_create_tables.rb

class CreateTables < ActiveRecord::Migration

  def self.up

    create_table :sites do |t|
      t.string      :name     
      t.integer     :group_id,      :null => true
      t.integer     :moh_id                      
      t.integer     :playlist_id                 

    create_table :songs do |t|
      t.string         :title 
      t.integer        :site_id,     :null => true

    create_table :groups do |t|
      t.string        :name    

    create_table :playlists do |t|
      t.string       :name
      t.integer      :song_id,    :null => true

  def self.down
    drop_table :sites
    drop_table :songs
    drop_table :groups
    drop_table :playlists


If several migrations are placed into one file like this:

files: 001_create_tables.rb

class CreateSites < ActiveRecord::Migration
  def self.up

    create_table :sites do |t|
      t.string            :name
      t.integer           :group_id,         null => true
      t.integer           :moh_id
      t.integer           :playlist_id

  def self.down
    drop_table :sites


class CreateSongs < ActiveRecord::Migration
  def self.up
    create_table :songs do |t|

  def self.down
    drop_table :songs


… etc

Then this error is likely: uninitialized constant CreateTables

(I guess because there wasn’t a class called “CreateTables” in the create_tables migration.)

Posted by: bjb

Glen Newton writes about a study done on bug-fixing speed and reliability for econometric software packages. Five proprietary packages were measured in 2004 with a common set of tests, those five packages and one open source package were measured again in 2010.

The study concludes that the open-source software has fewer known bugs at any given time (they are fixed shortly after being found) while more than half the proprietary software had many of the bugs discovered in 2004, still open in 2010. For example, after applying the basic set of tests to the open-source project, they found: “all of the errors were corrected within a week of our reporting.” But, after all those tests were applied to the proprietary projects in 2004, only two of the vendors had solved all of those problems by 2010.

They might have picked a model open-source project for this study, but still it shows what can happen if you pick a good open-source product.

Still, when you are choosing your suite of software, choose carefully (even if every candidate for your suite is free). Try to pick the lively, active projects (but not the ridiculously lively ones, that don’t bother to go back and fix their bugs at all).

Posted by: bjb

The pressure to know stuff quickly in high tech is enormous. I was glad to run across this article about the absurdity of it.

Posted by: bjb

The byteflow community has a user mailing list and a hacker mailing list:

  1. users mailing list
  2. hackers mailing list

There is more info on the byteflow wiki and in the docs directory of the sources.

Also they talk on jabber: (logs)