Finding Average File Size with Perl

I was creating a new filesystem on a NetBSD box today, and wondered about the appropriate value for the “average file size” parameter. The first question I had was “what is the average filesize in my data set”, which I figured I could answer, since I had some representative data handy. I put together a quick one-liner to answer this question:

find /export -type f -print | perl -ne 'chomp; $count++; $total += (stat())[7]; END { print "$count files $total bytes total ", $total/$count, " byte average\n"; }

After running this, I was surprised at just how large the “average” size was, which led me to wonder just which average they were looking for here: mean or median? While I was at it, I decided to calculate the mode as well.

Read the rest of this entry »

Stack Overflow!

Jeff Atwood of Coding Horror and Joel Spolsky of Joel on Software, two of my favorite programming bloggers, recently created a community-driven programming Q&A site, stackoverflow.com. It’s been open to the public for a few days now and seems like it will be an extremely useful resource for developers of all types.

$10,000-per-Minute Phone

A man in Topeka , Kansas decided to write a book about Churches around the country. He started by flying to San Francisco and started working east from there.

Going to a very large church, he began taking photographs and making notes. He spotted a golden telephone on the vestibule wall and was intrigued with a sign, which read ‘Calls: $10,000 a minute.’

Seeking out the pastor he asked about the phone and the sign. The pastor answered that this golden phone is, in fact, a direct line to heaven and if he pays the price he can talk directly to God.

The man thanked the pastor and continued on his way. As he continued to visit churches in Seattle , Dallas, St. Louis, Chicago, Milwaukee, and around the United States, he found more phones, with the same sign, and the same answer from each pastor.

Finally, he arrived in Pennsylvania , upon entering a church in Pittsburgh , Pa .. Behold – he saw the usual golden telephone. But THIS time, the sign read ‘Calls: 35 cents.’

Fascinated, he asked to talk to the pastor, ‘Reverend, I have been in cities all across the country and in each church I have found this golden telephone and have been told it is a direct line to Heaven and that I could talk to God, but in the other churches the cost was $10,000 a minute. Your sign reads only 35 cents a call. Why?’

I love this part ………………………

The pastor, smiling benignly, replied, ‘Son, you’re in Pittsburgh, Pennsylvania home of the Pittsburgh Steelers now…… You’re in God’s Country, It’s a local call.

(American by Birth – A Steelers Fan by the Grace of God.)

GO STEELERS !!!!!!

Quotes are back!

I finally got around to fixing the Vertigo theme to include a random quote in the sidebar. While I was at it I opened up the layout a bit, making it easier to read code snippets and such in the main text. The newspaper-style narrow column layout is great for prose but it’s a bit constraining for technical postings.

VMWare network devices and udev

I’ve been struggling for a while trying to get udev to maintain device names for vmxnet devices when running in a virtual machine. Well, I finally figured it out. The 75-persistent-net-generator.rules script in Gentoo was making rules that looked like:

SUBSYSTEM=="net", DRIVERS=="?*", ATTRS{address}=="00:0c:29:0b:02:d2", NAME="eth0"

These seemed to be silently ignored.

After an emerge --sync; emerge -u world last night, the vmware devices started getting IDs following the highest-numbered eth device. These rules would be added to 70-persistent-net.rules, and on the next boot the devices would move up even higher, causing all network device config settings to be ignored.

I noticed that the new rules added to 70-persistent-net.rules were of the form:

SUBSYSTEM=="net", DRIVERS=="?*", ATTR{address}=="00:0c:29:0b:02:d2", NAME="eth0"

(ATTRS was replaced with ATTR), which just means the match is done on the specific node rather than checking all of the parents.

After a lot of painful attempts to fix this, I finally found the problem. Apparently the DRIVERS key is unset at this point for the vmxnet driver. I removed that test, so that I have:

SUBSYSTEM=="net", ATTR{address}=="00:0c:29:0b:02:d2", NAME="eth0"

which now works.

Phew!