Categories
Posts in this category
- Iron Man Challenge - Am I a Stone Man?
- Why Design By Contract Does Not Replace a Test Suite
- Doubt and Confidence
- Fun and No-Fun with SVG
- Goodby Iron Man
- Harry Potter and the Methods of Rationality
- Introducing my new project: Quelology organizes books
- Keep it stupid, stupid!
- My Diploma Thesis: Spin Transport in Mesoscopic Systems
- Why is my /tmp/ directory suddenly only 1MB big?
Sun, 26 Jun 2011
Introducing my new project: Quelology organizes books
Permanent link
For about half a year I've been working on a website called quelology, which collects book series and translations.
It is intended to answer questions of the form: I've now read "Harry Potter and the Order of the Phoenix", which is the next book in that series? or What's the name of the French translation of that book?
The website and data mining behind it are written in Perl, and it is based on book meta data by isfdb, amazon and worldcat.
I'm working on importing data from more sources, next up will be the Swedish National Library.
After completing the data mining stage, I'll add an interfaces that allows the visitor to edit the book, series and translations data, so that users can extend the data body.
Tue, 14 Jun 2011
Why is my /tmp/ directory suddenly only 1MB big?
Permanent link
Today I got a really weird error on my Debian "Squeeze" Linux box --
a processes tried to write a temp file, and it complained that there was
No space left on device.
The weird thing is, just yesterday my root parition was full, and I had made about 7GB free space in it.
I checked, there was still plenty of room today. But behold:
$ df -h /tmp/ Filesystem Size Used Avail Use% Mounted on overflow 1.0M 632K 392K 62% /tmp
So, suddenly my /tmp/ directory was a ram disc with just 1MB of space. And
it didn't show up in /etc/fstab, so I had no idea what cause
it.
After googling a bit around, I found the likely reason: as a protection against low disc space, some daemon automatically "shadows" the current /tmp/ dir with a ram disc if the the root partition runs out of disc space. Sadly there's no automatic reversion of that process once enough disc space is free again.
To remove the mount, you can say (as root)
umount -l /tmp/
And to permanently disable this feature, use
echo 'MINTMPKB=0' > /etc/default/mountoverflowtmp
Mon, 22 Nov 2010
Harry Potter and the Methods of Rationality
Permanent link
What if Harry Potter had been raised by a loving stepmother? What if his stepfather was a scientist? What happens when somebody tries to analyze magic with scientific methods? What happens if an eleven year old boy is too smart for his own good?
A piece of fan fiction, Harry Potter and the Methods of Rationality by "Less Wrong" answers those questions - and makes quite a good read. If you are into fantasy books and science, you might really love it. I did.
But be warned: only read this if you've read all seven Harry Potter books by J.K.Rowling, because the fan fiction piece contains lots of spoilers.
So far 60 chapters for varying length have been published, and just a few more to be written before the first year ends. I look forward to the final chapters.
Tue, 08 Dec 2009
Keep it stupid, stupid!
Permanent link
How hard is it to build a good search engine? Very hard. So far I thought that only one company has managed to build a search engine that's not only decent, but good.
Sadly, they seem to have overdone it. Today I searched for tagged dfa. I was looking for a technique used in regex engines. On the front page three out of ten results actually dealt with the subjects, the other uses of dfa meant dog friendly area, department of foreign affairs or other unrelated things.
That's neither bad nor unexpected. But I wanted more specific results, so I decided against using the abbreviation, and searched for the full form: tagged deterministic finite automaton. You'd think that would give better results, no?
No. It gave worse. On the first result page only one of the hits actually dealt with the DFAs I was looking for. Actually the first hit contained none of my search terms. None. It just contained a phrase, which is also sometimes abbreviated dfa.
WTF? Google seemed to have internally converted my query into an ambiguous, abbreviated form, and then used that to find matches, without filtering. So it attempted to be very smart, and came out very stupid.
I doubt that any Google engineer is ever going to read this rant. But if one is: Please, Google, keep it stupid, stupid.
I'm fine with getting automatic suggestions on how to improve my search query; but please don't automatically "improve" it for me. I want to find what I search for. I'm not interested in dog friendly areas.
Sat, 05 Dec 2009
Doubt and Confidence
Permanent link
<meta>From my useless musings series.</meta>
As a programmer you have to have confidence in your skills, to some extent, and at the same time you have to constantly doubt them. Weird, eh?
Confidence
You need some level of confidence to do anything efficiently. Planning ahead requires confidence that you can achieve the steps on your way.
As a programmer you also need some confidence with the language, libraries and other tools you're using.
If you program for money, you also have to assess what kind of programs you can write, and where you might have problems.
Doubt
In the process of programming you make a lot of assumptions, some of the explicit, some of them implicit. If you want to write a good program, it's essential that you are aware of as many assumptions as possible.
When you find a bug in your program, you have to challenge previous assumptions, and that's where doubt comes in. You not only suspect, but you know that at least one of the assumptions was false (or maybe just a bit too specific), and you know that you did something wrong.
Sometimes programmers make really stupid mistakes which are rather tricky to track down. That's when you have to question your own sanity.
One example (that luckily doesn't happen all that often to me) is when I edit my program, and nothing seems to change. Nothing at all. Depending on the setup it might be some cache, but something it is even more devious - for example I didn't notice that the console where I edit and the console where I test are on different hosts - and thus the edits actually have no effect at all.
After having done such a thing once or twice I adopted the habit of just
adding a die('BOOM'); instruction to my code, to verify that
the part I'm looking at is actually run.
These are moments when I question my own sanity, thinking "how could I have possibly done such a stupid thing?". Doubt.
The same phenomena applies when doing scientific research: since you usually do things that nobody has done before (or at nobody has published about it yet), you can't know the results beforehand -- if you could, your research would be rather boring. So you have no external reference for verification, only your intuition and discussion with peers.
Sat, 10 Oct 2009
Fun and No-Fun with SVG
Permanent link
Lately I've been playing a lot of with SVG, and all in all I greatly enjoyed it. I wrote some Perl 6 programs that generate graphical output, and being a new programming language it doesn't have many bindings to graphic libraries. Simply emitting a text description of a graphic and then viewing it in the browser is a nice and simple way out.
I also enjoy getting visual feedback from my programs. I'd even more enjoy it if the feedback was more consistent.
I generally test my svg images with three different viewers: Firefox 3.0.6-3, inkscape (or inkview) 0.46-2.lenny2 and Opera 10.00.4585.gcc4.qt3. Often they produce two or more different renderings of the same SVG file.
Consider this seemingly simple SVG file:
<svg width="400" height="250" xmlns="http://www.w3.org/2000/svg" xmlns:svg="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" style="background-color: white" > <defs> <path id="curve" d="M 20 100 C 60 30 320 30 380 100" style="fill:none; stroke:black; stroke-width: 2" /> </defs> <text font-size="40" textLength="390" > <use xlink:href="#curve" /> <textPath xlink:href="#curve">SPRIXEL</textPath> </text> </svg>
If your browser supports SVG, you can view it directly here.
This SVG file first defines a path, and then references it twice: once a text is placed on the path, the second time it is simply referenced and given some styling information.
Rendered by Firefox:
Rendered by Inkview:
Rendered by Opera:
Three renderers, three outputs. Neither Firefox nor Inkview support the
textLength attribute, which is a real pity, because it's the only
way you can make a program emit SVG files where text is guaranteed not to
overlap.
If you scale text in Inkscape and then put it onto a path, the scaling is
lost. I found no way to reproduce opera's output with inkscape without
resorting to really evil trickery (like decomposing the text into paths, can
then cutting the letters apart and placing them manually). (Equally useful is
the dominant-baseline attribute, which Inkscape doesn't support
either).
The second difference is that only Firefox shows the shape of the path.
Firefox is correct here. The SVG specification clearly
states about the use attribute:
For user agents that support Styling with CSS, the conceptual deep cloning of the referenced element into a non-exposed DOM tree also copies any property values resulting from the CSS cascade [CSS2-CASCADE] on the referenced element and its contents. CSS2 selectors can be applied to the original (i.e., referenced) elements because they are part of the formal document structure. CSS2 selectors cannot be applied to the (conceptually) cloned DOM tree because its contents are not part of the formal document structure.
Sadly it seems to be a coincidence that Firefox works correctly here. If
the styling information is moved from the path to the
use element the curve is still displayed - even though it should
not be.
Using SVG feels like writing HTML and CSS for 15 year old browsers, which had their very own, idiosyncratic idea of how to render what, and what to support and what not.
Just like with HTML I have high hopes that the overall state will improve;
Indeed I've been told that Firefox 3.5 now supports the
textLength attribute. I'd also love to see wide-spread support
for SVG animations, which could replace some inaccessible flash
applications.
Tue, 04 Aug 2009
Goodby Iron Man
Permanent link
<update> (from 2009-08-23) It turned out that my disappearance on the ironman blog feed was due to a broken RSS feed. Matt S. Trout tried to inform me by blog comment, my blog marked it as spam and swallowed it.
So now we talked on IRC, clarified things, and I'm back in the game. </update>
So I accepted the Iron Man blogging challenge a few month ago. And last week I discovered that my blog was gone from their feed. For the second time. Without any notification.
Image: rusty iron man, by courtesy of artvixn, available under a create commons non-commerical by-attribution license.
The first time they had a good reason: the date tags in my RSS feed were goofed; still I'd thought it would be nice to at least notify me of such a removal. After some mails back and forth I was able to fix it; after the second removal without any notification I'm simply fed up and don't want to investigate any more energy into this.
Still I'll continue to follow the collected RSS feed, there are still many interesting blogs to be read there.
Tue, 23 Jun 2009
Iron Man Challenge - Am I a Stone Man?
Permanent link
Gabor asked what I'm missing from the Iron Man blogging challenge. Gabor focused on the contents of the blog posts, I'll talk about the challenge itself.
I'm missing the things announced on their website: a way to find out to which level you made it, a monthly selection of best blog posts, and all these other things that were designed to create some competition, and more fun.
Don't get me wrong, I like to read the blog of my fellow Perl programmers, and it motivates me to write more often myself. But that's not all that was promised to us.
One thing I'd like to add about the content, though: So far most of what I read was very good and informative, but it was all text. I know it's not easy to find nice on-topic programming pictures, and use.perl.org doesn't even allow the inclusion of pictures in posts, and I don't do it often myself, but having more picture or charts would be nice.
Mon, 01 Jun 2009
Why Design By Contract Does Not Replace a Test Suite
Permanent link
"Design By Contract" (DBC) usually refers both to very sophisticated assertion systems (for example in which assertions are inherited along with the methods to which they belong), and to the practice of using such assertions extensively, not only for quality assurance but also as a form of documentation.
When I was mostly programming in Eiffel some years ago, I liked DBC very much, and I still think that it's a very good idea, and that more programming language should offer good support for it.
However there's one comment that I've seen frequently on the web, in blogs and on IRC. Often DBC evangelists say something along these lines: "We have DBC, we don't need a test suite". I find such comments incredibly stupid, and here I want to write down why.
Code needs to run
If you want to verify that the code does what you want, you have to actually run it - otherwise the assertions won't be triggered, and are worthless as a verification tool.
You don't have to just run it, but should, when possible, cover every code path - just like you'd do it when you write tests. Doing that manually requires much work, so you still need a test suite that you can run to verify that some changes didn't break anything.
Examples are easy, general rules are hard
Test cases are just example input, paired with the expected output. Usually it's pretty easy to come with examples, so writing tests is also easy, even for corner cases.
On the other hand assertions are rules that have to hold for all possible input data, so to formulate them, you have to consider the general case - that's usually rather hard, so the lazy programmer leaves out the hard cases.
A simple example: suppose you've written a subroutine that adds two numbers (for example for a bignum library). Writing assertions for the general case of addition is quite hard if you can't trust your subtraction routine; so the only things you can really do is to check the signs (positive number plus positive number is positive etc.), but that won't catch any off-by-one errors.
So you should also write tests; tests like add(3, 4) == 7 are
trivial to come up with, and catch potential errors.
Conclusions
Design by Contract and testing should go hand in hand so that the tests exercise as many code paths as possible, and should cover those areas that are hard to validate with assertions.
DBC should not be viewed as a replacement for tests.
Thu, 18 Dec 2008
My Diploma Thesis: Spin Transport in Mesoscopic Systems
Permanent link
Sometimes people ask me what I'm doing right now, and I tell them "I'm writing my diploma thesis on mesoscopic spin transport", and they know just as much as before. So here I want to explain what that means.
Mesoscopic systems
A mesoscopic system is one that is larger than a few nanometers, but still small enough that you have to care about quantum effects.
That's not a very precise definition, so I'll try again: Consider a metallic wire. For macroscopic systems (ie the ones that we are used to in day-to-day live) you might know that the electrical resistance of such a wire increases linearly as you increase its length, and decreases linearly if you increase its cross section.
This is very intuitive, because electrical resistance describes how hard it is for an electron to travel through our wire. If the wire is longer, it sees more obstacles, so the resistance is higher. If the wire has a larger cross section, it's easier for the electron to find a way that's not blocked, so the resistance is smaller. That's called Ohm's law.
These relations aren't true anymore for rather small systems. If you have a very thin wire, say 20 nanometers, and increase its diameter by another nanometer, the resistance might not change at all. Then you increase its diameter by another nanometer, the resistance suddenly jumps down by a few percent.
All these systems that are too small for Ohm's law to apply are called mesoscopic. All mesoscopic effects have to be explained with quantum physics, at least at some point.
Electron Spin
Electrons have something called Spin. Everybody knows that it has a charge, and it acts as if it rotated around its own axis very fast. So it looks like a current which runs in a circle, and that creates a small magnetic field.
If you try to measure the magnetic field of one electron, you will only ever get two possible values, which we call spin up and spin down.
Spin Optics
In a semiconductor, one can split up a beam of electrons into two beams of spin-up and spin-down electrons, just like in optics with polarized light. That splitting can be influenced by an external voltage, like a classical transistor.
The topic of my diploma thesis is to figure out how such spin polarized electron beams behave in certain semiconductor systems.