Monday, July 24, 2006

namei

I stumbled today on namei, an executable found on Linux. I'm not sure why I've never been exposed to it before, FreeBSD not packaging it comes to mind, but this is truly useful. In a world of softlinks, with all their benefits and drawbacks, namei follows the trail until it reaches a real end point.

Here's a sample output on a RHEL box:

# namei `which java`
f: /usr/bin/java
d /
d usr
d bin
l java -> /etc/alternatives/java
d /
d etc
d alternatives
l java -> /usr/lib/jvm/jre-1.4.2-gcj/bin/java
d /
d usr
d lib
d jvm
l jre-1.4.2-gcj -> java-1.4.2-gcj-1.4.2.0/jre
d java-1.4.2-gcj-1.4.2.0
d jre
d bin
- java

The best part is that I know exactly how this was coded ... and that I don't have to do it myself.

Friday, July 21, 2006

Thinking versus doing

Ever had the feeling that:
  • the task at hand would be easier if you mastered the topic more?

  • you had researched a topic thoroughly but had nothing to show for it?

In the last few months, I've become more aware the tradeoffs required to program. There is a constant battle between thinking and doing. More importantly, there is a need to be careful about over-thinking and over-doing.

ThinkingDoing
  • research
  • RTFM
  • planning
  • designing
  • implementing
  • experimenting
  • practicing
  • typing (!)

Pitfalls of "thinking" too much?
  • less/no results
  • less/no practice
  • less/no hands-on experience to guide

Pitfalls of "doing" too much?
  • re-inventing the wheel
  • less than optimal solutions
  • less/no benefits from accepted best-practice (other people's mistakes)

The key is to constantly switch modes. Code a little bit, research a little bit, refactor/readjust your code, research with your newly acquired experience, and so on.

...

A typical scenario: ASCII downcast

On one of the project I am working on, we ask for city names and display their peer-produced restaurant reviews. The problem comes with cities like "Montréal" which contain accents. We capture the string as UTF-8, that's not the problem. Inserting it into a URL is another matter, although it works, it does make for ugly URLs: Montr%E9al.

The problem becomes, how do I convert "Montréal" to "Montreal"? More broadly, how do I remove all the accents? This is something I call "ASCII downcast", I would be happy to learn the real name of this process -- it would surely guide my efforts towards a solution.

Here are the steps I followed:
  • I googled for a few keywords
    (ruby, conversion, ASCII, UTF-8, downcast)
  • I researched the Ruby standard library
    (especially: Iconv)
  • I experimented a little bit in irb
    (utf-8 to ASCII, utf-8 to iso8859-1 to ASCII)
  • I googled for more keywords, including problem-specific keywords
    (e-acute, escape sequences)
  • I investigated the tr function of String
  • I researched tr and multi-byte encoding
Time elapsed: ~1 hour.

This could have taken a LOT longer. Consider that I'm already quite familiar with encodings, conversions, that I had sample files with the same content in multiple encodings, and that I knew the command-line versions of tr and iconv.

The solution that I came up with is:
  • convert UTF-8 to ISO-8859-1 (raise exception if not possible)
  • use the tr function and map all accented characters to non-accented correspondent characters
  • manually map characters -- a one-shot investment
  • detect other "evil" characters (raise exception if needed)
It works, but it feels like a hack. I feel like I'm missing a library somewhere that already does that. I cannot believe that this problem has not been solved before and it implies I didn't do my research well enough.

This reminds me of another similar situation I faced recently with Flickr-style tags. We implemented a "naive" version of tags for the above mentioned projects only to find this.

It was a learning experience, both coding it and being humbled by a better solution.

Tuesday, July 18, 2006

JavaScript: The Final Frontier

While I was in California, I stopped by Borders to have a look. I thought it would be bigger and more exciting. But it was just another bookstore, much like the ones we already have in Montreal. Incidentally, I did found a great bookstore a few days later: DigitalGuru.

However, Borders had a sale on a few books, one of which was: Head Rush AJAX. I mentioned before that I have a great respect for the Head-First series. I could complain that their books are quite verbose, but that's a small price to pay -- they drive the message across.

Once, in high school, at the very beginning of the semester, our English (ESL) teacher told us to write an essay. At the end of the semester, she asked us what we had learned this year. We made a point to explain her that we didn't really learn anything and that it had been a waste of time. We probably used gentler words though. She took out the essays we had completely forgotten about. She asked us to review them. What a difference! We cringed at the mistakes we had made, at our poor vocabulary, at our incredible lack of skills -- at least in comparison with our current level. Thank you for opening my eyes, Mme Lacasse!

It is too easy to think that we aren't improving because we lack the perspective of the before and after.

When I closed the "Head Rush AJAX" book, I thought: "Thanks for nothing..." But I was mistaken. Really mistaken. I only realized afterwards, when faced with my first AJAX problem, that I understood the concepts and I knew what to do. And I knew it wasn't like that before I had read the book. There is skill in teaching without people realizing how much they are learning.

Now I can play with AJAX, feed XML back in the requests and manipulate the DOM. But JavaScript is one of the topics I don't master when it comes to web development. I can "get by" but I definitely don't feel in control. I am waiting anxiously for "Head First JavaScript" as hinted inside their last book.

I could definitely go for more effortless learning.