Wednesday, January 18, 2012

A Brief Detour into JavaScript, XHTML, and HTML5

We will return to Ubuntu and Hadoop later...

My wife helps maintain a website for a volunteer organization, and I am powerless to help much because, despite years of programming, I have essentially zero knowledge of JavaScript and XHTML.  Sure, I've read some JavaScript and XHTML, but all writing has been like my SQL "writing": take some existing code, make an extremely minor change, and hope for the best.  What I saw of XHTML just felt "strange".  Not as bad as early editions of J2EE, with Factories, FactoryFactories, and FactoryFactoryStrategies, (o.k., I'm kidding some) but pretty bad.

But, it would be useful to actually know a bit more.  Also, I want to learn HTML5, so it seemed like a good time to learn some basics of JavaScript and XHTML.  So, to get started, I visited the local library and checked out JavaScript and AJAX for Dummies by Andy Harris. I already own a couple of HTML5 books.

From Andy's book, here is the XHTML version of the classic "Hello World" program.  I have removed all the < and > cause they screw up blogger.


!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"
html lang="EN" dir="ltr" xmlns="http://www.w3.org/1999/xhtml"
  head
    meta http-equiv="content-type" content="text/xml; charset=utf-8" /
    titleHelloWorld.html/title
    script type = "text/javascript"
      //![CDATA[
        // Hello, world!
        alert("Hello, World!");
      //]]
    /script
  /head
  body  
  /body
/html


Now, I'm a Java programmer and used to some verbosity, but this really pushed the limits.  Speaking as a complete outsider from Mars looking at this, what the Heck were they thinking?  There's the !DOCTYPE tag that I guess you just get used to typing, or copy from a pre-existing file.  Same for the HTML lang="en" etc...  

There's the meta tag.  What's that even mean?  That I'm UTF-8 and text/xml.  I know about internationalization, where it might not be UTF-8.  But I already said that I'm English via lang="EN". And UTF-8 is the natural encoding for English, should be the default, shouldn't it?  Again, what were they thinking

Finally, the really mysterious //CDATA stuff.  Now, CDATA means "Character data".  If the XML wizards are going to force me to write 174 characters (for the DOCTYPE and html tags) just to get started, why are they suddenly trying to save a few characters here???  Just call it CHARACTER_DATA or whatever.  Secondly, why is this commented out???  I think I finally understand, it's commented out with "//" so that JavaScript will ignore it.  And, fortunately, XML parsers do not use "//" for comments so they will not ignore it.  What a hack!  Now, I'd probably appreciate the hack in some deep part of the Java Runtime or the Linux kernel.  But as something out in the open that everybody is supposed to use around every JavaScript, all I can say, yet again, is what were they thinking?  It's quite obvious that whoever wrote the XHTML standard had no thought for usability, elegance, or thought.  Whenever they saw a hoop for coders to jump through, they added two just to make sure.  I expect someday to see it revealed that XHTML was actually a social experiment like the Stanford Prison Experiment - given a big-shot committee, just how much pain and agony will users tolerate?

By contrast, here is the HTML5 version.  (Again, < and > removed)

!DOCTYPE HTML
html
   head
   titleHello World/title
   script type="text/javascript"
      alert("Hello World");
   /script
   /head
   body
   /body
/html   
     

Notice something?  All that crap is gone!  Wonderful.

As it turns out, my wife might want a Calendar thingamabob on her page.  You can find lots of JavaScript code to do this.  I found some code, a mere 220 lines long (sarcasm intended), that you can cut and paste into your page.  With HTML5, you can do it with 0 lines of code: just say input type="date".

In conclusion, I am exceeding glad that I never bothered to learn XHTML.  I hope to remain, as much as possible, proudly ignorant of XHTML.  HTML5 has made an instant convert.

Tuesday, January 3, 2012

Adventures in Ubuntu and Hadoop Part 4

Let's recap Rocky and Bullwinkle's adventures with Hadoop and Ubuntu on an old desktop computer. They fairly easily installed Ubuntu, Java and Eclipse, then installed an SSH server and set the ip4 address, and then, in their greatest adventure, finally got X-Windows (and VNC) working. Now we are finally ready to install Hadoop. I plan to follow the blog post by Michael Noll and the O'Reilley Hadoop book by Tom White (esp. Appendix A). According to Michael Noll, I have more that satisfied the prereqs.

Recent Hadoop releases are here or here. Some strange domain names. As of today 0.20.203 was the latest stable release, so I grabbed it and put it in /opt/hadoop. Then, per Michael's instructions

sudo tar xzf hadoop-0.20.203.0rc1.tar.gz and sudo chown -R hduser:hadoop hadoop-0.20.203.0

Then edit /home/hduser/.bashrc (don't forget to type sudo!) to add the following lines at the end. YMMV depending on exactly where your Hadoop and Java are installed.


# Set Hadoop-related environment variables
export HADOOP_HOME=/opt/hadoop/hadoop-0.20.203.0


# Set JAVA_HOME
export JAVA_HOME=/opt/java/32/jdk1.6.0_30

# Add Hadoop bin/ directory to PATH

export PATH=$PATH:$HADOOP_HOME/bin

The Hadoop book suggests that you test if it will run by typing hadoop version. Before this will work,either re-login to run the .bashrc script, or manually do all three exports. If you forget to export JAVA_HOME, you'll see a useful, informative message

Error: JAVA_HOME is not set.

But, once you set all three, you'll see something like

mpc@mpc-desktop:/home/hduser$ hadoop version
Hadoop 0.20.203.0

Subversion http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20-security-203 -r 1099333

Compiled by oom on Wed May 4 07:57:50 PDT 2011


Wohoo! Our work is done! Well, not really, there's still a whole bunch to go, like configuring the Hadoop Distributed File System. (HDFS). But, let's declare victory for now and return to that on a later day.