Friday, December 07, 2007
Apple envy
Scaling
What if pedantic civil servants wrote software?
Sunday, November 25, 2007
WideFinder - late entrant
Snagged me a log file and starting cranking out some code tonight. I had some free time to think about this a couple of weeks ago, and got some sketches down, but I've only recently got the data to start seeing how the code can fly. So, I'm starting out on my work Dell laptop, Dual Core Pentium 2GHz with 2GB RAM and a really shitty disk, judging by the slowness and noises it makes (or is that just Vista?) [stay on topic! - Ed]. The Ruby version runs in just over a minute, once all of the caches are warmed up. My initial naive Java version runs in 14 seconds (I haven't figured out yet how to run it using time
as per *nix environments - Cygwin says it can't find the time command when I pipe zcat output into it).
Now to start implementing my ideas. I have what I think is the shared update of the accumulator as well as I'm going to get it. I'm hypothesising that most of the updates are uncontended and so don't require the full weight of Java's locking capabilities. Now I just need to parallelize the I/O and determine the most efficient matching algorithm, which seems to be Boyer-Moore from reading the Wide-Finder series. That particular algorithm seems to pop up fairly regularly in searching. Might be interesting to see what else is available in that field, but it should be in a library, surely?
Development updates
Just logging the early signs. Connor was walking to nursery with Al the other day, and kicked a cat. There is a history there, the cat had previously scratched him, but even so! The other sign to be aware of is from when the boys recently watched the middle Star Wars trilogy. Afterwards, Callum wanted to role play and be Luke. Connor wanted to be Darth Vader. Enough said.
Oakley For Ever
I love Oakley [1], and not just for the quality of the purchased product.
- From Lance Armstrong's book, he recounts telling his Oakley sponsor that he's been diagnosed with cancer, but he's not got health insurance. The guy from Oakley tells Armstrong not to worry and he sorts it out by telling the insurance company that Armstrong had better damn well be covered, otherwise Oakley will stop doing their company heath insurance scheme through that insurance company. I'm happy spending money with a company that employs that sort of human being.
- When I recently lost one of the rubber noseclips from one of my pairs of Oakleys, I phoned up customer service and they posted some out the same day, no charge. That's just great service, and ensures that I'll continue to buy Oakleys and will probably be getting them from my three sons over time as well. Why aren't more companies that switched on about having a great long-term relationship with customers?
Saturday, November 10, 2007
Albarracin trip day six - return to Sol
Went back to Sol, to go to the other end of it. Really struggled warming up; my arms would not work, and I have very mushy thin tips on all of my fingers and thumbs. Scott did a nice roof, first or second go. I really struggled to link it until I just got really angry, and just pissed up it. Why didn't I do that the first time and save all of the skin and energy, fool? Found it hard to get pysched today, tiredness and missing the family is starting to tell. Next we tried a really cool roof traverse into a slopey top-out. Great moves, but a bit hard for me in that state. Around 7a again, but I split a tip on about fourth go and that was me done. I'd got overlapping halves on it apart from topping out, which I'd split my tip on. Scottie got overlapping halves and topped out, but then seemed to lose power, the sun came around a bit more or something and he didn't look as smooth on it. I then tucked into the baguettes while he thrashed himself on a couple of other things.
While we were in the area, we went over to Masia. That looks like it's got a few nice things to do as well, so something else to go back for. Final night, we went back to El Molina del Gato, which serves nice beer, has strange music and all of the climbing topos, including French ones so we could work out where we'd gone wrong in looking for Tierra Media. And then another disappointing meal in a place off the main square. Not sure about Spanish food; it must be better than that. Maybe we couldn't speak the language that well, but people couldn't recommend stuff either, and we were getting cold peppers out of a tin served with bad fries and indeterminate meat. Not inspiring. I was already dreaming of Thai green chicken curry.
Albarracin trip day five - return to Arrastradero
Had some unfinished business with a sit down start to an arete here, so wanted to go back. Mint conditions, quite cold and hard to warm up. Skin still felt trashed. Met up with the two guys from Bristol that we'd met at the car park earlier in the week. I think it was their first time to the area, so they were running around dead keen. Managed to do a few nice things; there was a place round the back quite similar to l'Elephant with sloping flared cracks. There was a really good-looking 7a rib (8a sit start) which I didn't try due to trying to preserve skin, and the original walls that we'd seen on our first visit to this area that we still hadn't tackled. Oh well...
Albarracin trip day four - rest day
Day four for the first rest day? Well, not really. The first day wasn't much; an hour or so of tinkering. So we've pretty much done the 2 days on, 1 day off thing. It seems to have worked out well, as this was a planned rest day, and it started raining last night.
So a very late start today, after a lot of vino tinto last night. We decided to go exploring and work out where best to go on our last two days. First off, up towards Techo. This was the first area that we found using the topo and didn't get lost! This looks like quite a brutal hardcore place. OK, so it was a rest day and we're feeling (or I am anyway) quite shredded, but most stuff here looks like you need to be doing 7b to get much out of coming here. There were some inspiring roof lines that looked around 8a; I'll save them until I've lost 10 kg and got strong again.
We carried on up the hill towards Madriles and Pyschokiller; more big stepped roofs again. Although apparently Pyschokiller might not allow climbing there. Some of the areas are restricted, but not at the time we were here.
Then we went looking for the visitor's centre that was listed on the national park signs. Unfortunately, our written Spanish was enough to understand that it was only open at weekends and holiday periods. But the drive to get there was quite something. The park seems to be raised about the plateau, and we had some fantastic views of the flood plain and the sandstone towers elsewhere. It looked like going for a walk in that area would be worthwhile on the rest day as well, if we were there for a longer period.
To round off the day, we tried to find Tierra Media, but our Spanish wasn't up to the task. We did find some striking red trees though; not sure if they were seasonally that colour, or that was their natural plumage, so to speak.
Friday, November 09, 2007
Albarracin trip day three - Sol
Well, a bit more about where we're staying. We're at Camping Albarracin. It's a somewhat strange place in terms of requiring an international bank transfer to pay the deposit, but they accept Visa when I came to pay the balance on arrival. We're in a bungalow advertised as being suitable for four, and it would do that, but it feels very bijou with just the two of us. Not sure how Andy, Emma et al will find it next week!
Went to Sol today, and again had slight problems finding our way there. We followed our Spanish topo, but went a bit far before breaking up the hill parallel to the road. Sol was good, lots of stuff to warm-up on. My skin is feeling it already though. Trying a big roof today; I couldn't quite use Scott's sequence due to not having sixty foot arms, but need to work on press moves a bit more. I couldn't quite press out over the lip enough.
Albarracin trip day two - Arrastradero
Had the most awful food at Hotel Albarracin last night. It was steak, but well-done at one end and completely raw (proper raw, not just the raw that I've had in certain French establishments. So, late start before getting up to go climbing. Lovely day again, got a bit lost going to where we intended. We went up to the main car park, and then along the road a bit before striking left into the forest, past a collection of cave paintings. This was thanks to our reading of the topo, and we had inadvertently gone to the wrong side of the hill. We noticed some enticing looking walls on the left (minimum 7a?, so a bit hard to warm up on) and eventually got to where we wanted to be. What we should have done is walked up past the swing park, until you see the climbing area on your right. But our Spanish was non-existent. Another good day, good rock and lovely area. I'm not getting any power onto the rock really, been pulling on blobs too much and got no topping out skills for these slopey topouts, so I'm struggling on everything except the crimpy overhanding walls. Scott is climbing really well, but then he climbs on rock lots! Jealous, moi!?
Albarracin trip day one
Flew into Madrid, managed to get all of the bouldering kit including my unfeasibly large extending stick on the plane and then drove the Albarracin. No map in the hire care (thanks Europcar!) but a combination of the AA and Google Maps directions got us there without any mishaps. We got there at about six in the evening and Scott was mad keen. I was more up for a beer, but we went for a drive, found some rocks and had a quick play. The rock and national park is really cool, and we were climbing as the moon came up. Beautiful. Sadly Al's got the camera this week, so I hope that my phone will suffice. Think we're going to have a good week.
Saturday, October 20, 2007
How to remove a key when a 2 year old has snapped the key off in the lock
Google to the rescue again. In case this helps anyone else avoid locksmith callout charges on a Saturday, I started off here and here, and was sufficiently enlightened as to solve it myself. Try checking that the key is in the correct orientation to be removed, twisting using a screwdriver where necessary. Then you should hopefully be able to use another key to push it out from the other side, and use tweezers to coax it out the rest of the way. The suggested superglue method wasn't necessary, but is a nice trick.
Monday, October 08, 2007
Java Mock Objects tip
Discovered while using JMock, but I would imagine it's also good for EasyMock, RMock, ...
checking(new Expectations() {
{
one(httpServletRequest).getParameter("c");
will(returnValue("-2"));
one(httpServletRequest).getParameterNames();
// StringTokenizer implements Enumeration. A bit cheeky!
will(returnValue(new StringTokenizer("c")));
}
});
Thursday, October 04, 2007
Surgery
How I love the smell of burning vas in the evening! Recovery tips:
- treat it as a soft tissue injury and ICE it!
- Arnica; there's going to be a lot of bruising, so any old snake oil is worth a try.
Tuesday, October 02, 2007
Novarra makes a play for End Game
So there's been a bit of a kerfuffle over here. What's happened is that Vodafone UK have started using Novarra's content adaptation technology to transcode the internet for mobile phones. Why has this caused such irate responses in some quarters? Well, a history lesson seems to be in order...
The mobile internet started out with devices unable to render nearly all of the internet due to hardware constraints. WML was introduced as a very limited markup which phones were able to use to display content. Early WML sites were, needless to say, fairly limited in functionality, and tended to be exposed as distinct resources; e.g. Apache user-agent detection would be used to send you to http://wap.example.com/ if you accessed http://www.example.com/ with your phone. Moore's law still holds in this arena; devices got a bit more capable and iMode (CHTML) was introduced in Japan. This was initially something like HTML 2.0 / 3.2 without <table/>s, but with the better networks over there, this was somewhat more successful in the marketplace. The W3C stuck in their thumb and pulled out XHTML Basic, and then there was the related XHTML Mobile Profile.
So who is so offended by what Vodafone UK and Novarra have done? Well, mainly the mobile internet community. The solutions that the community provide have evolved over time from hand-coded WML sites, iMode sites, XHTML Basic sites, sites where the view renders the model to the appropriate markup and DIAL processors / Drutt / Volantis's offerings in this market. There is an entire industry sector dedicated to providing solutions in this area and nearly all of those solutions rely on the HTTP header User-Agent being present and being a reliable indicator of the requesting client. Novarra / Vodafone UK have introduced a fairly disruptive change which could be viewed as an attempt to change the rules of the game. They're making some pretty provacative statements along the way.
2.3.1.7 A content transformation server can do a better job of following
mobile best practices
The "Mobile Web Best Practices 1.0" W3C Proposed Recommendation [1]
contains many recommendations for authoring content that is intended for
viewing on a mobile device. A well-designed content transformation
server can do a better job of following the mobile best practices than a
human author, especially when taking into account the capabilities of
the many different mobile devices. The result will be a more
consistent, uniform experience.I call BS. The incumbent mobile content industry is feeling the pain, but this could just be a game-changing move like Google upp-ing the storage limits on webmail. Time will tell whether the market (mobile phone customers) feels that the Vodafone UK solution is good enough, or whether a more open market will be preferred. I'm all for the Ubiquitous Web when it's good for the customer. So here's a little gedank-experiment. What happens to the mobile content industry if all carriers start using a content-adaptation proxy? How else is your company adding value? Evolving markets are hardly a new phenomenon, although maybe the rate of change is a little faster in these modern times...
Sunday, September 30, 2007
Connor responding to discipline
We were out in a restaurant for Al's birthday meal and Connor was playing with the door where we were sitting. We warned him not to do that or the waitress would come over and tell him off. Five minutes later, the waitress came over and the boy's first reaction?
"I'm not touching it!"
Key log
Not blogged for a while. Usual issues:
- Been busy at work.
- Been on holiday and had a lot of catching up (reading blogs!)
- Been ill
Saturday, September 29, 2007
Happy Birthday Babe
Friday, September 07, 2007
Rugby World Cup - France vs Argentina
I don't think much of the English chances, but bring it on!
Update 1 9 - 17. The French seem to be choking. Lots of basic errors and not converting their chances. The Pumas are really going for it. We could be on for a massive upset and blow to the hosts (and favourites in some eyes) chances... (Loving Will Greenwood as a pundit as well).
All consuming
Saturday, August 18, 2007
Amazon losing capability in core function?
I was watching Steve Yegge's talk from OSCON about branding, and one of the things he touched on was Amazon's brand is books, and they want to get out of that since Jeff Bezos thinks that certain consumables will become totally digitized. My current thinking is that they are focusing on other things anyway (hosting, anyone?) and maybe this is taking away from their original core competency. Recent orders of mine have taken a lot longer than they used to. Previously, I'd order stuff on a Monday and it would arrive on Wednesday or Thursday the same week, with Super Saver Delivery. Now it takes a lot longer. So I'm still waiting for the Erlang, ANTLR, Haskell and REST books to arrive that I'd hoped to read on holiday.
Thursday, August 02, 2007
Hello Cameron
Jython UnicodeData hacking in the CDS
Got numeric and decimal working today in between contractions in the Central Delivery Suite at Frimley Park Hospital today. Nearly got categories working as well, except I'm not clear how CPython has implemented unicodedata.category for undefined codepoints.
e.g. Python 2.5 uses Unicode 4.2 for the Unicode database. The integer codepoint 13313(decimal) / 3401 (hex) is not defined within Unicode 4.1.
3400;;Lo;0;L;;;;;N;;;;;
4DB5;;Lo;0;L;;;;;N;;;;;
It isn't defined in Unicode 5.0, which is what I've been using to do the Jython implementation.
3400;;Lo;0;L;;;;;N;;;;;
4DB5;;Lo;0;L;;;;;N;;;;;
So how does CPython define unicodedata.category(unichr(13313)) to be 'Lo'? And it doesn't seem to be just 'Lo' in all cases of undefined items. I'm speculating that it might be falling back to the preceding valid codepoint category. Think I need to post to a CPython list to confirm.
Tuesday, July 31, 2007
Google Reader Bug report - use Atom <id/> elements
I am directly subscribed to Sam Ruby's feed. I recently added Planet Intertwingly as well, which contains Sam's blog. Both feeds are served as application/atom+xml
. In Google Reader, duplicate items show up (for Sam, Steve Loughran, and others that overlap from my other subscriptions. I don't want to remove individual subscriptions in case Sam removes them from Planet Intertwingly.
From my readings of the spec a while ago, that was an explicit rationale for having id's associated with each entry. As you would expect, the id's are the same. From Planet Intertwingly:
<entry>
<id>tag:intertwingly.net,2004:2619</id>
<link href="http://intertwingly.net/blog/2007/07/31/Agile-Financial-Publishing" rel="alternate" type="text/html"/>
<link href="http://intertwingly.net/blog/2619.atom" rel="replies" type="text/html"/>
<title>Agile Financial Publishing</title>
...
</entry>
From Intertwingly:
<entry>
<id>tag:intertwingly.net,2004:2619</id>
<link href="2007/07/31/Agile-Financial-Publishing"/>
<link rel="replies" href="2619.atom" thr:count="1" thr:updated="2007-07-31T12:15:28-04:00"/>
<title>Agile Financial Publishing</title>
...
</entry>
Those atom:id element IRIs appear to be the same to me...
Wednesday, July 25, 2007
Objective review of why Vista pisses me off (or Why isn't Vista more like Ubuntu?
So I'm being completely upfront in the title as to where I am on this one, but I think that it's worth giving some airtime to a few of these. Readers might care to note that they should route around the "grumpy Yorkshireman that doesn't like change" for some of these. You never know, I might get some comments explaining why I'm a bozo when it comes to Windows.
- Avalon and the new Windows Aero UI.
- I gave it two weeks and then I'd had enough. Does it really make me more productive having all that shit, or is it just effects for the sake of it? I have the same opinion on Compiz - I haven't yet seen a compelling reason for it to exist, beyond being secretly sponsored by Nvidia / ATI to make people get shiny new graphics cards. Let the conspiracy theorists chew on that one. And despite paying Dell more for a fancy graphics card in this laptop,
apparently it's not that good. - Blue screen of Death
- No really, I had one in my first week, and I've had one since then. Lovely. Occasionally (once a quarter?) when running Ubuntu Dapper, I've had X lock up on me and be completely unresponsive, to the extent that I couldn't even switch to a virtual terminal or ssh into the box and do something to it. The first time this happened, I went climbing at lunchtime, came back and it was still borked, so just a hard reboot to fix that. A reboot once a quarter on a development machine doesn't strike me as too bad. Vista is managing once a week at the moment, and in neither case am I writing C or any system level code. It's all Java / Python / Ruby and that sort of level.
- Black screen of death
- That was a Vista new feature for me; I've never had that on previous versions of Windows. This is progress people!
- Still no decent shell
- I'm using Cygwin, but it doesn't seem to let me tail files and press Return to get some space in between lines. Minor, but annoying.
- Continual swapping
- Previously I was on a 2GB RAM Dell workstation, and now I have a 2GB RAM Dell laptop. The laptop is always swapping. What's changed? Well, I'm now on Vista rather than Ubuntu and I'm not running Oracle XE anymore, but otherwise the services running are much the same. IDEA / Eclipse, Tomcat, MySQL, Firefox, intermittent email client and a text editor. Don't know why Vista is always swapping (TaskMngr thinks it has 700MB free) but it's bloody annoying.
- Broken file permissions
- Doing a release today, the VPN crashed (don't know if that was Vista, BT, the Windows 2003 Server or something else. That appears to have left me with the following undeletable file.
So Vista has let me create a file that I don't have the rights to delete. That's smashing! - Random security policies
- Or that's what I'm guessing is causing this anyway. If so, then a learning mode like AppArmor has would be nice. See
and
No, when I ask for a large amount of heap to be allocated for my Java process, I don't really know what I'm doing, so please stop me. Thank you Vista, you're my hero.
Update: LazyWeb to the rescue, at least about the disk thrashing issue.
Tuesday, July 24, 2007
The Hustler
There was a bag of sweets up on the kitchen top. Connor spotted these and was after them, but I told him "Not for breakfast". After I'd gone to work, he asked Al the same and got the same response. His reaction?
- Connor:
- OK Mummy, me just hold it, OK?
- Al:
- OK Connor.
(Al continues ironing. Some time passes)
- Connor: (Coming back into the kitchen)
- Look Mummy, me just hold it, OK?
- Al:
- Well done Connor.
(More time passes)
- Connor: (Comes back into the kitchen again)
- Look Mummy, me still hold it
- Al:
- Good boy, Connor!
After an hour, the ironing is done and Al's about to phone me to say how well-disciplined the boy's been, just holding the sweets. So she takes the ironing upstairs and comes back down to see Connor doing his Muttley laugh and just shy of shovelling all of the sweets into his mouth! Gamed by a 2 year old.
Friday, July 20, 2007
Good ETag support requires thinking about it up-front
I posted a comment on this but I thought it worthwhile going a little deeper.
Blogger doesn't support Trackback so I'll just post and link.
My point was not to argue about how little code is required to implement sending an ETag and checking an ETag based on the MD5 hash of your content (that's pretty much a library issue which should level out to be equal over time) but to go a little deeper into ETags.
I've been reading Sam Ruby long enough to have had the benefit of ETags drummed into me. The posts that Bill links to are focused on the network savings aspect of conditional GET. But you can also save server processing power, if you put a little more thought into your application model.
So we come back to the requirements for Java frameworks to support ETags such that it is possible to avoid doing the bulk of the server side processing. Caveat this could well be premature optimization, and is merely me thinking out loud. Struts is the one I'm most familiar with and I think with the struts-chain RequestProcessor, this approach could be used, but anything that works as a chain would do for this (so pure Filters would also work).
public void doFilter(ServletRequest request, ServletResponse response,
FilterChain filterChain) throws IOException, ServletException {
HttpServletRequest httpRequest = (HttpServletRequest) request;
/*
* Do something that works out what is required to render a response
* for this request and generate an ETag based on that. So here we
* have moved away from the approach of generating ETags from MD5
* hashes of the response body.
*/
ETag currentResourceETag = calculateETag(httpRequest);
ETag incomingETag = extractETag(httpRequest);
if (currentResourceETag.equals(incomingETag)) {
response.sendStatus(HttpServletResponse.SC_NOT_MODIFIED);
} else {
filterChain.doFilter(request, response);
response.addHeader("Etag", currentResourceETag.stringValue());
}
}
You would need to be able to obtain the items responsible for determining the ETag value reasonably early in the request processing, before any really expensive operations. Not sure what implications that has for the layers in your application, or if you were using strict MVC how disruptive / worthwhile it would be to try this approach...
Wednesday, July 18, 2007
Jython - UnicodeData mirrored is complete!
from test_support import verify, verbose
import sha
encoding = 'utf-8'
def test_mirrored():
h = sha.sha()
for i in range(65536):
c = unichr(i)
h.update(str(unicodedata.mirrored(c)))
print "%i : %i%c" % (i, unicodedata.mirrored(c), unichr(10)),
# Value returned by Python 2.5, which uses Unicode 4.2
#verify('91cd30c6c81911835dbcbed083f99fc9fc073e4a' == h.hexdigest(),
# h.hexdigest())
# Value returned by current Jython implementation, which uses Unicode 5.0
verify('595795a212ca0ac629d6b2dfb09c703a472adb03' == h.hexdigest(),
h.hexdigest())
# Add next test!
if __name__ == '__main__':
import unicodedata
test_mirrored()
OK, it's only for the BMP, but it's a good start. Supporting supplementary characters (in Java terminology) or the other sixteeen planes would need a more fundamental change to PyUnicode, methinks. Now I need to start adding the other unicodedata methods which should be fairly straightforward. Then I'll have a working implementation to post to the dev list. Maybe end of this month, unless Baby comes and I lose my late night hacking time?
Jython UnicodeData mirroring
for i in range(65536):
c = unichr(i)
print "%i : %i%c" % (i, unicodedata.mirrored(c), unichr(10)) ,
jabley@miq-jabley ~/work/eclipse/workspaces/personal/jython-trunk/jython
$ diff jython-mirrored.txt python-mirrored.txt
10177c10177
< 10176 : 0
---
> 10176 : 1
10180,10183c10180,10183
< 10179 : 0
< 10180 : 0
< 10181 : 0
< 10182 : 0
---
> 10179 : 1
> 10180 : 1
> 10181 : 1
> 10182 : 1
11779,11782c11779,11782
< 11778 : 0
< 11779 : 0
< 11780 : 0
< 11781 : 0
---
> 11778 : 1
> 11779 : 1
> 11780 : 1
> 11781 : 1
11786,11787c11786,11787
< 11785 : 0
< 11786 : 0
---
> 11785 : 1
> 11786 : 1
11789,11790c11789,11790
< 11788 : 0
< 11789 : 0
---
> 11788 : 1
> 11789 : 1
11805,11806c11805,11806
< 11804 : 0
< 11805 : 0
---
> 11804 : 1
> 11805 : 1
I was hoping to use
java.lang.Character.isMirrored(char)
, but the above is the result of diffing the output for jython and python running my test and diff-ing the output. Looking in more detail, Java 1.4 supports UCD 3.2, then Java 5 and Java 6 both only have support for UCD 4.0.
jabley@miq-jabley ~/work/eclipse/workspaces/personal/jython-trunk/jython
$ python
Python 2.5.1 (r251:54863, May 18 2007, 16:56:43)
[GCC 3.4.4 (cygming special, gdc 0.12, using dmd 0.125)] on cygwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import unicodedata
>>> unicodedata.unidata_version
'4.1.0'
And I'm getting the sneaking feeling that I've done something like that before, but it's been so long since I did any development in this area that I've forgotten it!
Friday, July 13, 2007
The Graduate
Sunday, July 08, 2007
Our little philosopher
- Me
- Callum, we don't need to take your toys into Legoland since there's enough things in there to keep you busy. How does playing with your toys make you feel versus going on a ride?
- Callum
- Happy.
- Me
- So what's a ride like then...like playing with toys.
- Callum
- So Daddy, when you're on the ride, it's like the ride is playing with you. You're the toy!
- Me (slightly shocked)
- Yes, that's right Callum. Good analogy.
- Me
- So Callum, your homework is to write an essay on the inside of a brick.
- Callum
- Silly Daddy, that's just pretend!
He's sharp, that one.
Saturday, June 30, 2007
The Negotiator
A breakfast discussion with Connor:
- Me
- What would you like, Golden Nuggets?
- Connor
- Schweetie.
- Me
- No Connor, no sweetie for breakfast. Crispies?
- Connor
- OK Daddy.
- Connor
- Wee schweetie (holding his thumb and middle finger to indicate the size).
Linux goodness
Tuesday, June 26, 2007
Career Development
Steve Yegge's recent post about the second most important course in CS made me smile. I discovered Eric Raymond's How To Become A Hacker fairly early on in my career, and to a Mathematics graduate with a job doing Visual Basic 6.0, that was eye-opening stuff. But you always have to remember that all authors have an agenda (including this one, obviously - Ed) and it has been commented by people other than myself that esr's HOWTO could be alternatively titled 'How To Be Like Eric Raymond'. Well, so be it. It's always the journey that's the interesting part. So can we consider Steve's post to be an equivalent 'How To Be Like Steve Yegge'? Doesn't matter. Again, it's the journey that has value.
As an aside, I had an email from Amazon today.
Your order #xxx-xxxxxxx-xxxxxxx (received 27-March-2007)Note the order date as well. Three months to get that book. Ouch! But at least it's given me time to read these two, and I can interpret Steve's post (and Joe's comment) as a good barometer of where I'm heading.
-------------------------------------------------------------------------
Ordered Title Price Dispatched Subtotal
---------------------------------------------------------------------
Amazon.co.uk items (Sold by Amazon EU S.a.r.L.):
1 Compilers: Principles, Tec... £47.49 1 £47.49
Shipped via Home Delivery Network Limited (estimated arrival date:
28-June-2007).
Thursday, June 14, 2007
Seasonal Suffering
I took the boys to the park yesterday and Callum was very chatty. He noticed that I was quite snottery* and talked about how I had B-fever, rather than hayfever. Clever!
* I must have been married to a Scot for too long to corrupt the English language so.
Jython Update - actually doing some work on UnicodeData
So I had put this on hold to finish reading Josh Bloch and Neal Gafter's Java Puzzlers. Great book; highlighted some new things for me, which is all I ask for in any book. Partly that is since I'm still not doing Java 5 apart from at home for various minor things, but it was good.
So now back to Jython, finally. Well, reasons for my procrastination first:- New Job - busy as hell.
- New job comes with a new laptop, which I was hoping would be a big step up from my nearly six year old self-built machine. The new laptop has reasonably impressive hardware specification, but it's running Vista. What an absolute pile of shit. I honestly don't know how developers are productive using that OS. I've given it nearly two months, just to be sure that it's not the fact that I've been off Windows for three years that is causing me all of the problems, but really. It's got to the point where I'm looking seriously at Xen on Ubuntu for the odd application that I do need to run Windows for. The other alternative would be to install XP, put up with the half life cost and initial downtime of getting the laptop set up for development all over again. I'll enumerate my grievances in a separate post. I don't think XP would get in my way as much as Vista (I did knock up the xmlunit XMLSchema validation patch on my wife's XP machine over Christmas and it wasn't that painful), but I'm not a fan of Windows after using GNU/Linux exclusively for three years.
So I've been doing a little work on UnicodeData again. Since I've not touched it for so long, I wanted to get some code up and running to start seeing how many tests were failing. Until Jython goes to Java 5 and above, I can't use java.lang.Character to do parts of it, or I could do a piecemeal approach of use java.lang.Character for the BMP, and then implement a new part for supplementary characters, or try to provide behaviour based on the running JVM. All a bit more work than I wanted to do, laziness and hubris being key. Instead, go for the brute force approach of the simplest thing that will possibly work. So I wrote a Python script (what else - it's a nice way of bootstrapping this problem) to parse the UnicodeData.txt file and generate some Java classes. The initial approach was to partition the UnicodeData.txt into a class for each plane in Unicode. Anyone that knows Unicode and the assigned codepoints will know that the BMP will take up most of this, but I was interested in getting something working, and then maybe refine it once the tests are passing. Well, my first cut was to have a simple interface:
interface UnicodePlane {
/**
* Return a UnicodeCodepoint for the specified codepoint.
*
* @param codepoint the Unicode codepoint
*
* @return a UnicodeCodepoint, or null if there is no match
*/
UnicodeCodepoint getCodepoint(int codepoint);
}
I would have a class that implements this interface for each plane and a static initializer within each class that fills a Map of UnicodeCodepoint classes keyed by Integer codepoint.
Eclipse gives me this error:
The code for the static initializer is exceeding the 65535 bytes limit
Whereas ANT gave me this variation:
[javac] Compiling 2 source files to c:\Users\jabley\work\eclipse\workspaces\personal\jython-trunk\jython\build
[javac] c:\Users\jabley\work\eclipse\workspaces\personal\jython-trunk\jython\UnicodeData\generated-src\org\python\modules\unicodedata\UnicodeCharacterDataBasicMultilingualPlane
.java:11: code too large
[javac] private static final Map CODEPOINTS = new HashMap();
[javac] ^
[javac] 1 error
BUILD FAILED
c:\Users\jabley\work\eclipse\workspaces\personal\jython-trunk\jython\build.xml:456: Compile failed; see the compiler error output for details.
So I need to think a bit harder about the data structures. Turning to Bentley's Programming Pearls, the sparseness of certain items stands out, like the mirrored property.
I'll have a think. At least with the Python script that I have to cut up the UnicodeData.txt file, it's very easy to add another list comprehension to it, to see how many items in the file exhibit a certain property. The other way I'm considering is to just generate a properties file and lazily populate a Map as required. That's probably what I'll try next, rather than thinking too hard about how to compress a 1038607 bytes data file into something more reasonable.
Sleekit plumbing
Happy Birthday!
Wednesday, May 09, 2007
Connor PeePee
Connor used his potty today for the first time.
Wednesday, April 25, 2007
Cheeky Mouse
The boys had both been given a chocolate log and some sweets. Connor troughed his as normal and was watching CBeebies when Callum came to tell me about his day. We spotted Connor going for Callum's chocolate log in his bowl on the couch and told him to leave it. He grabbed it and did it in one with a chuckle!
Friday, April 20, 2007
Broken Pipes?
I have what I think is a reasonable use case for Pipes. This blog was started to talk about the kids, so that I didn't forget things about them as they were growing up. But then over time I've obviously started blogging about technical subjects as well, since that's very dear to me! (Blogging is evidently very egocentric at times!)
So my desire was to create a pipe that only contained posts about Jython-related topics, so that Jython developers wouldn't get all the noise about family. This would have seemed a fairly simple thing to do.
- Take an XML document.
- Apply an XPath filter to it (//item/category/term = 'jython').
Sadly, I haven't been able to get this to work, and posting to the Pipes support forum didn't elucidate any way of doing it. So I've fallen back to filtering based on the title of each post, which seems a bit crappy to me. Tag / Labels / Categories are meta-data about my post, and I should be able to use that meta-data. But this one can handle a post with only a single tag, but doesn't handle posts that have multiple categories. So that leaves me with the option to either always have certain text in the title of each post, or only tag posts with a single category, which kind of misses the point of categorising items. What am I missing?
Jython progress report
Wednesday, March 14, 2007
The little literate one
Connor's so funny. He can't read, but he's seen that Callum has a book in his bed when he wakes up, so Connor wants to have the same. He's had Bob the Builder and Fireman Sam in with him the last few nights, and you can hear him blethering away to himself before he drops off.
Jython - contributing
Got the go-ahead from my employer late last week that it's OK for me to contribute to Jython. Cool, I can start properly doing the unicodedata implementation (and I don't have to ask Stefan to back out my xmlunit patch!). The only annoying thing is that I started doing the Josh Bloch / Neal Gafter Java Puzzlers book while I was waiting for the approval to come through, so I want to finish that rather than having too many things on the go at one time. So I'll probably be a week before ramping up on unicodedata again.
Monday, February 26, 2007
Jython stalled
I haven't made as much progress with this as I would like, since I'm having some troubles. Not ones of a technical nature, but instead ones of a legal nature. My employment contract is apparently fairly standard and says that my employer owns all of my thoughts (even the ones when I'm writing this!). As what I thought was a courtesy, I asked my manager whether it was OK to contribute to open source projects, so that there would be no shady areas about who owned what code. Well, I'm still waiting for an answer and a bit of paper from my employer. So until I get that, I'm holding off writing any real code.
What I do have is a bit of sketching around the general area. First off, I wasn't sure about some aspects of the CPython implementation, so I asked. It got bounced by the python-dev moderator with the advice that I should post on the python-list. Which I did, but as I expected, it was more of a question for the python-dev list and the only (private) response that I've had is from Martin V. Loewis, who did the last change to that part of CPython. Maybe people are busy with PyCon. I was pointed at the C implementation, which doesn't generate that part from the UnicodeData.txt file, but instead is a horrible case statement, which looks bad. I think I need to raise a bug.
The other thing I've done with it is some Learning Tests about java.lang.Character, to see what it offers me. Obviously, this is attractive since it's a core library, is well tested, debugged and used by millions of people and all the other reasons Josh Bloch enumerates in Effective Java. It seems to have a few little idiosyncracies, which I have captured in my tests. (Note to self: I haven't seen any JUnit (or TestNG - Cedric!) tests in the Jython source tree. Must ask the dev list about that.) Then maybe I should check whether JRuby needs this sort of thing, and make it re-usable, with a Jython wrapper for the API that it needs, etc.
Tuesday, February 13, 2007
Jython unicodedata - how complete is java.lang.Character anyway?
Aye, there's the rub! My initial reaction to java.lang.Character is the extensive use of char
in the API. That obviously wouldn't cover all of Unicode. So as a best scenario, maybe the BMP plus a bit can be covered by Character, and then something extra would need to be implemented to support the rest.
A related issue is that all of the nice int
overloaded versions of the methods are Java 5, and that's not what I'm targeting here. I'm hoping to get away with a target environment of Java 4, since haven't Sun end-of-lifed Java 3? Java 4 Character only has implemented Unicode 3.0 anyway, so there's a bit of a gap. Python 2.3 contains version 3.2.0 of UnicodeData. There's going to be a gap that I need to fill somewhere.
Thirdly, the API that Character offers seems to be rather different from what I and Python has interpreted under the Unicode specification.
>>> unicodedata.digit(u'\u2468')
9
but then in Java:
assertEquals(true, Character.isDigit('\u2468');
fails. Closer inspection of the isDigit API documentation shows that this is in fact a test for is DECIMAL DIGIT, so it equates to unicodedata.decimal
rather than unicodedata.digit
. Hopefully, Character
will allow me to get a fair way into implementing and making some of the tests run, before I have to start thinking too hard about creating lookup tables and bit-masking the 11 most significant bits.
Monday, February 12, 2007
Jython unicodedata initial overview
So it looks like the problem breaks down into creating a suitable data structure from the contents of the UnicodeData.txt file. CPython uses a python script (what else?) to create a C header file with the contents of various data structures. So all I need to do is probably one of the following:
- Use the same approach, generate a very similar structure and port the existing C code that accesses the data structures into Java. Not very appealing.
- Do something similar, and create a class for each code point. Probably not very good from a resource perspective (OK, potentially premature optmisation since I haven't measured it, but the UnicodeData.txt file is 817k, so that's some data structure). That could be useful from a LearningTest perspective though; e.g. can I use java.lang.Character, or do I need something else entirely.
- Use something in existing core Java libraries.
- Use a third-party library.
- Something else, that I haven't bothered to think about what it could be.
Just one problem. I don't fully understand what is required yet. From reading the UnicodeData commentary, that indicates to me the reasons why the below tests are fine.
2468;CIRCLED DIGIT NINE;No;0;EN; 0039;;9;9;N;;;;;
(UnicodeData.txt entry for code-point 0x2468)
verify(unicodedata.decimal(u'\u2468',None) is None)
verify(unicodedata.digit(u'\u2468') == 9)
verify(unicodedata.numeric(u'\u2468') == 9.0)
and those tests pass (for CPython - I haven't implemented the Jython version yet!). From the file entry and commentary, that code-point appears to have no decimal digit value, a digit value of 9 and a numeric value of 9. The tests confirm that. I don't understand why these don't also pass.
325F;CIRCLED NUMBER THIRTY FIVE;No;0;ON; 0033 0035;;;35;N;;;;;
(UnicodeData.txt entry for code-point 0x325F)
verify(unicodedata.decimal(u'\u325F',None) is None)
verify(unicodedata.digit(u'\u325F', None) is None)
verify(unicodedata.numeric(u'\u325F') == 35.0)
The last one fails with:Traceback (most recent call last):
File "
ValueError: not a numeric character
Evidently I need to delve deeper into the spec, or start asking more knowledgeable people some questions.
Friday, February 09, 2007
Jython unicodedata proceedings
So a little background as to why I'm doing this. Well, I don't know Unicode as well as I'd like, and I know Python a lot better than I know Ruby, so no temptation to start hacking JRuby at this point (well, maybe just a little).
I've implemented the methods that were missing and now I'm getting failures in the test. For the first implementation, I grabbed the existing Python source. Shit, C programming rots your brain. I learned C at Uni and then again via K&R, but this is a little different. But it's enough to give me the method signatures for everything that I need to stub.
*sys-package-mgr*: processing modified jar, '/home/jabley/work/workspaces/main/jython/dist/jython.jar'
Testing Unicode Database...
Methods: 38ef24ef104d52e24f9b7c942676c6961f9233cc
Functions: 97f3b4a034c7d9a0d0c1f387e216d6b8bf309442
API:Traceback (innermost last):
File "dist/Lib/test/test_unicodedata.py", line 91, in ?
File "/home/jabley/work/workspaces/main/jython/dist/Lib/test/test_support.py", line 125, in verify
TestFailed: test failed
Bit tired tonight (Connor's been throwing up the last two days), so I won't dig much into this. It feels slightly weird to be running the tests as python tests against a Java implementation, but that's what you get for implementing a library like this. No Junit / TestNG in sight. I have a feeling that it's going to require a lot of reading, which is good in that I might learn something, but I also wanted to get back to Stefan with a DocBook example for an xmlunit proposal.
Thursday, February 08, 2007
Jython 101
Inspired by Joe Gregario, I thought I'd take a gander at Jython.
The proposed JythonSprint seemed to suggest that unicodedata was required, so I thought I'd have a play and maybe even contribute something. We'll see.
So, grab the main trunk from subversion and off we go. TDD all the way, as much as possible. Are you sitting comfortably? Then I'll begin...
Step 1 - create an ant.properties file. This is in .cvsignore, so no worries about clashing with anyone else's settings. Straight off the developer guide.
build.compiler=modern
debug=on
optimize=off
Step 2 - build it using ANT.
Step 3 - make it accessible. I don't have jython on my machine already, so I take the dirty approach.
sudo vi /usr/bin/jython
#!/bin/sh
export JYTHON_HOME=/home/jabley/work/workspaces/main/jython
exec java -Dpython.home=${JYTHON_HOME}/dist/ -jar ${JYTHON_HOME}/dist/jython.jar $*
Slight tweak from the development guide, and yes, I'm using Eclipse.
sudo chmod 755 /usr/bin/jython
Now see what it looks like:
jython
*sys-package-mgr*: processing new jar, '/home/jabley/work/workspaces/main/jython/dist/jython.jar'
*sys-package-mgr*: processing new jar, '/usr/lib/jvm/java-1.5.0-sun-1.5.0.06/jre/lib/rt.jar'
*sys-package-mgr*: processing new jar, '/usr/lib/jvm/java-1.5.0-sun-1.5.0.06/jre/lib/jsse.jar'
*sys-package-mgr*: processing new jar, '/usr/lib/jvm/java-1.5.0-sun-1.5.0.06/jre/lib/jce.jar'
*sys-package-mgr*: processing new jar, '/usr/lib/jvm/java-1.5.0-sun-1.5.0.06/jre/lib/charsets.jar'
*sys-package-mgr*: processing new jar, '/usr/lib/jvm/java-1.5.0-sun-1.5.0.06/jre/lib/ext/sunjce_provider.jar'
*sys-package-mgr*: processing new jar, '/usr/lib/jvm/java-1.5.0-sun-1.5.0.06/jre/lib/ext/sunpkcs11.jar'
*sys-package-mgr*: processing new jar, '/usr/lib/jvm/java-1.5.0-sun-1.5.0.06/jre/lib/ext/localedata.jar'
*sys-package-mgr*: processing new jar, '/usr/lib/jvm/java-1.5.0-sun-1.5.0.06/jre/lib/ext/dnsns.jar'
Jython 2.2b1 on java1.5.0_06 (JIT: null)
Type "copyright", "credits" or "license" for more information.
>>>
Cool.
Next run the tests.
jython dist/Lib/test/test_unicodedata.py
Testing Unicode Database...
Methods: 38ef24ef104d52e24f9b7c942676c6961f9233cc
Traceback (innermost last):
File "dist/Lib/test/test_unicodedata.py", line 84, in ?
ImportError: no module named unicodedata
That's expected. So I added a class org.python.modules.unicodedata and added an entry in org.python.modules.Setup to have a unicodedata module. I'll probably go back to the class name and maybe create it's own package longer-term, but that's the simplest thing for now. Tests again.
jython dist/Lib/test/test_unicodedata.py
*sys-package-mgr*: processing modified jar, '/home/jabley/work/workspaces/main/jython/dist/jython.jar'
Testing Unicode Database...
Methods: 38ef24ef104d52e24f9b7c942676c6961f9233cc
Functions:Traceback (innermost last):
File "dist/Lib/test/test_unicodedata.py", line 86, in ?
File "dist/Lib/test/test_unicodedata.py", line 62, in test_unicodedata
AttributeError: class 'org.python.modules.unicodedata' has no attribute 'digit'
I seem to recall reading about some script that will generate stubs for these things - gexpose.py? I'll have a look, but otherwise it looks like my next step will be implementing stubs for the required methods and see where the tests fail next.
Saturday, February 03, 2007
Roaches in the sun
Mooching around under Inertial Reel discussing who had climbed it. Martin Dearden was just next to us, apparently enjoying being the subject of our historical wanderings when we tried to remember who it was that had taken eight years on it. Mind you, that's positively fast given that I first tried Stefan Grossman eleven years ago, before DB had done it. I came close six years ago, and haven't really been back. There's a long-term goal!
Did a bit, felt a bit fat and not moving well, I think I was a bit tired from introducing the weight belt back into the training programme. The Banks did well, getting Teck Crack Superdirect third go or thereabouts. I couldn't get my foot on the starting hold - bit of flexibility work required maybe!
Backed off Stretch and Mantel - I can't mantel and it was a bit high. Then puntered about on Calcutta Buttress before team effort on the world's hardest 6a+ slab. Andy came close; Adam Long complained about old boots not being up to it, Fiona crimped like a beast but couldn't quite get it.
Flashed Stretch Armstrong then it was time to go. I didn't really do much, but had a good day out with the guys. On the way home, Al rang and said it was OK for me to stay over, but I was already halfway home. Will get an overnight pass properly sorted out for the future.
Thursday, January 25, 2007
Tool User 2
The fittings are oriented at 90° more due to my ineptitude at figuring out how to install them without any instrucions, rather than any conscious effort to prevent them spotting any usage patterns.
Sunday, January 21, 2007
Tool user
Thursday, January 11, 2007
Where's Baby?
Had the twelve week scan today - all looks fine at this stage. Exciting, and I can announce it to the world, both here and in publishing ages old drafts.
So here's a little ego experiment for me. I'm not sure how many of my mates read this blog - some have asked about it.
I'm also curious as to how many are au fait with Atom and RSS. Well OK, I don't expect them to be that au fait with RFC 4287, but they maybe have come across the general concept. Plus I'm tight (did I mention I'm from Yorkshire?) and want them all to ring me rather than the other way round!