Tuesday, December 29, 2009

Objective-C for Java Developers

I'm coming from an Eclipse on Ubuntu background, but this is equally applicable for IDEA on Windows. What are the equivalents for iPhone development?

Java iPhone Notes
JUnit (unit testing framework) ? It is possible to use TDD for Swing apps, although I've been predominantly a server-side guy with client stuff happening in the browser for quite a while now. Cucumber with iPhone looks worth exploring...
Hudson (continuous integration tool) ? On my first iPhone app, it rapidly became apparent how easy it was for people to do bad merges and delete classes from the Xcode project file / strings from the UTF-16 l14n Localizable.strings file. You can argue that people should take more care; yeah, that'll fix it. git bisect is great, but a tool that builds on each commit is better.

Saturday, December 26, 2009

ScaleCamp - Queue PUBSUB

From some reason I went to this thinking PubSubHubbub, but it was more a refresher about making an app asynchronous, why you'd want to do that and how implementation complexity goes up as you go after certain desirable behaviours.

ScaleCamp - How do you scale Activity Feeds?

Popular session this - standing room only, so no notes from me.

The short answer is Redis, courtesy of Simon Willison. Since the consensus was that Redis would do the trick, we then touched on Simon's other new favourite technology - node.js.

Digression: Alex from mediamolecule made a comment about 100MB of data in a key-data structure store should only require 100MB of memory to store in such a server app (plus a little extra for housekeeping, but it shouldn't be a 1:10 ratio or similar. I didn't take that as a direct criticism of Redis but more of a reminder about choosing good data structures and the importance of CompSci (says this mathematician). I mention that, since it pricked me to investigate a suspected bad data structure in one of our apps, and coupled with the Eclipse Memory Analyser recommended by the Guardian guys, I found something that was using far too much of the heap for our Tomcat nodes; and had a change rolled out within 4 days of attending this conference. That reduced the memory footprint for that data structure from 250MB to 16MB. Ouch, shocker, but great to have found, prioritised and fixed.

Monday, December 07, 2009

ScaleCamp - Scaling Java and Oracle for the Guardian

Guardian.co.uk

Graham and various other people from the development and operations team pitching in.

3 years ago - published static files with apache SSI to fill-in gaps. Moved to a fully dynamic system. Now, they're somewhere in between.

Stack -
  • apache

  • resin

  • spring / hibernate / velocity

  • Oracle DB backend (not recommended!)


Measured the application - 1300 requests to DB just to render homepage.

Added ehcache to hibernate as 2nd level cache and added a warmup script before putting into load balancer

30m unique users per month
270m pages per month
250 requests/second at lunchtime
1500 requests/second peak.

GC tools



Google weakref cache (part of Google Collections)

Eclipse memory analyser - what's using all my memory?

Cacti for monitoring - DB usage was killing it.

8 app servers in each co-lo (London and Manchester).

400MB used by cache - churn meant was pretty ineffective.

Tried or considered ehcache distribution and jboss cache distribution.

Rejected since cache eviction via replication would have thrashed it.

memcached



massive improvement in response times, but DB load still high.

went to caching every query for 5 minutes. DB load vanished and is flat even as more app servers come on-line.

servlet filter writing to memcached made it stink fast.

took a days worth of logs and Hadoop to see how long the cache should be. 1 minute was the sweet spot.

Emergency switch to serve a static copy of the site, minus personalization features.

Daemon or script scrapes the sit; they can handle 700req/s/node when the site's operating in this mode.

new content published in this mode has a copy pressed so it can be served from disk - publish is slower than with the other system but updates still possible

Highly recommend that this sort of emergency degrade read-only mode should be built-in from the off - they've used this approach with the MPs Expenses apps built to crowd-source investigations.

ScaleCamp - Scaling Java with Shared Nothing

Thoughtworks guys again.

Basic Servlet overview - in-memory sessions don't scale, duh!

Preferred options - state goes into cookies and serialized. Security, legal aspects? Pretty well common to most frameworks these days.

Page composition in the server with proxy server holding StringTemplate objects. Interesting idea - seen variants of this in other places. I'm curious as to whether doing this could mean having a poor man's macro system for Java, since XSLTs can be written to create XSLTs; maybe Velocity templates could similarly generate Velocity templates or StringTemplate -> StringTemplate?

Again, application developers need to have a good idea of caching directives for this to work. One objection I had with this approach is that you potentially increase your hardware requirement and can open the app up to liveness failures here. Request A comes in and is serviced by Thread 1. As part of that, it makes a request to the proxy server for a template. At the proxy server, a cache miss means that another request needs to be made to the app server. Then Request A is tying up 2 app server threads. What about applications which parallelize the requests? They might use more than 2 app server request-handling threads at a time, etc.

Thoughtworks seem to do fun, interesting work.

ScaleCamp - LittleBigPlanet

Alex and James from mediamolecule.com. Fascinating perspective of embedded developers coming to server-programming and refusing to accept commonly held views on best practices for doing so. This was the surprise hit of the conference for me; I just elected to go since there wasn't anything else in that slot that I was really passionate about. I'd been talking to them both in the queue for tea earlier and made a poorly judged joke about Map-Reduce (we pretty much had this conversation). The session they ran was an awesome talk about scaling server-side within the games sector - Little Big Planet is theirs.

Written their own C-based key-data structure store, of which we're spoilt for choice just now. Alex commented that he's looked at Redis and it has some nice stuff, but when they came to need it, there wasn't anything that met their needs, and experience with running the recommended Java stack had left them with the impression that they should stick to what they know. What they know is writing very tight code in constrained environments, so applying that mind-set to server-side development seemed to have yielded some very pleasing numbers. Other parts are in Ruby (presumably 1.9, since they're using Fibers?). I didn't get around to asking James how well that works or which implementation they're using. Very happy with that programming model though - James is or was a Java guy - funny how nice Ruby feels coming from there!

ScaleCamp - Varnish

Artur Bergman talking about Varnish; this followed on from the Squid talk and most of the same crowd hung around for this one.

purges gone from multicast to Rabbit MQ (damn can't remember if I got that right or what it means!)
2 8-core servers in London data centre 350MB/s with 5000 requests/sec. Intel X25 SSDs have changed a certain class of application. If disk is the new tape, then it's probably still acceptable to go to disk if you're running those babies. See also Last.FMs experience with them.

Attempt cache hits in all data centres (UK -> US) before going to the app. Much better performance.

CDN gets broken with query string parameters - common misconfiguration which can be defended against.

Varnish protects against thundering herd. Interesting - need to read more about that to better understand it.

ScaleCamp - scaling with Squid

Summarising a recent Thoughtworks experience with this, from Chris Read and Sam Newman.

This was for a high volume retailer. A Proxy / caching solution was supposed to be provided and TW would do the app. At t-4, it turned out that TW would also have to provide the proxy / cache, so this was a hasty investigation into Squid.

1 hardware LB
2 Squid boxes - 8 core machine

1 carp process
lots of child processes

going to 16 app servers.

16,000 requests per second = 5% of traffic

TTL for items ranged from 5 minutes to 1 hour for stable furniture.

whole site does 250 million requests per day

peak 24,000 requests per second

importance of good HTTP Caching directives. Discussion of making the application (and by implication, the application developers) aware of considering ETag / Expires / Vary for all parts of the application, versus just adding it via apache ReWrite or similar. Most people in the room (including me, very strongly; RFC2616 is my favourite RFC) were in the former camp.

heap LFUDA was a good change to make.

Tried Varnish, but couldn't get good numbers out of it, in the timescales available. Artur opined that Varnish should provide better numbers than Squid; he's arguably conflicted, but seemed pretty convincing! Another factor in that was very likely that they were running RHEL old shit, and Varnish works best with a shiny new kernel (cite?)

Update: they also presented a version of this talk at DevOps 2010.

ScaleCamp - State of the Nation for Monitoring

Where are we now and what's broken with it?


People are almost getting to the point of needing more powerful machines to do the monitoring than the app servers! Maybe something's broken somewhere...


RRDTool - overall, the consensus seemed to be that this was a little dated.



  1. Does lots of writes due to the way it stores data

  2. throws away data by the way it aggregates - to see fine-grained data of last years sales, you need to keep a backup of the files / graphs, rather than being able to query it.

  3. can't be cleansed of bad data, or it's a bitch of a job to do so.


Alternative options



  • hbase

  • reconnoiter

  • Tokyo Tyrant / Cabinet

  • timesplicedb looks to be an interesting attempt to provide a replacement. More language bindings needed, don't be shy!

A good start to the conference for me and it gave me a flavour of the depth and breadth of discussions available.

ScaleCamp UK 2009

On Friday I was fortunate enough to attend the inaugural ScaleCamp UK event, organised by Michael Brunton-Spall at the Guardian. This was a great conference. It was a BarCamp-style approach (not that I've been to BarCamp yet!) with the schedule evolving over conversations and planned on a board in the morning. Some of the sessions I took notes at; others were standing room only, so I'll try to remember what was talked about. Obviously, this is a personal perspective focused on my interests; others should be blogging about Javascript and the like.

I met lots of very passionate, smart people doing cool stuff. That bodes well for the economy; if you want to do interesting work, then hooking up with any of the people that attended there wouldn't be a bad place to start.

Thursday, December 03, 2009

iMac - Guide for Linux Users

Got a MacBook Pro recently at work, for doing more iPhone stuff. I've long admired Macs as hardware, but haven't ever owned one due to an irrational distrust of Steve Jobs. Oh well, lots of friends recommend them and have told me it's the best computing experience going. I'm expecting a learning curve, but here goes:

  1. Go through basic setup for my user account. No friction so far apart from the keyboard. I know which keys to check when I'm installing a Linux, so I do the same. SHIFT+2 and SHIFT+' give me @". WTF? Need to remap certain stuff; that's not the British English layout I'm used to; none of my other computers are Macs and I have 10 years of muscle memory when it comes to typing. I'll come back to fixing that. First off, Apple | System Preferences | Keyboard | Modifier Keys and sort out Caps Lock and CTRL, for good emacs usage.

  2. Where's a bloody terminal? Spend 5 minutes learning the nuances of the trackpad (I'm used to a nipple) and then drag one out of Applications | Utilities onto the Dock.

  3. Good, ssh is available. Copy my SSH keys and config off the Dell laptop running Ubuntu 9.10. Test all ssh stuff and grin like a maniac.

  4. Right, best install any system updates before I start configuring the arse off it. For a Unix, Mac OS X seems to need a lot of restarts for simple stuff like iTunes updates.

  5. WTF!? Nothing like aptitude? That seems like a glaring omission. What are my options? Googling seems to point to Macports, Fink and Homebrew as the available options.

  6. IRC client - download Colloquy and start talking to real people about their experiences.

  7. After that very small skewed sample, decide to go with Macports for now with an intention to properly evaluate Homebrew Real Soon Now.

  8. Install git with git-svn support. $ sudo port install git-core +svn gitX

  9. Checkout my github stuff. Git is missing something - completion!

  10. $ sudo port install bash_completion
    $ curl -o git "http://repo.or.cz/w/git.git/blob_plain/HEAD:/contrib/completion/git-completion.bash"
    $ sudo mv git /opt/local/etc/bash_completion.d/

  11. Download the behemoth that is Xcode from the Apple Developer site and start checking out Objective-C



That'll do as a minimally usable system for now. Hardware-wise, it's a delight. Being able to watch all of InfoQ content without the teething problems that I always seem to have on Ubuntu is just a major relief - I've got a lot of stuff in delicious tagged from there that I've never managed to watch, so I can start getting through that backlog as well.

Wednesday, November 11, 2009

Maturity

Got home from Font around midnight on Saturday night (Halloween) to see a single carved pumpkin at home with a light inside. In the morning it transpired that someone that evening had taken Connor's pumpkin and smashed it just around the corner. I was taken with his reaction - very laissez-faire and musing on how he had the enjoyment from making the pumpkin. He was just a bit disappointed that I didn't get to see it. He's almost human sometimes!

Font 2009 summary

This was a bonus trip. I had a bad back for a month beforehand thanks to a Connor Kung-Fu Panda drop on me when stretching after a run. So no climbing and whacked out on painkillers - not ideal before a bouldering trip.

Traditional Friday night / Saturday morning driving ferry and drive only to find wet Font greeting us.

Sunday

Apremont

Monday

Cusiniere

Tuesday

Rocher Canon - running

Wednesday

Haute Pleines / Isatis

Thursday

Cuvier - rest day

Friday

95.2

Saturday

Gorge aux Chats



Joined the inveterate tickers club - that and I can't really remember what I've done in Font apart from Carnage, l'Abbatoir and a few others.

Great trip; too hot for getting on some stuff, but rather that that raining all week. Lucky to get stuff done too, with my lack of preparation. Not done 7b in the forest for a few years!

Google Gears for Firefox 3.5 on Ubuntu Karmic 9.10

Built my own - seems to work fine so far.


jabley@miq-jabley:~/work/gears-read-only$ svn info
Path: .
URL: http://gears.googlecode.com/svn/trunk
Repository Root: http://gears.googlecode.com/svn
Repository UUID: fe895e04-df30-0410-9975-d76d301b4276
Revision: 3410
Node Kind: directory
Schedule: normal
Last Changed Author: gears.daemon
Last Changed Rev: 3410
Last Changed Date: 2009-11-10 01:49:08 +0000 (Tue, 10 Nov 2009)


Made some changes:



make mode=OPT

and then install the resulting xpi.

Friday, October 23, 2009

Monday, October 05, 2009

Settig up Git mirrors of SVN

At work, I've been using git-svn for quite a while. I like the workflow options, and better merging capabilities. As a casualty from the recent laptop hard drive failure, all of my git repositories had gone, checked out from our main SVN server. I'm the only one using git where I work, but I love the workflow that it gives me and going back to SVN is a no-no. That had previously been created by doing a

git svn clone -s svn://svn.example.com/module

That took ages (3 days for all of the stuff I need to work on) and was quite slow when doing commits. My backups are mildly corrupt too, so I've started over, and set it up properly this time. Thanks to the guide here.

On the server, I created a directory to hold the git mirrors, and a text file containing the SVN modules that I wanted initially. Then a simple bash script to loop through the file and create a mirror of each SVN module:

for f in `cat svn-modules.txt` ; do svn2git.sh $f ; done



Then just make the repositories available:

git-daemon --export-all --base-path=/opt/git --verbose

and create a cron job to refresh the git mirrors periodically.

git --bare svn fetch --all

There are other ways, but that's the quick-n-dirty approach. Then a similar script on the client, which used the same list of modules that I wanted to check out.



Benefits of this approach:
  • Much faster to set up - it took just over a couple of hours this time.

  • Available to other people to try out - not just me.

  • Provides a migration path off SVN as we eventually migrate off SVN (my long-term aim, muhahaha)

Monday, September 21, 2009

Separation of concerns

The shower control described here resonated with me. Having a separate control for temperature and volume appeals to me as a software developer. But most sinks don't work this way. Instead, you have a tap for hot water and one for cold. I've seen a sink which did have controls as per the shower. I liked it; most other people complained and eventually it was replaced with a more typical Western arrangement. I wonder why the idea doesn't transport well from shower to sink, or if there are places where that is the norm?

Wednesday, September 16, 2009

JRuby on Rails Tomcat logging

We have a mixed dev team - Windows and Linux currently, although I'm considering a Mac.

In your environment:

os = java.lang.System.get_property 'os.name'

config.logger = Logger.new('/var/log/my-company/my-app/rails.log', 5, 104857) if os.downcase =~ /linux/

Maven Woes 2

More time spent fighting with maven. Projects which used hibernate starting failing yesterday. We use XDoclet to generate the hibernate mapping files.

Unable to find resource 'xdoclet-plugins:xdoclet-plugin-qtags:jar:1.0.4-SNAPSHOT' in repository
central (http://repo1.maven.org/maven2)
[INFO] ------------------------------------------------------------------------
[ERROR] BUILD ERROR
[INFO] ------------------------------------------------------------------------
[INFO] Failed to resolve artifact.

Missing:
----------
1) xdoclet-plugins:xdoclet-plugin-qtags:jar:1.0.4-SNAPSHOT
Try downloading the file manually from the project website.

Then, install it using the command:
mvn install:install-file -DgroupId=xdoclet-plugins -DartifactId=xdoclet-plugin-qtags -Dversion=1.0.4-SNAPSHOT -Dpackaging=jar -Dfile=/path/to/file

Alternatively, if you host your own repository you can deploy the file there:
mvn deploy:deploy-file -DgroupId=xdoclet-plugins -DartifactId=xdoclet-plugin-qtags -Dversion=1.0.4-SNAPSHOT -Dpackaging=jar -Dfile=/path/to/file -Durl=[url] -DrepositoryId=[id]

For whatever reason, maven wouldn't use the SNAPSHOT version that I had locally. I couldn't find a 1.0.4 released version on t'Internet, so I just deployed the snapshot that I had in my local repository as the 1.0.4 released version. Naughty me. Had to do the same for the xdoclet-plugin-hibernate plugin as well.

UPDATE: And today it's working with the SNAPSHOT versions again. Still just doing mvn clean install, on Ubuntu, Windows Vista and within Hudson on Red Hat. Silently breaks, and then fixes itself. WTF is that all about?

Maven Woes 1

My mental model of Maven is that there is a small kernel and lots of plugins which provide functionality. On a daily basis, maven will try to update plugins that it uses; e.g. for dependency resolution. You can configure maven to not upgrade certain core plugins, but people don't tend to do this. Perhaps they should...

At the beginning of August this year, I started getting this:

Unable to find resource 'bouncycastle:bctsp-jdk14:jar:138' in repository
central (http://repo1.maven.org/maven2)
[INFO] ------------------------------------------------------------------------
[ERROR] BUILD ERROR
[INFO] ------------------------------------------------------------------------
[INFO] Failed to resolve artifact.

Missing:
----------
1) bouncycastle:bctsp-jdk14:jar:138

Try downloading the file manually from the project website.

Then, install it using the command:
mvn install:install-file -DgroupId=bouncycastle -DartifactId=bctsp-jdk14 -
Dversion=138 -Dpackaging=jar -Dfile=/path/to/file

Alternatively, if you host your own repository you can deploy the file there:

mvn deploy:deploy-file -DgroupId=bouncycastle -DartifactId=bctsp-jdk14 -Dv
ersion=138 -Dpackaging=jar -Dfile=/path/to/file -Durl=[url] -DrepositoryId=[id]

This broke not only all of our trunk builds, but also previously released items. Bouncy Castle jars presumably aren't in the main repositories due to crypto-export issues for some countries. We don't need to ship them. The cause of the problem? We had a dependency on jasperreports. This has an open-ended dependency:

<dependency>
<groupId>com.lowagie</groupId>
<artifactId>itext</artifactId>
<version>[1.02b,)</version>
<scope>compile</scope>
</dependency>

Version 2.1.7 of com.lowagie itext (released a couple of months ago) introduced a dependency on bouncycastle. Before that time, maven had been resolving the com.lowagie itext version to use version 1.3.1. Presumably a plugin was updated to fix a known bug in open-ended dependencies like the one in jasperreports and it exposed us to this problem. We were OK until the bug was fixed! Our current solution is to explicitly define the com.lowagie itext version as 2.1.5, which doesn't have the bouncycastle dependency. The closer dependency wins over the transitive dependency, yada...

Monday, September 07, 2009

Technology predictions revisited

An email that I sent internally about some of our RSS and Atom feed processing, on 2008-03-04:

"The push aspect is a little annoying. We can do polling quite well now, but push via SFTP is messy. One day people will wake up and do this over Jabber, using something based on Atom, but until then, I guess we need to have a scheduled task that polls the SFTP directory and copies content about. Plus ├ža change, plus c’est la meme chose..."

That would be PubSubHubbub then.

Thursday, July 02, 2009

Ubuntu Firefox 3.5 update

I've switched to 3.5 as my main Firefox (using Google Chrome / Chromium Web Browser as well). Just needed to update the symlink in /usr/bin/firefox so that applications using the system default browser were opening the right version of Firefox.

$ sudo rm /usr/bin/firefox
$ sudo ln -s /usr/bin/firefox-3.5 /usr/bin/firefox

Wednesday, May 27, 2009

Scripting JMX via JRuby

I'm gradually growing some operations scripts which use JMX to alter application behaviour according to system load / required maintenance, etc. JRuby is a nice easy way to create these scripts.


host = 'some-ip'
port = 'port-number'
serviceUrl = javax.management.remote.JMXServiceURL.new("service:jmx:rmi:///jndi/rmi://#{host}:#{port}/jmxrmi")
connector = javax.management.remote.JMXConnectorFactory.connect(serviceUrl)
remote = connector.getMBeanServerConnection()
remoteRuntime = java.lang.management.ManagementFactory.newPlatformMXBeanProxy(remote,
java.lang.management.ManagementFactory::RUNTIME_MXBEAN_NAME,
java.lang.management.RuntimeMXBean.java_class)
p remoteRuntime.getName()
connector.close()

Sunday, April 19, 2009

Seriously - another standard plugin for viewing video on the web?


I was trying to view some content via Gilad Bracha but it wasn't easy enough to do and I can't be bothered with it.

FAIL!

Monday, April 06, 2009

Indoctrination

So it begins. Took the boys climbing today, of sorts. I thought Callum was ready for it; Connor less so, but organisational issues being how they are, I had the big two while Al was off with the little one.

The venue was Brant Fell. I'd gone there the previous day and done pretty much everything apart from the traverse and a fingery eliminate at about Font 7b (similar to Perfect Day direct at Gardoms, but smaller holds. Failed with a bad split tip, same as the last time I tried Perfect Day! I think that's related to Callum's steroid cream thinning my finger-tip skin). Callum was pretty happy, romped up 3 short things. Connor, not so happy. He tied on, but then didn't like it so bailed. He then tried to solo stuff instead. That boy!

Monday, March 30, 2009

Java doesNotUnderstand-like behaviour in Eclipse

Kent Beck tweeted this recently. I've been doing this for years, but I guess it's not as widely used as I assumed.

Window | Preferences | Java | Code Style | Code Templates | Code | Method Body


// ${todo} Auto-generated method stub
throw new UnsupportedOperationException("Not implemented");


Then add a breakpoint to your Debugger Breakpoint view:

New Java Exception for UnsupportedOperationException that hasn't been caught.

IDEA supports something similar, since I had it set up then as well, but I've not used IDEA for a couple of years.

I tend not to use debuggers; I prefer tests, but sometimes a debugger's the thing.

Sunday, February 01, 2009

Java REPL

Obviously, most good dynamic languages for the JVM have this, but still, it's sweet.


$ jirb
irb(main):001:0> require 'lib/org.restlet.jar'
=> true
irb(main):002:0> http = Java::OrgRestletData::Protocol::HTTP
=> #<Java::OrgRestletData::Protocol:0x59cbda @java_object=#>
irb(main):003:0> client = Java::OrgRestlet::Client.new http
=> #<Java::OrgRestlet::Client:0x800aa1 @java_object=#>
irb(main):004:0> r = client.get 'http://www.apache.org/'
01-Feb-2009 22:54:08 org.restlet.engine.http.StreamClientHelper start
INFO: Starting the HTTP client
=> #<Java::OrgRestletData::Response:0x12a416a @java_object=#>

Monday, January 26, 2009

Body Swerve

Cameron making us laugh again. I picked him up from nursery the other day and got home. Al was in the kitchen with open arms looking for a hug. He started off towards her, then dropped his shoulder, sold a beautiful dummy and went straight for the drawer with the pans to get some out and start banging! Not bad considering he's only been walking properly for less than a month.

JAXP pipelines using SAX

XProc looks handy, but is not in a usable state yet. So I had to roll my own pipeline as a one-off recently, and seemed to struggle more than I expected. The requirement was fairly simple

  1. Ingest some HTML and convert to well-formed XML for processing;

  2. filter that XML to remove unwanted content;

  3. convert the XML to a different format.


Step 2 was a new bit - I already had well-tested code for the other two parts. So I wanted to re-use that as much as possible. JAXP pipelines using SAX looked to be (and is!) very nice for this, but examples seemed a bit thin on the ground. I've put a version of it here, in the hope that others may find it useful.

/* Create an InputSource for the pipeline input document. */
InputSource in = new InputSource(new ByteArrayInputStream(StringUtils.getBytes(text, "utf-8")));

/* Step 1. TagSoup parsing to get well-formed XML */
XMLReader reader = new Parser();

try {
SAXTransformerFactory stf = (SAXTransformerFactory) SAXTransformerFactory.newInstance();

StringWriter sb = new StringWriter();

OutputFormat outputFormat = new OutputFormat();
outputFormat.setOmitXMLDeclaration(true);

XMLSerializer serializer = new org.apache.xml.serialize.XMLSerializer(outputFormat);
serializer.setOutputCharStream(sb);

/* Step 2. Remove unwanted markup from the well-formed XML. */
InputStream stripContent = getResourceAsStream("strip-content.xslt");
XMLFilter removeUnwanted = stf.newXMLFilter(new StreamSource(stripContent));

/* Step 3. Convert to preferred markup format. */
InputStream xsltResourceInputStream = getResourceAsStream("xhtml2dial.xslt");
XMLFilter xhtml2dial = stf.newXMLFilter(new StreamSource(xsltResourceInputStream));

removeUnwanted.setParent(reader);
xhtml2dial.setParent(removeUnwanted);
xhtml2dial.setContentHandler(serializer.asContentHandler());

reader.parse(in);

return sb.toString();
} catch (TransformerException e) {
throw new ConversionException(e.getMessage(), e);
} catch (IOException e) {
throw new ConversionException(e.getMessage(), e);
} catch (SAXException e) {
throw new ConversionException(e.getMessage(), e);
}