James' Pad

VelocityConf EU 2014 – Day 1

2014-11-21T16:23:00.001+00:00

TLDR; Velocity is a great conference for web and operations people, and why didn't you go already?

Is TLS Fast Yet?

This was a talk by Ilya Grigorik full of practical, actionable things that you can do to serve your site over TLS, and make it fast.

My notes on Ilya's talk.

Monitoring: The Math Behind Bad Behavior

Theo Schlossnagle gave an excellent talk (which didn't involve much maths) about the problems that Circonus see with handling massive amounts of data, and reliably detecting anomalies. I found this quite hard to take notes, and it wasn't as practical in my context as the first talk, but still really interesting.

My notes on Theo's talk.

Design Reviews for Operations

Mandi Walls of Chef showed us how operations should be involved early on. She did a great job of emphasising the importance of having the right people having the right conversations at the right time.

I felt a little over-qualified for this talk, given that I've worked with Gareth Rushgrove for most of the last 3 years, and helped write some of the user stories for operations that GDS published on GOV.UK. Not everyone has had that privilege though!

My notes on Mandi's talk.

What Ops Can Learn From Design

Rob Treat of Omniti brought together The Design of Everyday Things and The Art of UNIX Programming to show how designing with empathy to create intuitive interfaces can be easy to overlook, but can have a massive impact on people using your stuff.

My notes on Rob's talk.

Statistical Learning-based Automatic Anomaly Detection @Twitter

Anomaly Detection seemed to be quite popular this year (see Theo's talk and Baron's proposed talk). Here, Arun Kejariwal talked about the state of the art, how it didn't quite fit for Twitter's usage, and what they did about it. The tools and code should be open-sourced in a few weeks, so people can plug it into their own problems.

My notes on Arun's talk.

Your Place or Mine: A Discussion of Where to Host Your Site

This was an emergency panel convened since the originally planned speaker had something come up. Nice end to the day, talking about cloud and similar issues. Michael did a nice job of not answering someone that seemed to be either aggrieved, or trolling quite hard. He's a proper civil servant.

Things GDS doesn't tell you

2014-04-13T12:26:00.000+01:00

Working at the Government Digital Service (GDS) changes a person. It’s not a thing we highlight; just acknowledge internally in furtive conversations. This post should expose a few truths about that.

Pedantry

Words are really important. A common effect of working at GDS is that large parts of the internet become unusable for you, since the writing is so poor. Caring about serial commas is the norm.

New ideas

Content design. User research. These are all things that were new to me, and it turns out they have a massive impact in creating award-winning web sites. Another portion of the internet becomes blacklisted since it fails to meet your minimum standards for user experience.

Intolerance

Working with amazing people every day has a horrible effect on an individual. Working with less talented people becomes very unattractive. This is a deliberate retention policy strategy, and seems to work very well.

Elitism

Presenting well is a learned skill. Once you’ve learned it from one of the best there is, you start to notice things. Bad things. Powerpoint things. You cannot unsee these things.

If these side-effects repulse you, make sure you don’t apply to work here.

How to check the encryption used in a zip file

2012-09-27T09:27:00.002+01:00

Sadly, I've not found a nice CLI way of doing this, but I recently had to validate that a 3rd party was transferring files to us in an approved way (AES 256-bit) and this is what I did:

Open the zip file in emacs.
Use fundamental-mode to stop showing a listing of the zip contents. (M-x fundamental-mode)
Use hexl-mode to get a binary view of the file. (M-x hexl-mode)
Search for the string "0199 0700" to find the AES Extra header field. (C-S 0199 0700)
Check that 2 bytes after the 0700 (skip the 2 vendor bytes; 0200 below) is 4145 (the characters AE) followed by 01, 02 or 03 representing the AES encryption strength. In our case, we wanted 03, or AES-256.

How to convert an Oracle .dmp into a more portable format

2012-09-13T13:21:00.001+01:00

One of the things that I've come across has been legacy applications which use Oracle; where we don't have access to the database, but do get provided with Oracle .dmp files. These aren't very helpful when all you have (or are used to!) is MySQL and PostgreSQL.
One approach that I've had success with is to download a VirtualBox image of Oracle, and then play with the data in there. I chose the Database App Development VM since I wasn't sure what parts of Oracle I'd need, having strenuously, pretty successfully, avoided Oracle for most of my time in the industry.
I then imported this into VirtualBox (running on OSX Lion) and configured networking:

One adapter running NAT so that I can browse the internet in the guest OS. This let me download the .dmp file to the guest OS filesystem.
Set up Port Forwarding for the NAT interface so that I can ssh to port 2022 on the host which will go to port 22 on the guest, thus allowing ssh access.
Optionally, I set up the VM to have another adapter (Host-only) so that I can set up NFS shares to mount part of the host filesystem under the guest.

Next, I needed to get the .dmp data onto the guest OS (and later get the transformed data off the guest). ssh-copy-id is good for this, to put an SSH public key into the authorized_keys for the oracle user on the guest. You can also get data into and out of the guest using python -m SimpleHTTPServer ran in the appropriate directory, which let me browse the host filesystem or guest filesystem as needed. ifconfig in the host or guest lets me know which IP address to use.
Now, I needed to create a tablespace and user to allow me to import the data. I advise doing this, since (for me at least!) importing the data is an iterative process, and creating a separate tablespace (with separate data files) is a good practice since it avoids bloating the system tablespace and means that disk space can be reclaimed. Pretty much the only Oracle knowledge I have! Before you create the tablespace, it's a good idea to check the size of your .dmp and available space on the filesystem. I had a 1.4GB .dmp which didn't fit into the space left on the fs and I burnt a bit of time figuring out Oracle error messages for the failed import before I worked out the filesystem wasn't big enough. In this case, I created a symlink in $ORACLE_HOME/dbs/ which pointed to a large enough partition and set the owner / permissions as required. Creating the tablespace was just a case of running:

$ sqlplus / as sysdba
...
SQL> CREATE BIGFILE TABLESPACE mytablespace DATAFILE 'mytablespace/f1.dat' SIZE 20M AUTOEXTEND ON;

Tablespace created.

SQL> CREATE USER myuser IDENTIFIED BY password DEFAULT TABLESPACE mytablespace;

User created.

SQL> GRANT CREATE SESSION,CREATE SYNONYM,CONNECT,RESOURCE,CREATE VIEW,IMP_FULL_DATABASE to myuser;

Grant succeeded.

SQL> exit

We should now be in a position to try to import the data.

$ time imp myuser/password file=path/to/data.dmp full=yes

If this fails since the user that it was exported as is not the same as the user you created, then stop the import and clear out the user and tablespace.

$ sqlplus / as sysdba

SQL> DROP USER myuser CASCADE;

User dropped.

SQL> DROP TABLESPACE mytablespace INCLUDING CONTENTS AND DATAFILES;

Then re-create the tablespace and the new user and try the import again.
Once the import has succeeded, you want to get the data out of the database into a less proprietary format. One way is to use SQL Developer (a GUI tool included in the VM image).
Open SQL Developer and define a new database connection:

Connection name: myuser
User name: myuser
Password: password
Save Password?: Checked

SID is orcl rather than xe, in the Developer Days VM that I used.
Test the connection. It should work. Then open the connection and examine the tables.

In the menu, click Tools | Database Export
Want to export the data only, into CSV.
Choose the connection, choose the tables, choose the destination file.

For large databases, this can take a while to process (2.5 hours for my case). It may be faster to write your own export routine using Perl, PL/SQL or similar. In the end, that's what I did, so that I could script the entire process like so:

Joining GDS

2012-05-08T06:35:00.000+01:00

Basically echoing what others have said. I've held off working in London forever, not wanting to spend a large portion of my day on a train. But then something like this comes along, with an opportunity to transform how Government delivers services (work on stuff that matters), and working with a ridiculously talented set of people. Chances like that don't come along often. It's going to be an exciting ride, and I'm grateful to my wife and kids for letting me get on.

My reaction to Raganwald's "How to do what you love"

2012-03-03T21:39:00.005+00:00

You should buy this book. I know (all?) the content is available online already and if you've been reading raganwald's output over the years, you might have already read the articles collated in this slim volume. I still suggest you should buy the book; my only nitpick was that (at the time I purchased it) the maximum payable price seemed lower than what I would have paid.

I guess for me there were 2 reasons to buy it. One is partly a reflection on my evolving personal philosophy, that people who create great stuff should be somehow rewarded, so that they can carry on creating great stuff. In Renaissance times, this would be patronage. These days, tip jars or similar can be simple, low-friction ways of allowing a much larger potential audience to support an artist. Also, I prefer to buy stuff that is free, because I am fortunate enough to be in a position to do that, and to try to ensure that the supply of free stuff doesn't dry up.

The second reason is that I am grateful to Reg for providing me with so many hours of stimulating thought.

I don't believe I had previously read all of the compositions, and 3 things struck me upon reading this book.

First, I don't have a publicly viewable portfolio demonstrating that I am in any way a competent professional. There are the odd normal bunch of patches littered in various libraries that I use or have used, and one former employer released a large chunk of their code as open source (but with all identification / attribution removed) but there is nothing meaty that is mine (apparently, apart from vbunitfree, which is very dead). I have in the past railed against walled gardens in terms of mobile carriers and their view of the web; in this case I have been working with other walled gardens, in terms of writing code that is proprietary, for corporate entities. My github account needs some TLC to showcase my skills.

Second, in recent years I have neglected communication and other soft skills, choosing instead to focus on technical skills for quite some time. That is a mistake. As I've got older, I've come to think that communication is more important; it's all about the conversations you have with people. Reg certainly seems to share that viewpoint. This blog was initially created since all of my blogging output was going onto an internal, employer-owned blog and I wanted to develop those skills further (and stop putting all of the good stuff in a walled garden!). I need to dedicate some time to this.

Finally, NDAs are evil. In that instance, not only is your professional output (in terms of code at least) locked up in a walled garden so that no-one can view it, but neither can you even talk about it. I agonised for a long time about the last NDA that I signed. No more. If you need me to sign an NDA, I suggest that perhaps you need to examine why you are asking me to do that. Surely you should have confidence in your ability to execute on a plan, and the speed at which you will iterate?

Merging Subversion trunk into a branch; how to deal with merge conflicts

2012-02-24T16:24:00.007+00:00

TLDR meh.

I'd inherited a 4 month old branch which needed to be merged back into trunk at some point. As a first step, I wanted to merge the (hopefully smaller) changeset from trunk back into the branch. I tried git-svn. It didn't work for me. This has not been a pretty task.


$ pushd path/to/svn/repo
$ svn sw https://example.com/svn/project/branches/my-branch
$ svn up
$ svn merge https://example.com/svn/project/trunk --accept postpone
...
svn: One or more conflicts were produced while merging r3097:4432 into
'.' --
resolve all conflicts and rerun the merge to apply the remaining
unmerged revisions

At this point I have my working copy in a partially merged state with various file-level and tree/directory level conflicts. As an example of how a repository might get into this state, imagine this happening in the branch


$ svn mv dir1 dir2
$ mkdir dir1
...
# add files to dir1
# and commit a few times.

Meanwhile in trunk


...
# add and modify files in dir1
# commit a few times.

Since the changes hadn't been cherrypicked, you get tree conflicts. Let's take a look at those conflicts.


$ svn stat | grep 'C '

I had 46 issues listed for the merge up to this point. File level conflicts can be easily resolved using fmresolve which I've written about previously.


$ fmresolve path/to/conflicted/file

and then


$ svn resolve --accept working


$ svn resolve --accept theirs-full


$ svn resolve --accept mine-full

Tree conflicts can only be resolved using the working copy, so I needed to checkout / copy the relevant file and edit until I was happy with each one, and then mark each conflict as resolved, accepting the working copy. 21 of these needed attention at this stage.
Then you can proceed with the merge.


$ svn merge https://example.com/svn/project/trunk --accept postpone

Repeat until done.
Hopefully merging the branch back into trunk will go a little easier.

svn merging on OSX

2011-12-16T13:02:00.008+00:00

I don't always use svn as a version control system with which I'll need to merge branches, but when I do, I use fmdiff.

$ brew install fmdiff

One minor annoyance - fmmerge (used for interactive conflict resolution) doesn't work. The number of arguments passed to the script has changed since it was first written. I patched it locally, but it still didn't work. FileMerge was launched, I could edit files, etc; but it kept saying that the merge needed resolving. Instead, I just postpone all merge conflicts during the merge, and then use fmresolve and svn resolve to resolve any individual merge conflicts.

[1] I like to branch by feature typically, but occasionally, branch by VCS is used.

Minifying Javascript at runtime

2011-10-04T18:31:00.003+01:00

Steve Souders pointed at this today; I've done something similar in the past, but I struggled somewhat with the documentation. Hopefully this might be useful to others.

As part of a product that serves as a rendering runtime for mobile, this allows authors to create Javascript, and the runtime can optimise and cache on the fly. We like it!

Installing Graphite on OSX (Snow Leopard)

2011-09-27T11:20:00.002+01:00

This wasn't entirely straightforward, so in the hope that it's useful for others:

python on Snow Leopard doesn't seem to come with the development headers, so we need to address that, since pycairo needs them.

$ brew install python

edit the PATH to have /usr/local/share/python at the start
open a new shell to recognise the new PATH
install cairo as per http://stackoverflow.com/questions/6886578/how-to-install-pycairo-1-10-on-mac-osx-with-default-python
I do --use-gcc since it doesn't work with LLVM / LLVM-based GCC.

$ brew install cairo --use-gcc
$ wget http://cairographics.org/releases/py2cairo-1.10.0.tar.bz2
$ tar xjf py2cairo-1.10.0.tar.bz2
$ pushd py2cairo-1.10.0
$ emacs wscript
$ export CC=/usr/bin/gcc
$ export PKG_CONFIG_PATH=/usr/local/Cellar/cairo/1.10.2/lib/pkgconfig/
$ python waf configure
$ python waf build
$ python waf install

Now install python dependencies

$ pip install django
$ pip install django-tagging
$ pip install twisted
$ pushd path/to/graphite/
$ pushd whisper
$ python setup.py install
$ popd
$ pushd carbon
$ python setup.py install
$ popd 
$ python check-dependencies.py
$ python setup.py install
$ pushd /opt/graphite/webapp
$ export PYTHON_PATH=${PYTHON_PATH}:/opt/graphite/webapp
$ pushd graphite
$ python manage.py syncdb

Grabbed this file and put it in /opt/graphite/bin. That means I don't need to setup apache httpd locally.

$ wget https://raw.github.com/tmm1/graphite/d0f76a659f4f2dea67f19902002710f601f534aa/bin/run-graphite-devel-server.py
$ python /opt/graphite/bin/carbon-cache.py start
$ python /opt/graphite/bin/run-graphite-devel-server.py /opt/graphite

Browse to http://localhost:8080/ and I have a graphite webapp running

$ python path/to/graphite/examples/example-client.py

I now have a script putting data into graphite. Might want to tweak local_settings.py (make it Europe/London, for example), and conf/carbon.conf to have reasonable retention periods / file sizes for the whisper data files.

Content Negotiation on Mobile considered harmful

2011-05-31T23:17:00.000+01:00

This is kind of a follow up to my previous post on using Amazon CloudFront.

This post is to cover conneg, or Content Negotiation.

TL;DR - use HTTP as designed and follow the rules.

Alice tries to access http://example.com/ on her iPhone. She gets back some HTML which references some images.

The markup returned to Alice contains references to 3 images. I'll just look at the first one.

Alice's iPhone made a request for /images/1 and got back a 320x80px PNG image, since we have a clever server-side component which knows about different user-agents and tries to serve the most suitable version of an image for each client.

Along comes Bob. Bob is using a Google Nexus One. He similarly requests our home page and gets back a link to /images/1. When the Nexus One requests that resource though, it gets back a 420x120px PNG, again thanks to our fancy server-side detection.

What does this do to our caching? Well, it stuffs it up completely.

We're using a canonical URI for the image - /images/1.
We're serving different representations of the image from the same URL.
We cannot easily specify good HTTP expiration directives.

Going into point 3 in more detail, we cannot use the Vary header in our response to try to let proxy caches more efficiently. Vary can only specify a request header. This means that something like the User-Agent doesn't work:

There are many thousand User-Agent strings in existence.
How different are these 2 anyway?
- Mozilla/5.0 (iPhone; U; CPU iPhone OS 3_0_1 like Mac OS X; en-us) AppleWebKit/528.18 (KHTML, like Gecko)
- Mozilla/5.0 (iPhone; U; CPU iPhone OS 3_0_1 like Mac OS X; en-us) AppleWebKit/528.18 (KHTML, like Gecko) Mobile/7A400

Let's be conservative and say that there are 5000 different User-Agents hitting our site. That means that we could be telling proxy servers to cache 5000 copies of /images/1, rather than the perhaps 7 different sizes that our application might produce.

So alternatives to conneg?

Use a distinct URI for every bag of bytes that the application can serve. This means that your server-side markup generation needs to be a little smarter, so that you render markup containing /images/1/320x80 and /images/1/480x120 for example (see the point in my CloudFront post about not wanting to use query string parameters for this information).
It's still possible to support the old canonical URLs, either by continuing to perform conneg, or redirecting appropriately.
Rev your URIs so that you can happily set Far Future Expires directives on these resources; i.e. if the image changes, give it a new URI.

I'm still musing on what this means for progressive enhancement. Andrea's post looks like a step in the right direction though.

Objective-C concurrency issues

2011-02-08T20:32:00.002+00:00

Disclaimer - I've not shipped Java Swing / SWT apps. I'm a server guy where markup is the UI. Consequently, I don't have in-depth knowledge of Java to compare against. I'm aware of SwingUtilities.invokeLater(Runnable) but otherwise just assume I'm clueless about Swing.

First rule of GUI programming - don't block the main thread.

Second rule of GUI programming - don't block the main thread, etc.

Quantifying this, you have a device running at a refresh rate of 60Hz. So you if you do anything in the main thread, you need it to complete in under 16ms, or your UI will not be smooth and responsive.

In Java, I would normally look at Executors, Callables, Runnables and related APIs to do things off the main thread. In Objective-C, we have NSOperationQueue and NSOperation. Learn, use and love them. In particular, don't do what I did and start porting java.util.concurrent classes to Objective-C. I wrote a CountdownLatch, which was very nice and taught me about various low-level concurrency primitives. Unfortunately it was completely the wrong solution for the language. What I should have done was to use [NSOperation addDependency:] to chain tasks together.

Objective-C tooling

2011-02-08T20:19:00.003+00:00

Java development tools are top of the pile out of anything I've used. The IDEs are massively powerful (they have to be, with the warts on the language). I'm also an emacs user day to day, and pragmatically use vim as well. But Eclipse / IDEA / Netbeans are pretty amazing tools for Java The Language development.

Respectively for Objective-C development, Xcode isn't.

If Apple Ts&Cs allow, IntelliJ could probably make some impressive inroads into that market.

clang is a good (and getting better all the time) addition. The debugger needs some love; I don't find gdb as powerful as Java debuggers.

In Java-land, one can use maven, ant, ivy, Make, etc to build a project. For iOS development, the IDE rules a lot from the off. There is a command-line tool which can potentially be driven by Jenkins or Thoughtworks Go. That would be my preferred option going forward; in my view, building in an IDE is not a repeatable build process.

Creating a Custom Origin Server for Amazon CloudFront

2011-01-07T10:18:00.008+00:00

At the time of writing, tool support is limited to the REST API? The intention with this piece of work was to take content being served by our origin server; e.g. http://example.com/images/foo.png; and serve it via Amazon CloudFront on http://cdn.example.com/images/foo.png.

Download cfcurl.pl.
Get any dependencies from CPAN (the cfcurl.pl script tells you how to do that in case you aren't sure).

Create $HOME/.aws-secrets and chmod 600.

$ cat /Users/jabley/.aws-secrets
%awsSecretAccessKeys = (
 # my personal account
 'james-personal' => {
     id => 'foo',
     key => 'bar',
 },

 # my corporate account
 'james-work' => {
     id => 'AWS-ID',
     key => 'AWS-Secret-Key',
 },
);

Create a file with the request data

$ cat create-distribution.xml
<?xml version="1.0" encoding="UTF-8"?>
<DistributionConfig xmlns="http://cloudfront.amazonaws.com/doc/2010-11-01/">
<CustomOrigin>
   <DNSName>example.com</DNSName>
   <OriginProtocolPolicy>http-only</OriginProtocolPolicy>
</CustomOrigin>
<CallerReference>20110106103700</CallerReference>
<CNAME>cdn.example.com</CNAME>
<Comment>example.com CloudFront CDN</Comment>
<Enabled>true</Enabled>
<Logging>
   <Bucket>accesslogs-example.com.s3.amazonaws.com</Bucket>
   <Prefix>cdn.example.com/</Prefix>
</Logging>
</DistributionConfig>

POST the file

   perl cfcurl.pl --keyname james-work -- -X POST -H "Content-Type: text/xml;charset=utf-8" --upload-file \
    create-distribution.xml https://cloudfront.amazonaws.com/2010-11-01/distribution

Poll to see when it has finished creating the distribution:

   perl cfcurl.pl --keyname james-work -- https://cloudfront.amazonaws.com/2008-06-30/distribution

Configure DNS so that cdn.example.com is a CNAME for the DomainName value of your newly created CloudFront Distribution.

You should now be able to request a resource using the new CDN name:

$ curl -v -s "http://cdn.example.com/images/foo.png" -o /dev/null
* About to connect() to cdn.example.com port 80 (#0)
*   Trying 192.168.1.1... connected
* Connected to cdn.example.com (192.168.1.1) port 80 (#0)
> GET /images/foo.png HTTP/1.1
> Host: cdn.example.com
> Accept: */*
> User-Agent: curl
>
* HTTP 1.0, assume close after body
< HTTP/1.0 200 OK
< Date: Thu, 06 Jan 2011 18:51:35 GMT
< Server: Apache/2.2.3 (Red Hat)
< Content-Length: 1233
< Cache-Control: max-age=86400
< Content-Type: image/png
< Age: 13311
< X-Cache: Hit from cloudfront
< X-Amz-Cf-Id: b769b423c54e2ffb0a6fb60369e2e0f7b103251ef3e2c549084fb4abe4ef9a236052f8eec40b3a14,80416de274eb8ee87c21ee41c863f9f6f9ef1c251823d9c12c46ab13dc33759dcbb04175b7d4a5a7
< Via: 1.0 83eb7919a5076e946a3a2d59d7f4415b.cloudfront.net:11180 (CloudFront), 1.0 26fb80d2abd86d7f52358cd1c2efd787.cloudfront.net:11180 (CloudFront)
< Connection: close
< 
{ [data not shown]
* Closing connection #0

Note that Amazon CloudFront doesn't support query strings on resources, so you might need to use some Apache mod_rewrite stuff.

Apple AppStore feedback

2010-08-04T08:45:00.003+01:00

This morning the AppStore on my iPod Touch pointed out that the Twitter app had an update. I installed it and launched the app to see what was new. Crash! After 5 crashes in a row without a successful launch, it was obvious that a bad build had snuck into the App Store.

 Incident Identifier: 61EE3335-1AD0-4099-8EC6-FAB4B6160A43
 CrashReporter Key:   001e29bd4aa2ef81d42701ce5325da94b364e27b
 Process:         Twitter [8470]
 Path:            /var/mobile/Applications/8E13E345-CDD3-4CA4-899D-8E38BA6661C5/Twitter.app/Twitter
 Identifier:      Twitter
 Version:         ??? (???)
 Code Type:       ARM (Native)
 Parent Process:  launchd [1]
 
 Date/Time:       2010-08-04 07:58:35.994 +0100
 OS Version:      iPhone OS 3.1.3 (7E18)
 Report Version:  104
 
 Exception Type:  EXC_BREAKPOINT (SIGTRAP)
 Exception Codes: 0x00000001, 0xe7ffdefe
 Crashed Thread:  0
 
 Dyld Error Message:
   Symbol not found: __NSConcreteGlobalBlock
   Referenced from: /var/mobile/Applications/8E13E345-CDD3-4CA4-899D-8E38BA6661C5/Twitter.app/Twitter
   Expected in: /usr/lib/libSystem.B.dylib
   Dyld Version: 149
 
 Binary Images:
     0x1000 -   0x14ffff +Twitter armv6  <43ca857e309a61ba8c5da3ab83e42218> /var/mobile/Applications/8E13E345-CDD3-4CA4-899D-8E38BA6661C5/Twitter.app/Twitter

As an iPhone app developer, I think I know what this problem is. We saw this problem in one of our apps. IIRC, the new, preferred llvm compiler has a bug with the new blocks language construct, and gcc doesn't, and the bug only shows up at runtime, in certain environments. To fix it, Twitter are going to have to recompile and use gcc rather than llvm, and then wait for the wheels at Apple to turn.

Other people have talked about the frustration of not being able to iterate at web speed or do continuous deployment, but that's part of the ecosystem that you operate in with Apple.

Testing, either by the Twitter team, or by Apple when they review the app prior to approving it, should have caught this issue. But these things happen.

We had a similar thing happen with an update to one of our apps recently. An update went live and thanks to the apparent difficulty in doing your own testing of the binary that gets sent to Apple, an issue only became apparent when the new version was available through iTunes. To me, this is where the ecosystem is broken. If I have a webapp and I deploy an update, then find an issue (via my cluster-immune system - one day!), I roll it back.

iTunesConnect has no rollback, even though it seems like a highly desirable feature. I know in our case, we would have liked the option to rollback to the last known good version and then wait for Apple to review an update, rather than having the world upgrade to a version that we didn't want them to be running. I imagine Twitter would appreciate a similar feature right about now.

Capturing Mobile Network Traffic On OS X

2010-07-08T10:13:00.004+01:00

Recently had to audit an app to ensure that it wasn't leaking any unwanted details over the network. This was an iPhone app, but the same process can be used for Android, etc.

Ensure Macbook Pro is plugged into Ethernet.
Open System Preferences
Internet & Wireless | Sharing (in Snow Leopard).
Click Internet Sharing.
From Ethernet
On Airport
Close System Preferences
Click Airport
Select Create Network...
On the phone, open the WiFi controls and connect to the network that you've just created.
Run Wireshark.
Start capturing traffic on the wireless card.
Check stuff is using SSL that should be, etc.

mod_python in apache on OS X with Homebrew

2010-05-20T14:03:00.004+01:00

Recently had to install mod_python to test something for a customer. It needed some nudging, so including it here. Snow Leopard, Homebrew and default httpd.

Objective-C - the language

2010-01-13T10:21:00.004+00:00

First off, I read the Objective-C Primer and Objective-C Programming Language guides. I collect languages, so there was some underlying familiarity there. Ruby, Smalltalk and C obviously shone through for me. Second off, I re-read Smalltalk Best Practice Patterns. I first read that book maybe 6 years ago and it had a massive impact on my Java style. Objective-C is the most Smalltalk-like language that the 'masses' will actually use professionally. Sadly, it's not enough Smalltalk for me, and the C abstractions leak quite a bit.

Java Developers Guide to Objective-C on the iPhone

2010-01-13T10:17:00.005+00:00

This will be a place-holder page containing links to the other entries that I create in this series. I've got a lot of commercial experience with Java, some Python and Ruby. This has all been server-side; I've not really touched GUIs (apart from GWT, HTML and Javascript) for a while, so this series will necessarily reflect that. Hopefully it will prove useful to others.

Topics that I hope to cover:

Initial questions coming from a mainly Java background.
Objective-C the language, including comparisons with Java.
OO with Objective-C, covering how a typical Java app would use interfaces and how Objective-C might approach the problem.
Collections in Java and Objective-C.
Concurrency utils in Java and Objective-C.
Tools.

Objective-C for Java Developers

2009-12-29T23:20:00.006+00:00

I'm coming from an Eclipse on Ubuntu background, but this is equally applicable for IDEA on Windows. What are the equivalents for iPhone development?

Java	iPhone	Notes
JUnit (unit testing framework)	?	It is possible to use TDD for Swing apps, although I've been predominantly a server-side guy with client stuff happening in the browser for quite a while now. Cucumber with iPhone looks worth exploring...
Hudson (continuous integration tool)	?	On my first iPhone app, it rapidly became apparent how easy it was for people to do bad merges and delete classes from the Xcode project file / strings from the UTF-16 l14n Localizable.strings file. You can argue that people should take more care; yeah, that'll fix it. git bisect is great, but a tool that builds on each commit is better.

ScaleCamp - Queue PUBSUB

2009-12-26T23:53:00.003+00:00

From some reason I went to this thinking PubSubHubbub, but it was more a refresher about making an app asynchronous, why you'd want to do that and how implementation complexity goes up as you go after certain desirable behaviours.

ScaleCamp - How do you scale Activity Feeds?

2009-12-26T23:17:00.006+00:00

Popular session this - standing room only, so no notes from me.

The short answer is Redis, courtesy of Simon Willison. Since the consensus was that Redis would do the trick, we then touched on Simon's other new favourite technology - node.js.

Digression: Alex from mediamolecule made a comment about 100MB of data in a key-data structure store should only require 100MB of memory to store in such a server app (plus a little extra for housekeeping, but it shouldn't be a 1:10 ratio or similar. I didn't take that as a direct criticism of Redis but more of a reminder about choosing good data structures and the importance of CompSci (says this mathematician). I mention that, since it pricked me to investigate a suspected bad data structure in one of our apps, and coupled with the Eclipse Memory Analyser recommended by the Guardian guys, I found something that was using far too much of the heap for our Tomcat nodes; and had a change rolled out within 4 days of attending this conference. That reduced the memory footprint for that data structure from 250MB to 16MB. Ouch, shocker, but great to have found, prioritised and fixed.

ScaleCamp - Scaling Java and Oracle for the Guardian

2009-12-07T12:11:00.003+00:00

Guardian.co.uk

Graham and various other people from the development and operations team pitching in.

3 years ago - published static files with apache SSI to fill-in gaps. Moved to a fully dynamic system. Now, they're somewhere in between.

Stack -

apache

resin

spring / hibernate / velocity

Oracle DB backend (not recommended!)

Measured the application - 1300 requests to DB just to render homepage.

Added ehcache to hibernate as 2nd level cache and added a warmup script before putting into load balancer

30m unique users per month
270m pages per month
250 requests/second at lunchtime
1500 requests/second peak.

GC tools

Google weakref cache (part of Google Collections)

Eclipse memory analyser - what's using all my memory?

Cacti for monitoring - DB usage was killing it.

8 app servers in each co-lo (London and Manchester).

400MB used by cache - churn meant was pretty ineffective.

Tried or considered ehcache distribution and jboss cache distribution.

Rejected since cache eviction via replication would have thrashed it.

memcached

massive improvement in response times, but DB load still high.

went to caching every query for 5 minutes. DB load vanished and is flat even as more app servers come on-line.

servlet filter writing to memcached made it stink fast.

took a days worth of logs and Hadoop to see how long the cache should be. 1 minute was the sweet spot.

Emergency switch to serve a static copy of the site, minus personalization features.

Daemon or script scrapes the sit; they can handle 700req/s/node when the site's operating in this mode.

new content published in this mode has a copy pressed so it can be served from disk - publish is slower than with the other system but updates still possible

Highly recommend that this sort of emergency degrade read-only mode should be built-in from the off - they've used this approach with the MPs Expenses apps built to crowd-source investigations.

ScaleCamp - Scaling Java with Shared Nothing

2009-12-07T12:09:00.002+00:00

Thoughtworks guys again.

Basic Servlet overview - in-memory sessions don't scale, duh!

Preferred options - state goes into cookies and serialized. Security, legal aspects? Pretty well common to most frameworks these days.

Page composition in the server with proxy server holding StringTemplate objects. Interesting idea - seen variants of this in other places. I'm curious as to whether doing this could mean having a poor man's macro system for Java, since XSLTs can be written to create XSLTs; maybe Velocity templates could similarly generate Velocity templates or StringTemplate -> StringTemplate?

Again, application developers need to have a good idea of caching directives for this to work. One objection I had with this approach is that you potentially increase your hardware requirement and can open the app up to liveness failures here. Request A comes in and is serviced by Thread 1. As part of that, it makes a request to the proxy server for a template. At the proxy server, a cache miss means that another request needs to be made to the app server. Then Request A is tying up 2 app server threads. What about applications which parallelize the requests? They might use more than 2 app server request-handling threads at a time, etc.

Thoughtworks seem to do fun, interesting work.

ScaleCamp - LittleBigPlanet

2009-12-07T12:08:00.004+00:00

Alex and James from mediamolecule.com. Fascinating perspective of embedded developers coming to server-programming and refusing to accept commonly held views on best practices for doing so. This was the surprise hit of the conference for me; I just elected to go since there wasn't anything else in that slot that I was really passionate about. I'd been talking to them both in the queue for tea earlier and made a poorly judged joke about Map-Reduce (we pretty much had this conversation). The session they ran was an awesome talk about scaling server-side within the games sector - Little Big Planet is theirs.

Written their own C-based key-data structure store, of which we're spoilt for choice just now. Alex commented that he's looked at Redis and it has some nice stuff, but when they came to need it, there wasn't anything that met their needs, and experience with running the recommended Java stack had left them with the impression that they should stick to what they know. What they know is writing very tight code in constrained environments, so applying that mind-set to server-side development seemed to have yielded some very pleasing numbers. Other parts are in Ruby (presumably 1.9, since they're using Fibers?). I didn't get around to asking James how well that works or which implementation they're using. Very happy with that programming model though - James is or was a Java guy - funny how nice Ruby feels coming from there!