The Neo4j 1.9 to 2.0 Upgrade

So, with Neo4j 2.0 on the near horizon (RC1 out), it seemed appropriate to upgrade all my current codebase from 1.9 to 2.0. Generally, as a rule of thumb, the conversion has been pretty smooth, there are some pre-2.0 gaffs I’ve made which have made it slightly harder than it needed to be – but the ability to use 1.9 cypher in 2.0 db is incredibly useful.

So – I’m (as you can probably tell) a .NET developer, which means I use the excellent neo4jclient to perform pretty much all of my interaction with the database. The project (though I can’t go into specifics) has been using the client since 1.0.0.395 and initially I was using the Node<T> and NodeReference classes A LOT.

The BANG approach.

I guess some stats, the project isn’t huge, only 10,000 LOC (remember size doesn’t matter) and as a consequence my first thought was the let’s just do it all approach. So, start up a new DB and just see how it runs.

INSTANT FAIL

Of course – my queries are using ‘START’ ooooooooooooops, so I start doing a ‘Find/Replace’ hit some ridiculous number of build errors – swear a bit, back-out those changes and start again.

The Gradual approach.

Or, the “should’ve done this in the first place” approach. Neo4j has the option to switch the cypher parser – by putting ‘CYPHER X.X’ (where x.x is the version you want to use) in front of your queries, so you can run 1.9 queries against the 2.0 database.

Neo4jClient also supports this via the ‘ParserVersion’ method, so my initial queries of:

gc.Cypher
    .Start(new {n = new NodeReference(1)})
    .Return(n => n.As<Blah>())

change to:

gc.Cypher
    .ParserVersion(1,9)
    .Start(new {n = new NodeReference(1)})
    .Return(n => n.As<Blah>())

Compile and run – we’re in a happier place.

Labels, NodeReferences et al.

Now, those of you with eyes or indeed screen readers will have noted the ‘NodeReference’ being used above (even though it’s only an example!!) but still – naughty. NodeReference in Neo4jClient is the actual underlying Neo4j id for the node. There are various reasons not to use it, but one of the biggest is that it’s not something I / we have control over. Id’s can be reused, and there is no guarantee that id 1 is still id 1 after a day of activity.

Tatham Oddie (the primary author of awesomeness Neo4jClient) is moving the client away from NodeReference and Node<T> usage, effectively deprecating them as and when he can. Good news – this makes the code simpler as we deal with POCOs (Plain Old CLR Objects) exclusively, bad news – some chumps (*ahem* me *ahem*) used Node<T> quite a bit in their code and now would be better off removing it.

The other thing which I want, to take advantage of are Labels, so that means a bit of an update to the MATCH clauses. I say a bit. I mean a lot. If (for example) I have a match statement like:

.MATCH("n")

I need to update it to something more like:

.MATCH ("(n:Label)")

AND if I have lots of match statements like that, I have lots of updates to make. Basically, I’m paying for my naiveté at the beginning of the project.

Unit Tests

I’m doing quite a bit of query building based on given inputs (vague eh?) but in practice this means I have need to test my queries are generated correctly via a given input. This is where I’ve been bitten the hardest. See – I care a lot how these queries are produced, and I need to make sure they match up to the expected queries.

So for every MATCH n I’ve changed to MATCH (n:Label) I’ve had a test somewhere which then fails. OK, it’s not quite that bad, but you get the drift.

Conclusions

It’s taken some time (and some pretty impressive regex find/replace scripts if I do say so) but I finally managed to be 100% Cypher 2.0, and on the day Neo4j 2.0 stable was released, so how’s that for timing Smile.

Converting any codebase from a previous version to use the new version is always costly. There are two main changes – the first is the change to Cypher, this was a doddle, being able to put ‘CYPHER 1.9’ at the beginning meant the code worked straight away against Neo4j 2.0 and this is HUGELY helpful. The second change (for me) is removing all ‘Node<T>’ and ‘NodeReference’ usages, this was more complex but as a caveat – only because of the way I had written my code.

Print | posted @ Thursday, December 12, 2013 4:30 PM

Comments on this entry:

No comments posted yet.

Post A Comment
Title:
Name:
Email:
Comment:
Verification: