cosmic realms

a coder blog

Benchmark of node.js JSON Validation Modules

I need to validate JSON in a node.js project I am working on.

Speed is critical in my project, so I decided to benchmark several node.js JSON validation modules.

I only included modules that support the JSON schema described here: http://json-schema.org/

The following modules were benchmarked:

Two JSON objects and schemas were used, one basic and one advanced. The source code for these are at the bottom of this post.

Due to varying module support, I had to create both a v2 and v3 schema document.

Results






Conclusion

Looks like JSV is the slowest, even more so when dealing with V3 of the JSON schema. Also note that we are using a pre-processed JSV JSON schema. When that wasn’t used, JSV was an additional 20 times slower.

So easy conclusion right? Use json-schema or schema as they are the fastest?

Well, it turns out that bothjson-schema and schema lack supprt for serveral properties mentioned in the spec such as divisibleBy, uniqueItems and format.

For my project I’m not currently using any unsupported properties, so I’ll be choosing json-schema for it’s speed.

In the future if I need to use an unsupported property, I can always just add it to json-schema myself and send a pull request :)

Modules not used

These modules were NOT tested due to each having a custom JSON schema format:

Lastly this module was also not tested because it lacks support for the ‘required’ attribute:

Source Code

Here are the JSON data objects and schema used for the benchmarks:

Web Traffic Time Analysis by Server, Client and Client IP

Traffic analysis for World of Solitaire shows more visits on the weekdays compared to the weekends.

I suspect that this is due to people playing solitaire from work and school. I wanted to see at what hour of the day people are visiting to see if it is during normal work and school hours.

Measuring with just the server time however wouldn’t give an accurate measurement of what time of day it truly was at the visitor’s geographic location in the world.

I measured each visit three different ways:

  • The server’s time
  • The client’s actual local time (read by javascript in their browser)
  • The client’s IP address (converted to a city/time zone using GeoIP)

The data was collected over a 1 week period from Sun Oct 16 to Sat Oct 22.

Here are the visits broken down by day:

All three show pretty much the same thing. Most of my traffic comes from the states and my server is located here, so it looks like at a ‘day’ level, using the server time is reliable.

Let’s break it down by hour:

Now this is much more interesting! It shows that measuring with just the server time is not an accurate reading of what time of day people visit.

There is a 20% drop in visits between 4PM and 6PM, right about the time people would leave work and school.

The above graph was for the entire week. What about looking only at weekdays and weekends, will we see something different?

The weekday chart shows an even bigger drop between 4PM and 6PM, 27%.

The weekend only shows a tiny 3% drop during this same period. On the weekends the traffic doesn’t drop until about 9PM, bed time for many people.

It looks like using the GeoIP version of the client’s IP address is an accurate way to measure the client’s actual local time.

Can we also say that people are visiting from work and school? The data seems to point in that direction, but it isn’t definitive proof. People may be playing from home while their significant others go to work and then stop when they arrive home.

Massive Decrease in Memory Usage With Redis 2.4

I love redis. It’s blazingly fast and wonderfully atomic.

Back in December 2010 I converted the database behind World of Solitaire to redis 2.2

It’s currently holding over 21 million keys and handling over 1,100 commands per second.

Over the past 9 months RAM usage has slowly been increasing as more and more keys are inserted. A few days ago I realized I was very short on available RAM and had to do something sooner, rather than later.

Before throwing more RAM into the box, I decided to try updating to redis 2.4 as I had read a blog post that it was more efficient at storing sorted sets.

I was SHOCKED at the reduction in RAM usage:

It was such a dramatic reduction, I questioned whether or not data had been lost in some way.

Comparing the same dump in both 2.2 and 2.4 yielded the exact same key count: 21,085,659

Am I really storing that many sorted sets? I decided to write some code to see how many keys of each type I was storing and what the average length of each type was.

My first attempt in node.js ended pretty quickly with a memory allocation fault with node.js, so I decided to code it in C.

Data type breakdown:

Average Length:

Over 10 million sorted sets with only 1.5 entries on average per set.

Thanks to redis 2.4, I won’t have to worry about RAM for a while :)

Here is the hacky C code I coded up to gather redis key type stats:

First Post!

Over the past decade or so of coding I’ve had several coding related ideas I wanted to research or experiment with. In many cases I felt that the outcome of the research or experiment would be interestering to others. This blog will be a place where I can post the outcomes of these endeavors.

I hope they prove to be useful to someone someday :)