TL;DR If you’re using Ruby, want to perform operations on the values of a hash only at specific locations in the hash, check out hash_walker

A while back I worked at a Tokyo-based startup focused on building a crowd-sourced translation platform. The great thing about Gengo was that we (and they still do) offered a translation API that allowed clients to integrate and thus order huge amounts of translations, receive translated content back via callbacks and do whatever it is that clients do with translated materials.

On one of the projects I worked on for a potential client, as part of a trial project, I needed to process a huge amounts (ok, not huge, maybe around 2000 or so) JSON responses and grab the data, throw it over the Gengo API and once it was translated, re-insert the translated content into the JSON responses and send all the JSON responses in text files to the client.

Not too hard, but at the same time not entirely trivial because the client wanted to make sure that only specific pieces in the original JSON were translated. For example given the following JSON response:

Contrived JSON example (hash_walker_orig.json) download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
{
    "title": "Hilarious article",
    "subtitle": "Something people might find funny",
    "description": "super funny stuff",
    "system": {
        "categories": [
            "high ratings",
            "link bait"
        ],
        "description": "really not that funny"
    },
    "author": {
        "name": "Some Body",
        "body": "Not too bad"
    },
    "categories": [
        "Life",
        "Happiness",
        "Jokes"
    ],
    "content": {
        "categories": [
            "Life",
            "Happiness",
            "Jokes"
        ],
        "body": [
        "lots of jokes here",
        "lots of jokes here again !",
        "last of the jokes !",
        ]
    }
}

The client wanted title, subtitle, description, content.categories, content.body translated, but NOT system.categories, system.descriptions etc. So in other words, it wouldn’t be good enough to just traverse the JSON and look for specific keys, it had to be dependent on where the key was. To make things even more interesting, there would be cases where keys were inside objects that were inside arrays inside objects, inside arrays, etc.

Since I like using Ruby and in Ruby, the natural data structure that JSON maps to is the Hash, I decided to build an app that would allow me to define some sort of “template” if you will, for the purpose of traversing the JSON responses involved in this project.

I came up with the hash_walker gem. Use it like you would any gem, require it, and you get a each_primitive_value_at method on all hash objects. For full details on how to use it, check out the project’s Github page; I’ll just be explaining the basics here.

Since a code example says a thousand words:

Code example for hash_walker (hash_walker_hash.rb) download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
require 'hash_walker'

json_response = {
    "title"=> "Hilarious article",
    "subtitle"=> "Something people might find funny",
    "description"=> "super funny stuff",
    "system"=> {
        "categories"=> [
            "high ratings",
            "link bait"
        ],
        "description"=> "really not that funny"
    },
    "author"=> {
        "name"=> "Some Body",
        "body"=> "Not too bad"
    },
    "categories"=> [
        "Life",
        "Happiness",
        "Jokes"
    ],
    "content"=> {
        "categories"=> [
            "Life",
            "Happiness",
            "Jokes"
        ],
        "body"=> [
            "lots of jokes here",
            "lots of jokes here again !",
            "last of the jokes !",
        ]
    }
}

keys_to_read = [
        "title",
        "author" => ["name"],
        "categories",
        "content" => [
            "body"
        ]
    ]

json_response.each_primitive_value_at(keys_to_read){|value,path|
    puts "#{value}, #{path}"
}

# Output

# Hilarious article, ["title"]
# Some Body, ["author", "body"]
# Life, ["a_array", 0]
# Happiness, ["a_array", 1]
# Jokes, ["a_array", 2]
# lots of jokes here, ["content", body", 0]
# lots of jokes here again !, ["content", body", 1]
# last of the jokes !, ["content", body", 2]

If you have any suggestions, contributions or issues, please feel free to leave a comment here or on the repo itself !

Comments