Creed, Cult, and Code: June 2008

I wish I had a clip reel I could roll after a dude with a really deep voice said, "Previously on Parsing YAML files in Ruby". But I don't. So here's a link.

Ruby is Narnia. I spend my real life in c#, a perfectly serviceable language. I'm comfortable there, I kind of know my way around, and I've come to depend on it to make my living. But when I have a few spare moments here and there, I get to wander off into this magical fairy-land and have adventures with strange and wonderful creatures. Like the YAML. In case you forgot what our YAML looks like, here he is:

   1: ---

   2: shared paths:

   3:   build share : \\buildshare.mydomain.com\Builds

4:

   5: local paths:

   6:   references  : \references

7:

   8: custom assemblies:

   9:   - location: \Dev\Components\Business\Core\Trunk\Latest\Debug

  10:     assemblies:

  11:       - name : MyNamespace.Core

  12:         files:

  13:           - binary    : MyNamespace.Core.dll

  14:           - debug     : MyNamespace.Core.pdb

  15:           - document  : MyNamespace.Core.xml

16:

  17:   - location: \Dev\Components\Framework\Trunk\Debug

  18:     assemblies:

  19:       - name : MyNamespace.Framework.Core

  20:         files:

  21:           - binary    : MyNamespace.Framework.Core.dll

  22:           - debug     : MyNamespace.Framework.Core.pdb

  23:           - document  : MyNamespace.Framework.Core.xml

24:

  25: vendor assemblies:

  26:   - location: \vendor\DotNet Commons\Logging\2.0

  27:     assemblies:

  28:       - name : Dotnet.Commons.Logging

  29:         files:

  30:           - binary    : Dotnet.Commons.Logging.dll

31:

  32: testing assemblies:

  33:   - location: \vendor\Nunit\2.4.3

  34:     assemblies:

  35:       - name : NUnit.Framework

  36:         files:

  37:           - binary    : nunit.framework.dll

38:

  39:   - location: \vendor\Rhino.Mocks\3.3.0.906

  40:     assemblies:

  41:       - name : Rhino.Mocks

  42:         files:

  43:           - binary    : Rhino.Mocks.dll

  44:           - document  : Rhino.Mocks.xml

  45: ...

Last time I told you how easy it was to access data in *.yml files in Ruby. I've taken idea that a little further, and I cooked up this class:

   1: require 'yaml'

2:

   3: class References

   4:   attr_accessor :debug_mode

   5:   def initialize(references_file_name='references.yml',debug_mode=true)

   6:     @refs = open(references_file_name) {|f| YAML.load(f) }

   7:     @debug_mode = debug_mode

   8:   end

   9:   def shared_root_directory

  10:     @refs['shared paths']['build share']

  11:   end

  12:   def get_filenames(assembly_list_name, *file_types)

  13:     get_node(@refs, assembly_list_name) do |assembly_list|

  14:       assembly_list.each do |packing_list|

  15:         get_node(packing_list, 'assemblies'){|assembly| get_names(assembly, packing_list['location'],file_types){|filename| yield filename}}

  16:       end

  17:     end

  18:   end

  19:   private

  20:   def concatenate(*locators)

  21:     concatenated = String.new

  22:     locators.each { |locator| concatenated << (locator =~ /\A(?!\\)/ ? '\\' : '') << locator.sub(/\\\Z/, '') }

  23:     return concatenated

  24:   end

  25:   def get_names(assembly, path,file_types)

  26:     get_node(assembly, 'files') do |file|

  27:       parse_filenames(file,file_types){|filename| yield concatenate(shared_root_directory,path,filename)}

  28:     end

  29:   end

  30:   def get_node(data_store, find_key)

  31:     yield data_store[find_key] if data_store.kind_of? Hash

  32:     data_store.each{|node| get_node(node, find_key){|subnode| yield subnode}} if data_store.kind_of? Array

  33:   end

  34:   def parse_filenames(file_node,file_types)

  35:     file_node.keys.each {|key| yield file_node[key] unless filter(key,file_types)} if file_node.kind_of? Hash

  36:     file_node.each{|element| parse_filenames(element,file_types){|value| yield value}} if file_node.kind_of? Array

  37:   end

  38:   def filter(key,file_types)

  39:     (!@debug_mode && key=="debug") || (!file_types.include?(key) unless file_types.empty?)

  40:   end

  41: end

With this References class, you can do something like this (pay attention - here's where it starts to get cool):

refs = References.new
refs.get_filenames("custom assemblies","binary"){|filename| puts filename}

And you get something like this:

\\buildshare.mydomain.com\Builds\Dev\Components\Business\Core\Trunk\Latest\Debug\MyNamespace.Core.dll
\\buildshare.mydomain.com\Builds\Dev\Components\Framework\Trunk\Debug\MyNamespace.Framework.Core.dll

Let's start with get_node() on line 30. This method is an iterator. I don't know why, but it took a long time for the lightbulb to go off in my head over Ruby's usage of the yield keyword. Turns out, it works just like all the Ruby books say it does. Really, why would they lie? In this case, on line 31, we're getting the value located in an element in the data_store hash picked out by the find_key variable, and yielding that value back to the calling method. And that calling method better have a code block to execute once it receives a value, or we're gonna get a big ol' runtime exception. For the get_filenames() call in our script, on line 14 we're saying, "Look in the top-most hash in the references.yml file and find me a node with a key called 'custom assemblies'". Remember: the way our YAML file is laid out, it's just a big, weird hash of arrays and hashes. We have to write code to ferret out the info we want, and in this case ultimately we want a list of filenames.

There's another interesting thing happening on line 31. There's an if statement at the end of the line. If you tried to get away with something like that in c# land, they'd lock you up and throw away the key. Ahh, but here in Narnia, animals talk, trees walk, and all sorts of silly things happen. You can even say "hey, do this thing if this other thing is true", they way people do. No fussy brackets, or parentheses, or overly strict formatting rules to worry about. Line 31 takes care of the case when data_store is a hash. If data_store is not a hash, it's an array of hashes, and we take care of that on line 32, using a little recursion magic to get at the hash in each element of the array.

I think of the contents of 'custom assemblies' as an assembly_list, and each assembly_list contains packing_lists, each with a location and a list of assemblies at that location. Each assembly can have more than one file associated with it - in this case I've listed the binary dll, the debug symbol pdb file, and the xml document associated with our custom assemblies. I'm asking References to get just the binary files in 'custom assemblies'. Now that I've got the 'custom assemblies' node, I already know that the assembly_list is an array, so I can just iterate through it with .each to get each packing_list. (That's the reason I wrote get_node() in the first place - as I was learning about YAML and Ruby, I wasn't sure what object types I was dealing with as I drilled down through the YAML file. I could probably simplify the get_node() iterator now that I understand the structure of the file better, but I'll save refactoring for another time.) A packing_list is a hash containing a 'location' that can have multiple 'assemblies'. Line 15 says, "Get me the filenames for every assembly in this packing_list, and I'm really only want the ones that match this list of file_types".

The * in front of file_types in the get_filenames() signature makes it an optional parameter. You don't have to specify the type of file you're looking for, and if you don't then you get back everything. But if you do specify a list of file_types, that list is used in the filter() method, which is called on line 35. Another cool Ruby-ish way of saying something: using the unless keyword. Line 35 says, "Take a look at all the keys in the file_node hash, and give me back the file_node value for each key unless the key should be filtered out."

I'm hoping to put plain old Ruby classes and YAML together with Rake so that I can sweep angle brackets out of my life forever, and I'll post my progress as I learn. Now I'm sure that there are better ways of expressing these things in Ruby. But I'm new here. I'm still enjoying my Turkish Delight and hot tea. I still have a lot to learn about Narnia, but for now it's back to the real world.

Share this post :

Creed, Cult, and Code

Friday, June 6, 2008

Tracking Technical Debt in ReSharper - Follow-up

Wednesday, June 4, 2008

Parsing YAML files in Ruby - Part 2

About Me

Tags

Blog Archive

Blogs I should read

Blogs I actually read

Twitter / dalesmithtx