Aaron Van Bokhoven / aaronvb

I am a Software Developer and a Portrait Film Photographer.
  • I love coding with Ruby and Ruby on Rails, and I love shooting with film.
  • I currently live in Honolulu, Hawaii, but frequent Chicago and California.
  • I believe that programming is a form of art, like painting and photography, where you can express your ideas and logic and see it transform into something real.
Aaron Van Bokhoven
photo: kipkeston
Email me at bokhoven@gmail.com. View my Photography Portfolio and my Photo Blog (tumblr). I'm also on Twitter.

My current work:

Hawaii Photo Rental
http://www.hawaiicamera.com
Buoy Alarm
http://buoyalarm.com
Flux Hawaii Magazine
Summer 2011, Fall 2011, Winter 2011(cover) issues
http://fluxhawaii.com

Open source contributions:

Coffee-Resque (my fork)
https://github.com/aaronvb/coffee-resque
Pow
http://pow.cx (github)
My Gists
http://gist.github.com/aaronvb

You can also find me at:

Github
http://github.com/aaronvb
Photography portfolio
http://aaronvb.com/photo
Junkparty
http://junkparty.com
Flickr
http://flickr.com/photos/aaronvb

Jul 20, 2009

One of the main reasons I first used BackgrounDRb was for it's ability to use a cron like schedule that could reoccur at whatever time or date I specified. It was very easy infact, by just writing the cron schedule in the config file for the worker. I was satisfied and stuck with BDRb for just that reason, until I realized how much memory BDRb was using.

Not only was I relying on BDRb to handle a worker that would reoccur nightly, I had also moved a lot of features that didn't require immediate user feedback, to the background(ie. the message system). So not only was I looking for an alternative to handle my daily worker, but also everything else that needed regular background processing.

I did some research on the current background processing methods and came across an excellent blog post that provided me with a clean solution to all my problems. http://transfs.com/devblog/2009/04/06/goodbye-backgroundrb-hello-workling-starling.

Starling and Workling. There's tons of information on getting it setup. I recommend watching the Railscast by Ryan Bates - Starling and Workling.

Starling is the queue server, which Workling polls and then executes workers depending on the task. Easy enough to understand. Even better, Starling is a separate process outside of your rails application. Meaning, if your rails application gets restarted or stopped, or if Workling is off, everything in the Starling queue gets saved. That's great and all, but the fact that Starling is a separate process and is using the the same design as memcache, means I can slip anything I want into the Starling queue outside of my rails application! To rephrase that, I'm able to start a background process in my rails application without having to be in my rails application.

Let's continue with some code examples.

In my project I have a TemporaryShoppingCart model which handles temporary shopping carts that a user might create while visiting my site. At the start of each day(00:00:00 on my server), I want a background worker to clear the temporary shopping carts, but also create a record of the event and record how many carts were removed. The record would be stored in another model, say RemovedTemporaryShoppingCart so I can access that data later through an admin page. Note, I'm not sure how practical this is for a real application, I'm just creating this example off the top of my head.

The first step is to create a method in TemporaryShoppingCart model that will do everything I stated above.


def self.empty_self
#create a new record in RemovedTemporaryShoppingCart model with the total carts removed
RemovedTemporaryShoppingCart.create(:total => self.all.length)
#deletes all records in this model. http://api.rubyonrails.org/classes/ActiveRecord/Base.html#M002275
delete_all
end

Nothing fancy there. Trying to keep as much code in the model rather than in the worker. Next I create the Workling worker.


#app/workers/daily_worker.rb
class DailyWorker < Workling::Base
def empty_temporary_shopping_carts(options)
TemporaryShoppingCart.empty_self
end
end

Make sure to restart Workling and the rails application to generate the new worker, and that Starling is running as well. Normally I can start the worker through the rails app by 'DailyWorker.async_empty_temporary_shopping_carts()' but that's not what I want to do in my case.

If you watched the Starling Workling Railscast by Ryan Bates, near the end of the cast, Ryan shows you how Starling works by connecting to it through a memcached protocol. My next example is the exact same except with real data to start the worker.

Load up irb from the terminal by typing 'irb'.


>> require 'rubygems'
=> false *returned false because rubygems is already initialized*
>> require 'memcache'
=> true
>> starling = MemCache.new('localhost:22122') *change ip/port if not default*
=>
>> starling
=>
>> starling.set 'daily_workers__empty_temporary_shopping_carts', {}
=> "STORED\r\n"
>>

There should be a new record in RemovedTemporaryShoppingCart model which means the worker did the job. Great I just started a worker inside my rails application from outside my rails application. So now how do I get it to run on a schedule? Thankfully in that first link that I mentioned, the author realized they could use the built in cron feature in the linux operating system to run a ruby script similar to what we did in the irb above.


#!/usr/bin/env ruby
#lib/nightly_worker_for_cron.rb
require "rubygems"
require "memcache"

host = "localhost"
port = 22122
starling = MemCache.new "#{host}:#{port}"

case ARGV[0]
when 'empty_temp_carts'
starling.set 'daily_workers__empty_temporary_shopping_carts', {}
end

This file can be anywhere but for ease of editing access I threw it in my 'lib' folder and also named it 'nightly_worker_for_cron.rb'.

Now to add it to my crontab I just type 'crontab -e' in my console and insert '0 0 * * * ruby /the_path_to_my_rails_app/lib/nightly_worker_for_cron.rb empty_temp_carts' on any line then save. Done!

I wrote this to mainly show people how useful Starling and Workling can be. It might not be as easy to setup as delayed_job or straight forward, but there are a lot of neat things that it can do that delayed_job cannot. Instead of trying to compare which one of these background methods is better, decide on what you NEED your application to do and figure out which one will work for you.


Jul 19, 2009

I spent several days searching for ANY information on background uploads to Amazon S3 with Paperclip and I just couldn’t find anything concrete. There were a few posts on the Paperclip Google Code board that talked about it but with no clear examples of how to do it.

Anyway, we know it’s very easy to add S3 support into Paperclip, and there’s tons of information on setting that up. It works flawless and it’s simple, yet your upload now takes twice as long in comparison to using your servers filesystem for storage. The problem is pretty obvious and I’ll explain why in this example in case you’re wondering.

Example A(filesystem)

1) User uploads a file
2) Application re sizes file into thumbnails

Example B(Amazon S3)

1) User uploads a file
2) Application re sizes file into thumbnails
3) Application then connects to Amazon S3 and uploads thumbnail
4) Repeat 3 until each thumbnail is uploaded

As you can see in Example B, the more thumbnails you need the longer your upload is going to take because Paperclip needs to send each file to Amazon S3 one at time. All the while a user is still waiting for the website to respond back. If you use New Relic RPM you’ll notice that your Apdex gets destroyed by this. Which is basically telling you that a user is waiting longer than a second(more like 3-5 seconds) for an upload process.

Although there’s nothing that can be done about the initial upload, why does the user need to wait around while the application uploads to Amazon S3(which could take 2-5 seconds longer)?

I worked on several ways to get Paperclip to process it’s Amazon S3 storage in the background that led me to more complications than a solution. I realized I wanted to try my best to not modify the original Paperclip code so there wouldn’t be any problems updating Paperclip in the future.

The next idea was to have some sort of ‘syncing’ to Amazon S3 system. Let Paperclip upload to the filesystem, then have a background process move the files to Amazon S3, then delete the files off the filesystem.

Let’s assume you already have your rails app, starling/workling installed and running, and paperclip installed and working on the model of your choice. Let’s also assume we’re applying this to a User model and the attachment is an avatar. You’ll also need the aws-s3 gem(http://amazon.rubyforge.org) and add that to your environment.rb.


sudo gem install aws-s3

#environment.rb
config.gem 'aws-s3', :lib => 'aws/s3'

First let’s modify the Paperclip settings in our User model.


has_attached_file :avatar,
:styles => { :large => "800x800&>", :medium => "300x300&>", :thumb => "100x100#", :tiny => "50x50#", :really_tiny => "25x25#"},
:storage => :filesystem,
:url => "http://s3.amazonaws.com/webapps3bucketname/:attachment/:id/:style/:basename.:extension",
:path => "tmp/paperclip_uploads/:attachment/:id/:style/:basename.:extension"

The :url is set to the Amazon’s S3 link so that when we generate urls with object.avatar.url(:medium) it will output the Amazon S3 url instead of the filesystem path. I also set the :path to the location I want the file to be uploaded to and re sized. I chose the tmp folder since it seemed appropriate. (Note: I’m considering moving the location to system because it’s a shared folder across capistrano.)

A new column is needed for this model to let us know when the background uploading is being processed.


script/generate migration add_processing_upload_to_user processing_upload:boolean
rake db:migrate

It’s your choice if you want to set the default to true, assuming upon creation of a User record that an upload will be included. In my case, a user gets created first and if the user decides to upload an avatar, it’s done through the user settings later, so I set the default to false.

Next we’ll setup the worker which will handle the upload to Amazon S3.


#app/workers/paperclip_background_upload_worker.rb
class PaperclipBackgroundUploadWorker < Workling::Base
def upload_to_amazon_s3(options)
logger.info("Begin uploading #{options[:type]} to Amazon S3...")
#find the user and the avatar
@user = User.find_by_id(options[:id])
@avatar = @user.avatar

#connect to amazon s3
logger.info("Connecting to Amazon S3...")
AWS::S3::Base.establish_connection!(
:access_key_id => 'ACCESSKEYID',
:secret_access_key => 'SECRETACCESSKEY'
)
#first we store the original
logger.info("Uploading original image...")
AWS::S3::S3Object.store(@avatar.path_s3(:original), open("#{RAILS_ROOT}/#{@avatar.path(:original)}"), 'YOUR_AMAZONS3_BUCKET_NAME', :access => :public_read)
#then thumbnails
@avatar.styles.each do |key, value|
logger.info("Uploading #{key} image...")
AWS::S3::S3Object.store(@avatar.path_s3(key), open("#{RAILS_ROOT}/#{@avatar.path(key)}"), 'YOUR_AMAZONS3_BUCKET_NAME', :access => :public_read)
end
#start clean up, files first then directories
logger.info("Removing original from local filesystem...")
#remove original file
File.delete("#{RAILS_ROOT}/#{@avatar.path(:original)}")
#remove thumbnail files
@avatar.styles.each do |key, value|
logger.info("Removing #{key} from local filesystem...")
File.delete("#{RAILS_ROOT}/#{@avatar.path(key)}")
end
#delete directories
logger.info("Removing directories...")
#remove original folder
Dir.delete("#{RAILS_ROOT}/tmp/paperclip_uploads/avatars/#{@user.id}/original")
#remove thumbnail folders
@avatar.styles.each do |key, value|
Dir.delete("#{RAILS_ROOT}/tmp/paperclip_uploads/avatars/#{@user.id}/#{key}")
end
#remove avatar user_id folder
Dir.delete("#{RAILS_ROOT}/tmp/paperclip_uploads/avatars/#{@user.id}")

#set processing to false now that we are finished
@user.processing_upload = false
@user.save

logger.info("Finished uploading to Amazon S3.")
end
end

This is pretty simple. The worker uses the aws-s3 gem to upload the files and once it finishes they get removed from the filesystem. You’ll need to include your Amazon S3 Access Key ID and Secret Access Key and also fill in your Amazon S3 Bucket name that the files are getting uploaded to where it says ‘YOUR_AMAZONS3_BUCKET_NAME". You may also notice a new method named ’path_s3’ which I had to add to Paperclips attachment file.

Open up the ‘attachment.rb’ file in the ‘vendor/plugins/paperclip/lib/paperclip/’ folder. Under ‘def initialize’ add the two ‘@path_s3 =’ lines code(around line 47):


@dirty = false
#a new hash value and key for the s3 path we define in the model
@path_s3 = options[:path_s3]
@path_s3 = @path_s3.call(self) if @path_s3.is_a?(Proc)

normalize_style_definition
initialize_storage

Then scroll down, in the the same file, to about line 113 where there’s a method named ‘path’(def path style = nil #:nodoc:). Under that method we are going to make a new one.


def path style = nil #:nodoc:
original_filename.nil? ? nil : interpolate(@path, style)
end

#Added this for background uploading.
#This returns the path to the attachment in the s3 bucket.
def path_s3 style = nil
original_filename.nil? ? nil : interpolate(@path_s3, style)
end

This will let us define a new key value(:path_s3) in the has_attached_file hash in the User model. If you’re wondering what the ‘path_s3’ does, it determines the path within the Amazon S3 Bucket that the file gets saved to and by having it in our has_attached_file options, we can change this if we need to later on. Which I’ll show you next.

Back to the User model, we add a new line in the has_attached_file hash:


has_attached_file :avatar,
:styles => { :large => "800x800&>", :medium => "300x300&>", :thumb => "100x100#", :tiny => "50x50#", :really_tiny => "25x25#"},
:storage => :filesystem,
:url => "http://s3.amazonaws.com/webapps3bucketname/:attachment/:id/:style/:basename.:extension",
:path => "tmp/paperclip_uploads/:attachment/:id/:style/:basename.:extension",
:path_s3 => ":attachment/:id/:style/:basename.:extension"

Don’t forget to restart the server since the Paperclip plugin was modified and also restart workling to generate the new workling we created earlier.

The last part is to call the worker through Starling. In this example, the file upload is being handled normally through the User controller and the Update action.


def update
@user.attributes = params[:user]
#only set the processing_upload to true if an avatar file was uploaded
#if the file doesn't pass valiations, @user wont get saved and processing_upload stays false
@user.processing_upload = true if params[:user][:avatar]
if @user.save
flash[:notice] = "Your profile was successfully updated!"
#tell starling to start the background worker only if processing_upload is true
#the id of the user gets passed to let our worker know which files to upload
PaperclipBackgroundUploadWorker.async_upload_to_amazon_s3(:id => @user.id) if @user.processing_upload
#redirect to the edit page
redirect_to edit_user_path
else
#save failed
redirect_to :action => 'edit'
flash[:notice] = "Something went wrong while trying to update your profile."
end
end

That’s it! The next step is to create a way to let your users know that the file is being processed. All you have to do is check if processing_upload = true.

This is my first tutorial type of post so bear with me if I missed anything and please let me know asap! Also if you have any experience doing something similar to this, let me know!


Apr 27, 2009

It's been a while since I've developed some black and whites, mostly shooting color these days. Here's a few from Cappadocia and Istanbul.

Hasselblad + Distagon + T-Max 100

img465

img454

img468

img473

img475


Apr 20, 2009

The second roll from Istanbul. I hate dust. I know most of these are portraits, but wait until I start scanning the slide film I took on this trip!

hasselblad + planar/distagon + fuji pro 400h. click the photos for bigger view and captions.

img451

img450

img448

img447

img445

img452

img444

img442


Apr 15, 2009

Finished scanning my first roll from Turkey. It ended up going through this old scanner at the Aya sofya, and I'm guessing that's the reason behind the strange colors and artifacts. No worries though.

hasselblad + planar/distagon + fuji pro 400h. click the photos for bigger view and captions.

img361

img362

img363

img365

img371



© Aaron Van Bokhoven