We are attempting to employ network tools this week. My “voices” on the Santa Barbara Oil Spill did not seem to connect in useful ways, so I turned to a different source- the Biographical Directory of Federal Judges. This database contains extensive biographical information on more than 3,000 Federal Judges dating back to the colonial era so it is extremely useful. Unfortunately, it is also inordinately messy and untidy– containing over 200 columns (“cleaning” the data was an excercise in Clio3). With my R data cleaning skills and Google’s OpenRefine, I was able to arrive at the columns I would need for some simple network analysis of the judges.
I chose to examine only non-white judges. They made up about 10% of the entire Federal Judgeship. I expected that due to legacies of segregated schooling most of the non-white judges would come from a limited number of traditionally African American colleges and law schools. These networks of judges would provide for their school peers with clerkships and other opportunities. This would be quite common within African American church and business communities. However this did not appear to be the case at all for Federal judges. Of the 368 non-white judges, over 200 schools were represented (my data). Due to the nature of the Directory data, the schools might represent a judge’s undergraduate degree or their law degree, though attending Harvard Law or Harvard College was normalized into Harvard University. Even with this normalization, there were few school “networks’ according to Palladio:
The schools network in RAW was similarly attractive, yet fragmented:
Ultimately, I had major problems with all of the networking applications. They had strict data requirements, were buggy, and some aspects did not work on my computer. Gephi’s data requirements made it unusable for this particular project, though it clearly has some power if one is able to scale it’s steep learning curve. RAW provided code to embed in blog posts, but WordPress stripped all html tags and it appeared to be just a list of colleges (this might be a WP problem instead of a RAW problem). Fortunately, I don’t expect to need much network analysis in my work on the Santa Barbara Oil spill.
I revisited some of my Santa Barbara Voyant visualizations from last week. I had only begun to play with the categories and word frequency visualizations, so I thought I would embed a few more to showcase this tool more fully. Many of these visualizations will make more sense after reading the background to last week’s practicum.
Last week I checked “oil”, “spill” and some other similar words in various word frequency counters. Voyant also mapped these frequencies:
It was also interesting, in light of this week’s network analysis, to create see networks between the words in the letters to the editor:
While the network tools evaluate the actual connections between two nodes, Voyant’s collocate tool simply evaluates the network between two word frequencies. While these visualizations look similar, they have very different statistical analysis “inside the box.”