Tuesday 11 December 2012

Dusting off my climate authors site

Oof!

A new open letter about climate change, published last month in Canada's Financial Post, got me busy going back into my website covering who signs such things and how seriously we should take them. I'm preparing a new post on this letter in particular, but I'm also going back and dusting the cobwebs off the rest of my web listing of climate scientists, statements, petitions and so forth. 

After having worked at length in 2009 and 2010 to get the site to its current state, I gave myself a well-earned rest from updating it for a while. I devoted free time to other pursuits I also value, such as upgrading the energy-efficiency of our 1920 Toronto home, in time to claim some energy-retrofit tax credits that were set to expire. I made it in time, and had fun in the process. I upgraded the insulation in my attic (after struggling mightly trying to seal air leaks around the fixtures in the 2nd floor ceiling.) By renting the machine to blow in the shredded fiberglass as a D-I-Y, I actually got back the full cost of that upgrade.

I also had our foundation dug out and waterproofed, then insulated the basement and headers which were big areas of heat loss. I had new high-performance windows installed in the basement as well (we had the first and second floor windows done already some years back.) The basement was labour-intensive and I felt like I was holding down two jobs, but I did get some help by hiring a couple of neighbourhood youth (at or above the youth minimum wage.) When it came time for drywalling, I hired some tradesmen. I took digital photos at each stage, and presented before-and-after album to the inspector who verifies the work for the government rebates. He said mine was the best-documented job he'd ever inspected 8-) And yes, our heating bill was considerably lower the next winter - but then, the weather was a good deal milder too, so I don't have a clear read on how much good this did yet (I should probably do some math involving degree-days.)

Once that huge job was done, I had some actual leisure to catch up on reading, both fiction and non-fiction. I'm a pretty voracious reader, and now with e-books my ever-growing pile of pending 'must reads' extends into cyberspace (and forks between Kobo, Kindle, and Goodreader... sigh. I realised today I have a problem remembering which e-book space a given unfinished title is languishing.)

So when I started looking over my website, I found a lot of links had gone stale. First I reviewed my list of sources - links back to the original documents from which I noted who had signed which statements and letters. Sadly, several more of these had gone missing; fortunately each lost document is still housed at www.archive.org, so I've updated my links to that. I don't actually understand how they pay for this free service, and I got to wondering what will happen to people like me if that site ever goes under. {Tangent} Even worse would be the loss of a site like bit.ly, tinyurl, or any of the big URL shortening services. If one of them simply folded without donating their existing link base to the public good, a *whole* lot of web content would start to unravel. Just try not to think about this... {/Tangent}

Next I went back to my big long list of names, citation stats and website links for authors. I thought of several new tidbits to start collecting: who has a Twitter account now? (32 found so far, many more to come I'm sure.) Who is written up at Wikipedia? (a column I started some time back and am just getting rolling on filling out.) Who has ties to coal, oil & gas interests? Who has an author profile page at Google Scholar? (Neat new feature, solves the problem of separating an author's own works from those of others with similar names; also shows Google's results on handy stats like h-index) Who is written up in SourceWatch or in DeSmogBlog's reference database? 

After going back to data entry mode for a number of days, I thought I'd look at the scripts I use to extract the data from Excel, format it for both HTML and JSON/jquery, and sort and summarize the results. I enjoy this kind of coding immensely, but I hadn't run this for quite some time and had to get my bearings a bit. After getting errors from code I know I used okay at the last batch run, I realised I'm not supposed to run it under Solaris, but on a Linux host. I found the code in perl's Spreadsheet::XLSX module doesn't like when your spreadsheet contains any formula error warnings - so I learned to use the "Review errors" button in the Formulas tab. (Turns out I'd just type a stray letter into a column where numbers belong, which another cell referenced in a calculation. Clear - problem solved.)

Finally I had phase 1 of my script back up and running. I decided to activate the routine I'd added to check all the URLs for broken links as they are imported. This takes a lot longer, but with two years since the last pass, they really needed re-checking. 

The results were stark. Some 25% of URLs I collected in 2009-10 have gone 404 in the intervening two years. A large majority of the URL base are on academic websites. I guess everyone feels compelled to "improve" their sites with a big re-org once in a while... sigh. Lots of manual searches to see what new paths everyone got reorged off to. Here are the specific numbers:

Finished URL checks. Found 3029 okay, 1153 broken

I checked URLs for homepage and mugshot photo that I'd collected. I may just give up on trying to have a photo link for each person, though I thought that was nice to include. It's just a lot of added search time that might be better spent on other tasks.

I see I'll also want to add an option to my script to verify the other types of URL I've started gathering: Wikipedia, DeSmogBlog, Sourcewatch, and implicitly any Twitter handle can be formed into a URL for the person's Twitter profile. These at least should have a lot less turnover than the university homepage ones turn out to do.

Watch this space for more on this big push, including a "top climate science tweeps" report - some really big names are showing up on Twitter now. It's quite exciting.

No comments: