And we’re off
Welcome to ReadCongress! Our site’s now officially in beta, and since you’re here, we should introduce ourselves.
ReadCongress is a not-for-profit, volunteer-driven open-government website dedicated to opening up the Congressional Record to serious inspection. That’s a lot of words and a lot of hyphens, so you might be legitimately wondering what all that means.
- What is the Congressional Record? It’s the official transcript of Congressional business. All the speeches, motions, votes, everything that happens in the House and the Senate, and transcripts of all the committee hearings.
- And this matters because?
- If you want to make sense of what actually goes down in Congress, this is your primary source.
- It’s also the kind of information that might be useful in an election season like the one that spawned this site: how a legislator’s position changes over time or what positions he or she took in the past that might not be so useful now (see McCain, John, on deregulation).
- Finally, the Congressional Record can be used in judicial proceedings to figure out what the original legislative intent was, and hence what the law means and how it should be applied.
- Isn’t it already available? The Government Printing Office has published the Congressional Record and the committee hearing transcripts in electronic format since 1994. (That’s where we got them.) Some of the record is also available on C-Span’s website, where it’s linked to their video.
- So why another site? There are drawbacks to both of those sites. The GPO has the complete records, but they’re stored as text files, so you can’t do searches on a particular person or get specific results without reading through the entire returned records. C-Span has done some of the parsing by individual span and links to the video (which is really cool), but they simply don’t seem to have the entire record (they show 4 results for McCain on abortion, whereas our full import has 17). Our goal is to improve on those sites by taking the complete Congressional Record and making it smart: offering full text search, the ability to look deeply into the text and thoroughly through a specific legislator’s record, a pleasant, usable interface, and tools to tag, clip quotes, and otherwise identify the interesting stuff out of all those words.
- So the whole Congressional Record is here? So far, we’ve loaded the Senate’s floor speeches and committee hearings since 1995 — 1.4 million individual speeches. We’ll soon be loading the 1994 record soon (which is in a different format). Once that’s done, the next step is adding the House data, and the final step is code to automatically retrieve new contents each day so the site is always up to date.
- Awesome! I can’t wait to dig in and find some stunning quotes that’ll change the course of the election. Good luck :) What we have here is great research material, a solid source on what our legislators actually do in their day jobs. There isn’t a lot of shocking, headline-worthy material in here (you don’t confess that you hate small-town or big-city America on the floor of the Senate), but what there is is a lot of discussion of what our legislators believe and why they voted as they did. (So while you won’t find Ted Stevens admitting he took kickbacks for his influence, you’d definitely find McCain opposing abortion or explaining why he supports deregulation, both useful things to know.)
- Finding better servers. Now that the site’s in beta, budget hosting’s not really gonna cut it. You deserve better response times.
- Get a graphic designer (any volunteers?) We know the site ain’t pretty.
- Upgrading the search engine, particularly so that you can search someone’s name plus a search term (e.g. “Obama abortion”) and get the right results. Ferret, the search technology we’re using, has a lot of cool parameters, I’m looking forward to playing with them.
- Add the remaining data — the House’s records, the Senate’s 1994 record, and a few records that errored out in the import.
- Facebook Connect — a fun little project to connect the site to Facebook so that you can login easily and share anything neat that you find.
- Making Internet Explorer work. The site’s just not as nice on that platform, and that really needs to be fixed, even though it’ll be painful.
- Additional record tweaking tools to fix any import errors or other problems without needing to monkey with the database.
- Make this blog site look more like the main site, and less fugly.
- Lots of general site improvements, bug fixes, and tweaks.
We’ve mentioned ourselves in passing several times in the course of all this, so you might also be wondering who we are. Personally, I’m Alex, the main developer for ReadCongress. I started the site about seven months ago, after California voted in the primaries, as a way to help out for the general election and to make something that other people might find useful (and to burn off some election-related stress, too). I’ve been very excited to see how a small, McCain-focused research site has turned into a functional resource for the entire Congress. It’s been a lot of fun and an incredible learning opportunity, as well as a chance to work with some talented people (props to Luke, Alex, and Adam, the others who’ve helped out on areas like the upcoming Facebook integration, the user interface, and so on).
Finally, a few technical details. We built ReadCongress using Ruby on Rails running on MySQL, with Ferret providing the engine for full text searching. On the front end, it’s all Prototype and Scriptaculous, plus, of course, all the Javascript, CSS, and HTML we wrote by hand. With just the Senate imported, we already have 113,052 documents in our database, containing 1,432,130 million speeches. We also have a lot of work to do to make the site better — here are our top priorities:
So, that’s about it. Take a look around, read some records, and enjoy! We’d love to hear any thoughts, comments, complaints, love, hate, whatever, at alex _at_ readcongress.org. If you’re a Rails developer or a graphic designer and want to help out, that would be great. We want this to succeed and become a useful open-government resource, and there’s more than enough work to go around.
Cheers,
Alex