Under the Hood of Web 2.0 : the top ten programming concepts for librarians to understand

So, you “get” the Library 2.0 thing, and you’ve been through the 23 things. You are up-to-date on web services like blogs wikis and the like. You even know a bit of html. But you feel there is something missing in your understanding of Web 2.0. You do not want to get into programming your own web service, but you are the sort who liked those field trips you took at the local print shop. You want to see what’s under the hood of a Web 2.0 service so you can understand what all this stuff is really about. You saw The Machine is Us/ing Us and it got you part of the way there, but it seemed a little too dumbed-down for you. You want to *really* look under the hood of web services.

That said, your mind doesn’t work like a programmer. You are not going to be coding in PHP any time soon. You don’t like manipulating variables and handling loops and all those other weird systems-oriented stuff that the big O’Reilly books tell you about. And when you are honest with yourself, you understand that you will probably always choose television over web development.

The question you have in your mind is “Will I understand what is happening with emerging trends better if I understand more about programming.” My answer to this question is “yes.” But you do not have to learn how to program to understand about programming. Most programming concepts are fairly benign — the devil is always in the details. The good news is that Google and Wikipedia are chock full of good information about most technology. The bad news is that the language techies use to describe what they do is shaded in three-letter-acronyms, technical jargon and obscure references to text-based adventure games. It’s as if they do not want you to know what they are talking about.

Well, I decided to come up with 10 programming concepts that could help you get a better understanding of what brings Web 2.0 about. That way, you don’t have to cover a lot of ground that may not matter to you. I wouldn’t exactly call this mandatory reading, but it is meaningful, trust me. On the other hand, you’ll have to accept some jargon at the same time — so this is not exactly beginner stuff. Either way, you’ll benefit from Googling at least one or two of these concepts, I guarantee it.

So here it is. 10 programming concepts that would benefit a librarian who spent five minutes reading about them.

  • Object-oriented Programming (OOP)

In a nutshell: Creating code that can be re-used in multiple programs.

Why it matters: OOP marked an important change in computing and is probably the main reason you are using a mouse instead of text-based commands. Back when I had my Commodore 64, I was taught to create programs in linear fashion. Do this. Do that. If this is the case, then do that — otherwise do something else. Programs had numbers to show the modus operandi (ahem — notice how, er MARC still uses numbers?).

OOP is different. Instead of “do this, do that,” I create an object. I say “here is a cat. It has whiskers, and fur and a tail. It does some funny things like meow and scratch and puke furballs.” Then I can put that cat into my old-fashioned programs. I can call my cat George. George has long whiskers, blue fur and a bushy tail. He meows when he sees food and scratches when you pull his tail. That’s neat. Except, the real trick is that other programmers can make their own cats too with their own names, shapes and behaviors.

But that’s not all either. Set up a form and you can make it so that your users can create cats too. Aha! Now you know why OOP connects to library 2.0. OOP has been around for a while, but it is the kind of coding practice that makes open source and user-driven products possible.

How can you learn more (besides Wikipedia)? : Any good programming book will have a chapter on OOP.

  • Client-side scripting (AJAX)

In a nutshell: Client-side scripting means Javascript. Javascript is code that makes things happen by giving instructions to a user’s (or “client’s”) browser (like Internet Explorer or Mozilla Firefox). The major advantage to javascript is that it doesn’t require page reloading.

Why it matters: AJAX (Asynchronous Javascript and XML) is the major driver behind alot of Web 2.0 services. The key to AJAX is that it loads up XML data from some server, then it uses that data while the user interacts with the page. Jazzed-up RSS feeds is one result of AJAX. Another common AJAX service is a “suggestion” service for searches. Meebo uses AJAX to provide online Internet Messaging via the web. Though often criticized for being too slow, AJAX can also be used for online gaming.

How can you learn more (besides Wikipedia)? : I like this article by PC Magazine.

  • Relational Database

In a nutshell: Have you ever used a spreadsheet and wished you could take the cells of a particular column and break it into further columns? A relational database makes that sort of thing possible — except it does it through a series of tables that are interconnected (related) which are later accessed through something called a query (a search).

Why it matters: I think most library school students are learning about relational databases in their MLIS classes, but it is important enough that it deserves mention. When David Weinberger speaks about everything being miscellaneous, he is really affirming the value of the relational database. Because the relational database uses relationships instead of a tree-style taxonomy, it reduces needless repetition of data and opens the door to unforseen connections between different sets of data. The many-to-many relationship, in particular, really makes the prospect of mashing up huge masses of data from projects like Wikipedia in “god” databases like Freebase a possibility (while relational databases have been around for a long time, they have become fast and big enough now to handle huge amounts of data quickly).

How can you learn more (besides Wikipedia)? Hopefully library school taught you about relational databases. If not, you might want to play with a GUI(graphical)-style database product like Access. (For geeks: Yes, I know MySQL pwns Access, but I’m talking about using it to learn what a relational database is).

  • Server-side scripting (PHP, Perl, Java, Ruby etc.)

In a nutshell: Server side scripts give instructions to the computer (the server) that hosts a webpage, instead of using a browser on the user’s desktop. Server side scripts are usually faster than client side ones, and most AJAX programming requires at least some server-side scripts.

Why it matters: It matters mostly for the same reasons that AJAX matters. Server side scripts make the major of the web happen. What has happened is that scripts like PHP are becoming more easy for the neophyte programmer to understand (PHP uses natural language alot) and are therefore creating an environment where alot more people are coding. Another thing that many people do not realize is that when you “view source” on a webpage, most frequently it is not a “page” of html that you are seeing, but code that is “outputted” by a server-side script like PHP.

How can you learn more (besides Wikipedia)? : Pick up a book on Java or PHP. Or go for the W3 Schools tutorials.

  • Http Protocol

In a nutshell: These are the rules for transferring information on the internet.

Why it matters: I’m not sure how much this does really matter. I guess I just think librarians should have a little more in-depth knowledge about how computers talk to each other.

How can you learn more (besides Wikipedia)? : The James Marshall Tutorial is a classic.

  • Open Source Software (OSS)

In a nutshell: Software that is released without protections that prevent users from viewing the “source code” of the product, and with licenses that encourage users (within certain boundaries) to distribute the product freely and change the product to their own needs.

Why it matters: If websites were all hosting using expensive proprietary software, you would never see the kind of boom in internet services that you see now. All of the client-side products I have mentioned are open source. But it’s not just that the products are free — it’s that communities have built themselves around the development of these open source products. And many open source products have built themselves into Web 2.0 services. Take WordPress, for instance. See, when you have huge communities working together, they need advanced services that promote collaboration. Look at the documentation for any open source product (how about Drupal?) and you are almost certain to find a wiki hanging around somewhere.

How can you learn more (besides Wikipedia)? : The Open Source Initiative is a good resource.

  • The Document Object Model (DOM)

In a nutshell: Basically the DOM is the rules for retrieving data from XML or html documents. AJAX uses the html object model to make its magic happen, and other scripts use the DOM to extract information from RSS feeds.

Why it matters: XML is kind of like the universal solvent for data. If two services do not play well together, you always have XML to fall back on. Even so, XML is about taxonomies which is at the core of the library profession. We should be aware of how our taxonomies are turned into web interfaces, if only in a superficial way.

How can you learn more (besides Wikipedia)? : I think the W3 schools covers the DOM fairly well. There is a separate tutorial on the html DOM (same principles, just a tad different approach).

  • Encryption and Digital Rights Management (DRM)

In a nutshell: Encryption is nothing new. You have information; you have a code to represent that information; you have a key or solution that will turn one into the other. Maybe you need to look at encryption tools like MD5, or SSL but in the end it’s all still like the Little Orphan Annie Ovaltine-Advertising decoder ring from A Christmas Story. DRM uses encryption to prevent users from breaking copyright on their products (software, DVDs, CDs, e-books etc.).
Why it matters: Copyright and DRM is the real battlefield of the Web 2.0 movement. Many Web 2.0 folks are outspoken about such products, saying that they put needless restrictions on paying customers. Others steadfastly argue that DRM is necessary to keep people from stealing their intellectual property. You oughta know about this stuff, because where the chips fall will say alot about the future of the Web and the business of business in general.

How can you learn more (besides Wikipedia)? : Cory Doctorow is a good source. Though he obviously sits to one side of the issue, Cory always points out the cases where the bad guys (to him) are making their arguments.

  • Platforms

In a nutshell: All computer software requires a platform to make the things you do on the key board and mouse make sense to a computer. Windows is a platform. So is Linux.

Why it matters: In the end, the platform you use will dictate in the end how the software will perform. That said, there are three more reasons why I added this to the list. 1) One of the tenets of Web 2.0 is that of the “Web as platform.” Basically, this means that what can be done on Windows can be done on the web. If you look at Google Documents, you will get an idea of what this means. 2) I always think its good to remind people that Windows is just one of many possible platforms that people can use to make software work. 3) Platforms are changing. The battle of platforms continues onto handheld devices, and both the iPhone and the upcoming “Surface Computer” appear to be offering widely different platforms for the future.

How you can learn more (besides Wikipedia)? : Try linux out. Or a Mac.

  • Stylesheets (CSS & XSLT)

In a nutshell: Let’s say someone offers to do dishes for you, but after every plate he cleans, he asks you where it goes. Annoying, right? You want all plates to go in the same spot, right? A cascading stylesheet (CSS) is a way for a programmer to say “ok. All headers are going to be green, 12 point Times New Roman font.” An XSLT stylesheet is a way for a programmer to say “ok. all data with “this” tag is going to appear at the top of the page. Either way, you are looking at the rules for style being separated from the rules for content.

Why it matters: Stylesheets are behind any service that lets you change the “skin” of the webpage you view. It is also behind all the fancy templates that WordPress and other services offer. If you are a Horizon user, you may want to know that the current Horizon Information Portal uses XSLT stylesheets to display the available data.

How you can learn more (besides Wikipedia)? : Back to the W3 Schools.

That’s it. Just a challenge for you to brush up on your technology if you so desire. I don’t think it’s mandatory to learn this stuff, but taking a bit of time to familiarize yourself with these concepts does help make some extra sense about where the web, and a whole whack of other IT services are going in the future.



UPDATE: Nicolas Morin has a nice response to the list. He suggests adding Unix, SQL, Apache (and web servers) and the notion of API to the list. He is also curious about the order I put them in. Responses below:

The order: no real priorities there. The list was intended as a grab bag of things to look through.

I’m inclined to think Unix and Apache are the sort of things you can leave to systems people. If you are installing products on your lonesome, maybe you want to learn more about these.

SQL falls under “relational database” — it’d be nice if librarians knew more SQL, but I think it’s too much like programming. Suffice it to say that you need to give instructions to a computer to tell it how to search, store & organize your data. This is called a “query.” SQL is the language to do that. For most people I know, SQL is intimidating. I don’t want the list to be intimidating — learn what a relational database is first, then if you get enthusiastic, go for SQL.

API goes along with “Object oriented programming” although, I agree it’s a pretty big miss there. Using my analogy above, once you’ve created your cats, you can give people the list of things the cat will do. That list is the API. Once you share an API, people can do all kinds of wonderful things with your cat. Maybe even mash it up with a dog. Then you can see the fur fly! 🙂

He suggests replacing DOM with XML. Probably a good suggestion, although I find most tutorials on XML talk about syntax more than they explain why you would use it. DOM explains a bit more about the structure of an XML document and how you might extract data from it. It’s pretty much 6 of one, 1/2 a dozen of the other.

He also suggests taking out encryption and platforms. For one, Unix is a platform, so that’s why I put it in. I wanted to avoid the typical geeky acronyms and products names out as much as possible. Also, as I said before, I think platforms are going to change in the near future. This is a heads up.

Encryption is there mostly because DRM is such a big issue these days. Also, if we are helping clients evaluate good websites, we ought to know a little more about how secure sites work. There are other, more mind-blowing things that are interesting about encryption as well. Some of it is covered in a couple of great books I’m reading: The Man Who Knew Too Much (biography of Alan Turing, the man who invented the computer) and Decoding the Universe (about information theory).

31 thoughts on “Under the Hood of Web 2.0 : the top ten programming concepts for librarians to understand

  1. Thanks Ryan. Great post. Just what I need to know.

    By the way, it was the plastic bottle making factory fieldtrip that had me enthralled….just how did they turn those little balls that looked like they came out of my beanbag into Toilet Ducks ?

    Like

  2. Great post Ryan! Excellent job explaining these concepts – and I do agree that more people working in libraries need to understand these things.

    I do think the http protocol does matter – and is important. We all use this protocol on a daily basis. I would add that people should be able to distinguish between http and https protocol – and that librarians also need to understand FTP and Telnet (and that they are different from http).

    Like

  3. I am benefitting a lot from your blog. I really liked the Infodoodad article that you pointed to the other day on top 13… and the one today on under the hood is fantastic. My desk is littered with CSS for dummies etc. and I’m not quite certain how much of a code geek I have to actually become. Ideally I’d rather not but as I try to improve the library’s website where I work I’m thinking CSS might end up on the must learn list. Some of the stuff you wrote about today I had never heard of so this was *very* helpful. The comment about preferring to watch television than to do programming also made me LOL!

    Like

  4. I think this should be near the top of your list:

    SOA: Service Oriented Architecture.

    Traditional computer software development involves vertical programming architecture where everything required by the program including data, the core application, and the interface are all created and fixed within that program. The online catalog is a great example. The bibliographic data, the application which searches that data, and the customer interface are also fixed within the online catalog system.

    In service oriented architecture (SOA), the data, application, and interface are separated so that each can be implemented using the best technologies for the task. The pieces can be interchanged or repurposed.

    If one were to build an online catalog using this concept, each of the pieces of the online catalog would be separate software modules. Each would be designed using the best technology for the task. One could then replace the interface module without disrupting the application and bibliographic data processing modules.

    Like

  5. I think one of the reasons it’s important to understand http better is that it makes it a lot easier to understand what’s important about stuff like AJAX. Because web browsers are the most widely deployed more or less standard platform for accessing services across the Internet and they work by making http requests, we’re pretty much stuck with using http as a primary delivery mechanism. It does, however, have a major downside when it comes to delivering pages that perform like a desktop application, this is its lack of state. What this means is that there is no continuous control connection between the client’s web browser and the web server. The client requests a document, the server returns one. What AJAX allows for is the simulation of state by sending requests for additional information to the server that modify the content of the page prompted by user interaction with javascript running in their browser.

    Anyway, it’s great to see you folks thinking about this stuff.

    Like

  6. Hello,

    I wish it were that easy 😦

    I think your selection for topics on the top 10 is great! As Eric commented they are others (UTF-8, character sets in general).

    My problem is that even if you could get someone (a library decision maker) to spend 5 minutes looking at these topics is great, but it is just no where near being effective.

    Peter Novig has a great essay about how it takes 10 years to understand some of these concepts that you have mentioned. (What’s the Rush)

    What we need to impress on practitioners and students alike is that you have to willing to put in some work! While you will be better off if you’ve got the buzz words under your belt, you must realize that you are a long way from doing something useful.

    Like

  7. Hi Brian,

    I think I can understand where and Norvig are coming from (though I’ll have to read the Norvig essay to be sure), but the point of this particular essay is just to give people a brief look at some of the processes behind Web development — kind of like knowing what a combustion engine is without having to properly construct, repair or maintain one.

    I think we need people who are willing to put in their 10 years (and frankly, I think the coding part of the equation is minor in comparison to the systems logic and math-ish thing involved), but I have no solution there. Open source development has some potential, but it won’t turn librarians into Linus Torvalds anytime soon.

    That said, this is one step forward — and I’m young enough that I’ll see my readers in 10 years. 🙂

    Like

  8. It sort of pisses me off that he mentioned Linux as being a Platform OVER even mentioning Mac OS X… i dont know why… i should probably get over it though… even though most concepts in Windows (heavily mentioned product) were based on it… 😦

    Like

  9. Hey Silentfart,

    Who’s “he?” One of the reasons I open up comments is because I can’t think of everything. No need to get angry — just comment and show me where I’ve gone wrong.

    Like

Leave a comment