Metadata over Content - How to Build an Unusable Archive

| | Comments (0)

I like to troll for new academic sites in my various lists, especially the Internet Scout Project, but I find that a large percentage of online archival repositories are so unusable I've stopped adding them to my own links library. They are just more trouble to use than they're worth ... which is sad because I'm bypassing great information on civil rights, diversity, Pennsylvania history and more. And I bet most people who aren't professional archivists/scholars are too.

There are a lot of symptoms of an unusable archive (I'll get to those), but I think the main problem is that the content experts are so focused on "metadata" that they've forgotten the context. The result is that you pretty much already have to know what exactly to look for before you can find anything. And by exact I mean keywords like "Jackson, Andrew -- 1767-1845 -- Innauguration, 1828" - got that?

The Quest for Ladybird Johnson

So to clarify my angst, let me pick on a site which has some great resources, but isn't quite so user friendly - the Library of Congress Portraits of Presidents and First Ladies, 1780 - Present. First, let me say that my inner instructional designer salivates because I am fairly confident that somewhere in here is a public domain photograph of President Bush 43 in here somewhere. How many bloggers need a legal image of President Bush? Or President Clinton? Or maybe a student would like a picture of Abraham Lincoln? I do believe this is our tax dollars at work.

Since I'm not a professional historian, I'm really hoping there's a list of Presidents in chronological order (or maybe a second alphabetical order, just in case). Then I can find my target and download my image. So let's try the Browse link.

As you can see I do not get a list of presidents, but a list of all keywords (president, location, event, etc) in alphabetical order. A lazy searcher (the vast majority of us) will stop here. A more persistent searcher will notice that keywords include things like "Adams, Abigail" - meaning that if I target the approximate location of the presidential family name, I may get a hit.

So I will actually look for Ladybird Johnson, which looks to be in the "Grant..." to "Photographic..." range. Fortunately there is one entry for "Johnson, Ladybird."

And finally, I get my photo (very striking if I may say so), along with lots of supporting metadata, like the date (interesting), call number (will be going to DC anytime soon?), the medium (a photographic print - duh) and digital ID (I may actually need this if I lose the URL), and, best of all, alternate subject headers (if I found it, do I need these now?).

What I don't find is any information on what Lady Bird did, what her husband did, when she lived or died, when she lived in the White House or even a link to any sites a historical society might maintain. I can infer that her husband was Lyndon Baines Johnson, because she is "Mrs. Lyndon Baines Johnson", but not all readers are good "inferers."

I'm sure this information is useful to the LOC staff, but I'm not sure about the rest of us. They do say that the Reproduction Number can be used to order high quality prints, but this is really not clear on the page as is.

Spelling Out the Steps to Unusablity

I'm picking on the Library of Congress, but it really is a common problem. I cannot tell you how many sites:

Honestly, if I need a free photo, a lot these sites might be very useful. But I feel like I'm missing so much more in all that metadata.

I should add that if you really want to make an educational archive especially unusable, you'll have to add these steps

  • Call the list of external links "Resources" (it's more academic that way)
  • Classify artifacts with abstract scholarly categories like "Discover" vs. "Learn" (not of all this can be bad translation issues).
  • Mention all sources of funding on the front page (that's why they give out the big bucks), but skip the table of contents, integrated tutorials or anything that a non-professional could use.

I think I'll be visiting Wikipedia now. They might have a free photo I can use as well.

Sites that Do Work

You can build archives/educational sites for non-scholars. They key is to use comprehensible search terms and put information into more context. Some of my favorite archives.

I live in hope that I may learn some of these lessons myself.

Learning and Surprise

| | Comments (0)

A Harrisburg colleague, Carol McQuiggan pointed out an interesting article on teaching from the New York Times "Geek Lessons" (Sep 21, 2008). One of the interesting points is that the author, Mark Edmundson argues that the role of the instructor is to introduce some surprise into the student's life.

An astronomer has to explain that the Earth is actually the furthest from the sun during summer in North America. A classics professor may explain that the classic Greek drama Oedipus Rex was partly a political response against contemporary Athenian-Spartan politics. And linguists get to explain that people "who ain't speaking right" aren't stupid, just speaking a different dialect.

Edmundson argues that the role of surprise isn't just to "open student eyes" but rather to keep them open and combat that assumption that "you're on top of things and in charge." In other words, Edmunson asks can you question what underlies conventional wisdom - even when conventional wisdom is converging on James Dean. Trickier than this looks.

This reminds me of another secret principle of mine which is that a research method isn't really valid unless the data can surprise you. I've been exposed to lots of valid methods (statistical, ethnographic, traditional lab techniques), but they all share one thing in common - the ability to pull up data you weren't expecting. The examples I cited above are based on research, some of which pulled up counterintuitive data.

I think that is one of the great rewards of learning - finding out new information you weren't expecting, then re-evaluating what you think you learn. Sometimes I do a mini-research project and find an answer I don't like (i.e. one that contradicts my initial assumption.). But it's acknowledging that sometimes what you don't like is probably true that hopefully makes me a better learner/researcher. I may even learn more about how the world really works - if I can "handle the truth."

Remebering Your Assumptions

| | Comments (0)

One of the most challenging questions I ever got in a class wasn't an advanced question, but rather a very elementary one on sentence structure. Almost all linguists assume that in a sentence The queen saw the corgi that you separate the subject from the rest of the sentence (the predicate) instead of grouping the subject and the verb together (see below).

Right: [S [NP The Queen] || [VP fed [NP the corgi]].
Wrong: [S [VP[NP The Queen] fed] || [NP the corgi].

Why is this? On the surface, it appears to be an arbitrary division, but there is a reason behind this. After a good 30 second pause, I remembered what it was which is that linguists assume that sentences constituents are meaningful units by themselves (yes we do work with fragments). Thus you can have a exchange like "What did the Queen do?" "Feed the corgi", but an answer "The queen fed" is not as natural. Hence the assumption that verbs and direct objects form a unit apart from the sentence.

The above is interesting, but the point isn't really about linguistics but whether an instructor can remember why their discipline makes the assumptions that it does. To me it makes the difference between teaching your course as a coherent set of related concepts versus a random list of rules and facts. I was both relieved and thrilled that I could answer her question - another student convinced that we knew what we were talking about.

Every now and again a student asks why I torture them with analyzing a set of random words with ridiculous sounds from languages they've never heard of. When I remind them that sometimes this is all data on a language that an archaeologist or anthropologist may ever get, they realize that the homework isn't just a torture device but a way to join a community of active researchers.

Discussing the Whiteboard

| | Comments (0)

I see that yesterday's post about our office white board did hit some nerves. There's been continuing discussion of it on our director Cole Camplese's blog. Thanks Cole for responding and cross-posting - I appreciate his willingness to put his perspective out there in front of all of us, his ETS community.

Office Twitter vs Office Whiteboard...Is there a difference?

| | Comments (1)

Being a modern office modern office worker means you must master both old communication channels like the community bulletin board as well as the new forms like Twitter...so here's my take of an ongoing set of negotiations.

Recently there was a blank whiteboard installed in our hallway (probably a good idea), but almost immediately, there has been tension on what should go on the whiteboard. Is it an informal brainstorming tool? A place to post official office news? Although I have my preferences, I can see an argument for all three options.

Interestingly one of main problems for the whiteboard has been the erasure policy. If an entire white board is covered with a project sketch, how long does it remain? Similarly if we have a run of poll questions (e.g. water filter preferences, disco ball preferences, sci-fi preferences, presidential preferences). Are these appropriate?

I know that they were recently erased (and continue to be erased), so I do think someone is questioning their appropriateness. I know in the past we've been asked to not post political material in the office, so I wonder, so are we trying to be more "formal" here?

Which leads me to my annoying Twitter question... We all (more or less) tacitly agree that individuals can post whatever they want to their Twitter accounts - despite the fact that we know many of our colleagues will see them. And yet our Twitter posts are being displayed in the hallway on a monitor.

Maybe it's me, but given the fact that Twitter is in the hall and is being viewed by the same people who see the whiteboard, I do think of Twitter as a workplace communication channel.

That's not to say that I've been letting work stop me from ask annoying questions, but I do think it's interesting that our comfort level with "informality" on Twitter does not extend to the Whiteboard.

Is Independent Learning Really Learning?

| | Comments (0)

A few summers ago, I was very interested in getting FileMaker to convert hex numbers to decimal numbers and vice versa. This was an arcance enough question that I could not find a ready answer either online or in the user's manual. I was on my own.

What I did was create a solution based on lookup tables...pretty much on my own. The question is - Did I learn anything?

This may sound like a trick question, but consider that modern pedagogical theory places a premium on "human interaction", "joint activity" and "sociocultural practice." Consider this definition of learning and knowledge"

  • Knowledge is ability to participate in a community of practice.
  • Learning is becoming a member of a community of practice.

William J. Clancy, A Tutorial on Situated Learning

So according to this, I've learned only if I become a member of the community of practice (I'll call it Filemaker usage). But am I in the FileMaker Community of Practice (CoP)? Most definitions I see assume some sort of collaboration. For instance the "signs" of a CoP (according to Etinne Enger) all involve interpersonal communication - none of which I did. I did not ask for help, only researched it and experimented on my own. The most I may do is read some article and lurk on a Listserv. Otherwise I may be experimenting completely on my own.

So again I ask, according to this theory, am I really "learning" if I don't collaborate with someone else? Think about it.

P.S. The Standard Workaround

The standard workaround to this "paradox" is that my learning is "culturally" mediated - which in this case means I am using man-made software, learning from books written by humans and building on one Filemaker lecture seminar...but few theorists seem to really regard this as adequately "social."

By the way, I don't discount the need to get feedback from other people, especially when you are working to analyze a tough problem. But as a colleague once asked, why does modern pedagogy assume that no learning can happen until two people are in the room?

Is iTouch a Low-Cost Solution?

| | Comments (0)

Jamie's coffee read on the costs of purchasing tech is one that's close to my heart. I came from a single-income household, but my mother decided to we needed invest in a computer - which was a Coleco Adam with a whopping 80K of memory (this was before I met Mac). Thanks Mom!

But I'm still in a single-income household (for 2 people) and once I found myself still on the divide when it comes to ... mobile tech. Truthfully, it's not the price of the phone that bugs me ($100-400) but the high monthly fee ($840 per year on the iPhone plan), especially since I rarely find need of a cell phone. At least I watch cable TV most days.

So after some research, I did go with the iTouch, and even though I'm not regularly on the Internet, I'm finding the potential uses very interesting.

First, I found an amazing array of calculators (programming, basic, thermal units, financial,...). And of course, I checked out the games - hours of entertainment from one little software package. I've also been checking music (great sound) and even used iPhoto to import photos. Not only can I show off the corgi, but I can show pieces I've stitched - a great mini portfolio. And if you plan carefully, even a small iPod may have plenty of room for you, and most utilities are cheap ($0-1.99). Once you get over that initial hurdle, there are a lot of good options out there.

And once I did connect to the Internet at home, I was able to tap into the other cool apps like YouTube, Pandora and Wikipedia and the all-important movie time schedule. Cool.

You may already have discovered all of this, but I am relieved that I will be enjoying my iTouch without the extra $70 per month. It's much more affordable this way.

Living with Plagiarism

| | Comments (0)

As I have been reminding people recently I both maintain a plagiarism Web site and teach the occasional linguistics course. This is one of those times I'm glad to see an issue like plagiarism from multiple points of view.

Interestingly after teaching a few times, I have decided that the real solution isn't last-minute comparisons, but frequent interaction. So my tips, such as they are, include

  1. Frequent assignments - It is true that the more you see a student's work, the more likely you will spot an anomoly. In fact, blogging is one of the better tools because students really write in their own voices, and instructors see them, but may not have to grade the content in too much detail.

    I know this assumes a reasonably low student:faculty ratio which does not always happen here. Even so, I have been in a class of 50+ where plagiarism was detected - Overworked TAs can smell a rat even in a large data set.

  2. The early scare - Like John Harwood and others, I include a statement in the syllabus discuss the issue in the first day of class. The ultimate weapon of course is "I maintain the plagiarism site."

  3. Laying out collaboration rules - The great thing about collaboration is that students can learn from each other, but the bad thing is that they can get lazy also. My own personal rule has been "use your own words" (so that each student has to process some information). If nothing else, I learn who is studying together up front...in case anything weird happens later.

I think the ultimate lesson for me though is that plagiarism really may not pay for the student, even in the short term.

For instance, I questioned a student about copying a transcription from second student, but even if I hadn't caught it, that person would have scored worse...because the two dialects did not mesh. The original transcription was correct for the original speaker's dialect, but wrong for the other person. I knew that the student with the suspicious case totally missed the concept.

Another interesting case was a paper in which significant portions were cut and pasted from another source; I scored it as "missing quotations" since the reference was in the bibliography. Even if I had missed that one though - the paper would have scored low because the source materials were not meshed in well and was ultimately not very comprehensible.

I suspect I have been hosed a few times (for instance, there will be no more bathroom breaks for in-class exams), but overall I feel that I can worry less, because the results of plagiarism are amazingly shoddy in many cases.

Course Hero - The Study Site that's a Pyramid Scheme?

| | Comments (0)

The original discovery of this little gem goes to one of my Harrisburg colleagues who learned about it from an instructor. The site in question is Course Hero or "An Open Online Study Community", but note how the home page features quizzes, exam solutions and homework answers along with some actual lecture notes. Yes, I am a little paranoid especially since I have seen many suspicious study aids over the years.

But, since this was a new model, I thought I should investigate. First, I was interested to see that you can use your Facebook account to log in - I knew there was a reason to sign up. Once you log in, you can create a study profile identifying course number and instructor (presumably to find other online study mates). You can also enter in textbook information by ISBN-13 number (always get a textbook for class).

The interesting part happens when you click the Search button. At that point you find out that you have to "upgrade to a standard account" to view search results, and it offers several ways to do so. The first way is to upload your "study aid documents first" (5 for 1 month's access, 50 for unlimited access); the second is to invite your Facebook or AIM buddies (50 friends for one month or 200 friends for one year); or thirdly you could pay a monthly fee. And this is where I feel that "pyramid scheme" applies, because to avoid paying a fee you have to contribute resources (content or people), but if your friends want to avoid paying, they have to find more friends or content...or else. The only thing missing is your cut of the profits (although presumably you will have access to an ever-growing set of resources, possibly forever.

This model is interesting, and it probably works, but I would be leery of joining any service before I had gotten a chance to really look at the search results first. For one thing, I was seriously considering uploading 5 junk documents just to get an in-depth view of my hypothetical search results, and I may not be the only person with this idea. Even worse, I could have "joined" only to find that my search results were empty AFTER I uploaded/paid/sucked in friends. Seems like a real rip-off to me.

The other questionable aspect, of course, is the posting of exam and homework solutions. Hmmm. Sample tests can be helpful study tools...if the instructor chooses to post them, but since the sources on the homepage are set to "anonymous", I'm not sure the instructor is posting anything. Which is where another colleague mentioned copyright issues.

But I suspect that Course Hero is structured like YouTube in that they let users post anything they want and wait for any take-down notices to arrive at their doorstep (I'm sure it's all stated in the user agreement somewhere). In the meantime, all the solutions are available to you under a "Creative Commons" license...assuming that you ever get access to them.

Scheduling and the August Blog Project

| | Comments (1)

I'm not officially attempting the 1 blog post per day feat, but if I were, I could write up a bunch over the week-end and use the schedule feature to separate the appearance by 24 hours.

Use this to reader overload.