Publishing for the Scholarly Web

Christian Vinten-Johansen

Penn State University

S5 Presentation Controls
Next slide: space bar, right or down cursor keys
Previous slide: left or up cursor keys

Scholarly Articles are Unique

Rich in data and research
Deep social and organizational context
Adds to body of knowledge - perpetual
Findable, Usable, Accessible

Scholarly Articles Have Special Requirements

Human readable
Excellent typography, visual design, reading aids
Print stylesheet with displayed URLs
Machine readable
Structured content: articles, references, bibliographies, footnotes and tables
Rich (really, REALLY rich) metadata
Permanent URL
Archives, repositories, bookmarks
Summary: Findable, Usable, Accessible

Scholarly Articles Have Special Requirements

Machine readability satisfied by:

Semantic markup - make full use of standards!
Lost tags found - obscure, little-known or abused
Semantic tables
Footnotes - from text to footer and back
Metadata - It's under-used and unloved, but vitally important!
<meta... /> tags
Microformats - embedding metadata in content
ARIA - Accessible Rich Internet Applications

Lost Tags

Many (X)HTML tags are obscure and rarely used, or used incorrectly.

Proper use of markup gives documents richer semantic meaning.

More "lost tags":

Semantic tables

Use the summary attribute in the <table...> tag

    summary="Table 23. Ability of W3C markup 
    specifications to create well-formed XML markup."
	<th colspan="3" id="property">Valid XML?</th>

Semantic tables

Use the <caption...> tag nested in the <table...> tag

<table ...>
Table 23. Ability of W3C markup specifications to 
    create well-formed XML markup. 
	<th colspan="3" id="property">Valid XML?</th>

Semantic tables

Associating Header Information with Data Cells

	<th id="html401">HTML 4.01Transtitional</th>
	<th id="xhtmltrans">XHTML 1.0 Transitional</th>
	<th id="htmlstrict">XHTML 1.0 Strict</th>
	<td headers="html401">No</td>
	<td headers="xhtmltrans"">Yes</td>
	<td headers="xhtmlstrict">Yes</td>

Footnotes: From Text to Footer and Back

Content section

<sup class="footnote">
  <a id="footnote-3-referrer" href="#footnote-3">[3]</a>

Footnote section

  <li>><a id="footnote-3" href="#footnote-3-referrer">[3]</a>
  Content of footnote.</li>

Example document with footnotes

Metadata in Web Pages

Unused and unloved - an typical example

  <title>Metadata in Web Pages</title>
  <link rel="stylesheet" type="text/css" 
       href="styles.css" />

Metadata Matters!!!

<html  xmlns:DC="">
  <meta http-equiv="Content-type" 
	          content="text/html; charset=utf-8" />
  <meta http-equiv="Content-Language" content="en-us" />
  <meta name="DC.title" 
        content="Metadata: Unused and Unloved" />
  <meta name="DC.description" content="Metadata: Improve 
                 usability with metdata." />
  <meta name="DC.rights" 
        content="Copyright (c)2005 Christian Johansen" />

Why You Should Care About
Structure and Metadata

Smart agents can do their job, and do it well

The CiteSeer search engine uses metadata to parse and index scholarly documents, presented in detailed search results.

Microformats: Proposed and Built by Developer and Expert Communities

XFN on Molly Holzschlag's Blog

<h3>Roll Roll Roll</h3>
    <li><a href="" 
           rel="contact colleague">Channy Young</a></li>
    <li><a href="" 
           rel="friend met colleague">Dan Rubin</a></li>

Why You Should Care

“Operator”: An extension for Firefox

Screen shot of Operator (a Firefox extension) reading and 
                     locally saving an address from Yahoo!Local

ARIA Makes Web 2.0 Accessible

ARIA: Roles, States and Properties

Roles identify page regions
Menus, navigation bars, content, header, footer
Mapping page regions to APIs
States and properties
Assign to interactive page elements and widgets
Tree controls expanded/collasped (state information)
User-settable properties render content dynamically

Role Applied To Navigation Region

<html xmlns=""
  <div role="navigation"> 
      <li> <a href="">WAI</a></li>
      <li> <a href="">IBM Accessibility ...</a></li>
      <li> <a href="">Opera Software</a></li>
      <li><a href="">Mozilla ...</a></li>
      <li><a href="">UB Access</a></li>

Short (non-permanent!!!) URL