To start this entry, I went to to images.google.com and searched for "nsfnet". Here's the first picture that came up (at least for me).

200812281207

I can't seem to have a meeting or conversation on campus where someone wants to ask about the cloud, insist that the cloud is THE future, claim that the cloud is dead, or make a case that the cloud is the wrong direction. That picture sure looks like a cloud to me and it is from 1991. For me, "the cloud" first appeared in meetings at either the Cornell National Supercomputing Facility or the John von Neumann Center when supercomputer center staff would talk about the emerging network that was being created to connect the National Science Foundation's supercomputing centers. So - in the cloud conversations I find myself getting into, I usually take the position that the cloud has been around for a long time and it will continue to be around. It is neither a silver bullet nor a neutron bomb, but it is changing what we do and how we do it every bit as profoundly as that "first" cloud did. We're experiencing and creating a set of influences that are making the cloud much richer than "just" TCP/IP connected nodes, all leveraging a rich software stack to make it all go. Personal identity is being pushed into the cloud, GENI promises a future where the cloud itself is an innovation space (as opposed to innovation happening at the edge or node), Web2.0 services feel like they are certainly a part of the cloud, and storage and transactions themselves are every bit as much a part of the cloud as well (here, here, and here).

As it is and was with NSFnet and eventually the Internet, responsibilities were/are somewhat clearly delineated. Campuses know what they need to do (run campus data backbones), federal organizations and companies know what they need to do (run national backbones), and building occupants/research labs know what they need to do (run local backbones). Protocols are mostly interoperable, there are lots of hardware options, and while there are varying degrees in difficulty in running a backbone of any size - it has been successfully done hundreds of thousands of not millions of times over during the course of the last twenty years.

Zeroing in on Penn State...

The most important candidate to make as fundamental, interoperable, reliable, secure, and well funded (well understood at least) as the network in the cloud is storage. With apologies to those who know much, much more than I do about the nitty gritty details of the history of the Integrated Backbone at Penn State, I believe we're faced with the challenge of doing for storage what was done for data networking. What would an integrated storage backbone (ISB) look like at Penn State? Now that we're getting better at routing - securely and privately - attributes about people around the cloud, how do we make attributes about information discoverable and routable? What does the campus have to do? research labs? business units?

There are gobs and gobs of conversations going on about this at Penn State, each gob with its own spin and concern. I believe we need a guide post to serve all conversations because there can be no one conversation to rule them all - there's just too much happening at once to hope for that kind of conversational model. At the moment, I'm throwing out the idea of talking about and defining the Integrated Storage Backbone at Penn State for at least a little while to see what takes hold. If you have any ideas about the concepts or the conversation construction, I do hope you'll share them with everyone here.

I attended the sciencecommons presentation at CNI 2008 and was struck but how similar it was to presentations I gravitated to at IEEE/ACM Supercomputing conferences, years ago.

Neuroscience is, apparently, a (the) poster child for sciencecommons. Listening to this presenter (John Wilbanks), revealed that - as it was years ago with both vectorization and parallelization on large scale computers - he is a scientist who has endeavored to become an expert in tools in a different domain, in this case in data construction and deconstruction. He used Dublin Core terms with the ease that physicists years ago would take about loop ordering for Cray supercomputers. He knows RDF instead of cache structures, he knows SQL statements instead of FORTRAN or C. He knows copyright, ontology, provenance - as opposed to loop unrolling, compiler directives, memory management.

A major difference here is - in the high performance computing area we have to acquire resources to develop capacity and therefore we can see a training or awareness need hit many months out. With data oriented discovery, the asset already exists, what we lack are properly skilled faculty and staff to do this for themselves. What can IT organizations and libraries do to help close this skills gap? I'm not suggesting that there aren't already people who possess these skills, clearly there are. But we may be at a point in time when, just as was done with tool development and expertise with supercomputers, where what's needed is programmatic training and awareness raising about what is possible with the tools of the day. What can IT organizations and libraries do to help close these gaps? Is that where the responsibility lies? If not there, where? Tools exist, but they are difficult to use. Policy and law are barriers as much as they are protectors. Waiting for the tools to become simpler to use and for policy to catch up with the discovery needs of our communities would be foolhardy. Taking up knitting or tetris might be more productive than waiting.

This was a really exciting presentation that represented lots of possibilities, and challenges. I aim to make it a point of emphasis in the months ahead.

Technorati Tags: , ,

I attended a session at CNI 2008 that described the Data Audit Framework, a JISC funded initiative. Four UK universities are participating in the pilot. If you don't want to read the rest of these notes, I'd say it all comes down to one bullet for me - taking on something like a DAF "enables the development of a data strategy." Sounds painfully simple, but this is something I spend a fair amount of time worrying about. As faculty develop and create data associated with their funded research, how can we establish a culture of concern (thanks Candy Yekel) regarding the stewardship and curation of that data? In my travels around campus, I believe that many people think that these issues are simply being "handled" and in some cases they are, in many others I expect they aren't. I liked the concept and execution of this framework because as developed it is non-threatening but can yield positive outcomes for better managing (protecting, sharing, curating, archiving) research data. The raw notes follow.

"Enables the development of a data strategy"
Two modes:
1. improve awareness of assets

  • capacity planning
  • facilitate sharing and reuse
  • avoid leaks

2. recognition of practices

  • more efficient use and improved research workflows
  • enables development of a strategy
  • risk management enabled, important as more agencies start to require a curation plan

Plan the audit:

  • Identifying and classifying assets
  • Assessing mgt of data assets
  • Reporting and recommendations

Three of four pilot schools (didn't get the fourth one):

  • Geosciences at Edinburgh
  • Innovative Design and Manufacturing Research Centre at Bath
  • Glasgow University Archaeological Research Division

Main findings:

  • lots of data being created
  • few policies for creation, storage and mgt
  • researchers unsure where to begin
  • unaware of available support campus or federal support
  • often no place of deposit or funds for preservation

Implementation tips:

  • scope audit carefully
  • timing needs to be appropriate
  • find an advocate in the department
  • collect information at once where possible
  • general discussion to build rapport, communicate purpose of audit and understand organization

Conclusions:

  • Premise of project is demonstrated, that is, lack of knowledge about what one has and poor understanding about how to curate is prevalent
  • Key areas for future work:
  • support on policies, stds, best practice
  • skills are lacking - basic training & professional workforce needed
  • develop robust, sustainable federal infrastructure for curation
  • recognition of data and active push for sharing and reuse critical

I attended the kickoff session of this year's fall CNI meeting in Washington D.C.. Some of what follows are direct quotes from Cliff Lynch's comments, other comments are my own. If what you read is brilliant, it is right from Cliff's mouth. Anything questionable is my paraphrasing.

Cliff separated his comments into two categories: advances in information and advances in information technology infrastructure.

Cliff commented on the emphasis placed upon areas such as escience and cyberscience - embodied in the U.S. by the NSF Datanet solicitation. It was recognized that what is proposed by Datanet must be met by equal investments at the campus level. Datanet might expose a "last mile" set of challenges at the campus level. We were challenged to think about and act on those gaps. It was recognized that federated identity management would be a linchpin upon which the success of such endeavors depends. The future of InCommon is of critical importance to these kinds of initiatives.

At the lowest layers of the stack, it was recognized that a 10Gb backbone seems so "last year." Backbones are pressing into the 100Gb level - with some applications requiring the capabilities of dedicated waves to facilitate the type of sharing and movement necessary for discovery.

There was a recognition of advancement in object reuse, which will enable us to move from repository as stovepipe to an ecosystem of repositories. These advances are important for integration into eresearch and authoring workflows.

New policy issues are taking on a growing emphasis. Ownership of information by nations, tribes, churches - sometimes with an orientation towards reparation and restoration of cultural heritage is taking on a new intensity and will have an impact and come from a different place than traditional copyright discussions.

Mandates for access to information that are both campus based and agency based will shape our services and policies. These conversations all move forward steadily and require some observation by campus leaders. This begs the broader issue of faculty working together in advancement of the common good, sometimes institutional, sometimes national or beyond.

There was time spend thinking about activities "below the surface."

Everyone recognizes and is even tired about the web2.0 buzz (interactivity, user generated content). Yet this has great impact in how we host dialogue about sharing collections. Where is that conversation to be hosted? The Library of Congress' Flickr hosted projects were cited as an example of moving that hosting in hopes of greater fidelity of information. Interestingly, oral historians might be a resource for learning what's right and wrong here. What do public access interfaces look like? Text mining trends raise interesting access and policy issues. What does it mean to have someone compute on a collection as well as read it? Libraries need to be thinking about this.

It was recognized that cyberlearning is more than "just" the conversation about open educational resources. Where are the boundaries between access and learning? This line will stand to get more confusing than clearer in the near term. Large scale lecture capture can be a driver as there are unclear goals in doing this capture, but there are real outcomes we don't yet fully understand.

The nature of the digital library is changing. In the beginning, digital libraries were mostly in the digitization of existing collections. Critical mass and scholarly demand begs for "re-unification" of themes, discipline specific themes. The can of worms lies in sustainability and responsibility. The scientific sphere is a place where this happening too. Both mergers and fragmentation are being observed here.

We're at a time that it would be fair to look at the institutional repository. Have they succeeded? have they fallen short? Now is probably a fair point in time to take a look at them in this manner.

Cloud storage is something to pay attention to. The hype:reality ratio needs to be understood, however, when thinking about this solution.

Lastly, the current economic climate was discussed. Networked information can make a difference in these difficult times. Maybe we should think about fund raising for the building of access, not just buildings (here here!). Belt tightening should inspire us to collaborate, not withdrawal and protect. We also need to do thoughtful and do tough triage on what's to be preserved. It is a a time to think about doing things radically less expensively, as well. Cliff has spent a lot of time thinking about sustainability as has NSF, Mellon, JISC, etc. Bottom line is that institutional passion for preservation is a requirement going forward. In a related line of thinking, the new administration should consider knowledge, educational and cultural infrastructure as a way to help stimulate recovery as much as it is thinking about civic infrastructures. What would the Civilian Conservation Corps look like for knowledge, educational and cultural infrastructure improvement?

This is only my second CNI meeting, but I find myself enjoying how it is it forces me to think differently about data, in all aspects.

Technorati Tags: , ,

Geek speak

| | Comments (8)

A couple of things happened to me in recent weeks that I never thought would have driven me to write (again, finally).

I had the opportunity to speak to a group of people last week that I had never addressed before. Not only had I prepared a conversation different than what they were expecting, but I had grotesquely over estimated their knowledge of technology. And in my prepared comments (even though they were wrong) I had spent considerable time making sure the words and my mindset were non-technical and very high-level. As I got started with the presentation, I was caught in a recursive humility loop (recursive humility loop?!?!? ).

After I realized (was told) that the group was expecting different material, I headed into the new direction that they had hoped for, and was starting to make some headway. I got the group talking about their personal experiences on the topic and that set up lots of grist for the rest of our time together. In explaining a concept that came up with the 2nd question, I used the word "network." I was halted and asked what a network was - more humble pie. I think I managed to explain what a network was, at least that was the perception with which I was left.

Bottom line is, I think everyone in IT - and I mean everyone - dramatically and regularly underestimates that "our" shared vocabulary is rarely anyone else's. But even more importantly, we regularly project our usually deep understanding of how things work and connect onto our non-IT peers and audiences. People rarely halt us when that happens, they let us keep moving in hopes that we'll say something that they can relate to or that we'll finish so other conversation can pick up the slack. Maybe I should just assign this to myself, but I really do see this happen with alarming frequency. I think the best thing we can do is keep each other honest and listen carefully as we speak in such settings, and offer friendly criticism about how it is we might approach conversations differently.

The second event that happened was getting a link to an article in Governing.com magazine from my father-in-law, "Lighten Up on Language." The article mostly addresses what can happen in higher stakes conversations, but the point is basically the same - we need to think more critically about how we talk and who we are talking to if we want to have the impact with think we can have in our organizations.

Not a new thought, but one that was driven home for me in very real ways recently.

Philippe Petit was a hot commodity when I was growing up.  My memory of the intensity of his place in our culture in the 70's is probably greater than it really was, and for some reason he really sticks out for me.  The image in my head of him tightrope walking between the Twin Towers in NYC is breathtaking.

This week, I had the pleasure and honor of being a part of the most recent graduate ceremony for Penn State participants in our local Information Technology Leadership Program, ably produced by MOR Associates.  I've been to quite a few graduations at this point and I can say that each one has its own personality, this one being no different.  Brian MacDonald is a master at putting everyone on the spot in just that way that they are most uncomfortable, so I was furiously taking notes during the various presentations to make sure I was prepared for the Brian moment.  In less than 40 minutes of presentation I had a full page of notes and could have truly riffed for *hours* on what I was listening to - there was so much journey packed into each presentation that I wanted to talk about it all.

The core of what I heard was about balance.  Or at least, the core of what I digested was about balance.

Balance in perspective.
Balance in problem solving.
Balance in leading, following, fighting, facilitating.
Balance in periodicity (be a specialist, be a generalist at varying frequencies over time).

It was a room full of people who were reflecting on what it was going to take for them all to maintain balance and supporting each other to have the strength to maintain balance.  It was breathtaking.

Imaginiff is one of my family's favorite games to play. I like it because no matter who wins, I can always claim I win.

Imagine if you could write the contract - or rewrite existing contracts - for software that we use. What would you insist upon in terms of integration with existing systems? adherence to standards? terms of use? termination conditions?

I'm participating in such an exercise with some others in the CIC, and if you have a moment to chime in - it would be most helpful.

Thanks.

Technorati Tags:

More slots have been set up for continuing the "Coffee with Kevin" breakfasts. The web site has been set up to accept new registrations. Over the last year, 60 people were able to participate and I'm hopeful that some of you will still take advantage of the opportunity to meet some others from ITS and talk about the issues of the day. If you've already been able to participate and would like to do again, I ask that you wait a little while to give others a chance - and if after a couple of weeks there are still slots available, go for it.

Looking forward to getting to know some of you a little bit better.

Technorati Tags:

An ANGEL update

| | Comments (0)

On Monday and Tuesday of finals week (fall '07) , Penn State's course management system experienced extreme performance problems, a surprising anomaly in a system that had seen few interruptions of service. These problems were very disruptive to faculty, students, and staff during finals week. Our staff worked feverishly over those trying 48 hours to make the system as usable as possible under unprecedented and unpredicted load. Still, the impact to some faculty and students was profound.

Since 2001, we have done rigorous analysis on a semester by semester basis and upgraded hardware, software, and processes accordingly to meet what we believe will be the coming academic year's demands. Heading into this semester there were over 75,000 students signed up for over 250,000 course section enrollments. This finals week, we saw a demand for use not seen before and it appears that there has been a fundamental shift in how it is faculty and staff are using the service as well. This combination of factors pushed demand over the headroom built into the systems.

ITS has been working non-stop since the problems occurred to make the system improvements necessary for a successful spring semester and beyond. What we need to improve falls into two categories: system performance and communications in emergency situations. We are in contact with all of the vendors that provide the various pieces of the system to aid in the analysis of finals week and put in place non-disruptive enhancements to dramatically improve the performance headroom heading into the semester and particularly for finals week. Additionally, we are having intense conversation about how it is we can better reach faculty and students should such a situation confront us all again. It is our goal to have an update on both aspects of the challenges before us on January 7, 2008.

Penn State's course management system is arguably our best supported application/service. There are a number of first-rate personnel that support the service from the technical plumbing to training and to support in the classroom - there isn't an aspect of the service that we don't put a lot of energy and passion into, and we've been doing it for years. And even with that commitment, we've been humbled by the intensity and volume of the anger and frustration voiced by those faculty and students that were most greatly impacted. We deeply regret the disruptions that were caused during finals week, and we are doing all that we can to make the spring a great success so that faculty, staff and students can teach, learn, discover and not have the technology get in the way.

It takes years to build trust and confidence and only days or even moments to lose it. We understand it will take time to regain trust and confidence, and we've tightened up our boot straps to take that journey, however long it may be. We hope that recognition of our historic commitment to ANGEL will reduce the time it takes to regain that trust, but if it doesn't we'll keep making it better until we get back to where we were and beyond.

UPDATE, January 6, 2008:

Staff in ITS have been hard at work to continue to address the issues of finals week.

We are increasing the computing capacity for web transactions by adding 100% more Web servers. Additionally, we are in receipt of a larger, faster database server and will begin acceptance testing as soon as we possibly can.

We are working with our vendors to develop methodologies for increasing computing capacity for finals week at the end of spring semester. There have been countless emails and two conference calls to get to the bottom of the issues. There are tentative plans to conduct a summit of sorts between all parties, in the coming weeks - when the effectiveness of telephone calls has been reached

We are refining technical mitigation strategies should system degradation reoccur. Stay tuned to the ANGEL log on page for a description of those strategies.

We are developing guidelines for faculty about importing their courses and course materials prior to the start of a semester, to spread out the system load due to import/export during the first few days of classes.

We are outlining a processes for rapid crisis communications, proactively using a variety of vehicles. We also plan to meet with a subset of ANGEL users so we can better anticipate their usage at critical periods during a semester. Our future crisis communications also will outline ways the community can minimize the load, including alternatives to accomplish some tasks outside of the ANGEL system.

Technorati Tags: ,

"If history teaches any lesson it is that no nation has an inherent right to greatness. Greatness has to be earned and continually re-earned."

"Only by providing leading-edge human capital and knowledge capital can America continue to maintain a high standard of living - including providing national security - for its citizens."
- Norman Augustine

A review of the report titled, "Is America Falling Off the Flat Earth?" is compelling reading, as is a summary of the report's recommendations. Mr. Augustine also recently briefed congressional staff on the report. His testimony can be found on the National Academies Press web site.

Let's hope our government is listening. There are strong cases made for new (or renewed) investments in education, research and infrastructure - all of which would have an impact on research institutions and their IT organizations.

Technorati Tags: ,