February 2008 Archives

Why the Numbers are Wrong

| 1 Comment | 1 TrackBack

The polling numbers in primary states where Senator Barack Obama does well are consistently low compared to his results. Look at the numbers in the Potomac primaries last Tuesday. Most of the polls in Virginia and Maryland said he would win with numbers in the mid-50s (percent wise). He won with almost 2/3 of the vote in VA and 60% in MD. What are the pollsters missing?

If you're under 30, do you have a phone? Of course you do. What someone who asks that means (including pollsters) is, "Do you have a landline?" Now many of you would say, no. The fact is that many (most?) younger voters will not be captured in a representative way by traditional "phone polls" because they don't have traditional "phones." It's said that young people don't vote, but apparently they are for Senator Obama.

The question now is, how do traditional polling companies include younger voters without landlines? How can they do last minute polling without relying on landlines? Internet polls often capture the zealots rather than the "average" voter. I don't know the answer.

Last year we formed the ITS-ITANA focus groups which represent vertical, but broad topics in information architecture (e.g. Storage and IAM). In our charge to these groups, we asked them to consider the effects of "horizontal" enabling/disruptive forces on their "verticals." One of these "horizontals" is "mobility and mobile devices." How do we deal with the fact that an increasing number of our students are carrying very capable mobile devices with them? Most of us call these "cell phones," but I include things like Apple iPods and Amazon Kindles too. These devices can send/receive phone calls (but not the iPod Touch or Amazon Kindle for instance), send/receive text message, "do Web", and provide/consume location aware information. As we develop new architectures for our Penn State customers, we need to take this into account.

Sonification of Data

| No Comments | No TrackBacks

What often happens to me is that I'll hear about something or hear a word that I'm sure I've never thought about before, and then very soon after that I'll hear about it again. I think to myself, well if it's this common, how come you haven't heard about it before...

One such "theme cluster" started when I was at CalTech working with the LIGO folks. A couple of the research scientists and post-docs were talking about how data streams are being broadcast into the control rooms of these experiments while data is being collected. By "broadcast," I assumed that the raw data feeds were available real-time or near real-time from the instruments. While that is the main meaning, one of the grad students said that someone has made an audio stream out of the data and that too is available to anyone who would like to listen to it. She also said that a couple of the people have been trained with simulated data to "listen for" gravitational wave detections by (or more commonly problems with the ) the instruments. Like the noise accompanying a heart rate monitor, I thought, one could engage another sense during data collection and instrument monitoring.

So I filed it away in the "cool idea" category and talked to a few people about it last Monday and Tuesday when I got back into the office. On Thursday of last week I went to an all day session which included some research presentations by our faculty. Dr. Mark Ballora, Associate Professor in Art & Music, presented on "sonification of data." So now I have a term for that thing in my "cool idea" category. As Dr. Ballora explains it, for sonification to be useful, you have to sonify the right quantity or quantities. For his thesis research, he sonified the interval between similar points (often peaks) in a heart rhythm. As Dr. Ballora's Web page says:

Here we propose a novel diagnostic method based in music technology. Digital music software is employed to transform the sequence of intervals between consecutive heartbeats into an electroacoustic soundtrack. The results show promise as a diagnostic tool and also provide the basis of an interesting musical soundscape.

When you listen Dr. Ballora's samples, you hear two distinct sounds in the NN+50NN samples (which are the ones he played for us). The NN is the interval between beats represented by one set of tones and the NN50 tones are additional sounds which occur if the interbeat variation is more than 50 ms (milliseconds). In a healthy heart, the latter happens more than I would have guessed (I guess I'll add an "I am not a physician, IANAP" to go with my "I am not a lawyer, IANAL" category). Check out samples for yourself on Dr. Ballora's Web pages. This could be a very powerful, non-invasive diagnostic tool.

File that away in your "cool idea" category!

Einstein@Home and "Latent" Computing

| 2 Comments | No TrackBacks

At a strategic planning session on research support recently, I suggested we look into using our "latent computing" resources to work on research computing problems during off peak times. What I meant by this is that we have thousands of very fast processors which are sitting idle in our student labs between 11PM and 7:45AM every day. That's a chuck of nine hours for almost half of the machines in our labs. Originally, I was thinking that we should run Condor clustering on these after any updates are done to them each night. I was told that it's not clear that we want to reconfigure them each night and there are only a certain class of problems which run this way. While I agree with the latter point, I can think of a few massively parallel (MPP) or embarrassingly parallel (EPP) processing problems off the top of my head in which Penn State researchersare actively engaged (e.g. Monte Carlo simulations of radiative transport).

When I was at CalTech last week, one of those problems landed in my lap -- Einstein@Home. The great thing about Einstein@Home is that it works on a set of data which Penn State researchers have helped gather and are in the process of going through and cataloging. Einstein@Home operates on the LIGO Science Consortium's data runs looking for observations of gravitational waves. Rather than using something more tightly coupled like Condor, Einstein@Home uses BOINC -- The Berkeley Open Infrastructure for Network Computing. BOINC was developed for SETI@Home but has been generalized for problems which can run over networks of loosely coupled computers. Some of you have probably already used BOINC without knowing it when you loaded the SETI@Home screen saver or contributed to the Protein@Home effort. BOINC is lightweight and gives priority to everything else. What you don't use on your computer, BOINC can use.

We again may be confronted with the question, "But is it Green?" As I said in a previous post, if we design a system to lower peaks or even out energy utilization over the course of a day, while the system may use extra power, it is making use of power at times of traditionally low power consumption. Let's try to take advantage of our latent computing power with the goal of aiding Penn State researchers.

I would be interested in other BOINC-enabled, MPP, or EPP problems which you think would fit nicely into our down times, particularly if those problems would directly advance the cause of Penn State research. Please drop me a line or comment on this entry.

About this Archive

This page is an archive of entries from February 2008 listed from newest to oldest.

January 2008 is the previous archive.

March 2008 is the next archive.

Find recent content on the main index or look in the archives to find all content.

Categories

Pages

Subscribe

Powered by Movable Type 4.21-en