Research and Publications

Modern life sciences research increasingly relies on information technology: data storage, custom algorithms, advanced search and query mechanisms, streamlining recurring tasks, presentation and visualization interfaces.

I aim to combine agile software development methodologies with the latest advances in computational infrastructure development. A paradigm shift is underway in computational resource management and I'm excited to be a participant in it. Forget about the rooms full of computers and the tedious management that comes with them. Using the Amazon's Elastic Compute Cloud, Simple Storage Service and Google's Application Engine we will be able to build a low cost, efficient and scalable infrastructure that can handle the scientific data analysis and visualization challenges of the bioinformatics era.

I'm most interested in applying modern and sophisticated computational solutions to diverse biological problems. I have lead the development of several software projects, some with teams, others as sole developer. Here are some quotes that best capture my philosophy:

ArrowsA true object oriented program is designed without a "top" (Meyer 1997), that is, without a well-defined single high level "here's how the whole thing works". What the user sees as an application program is but one particular entry point that invokes the objects necessary for the application. This now implies that what the user perceives as the application could be dramatically altered by augmenting and combining the available objects in a different way.
From The Pragmatic Programmer by Andrew Hunt and David Thomas

One of the big insights in the last few years, through work by the internet search engines but also tools like Udi Manber's glimpse, is that data with no meaningful structure can still be very powerful if the tools to help you search the data are good. In fact, structure can be bad if the structure you have doesn't fit the problem you're trying to solve today, regardless of how well it fit the problem you were solving yesterday. So I don't much care any more how my data is stored; what matters is how to retrieve the relevant pieces when I need them ... Expect more liberation as searching replaces structure as the way to handle data.
From an interview with Rob Pike


Software Projects

  • GeneTrack2008 - GeneTrack - bioinformatics software package for storing, querying and visualizing interval oriented data.

  • 2008 - BooleanNet - boolean network simulation software for life sciences

  • 2007 - Bioinformatics Project Director of the Genome Cartography Project (2007)

  • 2006 - Lead developer of MiniDB, a data storage system for microarray research. A collaboration with Frank Pugh ( folded into the Genome Cartography Project).

  • 2005-2006 - Lead developer of Galaxy, a web-based data analysis framework (funded by NSF, served as Co-PI between 2004-2005). A collaboration with Anton Nekrutenko and Ross Hardison (2005).

  • 2004-2005 - Lead developer of LionDB, a laboratory data management system (in continuous operation since September, 2004 it serves the data exchange needs of the life science researchers at Penn State). A collaboration with Naomi Altman and Craig Praul (2004).


LionDB

Galaxy

MiniDB

Nucleosome Predictions

Nucleosome Browser

 

Publications

I have worked and published in three distinct scientific fields:

  • 10 publications in bioinformatics at the Pennsyvlania State University - as a Research Associate (2004-2006) then as an Assistant Professor of Bioinformatics between 2006-present

  • 5 publications in computer science, data mining and collaborative filtering at the Computer Science Department of the University of Minnesota - as a Research Manager between 2001 to 2003

  • 8 publications in computational and experimental physics at the Physics Department of the University of Notre Dame - as a Graduate Student between 1996 to 2001

List of Publications

  1. Nucleosome organization in the Drosophila genome,
    Travis N. Mavrich,, Cizhong Jiang, Ilya P. Ioshikhes, Xiaoyong Li, Bryan J. Venters, Sara J. Zanton, Lynn P. Tomsho, Ji Qi, Robert Glaser, Stephan C. Schuster, David S. Gilmour, Istvan Albert, and B. Franklin Pugh
    Nature (2008)

  2. A barrier nucleosome model for statistical positioning of nucleosomes throughout the yeast genome
    Travis N. Mavrich, Ilya P. Ioshikhes, Bryan J. Venters, Cizhong Jiang, Lynn P. Tomsho, Ji Qi, Stephan C. Schuster, Istvan Albert, and B. Franklin Pugh
    Genome Research (2008)

  3. GeneTrack – a genomic data processing and visualization framework
    Istvan Albert, Shinichiro Wachi, Cizhong Jiang, and B. Franklin Pugh
    Bioinformatics ( 2008 ) - software website

  4. Translational and rotational settings of H2A.Z nucleosomes across the Saccharomyces cerevisiae genome
    Istvan Albert, Travis N. Mavrich, Lynn P. Tomsho, Ji Qi, Sara J. Zanton , Stephan C. Schuster, and B. Franklin Pugh
    Nature 446, 572-576, (2007) - see Genome Cartography

  5. Galaxy A framework for collaborative analysis of ENCODE data: Making large-scale analyses biologist-friendly
    Daniel Blankerberg, James Taylor, Ian Schenk, Jianbin He, Yi Zhang, Matthew Ghent, Narayan Veeraraghavan, Istvan Albert, Webb Miller, Kateryna Makova, Ross C. Hardison, and Anton Nekrutenko
    Genome Research 17:960-964, (2007) - website

  6. Nucleosome positions predicted by comparative genomics
    Ilya P. Ioshikhes, Istvan Albert, Sara J. Zanton, and B. Franklin Pugh,
    Nature Genetics 38 , 1210 - 1215 (2006) - supporting website

  7. Rapid and Asymmetric Divergence of Duplicate Genes in the Human Gene Coexpression Network
    Wen-Yu Chung, Reka Albert, Istvan Albert, Anton Nekrutenko and Kateryna D. Makova
    BMC Bioinformatics, 7 : 46
    (2006)

  8. Galaxy: A platform for interactive large-scale genome analysis
    Belinda Giardine, Cathy Riemer, Ross Hardison, Richard Burnharns, Laura Elnitski, Prachi Shah, Yi Zhang, Daniel Blankerberg, Istvan Albert, Webb Miller, James Kent and Anton Nekrutenko
    Genome Research 15:1451-1455, 2005

  9. Conserved network motifs allow protein-protein interaction prediction
    Istvan Albert and Reka Albert
    Bioinformatics, 12 December 2004; Vol. 20, No. 18

  10. HyBrow - A Prototype System for Computer Aided Hypothesis Evaluation
    Stephen Racunas , Nigam Shah, Istvan Albert and Nina Fedoroff
    Bioinformatics 4 Aug 2004; Vol 20, Suppl 1:I257-I264
    related website: http://www.hybrow.org

  11. Structural Vulnerability of the North American Power Grid
    Reka Albert, Istvan Albert and Gary L. Nakarado
    Physical Review E, 69, 025103(R) (2004)

  12. Experiences with a Recommender System on Four Mobile Devices
    Brad Miller, Istvan Albert, Shyong Lam, Joe Konstan & John Riedl
    17th Annual Human-Computer Interaction Conference (HCI'03), Bath, England, 8-12 September 2003

  13. Is Seeing Believing? How Recommender Systems Influence Users' Opinions
    Cosley, D., Lam, S.K., Albert, I., Konstan, J., & Riedl, J.
    Proceedings of CHI 2003 Conference on Human Factors in Computing Systems (CHI'03) , Fort Lauderdale, FL, pp. 585-592

  14. MovieLens Unplugged: Experiences with an Occasionally Connected Recommender System
    Bradley N. Miller, Istvan Albert, Shyong K. Lam, Joseph A. Konstan, John Riedl
    Proceedings of ACM 2003 International Conference on Intelligent User Interfaces (IUI'03)

  15. Getting to Know You: Learning New User Preferences in Recommender Systems
    Rashid, A.M., Albert, I., Cosley, D., Lam, S.K., McNee, S., Konstan, J.A., & Riedl, J.
    Proceedings of the 2002 International Conference on Intelligent User Interfaces (IUI'02) , San Francisco, CA, pp. 127-134.

  16. On the Recommending of Citations for Research Papers
    McNee, S., Albert, I., Cosley, D., Gopalkrishnan, P., Lam, S.K., Rashid, A.M., Konstan, J.A., & Riedl, J.
    Proceedings of ACM 2002 Conference on Computer Supported Cooperative Work (CSCW'02), New Orleans, LA, pp. 116-125.

  17. Modeling relaxation and jamming in granular media
    B. Kahng, I. Albert, P. Schiffer, and A.-L. Barabási
    Physical Review E 64, 051303 (2001)

  18. Granular Drag on a Discrete Object: Shape Effects on Jamming
    I. Albert, J. G. Sample, A. J. Morss, S. Rajagopalan, A.-L. Barabási, and P. Schiffer
    Physical Review E 64, 061303-1 - 061303-4 (2001)

  19. The Drag Force in Granular Media
    P. Schiffer, I. Albert, J. G. Sample, and A.-L. Barabási
    Proceedings of Fourth International Conference on Micromechanics of Granular Media, ed. Y. Kishino, (A. A. Balkema, Lisse, 2001)

  20. Stick-Slip Fluctuations in Granular Drag
    I. Albert, P. Tegzes, R. Albert, J. G. Sample, A.-L. Barabási, T.Vicsek, and P. Schiffer
    Physical Review E 64, 031307-1- 031307-9 (2001)

  21. An Experimental Study of the Fluctuations in Granular Drag
    István Albert, Pál Tegzes, Réka Albert, John Sample, Albert-László Barabási, Tamás Vicsek, B. Kahng and Peter Schiffer
    Proc. Materials Research Society Symposium Series 627, ed. S. Sen, M. Hunt (2000)

  22. Jamming and Fluctuations in Granular Drag
    I. Albert, P. Tegzes, B. Kahng, R. Albert, J. Sample, M. Pfiefer, A.-L. Barabási, T. Vicsek, and P. Schiffer,
    Physical Review Letters 84, 5122-5 (2000)

  23. Maximum Angle of Stability in Wet and Dry Spherical Granular Media
    R. Albert, I. Albert, D. J. Hornbaker, P. Schiffer, and A.-L. Barabási
    Physical Review E 56, R6271 (1997)

  24. What Keeps Sandcastles Standing
    D. J. Hornbaker, R. Albert, I. Albert, A.-L. Barabási, and P. Schiffer
    Nature 387, 765 (1997)

Blast from the Past

Software projects that I have worked on in the past:

  • Lead developer of MovieLens. A movie recommendation site maintained by the GroupLens research group at the University of Minnesota is used to test novel predictions algorithms and user interface elements. The site has over 30 thousand registered users and manages millions of ratings. I was lead developer, in charge of implementing the database and server infrastructure (2001-2003). The site was built with XML/XSLT and JavaServer (Apache Tomcat) technologies .

  • Written for fun:

<<back