Penn State Official Sheild

Prasenjit Mitra,
Assistant Professor,

Intelligent Information Systems Laboratory
Cyber Security Laboratory
North-East Visualization and Analytics Center

School of Information Sciences and Technology,
Office: 313F IST Building,
The Pennsylvania State University,
University Park, PA 16802.
Office Phone: +1 (814) 865-4454
Email: pmitra AT ist.psu.edu


Biography  Curriculum-Vitae  Research  Students  Teaching  Publications Projects

Biography:

Prasenjit Mitra received his Doctor of Philosophy degree in Electrical Engineering at Stanford University in 2004. Prior to that, he had received a Master of Science degree in Computer Science at The University of Texas at Austin in December,1994. His Bachelor of Technology (with Honours) degree in Computer Science and Engineering was from the Indian Institute of Technology, Kharagpur in May, 1993.

From 1995, he worked for five years at Oracle Corporation in Redwood Shores, CA as a senior member of the technical staff at the Server Technologies division developing database software.  He also worked part-time as a senior engineer at Narus, and DBWizards

(Old) Curriculum Vita:
(Including Publication List): [MS-Word] [Postscript]


Research Interests:


Database Systems, Digital Libraries, Data Mining, Semantic Web, Information Retrieval, Artificial Intelligence.

My primary research focus is on issues related to information extraction from documents especially documents retrieved from the World-Wide-Web.  Apart from extracting information from the web, we have started looking into extracting information from tables and images automatically.  Of special interest to me is automated geo-spatial information extraction.  Typically, I work with domain scientists who have various applications for the extracted information.

 

For a more detailed idea of my research interests, please refer to my publication list and list of sponsored projects.


My three major research projects are as follows:

  1. ChemXSeer (co-PI): In this project, we are investigating the issues involved in constructing an integrated database and digital library for chemical kinetics data. We have developed a chemical name and formula search engine. We are investigating novel information extraction, document segmentation, and indexing schemes. We have also developed a table search engine, TableSeer, which uses a novel ranking function TableRank to rank tables extracted automatically from digital documents.  Experimental data is often presented using two-dimensional plots in figures in digital documents.  We are aiming at automatically extracting the data from 2-D plots.  Other topics of interest are web crawling (especially focused crawling), query expansion, and analysis of blogs and social networks.
  2. NEVAC (co-PI): I am a co-PI in the North East Visualization and Analysitcs Center. The objective is to allow for efficient processing of large text corpora. We are pursuing research on machine learning algorithms for relationship extraction between named entities, geographic disambiguation, etc. We have developed the FactXtractor system that extracts relationships between entities from text. We have also designed FEMARepViz, which extracts information (like topic) from daily FEMA situation update reports, performs geographical disambiguation, and visualizes the extracted information on a map.
  3. GeoCam (co-PI): This project aims to extract “accounts of movement” automatically from text documents, disambiguate descriptions of motion and combine the extracted information from a geographic information system. See link.

Teaching:

IST 512: Information Processing Technologies and Architectures, Spring 2007, 2008
IST 461: Database Systems Management and Administration, Fall 2006
IST 220: Computer Networks and Telecommunications, Spring 2004-2006,Fall 2007
IST 402: Emerging Topics in Database Systems, Fall 2004,2005

Other Interesting Links:

Some Maps