Alexander G. Ororbia II
Dept. of Information Science & Technology
Advisor: Dr. C. Lee Giles
Co-Advisor: Dr. David Reitter
Intelligent Information Systems Lab
Applied Cognitive Science Lab
Pennsylvania State University
University Park, PA, 16802
ago109 AT ist DOT psu DOT edu
"The whole thinking process is still rather mysterious to us, but I believe that the attempt to make a thinking machine will help us greatly in finding out how we think ourselves."
B.S.E, Computer Science & Engineering, Bucknell University
Philosophy Minor, Bucknell University
Mathematics Minor, Bucknell University
Ph.D., (Graduate Candidate), Information & Science Technology, Pennsylvania State University
Me shaking hands with
My current research is in developing semi-supervised connectionist models for lifelong learning. In my work, the approach that I have proposed and developed is called deep hybrid learning, where (representation-learning) models are constructed by balancing both generative and discriminative objectives. This allows the exploitation of potentially useful information found in large pools of unlabeled data in tandem with a few task-relevant labeled samples (which are usually costly and difficult to obtain). An example of a model I have proposed under the hybrid learning framework is the deep hybrid Boltzmann machine, shown to the right. My specific focus (w.r.t. task domain) is in language learning and generation.
To empirically demonstrate and analyze the practical viability of these models, I engineer intelligent tools that handle large-scale, mostly unlabeled, scholarly text, character recognition, and legislative speech data-sets. In addition, I have worked at the intersection of Machine Learning and Crowd-sourcing, developing models that learn to error-correct and aggregate the "wisdom of the crowd". I have also analyzed Intelligent Systems, such as CiteSeerX, from a variety of perspectives (e.g., as a dynamical system, as a knowledge construction engine).
Furthermore, I am a philosopher of mind by training. As such, I find it important (and of course quite fun) to ponder the more deeper questions related to Artificial General Intelligence (AGI) as well as reflect on what has been learned over decades of research, including that in (Computational) Cognitive Psychology and Neuroscience.
A deep hybrid Boltzmann machine (DHBM)!
Here is a link to my current Curriculum Vitae (updated as of May 3, 2017). My Google Scholar profile can be found here. I'm also quazi-active on Quora (a Question-Answer forum/website), where I pass on my knowledge through answering questions related to neural architectures, particularly restricted Boltzmann machines and recurrent neural networks.
Tutorials, Talks, & Posters
Alexander G. Ororbia II, David Reitter, and C. Lee Giles. The Temporal Neural Coding Network: Towards Lifelong Language Learning. 11th Annual Machine Learning Symposium.. (Peer-reviewed & accepted poster and spotlight talk).
- We present a novel lifelong neural architecture, the Temporal Neural Coding Network (TNCN), and its learning algorithm, for uncovering multiple levels of distributed representations for language data streams. The TNCN model adapts its parameters iteratively as new samples are observed without resorting to the popular, but expensive back-propagation through time procedure often used to calculate gradients for recurrent neural networks. We discuss how the proposed TNCN works specifically on language-based tasks formulated in the streaming setting as well as how the architecture may be adapted to other real-time tasks, such as video sequence modeling. This creates a simple, promising framework for online multi-modal and online semi-supervised learning.
Alexander G. Ororbia II Deep Learning Applied. SBP-BRIMS 2016
C. Lee Giles & Alexander G. Ororbia II Recurrent Neural Networks: State Machines and Pushdown Automata. ICML 2016: Neural Nets Back To the Future Workshop
- In this tutorial, I taught the audience the basics of training a neural architecture (with many hidden layers), starting from understanding the needed linear algebra basics and ideas of reverse-mode differentiation to some tricks-of-the-trade to get the performance seen in modern literature. To motivate the use of these expressive models in social science research, an application in partially annotated, automated content-coding was developed throughout the talk.
- In this talk I co-authored with my advisor, valuable concepts from the long-forgotten 80's and 90's era of recurrent neural neworks were presented and discussed. Some of these key ideas, developed by Lee Giles back in the day, included the extraction of finite-state automata from temporal neural models in the effort to uncover what these normally "black-box" architectures learned from data. Here's a link to our slides.
Alexander G. Ororbia II, Tomas Mikolov, and David Reitter. Learning Simpler Language Models with the Delta Recurrent Neural Network Framework. arXiv:1703.08864 [cs.LG].
- Learning useful information across long time lags is a critical and difficult problem for temporal neural models in tasks like language modeling. Existing architectures that address the issue are often complex and costly to train. The Delta Recurrent Neural Network (Delta-RNN) framework is a simple and high-performing design that unifies previously proposed gated neural models. The Delta-RNN models maintain longer-term memory by learning to interpolate between a fast-changing data-driven representation and a slowly changing, implicitly stable state. We show empirically that one concrete instantiation of our proposed Delta-RNN can actually popular complex architectures, such as the Long Short Term Memory (LSTM) and the Gated Recurrent Unit (GRU) in language modeling at character and word levels and yields comparable performance at the subword level.
Iulian Serban, Alexander G. Ororbia II, Joelle Pineau, and Aaron Courville. Multi-modal Variational Encoder-Decoders. arXiv:1612.00377 [cs.LG]. (Note: First two authors contributed equally.).
Alexander G. Ororbia II, C. Lee Giles, and Daniel Kifer. Unifying Adversarial Training Algorithms with Flexible Deep Data Gradient Regularization. Neural Computation. (Accepted).
- Many real-world data distributions are complex and multi-modal. Current neural variational inference approaches assume simple, uni-modal priors which hinders overall expressivity of learned models. In this work, we proposed the piecewise-constant prior, a simple, flexible, and efficient prior distribution for capturing potentially an exponential number of modes of an unknown, target distribution and developed several. Under our framework for variational encoder-decoder neural models, we investigated the effectivenss of our prior in both document modeling and dialogue modeling.
Alexander G. Ororbia II, Fridolin Linder, and Joshua Snoke. Privacy Protection for Natural Language: Neural Generative Models for Synthetic Text Data. arXiv:1606.01151 [cs.LG].
- In this work, we propose a general framework for learning deep neural models that are more robust to the dreaded "blind-spots" (or adversarial samples). Here, in comparison to previously proposed approaches, we offer a direct approximation of the deep Jacobian term needed for properly regularizing the final model.
- In this work, we propose a neural generative model for synthesizing text data, investigating the trade-off between user privacy and data utility (for downstream ML tasks).
Alexander G. Ororbia II, C. Lee Giles, and David Reitter. Online Semi-Supervised Learning with Deep Hybrid Boltzmann Machines and Denoising Autoencoders. arXiv:1511.06964 [cs].
Alexander G. Ororbia II, C. Lee Giles, and David Reitter. Learning a Deep Hybrid Model for Semi-Supervised Text Classification. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP). Lisbon, Portugal.
- In this paper, we propose a framework for jointly training deep hybrid models in the online semi-supervised learning setting, and specifically propose two new architectures learnable under this framework: the Deep Hybrid Boltzmann Machine and the Deep Hybrid Denoising Autoencoder.
Alexander G. Ororbia II, David Reitter, Jian Wu, and C. Lee Giles. Online Learning of Deep Hybrid Architectures for Semi-Supervised Categorization. In: Proc. European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD). Porto, Portugal: Springer.
- The Bottom-Up-Top-Down learning procedure is proposed for pseudo-jointly learning a Stacked Boltzmann Expert Network model. We investigate this model's performance in the semi-supervised text categorization and compare it against the original (bottom-up only) version proposed in the ECML 2015 paper as well as several other competitive semi-supervised learning approaches.
- Errata: While what appears in the original published paper will work (or did in the experiments I conducted at the time), the correct way to propagate the error signals down the model in Ensemble Back-prop is to replace the Hadamard product with a simple addition. (I checked this by only using the top-down phase of BUTD and get better--more correct--results).
Ororbia II, A. G., Jian Wu, Madian Khabsa, Kyle Williams, and C. L. Giles. Big
Scholarly Data in CiteSeerX: Information Extraction from the Web. In: BigScholar, The Second WWW Workshop on Big Scholarly Data: Towards the Web of Scholars.
Jian Wu, Kyle Mark Williams, Hung-Hsuan Chen, Madian Khabsa, Cornelia Caragea, Suppawong Tuarob, Alexander Ororbia, Douglas Jordan, Prasenjit Mitra, and C. Lee Giles:
CiteSeerX: AI in a Digital Library Search Engine. (2015) AI Magazine 36(3): 35-48.
- The Stacked Boltzmann Expert Network and the Hybrid Stacked Denoising Autoencoder models are developed in order to learn deep architectures that are trainable in semi-supervised learning problems. A greedy layer-wise procedure is proposed to learn from finite data-sets and a relaxed bottom-up variant (that dispenses with the greediness) is investigated in a online learning experiment.
- Note: The slides of the talk I gave on this paper can be found here.
Alexander G. Ororbia II, Jian Wu, C. L., Giles. CiteSeerX: Intelligent Information Extraction and Knowledge Creation from Web-Based Data. In: 4th Workshop on Automated Knowledge Base Construction (AKBC) (held at NIPS 2014).
Alexander G. Ororbia II, Yang Xu, David Reitter, Vito D'Orazio. Error-correction and Aggregation in Crowd-Sourcing of Geopolitical Incident Information. In: Social Computing, Behavioral Modeling and Prediction. Ed. by N. Agarwal et al. Vol. 9021. Springer, pp. 381-387.
Jian Wu, Alexander G. Ororbia II, Kyle Williams, Madian Khabsa, Zhaohui Wu, C. Lee Giles. . Utility-Based Control Feedback in a Digital Library Search Engine: Cases in CiteSeerX. The 9th International Workshop on Feedback Computing.
Hung-Hsuan Chen, Alexander G. Ororbia II,
C. Lee Giles. ExpertSeer: a Keyphrase Based Expert Recommender for Digital Libraries. [Under Review].
Jian Wu, Kyle Williams, Hung-Hsuan Chen, Madian Khabsa, Cornelia Caragea,
Alexander Ororbia, Douglas Jordan, & C. Lee Giles. CiteSeerX: AI in a Digital Library Search Engine. Twenty sixth Annual Conference on Innovative Applications of Artificial Intelligence (IAAI '14). (Won "Most Innovative Application of AI" Award).
Zhaohui Wu, Jian Wu, Madian Khabsa, Kyle Williams, Hung-Hsuan Chen, Wenyi
Huang, Suppawong Tuarob, Sagnik Ray Choudhury, Alexander Ororbia, Prasenjit
Mitra, C. Lee Giles. Towards Building a Scholarly Big Data Platform: Challenges, Lessons and Opportunities. International Conference on Digital Libraries 2014 (DL '14). (I presented this paper at the JCDL conference itself on September 9, 2014).
Elaina Miller, Alexander G. Ororbia II, Bonnie Reiff. Follow Automata Paper Analysis and Implementation. (Bucknell Tech. Report #12-1). Department of Computer Science, College of Engineering, Bucknell University. (Note: All authors carried equal contribution weight in this work).
Language Modeling (June 2016 - August 2016) - I was a summer intern at the company Interactions L.L.C., where I was tasked with exploring the development of neural language models for a variety of internal tasks (mentored by Ryan Price). I prototyped and investigated the performance of various ideas against well-tuned n-gram language model baselines. Here I got to work with many of the guys from AT&T Bell Labs, including Patrick Haffner (Machine Learning, who worked and published with the likes of Yann LeCun, Yoshua Bengio, Corinna Cortes, Vladmir Vapnik, and Leon Bottou to name a few) and Srinivas Bangalore (in Natural Language Processing, Machine Translation, & Speech Understanding).
Language/Document Modeling (June 2015 - present) - I began my work with Dr. Andrew McCallum on building a connectionist architecture for modelling documents at multiple levels of representation. Our work on this not yet concluded, but keep your eyes open for our upcoming publication on our findings! You can find me on his "People" page here.
Deep Hybrid Learning (December 2013 - present) - This is the subject of my core thesis research focusing on developing connectionist architectures for lifelong learning building on my contributions to semi-supervised learning.
Connectionist Hierarchical Models of Political Text (October 2014 - present) - This is another application domain I am developing representation learning-based approaches to analyzing political text. I am collaborating with Dr. Burt Monroe of the Penn State Political Science Department in this project.
The Third Eye Project (May 2014 - present) - I have been involved with various aspects of this very useful application case of deep neural architectures. You can find my graduate student profile on here.
Human-Aided Machine Learning (September 2013 - 2015) - Graphical models for error-correcting crowd-sourced annotations in complex, computational social science tasks. In particular, I worked on developing a supervised learning system for error-correcting crowd-sourced annotations in a variety of tasks utilizing the Militarzied Interstate Dispute (MID) data-set. Recently, our first published work on the topic has aided us in obtaining an NSF grant to fund students to further work on this project.
Artificial Creativity (May 2013 - present) - This is an ongoing exploratory research project that I am conducting in collaboration with Dr. Joseph V. Tranquillo (Bucknell University). The aim of this work is to develop a unified framework for creativity, and my specific goal is to leverage my experience in Computer Science and Engineering to develop an exemplar computational model of creativity. This research has involved studying and extending the fundamental mechanisms of Complex Adaptive Systems laid forth by thinkers such as John Holland and Herbert Simon, the concept of Self-Organized Criticality as developed by Per Bak, and scale-free structure and behaviour (of networks) as investigated by Alberto Barabasi. A more detailed description of the motivation and initial development of the research can be found on Dr. Joseph V. Tranquillo's page describing his work in creativity. One way in which to view this on-going research endeavour is to think of it as the development of a candidate ecorithm (as coined by Dr. Leslie Valiant in his book "Probably Approximately Correct".
Academic Home Page Finding, Scalability Study (July 2013 - present) - An in-progress research project on leveraging the classification techniques developed by Sujatha Das Golllapalli, Ph.D. (a recently graduated student from Penn State IST Intelligent Systems Laboratory), to identify academic home pages during a crawl, create a large-scale dataset of academic home pages, and ultimately examine the data for interesting (research) trends in the academic community.
Record Linkage (August 2013 - 2014) - A research project on data cleaning (a critical step in the machine learning/datamining processing pipeline), done in collaboration with Sagnik Ray Choudhury (a Ph.D. student, IST PennState, Intelligent Information Systems Laboratory).
BisonTax: Design and Implementation of a Lambda Calculus Interpreter (January 2013 - June 2013) - This was an interesting independent research project (design-driven) that I conducted the last semester of my undergraduate education at Bucknell University (advisor: Dr. Lea Wittie, co-advisor: Dr. Benoit Razet). This work involved a detailed study of the Lambda Calculus (a model for functional programming languages, developed by Stephen Kleen, Alonzo Church, and Alan Turing) and designing and implementing a working interpreter (that supported multiple evaluation strategies) for this model language (as well as my own flavor of Lamdba Calculus syntax). The interpreter is to be used in educating future undergraduate Computer Science students about the core condepts of Lambda Calculus in a programming languages course. I designed this software system (back-end interpreter and front-end GUI editor for writing Lamdba Calculus code) to be clean and flexible so that the code may be maintained and extended to allow for more advanced functionality (to be done by future undergraduate researchers). It was an exciting project that even required the use of and (slight) extension of techniques commonly found in higher-order reasoning systems (think of automated theorem provers), such as DeBruijn Indices.