Archimedes: A Probabilistic Master Knowledge Base System
The Archimedes project aims at building a probabilistic master knowledge base system by combining novel system components and algorithms that we are designing and building at UF. In the context of the Archimedes project, we pursue a spectrum of research directions we are exploring at the UF Data Science Research (DSR) group including: query-driven and scalable statistical inference, probabilistic data models, state-parallel and data parallel data analytics framework, multimodal (e.g., text, image) information extraction, and KB schema enrichment. This line of research on supporting large-scale automatically extracted knowledge bases is of high impact for many application domains from medical informatics to ecology. We have received funding from industry as well as federal government including DARPA, EMC, Amazon and Google. Other related projects include DeepDive from Stanford, YAGO from Max Planck Institute, NELL from CMU as well as WikiData/Freebase and Google Knowledge Vault.
ProbKB | Large-scale Probabilistic Reasoning over Uncertain Knowledge Bases |
HypoGator | Distinct Hypotheses and Claims Retrieval with Stance Detection on Controversial topics |
DBlytics/MADLib | Textual Retrieval/Analytics in distributed MPP frameworks over hybrid hardware |
Archer | Query-Driven Machine Learning |
CAMeL | Leverage Crowd Support in Probabilistic Databases |
SigmaKB | Knowledge fusion, cleaning and knowledge base integration |
Rose | Knowledge Extraction and Exchange from Electronic Health Records |
SMARTeR | Smarter information retrieval system |
VITA | Multimodal knowledge extraction and fusion. |
Selected Publications
- ArchimedesOne: Query Processing over Probabilistic Knowledge Bases
Xiaofeng Zhou, Yang Chen, Daisy Zhe Wang
Proceedings of the VLDB Endowment, 2016 - Ontological Pathfinding: Mining First-Order Knowledge from Large Knowledge Bases
Yang Chen, Sean Goldberg, Daisy Zhe Wang, Soumitra Siddharth Johri
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2016 - UDA-GIST: An In-database Framework to Unify Data-Parallel and State-Parallel Analytics
Kun Li, Daisy Zhe Wang, Alin Dobra, Christopher Dudley
Proceedings of 41th VLDB Very Large Data Base Endowment, 2015 - Efficient In-Database Analytics with Graphical Models
Daisy Zhe Wang, Yang Chen, Christan Grant, Kun Li
IEEE Data Engineering Bulletin, 2014 - Knowledge Expansion over Probabilistic Knowledge Bases
Yang Chen, Daisy Zhe Wang
Proceedings of the ACM SIGMOD International Conference on Management of Data, 2014 - CASTLE: Crowd-Assisted System for Textual Labeling and Extraction
Sean Goldberg, Daisy Zhe Wang, Tim Kraska
Proceedings of the First AAAI Conference on Human Computation and Crowdsourcing (HCOMP-13) - The MADlib Analytics Library or MAD Skills, the SQL
Joseph M. Hellerstein, Christoper Re, Florian Schoppmann, Daisy Zhe Wang, Eugene Fratkin, Aleks Gorajek, Kee Siong Ng, Caleb Welton, Xixuan Feng, Kun Li, Arun Kumar
Proceedings of 38th VLDB Very Large Data Base Endowment, 2012