- Dec 2016, Our journal paper by Sean Goldberg et. al.: pi-CASTLE: A Probabilistically Integrated System for Crowd-Assisted Text Labeling and Extraction is published in ACM JDIQ (Journal of Data and Information Quality), 2017.
- March 2016, UF DSR Lab is invited to participate the NIST Data Science pre-pilot evaluation workshop 2016 and will be presenting (1) the results of the 2015 NIST Data Science pre-pilot evaluation participation from UF and (2) a proposal of a new Data Science evaluation on Computational Ecology using remote sensing and data from the NSF Neon program.
- March 2016, Consensus Maximization Fusion of Probabilistic Information Extractors by Miguel Rodriguez et. al is accepted at HTL NAACL 2016. This CMF algorithm participated in the TAC KBP SVF evaluation organized by NIST in 2015 and achieved top 3 ranked results in CSSF/CSKB and overall ensemble runs.
- Feb 2016, I have visited Computer Science at University of Miami, Information Sciences Institute at University of South California and gave talks on different aspects of Archimedes. I also visited UC Irvine to discuss research projects.
- Jan 2016, I participated in the OneFlorida Clinical Research Consortium’s Second Annual Stakeholder meeting. The NLP expertise in the UF DSR lab was drawn upon by UF CTSI and supporting the newly funded OneFlorida Clinical Research Consortium, which was recently designated as one of the nation’s 13 clinical data research networks, or CDRNs, by the Patient-Centered Outcomes Research Institute (PCORI) to accelerate the translation of promising research findings into improved patient care.
- Spring 2016, I am advising four student projects in CAP4773/CAP6779 Project in Data Science: (1) contributing to Apache MADlib; (2) Legal citation graph analytics and Case predictions; (3) automatically extracting biomedical knowledge bases; and (4) distributed RDF store for query processing over large knowledge bases.
- Nov 2015, Ontological Pathfinding: Mining First-Order Knowledge from Large Knowledge Bases by Yang Chen et. al is accepted at SIGMOD 2016.
- In October 2015, the MADlib open-source library for scalable in-database analytics, is now an Apache Software Foundation Incubator project: MADlib@ASF. Student from the DSR Lab and my Data Science courses are excited to continue our contribution!
- My research on “Efficient Query Processing over Large Probabilistic Knowledge Bases” is funded by NSF IIS Div. of Information & Intelligent Systems starting Sept 1st 2015.
- Fall 2015, I will be teaching Introduction to Data Science and give guest lectures to informatics courses in other disciplines such as Foundations of Biomedical Informatics taught by Dr. William Hogan from UF CTSI.
- As a PC group leader for SIGMOD 2016, I will be part of the effort led by Dr. Sam Madden from MIT to use online conference calls to enhance paper review process.
- Summer 2015, the DSR Lab proudly graduated two Ph.D.’s: Dr. Kun Li and Dr. Christan Grant. One is current at Google and the other is starting as an assistant professor at the University of Oklahoma with their Data Science and Analytics program.
- In Spring and Summer 2015, I gave different version of the talk on “Archimedes: A Master Probabilistic Knowledge Base System” at Google Research, Berkeley AMP Lab, Sandia Livermore Lab, University of Toronto and Harris Coorporation.
- UDA-GIST: An In-database Framework to Unify Data-Parallel and State-Parallel Analytics by Kun Li et. al. is accepted at VLDB 2015.
- Fall 2014, I am co-teaching Projects in Data Science, the second course in the three-course UF CISE Data Science Curriculum with Dr. Sanjay Ranka.
- Together with Dr. Tyson Condie at UCLA, I serve as the Proceeding Chair for VLDB 2015.
- Knowledge Expansion over Probabilistic Knowledge Bases paper with my student Yang Chen was accepted and presented at SIGMOD 2014. I gave an invited talk in the WACCK workshop (Workshop on Automatic Creation and Curation of Knowledge Bases) at SIGMOD 2014.
- I gave a talk on Knowledge Base Construction from Big Text, Images and Crowds at a WISE event June 2014, organized by TRUST at Cornell University with Big Data research as the central theme.