• Home
  • Blog
  • People
  • Projects
  • Publications
  • Seminars
  • DSR Expo
  • Courses

Data Science Research

Menu
  • Home
  • Blog
  • People
  • Projects
  • Publications
  • Seminars
  • DSR Expo
  • Courses
Home › research directions
  • Efficient Conditional Rule Mining over Knowledge Bases

    December 12, 2018     Comment Closed     publications, research directions

    Current web-scale knowledge bases (KBs) incorporate a substantial amount of information in a structured format. Availability of this readily machine-digestible data has made KBs a desirable resource for other applications. This has motivated many to explore learning on KBs. Graph embeddings and inference rule learning are examples such methods. This paper concerns the later, mostly because

    Read more »

  • Mining Rules Incrementally over Large Knowledge Bases

    February 12, 2018     Comment Closed     publications, research directions

    Xiaofeng Zhou Multiple web-scale knowledge bases (e.g., Freebase, YAGO, NELL) have been constructed using semi-supervised or unsupervised information extraction techniques and many of them, despite their large sizes, are continuously growing. Much research effort has been put into mining inference rules from these knowledge bases. To address the task of rule mining over evolving web-scale knowledge bases,

    Read more »

  • Archimedes: Efficient Query Processing over Probabilistic Knowledge Bases

    June 26, 2017     Comment Closed     publications, research directions

    We present the ARCHIMEDES system for efficient query processing over probabilistic knowledge bases. We design ARCHIMEDES for knowledge bases containing incomplete and uncertain information due to limitations of information sources and human knowledge. Answering queries over these knowledge bases requires efficient probabilistic inference. In this paper, we describe ARCHIMEDES’s efficient knowledge expansion and query-driven inference over UDA-GIST, an

    Read more »

  • Extracting Visual Knowledge from the Web with Multimodal Learning

    May 26, 2017     Comment Closed     publications, research directions

    We consider the problem of automatically extracting visual objects from web images. Despite the extraordinary advancement in deep learning, visual object detection remains a challenging task. To overcome the deficiency of pure visual techniques, we propose to make use of meta text surrounding images on the Web for enhanced detection accuracy. In this work we present

    Read more »

  • The ArchimedesOne Knowledge Base System

    October 20, 2016     Comment Closed     publications, research directions

    Yang Chen, Xiaofeng Zhou Recent development in information extraction and data management systems arouses elevating efforts in constructing large knowledge bases (KBs). These knowledge bases store information in a structured format, facilitating efficient processing and querying. Examples of these knowledge bases include DBpedia, DeepDive, Freebase, Google Knowledge Graph, Knowledge Vault, NELL, OpenIE, ProBase, ProbKB, and

    Read more »

  • NIST 2015 Knowledge Base Population – Ensemble

    April 15, 2016     Comment Closed     NIST and open eval, research directions

    Miguel Rodriguez The Text Analysis Conference (TAC) is a series of evaluation workshops organized by NIST to encourage research in Natural Language Processing and related applications. TAC is  focused on Knowledge Base Population (KBP), automated systems that discover information about entities found in a large corpus and incorporate them into a knowledge base. The TAC-KBP evaluation is composed

    Read more »

  • NIST 2015 Pre-Pilot Data Science Evaluation

    March 21, 2016     Comment Closed     courses, NIST and open eval, research directions

    Dihong Gong and Daisy Zhe Wang We participated in the 2015 Pre-Pilot Data Science Evaluation organized by the National Institute of Standards and Technology (NIST). The primary goal of the pre-pilot evaluation is to develop and exercise the evaluation process in the context of data science. The evaluation consists of four tasks including data cleaning,

    Read more »

  • SMART Electronic Discovery: System Evaluation

    December 27, 2015     Comment Closed     research directions

    Clint P. George This is an extension to our earlier post, SMART Electronic Discovery (SMARTeR), which describes a framework for electronic discovery (e-discovery). Discovery is a pre-trial procedure in a lawsuit or legal investigation in which each party can obtain evidence from other parties (typically via a request for production) according to the laws of civil procedure in the

    Read more »

  • Multimodal Ensemble Fusion for Disambiguation and Retrieval

    November 25, 2015     Comment Closed     publications, research directions

    Yang Peng Huge amount of multimedia data is generated on the Internet everyday. In order to utilize multimodal data, researchers have developed a lot of multimodal machine learning models to integrate data of multiple modailities, including text, images, audios and videos. In the multimedia analysis community, multimodal fusion is greatly employed for various multimedia analysis

    Read more »

  • MADlib now an ASF incubator project! Do you have MAD Skills?

    Daisy Zhe Wang October 16, 2015     Comment Closed     publications, research directions

    Daisy Zhe Wang MADlib is an open-source library (licensed under 2-clause BSD license) for scalable in-database analytics. It provides data-parallel implementations of mathematical, statistical and machine learning methods for structured and unstructured data. The MADlib mission is to foster widespread development of scalable analytic skills, by harnessing efforts from commercial practice, academic research, and open

    Read more »

  • 1
  • 2
  • Next

Recent Posts

  • DBSim: Extensible Database Simulator for Fast Prototyping In-Database Algorithms
  • DrugEHRQA: A Question Answering Dataset on Structured and Unstructured Electronic Health Records For Medicine Related Queries
  • A Brief Overview of Weak Supervision
  • DRUM: End-To-End Differentiable Rule Mining On Knowledge Graphs
  • IDTrees Data Science Challenge: 2017

Categories

  • courses
  • ecology
  • NIST and open eval
  • publications
  • research
  • research directions
  • survey
  • Uncategorized

Archives

  • February 2023
  • October 2020
  • December 2019
  • April 2019
  • December 2018
  • August 2018
  • February 2018
  • November 2017
  • June 2017
  • May 2017
  • March 2017
  • December 2016
  • October 2016
  • April 2016
  • March 2016
  • December 2015
  • November 2015
  • October 2015
  • May 2015
  • November 2014
  • October 2014
  • July 2014
  • May 2014
  • March 2014
  • December 2013
  • November 2013
  • October 2013
  • September 2013

Recent Posts

  • DBSim: Extensible Database Simulator for Fast Prototyping In-Database Algorithms
  • DrugEHRQA: A Question Answering Dataset on Structured and Unstructured Electronic Health Records For Medicine Related Queries
  • A Brief Overview of Weak Supervision
  • DRUM: End-To-End Differentiable Rule Mining On Knowledge Graphs
  • IDTrees Data Science Challenge: 2017