SIGMOD '16- Proceedings of the 2016 International Conference on Management of Data
Full Citation in the ACM Digital Library
SESSION: Keynote - Jeff Dean
Building Machine Learning Systems that Understand
Jeff Dean
SESSION: Session 1 - Scalable Analytics and Machine Learning
Learning Linear Regression Models over Factorized Joins
Maximilian Schleich
Dan Olteanu
Radu Ciucanu
To Join or Not to Join?: Thinking Twice about Joins before Feature Selection
Arun Kumar
Jeffrey Naughton
Jignesh M. Patel
Xiaojin Zhu
Real-time Video Recommendation Exploration
Yanxiang Huang
Bin Cui
Jie Jiang
Kunqian Hong
Wenyu Zhang
Yiran Xie
Towards Globally Optimal Crowdsourcing Quality Management: The Uniform Worker Setting
Akash Das Sarma
Aditya Parameswaran
Jennifer Widom
Building the Enterprise Fabric for Big Data with Vertica and Spark Integration
Jeff LeFevre
Rui Liu
Cornelio Inigo
Lupita Paz
Edward Ma
Malu Castellanos
Meichun Hsu
Truss Decomposition of Probabilistic Graphs: Semantics and Algorithms
Xin Huang
Wei Lu
Laks V.S. Lakshmanan
Efficient and Progressive Group Steiner Tree Search
Rong-Hua Li
Lu Qin
Jeffrey Xu Yu
Rui Mao
SESSION: Session 2 - Privacy and Security
Publishing Attributed Social Graphs with Formal Privacy Guarantees
Zach Jorgensen
Ting Yu
Graham Cormode
Publishing Graph Degree Distribution with Node Differential Privacy
Wei-Yen Day
Ninghui Li
Min Lyu
Principled Evaluation of Differentially Private Algorithms using DPBench
Michael Hay
Ashwin Machanavajjhala
Gerome Miklau
Yan Chen
Dan Zhang
PrivTree: A Differentially Private Algorithm for Hierarchical Decompositions
Jun Zhang
Xiaokui Xiao
Xing Xie
Adaptive Indexing over Encrypted Numeric Data
Panagiotis Karras
Artyom Nikitin
Muhammad Saad
Rudrika Bhatt
Denis Antyukhov
Stratos Idreos
Practical Private Range Search Revisited
Ioannis Demertzis
Stavros Papadopoulos
Odysseas Papapetrou
Antonios Deligiannakis
Minos Garofalakis
Privacy Preserving Subgraph Matching on Large Graphs in Cloud
Zhao Chang
Lei Zou
Feifei Li
SESSION: Session 3 - Logical and Physical Database Design
The Snowflake Elastic Data Warehouse
Benoit Dageville
Thierry Cruanes
Marcin Zukowski
Vadim Antonov
Artin Avanes
Jon Bock
Jonathan Claybaugh
Daniel Engovatov
Martin Hentschel
Jiansheng Huang
Allison W. Lee
Ashish Motivala
Abdul Q. Munir
Steven Pelley
Peter Povinec
Greg Rahn
Spyridon Triantafyllis
Philipp Unterbrunner
Closing the functional and Performance Gap between SQL and NoSQL
Zhen Hua Liu
Beda Hammerschmidt
Doug McMahon
Ying Liu
Hui Joe Chang
Have Your Data and Query It Too: From Key-Value Caching to Big Data Management
Dipti Borkar
Ravi Mayuram
Gerald Sangudi
Michael Carey
Ambry: LinkedIn's Scalable Geo-Distributed Object Store
Shadi A. Noghabi
Sriram Subramanian
Priyesh Narayanan
Sivabalan Narayanan
Gopalakrishna Holla
Mammad Zadeh
Tianwei Li
Indranil Gupta
Roy H. Campbell
SQL Schema Design: Foundations, Normal Forms, and Normalization
Henning Köhler
Sebastian Link
SQLShare: Results from a Multi-Year SQL-as-a-Service Experiment
Shrainik Jain
Dominik Moritz
Daniel Halperin
Bill Howe
Ed Lazowska
Automatic Generation of Normalized Relational Schemas from Nested Key-Value Data
Michael DiScala
Daniel J. Abadi
SESSION: Session 4 - New Storage and Network Architectures
Data Blocks: Hybrid OLTP and OLAP on Compressed Storage using both Vectorization and Compilation
Harald Lang
Tobias Mühlbauer
Florian Funke
Peter A. Boncz
Thomas Neumann
Alfons Kemper
GeckoFTL: Scalable Flash Translation Techniques For Very Large Flash Devices
Niv Dayan
Philippe Bonnet
Stratos Idreos
SHARE Interface in Flash Storage for Relational and NoSQL Databases
Gihwan Oh
Chiyoung Seo
Ravi Mayuram
Yang-Suk Kee
Sang-Won Lee
Accelerating Relational Databases by Leveraging Remote Memory and RDMA
Feng Li
Sudipto Das
Manoj Syamala
Vivek R. Narasayya
FPTree: A Hybrid SCM-DRAM Persistent and Concurrent B-Tree for Storage Class Memory
Ismail Oukid
Johan Lasperas
Anisoara Nica
Thomas Willhalm
Wolfgang Lehner
Micro-architectural Analysis of In-memory OLTP
Utku Sirin
Pinar Tözün
Danica Porobic
Anastasia Ailamaki
SESSION: Session 5 - Graphs 1: Infrastructure and Processing on Modern Hardware
iBFS: Concurrent Breadth-First Search on GPUs
Hang Liu
H. Howie Huang
Yang Hu
Tornado: A System For Real-Time Iterative Analysis Over Evolving Data
Xiaogang Shi
Bin Cui
Yingxia Shao
Yunhai Tong
EmptyHeaded: A Relational Engine for Graph Processing
Christopher R. Aberger
Susan Tu
Kunle Olukotun
Christopher Ré
GTS: A Fast and Scalable Graph Processing Method based on Streaming Topology to GPUs
Min-Soo Kim
Kyuhyeon An
Himchan Park
Hyunseok Seo
Jinwook Kim
Graph Analytics Through Fine-Grained Parallelism
Zechao Shang
Feifei Li
Jeffrey Xu Yu
Zhiwei Zhang
Hong Cheng
Hybrid Pulling/Pushing for I/O-Efficient Distributed and Iterative Graph Computing
Zhigang Wang
Yu Gu
Yubin Bao
Ge Yu
Jeffrey Xu Yu
SESSION: Session 6 - Streaming 1: Systems and Outlier Detection
Scalable Pattern Sharing on Event Streams*
Medhabi Ray
Chuan Lei
Elke A. Rundensteiner
How to Win a Hot Dog Eating Contest: Distributed Incremental View Maintenance with Batch Updates
Milos Nikolic
Mohammad Dashti
Christoph Koch
Sharing-Aware Outlier Analytics over High-Volume Data Streams
Lei Cao
Jiayuan Wang
Elke A. Rundensteiner
THEMIS: Fairness in Federated Stream Processing under Overload
Evangelia Kalyvianaki
Marco Fiscato
Theodoros Salonidis
Peter Pietzuch
SABER: Window-Based Hybrid Stream Processing for Heterogeneous Architectures
Alexandros Koliousis
Matthias Weidlich
Raul Castro Fernandez
Alexander L. Wolf
Paolo Costa
Peter Pietzuch
Range Thresholding on Streams
Miao Qiao
Junhao Gan
Yufei Tao
SESSION: Session 7 - Approximate Query Processing
Bridging the Archipelago between Row-Stores and Column-Stores for Hybrid Workloads
Joy Arulraj
Andrew Pavlo
Prashanth Menon
An Effective Syntax for Bounded Relational Queries
Yang Cao
Wenfei Fan
Wander Join: Online Aggregation via Random Walks
Feifei Li
Bin Wu
Ke Yi
Zhuoyue Zhao
Quickr: Lazily Approximating Complex AdHoc Queries in BigData Clusters
Srikanth Kandula
Anil Shanbhag
Aleksandar Vitorovic
Matthaios Olma
Robert Grandl
Surajit Chaudhuri
Bolin Ding
A Study of Sorting Algorithms on Approximate Memory
Shuang Chen
Shunning Jiang
Bingsheng He
Xueyan Tang
Distributed Wavelet Thresholding for Maximum Error Metrics
Ioannis Mytilinis
Dimitrios Tsoumakos
Nectarios Koziris
Sample + Seek: Approximating Aggregates with Distribution Precision Guarantee
Bolin Ding
Silu Huang
Surajit Chaudhuri
Kaushik Chakrabarti
Chi Wang
SESSION: Session 8 - Networks and the Web
Stop-and-Stare: Optimal Sampling Algorithms for Viral Marketing in Billion-scale Networks
Hung T. Nguyen
My T. Thai
Thang N. Dinh
Spheres of Influence for More Effective Viral Marketing
Yasir Mehmood
Francesco Bonchi
David García-Soriano
Continuous Influence Maximization: What Discounts Should We Offer to Social Network Users?
Yu Yang
Xiangbo Mao
Jian Pei
Xiaofei He
Holistic Influence Maximization: Combining Scalability and Efficiency with Opinion-Aware Models
Sainyam Galhotra
Akhil Arora
Shourya Roy
Potential and Pitfalls of Domain-Specific Information Extraction at Web Scale
Astrid Rheinländer
Mario Lehmann
Anja Kunkel
Jörg Meier
Ulf Leser
Robust and Noise Resistant Wrapper Induction
Tim Furche
Jinsong Guo
Sebastian Maneth
Christian Schallhart
SESSION: Session 9 - Data Discovery and Extraction
Goods: Organizing Google's Datasets
Alon Halevy
Flip Korn
Natalya F. Noy
Christopher Olston
Neoklis Polyzotis
Sudip Roy
Steven Euijong Whang
Multi-Source Uncertain Entity Resolution at Yad Vashem: Transforming Holocaust Victim Reports into People
Tomer Sagi
Avigdor Gal
Omer Barkol
Ruth Bergman
Alexander Avram
A Hybrid Approach to Functional Dependency Discovery
Thorsten Papenbrock
Felix Naumann
Ontological Pathfinding
Yang Chen
Sean Goldberg
Daisy Zhe Wang
Soumitra Siddharth Johri
Extracting Databases from Dark Data with DeepDive
Ce Zhang
Jaeho Shin
Christopher Ré
Michael Cafarella
Feng Niu
Estimating the Impact of Unknown Unknowns on Aggregate Query Results
Yeounoh Chung
Michael Lind Mortensen
Carsten Binnig
Tim Kraska
SESSION: Session 10 - Data Integration / Cleaning
Constraint-Variance Tolerant Data Repairing
Shaoxu Song
Han Zhu
Jianmin Wang
Interactive and Deterministic Data Cleaning: A Tossed Stone Raises a Thousand Ripples
Jian He
Enzo Veltri
Donatello Santoro
Guoliang Li
Giansalvatore Mecca
Paolo Papotti
Nan Tang
Sequential Data Cleaning: A Statistical Approach
Aoqian Zhang
Shaoxu Song
Jianmin Wang
Learning-Based Cleansing for Indoor RFID Data
Asif Iqbal Baba
Manfred Jaeger
Hua Lu
Torben Bach Pedersen
Wei-Shinn Ku
Xike Xie
PrivateClean: Data Cleaning and Differential Privacy
Sanjay Krishnan
Jiannan Wang
Michael J. Franklin
Ken Goldberg
Tim Kraska
RDFind: Scalable Conditional Inclusion Dependency Discovery in RDF Datasets
Sebastian Kruse
Anja Jentzsch
Thorsten Papenbrock
Zoi Kaoudi
Jorge-Arnulfo Quiané-Ruiz
Felix Naumann
Cost-Effective Crowdsourced Entity Resolution: A Partial-Order Approach
Chengliang Chai
Guoliang Li
Jian Li
Dong Deng
Jianhua Feng
SESSION: Session 11 - Spatio / Temporal Databases
Topic Exploration in Spatio-Temporal Document Collections
Kaiqi Zhao
Lisi Chen
Gao Cong
ParTime: Parallel Temporal Aggregation
Markus Pilman
Martin Kaufmann
Florian Köhl
Donald Kossmann
Damien Profeta
Data Polygamy: The Many-Many Relationships among Urban Spatio-Temporal Data Sets
Fernando Chirigati
Harish Doraiswamy
Theodoros Damoulas
Juliana Freire
Distributed Evaluation of Top-k Temporal Joins
Julien Pilourdault
Vincent Leroy
Sihem Amer-Yahia
AT-GIS: Highly Parallel Spatial Query Processing with Associative Transducers
Peter Ogden
David Thomas
Peter Pietzuch
Towards Best Region Search for Data Exploration
Kaiyu Feng
Gao Cong
Sourav S. Bhowmick
Wen-Chih Peng
Chunyan Miao
Simba: Efficient In-Memory Spatial Analytics
Dong Xie
Feifei Li
Bin Yao
Gefei Li
Liang Zhou
Minyi Guo
SESSION: Session 12 - Distributed Data Processing
Realtime Data Processing at Facebook
Guoqiang Jerry Chen
Janet L. Wiener
Shridhar Iyer
Anshul Jaiswal
Ran Lei
Nikhil Simha
Wei Wang
Kevin Wilfong
Tim Williamson
Serhat Yilmaz
SparkR: Scaling R Programs with Spark
Shivaram Venkataraman
Zongheng Yang
Davies Liu
Eric Liang
Hossein Falaki
Xiangrui Meng
Reynold Xin
Ali Ghodsi
Michael Franklin
Ion Stoica
Matei Zaharia
VectorH: Taking SQL-on-Hadoop to the Next Level
Andrei Costea
Adrian Ionescu
Bogdan Răducanu
Michał Switakowski
Cristian Bârca
Juliusz Sompolski
Alicja Łuszczak
Michał Szafrański
Giel de Nijs
Peter Boncz
Adaptive Logging: Optimizing Logging and Recovery Costs in Distributed In-memory Databases
Chang Yao
Divyakant Agrawal
Gang Chen
Beng Chin Ooi
Sai Wu
Big Data Analytics with Datalog Queries on Spark
Alexander Shkapsky
Mohan Yang
Matteo Interlandi
Hsuan Chiu
Tyson Condie
Carlo Zaniolo
An Efficient MapReduce Cube Algorithm for Varied DataDistributions
Tova Milo
Eyal Altshuler
SESSION: Session 13 - Graphs 2: Subgraph-based Optimization Techniques
Diversified Top-k Subgraph Querying in a Large Graph
Zhengwei Yang
Ada Wai-Chee Fu
Ruifeng Liu
Graph Indexing for Shortest-Path Finding over Dynamic Sub-Graphs
Mohamed S. Hassan
Walid G. Aref
Ahmed M. Aly
Efficient Subgraph Matching by Postponing Cartesian Products
Fei Bi
Lijun Chang
Xuemin Lin
Lu Qin
Wenjie Zhang
Adding Counting Quantifiers to Graph Patterns
Wenfei Fan
Yinghui Wu
Jingbo Xu
DUALSIM: Parallel Subgraph Enumeration in a Massive Graph on a Single Machine
Hyeonji Kim
Juneyoung Lee
Sourav S. Bhowmick
Wook-Shin Han
JeongHoon Lee
Seongyun Ko
Moath H.A. Jarrah
Distributed Set Reachability
Sairam Gurajada
Martin Theobald
SESSION: Session 14 - Main Memory Analytics
Fast Multi-Column Sorting in Main-Memory Column-Stores
Wenjian Xu
Ziqiang Feng
Eric Lo
Elastic Pipelining in an In-Memory Database Cluster
Li Wang
Minqi Zhou
Zhenjie Zhang
Yin Yang
Aoying Zhou
Dina Bitton
Page As You Go: Piecewise Columnar Access In SAP HANA
Reza Sherkat
Colin Florendo
Mihnea Andrei
Anil K. Goel
Anisoara Nica
Peter Bumbulis
Ivan Schreter
Günter Radestock
Christian Bensberg
Daniel Booss
Heiko Gerwens
Hybrid Garbage Collection for Multi-Version Concurrency Control in SAP HANA
Juchang Lee
Hyungyu Shin
Chang Gyoo Park
Seongyun Ko
Jaeyun Noh
Yongjae Chuh
Wolfgang Stephan
Wook-Shin Han
UpBit: Scalable In-Memory Updatable Bitmap Indexing
Manos Athanassoulis
Zheng Yan
Stratos Idreos
SESSION: Session 15 - Interactive Analytics
FluxQuery
: An Execution Framework for Highly Interactive Query Workloads
Roee Ebenstein
Niranjan Kamat
Arnab Nandi
iOLAP: Managing Uncertainty for Efficient Incremental OLAP
Kai Zeng
Sameer Agarwal
Ion Stoica
Dynamic Prefetching of Data Tiles for Interactive Visualization
Leilani Battle
Remco Chang
Michael Stonebraker
Expressive Query Construction through Direct Manipulation of Nested Relational Results
Eirik Bakke
David R. Karger
Shasta: Interactive Reporting At Scale
Gokul Nath Babu Manoharan
Stephan Ellner
Karl Schnaitter
Sridatta Chegu
Alejandro Estrella-Balderrama
Stephan Gudmundson
Apurv Gupta
Ben Handy
Bart Samwel
Chad Whipkey
Larysa Aharkava
Himani Apte
Nitin Gangahar
Jun Xu
Shivakumar Venkataraman
Divyakant Agrawal
Jeffrey D. Ullman
Datometry Hyper-Q: Bridging the Gap Between Real-Time and Historical Analytics
Lyublena Antova
Rhonda Baldwin
Derrick Bryant
Tuan Cao
Michael Duller
John Eshleman
Zhongxian Gu
Entong Shen
Mohamed A. Soliman
F. Michael Waas
SESSION: Session 16 - Streaming 2: Sketches
Time Adaptive Sketches (Ada-Sketches) for Summarizing Data Streams
Anshumali Shrivastava
Arnd Christian Konig
Mikhail Bilenko
Streaming Algorithms for Robust Distinct Elements
Di Chen
Qin Zhang
Augmented Sketch: Faster and More Accurate Stream Processing
Pratanu Roy
Arijit Khan
Gustavo Alonso
Matrix Sketching Over Sliding Windows
Zhewei Wei
Xuancheng Liu
Feifei Li
Shuo Shang
Xiaoyong Du
Ji-Rong Wen
Graph Stream Summarization: From Big Bang to Big Crunch
Nan Tang
Qing Chen
Prasenjit Mitra
Scalable Approximate Query Tracking over Highly Distributed Data Streams
Nikos Giatrakos
Antonios Deligiannakis
Minos Garofalakis