Loading...

Abstract

Domain

DATA SCIENCE

Title

Dead or Alive: Continuous Data Profiling for Interactive Data Science

Abstract

AutoProfiler is a system designed to streamline the data profiling process by automatically generating and updating interactive visual summaries and statistics of data in real-time. This helps analysts quickly understand their data and verify transformations without writing extra code. The system facilitates insight discovery by continuously displaying data distributions and summary statistics, with live updates ensuring that the most current data is always analyzed. It also supports follow-up analysis and documentation by generating relevant code snippets. In a user study with 16 participants, AutoProfiler, in both its reactive (real-time updates) and on-demand (updates only when requested) versions, significantly enhanced insight discovery, with 91% of insights coming from the tool rather than manual profiling. Participants appreciated the intuitive live updates for verifying transformations and the ability to review past visualizations with the on-demand version, highlighting the system's potential to improve automated data analysis support in future tools.