Forum: Analytic Provenance

Comments during the workshop

United States
Here is a place for anyone to put comments or discussion points that come up during the workshop sessions. Please list the session the comment is related to so we can better organize the information.

Session 1 discussion notes:

Show scientists the vis with their own data so they can have an aha moment, then they will be convinced of its value. SciVis learned this years ago.

We should move our research emphasis from composability of tools to Composability of tasks.

Record provenance at fine level vs show at broad level.

Trade personal research directions with compatibility, so as to better communicate as a community, share provenance.
We Have some formats, e.g. pset and vistrails, open provenance model. Agreeing on format is unlikely but agreeing on transforming between formats is possible.

Session 2 discussion notes:

To Chris Weaver, how to deal with abstracting value ranges? Sets are easier because they are discrete, continuous dimensions will be more difficult. Also the ability to query the question space, not just the data space.

Linking the Haber and Weaver papers could enable that functionality. In a sense, these two are coming at the same problem from opposite directions.

To Jean: Where is the data? Difficulty is figuring out who owns the data and getting permission!

Session 3 discussion notes

Cluster by different categories based on the context of their analysis. See this also in image clustering. Need to understand that context to understand the reasoning behind the categories.

1. recording See How users are using Tools,
2 sharing how analysis is done
3. prescriptive tell the user what we advise you to do.
First doesn't get to a standard method of doing the exploration.

We should be Reductive, add more to the process vs converge to some point.

Look at analysis in a single application. But how do people move between different apps. How to capture and integrate the provenance across many tools.
Need provenance of workflow.

How can we relate the provenance data to collaborative activities?
Used custom tools to help find those relationships.
Being able to distinguish in the provenance what the user did that was useful or not useful.
Hard to predict when an activity ends or continues.
Not until you see the outcome do you understand their actions, so you have to propogate back, which implies that you have to be able to know the future to recognize their activities.
Need to highlight which part of the video you should look at, whats the important part. But also there is the need to look at the part that others ignored.
Problem is that we impart our own judgement on their motivation. Too difficult to do by watching video. The only way to truly discern is to interview them.
But post-interviews often the users have forgotten their motives. But if you interrupt them to ask their motives, that will affect their analytic process.
Circuitous routes make it very difficult to follow.

One goal is to Model behavior. But its very specific to the type of collaboration, discipline. So need to think about what is the purpose of the modeling.

The original document is available at