Art and Narrative Research in Humanitarianism, Development, and the Study of Complex Systems: Sabbatical Study for 2015-16

Seeing Narrative Software User Flow

Seeing Narrative platform user flow
1st rough draft
Andrew Freiband May 2016
 
1 - ingest/logging
            -raw video media or pre-edited sequences (quicktimes etc) are imported into the system and can be organized in folders/bins.  From there, ingest/logging: dynamic metadata is applied (dynamic metadata is variable across the duration of the clip, and also can be variable in its intensity depending on the nature of the data)
                        -automatic metadata application - face detection, chronology, geographic location, speech to text, etc - many categories of metadata can be applied automatically (user switchable?  could these be added like plug-ins?)
            -user-enterable dynamic metadata - the platform should enable the user to enter their own categories of 'subjective' metadata and apply this metadata through a few different input mechanisms.  Thinking of sliders, dials, or other analogous graduated input objects (down the road the platform could be compatible with external console-type input devices, like a colorist's console or mixing board or even gestural inputs?)  A slider or dial could be labeled with the category of metadata (for example, a thematic category such as 'subject's emotional state).  The dial could then be used either for grading that quality (on a 1-10 scale, for example) or for differentiating variations (a dial set with 8 different emotional states at different positions)
                        As the user watches video playback they can manipulate sliders/dials in realtime and metadata values are applied to the clip concurrently.  Once metadata has been applied either automatically or manually to media, it can be considered 'ingested'. 
                        At this stage perhaps there can also be a way to filter out the chaff in the footage (focus failures, camera pre- and post-roll, accidental triggers, etc), effectively creating a 'cut-down' of the media.  (This could be done in an NLE such as Premiere in advance of using the Seeing Narrative tool, but reducing the steps to ingest seems ideal, and I like the idea of being able to salvage, for example, relevant audio even if the image isn't focused.  The premise being there is potential narrative content even in footage that isn't intentional.)
                       
 
2 - visualization
            -as media is ingested and tagged with a constellation of dynamic metadata sets, it appears in the visualization space, comprised of 'nodes' of media and pathways that connect them to others.  (See framegrab from risd.tv site for reference, though this is only 2-dimensional and linkages are singular/static)
 

RISD.tv framegrab - concept for visualizing media nodes in relation to each other
 
            Dynamic and multidimensional metadata requires a more complex environment (than the risd.tv one) - and one that can be visualized in a number of different ways.  For example different categories of metadata can be given 'elevation' to highlight the presence of that metadata in the media set, if desired.  The 'threads' that connect nodes will also need variation - in color and density, for example - to illustrate the strength (if graded) of the connection.  Categories of metadata could be selected on/off to see different networks.  Visualization could be combinable and selectable - chart layouts, landscape, constellation-based) - offering different views of the dataset.
 
            -much to be developed in the visualization space, and open to new ideas here - admittedly thinking based on previous exposure to more rudimentary tools, and very likely there are breakthroughs to be made in how this environment looks and functions-
 
            -comparative narrative metadata - This is like a second-tier of metadata application.  It could happen during the ingest stage, but I think it makes sense to enable it in the visualization environment, because the visualization will greatly enhance an editor's view of the relationships between media clips.
            The user(editor) can also comparative values for 'narrative causality' between any moment (or bracketed timeframe) and another in the media database.  This relates two moments in a single clip, or moments from one clip to another - it is related to the assembly/sequencing process an editor undertakes with footage in a traditional NLE.  I imagine the footage, as it is ingested, is fixed with regularly-intervaled 'hooks' - these could just be frame numbers, for example.  Somehow the editor should be able to connect hooks from one media location to another.  Once the media is in the visualization space, metadata relationships will be illustrated and this will facilitate the search for linkages via 'perceived causality' - knowing chronological, thematic, or subject relationships between clips for example, will increase our ability to see possible narrative linkage.
                        In the visualization space the editor could draw the connections directly between clips, or 'zoom' in on a node and draw causal-links between moments (perhaps this would 'break' the node into multiple nodes, connected by new threads, including the perceived-causality thread).
                        Narrative (perceived causality) threads could be binary (connected/not-connected) or could be graduated like dynamic metadata, and the resulting connective threads could be similarly transparent/bold based on the value.
 
3 - interaction/ navigation
 
            Once the visualization environment is populated by a web of connected media, the user can watch their way down varying pathways. On one hand we maintain a view of the visualization landscape, so that we can see where we have been and where we are going.  On the other we present a 'cinematic' window, a framed video viewer, where the highlighted media can be played back.
                        User adjustable filters (i.e. play along pathways only graded 8 or higher, create a priority list for certain types of metadata, including narrative linkage) help the system determine direction, and these can be as loosely controlled (relying on metadata-driven algorithms a la Korsakow) or tightly controlled (high degree of user interaction, 'flying' through the landscape).
                        Like some layouts in Korsakow, media in a node plays back and as possible pathways open off of it, the system can be set to display these directions (both through clickable frames appearing adjacent to the player, but also as highlights of the connective threads in the visualization landscape).
                        Viewing of media will likely lead to new ideas about connections, so the viewer should have some access to tagging/connecting tools (place flags, in-out points for narrative connection, etc), making the landscape changeable.
                        (Additionally the system should be savable at any stage, but new media can be ingested and it will populate the landscape as its metadata determines - so the saves of the landscapes will represent different stages of the database, prior to the introduction of new data)
                        Finally the pathways followed by the viewer should themselves be savable and 'printable' - exportable as linear quicktime/digital video files.  These outputs could conceivably used as presentation edits, or offline files for comparison of different viewpoints of the system - 'historical perspective' or 'John Doe's story' etc).
            
 

This page references: