Loading...
 
Print

MMA 2011

The Multimedia Analytics community presents a VAC Workshop

MultiMedia Analytics 2011 "Pow Wow"

Sponsored by Intel

July 26-27, 2011

At the Intel campus in Santa Clara, CA


Image

Questions? Contact: Luciano.C.Oviedo at intel.com




MMA Pow Wow Report Out

• Summary of Key Learning/Next Steps on Friday, 05.Aug.2011 at 8am
LINK HERE (WIP)

Agenda/Presentations

Tuesday, 26 July 2011 from 1 - 5pm
Luciano C. Oviedo, Visual & Parallel Computing Group: Welcome/Introduction
Mike Smith, Visual & Parallel Computing Group: Intel Visual Computing Research Overview
Cindy Pickering, IT Strategy Architecture & Innovation: Distributed Enterprise: Knowledge Work and Collaboration Use Cases
Ashwini Asokan, Intel Labs, User Experience Research: User Experience Methods, Frameworks and Modeling + Case --> LINK HERE (WIP)
Steve Guerin, Redfish Group: Multi-modal/Ambient Computing Use Cases + Demo of Interactive Sand Table & Simulation
Image
•Networking/dinner
Image , Image , Image , Image

Wednesday, 27 July 2011 from 9am - 5pm
Ken Salsman, New Technology, Aptina Imaging: Emerging Challenges in Capture/Display Technologies

• Use Cases: Consumer, enterprise, SMB, education, research, government?
• Ground Truth, Multi-Modal Data Sets: Mix of audio, video, text, image?
• Benchmarks: F-score, average precision, probabilistic rand index?
• Development Tools & Technologies: API’s, libraries, standards?
• Sensors & Displays: x, y, z-distance, pixel color depth, stereo?
• Future Compute Requirements: GFLOPS, memory size, memory bandwidth?
• Future Codecs: Existing vs. future codecs, roadmaps, gaps, limitations?
LINK HERE (WIP)

Summary

The flood of both premium and user-generated digital media content is driving the need from individuals and organizations for sense-making capabilities, such as multi-media capture, navigation, search and analytics. The problem is that most tools were created with only one mode in mind (image, video, audio or text, but not all) and therefore are consciously trading-off the comprehension of all digital media signals. This “pow wow” aims to bring people from across different contexts, in a casual and open gathering, to share and look for new ways of addressing challenges associated with developing this emerging technology domain.

Goals

To share and identify cross-market “grand challenges” for cultivating a multi-media analytics future; intent is to co-develop a first draft roadmap.

Motivation

A friendly gathering of top subject matter experts in Industry, Government or Academia working on different aspects of multi-media analytics technologies.

Topics

Share perspective on where we are today (As-Is) vs. where we are going (To-Be). Proposed areas will include:
• Use Cases: Consumer, enterprise, SMB, education, research, government?
• Technological Grand Challenges: Unknown category discovery, user presentation?
• Ground Truth, Multi-Modal Data Sets: Mix of audio, video, text, image?
• Benchmarks: F-score, average precision, probabilistic rand index?
• Development Tools & Technologies: API’s, libraries, standards?
• Sensors & Displays: x, y, z-distance, pixel color depth, stereo?
• Future Compute Requirements: GFLOPS, memory size, memory bandwidth?
• Future Codecs: Existing vs. future codecs, roadmaps, gaps, limitations?

Participants and Registration

Open to developers, end users or researchers of image, video, audio or text analytics products, tools, technologies in Industry, Government or Academia.

To register, sign up here. Seating is limited so selection will be on a first come, first serve basis. There is no registration fee and participants are responsible for covering all costs associated with attending.


Workshop Organizers

Luciano Oviedo is on the Strategy and Planning team of the Visual and Parallel Computing Group. His interests are in the visual and multi-modal analytic's domains.

Scott Krig is a pioneer in image processing, visualization, and computer graphics, with experience ranging across aerospace simulators, 2d/3d image processing on super computers and workstations, and embedded process control & vision systems. Scott founded Krig Research in 1988 to provide the world’s first integrated imaging & visualization platforms based on high-end super workstations like SGI and Apollo with distributors in 25 countries. Scott has authored or co-authored about 40 patent applications and studied computer security at Stanford.

Shawn Bohn is a scientist and engineer at the Pacific Northwest National Laboratory working in the fields of information retrieval, extraction and visualization. Mr. Bohn has been recognized by his peers and clients as an expert in the specific field of Item Authority and a significant contributor in Text Analytics and Information Visualization fields. Mr. Bohn has over two decades of experience in applying research to commercial sectors as well as applying practices in the commercial sector into research as well as managing and leading research programs. Mr. Bohn is currently leading work in the field of Multimedia Analytics. Part of this effort is leading a team of researchers and engineers to develop a visual analytics tool for interacting with large quantities text, image and video data. In addition to this effort, Mr. Bohn is leading the technical direction and management of the Information Synthesis and Fusion project. This project is working towards unifying information spaces and producing new signatures that serve as a basis to visual analytic platforms.

Mark Hasegawa-Johnson has developed machine learning and opportunistic sensing algorithms for automatic speech recognition and multimedia analytics, with applications in HCI problems including dialect adaptation, computer-assisted language learning, assistive technology, and the study of group dynamics. Mathematically, Dr. Hasegawa-Johnson's work focuses on the tight relationship between feature transformations and probability density estimation, e.g., he directed a 2004 summer research program at Johns Hopkins on landmark-based speech recognition, and was co-advisor of the ICPR 2008 "Best Student Paper" in which MAP-adapted mixture Gaussian supervectors were first used to classify natural images. Dr. Hasegawa-Johnson's teaching and service activities create channels of communication between engineering and the social sciences, e.g., he was General Chair of Speech Prosody 2010, flagship conference of the Speech Prosody Special Interest Group. Dr. Hasegawa-Johnson is co-author of 32 journal articles and book chapters, 116 conference papers, four patents, and three widely used databases. He is currently Associate Editor of the Journal of the Acoustical Society of America and of the Journal of Laboratory Phonology. He received his Ph.D. from MIT in 1996.

Russ Burtner is a Senior User Experience Lead on the Visual Analytic team with over 20 years experience in HCI, software design and vision exploration. Past work experience includes extensive product development with Microsoft, EA Sports, Disney and Oxygen Media. Current work areas for PNL are in user experience for multimedia analysis, decision support systems, collaborative and adaptive environments and emergency response. His research interests are in Human Computer Interaction, Visual Design, Ethnography/Usability, Information Visualization, vision development and future Technology Trends.

John David Miller is a principle engineer in the Intel IT Strategy, Research and Innovation group. He works on business intelligence and visualization.

Richard May is a chief scientist at the Pacific Northwest National Laboratory and Director of the National Visualization and Analytics Center (NVAC). His research focus for the past several years has been in visual analytics and interaction methodologies. His particular interest is the logical and physical aspects of interacting with information for analytical tasks using visual analytic techniques. He manages both research and development projects as well as outreach programs to government, industry, and academia.


Page last modified on Tuesday, September 06, 2011

Print