CVDB 2004 Panel: Future Applications and Solutions

Panel: Future Applications and Solutions

While the technical solutions developed by Computer Vision and Database researchers are often elegant and well designed, it is not clear that they are always able to solve the actual problems that users of image and multimedia databases are facing. Users range from professional users to leisurely users, although with the improvements in digital cameras, even leisurely users may quickly accumulate tens of thousands of images. Overall, these users are likely to vary significantly in what they are trying to achieve, what data they manipulate, how much data they deal with, which tools they use, and so on. Many works in Computer Vision and Databases, however, deal only with a single application, frequently even working with artificially generated data. On the other hand, the users may not be aware of the great technical solutions, which might well solve some of their problems, if appropriately applied.

The goal of this panel is therefore to be a forum for exchanging ideas on the applications of image and video data. The panel will include professional users that deal everyday with huge volumes of data, but are using that data in very different ways. These people can clearly describe what kind of tools they would need to facilitate the management of their large volumes of multimedia data. The panel will also include Computer Vision and Database researchers that typically address technical issues such as enhancing image recognition or designing faster systems.

Get the slides presented by:

Tamer Özsu

Izabela Grasland

Jean Carrive

Sébastien Gilles

Moderator: M. Tamer Özsu, University of Waterloo

Dr. M. Tamer Özsu is Professor of Computer Science and University Research Professor at the University of Waterloo where he leads research groups in distributed data and object management, multimedia data management and structured document management. Prior to his current position, he was with the Department of Computing Science of the University of Alberta between 1984 and 2000.

Dr. Özsu's multimedia research has focused on image data\-bases within the context of the DISIMA system. DISIMA investigated an object-oriented database approach whereby images and other related data were stored and managed by an object DBMS. It defined a high level declarative query language, which allowed users to query over image content and syntactic features, and included a query engine, which was augmented with logical and physical optimizations. More recent work focuses on similarity search over moving object trajectories, primarily as they are found in videos.

Panelist: Jean Carrive, INA

Dr. Jean Carrive received a Ph.D. in Computer Science at Paris 6 University in 2000 in collaboration with INA. The subject of the thesis was Classification of Audiovisual Sequences. Since then, he is in charge of research at the Research and Experimentation Department of INA in the field of Description of Audiovisual Documents. He is mainly interested in automatic structure discovery of audiovisual documents through symbolic techniques and in languages for representing descriptions of broadcasts. He has been involved in several national and European projects (DiVAN, AGIR, CHAPERON) and is currently project manager for the French FERIA project.

The Institut National de l'Audiovisuel or INA is a public organization with an industrial and commercial role that was set up through the reform of the audiovisual sector conducted in 1974 and finally implemented on January 6, 1975. INA is the first audiovisual archive center in the world and the first digital image data-bank in Europe. The main missions of INA are to preserve the national audiovisual heritage, to make it more available and to keep abreast of changes in the audiovisual sector through its research, production and training activities. The research work at INA is mainly directed towards finalization of digital tools in the field of restoration and indexing of audiovisual documents. INA archives more than 1.110.000 hours of radio and TV broadcasts (60 years of radio and 50 years of TV). Overall, about 144.000 hours are on-line for professional users. Because of cable and satellite TV, the archive will soon grow by 500.000 hours each year.

Panelist: Sébastien Gilles, LTU Technologies

Dr. Sébastien Gilles holds a Ph.D. from Oxford University in computer vision, focusing on information-theoretic algorithms applied to image matching and recognition. After completing his Ph.D., Sébastien joined the Image and Multimedia Indexing Group at INRIA, France. In 1999, he co-founded LTU technologies, and has been Chief Scientist of the company since then. He has a strong expertise in the following areas: image matching and retrieval, image processing, statistical image recognition, and information theory applied to image understanding.

LTU technologies is the leading provider of industrial softwares for organizing, searching, filtering and exploiting visual content. LTU is the standard for image-based forensics and investigation analysis and is deployed at law enforcement, defense and Intelligence agencies worldwide. LTU softwares are also used to monitor misuse and violations of over 100 brands from Global 2000 companies and to enable patent offices to carry out image-based searches in IP databases. LTU is based in Paris, France and Washington D.C. The company was founded in 1999 by veteran scientists from the MIT Media Lab, Oxford University and INRIA. LTU's technology is patented worldwide. LTU now indexes more than 6.000.000 images and has customers in France, U.S.A., Canada, Germany, Italy, Spain and U.K. Over the years, LTU was involved in a number of European projects (Adequate, Annapurna) and is an active member of the European Network of Excellence (NoE) "Muscle", focusing on technologies for high-level access to multimedia databases.

Panelist: Izabela Grasland, Thomson

Izabela Grasland received a DESS in Human-Man Interaction (psychology, linguistics, computer science) in 1994 at Université Le Mirail in Toulouse. Before, she was studying linguistics and the subject of her main reports concerned terminologies used in the professional domain. In 1995, she co-created a small web agency and focused on usability of web sites. In 1999, she joined the french R\&D center of Thomson and worked on various subjects: vocal interfaces (constitution of language models), content search \& access in personal databases, and home networking systems. Her main goal is to obtain "real" user requirements and she uses various evaluation tools, such as in-home observations, interviews, questionnaires, user tests in home labs (user doing precise tasks with prototypes), etc. She is also involved in several national and European projects (Annapurna, Ozone, aceMedia, Amigo).

Thomson is the leading provider of technology and service solutions for integrated entertainment and media companies. By capitalizing on and expanding its leadership positions at the intersection of entertainment, media and technology, Thomson provides end-to-end solutions to content creators, video network operators and manufacturers and retailers through its Technicolor, Grass Valley, RCA and THOMSON brands.

Panelist: Roger Mohr

Dr. Roger Mohr is a Professor of Computer Science at Institut National Polytechnique de Grenoble since 1988. He is chairing the Computer Science Engineering School since 2003. From 2000 to 2002, he joined Xerox Research Centre Europe where he led the Grenoble lab.

His research interests are mainly in the area of computer vision. Dr. Mohr has published over 120 publications, including papers in international journals such as IEEE-PAMI, Artificial Intelligence, IJCV, and Pattern Recognition. His major personal contributions are the optimal consistency algorithm for constrained problems, the introduction of projective geometry in computer vision, and image indexing through local invariant features. His present research interest is visual learning.

Dr. Mohr is member of the GRAVIR lab. Half of this lab activity is centred around vision and has a particular focus on visual learning, on human activity recognition, on three-dimensional perception with cameras and video interpretation. He is involved in many national, industrial and international projects, including the EU networks of excellence and the EU integrated projects. Roger Mohr is a Professor in Computer Science at Institut National Polytechnique de Grenoble since 1988. He is now chairing the Computer Science Engineering School since 2003. From 2000 to 2002, he joined joined Xerox Research Centre Europe where he led the Grenoble lab.

Panelist: Thomas Seidl, RWTH Aachen University

Dr. Thomas Seidl is a full professor of Computer Science at RWTH Aachen University, Germany, where he holds a chair for Data Management and Data Exploration since 2002. His research interests include methods for managing and mining large databases of complex objects. Typical applications include spatial, temporal, and multimedia databases in the domains of medical imaging, molecular biology, mechanical engineering, or geographic information systems. Current activities in the data management area address relational indexing techniques to efficiently support multimedia queries in relational database systems.

Dr. Seidl finished his MS in 1992 at the Technische Universitaet Muenchen with a thesis supervised by Prof. Rudolf Bayer, Ph.D., and received his Ph.D. in 1997 from the University of Munich, Germany under supervision of Prof. Dr. Hans-Peter Kriegel. From 2001 to 2002, he held a guest lectureship at the University of Augsburg and a substitute professorship for Databases, Data Mining and Visualization at the University of Constance, respectively.