Saving data by adding visual knowledge priors to Deep Learning

placeholder image 2

Call for papers

We invite researchers to submit their recent work on data-efficient computer vision.

Accepted works are included in the program.

Read More

placeholder image 2

Present a poster

We invite researchers to present their recent papers on data-efficient computer vision.

Accepted posters are included in the program.

Read More

VIPriors challenges

Including data-efficient action recognition, classification, detection and segmentation.

Final rankings are out now.

Read More

Watch the recording

Thanks to all for attending

We enjoyed two interesting workshop sessions at ECCV 2020. We thank all presenters for their efforts and all participants for their attention. This website will keep a record of all presented materials. We hope to see you all next year (venue TBD) for the next workshop!

About the workshop

This workshop focuses on how to pre-wire deep networks with generic visual inductive innate knowledge structures, which allows to incorporate hard won existing generic knowledge from physics such as light reflection or geometry. Visual inductive priors are data efficient: What is built-in no longer has to be learned, saving valuable training data.

Data is fueling deep learning. Data is costly to gather and expensive to annotate. Training on massive datasets has a huge energy consumption adding to our carbon footprint. In addition, there are only a select few deep learning behemoths which have billions of data points and thousands of expensive deep learning hardware GPUs at their disposal. This workshop aims beyond the few very large companies to the long tail of smaller companies and universities with smaller datasets and smaller hardware clusters. We focus on data efficiency through visual inductive priors.

Excellent recent research investigates data efficiency in deep networks by exploiting other data sources such as unsupervised learning, re-using existing datasets, or synthesizing artificial training data. Not enough attention is given on how to overcome the data dependency by adding prior knowledge to deep nets. As a consequence, all knowledge has to be (re-)learned implicitly from data, making deep networks hard to understand black boxes which are susceptible to dataset bias requiring huge data and compute resources. This workshop aims to remedy this gap by investigating how to flexibly pre-wire deep networks with generic visual innate knowledge structures, which allows to incorporate hard won existing knowledge from physics such as light reflection or geometry.

The great power of deep neural networks is their incredible flexibility to learn. The direct consequence of such power, is that small datasets can simply be memorized and the network will likely not generalize to unseen data. Regularization aims to prevent such over-fitting by adding constraints to the learning process. Much work is done on regularization of internal network properties and architectures. In this workshop we focus on regularization methods based on innate priors. There is strong evidence that an innate prior benefits deep nets: Adding convolution to deep networks yields a convolutional deep neural network (CNN) which is hugely successful and has permeated the entire field. While convolution was initially applied on images, it is now generalized to graph networks, speech, language, 3D data, video, etc. Convolution models translation invariance in images: an object may occur anywhere in the image, and thus instead of learning parameters at each location in the image, convolution allows to only consider local relations, yet, share parameters over all image locations, and thus saving a huge number of parameters to learn, allowing a strong reduction in the number of examples to learn from. This workshop aims to further the great success of convolution, exploiting innate regularizing structures yielding a significant reduction of training data.

Workshop program

Our live program featured a panel discussion with our invited speakers, as well as playback of recorded talks for all presentations and live Q&A. All keynotes, papers and presentations were made available through the ECCV Workshops and Tutorials website.

Time (UTC+1)    
8:00 / 18:00 Keynote session Panel discussion with invited speakers + Q&A
8:40 / 18:40 Break  
8:45 / 18:45 Oral session Oral presentations
9:10 / 19:10   Q&A
9:25 / 19:25 Challenges Awards & winners presentations
9:35 / 19:35 Poster session Poster presentations
9:45 / 19:45   Q&A (for posters & challenges)
9:50 / 19:50   External poster presentations
9:55 / 19:55 Closing  

Oral session

  1. Lightweight Action Recognition in Compressed Videos.
    Yuqi Huo, Xiaoli Xu, Yao Lu, Yulei Niu, Mingyu Ding, Zhiwu Lu, Tao Xiang, Ji-Rong Wen
  2. On sparse connectivity, adversarial robustness, and a novel model of the artificial neuron.
    Sergey Bochkanov
  3. Injecting Prior Knowledge into Image Caption Generation.
    Arushi Goel, Basura Fernando, Thanh-Son Nguyen, Hakan Bilen
  4. Learning Temporally Invariant and Localizable Features via Data Augmentation for Video Recognition.
    Taeoh Kim, Hyeongmin Lee, MyeongAh Cho, Hoseong Lee, Dong heon Cho, Sangyoun Lee
  5. Unsupervised Learning of Video Representations via Dense Trajectory Clustering.
    Pavel Tokmakov, Martial Hebert, Cordelia Schmid

Poster session

  1. Distilling Visual Priors from Self-Supervised Learning.
    Bingchen Zhao, Xin Wen
  2. Unsupervised Image Classification for Deep Representation Learning.
    Weijie Chen, Shiliang Pu, Di Xie, Shicai Yang, Yilu Guo, Luojun Lin
  3. TDMPNet: Prototype Network with Recurrent Top-Down Modulation for Robust Object Classification under Partial Occlusion.
    Mingqing Xiao, Adam Kortylewski, Ruihai Wu, Siyuan Qiao, Wei Shen, Alan Yuille
  4. What leads to generalization of object proposals?.
    Rui Wang, Dhruv Mahajan, Vignesh Ramanathan
  5. A Self-Supervised Framework for Human Instance Segmentation.
    Yalong Jiang, Wenrui Ding, Hongguang Li, Hua Yang, Xu Wang
  6. Multiple interaction learning with question-type prior knowledge for constraining answer search space in visual question answering.
    Tuong Do, Binh Nguyen, Huy Tran, Erman Tjiputra, Quang Tran, Thanh Toan Do
  7. A visual inductive priors framework for data-efficient image classification.
    Pengfei Sun, Xuan Jin, Wei Su, Yuan He, Hui Xue’, Quan Lu

External poster session

  1. Select to Better Learn: Fast and Accurate Deep Learning using Data Selection from Nonlinear Manifolds.
    Mohsen Joneidi, Saeed Vahidian, Ashkan Esmaeili, Weijia Wang, Nazanin Rahnavard, Bill Lin, and Mubarak Shah
  2. Compositional Convolutional Neural Networks: A Deep Architecture with Innate Robustness to Partial Occlusion.
    Adam Kortylewski, Ju He, Qing Liu, Alan Yuille
  3. On Translation Invariance in CNNs: Convolutional Layers can Exploit Absolute Spatial Location.
    Osman Semih Kayhan, Jan C. van Gemert

VIPriors Challenges

We presented the “Visual Inductive Priors for Data-Efficient Computer Vision” challenges. We offered four challenges, where models are to be trained from scratch, and we reduced the number of training samples to a fraction of the full set.

Please see the challenges page for the results of the challenges.

Invited speakers

prof. Matthias Bethge

prof. Matthias Bethge

Bethge Lab

Website

prof. Charles Leek

prof. Charles Leek

University of Liverpool

Website

prof. Daniel Cremers

prof. Daniel Cremers

TU München

Website

Organizers

dr. Jan van Gemert

dr. Jan van Gemert

Delft University of Technology

Website

dr. Anton van den Hengel

dr. Anton van den Hengel

University of Adelaide

Website

Attila Lengyel

Attila Lengyel

Delft University of Technology

Robert-Jan Bruintjes

Robert-Jan Bruintjes

Delft University of Technology

Website

Osman Semih Kayhan

Osman Semih Kayhan

Delft University of Technology

Marcos Baptista Ríos

Marcos Baptista Ríos

University of Alcala

Contact

Email us at vipriors-ewi AT tudelft DOT nl