Literate models for computer vision: Combining vision, language and reading

Literate models for computer vision: Combining vision, language and reading

Lecturer

Dimosthenis Karatzas, dimos@cvc.uab.es

Lluis Gomez,

Ernest Valveny,

Andres Mafla,

Ali Biten,

Ruben Perez,

Sergi Garcia,

Organizer/s

Computer Vision Center, Autonomous University of Barcelona

Content and organization

Written information in the world around us is a fundamental cue for a multitude of everyday tasks. From shopping at the supermarket to finding our destination in an unknown urban space, written text helps us perform many tasks that would otherwise be much more complex.

Computer vision systems on the other hand, have been practically illiterate for the first half century of their lifetime. Specific research on reading systems has been going on for decades, but the semantic information that image text conveys was not incorporated to higher-level computer vision tasks until very recently. This is gradually changing, afforded by the great success achieved in the field of scene text recognition in recent years.

Through this short interactive course, doctoral students will have a chance to reconcile with the state of the art in reading systems, especially scene text recognition, and explore how image text enables us to tackle new and exciting computer vision tasks such as fine-grained image classification, cross-modal retrieval, captioning and visual question answering.

Level

MSc / PhD

Course Duration

4 hours

Course Type

Short Course

Participation terms

Both AIDA and non-AIDA students are encouraged to participate in this short course.

If you are an AIDA Student* already, please:
Step (a): Register in the course by filing in the form at the Web site of the course.

AND

Step (b): Enroll in the same course in the AIDA system using the “Enroll on this Course” button below, so that this course enters your AIDA Certificate of Course Attendance.

If you are not an AIDA Student do only step (a).

*The International AI Doctoral Academy (AIDA) has 73 members, which are top AI Universities, Research centers and Industries: https://www.i-aida.org/

AIDA Students should have been registered in the AIDA system already (they are PhD students or PostDocs that belong only to the AIDA Members listed in this page: Members)

Language

English

Modality (online/in person):

Hybrid

Notes

The Course will take place in hybrid mode. Onsite attendance is possible, prior confirmation, at the Computer Vision Centre, Barcelona. There is limited availability for onsite attendants. If you are interested to attend onsite, please indicate so during registration and expect to receive a confirmation. Online link will be provided by the Lecturer after registration/enrollment. Basic deep learning knowledge is expected. To participate in the demo session you should be able to use Google CoLab and have basic understanding of PyTorch / TensorFlow.

Other short courses

10. 04. 2024 Go

Ethics & STICs

01. 03. 2024 Go

Computer Vision

01. 03. 2024 Go