Vision Transformers: A Review — Part I

  • Part I — Introduction to Transformer & ViT
  • Part II & III — Key problems of ViT and its improvement

1. What is Transformer?

“Jane is a travel blogger and also a very talented guitarist.”

Figure 1. The architecture of the Transformer model (image from [1])

2. Vision Transformer

Figure 2. The architecture of ViT (image from [2])

3. Summary

References

--

--

--

Leading big data and AI-powered solution company https://www.sertiscorp.com/

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Deep learning for time series forecasting framework updates

How to make SnapChat Lenses?

BA/MA Thesis: Single Shot Multi-Object Detection and 6D Pose Estimation

GSoC ’20 #7 [Week 11&12]

Generating MRI Images of Brain Tumors with GANs

(Reference) Custom Image Augmentation with Keras by Ceshine Lee — 라임오렌지파이와 일상

Intuitive Neural Networks

Predicting Churn with PySpark ML

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Sertis

Sertis

Leading big data and AI-powered solution company https://www.sertiscorp.com/

More from Medium

Understanding representations of concepts in Visual Transformers by analyzing attention maps from…

RAFT: A Machine Learning Model for Estimating Optical Flow

SimCLR — STL10 Implementation

Contrastive Representation Learning — A Comprehensive Guide (part 2, scoring representations)