基于大规模弱标注数据的深度学习(杨奎元)

2020-02-27 57浏览

  • 1.基于大规模弱标注数据的深度学习 杨奎元 微软亚洲研究院
  • 2.深度学习 深度模型 数据 计算资源
  • 3.数据的标注 • 图像分类 • 图像细类度分类 比格猎犬 柯基犬 蝴蝶犬
  • 4.数据的标注 • 图像分类 • 图像细类度分类 黑背信天翁 巾冠拟黄鹂 太平鸟
  • 5.图像→子图像 吉娃娃 Bottom-Up proposal … … … …
  • 6.标注物体区域 … … … … … Husky? Boxer? Basenji? + Dog? Labrador? Object-level FilterNet … … … …
  • 7.标注部位区域 Filters learned in conv4 … …… … Part1 Detector … Part2 Detector …
  • 8.Two-level attention Object-level Filtering Bottom-Up proposal … … … … Part-level Filtering part1 part2 Object-level Classifier + Classification Result Part-level Classifier
  • 9.ImageNet-Dog 训练集:15.4万张图片, 118类狗 Method Image-level label Object-level attention Part-level attention Two-level attention Top-1 error rate (%) 40.1 30.3 35.2 28.1
  • 10.Caltech-UCSD Bird 训练集:1.2万张图片,200类鸟 Training phase Method BBox Info Part Info DeCAF [7] √ Part RCNN [26] √ √ Part RCNN [26] √ √ Part Discovery [17] Object-level Part-level Two-level (AlexNet) Two-level (VGGNet) Testing phase BBox Info Part Info √ √ Acc.(%) 58.8 73.5 76.7 53.8 67.6 64.9 69.7 77.9
  • 11.点击数据 different types of french braids Beyomce; facebook filipino cookinh chow chow dog for adoption; puppy blue prints of jupiter old english sheepdog bay windown sheers beyonce black and white funny nurse happy fathers day sweetheart hyundai santa fe dalmatian timing belt change puppies; cute pet cutlass supreme miniature pinscher puppies neru akita pomeranian puppies
  • 12.图像/查询词的向量表示 • 词的向量表示: ?????? ↦ ????????????, ?????? ∈ ?????? • 点击图像??????的查询词的向量表示: ?????????????????? = ???????????? ?????????????????? ????????????∈???????????? • 图像的向量表示: ?????? ↦ ??????(??????)
  • 13.图像-查询词关联 sim ????????????, ?????? = ??????????????????, ?????? ?????? = ?????????????????????????????? , ?????? ?????? ????????????∈???????????? = ??????, ???????????? ?????? 为防止平凡解, 内积改为归一化的余弦相似度 ??????, ???????????? ?????? sim ????????????, ?????? = ?????? ⋅ ???????????? ??????
  • 14.深度网络建模 (BoWDNN) Image ?????? Queries small dog white dog chihuahua ???????????? Term Weight Small 0.1 Dog 0.6 White 0.05 Chihuahua 0.9 BoW tf-idf representation ?????? ... ... ?????? ?????? ???????????? ?????? 0.1 0.6 0.05 sim ????????????, ?????? ... 0.9 ?????? 词向量空间
  • 15.目标函数 • 点击数据集 ?????? = ????????????, ?????????????????? ?????? ??????=1 • 目标函数 ?????? 1 ??????0 ?????? = − ?????? sim ????????????, ?????????????????? ?????? +2 ?????? 2 2 ??????=1 • 避免学习重复模式 ?????? ???????????? ???????????? ?????? ?????? = ??????0 ?????? + ?????? ??????????????????, ?????????????????? ??????=1 ??????=1 ??????=??????+1
  • 16.学习曲线 Without cross-filter regularization With cross-filter regularization
  • 17.学习效果 Without cross-filter regularization With cross-filter regularization
  • 18.图像和词的表示 • Visually similar words are closely distributed in a meaningful way MM 2015
  • 19.MSR Bing Grand Challenge
  • 20.自动数据集构建 Step 2: Query formation Step 3: Noisy image removal
  • 21.自动数据集构建 • 对每个类别??????,构建查询词 • 去除噪声图像
  • 22.实验结果 • AutoSet-10 • 采用CIFAR-10的10个类别 (飞机、汽车、鸟、猫等) • AutoSet-1K • 1000个常见类别(猫、汽车,电子产品,电子游戏等)
  • 23.跨数据集推广能力 Test on ImageNet-10 CIFAR-10 ImageNet-10 AutoSet-10
  • 24.Autoset-1Khttp://ylbai.asiteof.me/autoset• 规模 • 1000类, 250万图片 • 正确率 • 95.2% (ImageNet 99.7%) • 多样性 MM 2015
  • 25.AutoSet-1K示例 MM 2015
  • 26.自动构建狗的数据集 different types beyomce of french braids filipino cookinh chow chow dog for adoption blue prints of jupiter old english sheepdog bay windown beyonce black sheers and white funny nurse happy fathers day sweetheart hyundai santa fe timing belt change dalmatian puppies cutlass supreme miniature pinscher puppies neru akita pomeranian puppies 40M Clickture Logsdog:1.00puppy:0.68hound:0.66dogss:0.65breed:0.65dosg:0.63puppie:0.62spaniel:0.61dopg:0.61breeder:0.60mix:0.59boxer:0.58cocker:0.57pupy:0.57retriever:0.57dane:0.56beagle:0.54mastiff:0.52sheepdog:0.52dogpic:0.51puppo:0.51cataho:0.51ovcharka:0.51retreiver:0.50doggy:0.50retrever:0.50dogd:0.50Dog related words old english sheepdog chow chow dog for adoption dalmatian puppies miniature pinscher puppies pomeranian puppies 95K + 68K dog images
  • 27.MSR Bing Grand Challenge 在线分类344种狗 Top-5 Accuracy 弱监督 100 90 80 70 60 50 40 30 20 10 0 强监督
  • 28.结论 • 监督信息越弱,可用数据量越多; • 深度模型可以从弱标注数据中学习,并自动进行强标注。
  • 29.… … … … … 谢谢大家! kuyang@microsoft.com
  • 30.Reference • Bag-of-Words Based Deep Neural Network for Image Retrieval. ACM Multimedia, 2014. • The Application of Two-level Attention Models in Deep Convolutional Neural Network for Fine-grained Image Classification. CVPR, 2015. • Automatic Image Dataset Construction from Click-through Logs Using Deep Neural Network. ACM Multimedia, 2015. • Improve Dog Recognition By Mining More Information From Both Click-through Logs and Pre-trained Models. ICME workshop, 2016.