Scene and View

  • Scene:
  • A single arrangement of objects in the bin.
  • Different environments.
  • Static Vs. Falling Objects
  • View:
  • Static 9 point of views (45' + top view)
  • Random point of views

  • Note: There is no relation between # objects and (# Scene and #Views).


DATASET Scenarios

  • a) level 1: no occlusion.
  • b) level 2: some occlusion.
  • c) level 3: incomplete objects (scissors is crosscut by the banana).
  • d) level 4: multiple instances of the same object class.
  • e) level 5: includes all difficult characteristics (incomplete and non-unique objects).

Formula

- #images = #scenes * #view-point
- size of dataset = #images * 42 MB 

Version #Images #Scenes #Views Size Splitting (Tr, Vl, Ts)
1 20K (LineMOD) 15 1200 800 GB
2 ~60k (FAT) 15 (3S * 5L) - 2.5 TB
3 80k (T-LESS) 20 - 3.2 TB
4 97K (Grasp-NET) 190 - 4.1 TB
5 134K (YCB-V) 92 - 5.6 TB
6 206K (BIN-P) - - 8.6 TB
7 396K (StereOBJ-1M) 183s , 11e - ~ 16 TB , X => (2t > 8.6 TB)
8 600K (ObjSynth) 6 200 ~ 42 TB , Y => (3t > 8.6 TB)
9 (X10) 6M (Ours) 6000 1000 10t > Y

Splitting

  • 15% 80% (LIneMod) same seq
  • 80% 20% (YCB-V) diff seq
  • 90% 10% (T-LESS)
  • 50% 20% 30% (REAL)
  • By Scenes {HOPE: (10, 10, 40)}
  • Randomly (DOPose)