One important element of deep learning and machine learning at large is dataset. A good dataset will contribute to a model with good precision and recall. In the realm of object detection in images or motion pictures, there are some household names commonly used and referenced by researchers and practitioners. The names in the list include Pascal, ImageNet, SUN, and COCO. In this post, we will briefly discuss about COCO dataset, especially on its distinct feature and labeled objects.
tl;dr The COCO dataset labels from the original paper and the released versions in 2014 and 2017 can be viewed and downloaded from this repository.
A Dataset with Context
COCO stands for Common Objects in Context. As hinted by the name, images in COCO dataset are taken from everyday scenes thus attaching “context” to the objects captured in the scenes. We can put an analogy to explain this further. Let’s say we want to detect a person object in an image. A non-contextual, isolated image will be a close-up photograph of a person. Looking at the photograph, we can only tell that it is an image of a person. However, it will be challenging to describe the environment where the photograph was taken without having other supplementary images that capture not only the person but also the studio or surrounding scene.
COCO was an initiative to collect natural images, the images that reflect everyday scene and provides contextual information. In everyday scene, multiple objects can be found in the same image and each should be labeled as a different object and segmented properly. COCO dataset provides the labeling and segmentation of the objects in the images. A machine learning practitioner can take advantage of the labeled and segmented images to create a better performing object detection model.
Objects in COCO
As written in the original research paper, there are 91 object categories in COCO. However, only 80 object categories of labeled and segmented images were released in the first publication in 2014. Currently there are two releases of COCO dataset for labeled and segmented images. After the 2014 release, the subsequent release was in 2017. The COCO dataset is available for download from the download page.
To compare and confirm the available object categories in COCO dataset, we can run a simple Python script that will output the list of the object categories. This can be replicated by following these steps on Ubuntu or other GNU/Linux distros.
1. Download 2014 train/val annotation file
$ wget http://images.cocodataset.org/annotations/annotations_trainval2014.zip
2. Download 2017 train/val annotation file
$ wget http://images.cocodataset.org/annotations/annotations_trainval2017.zip
3. Inflate both zip files using unzip
$ unzip annotations_trainval2014.zip
$ unzip annotations_trainval2017.zip
This will create a directory named “annotations” that contain the dataset annotations.
4. Create a Python file named coco-object-categories.py and type the following code.
Note: This should be considered a merely functional code instead of production code
#!/usr/bin/python
cat_2014 = './annotations/instances_val2014.json'
cat_2017 = './annotations/instances_val2017.json'
import sys, getopt
import json
def main(argv):
json_file = None
try:
opts, args = getopt.getopt(argv,"hy:")
except getopt.GetoptError:
print 'coco_categories.py -y <year>'
sys.exit(2)
for opt, arg in opts:
if opt == '-y':
if(arg == '2014'):
json_file = cat_2014
else:
json_file = cat_2017
if json_file is not None:
with open(json_file,'r') as COCO:
js = json.loads(COCO.read())
print json.dumps(js['categories'])
if __name__ == "__main__":
main(sys.argv[1:])
6. Run the python file
$ python coco-object-categories.py -y 2014
$ python coco-object-categories.py -y 2017
5. Observe the JSON output
After the observation, we will have the following tables that contain the comparison of object category list between the original paper and the dataset release.
ID | Object (Paper) | Object (2014 Rel.) | Object (2017 Rel.) | Super Category |
---|---|---|---|---|
1 | person | person | person | person |
2 | bicycle | bicycle | bicycle | vehicle |
3 | car | car | car | vehicle |
4 | motorcycle | motorcycle | motorcycle | vehicle |
5 | airplane | airplane | airplane | vehicle |
6 | bus | bus | bus | vehicle |
7 | train | train | train | vehicle |
8 | truck | truck | truck | vehicle |
9 | boat | boat | boat | vehicle |
10 | traffic light | traffic light | traffic light | outdoor |
11 | fire hydrant | fire hydrant | fire hydrant | outdoor |
12 | street sign | - | - | outdoor |
13 | stop sign | stop sign | stop sign | outdoor |
14 | parking meter | parking meter | parking meter | outdoor |
15 | bench | bench | bench | outdoor |
16 | bird | bird | bird | animal |
17 | cat | cat | cat | animal |
18 | dog | dog | dog | animal |
19 | horse | horse | horse | animal |
20 | sheep | sheep | sheep | animal |
21 | cow | cow | cow | animal |
22 | elephant | elephant | elephant | animal |
23 | bear | bear | bear | animal |
24 | zebra | zebra | zebra | animal |
25 | giraffe | giraffe | giraffe | animal |
26 | hat | - | - | accessory |
27 | backpack | backpack | backpack | accessory |
28 | umbrella | umbrella | umbrella | accessory |
29 | shoe | - | - | accessory |
30 | eye glasses | - | - | accessory |
31 | handbag | handbag | handbag | accessory |
32 | tie | tie | tie | accessory |
33 | suitcase | suitcase | suitcase | accessory |
34 | frisbee | frisbee | frisbee | sports |
35 | skis | skis | skis | sports |
36 | snowboard | snowboard | snowboard | sports |
37 | sports ball | sports ball | sports ball | sports |
38 | kite | kite | kite | sports |
39 | baseball bat | baseball bat | baseball bat | sports |
40 | baseball glove | baseball glove | baseball glove | sports |
41 | skateboard | skateboard | skateboard | sports |
42 | surfboard | surfboard | surfboard | sports |
43 | tennis racket | tennis racket | tennis racket | sports |
44 | bottle | bottle | bottle | kitchen |
45 | plate | - | - | kitchen |
46 | wine glass | wine glass | wine glass | kitchen |
47 | cup | cup | cup | kitchen |
48 | fork | fork | fork | kitchen |
49 | knife | knife | knife | kitchen |
50 | spoon | spoon | spoon | kitchen |
51 | bowl | bowl | bowl | kitchen |
52 | banana | banana | banana | food |
53 | apple | apple | apple | food |
54 | sandwich | sandwich | sandwich | food |
55 | orange | orange | orange | food |
56 | broccoli | broccoli | broccoli | food |
57 | carrot | carrot | carrot | food |
58 | hot dog | hot dog | hot dog | food |
59 | pizza | pizza | pizza | food |
60 | donut | donut | donut | food |
61 | cake | cake | cake | food |
62 | chair | chair | chair | furniture |
63 | couch | couch | couch | furniture |
64 | potted plant | potted plant | potted plant | furniture |
65 | bed | bed | bed | furniture |
66 | mirror | - | - | furniture |
67 | dining table | dining table | dining table | furniture |
68 | window | - | - | furniture |
69 | desk | - | - | furniture |
70 | toilet | toilet | toilet | furniture |
71 | door | - | - | furniture |
72 | tv | tv | tv | electronic |
73 | laptop | laptop | laptop | electronic |
74 | mouse | mouse | mouse | electronic |
75 | remote | remote | remote | electronic |
76 | keyboard | keyboard | keyboard | electronic |
77 | cell phone | cell phone | cell phone | electronic |
78 | microwave | microwave | microwave | appliance |
79 | oven | oven | oven | appliance |
80 | toaster | toaster | toaster | appliance |
81 | sink | sink | sink | appliance |
82 | refrigerator | refrigerator | refrigerator | appliance |
83 | blender | - | - | appliance |
84 | book | book | book | indoor |
85 | clock | clock | clock | indoor |
86 | vase | vase | vase | indoor |
87 | scissors | scissors | scissors | indoor |
88 | teddy bear | teddy bear | teddy bear | indoor |
89 | hair drier | hair drier | hair drier | indoor |
90 | toothbrush | toothbrush | toothbrush | indoor |
91 | hair brush | - | - | indoor |
As you can see, the list of objects for the 2014 and 2017 releases are the same, which are 80 objects from the original 91 object categories in the paper.
If you need to have the object list as a text file, you can view and download it from this repository.
Moving to the discrepancies between the object list in the paper and dataset release, the missing object categories / labels are identical in both 2014 and 2017 dataset releases. So, let’s compile this data and create another table to enlist those missing labels.
ID | Object | Super Category |
---|---|---|
12 | street sign | outdoor |
26 | hat | accessory |
29 | shoe | accessory |
30 | eye glasses | accessory |
45 | plate | kitchen |
66 | mirror | furniture |
68 | window | furniture |
69 | desk | furniture |
71 | door | furniture |
83 | blender | furniture |
91 | hair brush | indoor |
What does this mean? Practically you need to source from different datasets if your objective is to build a model that also supports detection for the missing object categories / labels.
Beyond Coco Objects
Despite providing sufficient list of objects, there can be in circumstances where the object you want to identify is not included in the COCO labels list. This is especially true when building models in another / more specific domain or adding context to the object identification. There can be at least two common approaches to this:
- Manual labeling and modeling: Objects are labeled using bounding box or segmentation technique and neural network for object recognition (for e.g. RCNN, fast RCNN, faster RCNN) is applied to generate a new model for the object detection
- Transfer learning: Existing pre-trained model is adapted when performing object recognition in a new domain. A prevalent technique is by reusing the hidden layers of the pre-trained model to extract features of objects and replacing the final / output layer with classification that is specific to the new domain.
Transfer learning can be considered an advanced topic in computer vision. Reading various arXiv papers can be helpful in seeking better understanding about transfer learning. Additionally, you can also enroll in an online course to supplement the theoretical knowledge with practical knowhow. The course Advanced Computer Vision elaborates transfer learning and provides sample implementation that can be referenced when building your own model. If you just recently started your journey in machine learning / AI and want to develop not only the theoretical but also implementation skill, you can also consider the TensorFlow AI course or Pytorch AI course.
Brilliant. I have been looking for this list everywhere. Thank you for publishing.
Is there a trained model for fish detection (really helpful if it detects semantic/instance segmentation rather than bounding boxes)
CIFAR-100 has fish superclass even though the object classes are rather limited. Imagenet also has fish superclass and more object classes. If you only want to identify fish but not the species using segmentation, you can build the fish model with Detectron.
Pingback: DIY Robot Uses ML to Bring You Stuff! #TensorFlow #SnowBoy #RaspberryPi #Coral #EdgeComputing #TinyML @saraltayal @instructables « Adafruit Industries – Makers, hackers, artists, designers and engineers!
great!
I need a trained model for closthes/fashion. Is there any?
If you want to differentiate the clothes by the features, you can approach by using semantic segmentation approach and train a new model.
is there a model that detects the industrial materials like,bolt,swanner screw driver etc.,
It is more likely that you will end up building the model with transfer learning.
Pingback: Detección de Objetos con Python | Aprende Machine Learning
Pingback: MLOps — Cook book, chapter 1 / ???? ???????? ???? / ????
i need only vehicle class ,how to remove others?
You may find this project helpful for your need: https://github.com/cocodataset/cocoapi
Pingback: Yolov5_DeepSort_Pytorch??? Yolov5 + Deep Sort ?????????-PythonOK
HI, is there a pre-trianed model/Weights Class for Shoe Wear ?
Pingback: Auto-Lama combines object detection and image inpainting to automate object removals
Amazing…..all information is very helpful in this blog. I did not just come here from the perspective of commenting posts; these are good ideas for. Hope you will share the best information with us even further.
is there a model that detects fruit and vegetable plant and few birds
Have you checked Open Model Zoo? You can choose one of the pre-trained models and train the model with your custom objects. Here is sample implementation for fruits.
Hi I need your help please tell me how to apply ANN on coco dataset as well as on MSRC datasets???
Hola, saben si existe un modelo entrenado para detectar renacuajos (larvas de zapos)?
amazing…… 🙂
Hi Michael I have read this blog and u r simply awesome.because u r having solution of every one’s problem.
Now plz help me also. I have tried but I couldn’t do it.
I’m working on a project where I need to detect specific brands of objects in images, even when there are similar objects from different brands present. how much dataset I would need for effective detection? Additionally, how many annotations would be necessary?
In a typical image, there are 3-6 target objects and around 20 other objects. What would be the best approach for annotating these images?