Список наборів даних для досліджень з машинного навчання

Набори даних використовується для досліджень в області машинного навчання, посилання на них використовуються в наукових академічних статтях. Набори даних орієнтовані, здебільшого, на вирішення задач класифікації та розпізнавання і містять оцифровані зображення, відео, тексти, сигнали, звуки тощо.

Зображення

Розпізнавання осіб

Лицьові зображення широко використовуються для розробки систем машинного зору та розпізнавання осіб та пов'язаних з класифікацією зображень задачах.

Назва	Опис	Обробка	Розмір	Формат	Задачі	Створений	Посилання	Джерело
Face Recognition Technology (FERET)	11338 images of 1199 individuals in different positions and at different times.	None.	11,338	Images	Classification, face recognition	2003	[1][2]	United States Department of Defense
CMU Pose, Illumination, and Expression (PIE)	41,368 color images of 68 people in 13 different poses.	Images labeled with expressions.	41,368	Images, text	Classification, face recognition	2000	[3][4]	R. Gross et al.
SCFace	Color images of faces at various angles.	Location of facial features extracted. Coordinates of features given.	4,160	Images, text	Classification, face recognition	2011	[5][6]	M. Grgic et al.
YouTube Faces DB	Videos of 1,595 different people gathered from YouTube. Each clip is between 48 and 6,070 frames.	Identity of those appearing in videos and descriptors.	3,425 videos	Video, text	Video classification, face recognition	2011	[7][8]	L. Wolf et al.
300 videos in-the-Wild	114 videos annotated for facial landmark tracking. The 68 landmark mark-up is applied to every frame.	None	114 videos, 218,000 frames.	Video, annotation file.	Facial landmark tracking.	2015	[9]	Shen, Jie et al.
Grammatical Facial Expressions Dataset	Grammatical Facial Expressions from Brazilian Sign Language.	Microsoft Kinect features extracted.	27,965	Text	Facial gesture recognition	2014	[10]	F. Freitas et al.
CMU Face Images Dataset	Images of faces. Each person is photographed multiple times to capture different expressions.	Labels and features.	640	Images, Text	Face recognition	1999	[11][12]	T. Mitchell
Yale Face Database	Faces of 15 individuals in 11 different expressions.	Labels of expressions.	165	Images	Face recognition	1997	[13][14]	J. Yang et al.
Cohn-Kanade AU-Coded Expression Database	Large database of images with labels for expressions.	Tracking of certain facial features.	500+ sequences	Images, text	Facial expression analysis	2000	[15][16]	T. Kanade et al.
FaceScrub	Images of public figures scrubbed from image searching.	Name and m/f annotation.	107,818	Images, text	Face recognition	2014	[17][18]	H. Ng et al.
BioID Face Database	Images of faces with eye positions marked.	Manually set eye positions.	1521	Images, text	Face recognition	2001	[19][20]	BioID
Skin Segmentation Dataset	Randomly sampled color values from face images.	B, G, R, values extracted.	245,057	Text	Segmentation, classification	2012	[21][22]	R. Bhatt.
Bosphorus	3D Face image database.	34 action units and 6 expressions labeled; 24 facial landmarks labeled.	4652	Images, text	Face recognition, classification	2008	[23][24]	A Savran et al.
UOY 3D-Face	neutral face, 5 expressions: anger, happiness, sadness, eyes closed, eyebrows raised.	labeling.	5250	Images, text	Face recognition, classification	2004	[25][26]	University of York
CASIA	Expressions: Anger, smile, laugh, surprise, closed eyes.	None.	4624	Images, text	Face recognition, classification	2007	[27][28]	Institute of Automation, Chinese Academy of Sciences
CASIA	Expressions: Anger Disgust Fear Happiness Sadness Surprise	None.	480	Annotated Visible Spectrum and Near Infrared Video captures at 25 frames per second	Face recognition, classification	2011	[29]	Zhao, G. et al.
BU-3DFE	neutral face, and 6 expressions: anger, happiness, sadness, surprise, disgust, fear (4 levels). 3D images extracted.	None.	2500	Images, text	Facial expression recognition, classification	2006	[30]	Binghamton University
Face Recognition Grand Challenge Dataset	Up to 22 samples for each subject. Expressions: anger, happiness, sadness, surprise, disgust, puffy. 3D Data.	None.	4007	Images, text	Face recognition, classification	2004	[31][32]	National Institute of Standards and Technology
Gavabdb	Up to 61 samples for each subject. Expressions neutral face, smile, frontal accentuated laugh, frontal random gesture. 3D images.	None.	549	Images, text	Face recognition, classification	2008	[33][34]	King Juan Carlos University
3D-RMA	Up to 100 subjects, expressions mostly neutral. Several poses as well.	None.	9971	Images, text	Face recognition, classification	2004	[35][36]	Royal Military Academy (Belgium)

Виявлення та розпізнавання об'єктів

Dataset Name	Brief description	Preprocessing	Instances	Format	Default Task	Created (updated)	Reference	Creator
Visual Genome	Images and their description		108,000	images, text	Image captioning	2016	[37]	R. Krishna et al.
DAVIS: Densely Annotated VIdeo Segmentation 2017	150 video sequences containing 10459 frames with a total of 376 objects annotated.	Dataset released for the 2017 DAVIS Challenge with a dedicated workshop co-located with CVPR 2017. The videos contain several types of objects and humans with a high quality segmentation annotation.In each video sequence multiple instances are annotated.	10,459	Frames annotated	Video object segmentation	2017	[38]	Pont-Tuset, J. et al.
DAVIS: Densely Annotated VIdeo Segmentation 2016	50 video sequences containing 3455 frames with a total of 50 objects annotated.	Dataset released with the CVPR 2016 paper. The videos contain several types of objects and humans with a high quality segmentation annotation. In each video sequence a single instance is annotated.	3,455	Frames annotated	Video object segmentation	2016	[39]	Perazzi, F. et al.
T-LESS: An RGB-D Dataset for 6D Pose Estimation of Texture-less Objects	30 industry-relevant objects. 39K training and 10K test images from each of three sensors. Two types of 3D models for each object.	6D poses for all modeled objects in all images. Per-pixel labelling can be obtained by rendering of the object models at the ground truth poses.	49,000	RGB-D images, 3D object models	6D object pose estimation, object detection	2017	[40]	T. Hodan et al.
Berkeley 3-D Object Dataset	849 images taken in 75 different scenes. About 50 different object classes are labeled.	Object bounding boxes and labeling.	849	labeled images, text	Object recognition	2014	[41][42]	A. Janoch et al.
Berkeley Segmentation Data Set and Benchmarks 500 (BSDS500)	500 natural images, explicitly separated into disjoint train, validation and test subsets + benchmarking code. Based on BSDS300.	Each image segmented by five different subjects on average.	500	Segmented images	Contour detection and hierarchical image segmentation	2011	[43]	University of California, Berkeley
Microsoft Common Objects in Context (COCO)	complex everyday scenes of common objects in their natural context.	Object highlighting, labeling, and classification into 91 object types.	2,500,000	Labeled images, text	Object recognition	2015	[44][45][46]	T. Lin et al.
SUN Database	Very large scene and object recognition database.	Places and objects are labeled. Objects are segmented.	131,067	Images, text	Object recognition, scene recognition	2014	[47][48]	J. Xiao et al.
ImageNet	Labeled object image database, used in the ImageNet Large Scale Visual Recognition Challenge	Labeled objects, bounding boxes, descriptive words, SIFT features	14,197,122	Images, text	Object recognition, scene recognition	2009 (2014)	[49][50][51]	J. Deng et al.
Open Images	A Large set of images listed as having CC BY 2.0 license with image-level labels and bounding boxes spanning thousands of classes.	Image-level labels, Bounding boxes	9,178,275	Images, text	Classification, Object recognition	2017	[52]
TV News Channel Commercial Detection Dataset	TV commercials and news broadcasts.	Audio and video features extracted from still images.	129,685	Text	Clustering, classification	2015	[53][54]	P. Guha et al.
Statlog (Image Segmentation) Dataset	The instances were drawn randomly from a database of 7 outdoor images and hand-segmented to create a classification for every pixel.	Many features calculated.	2310	Text	Classification	1990	[55]	University of Massachusetts
Caltech 101	Pictures of objects.	Detailed object outlines marked.	9146	Images	Classification, object recognition.	2003	[56][57]	F. Li et al.
Caltech-256	Large dataset of images for object classification.	Images categorized and hand-sorted.	30,607	Images, Text	Classification, object detection	2007	[58][59]	G. Griffin et al.
SIFT10M Dataset	SIFT features of Caltech-256 dataset.	Extensive SIFT feature extraction.	11,164,866	Text	Classification, object detection	2016	[60]	X. Fu et al.
LabelMe	Annotated pictures of scenes.	Objects outlined.	187,240	Images, text	Classification, object detection	2005	[61]	MIT Computer Science and Artificial Intelligence Laboratory
Cityscapes Dataset	Stereo video sequences recorded in street scenes, with pixel-level annotations. Metadata also included.	Pixel-level segmentation and labeling	25,000	Images, text	Classification, object detection	2016	[62]	Daimler AG et al.
PASCAL VOC Dataset	Large number of images for classification tasks.	Labeling, bounding box included	500,000	Images, text	Classification, object detection	2010	[63][64]	M. Everingham et al.
CIFAR-10 Dataset	Many small, low-resolution, images of 10 classes of objects.	Classes labelled, training set splits created.	60,000	Images	Classification	2009	[50][65]	A. Krizhevsky et al.
CIFAR-100 Dataset	Like CIFAR-10, above, but 100 classes of objects are given.	Classes labelled, training set splits created.	60,000	Images	Classification	2009	[50][65]	A. Krizhevsky et al.
CINIC-10 Dataset	A unified contribution of CIFAR-10 and Imagenet with 10 classes, and 3 splits. Larger than CIFAR-10.	Classes labelled, training, validation, test set splits created.	270,000	Images	Classification	2018	[66]	Luke N. Darlow, Elliot J. Crowley, Antreas Antoniou, Amos J. Storkey
Fashion-MNIST	A MNIST-like fashion product database	Classes labelled, training set splits created.	60,000	Images	Classification	2017	[67]	Zalando SE
notMNIST	Some publicly available fonts and extracted glyphs from them to make a dataset similar to MNIST. There are 10 classes, with letters A-J taken from different fonts.	Classes labelled, training set splits created.	500,000	Images	Classification	2011	[68]	Yaroslav Bulatov
German Traffic Sign Detection Benchmark Dataset	Images from vehicles of traffic signs on German roads. These signs comply with UN standards and therefore are the same as in other countries.	Signs manually labeled	900	Images	Classification	2013	[69][70]	S Houben et al.
KITTI Vision Benchmark Dataset	Autonomous vehicles driving through a mid-size city captured images of various areas using cameras and laser scanners.	Many benchmarks extracted from data.	>100 GB of data	Images, text	Classification, object detection	2012	[71][72][73]	A Geiger et al.
Linnaeus 5 dataset	Images of 5 classes of objects.	Classes labelled, training set splits created.	8000	Images	Classification	2017	[74]	Chaladze & Kalatozishvili
FieldSAFE	Multi-modal dataset for obstacle detection in agriculture including stereo camera, thermal camera, web camera, 360-degree camera, lidar, radar, and precise localization.	Classes labelled geographically.	>400 GB of data	Images and 3D point clouds	Classification, object detection, object localization	2017	[75]	M. Kragh et al.
11K Hands	11,076 hand images (1600 x 1200 pixels) of 190 subjects, of varying ages between 18 – 75 years old, for gender recognition and biometric identification.	None	11,076 hand images	Images and (.mat, .txt, and .csv) label files	Gender recognition and biometric identification	2017	[76]	M Afifi
CORe50	Specifically designed for Continuous/Lifelong Learning and Object Recognition, is a collection of more than 500 videos (30fps) of 50 domestic objects belonging to 10 different categories.	Classes labelled, training set splits created based on a 3-way, multi-runs benchmark.	164,866 RBG-D images	images (.png or .pkl) and (.pkl, .txt, .tsv) label files	Classification, Object recognition	2017	[77]	V. Lomonaco and D. Maltoni
THz and thermal video data set	This multispectral data set includes terahertz, thermal, visual, near infrared, and three-dimensional videos of objects hidden under people's clothes.	3D lookup tables are provided that allow you to project images onto 3D point clouds.	More than 20 videos. The duration of each video is about 85 seconds (about 345 frames).	AP2J	Experiments with hidden object detection	2019	[78][79]	Alexei A. Morozov and Olga S. Sushkova

Примітки

Phillips, P. Jonathon, et al. "The FERET database and evaluation procedure for face-recognition algorithms." Image and vision computing 16.5 (1998): 295-306.
Wiskott, Laurenz, et al. "Face recognition by elastic bunch graph matching."Pattern Analysis and Machine Intelligence, IEEE Transactions on 19.7 (1997): 775-779.
Sim, Terence, Simon Baker, and Maan Bsat. "The CMU pose, illumination, and expression (PIE) database." Automatic Face and Gesture Recognition, 2002. Proceedings. Fifth IEEE International Conference on. IEEE, 2002.
Schroff, Florian, et al. "Pose, illumination and expression invariant pairwise face-similarity measure via doppelgänger list comparison."Computer Vision (ICCV), 2011 IEEE International Conference on. IEEE, 2011.
Grgic, Mislav, Kresimir Delac, and Sonja Grgic. "SCface–surveillance cameras face database." Multimedia tools and applications 51.3 (2011): 863-879.
Wallace, Roy, et al. "Inter-session variability modelling and joint factor analysis for face authentication." Biometrics (IJCB), 2011 International Joint Conference on. IEEE, 2011.
Schroff, Florian, Dmitry Kalenichenko, and James Philbin. "Facenet: A unified embedding for face recognition and clustering." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015.
Wolf, Lior, Tal Hassner, and Itay Maoz. "Face recognition in unconstrained videos with matched background similarity." Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on. IEEE, 2011.
Shen, Jie, et al. "The first facial landmark tracking in-the-wild challenge: Benchmark and results." 2015 IEEE International Conference on Computer Vision Workshop (ICCVW). IEEE, 2015.
de Almeida Freitas, Fernando, et al. "Grammatical Facial Expressions Recognition with Machine Learning." FLAIRS Conference. 2014.
Mitchell, Tom M. "Machine learning. WCB." (1997).
Xiaofeng He and Partha Niyogi. Locality Preserving Projections. NIPS. 2003.
Georghiades, A. "Yale face database." Center for computational Vision and Control at Yale University, http://cvc. yale. edu/projects/yalefaces/yalefa 2 (1997).
Nguyen, Duy, et al. "Real-time face detection and lip feature extraction using field-programmable gate arrays." Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on 36.4 (2006): 902-912.
Kanade, Takeo, Jeffrey F. Cohn, and Yingli Tian. "Comprehensive database for facial expression analysis." Automatic Face and Gesture Recognition, 2000. Proceedings. Fourth IEEE International Conference on. IEEE, 2000.
Zeng, Zhihong, et al. "A survey of affect recognition methods: Audio, visual, and spontaneous expressions." Pattern Analysis and Machine Intelligence, IEEE Transactions on 31.1 (2009): 39-58.
Ng, Hong-Wei, and Stefan Winkler. "A data-driven approach to cleaning large face datasets." Image Processing (ICIP), 2014 IEEE International Conference on. IEEE, 2014.
RoyChowdhury, Aruni; Lin, Tsung-Yu; Maji, Subhransu; Learned-Miller, Erik (2015). «One-to-many face recognition with bilinear CNNs». arXiv:1506.01342 [cs.CV].
Jesorsky, Oliver, Klaus J. Kirchberg, and Robert W. Frischholz. "Robust face detection using the hausdorff distance." Audio-and video-based biometric person authentication. Springer Berlin Heidelberg, 2001.
Huang, Gary B., et al. Labeled faces in the wild: A database for studying face recognition in unconstrained environments. Vol. 1. No. 2. Technical Report 07-49, University of Massachusetts, Amherst, 2007.
Bhatt, Rajen B., et al. "Efficient skin region segmentation using low complexity fuzzy decision tree model." India Conference (INDICON), 2009 Annual IEEE. IEEE, 2009.
Lingala, Mounika, et al. "Fuzzy logic color detection: Blue areas in melanoma dermoscopy images." Computerized Medical Imaging and Graphics 38.5 (2014): 403-410.
Maes, Chris, et al. "Feature detection on 3D face surfaces for pose normalisation and recognition." Biometrics: Theory Applications and Systems (BTAS), 2010 Fourth IEEE International Conference on. IEEE, 2010.
Savran, Arman, et al. "Bosphorus database for 3D face analysis." Biometrics and Identity Management. Springer Berlin Heidelberg, 2008. 47-56.
Heseltine, Thomas, Nick Pears, and Jim Austin. "Three-dimensional face recognition: An eigensurface approach." Image Processing, 2004. ICIP'04. 2004 International Conference on. Vol. 2. IEEE, 2004.
Ge, Yun, et al. "3D Novel Face Sample Modeling for Face Recognition."Journal of Multimedia 6.5 (2011): 467-475.
Wang, Yueming, Jianzhuang Liu, and Xiaoou Tang. "Robust 3D face recognition by local shape difference boosting." Pattern Analysis and Machine Intelligence, IEEE Transactions on 32.10 (2010): 1858–1870.
Zhong, Cheng, Zhenan Sun, and Tieniu Tan. "Robust 3D face recognition using learned visual codebook." Computer Vision and Pattern Recognition, 2007. CVPR'07. IEEE Conference on. IEEE, 2007.
Zhao, G., Huang, X., Taini, M., Li, S. Z., & Pietikäinen, M. (2011). Facial expression recognition from near-infrared videos. Image and Vision Computing, 29(9), 607-619.
Soyel, Hamit, and Hasan Demirel. "Facial expression recognition using 3D facial feature distances." Image Analysis and Recognition. Springer Berlin Heidelberg, 2007. 831-838.
Bowyer, Kevin W., Kyong Chang, and Patrick Flynn. "A survey of approaches and challenges in 3D and multi-modal 3D+ 2D face recognition." Computer vision and image understanding 101.1 (2006): 1-15.
Tan, Xiaoyang, and Bill Triggs. "Enhanced local texture feature sets for face recognition under difficult lighting conditions." Image Processing, IEEE Transactions on 19.6 (2010): 1635–1650.
Mousavi, Mir Hashem, Karim Faez, and Amin Asghari. "Three dimensional face recognition using SVM classifier." Computer and Information Science, 2008. ICIS 08. Seventh IEEE/ACIS International Conference on. IEEE, 2008.
Amberg, Brian, Reinhard Knothe, and Thomas Vetter. "Expression invariant 3D face recognition with a morphable model." Automatic Face & Gesture Recognition, 2008. FG'08. 8th IEEE International Conference on. IEEE, 2008.
İrfanoğlu, M. O., Berk Gökberk, and Lale Akarun. "3D shape-based face recognition using automatically registered facial surfaces." Pattern Recognition, 2004. ICPR 2004. Proceedings of the 17th International Conference on. Vol. 4. IEEE, 2004.
Beumier, Charles, and Marc Acheroy. "Face verification from 3D and grey level clues." Pattern recognition letters 22.12 (2001): 1321–1329.
Krishna, Ranjay; Zhu, Yuke; Groth, Oliver; Johnson, Justin; Hata, Kenji; Kravitz, Joshua; Chen, Stephanie; Kalantidis, Yannis; Li, Li-Jia; Shamma, David A; Bernstein, Michael S; Fei-Fei, Li (2017). Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations. International Journal of Computer Vision 123: 32–73. arXiv:1602.07332. doi:10.1007/s11263-016-0981-7.
Pont-Tuset, Jordi; Perazzi, Federico; Caelles, Sergi; Arbeláez, Pablo; Sorkine-Hornung, Alex; Luc Van Gool (2017). «The 2017 DAVIS Challenge on Video Object Segmentation». arXiv:1704.00675 [cs.CV].
Perazzi, Federico; Pont-Tuset, Jordi; McWilliams, Brian; Van Gool, Luc; Gross, Markus; Sorkine-Hornung, Alex (2016). A Benchmark Dataset and Evaluation Methodology for Video Object Segmentation.
Hodan, T., et al. "T-LESS: An RGB-D Dataset for 6D Pose Estimation of Texture-less Objects." Winter Conference on Applications of Computer Vision (WACV) 2017.
Karayev, S., et al. "A category-level 3-D object dataset: putting the Kinect to work." Proceedings of the IEEE International Conference on Computer Vision Workshops. 2011.
Tighe, Joseph, and Svetlana Lazebnik. "Superparsing: scalable nonparametric image parsing with superpixels." Computer Vision–ECCV 2010. Springer Berlin Heidelberg, 2010. 352–365.
Arbelaez, P.; Maire, M; Fowlkes, C; Malik, J (May 2011). Contour Detection and Hierarchical Image Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 33 (5): 898–916. PMID 20733228. doi:10.1109/tpami.2010.161. Процитовано 27 лютого 2016.
Lin, Tsung-Yi, et al. "Microsoft coco: Common objects in context." Computer Vision–ECCV 2014. Springer International Publishing, 2014. 740–755.
Russakovsky, Olga (2015). Imagenet large scale visual recognition challenge. International Journal of Computer Vision 115 (3): 211–252. arXiv:1409.0575. doi:10.1007/s11263-015-0816-y. Проігноровано невідомий параметр |hdl= (довідка)
COCO - Common Objects in Context. cocodataset.org.
Xiao, Jianxiong, et al. "Sun database: Large-scale scene recognition from abbey to zoo." Computer vision and pattern recognition (CVPR), 2010 IEEE conference on. IEEE, 2010.
Donahue, Jeff; Jia, Yangqing; Vinyals, Oriol; Hoffman, Judy; Zhang, Ning; Tzeng, Eric; Darrell, Trevor (2013). «DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition». arXiv:1310.1531 [cs.CV].
Deng, Jia, et al. "Imagenet: A large-scale hierarchical image database."Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on. IEEE, 2009.
Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems. 2012.
Russakovsky, Olga; Deng, Jia; Su, Hao; Krause, Jonathan; Satheesh, Sanjeev та ін. (11 квітня 2015). ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision 115 (3): 211–252. arXiv:1409.0575. doi:10.1007/s11263-015-0816-y. Проігноровано невідомий параметр |hdl= (довідка)
Ivan Krasin, Tom Duerig, Neil Alldrin, Andreas Veit, Sami Abu-El-Haija, Serge Belongie, David Cai, Zheyun Feng, Vittorio Ferrari, Victor Gomes, Abhinav Gupta, Dhyanesh Narayanan, Chen Sun, Gal Chechik, Kevin Murphy. "OpenImages: A public dataset for large-scale multi-label and multi-class image classification, 2017. Available from https://github.com/openimages."
Vyas, Apoorv, et al. "Commercial Block Detection in Broadcast News Videos." Proceedings of the 2014 Indian Conference on Computer Vision Graphics and Image Processing. ACM, 2014.
Hauptmann, Alexander G., and Michael J. Witbrock. "Story segmentation and detection of commercials in broadcast news video." Research and Technology Advances in Digital Libraries, 1998. ADL 98. Proceedings. IEEE International Forum on. IEEE, 1998.
Tung, Anthony KH, Xin Xu, and Beng Chin Ooi. "Curler: finding and visualizing nonlinear correlation clusters." Proceedings of the 2005 ACM SIGMOD international conference on Management of data. ACM, 2005.
Jarrett, Kevin, et al. "What is the best multi-stage architecture for object recognition?." Computer Vision, 2009 IEEE 12th International Conference on. IEEE, 2009.
Lazebnik, Svetlana, Cordelia Schmid, and Jean Ponce. "Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories."Computer Vision and Pattern Recognition, 2006 IEEE Computer Society Conference on. Vol. 2. IEEE, 2006.
Griffin, G., A. Holub, and P. Perona. Caltech-256 object category dataset California Inst. Technol., Tech. Rep. 7694, 2007 [Online]. Available: http://authors.library.caltech.edu/7694 , 2007.
Baeza-Yates, Ricardo, and Berthier Ribeiro-Neto. Modern information retrieval. Vol. 463. New York: ACM press, 1999.
Fu, Xiping, et al. "NOKMeans: Non-Orthogonal K-means Hashing." Computer Vision—ACCV 2014. Springer International Publishing, 2014. 162–177.
Heitz, Geremy (2009). Shape-based object localization for descriptive classification. International Journal of Computer Vision 84 (1): 40–62. doi:10.1007/s11263-009-0228-y. Проігноровано невідомий параметр |citeseerx= (довідка)
M. Cordts, M. Omran, S. Ramos, T. Scharwächter, M. Enzweiler, R. Benenson, U. Franke, S. Roth, and B. Schiele, "The Cityscapes Dataset." In CVPR Workshop on The Future of Datasets in Vision, 2015.
Everingham, Mark (2010). The pascal visual object classes (voc) challenge. International Journal of Computer Vision 88 (2): 303–338. doi:10.1007/s11263-009-0275-4.
Felzenszwalb, Pedro F. (2010). Object detection with discriminatively trained part-based models. IEEE Transactions on Pattern Analysis and Machine Intelligence 32 (9): 1627–1645. PMID 20634557. doi:10.1109/tpami.2009.167. Проігноровано невідомий параметр |citeseerx= (довідка)
Gong, Yunchao, and Svetlana Lazebnik. "Iterative quantization: A procrustean approach to learning binary codes." Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on. IEEE, 2011.
CINIC-10 dataset. Luke N. Darlow, Elliot J. Crowley, Antreas Antoniou, Amos J. Storkey (2018) CINIC-10 is not ImageNet or CIFAR-10. 9 жовтня 2018. Процитовано 13 листопада 2018.
fashion-mnist: A MNIST-like fashion product database. Benchmark :point_right. Zalando Research. 7 жовтня 2017. Процитовано 7 жовтня 2017.
notMNIST dataset. Machine Learning, etc. 8 вересня 2011. Процитовано 13 жовтня 2017.
Houben, Sebastian, et al. "Detection of traffic signs in real-world images: The German Traffic Sign Detection Benchmark." Neural Networks (IJCNN), The 2013 International Joint Conference on. IEEE, 2013.
Mathias, Mayeul, et al. "Traffic sign recognition—How far are we from the solution?." Neural Networks (IJCNN), The 2013 International Joint Conference on. IEEE, 2013.
Geiger, Andreas, Philip Lenz, and Raquel Urtasun. "Are we ready for autonomous driving? the kitti vision benchmark suite." Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on. IEEE, 2012.
Sturm, Jürgen, et al. "A benchmark for the evaluation of RGB-D SLAM systems." Intelligent Robots and Systems (IROS), 2012 IEEE/RSJ International Conference on. IEEE, 2012.
The KITTI Vision Benchmark Suite на YouTube (англ.)
Chaladze, G., Kalatozishvili, L. (2017). Linnaeus 5 dataset. Chaladze.com. Retrieved 13 November 2017, from http://chaladze.com/l5/
Kragh, Mikkel F. (2017). FieldSAFE – Dataset for Obstacle Detection in Agriculture. Sensors 17 (11): 2579. PMC 5713196. PMID 29120383. doi:10.3390/s17112579.
Afifi, Mahmoud (2017-11-12). «Gender recognition and biometric identification using a large dataset of hand images». arXiv:1711.04322 [cs.CV].
Lomonaco, Vincenzo; Maltoni, Davide (2017-10-18). «CORe50: a New Dataset and Benchmark for Continuous Object Recognition». arXiv:1705.03550 [cs.CV].
Morozov, Alexei; Sushkova, Olga (13 червня 2019). THz and thermal video data set. Development of the multi-agent logic programming approach to a human behaviour analysis in a multi-channel video surveillance. Moscow: IRE RAS. Процитовано 19 липня 2019.
Morozov, Alexei; Sushkova, Olga; Kershner, Ivan; Polupanov, Alexander (9 липня 2019). Development of a method of terahertz intelligent video surveillance based on the semantic fusion of terahertz and 3D video images. CEUR 2391: paper19. Процитовано 19 липня 2019.

This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.

[:4-1] Phillips, P. Jonathon, et al. "The FERET database and evaluation procedure for face-recognition algorithms." Image and vision computing 16.5 (1998): 295-306.

[2] Wiskott, Laurenz, et al. "Face recognition by elastic bunch graph matching."Pattern Analysis and Machine Intelligence, IEEE Transactions on 19.7 (1997): 775-779.

[3] Sim, Terence, Simon Baker, and Maan Bsat. "The CMU pose, illumination, and expression (PIE) database." Automatic Face and Gesture Recognition, 2002. Proceedings. Fifth IEEE International Conference on. IEEE, 2002.

[4] Schroff, Florian, et al. "Pose, illumination and expression invariant pairwise face-similarity measure via doppelgänger list comparison."Computer Vision (ICCV), 2011 IEEE International Conference on. IEEE, 2011.

[:0-5] Grgic, Mislav, Kresimir Delac, and Sonja Grgic. "SCface–surveillance cameras face database." Multimedia tools and applications 51.3 (2011): 863-879.

[6] Wallace, Roy, et al. "Inter-session variability modelling and joint factor analysis for face authentication." Biometrics (IJCB), 2011 International Joint Conference on. IEEE, 2011.

[7] Schroff, Florian, Dmitry Kalenichenko, and James Philbin. "Facenet: A unified embedding for face recognition and clustering." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015.

[8] Wolf, Lior, Tal Hassner, and Itay Maoz. "Face recognition in unconstrained videos with matched background similarity." Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on. IEEE, 2011.

[9] Shen, Jie, et al. "The first facial landmark tracking in-the-wild challenge: Benchmark and results." 2015 IEEE International Conference on Computer Vision Workshop (ICCVW). IEEE, 2015.

[10] Almeida Freitas, Fernando, et al. "Grammatical Facial Expressions Recognition with Machine Learning." FLAIRS Conference. 2014.

[11] Mitchell, Tom M. "Machine learning. WCB." (1997).

[12] Xiaofeng He and Partha Niyogi. Locality Preserving Projections. NIPS. 2003.

[13] Georghiades, A. "Yale face database." Center for computational Vision and Control at Yale University, http://cvc. yale. edu/projects/yalefaces/yalefa 2 (1997).

[14] Nguyen, Duy, et al. "Real-time face detection and lip feature extraction using field-programmable gate arrays." Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on 36.4 (2006): 902-912.

[15] Kanade, Takeo, Jeffrey F. Cohn, and Yingli Tian. "Comprehensive database for facial expression analysis." Automatic Face and Gesture Recognition, 2000. Proceedings. Fourth IEEE International Conference on. IEEE, 2000.

[16] Zeng, Zhihong, et al. "A survey of affect recognition methods: Audio, visual, and spontaneous expressions." Pattern Analysis and Machine Intelligence, IEEE Transactions on 31.1 (2009): 39-58.

[17] Ng, Hong-Wei, and Stefan Winkler. "A data-driven approach to cleaning large face datasets." Image Processing (ICIP), 2014 IEEE International Conference on. IEEE, 2014.

[18] RoyChowdhury, Aruni; Lin, Tsung-Yu; Maji, Subhransu; Learned-Miller, Erik (2015). «One-to-many face recognition with bilinear CNNs». arXiv:1506.01342 [cs.CV].

[19] Jesorsky, Oliver, Klaus J. Kirchberg, and Robert W. Frischholz. "Robust face detection using the hausdorff distance." Audio-and video-based biometric person authentication. Springer Berlin Heidelberg, 2001.

[20] Huang, Gary B., et al. Labeled faces in the wild: A database for studying face recognition in unconstrained environments. Vol. 1. No. 2. Technical Report 07-49, University of Massachusetts, Amherst, 2007.

[21] Bhatt, Rajen B., et al. "Efficient skin region segmentation using low complexity fuzzy decision tree model." India Conference (INDICON), 2009 Annual IEEE. IEEE, 2009.

[22] Lingala, Mounika, et al. "Fuzzy logic color detection: Blue areas in melanoma dermoscopy images." Computerized Medical Imaging and Graphics 38.5 (2014): 403-410.

[23] Maes, Chris, et al. "Feature detection on 3D face surfaces for pose normalisation and recognition." Biometrics: Theory Applications and Systems (BTAS), 2010 Fourth IEEE International Conference on. IEEE, 2010.

[24] Savran, Arman, et al. "Bosphorus database for 3D face analysis." Biometrics and Identity Management. Springer Berlin Heidelberg, 2008. 47-56.

[25] Heseltine, Thomas, Nick Pears, and Jim Austin. "Three-dimensional face recognition: An eigensurface approach." Image Processing, 2004. ICIP'04. 2004 International Conference on. Vol. 2. IEEE, 2004.

[26] Ge, Yun, et al. "3D Novel Face Sample Modeling for Face Recognition."Journal of Multimedia 6.5 (2011): 467-475.

[27] Wang, Yueming, Jianzhuang Liu, and Xiaoou Tang. "Robust 3D face recognition by local shape difference boosting." Pattern Analysis and Machine Intelligence, IEEE Transactions on 32.10 (2010): 1858–1870.

[28] Zhong, Cheng, Zhenan Sun, and Tieniu Tan. "Robust 3D face recognition using learned visual codebook." Computer Vision and Pattern Recognition, 2007. CVPR'07. IEEE Conference on. IEEE, 2007.

[29] Zhao, G., Huang, X., Taini, M., Li, S. Z., & Pietikäinen, M. (2011). Facial expression recognition from near-infrared videos. Image and Vision Computing, 29(9), 607-619.

[30] Soyel, Hamit, and Hasan Demirel. "Facial expression recognition using 3D facial feature distances." Image Analysis and Recognition. Springer Berlin Heidelberg, 2007. 831-838.

[31] Bowyer, Kevin W., Kyong Chang, and Patrick Flynn. "A survey of approaches and challenges in 3D and multi-modal 3D+ 2D face recognition." Computer vision and image understanding 101.1 (2006): 1-15.

[32] Tan, Xiaoyang, and Bill Triggs. "Enhanced local texture feature sets for face recognition under difficult lighting conditions." Image Processing, IEEE Transactions on 19.6 (2010): 1635–1650.

[33] Mousavi, Mir Hashem, Karim Faez, and Amin Asghari. "Three dimensional face recognition using SVM classifier." Computer and Information Science, 2008. ICIS 08. Seventh IEEE/ACIS International Conference on. IEEE, 2008.

[34] Amberg, Brian, Reinhard Knothe, and Thomas Vetter. "Expression invariant 3D face recognition with a morphable model." Automatic Face & Gesture Recognition, 2008. FG'08. 8th IEEE International Conference on. IEEE, 2008.

[35] İrfanoğlu, M. O., Berk Gökberk, and Lale Akarun. "3D shape-based face recognition using automatically registered facial surfaces." Pattern Recognition, 2004. ICPR 2004. Proceedings of the 17th International Conference on. Vol. 4. IEEE, 2004.

[36] Beumier, Charles, and Marc Acheroy. "Face verification from 3D and grey level clues." Pattern recognition letters 22.12 (2001): 1321–1329.

[37] Krishna, Ranjay; Zhu, Yuke; Groth, Oliver; Johnson, Justin; Hata, Kenji; Kravitz, Joshua; Chen, Stephanie; Kalantidis, Yannis; Li, Li-Jia; Shamma, David A; Bernstein, Michael S; Fei-Fei, Li (2017). Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations. International Journal of Computer Vision 123: 32–73. arXiv:1602.07332. doi:10.1007/s11263-016-0981-7.

[38] Pont-Tuset, Jordi; Perazzi, Federico; Caelles, Sergi; Arbeláez, Pablo; Sorkine-Hornung, Alex; Luc Van Gool (2017). «The 2017 DAVIS Challenge on Video Object Segmentation». arXiv:1704.00675 [cs.CV].

[39] Perazzi, Federico; Pont-Tuset, Jordi; McWilliams, Brian; Van Gool, Luc; Gross, Markus; Sorkine-Hornung, Alex (2016). A Benchmark Dataset and Evaluation Methodology for Video Object Segmentation.

[40] Hodan, T., et al. "T-LESS: An RGB-D Dataset for 6D Pose Estimation of Texture-less Objects." Winter Conference on Applications of Computer Vision (WACV) 2017.

[:6-41] Karayev, S., et al. "A category-level 3-D object dataset: putting the Kinect to work." Proceedings of the IEEE International Conference on Computer Vision Workshops. 2011.

[42] Tighe, Joseph, and Svetlana Lazebnik. "Superparsing: scalable nonparametric image parsing with superpixels." Computer Vision–ECCV 2010. Springer Berlin Heidelberg, 2010. 352–365.

[43] Arbelaez, P.; Maire, M; Fowlkes, C; Malik, J (May 2011). Contour Detection and Hierarchical Image Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 33 (5): 898–916. PMID 20733228. doi:10.1109/tpami.2010.161. Процитовано 27 лютого 2016.

[44] Lin, Tsung-Yi, et al. "Microsoft coco: Common objects in context." Computer Vision–ECCV 2014. Springer International Publishing, 2014. 740–755.

[45] Russakovsky, Olga (2015). Imagenet large scale visual recognition challenge. International Journal of Computer Vision 115 (3): 211–252. arXiv:1409.0575. doi:10.1007/s11263-015-0816-y. Проігноровано невідомий параметр |hdl= (довідка)

[46] COCO - Common Objects in Context. cocodataset.org.

[47] Xiao, Jianxiong, et al. "Sun database: Large-scale scene recognition from abbey to zoo." Computer vision and pattern recognition (CVPR), 2010 IEEE conference on. IEEE, 2010.

[48] Donahue, Jeff; Jia, Yangqing; Vinyals, Oriol; Hoffman, Judy; Zhang, Ning; Tzeng, Eric; Darrell, Trevor (2013). «DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition». arXiv:1310.1531 [cs.CV].

[49] Deng, Jia, et al. "Imagenet: A large-scale hierarchical image database."Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on. IEEE, 2009.

[:02-50] Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems. 2012.

[51] Russakovsky, Olga; Deng, Jia; Su, Hao; Krause, Jonathan; Satheesh, Sanjeev та ін. (11 квітня 2015). ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision 115 (3): 211–252. arXiv:1409.0575. doi:10.1007/s11263-015-0816-y. Проігноровано невідомий параметр |hdl= (довідка)

[52] Ivan Krasin, Tom Duerig, Neil Alldrin, Andreas Veit, Sami Abu-El-Haija, Serge Belongie, David Cai, Zheyun Feng, Vittorio Ferrari, Victor Gomes, Abhinav Gupta, Dhyanesh Narayanan, Chen Sun, Gal Chechik, Kevin Murphy. "OpenImages: A public dataset for large-scale multi-label and multi-class image classification, 2017. Available from https://github.com/openimages."

[53] Vyas, Apoorv, et al. "Commercial Block Detection in Broadcast News Videos." Proceedings of the 2014 Indian Conference on Computer Vision Graphics and Image Processing. ACM, 2014.

[54] Hauptmann, Alexander G., and Michael J. Witbrock. "Story segmentation and detection of commercials in broadcast news video." Research and Technology Advances in Digital Libraries, 1998. ADL 98. Proceedings. IEEE International Forum on. IEEE, 1998.

[55] Tung, Anthony KH, Xin Xu, and Beng Chin Ooi. "Curler: finding and visualizing nonlinear correlation clusters." Proceedings of the 2005 ACM SIGMOD international conference on Management of data. ACM, 2005.

[56] Jarrett, Kevin, et al. "What is the best multi-stage architecture for object recognition?." Computer Vision, 2009 IEEE 12th International Conference on. IEEE, 2009.

[57] Lazebnik, Svetlana, Cordelia Schmid, and Jean Ponce. "Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories."Computer Vision and Pattern Recognition, 2006 IEEE Computer Society Conference on. Vol. 2. IEEE, 2006.

[58] Griffin, G., A. Holub, and P. Perona. Caltech-256 object category dataset California Inst. Technol., Tech. Rep. 7694, 2007 [Online]. Available: http://authors.library.caltech.edu/7694 , 2007.

[59] Baeza-Yates, Ricardo, and Berthier Ribeiro-Neto. Modern information retrieval. Vol. 463. New York: ACM press, 1999.

[60] Fu, Xiping, et al. "NOKMeans: Non-Orthogonal K-means Hashing." Computer Vision—ACCV 2014. Springer International Publishing, 2014. 162–177.

[61] Heitz, Geremy (2009). Shape-based object localization for descriptive classification. International Journal of Computer Vision 84 (1): 40–62. doi:10.1007/s11263-009-0228-y. Проігноровано невідомий параметр |citeseerx= (довідка)

[62] M. Cordts, M. Omran, S. Ramos, T. Scharwächter, M. Enzweiler, R. Benenson, U. Franke, S. Roth, and B. Schiele, "The Cityscapes Dataset." In CVPR Workshop on The Future of Datasets in Vision, 2015.

[63] Everingham, Mark (2010). The pascal visual object classes (voc) challenge. International Journal of Computer Vision 88 (2): 303–338. doi:10.1007/s11263-009-0275-4.

[64] Felzenszwalb, Pedro F. (2010). Object detection with discriminatively trained part-based models. IEEE Transactions on Pattern Analysis and Machine Intelligence 32 (9): 1627–1645. PMID 20634557. doi:10.1109/tpami.2009.167. Проігноровано невідомий параметр |citeseerx= (довідка)

[:12-65] Gong, Yunchao, and Svetlana Lazebnik. "Iterative quantization: A procrustean approach to learning binary codes." Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on. IEEE, 2011.

[66] CINIC-10 dataset. Luke N. Darlow, Elliot J. Crowley, Antreas Antoniou, Amos J. Storkey (2018) CINIC-10 is not ImageNet or CIFAR-10. 9 жовтня 2018. Процитовано 13 листопада 2018.

[67] fashion-mnist: A MNIST-like fashion product database. Benchmark :point_right. Zalando Research. 7 жовтня 2017. Процитовано 7 жовтня 2017.

[68] tMNIST dataset. Machine Learning, etc. 8 вересня 2011. Процитовано 13 жовтня 2017.

[69] Houben, Sebastian, et al. "Detection of traffic signs in real-world images: The German Traffic Sign Detection Benchmark." Neural Networks (IJCNN), The 2013 International Joint Conference on. IEEE, 2013.

[70] Mathias, Mayeul, et al. "Traffic sign recognition—How far are we from the solution?." Neural Networks (IJCNN), The 2013 International Joint Conference on. IEEE, 2013.

[71] Geiger, Andreas, Philip Lenz, and Raquel Urtasun. "Are we ready for autonomous driving? the kitti vision benchmark suite." Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on. IEEE, 2012.

[72] Sturm, Jürgen, et al. "A benchmark for the evaluation of RGB-D SLAM systems." Intelligent Robots and Systems (IROS), 2012 IEEE/RSJ International Conference on. IEEE, 2012.

[73] The KITTI Vision Benchmark Suite на YouTube (англ.)

[74] Chaladze, G., Kalatozishvili, L. (2017). Linnaeus 5 dataset. Chaladze.com. Retrieved 13 November 2017, from http://chaladze.com/l5/

[75] Kragh, Mikkel F. (2017). FieldSAFE – Dataset for Obstacle Detection in Agriculture. Sensors 17 (11): 2579. PMC 5713196. PMID 29120383. doi:10.3390/s17112579.

[76] Afifi, Mahmoud (2017-11-12). «Gender recognition and biometric identification using a large dataset of hand images». arXiv:1711.04322 [cs.CV].

[77] Lomonaco, Vincenzo; Maltoni, Davide (2017-10-18). «CORe50: a New Dataset and Benchmark for Continuous Object Recognition». arXiv:1705.03550 [cs.CV].

[78] Morozov, Alexei; Sushkova, Olga (13 червня 2019). THz and thermal video data set. Development of the multi-agent logic programming approach to a human behaviour analysis in a multi-channel video surveillance. Moscow: IRE RAS. Процитовано 19 липня 2019.

[79] Morozov, Alexei; Sushkova, Olga; Kershner, Ivan; Polupanov, Alexander (9 липня 2019). Development of a method of terahertz intelligent video surveillance based on the semantic fusion of terahertz and 3D video images. CEUR 2391: paper19. Процитовано 19 липня 2019.