We pursue our research on Tracking-by-Detection (TbD) such by improving the performance of particle filter trackers by taking feedback from deep object detectors.
TAVOT (Target Aware Visual Object Tracking) is a visual object tracker that improves accuracy while significantly decreasing false alarm rate.
This project includes information about training on “YOLOv3” object detection system; and shows results which is obtained from WIDER Face Dataset.
In the context of this project state-of-the-art real-time deep object detectors are adopted to mobile phones. It is aimed to improve the stand alone performance of these algorithms by using data acquired by mobile sensors.
We have introduced an incremental non-negative matrix factorization (INMF) scheme in order to overcome the difficulties that the conventional NMF has in online processing of large data sets. Unlike the conventional NMF, with its incremental nature and weighted cost function, the introduced INMF successfully utilizes adaptability to dynamic content changes with a low computational complexity.
In this project we have developed a digital audio content identification system that enables on-line monitoring of multiple radio/TV channels. The proposed system consists of two main modules: “Digital audio watermarking” module and “audio fingerprinting” module. In order to obtain high identification accuracy, spread spectrum techniques will be used in the design of the modules.
It is well known that human speech accommodates not only the linguistic content but also the emotional state of the speaker. Therefore in applications that require humanmachine interaction, it is important that emotional states in human speech are fully perceived by computers . The classification step in emotion recognition is well advanced, however the determination of a set of well distinguishing features is a difficult task that requires selection among hundreds of different features.
Recently, the studies for digitizing the paintings and storing them in databases are becoming widespread. Automatic identification of art movements for digital paintings and, in addition to their extracted features, automatic labelling of paintings with their proper art movements for indexing and storing processes are significant improvements in the construction of online museums.
ITU MSPR Group participates the TREC Video Retrieval Evaluation (TRECVID) in Content Based Copy Detection (CBCD) task. The system proposed by ITU MSPR consists of two main modules: Extraction of video fingerprints and search/retrieval. We propose a feature extraction scheme based on the Nonnegative Matrix Factorization(NMF) which is an efficient dimension reduction technique in video processing. Video fingerprint generation module takes the factorization matrices generated by NMF as its input and converts them to binary hashes by differencial coding.