Konferans bildirisi Açık Erişim

SPEECH DETECTION ON BROADCAST AUDIO

Zubari, Unal; Ozan, Ezgi Can; Acar, Banu Oskay; Ciloglu, Tolga; Esen, Ersin; Ates, Tugrul K.; Onur, Duygu Oskay


JSON-LD (schema.org)

{
  "@context": "https://schema.org/", 
  "@id": 92905, 
  "@type": "ScholarlyArticle", 
  "creator": [
    {
      "@type": "Person", 
      "affiliation": "TUBITAK UZAY, Video & Audio Proc Grp, TR-06531 Ankara, Turkey", 
      "name": "Zubari, Unal"
    }, 
    {
      "@type": "Person", 
      "name": "Ozan, Ezgi Can"
    }, 
    {
      "@type": "Person", 
      "affiliation": "TUBITAK UZAY, Video & Audio Proc Grp, TR-06531 Ankara, Turkey", 
      "name": "Acar, Banu Oskay"
    }, 
    {
      "@type": "Person", 
      "affiliation": "METU, Dept Elect & Elect Engn, TR-06531 Ankara, Turkey", 
      "name": "Ciloglu, Tolga"
    }, 
    {
      "@type": "Person", 
      "affiliation": "TUBITAK UZAY, Video & Audio Proc Grp, TR-06531 Ankara, Turkey", 
      "name": "Esen, Ersin"
    }, 
    {
      "@type": "Person", 
      "name": "Ates, Tugrul K."
    }, 
    {
      "@type": "Person", 
      "affiliation": "TUBITAK UZAY, Video & Audio Proc Grp, TR-06531 Ankara, Turkey", 
      "name": "Onur, Duygu Oskay"
    }
  ], 
  "datePublished": "2010-01-01", 
  "description": "Speech boundary detection contributes to performance of speech based applications such as speech recognition and speaker recognition. Speech boundary detector implemented in this study works on broadcast audio as a pre-processor module of a keyword spotter. Speech boundary detection is handled in 3 steps. At first step, audio data is segmented into homogeneous regions in an unsupervised manner. After an ACTIVITY/NON-ACTIVITY decision is made for each region, ACTIVITY regions are classified as Speech/Non-speech via Gaussian Mixture Model (GMM) based classification. GMM's are trained using a novel feature, Spectral Flow Direction (SFD), and an improved multi-band harmonicity feature in addition to widely used Mel Frequency Cepstral Coefficients (MFCC's).", 
  "headline": "SPEECH DETECTION ON BROADCAST AUDIO", 
  "identifier": 92905, 
  "image": "https://aperta.ulakbim.gov.tr/static/img/logo/aperta_logo_with_icon.svg", 
  "license": "http://www.opendefinition.org/licenses/cc-by", 
  "name": "SPEECH DETECTION ON BROADCAST AUDIO", 
  "url": "https://aperta.ulakbim.gov.tr/record/92905"
}
29
10
görüntülenme
indirilme
Görüntülenme 29
İndirme 10
Veri hacmi 1.8 kB
Tekil görüntülenme 28
Tekil indirme 10

Alıntı yap