Technology Overview
eMedia Monitor is a technology company which continually improves the state of the art in Natural Language Processing.
With our roots in spoken language processing, we apply the statistical algorithms used
on spoken language also to text oriented information, e.g. for semantic analysis.
Our velocity in research and development of natural language technology allows us to maintain a comfortable advantage ahead of any closest rival.
Our technology advantage allows us to introduce technology-based services ahead of our rivals and open new markets.
E.g. we have pioneered real-time analysis of broadcast content in 2007 — a full 2 years ahead of anybody else in the
content management industry.
Spoken Language Processing
Spoken language differs considerably from other information sources.
While there is significant variation between audio recordings as expressed by metrics like signal-to-noise ratio,
bandwidth, audio encoding, language, dialect, speaking styles, gender, pitch to name just a few which each pose
a challenge, there is also the fact that redundancy of speech makes spoken language processing with a focus towards
information retrieval a more promising field than e.g. a transcription task.
Spoken language processing includes technologies like Speech Recognition, Speech Understanding,
Speaker Recognition and Language Recognition.
We look at different levels of spoken language processing, with acoustic, lexical, syntactic and semantic
as well as the communication process.
Our work is inherently multidisciplinary, requiring competence in signal processing, acoustics,
phonetics and phonology, linguistics and computer science.
Having a production environment in-house allows us to validate any research results immediately on live data.
This is important, as too many scientists work on artificially created, years-old test beds to prove their
results, and do not have the expertise and resources available to validate their results on live data.
Audio Segmentation
When listening to a stream of audio coming in e.g. from a TV or Radio station, the first step in analyzing the
data is to segment it according to acoustic characteristics.
Is it a person speaking, or a song?
Is it a person speaking for a long time (e.g. reading a novel), or is it a dialog between two or more speakers?
Audio segmentation gives us these answers.
Speech Recognition
On all audio segments where humans speak, we can run Speech Recognition to give us a rough transcript of what has been said.
Our systems are based on Hidden Markov Models (HMMs) which generate best-of-breed results on continuous speech as typically
found in broadcast monitoring applications.
Speaker Recognition and Speaker Clustering
Audio segments can be clustered according to the audio fingerprints of the persons speaking.
This allows us to analyse an interview and identify which of the persons is speaking what segment.
If you have a voice fingerprint of a person, you can use Speaker Recognition to label audio segments with
person names.
Named Entity Detection
Subjects of news stories are usually Persons, Locations and Organisations. Identifying the names of these entities
in the text stream allows us to do a first-level semantic analysis and gives us a powerful means of focussing our search
results.
Story Segmentation and Topic Detection
Like in Audio Segmentation, a continuous stream becomes better manageable once we have the ability to chunk it into
segments.
On textual data we can use a semantic analysis tool to find boundaries between stories in the text.
A story is defined as a region of text which is about a particular topic.
Topic detection and story segmentation work hand-in-hand to define stories and give those stories a semantic label called a topic.
Multi-Language Approach
Our operations are world-wide and span a lot of languages.
Therefore our technologies need to keep up with this requirement. Algorithms which work in English-only are not sufficient for us. Whenever we develop something, we keep in mind that there are lots of languages out there. All developments
are immediately scrutinized to see if they are applicable across the languages in our portfolio.
Our language know-how includes, but is not limited to English, French, German, Spanish, Arabic, Polish, to name a few.
Having all the technology and know-how in-house for all our technologies, we are capable of developing support for virtually any language your business requires.
Real Time
All our systems are engineered to work in real time.
Using know-how from real-time systems research, our systems are capable of delivering information within less
than 30 seconds from the time a keyword was aired until we send out an alert to our users.
Keep in mind that one of the characteristics of spoken language is that we need to wait for a sentence to complete
until we can analyse it.
|