They likely use Information Extraction techniques for this.
Here is a demo of Stanford’s SUTime tool:
http://nlp.stanford.edu:8080/sutime/process
You would extract attributes about n-grams (consecutive words) in a document:
- numberOfLetters
- numberOfSymbols
- length
- previousWord
- nextWord
- nextWordNumberOfSymbols
…
And then use a classification algorithm, and feed it positive and negative examples:
Observation nLetters nSymbols length prevWord nextWord isPartOfDate
"Feb." 3 1 4 "Wed" "29th" TRUE
"DEC" 3 0 3 "company" "went" FALSE
...
You might get away with 50 examples of each, but the more the merrier. Then, the algorithm learns based on those examples, and can apply to future examples that it hasn’t seen before.
It might learn rules such as
- if previous word is only characters and maybe periods…
- and current word is in “february”, “mar.”, “the” …
- and next word is in “twelfth”, any_number …
- then is date
Here is a decent video by a Google engineer on the subject